WO2010022661A1 - 音频编码、解码方法及装置、系统 - Google Patents

音频编码、解码方法及装置、系统 Download PDF

Info

Publication number
WO2010022661A1
WO2010022661A1 PCT/CN2009/073559 CN2009073559W WO2010022661A1 WO 2010022661 A1 WO2010022661 A1 WO 2010022661A1 CN 2009073559 W CN2009073559 W CN 2009073559W WO 2010022661 A1 WO2010022661 A1 WO 2010022661A1
Authority
WO
WIPO (PCT)
Prior art keywords
parameter
signal
audio signal
harmonic
domain envelope
Prior art date
Application number
PCT/CN2009/073559
Other languages
English (en)
French (fr)
Inventor
张德明
李海婷
张立斌
克鲁格·霍克
凯瑟·本特
瓦里·皮特
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2010022661A1 publication Critical patent/WO2010022661A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders

Definitions

  • Audio coding and decoding method and device system
  • the present application claims the priority of the Chinese application filed on August 28, 2008, with the application number of 200810119170. 6, the invention name is "audio coding, decoding method and device, system”. The entire contents of this application are incorporated herein by reference.
  • the present invention relates to the field of audio coding and decoding technologies, and in particular, to a method, device, and system for parameter audio coding and decoding. Background of the invention
  • the audio signal usually refers to sound waves with a frequency of 20 Hz to 20 kHz that can be heard by the human ear
  • the digital audio signal refers to an audio signal after analog-to-digital conversion.
  • the analog to digital conversion involves digital sampling at a specified sample rate and scalar quantization of time domain discrete signals at a specified resolution.
  • Audio coding usually refers to the elimination of statistical redundancy and perceptual insensitivity in audio signals (removing statistical redundancy and perception inaccuracy in audio signals, RIMC), such as transform domain coding . Audio coding can characterize the signal at a lower bit rate, but at the same time the coding noise is introduced into the signal. Using the masking effect of the human auditory system, these noises will be difficult or undetectable after frequency domain and time domain shaping of the audio signal. With this RIRAC encoding method, higher quality encoding performance can be obtained with a higher number of bits, but when the bandwidth is unstable, the audio quality degradation using this encoding method is very significant.
  • the use of parameters for audio coding is a method for characterizing signals by using a simple parameter description, by which a higher coding quality can be obtained with a lower coding rate, wherein the parameters can be included.
  • a set of parameters for the time and frequency domain characteristics of a signal Since such a set of parameters can be represented by a small number of bits, the method of encoding audio using parameters is well suited for low rate transmission mechanisms.
  • the decoder After transmitting the parameter description to the decoder, the decoder can reconstruct the audio signal according to these parameters.
  • the methods for encoding audio signals using parameters are mainly:
  • the audio signals are analyzed by various models, such as harmonic models, transient models, single-line models and noise models, and corresponding model parameters are extracted, and the audio signals are restored by using these model parameters at the synthesis end.
  • models such as harmonic models, transient models, single-line models and noise models
  • Embodiments of the present invention provide an audio encoding and decoding method, apparatus, and system, which can reduce the number of bits required for encoding in an application, thereby enabling encoding of a signal with a small number of bits.
  • the embodiment of the present invention further provides a method and a device for performing frequency division coding and decoding on an audio signal, and implementing coding of the signal with a small number of bits.
  • An audio coding method including:
  • An audio encoding device includes:
  • a parameter extraction unit configured to extract a time domain envelope parameter, a frequency domain envelope parameter, and a pitch parameter harmonic wave interval parameter used for characterizing the audio signal
  • a sending unit configured to encode the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, and the harmonic interval parameter, and transmit the parameter to the decoding end.
  • An audio decoding method includes:
  • An audio decoding device includes:
  • a decoding unit configured to decode the received data, and obtain a time domain envelope parameter, a frequency domain envelope parameter, a pitch parameter, and a harmonic interval parameter for characterizing the audio signal;
  • a synthesizing unit configured to synthesize an audio signal according to the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, and the harmonic interval parameter.
  • An audio codec system comprising:
  • An encoding device configured to extract a time domain envelope parameter, a frequency domain envelope parameter, a pitch parameter, and a harmonic interval parameter for characterizing the audio signal; and the time domain envelope parameter, the frequency domain envelope parameter, the tone parameter, and After the harmonic interval parameter is encoded, it is sent to the decoding device, and the decoding device is configured to decode the data sent by the encoding device to obtain the time domain envelope parameter, the frequency domain envelope parameter, the tone parameter, and the harmonic interval. a parameter; synthesizing an audio signal according to the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, and the harmonic interval parameter.
  • An encoding processing method includes:
  • the audio signal is encoded in a sub-band manner, if the spectral signal of the audio signal of the current frequency band is similar to the spectral signal of the audio signal of the previous frequency band, the time domain envelope parameter and the frequency domain for characterizing the audio signal are extracted.
  • Envelope parameters and encoding the time domain envelope parameter and the frequency domain envelope parameter, and simultaneously transmitting information indicating that the spectral signal of the audio signal of the current frequency band is similar to the spectral signal of the audio signal of the previous frequency band;
  • the spectral signal of the audio signal of the frequency band is not similar to the spectral signal of the audio signal of the previous frequency band, and the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter and the harmonic interval parameter for characterizing the audio signal are extracted, and
  • the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, and the harmonic interval parameter are encoded and transmitted, and simultaneously send information indicating that the spectral signal of the audio signal of the current frequency band is not similar to the spectral signal of the audio signal of the previous frequency band.
  • An encoding processing device includes:
  • a determining unit configured to determine whether a spectral signal of the audio signal of the current frequency band is similar to a spectral signal of the audio signal of the previous frequency band;
  • a coding unit configured to determine, according to the judgment result information obtained by the judging unit, a spectrum signal of an audio signal in a current frequency band When the spectral signals of the audio signals of the previous frequency band are similar, the time domain envelope parameter and the frequency domain envelope parameter for characterizing the audio signal are extracted; or, between the spectral signal of the audio signal of the current frequency band and the audio signal of the previous frequency band When the spectral signals are not similar, extracting time domain envelope parameters, frequency domain envelope parameters, pitch parameters, and harmonic interval parameters for characterizing the audio signal;
  • a transmission unit configured to send information similar to a spectral signal between an audio signal of a current frequency band obtained by the determining unit and an audio signal of a previous frequency band, and a time domain packet of the audio signal extracted by the coding unit Transmitting and transmitting the network parameter and the frequency domain envelope parameter; or transmitting, by the determining unit, information that is not similar to the spectral signal between the audio signal of the current frequency band and the audio signal of the previous frequency band, and the encoding
  • the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter and the harmonic interval parameter of the extracted audio signal are encoded and transmitted.
  • a decoding processing method includes:
  • Receiving data transmitted by the encoding end if receiving the information indicating that the spectral signal of the audio signal of the current frequency band is similar to the spectral signal of the audio signal of the previous frequency band, according to the time domain envelope parameter and the frequency domain envelope for characterizing the audio signal a parameter synthesizing an audio signal, wherein the time domain envelope parameter and the frequency domain envelope parameter are decoded from the received data;
  • the harmonic interval parameter synthesizes the audio signal, wherein the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, and the harmonic interval parameter are decoded from the received data.
  • a decoding processing device comprising:
  • a receiving information unit configured to receive information similar to a spectral signal of an audio signal of a current frequency band and a spectral signal of an audio signal of a previous frequency band, and decode the received data to obtain a time domain envelope parameter for characterizing the audio signal and a frequency domain envelope parameter; or, receiving information indicating that the spectral signal of the audio signal of the current frequency band is not similar to the spectral signal of the audio signal of the previous frequency band, and decoding the received data to obtain a time domain packet for characterizing the audio signal a network parameter, a frequency domain envelope parameter, a tone parameter, and a harmonic interval parameter; a decoding unit, configured to receive the similar information according to the received information unit, and the time domain envelope parameter and the frequency domain envelope parameter And synthesizing the audio signal; or synthesizing the audio signal according to the dissimilar information, and the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, and the harmonic interval parameter.
  • FIG. 1 is a schematic flowchart of an audio encoding method according to an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of an audio decoding method according to an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of a coding processing method according to an embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of a decoding processing method according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a processing procedure at an encoding end according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a processing procedure at a decoding end according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of an audio encoding apparatus according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of an audio decoding apparatus according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of an audio codec system according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of an apparatus for encoding processing according to an embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of a decoding processing apparatus according to an embodiment of the present disclosure.
  • FIG. 12 is a schematic structural diagram of a decoding unit according to an embodiment of the present invention. Mode for carrying out the invention
  • the embodiment of the present invention provides an audio coding method, which may specifically include: extracting for characterizing an audio signal The time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter and the harmonic interval parameter; encoding the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter and the harmonic interval parameter, and transmitting to the decoding end .
  • the harmonic interval of the audio signal is different from the value of the first harmonic offset, extracting a first harmonic offset parameter of the audio signal, and extracting a first harmonic offset parameter After encoding, it is transmitted to the decoding end.
  • FIG. 1 is a schematic flow chart of an audio encoding method according to an embodiment of the present invention.
  • An audio encoding method according to an embodiment of the present invention will be described below with reference to FIG. As shown in FIG. 1 , the specific may include:
  • the time domain envelope of the audio signal can be obtained by calculating the subframe energy of the audio signal, or the audio signal can be transformed into the frequency domain (or Transform domain) then extract autoregressive (AR, Auto Regressive) model parameters to characterize the time domain envelope of the audio signal;
  • AR autoregressive
  • the frequency domain envelope of the audio signal can be obtained by calculating the sub-band energy in the frequency domain (or transform domain), or can be extracted in the time domain.
  • the pitch parameter characterizes the ratio between the harmonic signal and the noise signal in the audio signal
  • the pitch parameter is represented by various methods, and may be the frequency domain of the audio signal. The ratio of the maximum value to the minimum value of the correlation function
  • the harmonic interval parameter characterizes the interval between different harmonics of the audio signal; specifically, the harmonic interval can be estimated by the peak extraction method Parameter
  • the step 15 extracting a first harmonic offset parameter (P0, Pitch Offset) of the audio signal that needs to be encoded; specifically, estimating a first harmonic offset parameter according to the harmonic interval parameter, and a harmonic offset parameter code transmission; the first harmonic offset parameter characterizes the position of the first harmonic of the audio signal; it should be noted that if the value of the first harmonic offset is equal to the harmonic interval, Then, the step 15 can be omitted; that is, when the harmonic interval of the audio signal is different from the value of the first harmonic offset, the first harmonic offset parameter of the audio signal is extracted;
  • P0 Pitch Offset
  • the above time domain envelope parameters, the frequency domain envelope parameters, the pitch parameters, the harmonic interval parameters and the first harmonic offset parameters are encoded (which can also be quantized and encoded) and output.
  • pitch parameters, harmonic interval parameters and first harmonic offset parameters may be, but are not limited to, calculated in the frequency domain (or transform domain), for example, may also be calculated in the time domain.
  • the order of obtaining the above parameters is not unique, that is, the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, the harmonic interval parameter, and the first harmonic offset of the audio signal are acquired in any order.
  • the quantity parameter can be.
  • a set of parameters to characterize an audio signal or to characterize an audio signal with a set of parameters including time domain envelope parameters, frequency domain envelope parameters, pitch parameters, and harmonic interval parameters.
  • the set of parameters used in the embodiments of the present invention does not involve the transient model, the parameters of the single-line model, and the noise-free parameters other than the pitch parameters.
  • the other parameters reduce the number of parameters required for encoding, and reduce the number of bits required for encoding using parameters, thereby solving the problem of high bit number in the prior art RIMC encoding method;
  • the set of parameters of the embodiment of the present invention can be encoded with a smaller number of bits, the coding rate of the signal is further reduced, and when the transmission capability of the channel is constant, The number of coded bits in the embodiment of the present invention is low, so that a signal with a higher bandwidth can be encoded, and a larger coding bandwidth and a higher coding quality are obtained with a lower coding rate.
  • the embodiment of the present invention further provides an audio decoding method, which may include: decoding the received data to obtain a time domain envelope parameter, a frequency domain envelope parameter, a tone parameter, and a harmonic interval for characterizing the audio signal. Parameter; synthesizing an audio signal according to the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, and the harmonic interval parameter.
  • the audio decoding method of the embodiment of the present invention further includes: decoding the received data including the first harmonic offset parameter to obtain a first harmonic offset parameter for characterizing the audio signal.
  • the step of synthesizing the audio signal includes:
  • FIG. 2 is a schematic flowchart of an audio decoding method according to an embodiment of the present invention.
  • An audio decoding method according to an embodiment of the present invention will be described below with reference to FIG. As shown in FIG. 2, the specific may include:
  • the method further includes: obtaining a first harmonic offset parameter;
  • the harmonic interval parameter reconstruct the harmonic structure of the signal to obtain a harmonic signal (when the harmonic interval of the audio signal is different from the first harmonic offset parameter, according to the harmonic interval parameter and the first harmonic
  • the wave offset parameter, the harmonic signal is obtained; otherwise the value of the first harmonic offset is equal to the value of the harmonic interval);
  • the harmonic structure can be represented by a harmonic having a random phase, wherein the first harmonic offset
  • the quantity parameter determines the position of the first harmonic, and the interval of each harmonic is determined by the harmonic interval parameter;
  • the harmonic structure is a harmonic signal;
  • generating a noise signal for example, a random number generator can generate a noise signal
  • the time domain shaping processing may be performed on the reconstructed spectral signal according to the time domain envelope parameter, and then the time domain shaping is performed according to the frequency domain envelope parameter.
  • the spectral signal is subjected to frequency domain shaping to obtain a final synthesized audio signal.
  • the foregoing describes the flow of the audio decoding method according to the embodiment of the present invention.
  • the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, the harmonic interval parameter, and the first A set of parameters of the harmonic offset parameter which does not involve the transient model, the parameters of the single-line model, and other parameters other than the pitch parameter in the noise model, which reduces the number of parameters required for encoding. It is possible to synthesize an audio signal with a smaller number of bits, and the audio signal is of higher quality; and, when the harmonic structure of the audio signal is significant, the decoded audio quality is better.
  • the encoding end respectively extracts a time domain envelope parameter, a frequency domain envelope parameter, a pitch parameter, and a harmonic interval parameter of the audio signal, and transmits the parameter encoding to the decoding end, because the embodiment
  • the harmonic interval of the intermediate audio signal is the same as the first harmonic offset parameter, so the step of extracting the first harmonic offset parameter is omitted; the decoding end decodes the above parameters to further synthesize the audio signal.
  • the implementation process of the coding end may specifically include:
  • the time domain envelope parameter may further encode the time domain envelope parameter; and the time domain normalization process of the audio signal may be performed by using the quantized time domain envelope;
  • the audio signal may be transformed into a frequency domain (or a transform domain), and then an autoregressive (AR, Auto Regressive) model parameter is extracted to represent a time domain envelope of the audio signal;
  • AR Auto Regressive
  • the frequency domain envelope parameter of the audio signal can also be obtained by calculating the subband energy in the frequency domain (or the transform domain);
  • pitch parameter of the audio signal extracting the pitch parameter of the audio signal; the pitch parameter characterizing the ratio between the harmonic signal and the noise signal in the audio signal Example;
  • pitch parameters which can be the ratio of the maximum value to the minimum value of the autocorrelation function in the frequency domain of the audio signal.
  • ACF ⁇ k 0 ⁇ S(k)S(k + k 0 )
  • AMDF average correlation function
  • the harmonic interval parameter characterizes the interval between different harmonics of the audio signal; specifically, the integer value of the harmonic interval parameter can be estimated by the peak extraction method
  • the fractional value of the harmonic interval can be obtained by interpolation of the autocorrelation function ⁇ ( ⁇ )
  • the interpolation calculation of the autocorrelation function may be performed only in the vicinity of the integer harmonic interval obtained first, and the fractional value of the harmonic interval may be searched for in the interpolated autocorrelation function; in order to obtain better performance,
  • the obtained harmonic interval parameter can be further corrected and then encoded and transmitted to suppress the generation of the multiplier and the fractional frequency; for example, the obtained harmonic interval PG of the current frame and the harmonic interval of the previous frame are old-PG For comparison, if the ratio between the harmonic interval of the current frame and the harmonic interval of the previous frame is less than a certain domain
  • the step (5) extracting a first harmonic offset parameter of the audio signal; since the value of the first harmonic offset in the embodiment is equal to the harmonic interval, the step (5) may be omitted; but in the first harmonic When the value of the offset is not equal to the harmonic interval, the first harmonic offset parameter may be extracted: estimating the first harmonic offset parameter according to the harmonic interval parameter, and offsetting the first harmonic The quantity parameter code transmission; the first harmonic offset parameter characterizes the position of the first harmonic of the audio signal; it should be noted that if the value of the first harmonic offset is equal to the harmonic interval, then the step (5) Can be omitted; that is, when the harmonic interval of the audio signal is different from the value of the first harmonic offset, the first harmonic offset parameter of the audio signal is extracted;
  • time domain envelope parameters, frequency domain envelope parameters, pitch parameters and harmonic interval parameters are encoded (or quantized and output).
  • step (5) the first harmonic offset parameter will also be encoded and transmitted.
  • pitch parameters, harmonic interval parameters and first harmonic offset parameters may be, but are not limited to, calculated in the frequency domain (or transform domain), for example, may also be calculated in the time domain.
  • order of obtaining the above parameters is not unique, that is, the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, the harmonic interval parameter, and the first harmonic offset of the audio signal are acquired in any order.
  • the quantity parameter can be
  • the decoding end decodes the received data, and obtains a time domain envelope parameter, a frequency domain envelope parameter, a pitch parameter and a harmonic interval parameter for characterizing the audio signal, and then synthesizes the audio signal.
  • the parameter decoded by the decoding end further includes a first harmonic offset parameter.
  • the specific processing procedure for decoding at the decoding end may include:
  • the decoding end can obtain the harmonic signal according to the harmonic interval parameter and the first harmonic offset parameter;
  • the harmonic structure can be A harmonic representation having a random phase, wherein the first harmonic offset parameter determines the position of the first harmonic, and the interval of each harmonic is determined by the harmonic interval parameter; the harmonic structure is a harmonic signal.
  • the first harmonic offset parameter (P0) is the position of the first pulse, and the harmonic interval represented by the harmonic interval parameter (PG) from the first pulse position will have a random phase harmonic
  • generating a noise signal for example, a random number generator can generate a noise signal buf - 1 ⁇ 1 ⁇ );
  • (11) performing time domain shaping processing on the frequency domain shaped signal according to a time domain envelope parameter to obtain a final synthesized audio signal; for example, according to the decoded subframe energy envelope signal 1 ⁇ ′ ( n ) After the denormalization process, the final synthesized audio signal is obtained.
  • the time domain shaping processing may be performed on the reconstructed spectral signal according to the time domain envelope parameter, and then the time domain shaping is performed according to the frequency domain envelope parameter.
  • the spectral signal is processed in the frequency domain to obtain the most The final synthesized audio signal.
  • the set of parameters used in the embodiment of the present invention reduces the number of parameters required for encoding, and reduces the number of bits required for encoding using parameters;
  • the problem that the existing RIRAC coding method has a high number of bits is solved; at the same time, compared with the prior art multi-model based parameter audio coding algorithm, since the set of parameters of the embodiment of the present invention does not involve a transient model, a single spectrum
  • the parameters of the line model, and other parameters other than the pitch parameter in the noise model reduce the number of parameters required for encoding, and can be encoded with fewer bits, thereby further reducing the encoding rate of the signal, and
  • the transmission capacity of the channel is constant, since the number of coded bits of the present invention is low, it is possible to encode a signal having a higher bandwidth, thereby achieving a larger coding bandwidth and a higher coding quality with a lower coding rate.
  • the embodiment of the present invention further provides an encoding processing method, which may specifically include: when encoding an audio signal by using a frequency band, if the spectral signal of the audio signal of the current frequency band is similar to the spectral signal of the audio signal of the previous frequency band And extracting a time domain envelope parameter and a frequency domain envelope parameter for characterizing the audio signal, and encoding the time domain envelope parameter and the frequency domain envelope parameter, and transmitting the spectrum of the audio signal indicating the current frequency band.
  • the information similar to the spectral signal of the audio signal of the previous frequency band if the spectral signal of the audio signal of the current frequency band is not similar to the spectral signal of the audio signal of the previous frequency band, extracting the time domain envelope parameter for characterizing the audio signal a frequency domain envelope parameter, a pitch parameter, and a harmonic interval parameter, and encoding the time domain envelope parameter, the frequency domain envelope parameter, the tone parameter, and the harmonic interval parameter, and simultaneously transmitting the audio signal indicating the current frequency band
  • the spectral signal is not similar to the spectral signal of the audio signal of the previous frequency band.
  • the information indicating that the spectral signal of the audio signal of the current frequency band is similar or dissimilar to the spectral signal of the audio signal of the previous frequency band may be specifically represented by an encoding mode parameter; the encoding mode parameter is used to indicate the decoding end
  • the encoding mode parameter is used to indicate the decoding end
  • the audio signal of the current frequency band is decoded according to the time domain envelope parameter and the frequency domain envelope parameter of the audio signal; or Instructing the decoding end that the spectral signal of the audio signal of the current frequency band is not similar to the spectral signal of the audio signal of the previous frequency band, according to the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, and the harmonic interval of the audio signal.
  • the parameter decodes the audio signal of the current frequency band.
  • a first harmonic offset parameter of the audio signal if the spectral signal of the audio signal of the current frequency band is not similar to the spectral signal of the audio signal of the previous frequency band, and when the harmonic interval of the audio signal is different from the value of the first harmonic offset, a first harmonic offset parameter of the audio signal; and transmitting the first harmonic offset parameter to a decoding end.
  • the pitch parameter of the audio signal can also be extracted, and the pitch parameter can be transmitted to the decoding end.
  • FIG. 3 is a schematic flowchart of a coding processing method according to an embodiment of the present invention, and an encoding processing method according to an embodiment of the present invention will be described below with reference to FIG. 3. As shown in FIG. 3, the specific may include:
  • the current frequency band signal spectrum can be calculated first.
  • the cross-correlation between the signal spectra of the previous frequency band to determine the similarity between the current band harmonic structure and the harmonic structure of the previous band; when the cross-correlation is greater than a certain value, it can be determined as the current band harmonic structure and The harmonic structure of the previous band is similar, set CM to 1, otherwise set CM to 0 ; and when the current band signal spectrum is similar to the previous band signal spectrum, the following pitch parameters can no longer be extracted, harmonic Wave interval parameter and first harmonic offset parameter;
  • the pitch parameter, the harmonic interval parameter, and the first harmonic offset parameter of the audio signal may not be extracted; specifically, the above parameters are extracted.
  • the method can be as follows:
  • Extracting the time domain envelope parameter for example, calculating the subframe energy envelope of the current frequency band signal and the global gain factor gain, and determining whether the signal is a steady state signal or a transient signal according to the two sets of values; if it is a steady state signal, The global gain factor gain is quantized, and the obtained quantized value is used as a time domain envelope parameter; if it is a transient signal, the subframe energy envelope is quantized, and the obtained quantized value is used as a time domain envelope parameter;
  • the time domain envelope parameter performs time domain normalization processing on the current frequency band signal to obtain a time domain normalized signal;
  • Extracting frequency domain envelope parameters for example, performing MDCT (Modified Discrete Cosine Transform) transformation on the signal after normalization in the time domain to obtain a set of MDCT coefficients, which are corresponding to the frequency domain after normal domain normalization
  • the frequency domain signal is divided into N subbands by the frequency domain signal processing, and the child energy of each subband is extracted and quantized to obtain a set of quantized frequency domain envelope, that is, a frequency domain packet.
  • the network parameter is normalized in the frequency domain according to the frequency domain envelope parameter to obtain a frequency domain normalized signal;
  • Extracting the pitch parameters can be extracted directly in the MDCT domain; in order to further improve the performance of the encoder, the parameter extraction may not be performed directly in the MDCT domain, but the pseudo-spectrum signal is calculated according to the original frequency domain signal, and according to this
  • the pseudospectral signal calculates the pitch parameter; the pitch parameter can be expressed by the ratio between the maximum value and the minimum value of the autocorrelation function, wherein the acquisition of the maximum value and the minimum value is within a desired range or is beneficial for calculating the harmonic interval parameter Within the scope of;
  • Extract the harmonic interval parameter PG the harmonic interval parameter of the high-band signal is usually extracted in the frequency domain (or transform domain); the integer value of the harmonic interval can be estimated by the autocorrelation function by the peak extraction method, The fractional value of the wave interval can be estimated by the interpolation autocorrelation function by the method of peak extraction; the interpolation calculation of the autocorrelation function can be performed only in the vicinity of the obtained integer harmonic interval, and then the harmonic is obtained by the peak extraction method.
  • the fractional value of the wave spacing the fractional value of the wave spacing;
  • Extracting the first harmonic offset parameter for example, estimating the first harmonic offset parameter P0 according to the harmonic interval; specifically, within the harmonic interval range, within the range of BP [0, PG], the first harmonic
  • the wave components are placed at different offset positions, and other harmonics are placed in order according to the harmonic interval, and the correlation between the generated spectrum and the pseudo-spectrum is calculated, and the offset position with the largest correlation is the first desired.
  • the first harmonic offset parameter can also be used to further correct the estimated value of the harmonic interval parameter, so as to achieve better parameter extraction effect; it should be noted that if the first harmonic is biased The value of the shift is always equal to the harmonic interval, then the step can be omitted; 33: transmitting the information that the spectral signal of the audio signal representing the current frequency band is similar or dissimilar to the spectral signal of the audio signal of the previous frequency band, and the extracted The parameter is encoded and sent;
  • CM when CM is equal to 1, a set of parameters including an encoding mode parameter, a time domain envelope parameter, and a frequency domain envelope parameter are quantized or encoded and transmitted to the decoding end; when CM is equal to 0, the Encoding mode parameters, time domain envelope parameters, frequency domain envelope parameters, A set of parameters of the pitch parameter and the harmonic interval parameter will be quantized, encoded, and transmitted to the decoder;
  • the parameters transmitted to the decoding end may further include a pitch parameter; when CM is equal to 0, if the value of the first harmonic offset is not equal to the harmonic interval, the first harmonic is also transmitted. Wave offset parameter.
  • the decoding end decodes and obtains the foregoing set of parameters including the coding mode parameter, the time domain envelope parameter and the frequency domain envelope parameter, or decodes and obtains the foregoing coding mode parameter, the time domain envelope parameter, the frequency domain envelope parameter, A set of parameters of the pitch parameter and the harmonic interval parameter, the synthesized audio signal.
  • the encoding end also transmits the pitch parameter when the CM is equal to 1, the corresponding decoding terminal also receives the pitch parameter; if the encoding end transmits the first harmonic offset parameter when the CM is equal to 0, the corresponding The decoder also receives the first harmonic offset parameter.
  • the embodiment of the present invention further provides a decoding processing method, which may specifically include: receiving data sent by the encoding end, and receiving a spectrum signal indicating an audio signal of a current frequency band and The spectral signal similar information of the audio signal of the previous frequency band, the audio signal is synthesized according to the time domain envelope parameter and the frequency domain envelope parameter for characterizing the audio signal, wherein the time domain envelope parameter and the frequency domain envelope parameter Is decoded from the received data; if the information indicating that the spectral signal of the audio signal of the current frequency band is not similar to the spectral signal of the audio signal of the previous frequency band is received, according to the time domain envelope parameter used to characterize the audio signal a frequency domain envelope parameter, a pitch parameter, and a harmonic interval parameter, wherein the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, and the harmonic interval parameter are decoded from the received data.
  • the method further includes: receiving a first harmonic offset parameter of the audio signal; if the spectral signal of the audio signal of the current frequency band is similar to the spectral signal between the audio signals of the previous frequency band, the received Characterizing the time domain envelope parameter and the frequency domain envelope parameter of the audio signal may further comprise: receiving a pitch parameter for characterizing the audio signal.
  • FIG. 4 is a schematic flowchart of a decoding processing method according to an embodiment of the present invention. As shown in FIG. 4, the specific processing process of the decoding processing is as shown in FIG. 4, which may specifically include:
  • the coding mode parameter CM is decoded, and according to the coding mode parameter CM, it can be determined whether it is similar;
  • the spectral signal of the previous frequency band may be used as the spectral signal reconstructed by the current frequency band in a spectrally replicating manner; Reconstructing the spectral signal; if the CM is equal to 1 when the encoding end is also transmitted, the pitch parameter can also be decoded from the code stream, and the spectrum signal of the current frequency band is reconstructed by the spectrum of the previous frequency band by spectral replication; The spectral signal of the previous frequency band may be shaped according to the pitch parameter, and the reconstructed spectral signal is obtained, and the shaped spectral signal is used as the spectral signal reconstructed by the current frequency band;
  • the pitch parameter, the harmonic interval parameter, and the first harmonic offset parameter are decoded from the code stream, according to the harmonic Obtaining a harmonic signal according to the interval parameter; or obtaining a harmonic signal according to the harmonic interval parameter and the first harmonic offset parameter; adjusting a ratio between the harmonic signal and the noise signal according to the pitch parameter; The adjusted harmonic signal and the noise signal are obtained to obtain a reconstructed spectral signal; that is, the artificial reconstruction method based on the pitch parameter, the harmonic interval parameter and the first harmonic offset parameter is used to reconstruct the spectral signal of the high frequency band;
  • the first harmonic offset parameter of the decoding end is equal to the harmonic interval parameter when the first harmonic offset parameter is not transmitted in the encoded code stream.
  • Performing frequency domain shaping on the reconstructed spectral signal according to the decoded frequency domain envelope may be performed by inverse MDCT transform or reverse
  • the FFT transform transforms the trimmed spectral signal into the time domain, but must correspond to the transform method employed by the encoding end;
  • time domain shaping processing according to the decoded time domain envelope parameter, for example, time domain denormalization processing, obtaining a high frequency signal decoded by the parameter audio; and obtaining a synthesized audio signal.
  • the order of the frequency domain shaping and the time domain shaping is not unique, that is, the reconstructed spectral signal may be first time domain shaped and then frequency domain shaped.
  • Processing, obtaining a synthesized audio signal; or performing time domain shaping processing on the reconstructed spectral signal according to the time domain envelope parameter to obtain a time domain shaped signal, and time domain shaping according to the frequency domain envelope parameter The latter signal is subjected to frequency domain shaping processing to obtain a synthesized audio signal.
  • the above description describes whether the spectral signal of the audio signal of the current frequency band is similar to the spectral signal of the audio signal of the previous frequency band when the audio signal is encoded in a sub-band manner, and extracts the time domain envelope parameter when not similar. a set of parameters of the frequency domain envelope parameter, the pitch parameter, the harmonic interval parameter, and the first harmonic offset parameter. When similar, only one of the time domain envelope parameter, the frequency domain envelope parameter, and the tone parameter is extracted.
  • the group parameter may also be a set of parameters including only the time domain envelope parameter and the frequency domain envelope parameter.
  • the decoding end can implement different spectral signal reconstruction methods for the characteristics of different signals in the process of decoding the audio signals in the sub-band, and the adaptability to the signal features is stronger, and the same high synthesis quality can be obtained for different signals.
  • a specific implementation scheme of the encoding processing method and the decoding processing method in the embodiments of the present invention will be described in detail below.
  • the input audio signal is divided into a high frequency band signal and a low frequency band signal at the encoding end, and the high band signal and the low band signal are respectively encoded.
  • FIG. 5 is a schematic diagram of a processing procedure at the encoding end according to an embodiment of the present invention. As shown in FIG. 5, the encoding processing process includes:
  • the sampling rate of the input audio signal is 32KHz, and the processing frame length is 20ms .
  • the input signal is sub-band and down-sampled, it corresponds to the low frequency band (the signal in the T8kHz band has 320 sampling points, corresponding to the high frequency band).
  • the signal in the 8 6 kHz band has 320 sampling points;
  • the core coding can be completed by the G.729.1 codec, or by other wideband signal codecs, that is, regardless of the coding mode, the signal in the T8kHz band can be encoded. And outputting a bit stream of the low frequency signal, that is, an output code stream;
  • the high frequency band 8 6 kHz band is the current frequency band described in the encoding processing method
  • the low frequency band (the TS kHz band is the previous band; when the spectrum of the high frequency signal has no similarity with the spectrum of the low frequency signal, Extracting a set of parameters including a time domain envelope parameter, a frequency domain envelope parameter, a pitch parameter, a harmonic interval parameter, a first harmonic offset parameter, and an encoding mode parameter; when having similarity, extracting only includes a time domain
  • the envelope parameter, the frequency domain envelope parameter, the pitch parameter and the coding mode parameter may also extract only a set of parameters including a time domain envelope parameter, a frequency domain envelope parameter and an encoding mode parameter; the specific processing process may include:
  • (1) determining the coding mode parameter CM may be first calculated to determine the similarity between the low-band harmonic structure and the high-band harmonic structure.
  • the cross-correlation is greater than a certain domain value, it can be determined that the low-band harmonic structure is similar to the high-band harmonic structure, the CM is set to 1, and the spectrum signal of the low-frequency band is adopted by spectral copy shaping.
  • CM Reconstructing the spectral signal of the high frequency band; or reconstructing the spectral signal by other means other than spectral replication; when the cross correlation is less than or equal to the domain value, it is determined whether the low frequency band harmonic structure and the high frequency band harmonic structure are not Similarly, CM is set to 0, and the spectral signal of the high frequency band is manually reconstructed according to the parameters; of course, in a practical application, a simple manner can also be used for the coding mode determination, that is, when the harmonic interval PG is smaller than a certain When a field value is set, CM is set to 1 ; otherwise, it is set to 0 ;
  • the signal is determined to be a steady state signal or a transient signal; if it is a steady state signal, the global gain factor gain is quantized, and the obtained quantized value is used as a time domain envelope parameter, and the coded code is coded.
  • the time domain normalized signal is transformed by MDCT (Modified Discrete Cosine Transform) (for example, 640 points) to obtain a set of MDCT coefficients, that is, the frequency domain signal corresponding to the frequency band ⁇ y_swb( 0), y_swb(l), ..., y_swb(3 ⁇ 9) ⁇ , since the UWB encoder only requires processing signals in the ⁇ 4kHz band, it only processes the frequency domain signal - ⁇ (0), ⁇ - sw ⁇ 1 )'...
  • MDCT Modified Discrete Cosine Transform
  • the signal is divided into N sub-bands, and the child energy of each sub-band is extracted and quantized to obtain a set of quantized frequency domain envelopes ⁇ S pec_env ⁇ Q), spec_env ⁇ ), ..., spec _env ⁇ N - ⁇ ) ⁇ ⁇ is the frequency envelope parameter within the frequency band 14kHz; since the core coder for wideband G. 729.
  • the autocorrelation function can be obtained from the pseudo-spectral signal by frequency domain calculation, for example Where FFT is fast Fourier
  • ⁇ C ( : 0 ) ⁇ 5(k)5(k + k 0 )
  • PG arg maX ACF k . )), wherein the acquisition of the maximum value can be limited to a desired range or a range of interest, and the fractional value of the harmonic interval can be appropriately interpolated from the correlation function ⁇ ( ⁇ ) Peak extraction
  • the method obtains; the interpolation calculation of the autocorrelation function can be performed only in the vicinity of the obtained integer harmonic interval, and then the fractional value of the harmonic interval is obtained by the peak extraction method;
  • the estimated harmonic interval parameter values can also be corrected to suppress the generation of the multiplier and the fractional frequency; for example, the obtained harmonic interval PG of the current frame and the harmonic interval of the previous frame are old—PG
  • the ratio between the harmonic interval of the current frame and the harmonic interval of the previous frame is less than a certain domain value (such as 0.1) and ACF (old-PG) > 0. 95ACF (PG), before use
  • the first harmonic component may be different in the range of the harmonic interval, that is, in the range of [0, PG] Offset the position, and place other harmonics in order according to the harmonic interval, and calculate the correlation between the generated spectrum and the pseudo spectrum.
  • the offset position with the largest correlation is the first harmonic offset obtained.
  • CM When CM is equal to 1, a set of parameters including coding mode parameters, time domain envelope parameters, frequency domain envelope parameters, and pitch parameters will be quantized or encoded and transmitted to the decoding end (ie, transmitting high frequency) Parameter bit stream); when CM is equal to 0, it includes a set of parameters of the coding mode parameter, the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, the harmonic interval parameter and the first harmonic offset parameter, Will be quantized or encoded and transmitted to the decoder (ie transmit the high frequency parameter bit stream);
  • CM when CM is equal to 1, only a set of parameters including coding mode parameters, time domain envelope parameters and frequency domain envelope parameters may be quantized or encoded and transmitted to the decoding end;
  • the enhancement method adopted in this embodiment is to transform and encode the high-band signal in the MDCT domain; of course, other methods may be used to enhance the high-frequency signal after the parameter audio encoding, such as the high-band original signal and the high-band audio.
  • the encoded residual signal is subjected to transform coding or the like; and the high frequency enhanced bit stream is transmitted.
  • the decoding end after receiving the low frequency bit stream, the high frequency parameter bit stream, and the high frequency enhanced bit stream, the decoding end performs decoding and synthesizes the audio signal.
  • FIG. 6 is a schematic diagram of processing process at the decoding end according to the embodiment of the present invention, as shown in FIG. 6 .
  • the decoding specific processing process may include:
  • 8 6kHz signal synthesis is completed by parametric audio decoding; the specific processing includes: (1) decoding the coding mode parameter CM according to the received data;
  • the pitch parameter can be decoded from the received data, and the spectral signal of the high frequency band can be reconstructed by the spectrum of the low frequency band by spectral copy shaping, or reconstructed by other methods other than spectral replication.
  • a spectral signal for example, the spectral signal of the low-band signal obtained by the core decoding may be shaped according to the pitch parameter, and the shaped spectral signal is used as the reconstructed high-band spectrum signal;
  • the decoding end directly uses the spectrum signal of the low frequency band obtained by the core decoding as the reconstructed high-band spectrum signal;
  • the pitch parameter, the harmonic interval parameter, and the first harmonic offset parameter may be decoded from the received data, using tone-based parameters, harmonic interval parameters, and first harmonic offset parameters.
  • Artificial reconstruction method to reconstruct the spectral signal of the high frequency band the reconstruction method of the spectral signal is based on the harmonic signal plus the noise signal; specifically, the harmonic with random phase is placed in the frequency domain in the frequency domain Above the point, thereby reconstructing the harmonic signal, wherein the interval of the pulse is determined by the harmonic interval parameter, the position of the first pulse can be obtained according to the first harmonic offset; the noise signal can be obtained by a random number generator;
  • the trimmed spectral signal can also be transformed into the time domain by inverse FFT transformation;
  • an optional smoothing filtering process may also be performed on the time domain envelope and the frequency domain envelope. If the spectral signal of the high frequency band is performed in a manually reconstructed manner, once the harmonics are placed in the wrong subband, the denormalization used at this time will be the wrong envelope factor. If there is a slight deviation in the harmonic position, a certain degree of distortion is introduced, and smoothing can be used to mitigate this distortion.
  • the interpolated subband energy envelope factor can be used for frequency domain denormalization; then the resulting signal is transformed Domain, then the adaptive sub-frame energy envelope (ATE) inserts the time domain gain function in the time domain; this time domain gain function can finally be used to denormalize the time domain signal;
  • the composite signal of the TSkHz band and the synthesized signal of the 8 6 kHz band are subjected to QMF synthesis filtering to obtain a final synthesized audio signal of a sampling rate of 32 kHz.
  • the high frequency band signal is subjected to parameter encoding and decoding processing, that is, the coding mode parameter is used to indicate the included time domain of the characterization signal.
  • the coding mode parameter is used to indicate the included time domain of the characterization signal.
  • Envelope, frequency domain envelope, tone, A set of parameters of the harmonic interval and the first harmonic offset are used to complete the codec, or a set of parameters including the time domain envelope, the frequency domain envelope, and the tone of the signal are used to complete the codec.
  • the set of parameters used in the embodiment of the present invention reduces the number of parameters required for encoding, and reduces the number of bits required when encoding using parameters; thereby solving the problem that the number of bits in the existing RIRAC encoding method is high; Compared with the existing multi-model based parameter audio coding algorithm, since the set of parameters of the embodiment of the present invention does not involve the transient model, the parameters of the single-line model, and the noise-free parameters are not involved in the noise model.
  • the other parameters reduce the number of parameters required for encoding, and can be encoded with fewer bits, thereby further reducing the encoding rate of the signal, and when the transmission capacity of the channel is constant, the number of encoded bits of the present invention is higher.
  • a method for extracting a frequency domain envelope parameter after extracting a time domain envelope parameter is used with respect to the foregoing embodiment.
  • a method of first extracting a frequency domain envelope parameter is implemented.
  • the coding in the audio signal and the sub-band processing method in this embodiment is the same as in the above embodiment).
  • the process of processing the high-band signal at the encoding end may specifically include:
  • the time domain signal in the 8 6 kHz band is MDCT-transformed to obtain a set of MDCT coefficients. Since the ultra-wideband portion only processes signals in the 8 4 kHz band, only the frequency domain signal is processed — ⁇ (W ⁇ — Sw ⁇ 1 ) '... — S ⁇ (239) ⁇ part; for the core coding, the 7 ⁇ 8kHz part of the signal is no longer within its processing range. In order to ensure the continuity of the decoded signal spectrum at the decoding end, it needs to be extracted at the encoding end. 7 kHz partial MDCT transform domain signal ⁇ y _ w&(120), _ wb(l 21), ⁇ ⁇ ⁇ ⁇ ⁇ , y _ wb(l 59) ⁇ .
  • the process of processing the high-band signal by the decoding end may specifically include:
  • the quantized linearity can be obtained by codebook searching Prediction coefficient, real-time envelope parameter; to facilitate subsequent time domain shaping processing according to the obtained linear prediction coefficient; obtaining quantized sub-band energy by codebook search, ie, frequency domain envelope a parameter; to facilitate subsequent frequency domain shaping processing based on the obtained subband energy;
  • the method uses the method of extracting the frequency domain envelope parameter first to implement the encoding.
  • the order of obtaining the above parameters is not unique, that is, the encoding mode parameter of the audio signal is obtained in any order.
  • the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, the harmonic interval parameter and the first harmonic offset parameter may be used.
  • the set of parameters used in the embodiment of the present invention reduces the number of parameters required for encoding, and reduces the number of bits required when encoding using parameters; thereby solving the problem that the number of bits in the existing RIRAC encoding method is high; Compared with the existing multi-model based parameter audio coding algorithm, since the set of parameters of the embodiment of the present invention does not involve the transient model, the parameters of the single-line model, and the noise-free parameters are not involved in the noise model.
  • the other parameters reduce the number of parameters required for encoding, and can be encoded with fewer bits, thereby further reducing the encoding rate of the signal, and when the transmission capacity of the channel is constant, the number of encoded bits of the present invention is higher.
  • the embodiment of the present invention further provides a corresponding audio coding device, and the structure thereof is as shown in FIG. 7.
  • the specific implementation structure may include:
  • a parameter extraction unit 71 configured to extract a time domain envelope parameter, a frequency domain envelope parameter, a pitch parameter, and a harmonic interval parameter for characterizing the audio signal; when the harmonic interval of the audio signal is offset from the first harmonic When the values of the quantities are different, the method further includes extracting a first harmonic offset parameter for characterizing the audio signal, and transmitting the parameter to the transmitting unit;
  • the sending unit 72 is configured to: encode the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, and the harmonic interval parameter, and then transmit the parameter to the decoding end, specifically, for example, the time domain envelope parameter, The frequency domain envelope parameter, the pitch parameter and the harmonic interval parameter are encoded and transmitted to the decoding end; or used to encode the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, the harmonic interval parameter, and the A harmonic offset parameter is encoded and transmitted to the decoder.
  • the embodiment of the present invention further provides a corresponding audio decoding device, and the structure thereof is as shown in FIG. 8.
  • the specific implementation structure may include:
  • the decoding unit 81 is configured to decode the received data to obtain a time domain envelope parameter, a frequency domain envelope parameter, a pitch parameter, and a harmonic interval parameter for characterizing the audio signal; Decoding data of a harmonic offset parameter to obtain a first harmonic offset parameter for characterizing the audio signal;
  • a synthesizing unit 82 configured to: according to a time domain envelope parameter, a frequency domain envelope parameter, a pitch parameter, and a harmonic interval parameter; or a time domain
  • the envelope parameter, the frequency domain envelope parameter, the pitch parameter, the harmonic interval parameter, and the first harmonic offset parameter, the synthesized audio signal; the synthesizing unit 82 may include:
  • a harmonic reconstruction sub-unit 821 configured to obtain a harmonic signal according to the harmonic interval parameter; or when the harmonic interval for characterizing the audio signal is different from the first harmonic offset, according to the harmonic a wave interval parameter and the first harmonic offset parameter to obtain a harmonic signal;
  • a spectral signal reconstruction sub-unit 822 configured to adjust a ratio between a harmonic signal obtained by the harmonic reconstruction sub-unit 821 and a noise signal according to the pitch parameter; and according to the adjusted harmonic signal and the noise signal, Reconstructed spectral signal;
  • the shaping subunit 823 is configured to process the spectral signal reconstructed by the spectral signal reconstruction subunit 822 according to the frequency domain envelope parameter and the time domain envelope parameter to obtain a synthesized audio signal; for example: according to the frequency domain packet
  • the network parameter performs frequency domain shaping processing on the reconstructed spectral signal to obtain a frequency domain shaped signal, and performs time domain shaping processing on the frequency domain shaped signal according to the time domain envelope parameter to obtain the synthesized audio signal. Or performing time domain shaping processing on the reconstructed spectral signal according to the time domain envelope parameter to obtain a time domain shaped signal, and performing frequency domain on the time domain shaped signal according to the frequency domain envelope parameter
  • the shaping process obtains the synthesized audio signal.
  • the embodiment of the present invention further provides a corresponding audio codec system, and the structure thereof is as shown in FIG. 9.
  • the specific implementation structure may include:
  • the encoding device 91 is configured to extract a time domain envelope parameter, a frequency domain envelope parameter, a pitch parameter, and a harmonic interval parameter for characterizing the audio signal; and the time domain envelope parameter and the frequency domain for characterizing the audio signal
  • the envelope parameter, the pitch parameter and the harmonic interval parameter are encoded and sent to the decoding device;
  • Decoding device 92 configured to decode data sent by the encoding device, to obtain the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, and the harmonic interval parameter; according to the time domain envelope parameter, Combining frequency domain envelope parameters, pitch parameters and harmonic interval parameters to synthesize audio signals;
  • the encoding device 91 may specifically include:
  • a parameter extraction unit 911 configured to extract a time domain envelope parameter, a frequency domain envelope parameter, a pitch parameter, and a harmonic interval parameter of the audio signal; when the harmonic interval of the audio signal and the first harmonic offset value At the same time, it is also used to extract a first harmonic offset parameter of the audio signal;
  • the sending unit 912 is configured to: the time domain envelope parameter, the frequency domain envelope parameter, the tone parameter, and the harmonic interval parameter; or the time domain envelope parameter, the frequency domain envelope parameter, the tone parameter, and the harmonic interval
  • the parameter and the first harmonic offset parameter are encoded and transmitted to the decoding device;
  • the decoding device 92 may specifically include:
  • the decoding unit 921 is configured to decode the received data to obtain the time domain envelope parameter, the frequency domain envelope parameter, the tone parameter, and the harmonic interval parameter, or the time domain envelope parameter and the frequency domain envelope. Parameters, pitch parameters, harmonic interval parameters, and first harmonic offset parameters;
  • the synthesizing unit 922 is configured to perform, according to the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, and the harmonic interval parameter, or the time domain envelope parameter, the frequency domain envelope parameter, the tone parameter, and the harmonic interval Parameter and first harmonic offset parameter, synthesized audio letter number.
  • the embodiment of the present invention further provides a corresponding encoding processing device, and the structure thereof is as shown in FIG.
  • the determining unit 101 is configured to determine whether the spectral signal of the audio signal of the current frequency band is similar to the spectral signal of the audio signal of the previous frequency band; specifically, the value of the encoding mode parameter may be used to indicate whether the information is similar;
  • the encoding unit 102 is configured to extract, according to the determination result information obtained by the determining unit 101, a time domain packet for characterizing the audio signal when the spectral signal of the audio signal of the current frequency band is similar to the spectral signal of the audio signal of the previous frequency band.
  • the network parameter and the frequency domain envelope parameter are also used to extract the pitch parameter; or, when the spectrum signal of the audio signal of the current frequency band is not similar to the spectrum signal of the audio signal of the previous frequency band, the time for characterizing the audio signal is extracted.
  • the transmitting unit 103 is configured to send information that is similar to the spectral signal between the audio signal of the current frequency band obtained by the determining unit 101 and the audio signal of the previous frequency band, for example, encoding and transmitting the encoding mode parameter;
  • the time domain envelope parameter and the frequency domain envelope parameter (which may also include a pitch parameter) of the audio signal extracted by the coding unit are encoded and transmitted; or, the spectrum of the audio signal of the current frequency band obtained by the determining unit is sent.
  • Information that is not similar to the spectral signal between the audio signal of the previous frequency band, time domain envelope parameters, frequency domain envelope parameters, pitch parameters, and harmonic interval parameters of the audio signal extracted by the coding unit may also include The first harmonic offset parameter) is encoded and transmitted.
  • the embodiment of the present invention further provides a corresponding decoding processing apparatus, and the structure thereof is as shown in FIG.
  • the receiving information unit 111 is configured to receive information indicating that the spectral signal of the audio signal of the current frequency band is similar to the spectral signal of the audio signal of the previous frequency band, and decode the received data to obtain a time domain envelope parameter for characterizing the audio signal.
  • the receiving information unit 111 may determine, according to the received coding mode parameter, that the spectral signal of the audio signal of the current frequency band is similar or dissimilar to the spectral signal between the audio signals of the previous frequency band;
  • the decoding unit 112 is configured to synthesize an audio signal according to the similar information received by the receiving information unit 111, and the time domain envelope parameter and the frequency domain envelope parameter used to represent the audio signal; or, according to the The dissimilar information, and the time domain envelope parameter, the frequency domain envelope parameter, the pitch parameter, and the harmonic interval parameter used to represent the audio signal, synthesize the audio signal; specifically: when receiving the audio signal of the current frequency band
  • the information of the spectral signal is similar to the spectral signal of the audio signal of the previous frequency band.
  • the decoding unit 112, as shown in FIG. 12, may include:
  • Reconstruction subunit 121 configured to reconstruct a spectral signal to obtain a reconstructed spectral signal
  • a second shaping subunit 122 configured to shape a spectral signal reconstructed by the reconstruction subunit 121 according to the pitch parameter Processing, obtaining a reconstructed spectral signal after shaping;
  • the first shaping sub-unit 123 is configured to process the reconstructed spectral signal (or the shaped spectral signal) according to the frequency domain envelope parameter and the time domain envelope parameter to obtain a synthesized audio signal; for example: according to the And processing, by the frequency domain envelope parameter and the time domain envelope parameter, the reconstructed spectral signal after the second shaping subunit shaping processing, including: performing frequency domain on the reconstructed spectral signal according to the frequency domain envelope parameter And shaping, obtaining a frequency domain shaped signal; performing time domain shaping processing on the frequency domain shaped signal according to the time domain envelope parameter to obtain the synthesized audio signal; or, according to the time domain envelope parameter pair Performing a time domain shaping process on the reconstructed spectral signal to obtain a time domain shaped signal; performing frequency domain shaping processing on the time domain shaped signal according to the frequency domain envelope parameter to obtain the synthesized audio signal;
  • the decoding unit 112 When receiving the information that the spectral signal of the audio signal of the current frequency band is not similar to the spectral signal of the audio signal of the previous frequency band, the decoding unit 112, as shown in FIG. 12, may include:
  • a harmonic reconstruction sub-unit 124 configured to obtain a harmonic signal according to the harmonic interval parameter; or obtain a harmonic signal according to the harmonic interval parameter and the first harmonic offset parameter;
  • a spectral signal reconstruction sub-unit 125 configured to adjust a ratio between the harmonic signal and the noise signal according to the pitch parameter, and obtain a reconstructed spectral signal according to the adjusted harmonic signal and the noise signal;
  • the third shaping subunit 126 is configured to process the reconstructed spectral signal according to the frequency domain envelope parameter and the time domain envelope parameter to obtain a synthesized audio signal. For example, performing frequency domain shaping processing on the reconstructed spectral signal according to the frequency domain envelope parameter to obtain a frequency domain shaped signal; performing time domain shaping on the frequency domain shaped signal according to the time domain envelope parameter Processing, obtaining the synthesized audio signal; or, performing time domain shaping processing on the reconstructed spectral signal according to the time domain envelope parameter, to obtain a time domain shaped signal; and timing according to the frequency domain envelope parameter The domain shaped signal is subjected to frequency domain shaping processing to obtain a synthesized audio signal.
  • Each of the above embodiments of the present invention may be, but is not limited to, applied to an audio codec device.
  • the embodiments of the present invention use the time domain envelope parameter, the frequency domain envelope parameter, the tone parameter, and the harmonic interval parameter (which may also include the first A set of parameters of the harmonic offset parameter) to characterize the audio signal.
  • the number of bits required for encoding using the parameter can be reduced on the existing basis, and fewer bits can be used.
  • Encoding the signal further reduces the coding rate of the signal, thereby obtaining a larger coding bandwidth and a higher coding quality with a lower coding rate, especially for a signal having a significant harmonic structure, which can be obtained by using the embodiment of the present invention. Good coding quality.
  • the spectral signal of the audio signal of the current frequency band is similar to the spectral signal of the audio signal of the previous frequency band.
  • extract a set of parameters including a time domain envelope parameter, a frequency domain envelope parameter, a pitch parameter, and a harmonic interval parameter (which may also include a first harmonic offset parameter) when only the inclusion time is extracted.
  • a set of parameters of the domain envelope parameter and the frequency domain envelope parameter (which may also include pitch parameters) effectively utilizes the similarity of the spectrum between different frequency bands of the signal to further reduce the coding rate and obtain a larger coding bandwidth.
  • the decoding end can implement different spectral signal reconstruction methods for the characteristics of different signals in the process of decoding the audio signals in the sub-band, and the adaptability to the signal features is stronger, and the same high synthesis quality can be obtained for different signals.
  • the transmission capability of the channel is constant, since the number of coded bits of the present invention is low, it is possible to encode a signal having a higher bandwidth. Since the higher the bandwidth of the signal is audibly, the better the auditory experience is obtained. Therefore, when the transmission capability of the channel is constant, the method provided by the present invention can obtain a higher coding bandwidth and a higher synthesis quality.
  • an audio signal is provided in an embodiment of the present invention.
  • the technical solution of the line sub-band coding and decoding process can achieve a larger coding bandwidth with a lower coding rate in the process of encoding and decoding the audio signal in the sub-band, and obtain higher coding quality.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Description

音频编码、 解码方法及装置、 系统 本申请要求了 2008年 8月 28日提交的、 申请号为 200810119170. 6、 发明名称为 "音频编码、 解 码方法及装置、 系统" 的中国申请的优先权, 其全部内容通过引用结合在本申请中。 技术领域
本发明涉及音频编码、 解码技术领域, 尤其涉及参数音频编码、 解码的方法及装置、 系统。 发明背景
音频信号通常指人耳可以听到的频率在 20Hz到 20KHz的声波, 数字音频信号是指经过模数转换 后的音频信号。 从模拟到数字的转换包含了以指定的采样率进行数字采样, 以及以指定的分辨率对 时域离散信号进行标量量化的过程。
音频编码, 通常是指消除音频信号中的统计冗余和感知不敏感的编码方法 (消除音频信号中 的统计冗余和感知不敏感, Redundancy and Irrelevancy Removal Audio Coding, 简称 RIMC ) , 例如变换域编码。 音频编码可以用一个较低的码率来表征信号, 但同时编码噪声也会被引入到信号 中。 利用人耳听觉系统的掩蔽效应, 在对音频信号进行频域和时域整形后这些噪声将很难或不被听 到。利用这种 RIRAC编码方法,可以用较高的比特数获得较高质量的编码性能,但是当带宽不稳定时, 采用这种编码方法的音频质量下降非常明显。
相对于上述 RIRAC编码方法, 利用参数对音频编码是一种利用简洁的参数描述来表征信号的方 法, 通过这种方法可以用更低的编码速率获得较高的编码质量, 其中, 参数可以是包含信号的时域 和频域特征的一组参数。 由于这样一组参数可以用较少的比特数来表示, 因此利用参数对音频编码 的方法非常适用于低速率传输机制。 在将参数描述传输至解码端之后, 解码端可以跟据这些参数重 构音频信号。
目前利用参数对音频信号进行编码的方法主要有:
利用各种模型, 例如谐波模型、 暂态模型、 单谱线模型和噪声模型等对音频信号进行分析, 提取相应的模型参数, 在合成端利用这些模型参数还原音频信号。
发明人在实现本发明的过程中, 发现现有技术中至少存在如下问题:
由于需要在编码端利用多种模型, 用较多的模型参数来对信号进行描述, 才能在解码端获得 较高质量的合成音频信号; 因此, 实际应用中采用这种技术时需要传输的比特数也较多; 当信道的 传输能力进一步降低时, 会影响采用该技术进行编码的音频质量。 发明内容
本发明的实施例提供了一种音频编码、 解码方法及装置、 系统, 可以在应用中降低编码时所 需要的比特数, 从而实现用少的比特数对信号进行编码。 同时, 本发明实施例还提供了一种对音频 信号进行分频带编码、 解码处理方法和装置, 实现用少的比特数对信号进行编码。 一种音频编码方法, 包括:
提取用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数; 将所述 时域包络参数、 频域包络参数、 音调参数和谐波间隔参数编码后, 传输给解码端。
一种音频编码装置, 包括:
参数提取单元, 用于提取用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐 波间隔参数;
发送单元, 用于将所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数编码后, 传 输给解码端。
一种音频解码方法, 包括:
对收到的数据进行解码, 得到用于表征音频信号的时域包络参数、 频域包络参数、 音调参数 和谐波间隔参数; 根据所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 合成音频信 号。
一种音频解码装置, 包括:
解码单元, 用于对收到的数据进行解码, 得到用于表征音频信号的时域包络参数、 频域包络 参数、 音调参数和谐波间隔参数;
合成单元, 用于根据所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 合成音 频信号。
一种音频编解码系统, 包括:
编码装置, 用于提取用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间 隔参数; 对所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数编码后, 发送至解码装置; 解码装置, 用于对所述编码装置发送来的数据进行解码, 得到所述时域包络参数、 频域包络 参数、 音调参数和谐波间隔参数; 根据所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参 数合成音频信号。
一种编码处理方法, 包括:
当用分频带的方式对音频信号进行编码时, 若当前频带的音频信号的谱信号与前一个频带的 音频信号的谱信号相似, 则提取用于表征音频信号的时域包络参数和频域包络参数, 并将所述时域 包络参数和频域包络参数编码后发送, 同时发送表示当前频带的音频信号的谱信号与前一个频带的 音频信号的谱信号相似的信息; 若当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号 不相似, 则提取用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 并 将所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数编码后发送, 同时发送表示当前频 带的音频信号的谱信号与前一个频带的音频信号的谱信号不相似的信息。
一种编码处理装置, 包括:
判断单元, 用于判断当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号是否相 似;
编码单元, 用于根据所述判断单元得到的判断结果信息, 在当前频带的音频信号的谱信号与 前一个频带的音频信号的谱信号相似时, 提取用于表征音频信号的时域包络参数和频域包络参数; 或者, 在当前频带的音频信号的谱信号与前一个频带的音频信号间的谱信号不相似时, 提取用于表 征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数;
传输单元, 用于发送所述判断单元得到的当前频带的音频信号的谱信号与前一个频带的音频 信号间的谱信号相似的信息, 对所述编码单元提取的所述音频信号的时域包络参数和频域包络参数 进行编码后发送; 或者, 发送所述判断单元得到的当前频带的音频信号的谱信号与前一个频带的音 频信号间的谱信号不相似的信息, 对所述编码单元提取的音频信号的时域包络参数、频域包络参数、 音调参数和谐波间隔参数进行编码后发送。
一种解码处理方法, 包括:
接收编码端发送的数据, 若接收到表示当前频带的音频信号的谱信号与前一个频带的音频信 号的谱信号相似的信息, 根据用于表征音频信号的时域包络参数和频域包络参数合成音频信号, 其 中, 所述时域包络参数和频域包络参数是从接收到的数据中解码得到;
若接收到表示当前频带的音频信号的谱信号与前一个频带的音频信号间的谱信号不相似的信 息, 根据用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数合成音频信 号, 其中, 所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数是从接收到的数据中解码 得到。
一种解码处理装置, 其特征在于, 包括:
接收信息单元, 用于接收表示当前频带的音频信号的谱信号与前一个频带的音频信号的谱信 号相似的信息, 并对收到的数据解码得到用于表征音频信号的时域包络参数和频域包络参数; 或者, 接收表示当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号不相似的信息, 并对收到 的数据解码得到用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数; 解码单元, 用于根据所述接收信息单元接收的所述相似的信息, 以及所述时域包络参数和频 域包络参数, 合成音频信号; 或者, 根据所述不相似的信息, 以及所述时域包络参数、 频域包络参 数、 音调参数和谐波间隔参数, 合成音频信号。 附图简要说明
图 1为本发明实施例提供的音频编码方法流程示意图;
图 2为本发明实施例提供的音频解码方法流程示意图;
图 3为本发明实施例的编码处理方法流程示意图;
图 4为本发明实施例的解码处理方法流程示意图;
图 5为本发明实施例在编码端的处理过程示意图;
图 6为本发明实施例在解码端的处理过程示意图;
图 7为本发明实施例提供的音频编码装置结构示意图;
图 8为本发明实施例提供的音频解码装置结构示意图;
图 9为本发明实施例提供的音频编解码系统结构示意图; 图 10为本发明实施例提供的编码处理装置结构示意图;
图 11为本发明实施例提供的解码处理装置结构示意图;
图 12为本发明实施例提供的解码单元结构示意图。 实施本发明的方式
为了在现有音频编码基础上用更低的编码速率获得更大的编码带宽, 并获得更高的编码质量, 本发明实施例提供一种音频编码方法, 具体可以包括: 提取用于表征音频信号的时域包络参数、 频 域包络参数、 音调参数和谐波间隔参数; 将所述时域包络参数、 频域包络参数、 音调参数和谐波间 隔参数编码后, 传输给解码端。
进一步的, 当所述音频信号的谐波间隔与第一谐波偏移量的值不同时, 提取所述音频信号的 第一谐波偏移量参数, 并对第一谐波偏移量参数编码后传输给所述解码端。
图 1是本发明实施例的音频编码方法流程示意图, 下面将结合图 1对本发明实施例的音频编码 方法进行介绍。 如图 1所示, 具体可以包括:
11: 提取需要进行编码处理的音频信号的时域包络参数; 具体的, 可以通过计算音频信号的 子帧能量来得到音频信号的时域包络, 也可以将音频信号变换到频域 (或变换域) 之后提取自回归 (AR, Auto Regressive )模型参数来表征音频信号的时域包络;
12 : 提取需要进行编码处理的音频信号的频域包络参数; 具体的, 可以通过计算频域 (或变 换域) 下的子带能量得到音频信号的频域包络, 也可以在时域提取音频信号的自回归模型参数来表 征音频信号的频域包络;
13: 提取需要进行编码处理的音频信号的音调参数; 音调参数表征了音频信号中谐波信号与 噪声信号之间的比例; 音调参数的表示方法有多种, 可以是音频信号频域下的自相关函数的最大值 与最小值之比;
14: 提取需要进行编码处理的音频信号的谐波间隔 (PG, Pitch Grid) 参数; 谐波间隔参数 表征了音频信号的不同谐波之间的间隔; 具体可以通过峰值提取方法估计出谐波间隔参数;
15: 提取需要进行编码处理的音频信号的第一谐波偏移量参数 (P0, Pitch Offset ) ; 具体 的, 可以根据谐波间隔参数, 估计第一谐波偏移量参数, 并将该第一谐波偏移量参数编码传输; 第 一谐波偏移量参数表征了音频信号第一个谐波的位置; 需要指出的是, 若第一谐波偏移量的值等于 谐波间隔, 则该步骤 15可以省略; 也就是当所述音频信号的谐波间隔与第一谐波偏移量的值不同时, 提取所述音频信号的第一谐波偏移量参数;
对上述时域包络参数, 频域包络参数, 音调参数, 谐波间隔参数和第一谐波偏移量参数编码 后 (也可以量化后编码) , 将其输出。
需要指出的是, 上述音调参数,谐波间隔参数和第一谐波偏移量参数可以但不限于在频域(或 变换域) 计算得到, 例如还可以在时域计算得到。 并且, 获取上述各参数的顺序不唯一, 即不论以 何种顺序, 只要获取上述音频信号的时域包络参数, 频域包络参数, 音调参数, 谐波间隔参数和第 一谐波偏移量参数即可。 上述内容描述了本发明实施例的音频编码方法流程, 通过上述方法, 可以用包含时域包络参 数, 频域包络参数, 音调参数, 谐波间隔参数和第一谐波偏移量参数的一组参数, 来表征音频信号, 或用包含时域包络参数, 频域包络参数, 音调参数和谐波间隔参数的一组参数, 来表征音频信号。 相对于现有技术的基于多种模型的参数音频编码技术, 本发明实施例采用的一组参数, 不涉及暂态 模型、 单谱线模型的参数, 以及不涉及噪声模型中除音调参数之外的其他参数, 减少了编码时需要 的参数的个数, 降低了使用参数进行编码时所需要的比特数, 从而解决了现有技术 RIMC编码方法比 特数较高的问题; 同时, 与现有技术的基于多种模型的参数音频编码算法相比, 由于本发明实施例 的这组参数可以用更少的比特数进行编码, 从而进一步降低信号的编码速率, 并且当信道的传输能 力一定时, 由于本发明实施例的编码比特数较低, 因此能够编码具有更高带宽的信号, 实现了用更 低的编码速率获得更大的编码带宽及更高的编码质量。 本发明实施例还提供了一种音频解码方法, 具体可以包括: 对收到的数据进行解码, 得到用 于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数; 根据所述时域包络参 数、 频域包络参数、 音调参数和谐波间隔参数, 合成音频信号。
进一步的, 本发明实施例的音频解码方法还包括: 对收到的包含第一谐波偏移量参数的数据 进行解码, 得到用于表征所述音频信号的第一谐波偏移量参数。
所述合成音频信号的步骤包括:
根据所述谐波间隔参数得到谐波信号; 或当所述音频信号的谐波间隔与第一谐波偏移量参数 不同时, 根据所述谐波间隔参数和所述第一谐波偏移量参数, 得到谐波信号;
根据所述音调参数, 调整谐波信号与噪声信号之间的比例; 并根据调整后的谐波信号与噪声 信号, 得到重建的谱信号;
根据所述频域包络参数和时域包络参数对所述重建的谱信号进行处理得到合成音频信号。 图 2是本发明实施例提供的音频解码方法流程示意图, 下面将结合图 2对本发明实施例的音频 解码方法进行介绍。 如图 2所示, 具体可以包括:
21: 对收到的数据进行解码, 得到用于表征音频信号的时域包络参数、 频域包络参数、 音调 参数和谐波间隔参数, 当音频信号的谐波间隔与第一谐波偏移量的值不同时, 还包括得到第一谐波 偏移量参数;
22: 根据谐波间隔参数, 重建信号的谐波结构, 得到谐波信号 (当所述音频信号的谐波间隔 与第一谐波偏移量参数不同时, 根据谐波间隔参数和第一谐波偏移量参数, 得到谐波信号; 否则第 一谐波偏移量的值等于谐波间隔的值) ; 该谐波结构可以由具有随机相位的谐波表示, 其中第一谐 波偏移量参数确定了第一个谐波的位置, 各个谐波的间隔由谐波间隔参数决定; 该谐波结构即为谐 波信号;
23: 产生噪声信号, 例如, 可以由一个随机数产生器产生噪声信号;
24: 根据音调参数的值调整谐波信号与噪声信号之间的比例; 并根据调整后的谐波信号与噪 声信号, 得到重建的谱信号; 25: 根据频域包络参数对所述重建的谱信号进行频域整形处理, 得到频域整形后的信号; 例 如, 可以根据解码出的子带能量包络对重建的谱信号进行去归一化处理后得到频域整形后的信号;
26: 根据时域包络参数对所述频域整形后的信号进行时域整形处理, 得到最终的合成音频信 号; 例如, 可以根据解码出的子帧能量包络对频域整形后的信号变换到时域以后再进行去归一化处 理后, 得到最终的合成音频信号。
需要指出的是频域整形和时域整形的顺序不唯一, 也可以先根据时域包络参数对所述重建的 谱信号进行时域整形处理, 再根据频域包络参数对时域整形后的谱信号进行频域整形处理, 得到最 终的合成音频信号。
上述内容描述了本发明实施例的音频解码方法流程, 通过本发明实施例提供的包含用于表征 音频信号的时域包络参数、 频域包络参数、 音调参数、 谐波间隔参数和第一谐波偏移量参数的一组 参数, 不涉及暂态模型、 单谱线模型的参数, 以及不涉及噪声模型中除音调参数之外的其他参数, 减少了编码时需要的参数的个数, 可以实现利用更少的比特数来合成音频信号, 且该音频信号质量 较高; 并且, 当音频信号的谐波结构明显时, 解码得到的音频质量更佳。
为便于对本发明实施例的理解, 下面将对本发明实施例的编码、 解码具体实现方案进行详细 的描述。 本发明的一个实施例中, 编码端分别提取了音频信号的时域包络参数、 频域包络参数、 音调 参数和谐波间隔参数, 并对上述参数编码传输给解码端, 由于本实施例中音频信号的谐波间隔与第 一谐波偏移量参数相同, 因此省略了提取第一谐波偏移量参数的步骤; 解码端对上述各参数进行解 码, 进一步合成音频信号。
编码端的实施过程具体可以包括:
( 1 ) : 提取音频信号的时域包络参数: 例如, 采用计算音频信号的子帧能量来得到音频信号 的时域包络参数, 可以计算音频信号的子帧能量包络 - ^1^)^^^1),…… , (N- , 其中 N为子帧个数, 设帧长为 15ms, 子帧长度为 3ms, 贝 ljN=5; 对此子帧能量包络进行量化, 即得到时 域包络参数, 进一步的可以对该时域包络参数进行编码; 同时可以利用量化后的时域包络对音频信 号进行时域归一化处理;
当然, 实际应用中也可以将音频信号变换到频域 (或变换域) 之后提取自回归 (AR, Auto Regressive ) 模型参数来表征音频信号的时域包络;
(2 ) : 提取音频信号的频域包络参数; 例如, 在时域提取信号的自回归模型参数来表征音频 信号的频域包络时, 在时域计算得到音频信号的自回归模型参数 {ί¾' ι'…… ' Μ-ι }, 其中 Μ为自回 归模型的阶数, 进一步的可以对该自回归模型参数进行量化、 编码和传输; 同时根据量化后的自回 归模型参数进行滤波, 得到残差信号 err (n) ;
具体应用中, 还可以通过计算频域 (或变换域) 下的子带能量得到音频信号的频域包络参数;
( 3 ) : 提取音频信号的音调参数; 音调参数表征了音频信号中谐波信号与噪声信号之间的比 例; 音调参数的表示方法有多种, 可以是音频信号频域下的自相关函数的最大值与最小值之比, 例 T
Figure imgf000009_0001
, 也可以是其它表现形式, 只要可以表征谐波与噪声之间的 比例关系即可; 其中, 自相关函数 ^^(^)的计算可以利用 FFT (Fast Fourier Transform.快速傅 立叶变换)变化与逆 FFT变换得到, 例如, 对(2 ) 中的残差信号 err (n)进行 FFT变换, 得到频域信号 S (k) =FFT (err (n) ) , 并进一步得到自相关函数 ) = IFFT( lFFT(S(k)) 当然, 也可以直接
ACF{k0 ) = ^ S(k)S(k + k0 )
计算得到, 例如 k=o , 其中 L为编码带宽范围内频域变换系数的个数; 此 夕卜, 还可以使用平均幅度差函数(AMDF, Average Mean Difference Function)来修正自相关函数;
( 4 ) : 提取音频信号的谐波间隔 (PG, Pitch Grid) 参数; 谐波间隔参数表征了音频信号的 不同谐波之间的间隔; 具体可以通过峰值提取方法估计出谐波间隔参数的整数部分, 例如通过 PG = arg max (^^(^))计算得到谐波间隔参数的整数部分;谐波间隔的分数值可以内插自相关函 数 ^^(^) )以后通过峰值提取的方法获得; 具体地, 可以只在先获得的整数谐波间隔附近进行自相 关函数的内插计算, 并在内插后的自相关函数中搜索出谐波间隔的分数值; 为了获得更好的性能, 可以对得到的谐波间隔参数进一步修正后再进行编码传输, 以抑制倍频和分数频的产生; 例如, 将 求得的当前帧的谐波间隔 PG与前一帧的谐波间隔 old— PG进行比较, 如果当前帧的谐波间隔与前一帧 谐波间隔之间的比值小于某个域值 (如 0. 1 ) 且 ACF (old— PG) >0. 95ACF (PG), 则用前一帧的谐波间隔 代替本帧求得的谐波间隔 PG=old— PG;
( 5 ) : 提取音频信号的第一谐波偏移量参数; 由于本实施例中第一谐波偏移量的值等于谐波 间隔, 该步骤 ( 5 )可以省略; 但在第一谐波偏移量的值不等于谐波间隔时, 提取第一谐波偏移量参 数时具体可以: 根据谐波间隔参数, 估计第一谐波偏移量参数, 并将该第一谐波偏移量参数编码传 输; 第一谐波偏移量参数表征了音频信号第一个谐波的位置; 需要指出的是, 若第一谐波偏移量的 值等于谐波间隔, 则该步骤 ( 5 )可以省略; 也就是当所述音频信号的谐波间隔与第一谐波偏移量的 值不同时, 提取所述音频信号的第一谐波偏移量参数;
将上述时域包络参数, 频域包络参数, 音调参数和谐波间隔参数编码后 (或量化后输出) 输 出。 当然, 如果步骤 (5 ) 没有被省略, 则第一谐波偏移量参数也将被编码、 传输。
需要指出的是, 上述音调参数,谐波间隔参数和第一谐波偏移量参数可以但不限于在频域(或 变换域) 计算得到, 例如还可以在时域计算得到。 并且, 获取上述各参数的顺序不唯一, 即不论以 何种顺序, 只要获取上述音频信号的时域包络参数, 频域包络参数, 音调参数, 谐波间隔参数和第 一谐波偏移量参数即可;
对应的, 解码端对收到的数据解码, 得到用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数后, 合成音频信号。 当然, 若编码端步骤(5 )没有被省略, 则解码端解码 得到的参数还包括第一谐波偏移量参数。 解码端实施解码的具体处理过程可以包括:
( 6 ) : 对收到的数据进行解码, 得到用于表征音频信号的时域包络参数、 频域包络参数、 音 调参数和谐波间隔参数; 当然, 若编码端音频信号的谐波间隔与第一谐波偏移量的值不同时, 还得 到第一谐波偏移量参数;
(7 ) : 根据谐波间隔参数得到谐波信号; 该谐波结构可以由具有随机相位的谐波表示, 其中 第一个谐波的位置等于谐波间隔的值, 各个谐波的间隔也由谐波间隔参数决定; 该谐波结构即为谐 波信号; 具体的, 例如 : 从起始频点开始按照谐波间隔参数(PG)表示的谐波间隔将具有随机相位 的谐波以脉冲的形式放置于信号带宽范围内相应的频点, 从而产生谐波信号 buf ukes^) , 例如 buf_pulses(^) = h(k) * ^ ^(k - (n * PG))
fl≤n*KKL— Ϊ , 其中 h (k)表示具有随机相位的谐波;
需要说明的是, 若解码端还收到了第一谐波偏移量参数, 解码端则可以根据谐波间隔参数和 第一谐波偏移量参数, 得到谐波信号; 该谐波结构可以由具有随机相位的谐波表示, 其中第一谐波 偏移量参数确定了第一个谐波的位置, 各个谐波的间隔由谐波间隔参数决定; 该谐波结构即为谐波 信号。 具体的, 例如, 第一谐波偏移量参数 (P0) 为第一个脉冲的位置, 从第一个脉冲位置开始按 照谐波间隔参数 (PG) 表示的谐波间隔将具有随机相位的谐波以脉冲的形式放置于信号带宽范围内 相 应 的 频 点 , 从 而 产 生 谐 波 信 号 buf_j)ulses( :) , 例 如 buf_pulses( :) = h(k)* ^ S{k - (m + n * PG))
0PO+„*PGL-1 , 其中 h (k)表示具有随机相位的谐波;
(8) : 产生噪声信号, 例如, 可以由一个随机数产生器产生噪声信号 buf -1 ^1^^) ;
(9) : 根据音调参数的值调整谐波信号与噪声信号之间的比例; 并根据调整后的谐波信号与 噪声信号, 得到重建的谱信号; 具体的调整可以有多种, 例如: 先分别计算谐波信号与噪声信号的 能量, 记作 enerP和 enerN, 再计算调整因子 A = 1 _ 1^和 2 enerN , 其中 T是音调参数; 并 得到修正后的重建谱信号 = Ab f_pulses( :) + y92buf_noise( :) . 通过逆 FFT变换将重建的谱信 号变换到时域, 记作 efr(n) ;
( 10) : 根据频域包络参数对所述重建的谱信号进行频域整形处理, 得到频域整形后的信号; 例如, 根据解码得到的自回归模型参数, 对信号 efr(n)进行逆滤波, 得到频域整形后的信号1 ^'(n) ;
( 11 ) : 根据时域包络参数对所述频域整形后的信号进行时域整形处理, 得到最终的合成音 频信号; 例如, 可以根据解码出的子帧能量包络对信号1 ^'(n)进行去归一化处理后, 得到最终的合 成音频信号。
需要指出的是频域整形和时域整形的顺序不唯一, 也可以先根据时域包络参数对所述重建的 谱信号进行时域整形处理, 再根据频域包络参数对时域整形后的谱信号进行频域整形处理, 得到最 终的合成音频信号。
相对于现有技术的基于一定模型的参数音频编码技术, 本发明实施例采用的一组参数, 减少 了编码时需要的参数的个数, 降低了使用参数进行编码时所需要的比特数; 从而解决了现有 RIRAC 编码方法比特数较高的问题; 同时, 与现有技术的基于多种模型的参数音频编码算法相比, 由于本 发明实施例的这组参数不涉及暂态模型、 单谱线模型的参数, 以及不涉及噪声模型中除音调参数之 外的其他参数, 减少了编码时需要的参数的个数, 可以用更少的比特数进行编码, 从而进一步降低 信号的编码速率, 并且当信道的传输能力一定时, 由于本发明的编码比特数较低, 因此能够编码具 有更高带宽的信号, 实现了用更低的编码速率获得更大的编码带宽及更高的编码质量。 同时在解码 端可以实现利用更少的比特数来合成音频信号, 且该音频信号质量较高; 并且, 当音频信号的谐波 结构明显时, 解码得到的音频质量更佳。 本发明实施例还提供了一种编码处理方法, 具体可以包括: 当用分频带的方式对音频信号进 行编码时, 若当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号相似, 则提取用于表 征音频信号的时域包络参数和频域包络参数, 并将所述时域包络参数和频域包络参数编码后发送, 同时发送表示当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号相似的信息; 若当前 频带的音频信号的谱信号与前一个频带的音频信号的谱信号不相似, 则提取用于表征音频信号的时 域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 并将所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数编码后发送, 同时发送表示当前频带的音频信号的谱信号与前一个频带的 音频信号的谱信号不相似的信息。
具体的, 所述表示当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号相似或不 相似的信息, 具体可以用编码模式参数表示; 所述编码模式参数, 用于指示解码端在当前频带的音 频信号的谱信号与前一个频带的音频信号的谱信号相似时, 根据所述音频信号的时域包络参数和频 域包络参数, 对当前频带的音频信号进行解码; 或者指示解码端在当前频带的音频信号的谱信号与 前一个频带的音频信号的谱信号不相似时, 根据所述音频信号的时域包络参数、 频域包络参数、 音 调参数和谐波间隔参数, 对当前频带的音频信号进行解码。
可选的, 若当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号不相似, 且当所 述音频信号的谐波间隔与第一谐波偏移量的值不同时, 提取所述音频信号的第一谐波偏移量参数; 并将所述第一谐波偏移量参数传输给解码端。 而且, 若当前频带的音频信号的谱信号与前一个频带 的音频信号的谱信号相似时, 还可以提取所述音频信号的音调参数, 并将所述音调参数传输给解码 端。
图 3是本发明实施例的编码处理方法流程示意图, 下面将结合图 3对本发明实施例的编码处理 方法进行介绍。 如图 3所示, 具体可以包括:
31: 当用分频带的方式对音频信号进行编码时, 判断当前频带的音频信号的谱信号与前一个 频带的音频信号的谱信号是否相似;
具体的可以通过确定编码模式参数 CM来表示是否相似; 例如, 可以先计算当前频带信号谱与 前一个频带信号谱之间的互相关, 以确定当前频带谐波结构与前一个频带谐波结构之间的相似性; 当互相关大于某一域值时, 可以判定为当前频带谐波结构与前一个频带谐波结构之间是相似, 将 CM 置为 1, 否则将 CM置为 0; 并且当前频带信号谱与前一个频带信号谱之间相似时, 可以不再提取下面 的音调参数、 谐波间隔参数和第一谐波偏移量参数;
32: 若相似, 则提取用于表征音频信号的时域包络参数和频域包络参数; 若不相似, 则提取 用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数;
也就是说, 若当前频带信号谱与前一个频带信号谱之间相似时, 可以不提取音频信号的音调 参数、 谐波间隔参数和第一谐波偏移量参数; 具体的, 提取上述各参数的方法可以如下:
提取时域包络参数; 例如可以通过计算当前频带信号的子帧能量包络和全局增益因子 gain, 并根据这两组值判断信号是稳态信号或是瞬态信号; 若是稳态信号, 则对全局增益因子 gain进行量 化, 将得到的量化值作为时域包络参数; 如果是瞬态信号, 则对子帧能量包络进行量化, 将得到的 量化值作为时域包络参数; 并根据时域包络参数对当前频带信号进行时域归一化处理, 得到时域归 一化后的信号;
提取频域包络参数;例如对时域归一化以后的信号进行 MDCT (修正的离散余弦变换, Modified Discrete Cosine Transform) 变换后得到了一组 MDCT系数, 即时域归一化以后该频带对应的频域信 号, 对该频域信号处理时将这组频域信号分为 N个子带, 提取每个子带的子代能量并量化, 得到一组 量化后的频域包络, 即为频域包络参数; 根据频域包络参数对频域信号进行频域归一化处理, 得到 频域归一化后的信号;
提取音调参数; 具体的, 可以直接在 MDCT域进行参数提取; 为了进一步提高编码器的性能, 也可以不直接在 MDCT域进行参数提取, 而是根据原始频域信号计算伪谱信号, 并根据此伪谱信号计 算音调参数; 音调参数可以通过自相关函数的最大值与最小值之间的比值表示, 其中最大值和最小 值的获取是在期望的范围内或者是在对谐波间隔参数计算有益的范围内进行的;
提取谐波间隔参数 PG; 高频带信号的谐波间隔参数, 通常是在频域 (或变换域)下提取的; 谐 波间隔的整数值可以通过峰值提取方法由自相关函数估计出来, 谐波间隔的分数值可以通过峰值提 取的方法由内插的自相关函数估计出来; 也可以只在求得的整数谐波间隔附近进行自相关函数的内 插计算, 之后通过峰值提取的方法获得谐波间隔的分数值;
提取第一谐波偏移量参数, 例如根据谐波间隔, 估计第一谐波偏移量参数 P0; 具体的可以在 谐波间隔范围内, BP [0, PG] 范围内, 将第一谐波分量分别置于不同偏移位置, 并按谐波间隔依次 放置其它谐波, 并计算由此产生的谱与伪谱之间的相关性, 相关性最大的偏移位置即所求的第一谐 波偏移量; 同时, 第一谐波偏移量参数也可以用来进一步修正谐波间隔参数的估计值, 从而达到更 优的参数提取效果; 需要指出的是, 若第一谐波偏移量的值始终等于谐波间隔, 则该步骤可以省略; 33: 将表示当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号相似或不相似的 信息发送, 并将提取的参数编码后发送;
具体的, 当 CM等于 1时, 包含编码模式参数、 时域包络参数和频域包络参数的一组参数将会被 量化或编码, 并传输到解码端; 当 CM等于 0时,包含了编码模式参数、 时域包络参数、频域包络参数、 音调参数和谐波间隔参数的一组参数, 将会被量化、 编码, 并传输到解码端;
需要指出的是当 CM等于 1时, 传输到解码端的参数还可以包括音调参数; 当 CM等于 0时, 若第 一谐波偏移量的值不等于谐波间隔, 则还要传输第一谐波偏移量参数。
对应的, 解码端解码获得上述包含编码模式参数、 时域包络参数和频域包络参数的一组参数, 或解码获得上述包含编码模式参数、 时域包络参数、 频域包络参数、 音调参数和谐波间隔参数的一 组参数, 合成音频信号。
需要指出的是,如果编码端在 CM等于 1时还传输了音调参数,相应的解码端也要接收音调参数; 如果编码端在 CM等于 0时还传输了第一谐波偏移量参数, 相应的解码端也要接收第一谐波偏移量参 数。 对应于本发明实施例提供的种编码处理方法, 本发明实施例还提供了一种解码处理方法, 具 体可以包括: 接收编码端发送的数据, 若接收到表示当前频带的音频信号的谱信号与前一个频带的 音频信号的谱信号相似的信息, 根据用于表征音频信号的时域包络参数和频域包络参数合成音频信 号, 其中, 所述时域包络参数和频域包络参数是从接收到的数据中解码得到; 若接收到表示当前频 带的音频信号的谱信号与前一个频带的音频信号间的谱信号不相似的信息, 根据用于表征音频信号 的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数合成音频信号, 其中, 所述时域包络参 数、 频域包络参数、 音调参数和谐波间隔参数是从接收到的数据中解码得到。
具体的, 根据接收到的编码模式参数, 确定所述当前频带的音频信号的谱信号与前一个频带 的音频信号间的谱信号相似或不相似; 若当前频带的音频信号的谱信号与前一个频带的音频信号间 的谱信号相似,则根据所述收到的用于表征音频信号的时域包络参数和频域包络参数合成音频信号; 若当前频带的音频信号的谱信号与前一个频带的音频信号间的谱信号不相似, 则根据所述收到的时 域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 合成音频信号。
可选的, 若当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号不相似, 所述收 到的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 还可以包括: 收到所述音频信号的 第一谐波偏移量参数; 若当前频带的音频信号的谱信号与前一个频带的音频信号间的谱信号相似, 所述收到的用于表征音频信号的时域包络参数和频域包络参数, 还可以包括: 收到用于表征所述音 频信号的音调参数。
图 4是本发明实施例的解码处理方法流程示意图; 如图 4所示, 解码处理的具体处理过程如图 4 所示, 具体可以包括:
41: 接收到表示当前频带的音频信号的谱信号与前一个频带的音频信号间的谱信号相似的信 息, 或不相似的信息;
例如根据接收到的数据, 解码出编码模式参数 CM, 根据该编码模式参数 CM, 即可确定是否相 似;
42: 当前频带的音频信号的谱信号与前一个频带的音频信号间的谱信号相似时, 根据对收到 数据解码得到的用于表征音频信号的时域包络参数和频域包络参数, 合成音频信号; 不相似时, 根 据对收到数据解码得到的用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔 参数, 合成音频信号;
具体的, 重建谱信号时:
若当前频带信号谱与前一个频带信号谱之间相似, 例如 CM等于 1, 则可以采用谱复制的方式用 前一个频带的谱信号作为当前频带重建的谱信号;当然也可以采用不同于谱复制的方式重建谱信号; 如果编码端 CM等于 1时还传输了音调参数也可以从码流中解码出音调参数,采用谱复制的方式通过前 一个频带的谱重建当前频带的谱信号; 具体的, 可以根据音调参数, 对前一个频带的谱信号做整形, 得到整形后的重建谱信号, 将整形后的谱信号作为当前频带重建的谱信号;
若当前频带信号谱与前一个频带信号谱之间不相似, 例如 CM等于 0, 则从码流中解码出音调参 数、 谐波间隔参数和第一谐波偏移量参数, 根据所述谐波间隔参数得到谐波信号; 或根据所述谐波 间隔参数和第一谐波偏移量参数, 得到谐波信号; 根据所述音调参数, 调整谐波信号与噪声信号之 间的比例; 并根据调整后的谐波信号与噪声信号, 得到重建的谱信号; 即使用基于音调参数、 谐波 间隔参数和第一谐波偏移量参数的人工重建方法来重建高频带的谱信号; 需要说明的是, 当编码的 码流中没有传输第一谐波偏移量参数时, 解码端第一谐波偏移量参数等于谐波间隔参数。
根据解码出的频域包络对重建的谱信号进行频域整形, 例如进行频域去归一化处理, 并将整 形后的谱信号变换到时域; 可以通过逆 MDCT变换, 也可以通过逆 FFT变换将修整后的谱信号变换到时 域, 但是必须与编码端采用的变换方法相对应;
根据解码出的时域包络参数进行时域整形处理, 例如时域去归一化处理, 得到参数音频解码 出的高频信号; 得到合成的音频信号。
需要说明的是, 上述频域整形与时域整形的顺序不唯一, 即也可以先对重建的谱信号进行时 域整形, 再进行频域整形。 例如: 根据所述频域包络参数对所述重建的谱信号进行频域整形处理, 得到频域整形后的信号, 根据所述时域包络参数对频域整形后的信号进行时域整形处理, 得到合成 音频信号; 或者, 根据所述时域包络参数对所述重建的谱信号进行时域整形处理, 得到时域整形后 的信号, 根据所述频域包络参数对时域整形后的信号进行频域整形处理, 得到合成音频信号。
上述内容描述了当用分频带的方式对音频信号进行编码时, 判断当前频带的音频信号的谱信 号与前一个频带的音频信号的谱信号是否相似, 当不相似时提取包含时域包络参数、频域包络参数、 音调参数、 谐波间隔参数和第一谐波偏移量参数的一组参数, 当相似时仅提取包含时域包络参数、 频域包络参数和音调参数的一组参数, 也可以是仅提取包含时域包络参数和频域包络参数的一组参 数, 由于本发明实施例减少了编码时需要的参数的个数, 降低了使用参数进行编码时所需要的比特 数; 也有效地利用了信号不同频带之间谱的相似性进一步降低了编码速率, 获得更大的编码带宽。 解码端根据上述参数能够在分频带解码音频信号的过程中实现针对不同信号的特征采用不同的谱信 号重建方法, 对信号特征的适应性更强, 可以对不同信号获得同样高的合成质量。 为便于对本发明实施例的理解, 下面将对本发明实施例的编码处理方法、 解码处理方法具体 实现方案进行详细的描述。 本发明的另一个实施例中, 在编码端将输入的音频信号分为高频带信号和低频带信号, 并分 别对高频带信号和低频带信号进行编码处理。
图 5是本发明实施例在编码端的处理过程示意图, 如图 5所示, 编码处理过程包括:
51: 对输入的音频信号进行滤波分析;
设输入的音频信号的采样率为 32KHz, 处理帧长为 20ms; 对输入的信号进行分频带、 下采样处 理后, 对应于低频带 (T8kHz频带的信号有 320个采样点, 对应于高频带 8 6kHz频带的信号有 320个采 样点;
52: (TSkHz频带内的信号通过核心编码进行编码处理;
具体应用中, 核心编码可以通过 G. 729. 1编解码器完成, 也可以通过其它宽带信号编解码器完 成编码, 即无论采用何种编码方式, 能够对 (T8kHz频带内的信号进行编码即可; 并输出低频信号的 比特流, 即输出码流;
53: 对 8〜16kHz频带内的信号, 例如时域信号 — ^(O)^— ^(丄),…… J— ^(319)}, 采用本 发明实施例提供的编码处理方法进行参数音频编码:
这里高频带 8 6kHz频带即为编码处理方法中所述的当前频带, 低频带 (TSkHz频带即为所述的 前一个频带; 当高频信号的谱与低频信号的谱不具有相似性时, 提取包括时域包络参数、 频域包络 参数、 音调参数、 谐波间隔参数、 第一谐波偏移量参数及编码模式参数的一组参数; 当具有相似性 时, 仅提取包括时域包络参数、 频域包络参数、 音调参数及编码模式参数, 也可以仅提取包括时域 包络参数、 频域包络参数及编码模式参数的一组参数; 具体处理过程可以包括:
( 1 )确定编码模式参数 CM; 具体地, 可以先计算低频带信号谱与高频带信号谱之间的互相关, 以确定低频带谐波结构与高频带谐波结构之间的相似性; 当互相关大于某一域值时, 可以判定为低 频带谐波结构与高频带谐波结构之间是相似, 将 CM置为 1, 并采用谱复制整形的方式通过低频带的谱 信号重建高频带的谱信号; 或通过其他不同于谱复制的方式重建谱信号; 当互相关小于等于所述域 值时, 则判定低频带谐波结构与高频带谐波结构之间是不相似的, 将 CM置为 0, 并根据参数人工重建 出高频带的谱信号; 当然在实际的应用中也可以采用一种简单的方式来进行编码模式判定, 即当谐 波间隔 PG小于某一域值时, 将 CM置为 1 ; 否则置为 0;
( 2 ) 计算信号的子帧能量包络 ^^― em 0),te^(l),…… ,temp{N - \)}和全局增益因子 gain, 在本实施例中 N=8; 并根据这两组值判断信号是稳态信号或是瞬态信号; 若是稳态信号, 则对 全局增益因子 gain进行量化, 将得到的量化值作为时域包络参数, 并进行编码写入码流; 如果是瞬 态信号, 则对子帧能量包络进行量化, 将得到的量化值作为时域包络参数, 并进行编码写入码流; 并根据时域包络参数对 8 6kHz频带信号进行时域归一化处理, 得到时域归一化后的信号;
( 3 ) 时域归一化后的信号经过 MDCT (修正的离散余弦变换, Modified Discrete Cosine Transform ) 变换 (例如 640点) 后得到了一组 MDCT系数, 即该频带对应的频域信号 {y_swb(0),y_swb(l),…… ,y_swb(3 \ 9)}, 由于超宽带编码器只要求处理^ 4kHz频带内的信 号, 所以对频域信号仅处理 - ^^(0),·^- sw^1)'…… J— s ^(239)}部分; 处理时将这组频域 信号分为 N个子带, 提取每个子带的子代能量并量化, 得到一组量化后的频域包络 {Spec_env{Q),spec_env{\),…… ,spec _env{N - \)} ^ 即为 14kHz频带内的频域包络参数; 由于对于宽带核心编码器 G. 729. 1, 7 kHz部分信号已不在其处理范围内, 为了确保在解码端 解码信号频谱的连续性, 还需要提取 8kHz部分的信号的特征参数; 由于 G. 729. 1编码器对 4 kHz 的信号进行了 MDCT变换(例如 320点) , 对应的频域信号 w^ W^— ^1),…… ,y_wb(\ 59)} ;
Figure imgf000016_0001
, 将其分为 M个子带, 提取每个子带的频域包络并量化, 得到一组 7 kHz频带内的量化后的频域包络 {spec _ env _ extra(0), spec _ env _ extra(V), , spec _ env _ extra(M - 1)} 与8〜14 频带内的 频域包络参数一起组成整个的频域包络参数; 这组包络经过编码可以传输到解码端; 在本实施例中 N=15, M=3 ;
(4)提取音调参数; 具体的, 可以直接在 MDCT域进行参数提取; 为了进一步提高编码器的性 能 , 也 可 以 不 直 接 在 MDCT 域 进行 参 数提 取 , 而 是 根 据 原 始 频 域 信 号 {y_swb(0),y_swb(l),…… ,y_swb(239)}计算伪谱信号, 并根据此伪谱信号计算音调参数; 具体的伪谱信号15 (k) = (0), ^1)'……,^(239)}可以按照下面的公式计算:
Figure imgf000016_0002
S(k) = ■^y _ swb2 (239) + y _ swb2 (23S,), k = 239
jy _ swb2 (k) + (y _ swb(k + 1) - J _ swb(k + 1)) 2 , otherwise 当然也可以通过其它方法, 如对原始频域信号直接取绝对值得到 的
) .
Figure imgf000016_0003
自相关 函数可以由伪谱信号通过频域计算得到,例如
Figure imgf000016_0004
其中 FFT为快速傅立
239
^C ( :0 ) =∑5(k)5(k + k0) 叶变换, IFFT为其逆变换; 此外, 也可以直接计算得到, 例如 k=0 ; 另外, 还可以使用平均幅度差函数 (AMDF)来增强自相关函数;
音调参数可以通过 自相关函数的最大值与最小值之间的比值表示, 例如 T = max(^C ( :0 ))/min(^C ( :0 )), 其中最大值和最小值的获取是在期望的范围内或者是在对谐 波间隔参数计算有益的范围内进行的;
( 5 ) 根据 ^^(^ ), 估计谐波间隔参数 PG; 高频带信号的谐波间隔参数, 通常是在频域 (或 变换域)下提取的; 谐波间隔的整数值可以通过峰值提取方法由自相关函数估计出来, 例如根据
PG = arg maX ACF k。 ))获得,其中最大值的获取可以是限定在一个期望的范围内或者是感兴趣的 范围内进行的, 谐波间隔的分数值可以在适当地内插自相关函数 ^^(^)之后, 通过峰值提取的方 法获得; 也可以只在求得的整数谐波间隔附近进行自相关函数的内插计算, 之后通过峰值提取的方 法获得谐波间隔的分数值;
(6 ) 还可以对估计的谐波间隔参数值进行修正, 以抑制倍频和分数频的产生; 例如, 将求得 的当前帧的谐波间隔 PG与前一帧的谐波间隔 old— PG进行比较, 如果当前帧的谐波间隔与前一帧谐波 间隔之间的比值小于某个域值 (如 0. 1 ) 且 ACF (old— PG) >0. 95ACF (PG), 则用前一帧的谐波间隔代替 本帧求得的谐波间隔 PG=old— PG;
(7 ) 根据谐波间隔, 估计第一谐波偏移量参数 P0; 例如, 具体的可以在谐波间隔范围内, 即 [0, PG] 范围内, 将第一谐波分量分别置于不同偏移位置, 并按谐波间隔依次放置其它谐波, 并计 算由此产生的谱与伪谱之间的相关性, 相关性最大的偏移位置即所求的第一谐波偏移量, 例如
Figure imgf000017_0001
, 其中 L」表示向下取整; 需要指出的是, 实际上谐波间隔参数与第一谐波偏移量参数之间也存在着一定程度上的相关性, 因此可以通过谐波 间隔参数估计出高频带信号的第一谐波偏移量参数; 同时, 第一谐波偏移量参数也可以用来进一步 修正谐波间隔参数的估计值, 从而达到更优的参数提取效果;
( 8) 当 CM等于 1时, 包含编码模式参数、 时域包络参数、 频域包络参数、 及音调参数的一组 参数将会被量化或编码, 并传输到解码端(即传输高频参数比特流) ; 当 CM等于 0时, 包含了编码模 式参数、 时域包络参数、 频域包络参数、 音调参数、 谐波间隔参数及第一谐波偏移量参数的一组参 数, 将会被量化或编码, 并传输到解码端 (即传输高频参数比特流) ;
需要指出的是, 当 CM等于 1时, 也可以只将包含编码模式参数、 时域包络参数和频域包络参数 的一组参数量化或编码, 并传输到解码端;
54: 当完成高频带信号的参数音频编码后, 可以根据所剩的编码比特数选择是否利用可选择 的 RIRAC音频编码对参数音频编码后的高频信号进行增强;
本实施例采用的增强方式是对高频带信号在 MDCT域进行变换编码; 当然也可以选用其它方式 对参数音频编码后的高频信号进行增强, 如对高频带原始信号与高频带音频编码后的残差信号进行 变换编码等; 并传输高频增强比特流。 对应的, 解码端收到上述低频比特流、 高频参数比特流、 高频增强比特流之后, 进行解码, 并合成音频信号; 图 6是本发明实施例在解码端的处理过程示意图, 如图 6所示, 解码具体处理过程 可以包括:
61: (T8kHz频带内的信号合成通过核心解码完成;
62 : 8 6kHz频带内的信号合成则通过参数音频解码完成; 具体处理包括: (1 ) 根据接收到 的数据, 解码出编码模式参数 CM;
(2 ) 从数据中解码出时域包络参数、 频域包络参数; ( 3 ) 若 CM等于 1, 则可以从收到的数据中解码出音调参数, 采用谱复制整形的方式通过低频 带的谱重建高频带的谱信号, 或通过其它不同于谱复制的方式重建谱信号; 例如具体的, 可以根据 音调参数, 对通过核心解码得到的低频带信号的谱信号做整形, 将整形后的谱信号作为重建的高频 带谱信号;
需要指出的是, 当编码端在 CM等于 1时没有传输音调参数, 则解码端将核心解码得到的低频带 的谱信号直接作为重建的高频带谱信号;
若 CM等于 0, 则可以从收到的数据中解码出音调参数、 谐波间隔参数和第一谐波偏移量参数, 使用基于音调参数、 谐波间隔参数和第一谐波偏移量参数的人工重建方法来重建高频带的谱信号; 谱信号的重建方法基于谐波信号加噪声信号; 具体地, 具有随机相位的谐波以脉冲的形式被置于频 域范围内的某些频点之上, 从而重建谐波信号, 其中脉冲的间隔由谐波间隔参数决定, 第一个脉冲 的位置可以根据第一谐波偏移量得到;噪声信号可以由一个随机数产生器获得;根据音调参数 τ的值, 调整谐波信号与噪声信号之间的比例; 并将调整后的谐波信号与噪声信号相加, 得到重建的谱信号; 具体的调整可以有多种, 例如: 先分别计算谐波信号与噪声信号的能量, 记作 enerP和 enerN, 再计 β2 = \ enerP * T
算 调 整 因 子 Α = 1-Τ2 V enerN , 并 得 到 重 建 的 谱 信 号 (k) =
Figure imgf000018_0001
+ y52buf_noise(A;) .
(4) 根据解码出的频域包络对重建的谱信号进行频域整形, 例如频域去归一化处理, 并将整 形后的谱信号变换到时域; 例如, 可以通过逆 MDCT变化, 也可以通过逆 FFT变换将修整后的谱信号变 换到时域;
( 5 ) 根据解码出的时域包络参数进行时域整形处理, 例如时域去归一化处理, 得到解码出的 高频信号;
需要说明的是, 在时域和频域去归一化处理中, 还可以对时域包络和频域包络进行一种可选 择的平滑滤波处理。 如果高频带的谱信号是按照人工重建的方式进行的, 一旦谐波被放置到错误的 子带中, 此时去归一化所用的将是错误的包络因子。 若谐波位置出现轻微的偏差, 就会引入一定程 度的失真, 使用平滑滤波可以减轻这种失真。 具体地, 如果在接近子带边界的附近有一个非常强的 音调成分, 那么就可以用内插后的子带能量包络因子进行频域去归一化处理; 然后将得到的信号变 换到时域, 再由自适应的子帧能量包络(ATE)在时域内插出时域增益函数; 这个时域增益函数最后 可以被用来对时域信号进行去归一化处理;
63: 在 62完成高频带信号的解码后, 可以根据接收到的数据中所剩的比特数选择是否对编码 后的高频信号进行增强, 具体的方法与编码端采用的增强方式相对应, 这里不再赘述;
64: 将 (TSkHz频带的合成信号, 与 8 6kHz频带的合成信号通过 QMF合成滤波, 即可得到最终 的 32kHz采样率的合成音频信号。
本实施例中, 在将音频信号分为低频带信号和高频带信号的情况下, 对其中的高频带信号进 行参数编码、 解码处理, 即采用编码模式参数指示利用表征信号的包含时域包络、 频域包络、 音调、 谐波间隔和第一谐波偏移量的一组参数来完成编解码, 或者利用表征信号的包含时域包络、 频域包 络和音调的一组参数, 来完成编解码。 本发明实施例采用的一组参数, 减少了编码时需要的参数的 个数, 降低了使用参数进行编码时所需要的比特数; 从而解决了现有 RIRAC编码方法比特数较高的问 题; 同时, 与现有的基于多种模型的参数音频编码算法相比, 由于本发明实施例的这组参数不涉及 暂态模型、 单谱线模型的参数, 以及不涉及噪声模型中除音调参数之外的其他参数, 减少了编码时 需要的参数的个数, 可以用更少的比特数进行编码, 从而进一步降低信号的编码速率, 并且当信道 的传输能力一定时, 由于本发明的编码比特数较低, 因此能够编码具有更高带宽的信号, 实现了用 更低的编码速率获得更大的编码带宽及更高的编码质量。 同时在解码端可以实现利用更少的比特数 来合成音频信号, 且该音频信号质量较高; 并且, 当音频信号的谐波结构明显时, 解码得到的音频 质量更佳。 本发明的又一个实施例中, 相对于上述实施例采用了先提取时域包络参数后提取频域包络参 数的方法, 本实施例则采用了先提取频域包络参数的方法来实现编码 (以本实施例中的音频信号及 分频带处理方法, 与上述实施例中的相同为例) 。
本实施例中, 在编码端对高频带信号处理的过程具体可以包括:
( 1 ) : 按照上述实施例中编码端的 (1 ) 中的方法确定编码模式参数 CM;
( 2 ) : 8 6kHz频带内的时域信号经过 MDCT变换后得到了一组 MDCT系数, 由于超宽带部分仅 处理 8 4kHz频带内的信号,所以对频域信号仅处理 — ^^(W^— sw^1)'…… — S ^(239)}部 分; 对于核心编码, 7〜8kHz部分信号已不在其处理范围之内, 为了确保在解码端解码信号频谱的连 续 性 , 在 编 码 端 需 要 提 取 7 kHz 部 分 MDCT 变 换 域 信 号 {y _ w&(120), _ wb(l 21), · · ·· · · ,y _ wb(l 59)} .
( 3 ) : 对 14kHz频带内的 MDCT系数进行分带, 并计算各自的子带能量, 作为频域包络参数, 并对其量化后编码传输;
(4 ) : 对 14kHz频带内的 MDCT系数进行频域归一化处理, 并根据频域归一化以后的 MDCT系 数提取线性预测系数, 作为时域包络参数, 并对这组线性预测系数量化后编码传输;
( 5 ) : 对于频域归一化的 MDCT系数进行线性预测滤波, 得到 MDCT域的线性预测残差;
( 6 ) : 按照上述实施例中编码端 53的 (4 ) 〜 ( 8) 中的方法提取出高频信号的音调参数、 谐 波间隔参数以及第一谐波偏移量参数; 当编码模式为 1时, 只传输编码模式参数、 时域包络参数、 频 域包络参数和音调参数到解码端; 当编码模式为 0时, 则将编码模式参数、 时域包络参数、 频域包络 参数、 音调参数、 谐波间隔参数和第一谐波偏移量参数一起传输到解码端;
对应的, 解码端对高频带信号的处理的过程具体可以包括:
(7 ) : 根据接收到的码流, 解码出编码模式参数 CM; 并从码流中解码出时域包络参数、 频域 包络参数; 具体地, 可以通过码书査找获得量化后的线性预测系数, 即时域包络参数; 以便于随后 根据该获得的线性预测系数进行时域整形处理; 通过码书査找获得量化后的子带能量, 即频域包络 参数; 以便于随后根据该获得的子带能量进行频域整形处理;
(8) : 按照上述实施例中解码端 62中的 (3 ) 中的方法重建高频带的谱信号;
( 9) : 使重建的高频带谱信号通过线性预测逆滤波器, 也即相当于对重建的高频带谱信号进 行时域整形处理;
( 10) : 根据量化后的子带能量, 对重建的高频带谱信号进行频域整形处理;
( 11 ) : 通过逆 MDCT变换, 将整形后的高频带谱信号变换到时域, 得到最终的高频带合成信 号。
由上述描述可知, 本实施例采用了先提取频域包络参数的方法来实现编码, 由于获取上述各 参数的顺序不唯一, 即不论以何种顺序, 只要获取上述音频信号的编码模式参数、 时域包络参数, 频域包络参数, 音调参数, 谐波间隔参数和第一谐波偏移量参数即可。
本发明实施例采用的一组参数, 减少了编码时需要的参数的个数, 降低了使用参数进行编码 时所需要的比特数; 从而解决了现有 RIRAC编码方法比特数较高的问题; 同时, 与现有的基于多种模 型的参数音频编码算法相比, 由于本发明实施例的这组参数不涉及暂态模型、 单谱线模型的参数, 以及不涉及噪声模型中除音调参数之外的其他参数, 减少了编码时需要的参数的个数, 可以用更少 的比特数进行编码, 从而进一步降低信号的编码速率, 并且当信道的传输能力一定时, 由于本发明 的编码比特数较低, 因此能够编码具有更高带宽的信号, 实现了用更低的编码速率获得更大的编码 带宽及更高的编码质量。 同时在解码端可以实现利用更少的比特数来合成音频信号, 且该音频信号 质量较高; 并且, 当音频信号的谐波结构明显时, 解码得到的音频质量更佳。 对应于本发明实施例提供的音频编码方法, 本发明实施例还提供了相应的音频编码装置, 其 结构如图 7所示, 具体实现结构可以包括:
参数提取单元 71, 用于提取用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和 谐波间隔参数; 当所述音频信号的谐波间隔与第一谐波偏移量的值不同时, 还用于提取用于表征所 述音频信号的第一谐波偏移量参数, 并传送至发送单元;
发送单元 72, 用于将所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数编码后, 传输给解码端, 具体的, 例如: 对所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 进行编码后, 传输给解码端; 或者用于将所述时域包络参数、 频域包络参数、 音调参数、 谐波间隔 参数和第一谐波偏移量参数编码后传输给解码端。 对应于本发明实施例提供的音频解码方法, 本发明实施例还提供了相应的音频解码装置, 其 结构如图 8所示, 具体实现结构可以包括:
解码单元 81, 用于对收到的数据进行解码, 得到用于表征音频信号的时域包络参数、 频域包 络参数、 音调参数和谐波间隔参数; 还用于对收到的包含第一谐波偏移量参数的数据进行解码, 得 到用于表征所述音频信号的第一谐波偏移量参数;
合成单元 82, 用于根据时域包络参数、 频域包络参数、 音调参数和谐波间隔参数; 或者时域 包络参数、 频域包络参数、 音调参数、 谐波间隔参数和第一谐波偏移量参数, 合成音频信号; 合成单元 82, 具体可以包括:
谐波重建子单元 821, 用于根据所述谐波间隔参数, 得到谐波信号; 或当所述用于表征音频信 号的谐波间隔与第一谐波偏移量不同时, 根据所述谐波间隔参数和所述第一谐波偏移量参数, 得到 谐波信号;
谱信号重建子单元 822, 用于根据所述音调参数, 调整所述谐波重建子单元 821得到的谐波信 号与噪声信号之间的比例; 并根据调整后的谐波信号与噪声信号, 得到重建的谱信号;
整形子单元 823, 用于根据所述频域包络参数和时域包络参数对所述谱信号重建子单元 822重 建的谱信号进行处理, 得到合成音频信号; 例如: 根据所述频域包络参数对所述重建的谱信号进行 频域整形处理, 得到频域整形后的信号, 根据所述时域包络参数对频域整形后的信号进行时域整形 处理, 得到所述合成音频信号; 或者, 根据所述时域包络参数对所述重建的谱信号进行时域整形处 理, 得到时域整形后的信号, 根据所述频域包络参数对时域整形后的信号进行频域整形处理, 得到 所述合成音频信号。 对应于本发明实施例提供的音频编码装置、 音频解码装置, 本发明实施例还提供了相应的音 频编解码系统, 其结构如图 9所示, 具体实现结构可以包括:
编码装置 91, 用于提取用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波 间隔参数; 对所述用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数编 码后, 发送至解码装置; 以及,
解码装置 92, 用于对所述编码装置发送来的数据进行解码, 得到所述时域包络参数、 频域包 络参数、 音调参数和谐波间隔参数; 根据所述时域包络参数、 频域包络参数、 音调参数和谐波间隔 参数合成音频信号;
编码装置 91, 具体可以包括:
参数提取单元 911, 用于提取音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔 参数; 当所述音频信号的谐波间隔与第一谐波偏移量的值不同时, 还用于提取所述音频信号的第一 谐波偏移量参数;
发送单元 912, 用于将所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数; 或者所 述时域包络参数、 频域包络参数、 音调参数、 谐波间隔参数和第一谐波偏移量参数, 编码后传输给 解码装置;
解码装置 92, 具体可以包括:
解码单元 921, 用于对收到的数据进行解码, 得到所述时域包络参数、 频域包络参数、 音调参 数和谐波间隔参数, 或者所述时域包络参数、 频域包络参数、 音调参数、 谐波间隔参数和第一谐波 偏移量参数;
合成单元 922, 用于根据所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 或者 所述时域包络参数、 频域包络参数、 音调参数、 谐波间隔参数和第一谐波偏移量参数, 合成音频信 号。 对应于本发明实施例提供的编码处理方法, 本发明实施例还提供了相应的编码处理装置, 其 结构如图 10所示, 具体实现结构可以包括:
判断单元 101, 用于判断当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号是否 相似; 具体的, 可以用编码模式参数的值来表示是否相似的信息;
编码单元 102, 用于根据所述判断单元 101得到的判断结果信息, 在当前频带的音频信号的谱 信号与前一个频带的音频信号的谱信号相似时, 提取用于表征音频信号的时域包络参数和频域包络 参数, 还用于提取音调参数; 或者, 在当前频带的音频信号的谱信号与前一个频带的音频信号间的 谱信号不相似时, 提取用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参 数; 在所述音频信号的谐波间隔与第一谐波偏移量的值不同时, 还用于提取所述音频信号的第一谐 波偏移量参数;
传输单元 103, 用于发送所述判断单元 101得到的当前频带的音频信号的谱信号与前一个频带 的音频信号间的谱信号相似的信息, 例如将编码模式参数编码后发送; 还用于对所述编码单元提取 的所述音频信号的时域包络参数和频域包络参数 (还可以包括音调参数) 进行编码后发送; 或者, 发送所述判断单元得到的当前频带的音频信号的谱信号与前一个频带的音频信号间的谱信号不相似 的信息, 对所述编码单元提取的音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参 数 (还可以包括第一谐波偏移量参数) 进行编码后发送。 对应于本发明实施例提供的解码处理方法, 本发明实施例还提供了相应的解码处理装置, 其 结构如图 11所示, 具体实现结构可以包括:
接收信息单元 111, 用于接收表示当前频带的音频信号的谱信号与前一个频带的音频信号的谱 信号相似的信息, 并对收到的数据解码得到用于表征音频信号的时域包络参数和频域包络参数; 或 者, 接收表示当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号不相似的信息, 并对 收到的数据解码得到用于表征音频信号的时域包络参数、频域包络参数、音调参数和谐波间隔参数; 还用于对包含第一谐波偏移量参数的数据解码, 得到用于表征音频信号的第一谐波偏移量参数; 具体的, 接收信息单元 111可以根据接收到的编码模式参数, 确定所述当前频带的音频信号的 谱信号与前一个频带的音频信号间的谱信号相似或不相似;
解码单元 112, 用于根据所述接收信息单元 111接收的所述相似的信息, 以及所述用于表征音 频信号的时域包络参数和频域包络参数, 合成音频信号; 或者, 根据所述不相似的信息, 以及所述 用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 合成音频信号; 具体的: 当接收到当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号相似的信 息, 所述解码单元 112, 具体如图 12所示, 可以包括:
重建子单元 121: 用于重建谱信号, 得到重建的谱信号;
第二整形子单元 122: 用于根据所述音调参数, 对所述重建子单元 121重建的谱信号进行整形 处理, 得到整形后的重建谱信号;
第一整形子单元 123: 用于根据所述频域包络参数和时域包络参数对所述重建的谱信号 (或整 形后的谱信号) 进行处理得到合成音频信号; 例如: 根据所述频域包络参数和时域包络参数对所述 第二整形子单元整形处理后的重建的谱信号进行处理, 包括: 根据所述频域包络参数对所述重建的 谱信号进行频域整形处理, 得到频域整形后的信号; 根据所述时域包络参数对频域整形后的信号进 行时域整形处理, 得到所述合成音频信号; 或, 根据所述时域包络参数对所述重建的谱信号进行时 域整形处理, 得到时域整形后的信号; 根据所述频域包络参数对时域整形后的信号进行频域整形处 理, 得到所述合成音频信号;
当接收到当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号不相似的信息, 所 述解码单元 112, 具体如图 12所示, 可以包括:
谐波重建子单元 124, 用于根据所述谐波间隔参数, 得到谐波信号; 或根据所述谐波间隔参数 和第一谐波偏移量参数, 得到谐波信号;
谱信号重建子单元 125, 用于根据所述音调参数, 调整谐波信号与噪声信号之间的比例, 并根 据调整后的谐波信号与噪声信号, 得到重建的谱信号;
第三整形子单元 126, 用于根据所述频域包络参数和时域包络参数对所述重建的谱信号进行处 理得到合成音频信号。 例如, 根据所述频域包络参数对所述重建的谱信号进行频域整形处理, 得到 频域整形后的信号; 根据所述时域包络参数对频域整形后的信号进行时域整形处理, 得到所述合成 音频信号; 或, 根据所述时域包络参数对所述重建的谱信号进行时域整形处理, 得到时域整形后的 信号; 根据所述频域包络参数对时域整形后的信号进行频域整形处理, 得到合成音频信号。
上述各个本发明实施例可以但不限于应用于音频编解码设备中。
综上所述, 本发明各实施例和现有技术中相比, 由于本发明实施例采用包含时域包络参数、 频域包络参数、 音调参数和谐波间隔参数 (还可以包括第一谐波偏移量参数) 的一组参数, 来表征 音频信号,在对音频信号编码时可以实现在现有基础上降低了使用参数进行编码时所需要的比特数, 可以用更少的比特数对信号进行编码, 进一步降低信号的编码速率, 从而用更低的编码速率获得更 大的编码带宽以及更高的编码质量, 特别是对谐波结构明显的信号, 采用本发明实施例可以获得很 好的编码质量。 同时本发明实施例提供的编码、 解码处理技术方案中, 当用分频带的方式对音频信 号进行编码时, 判断当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号是否相似, 当 不相似时提取包含时域包络参数、 频域包络参数、 音调参数和谐波间隔参数 (还可以包括第一谐波 偏移量参数) 的一组参数, 当相似时仅提取包含时域包络参数、 频域包络参数 (还可以包括音调参 数) 的一组参数, 有效地利用了信号不同频带之间谱的相似性进一步降低了编码速率, 获得更大的 编码带宽。 解码端根据上述参数能够在分频带解码音频信号的过程中实现针对不同信号的特征采用 不同的谱信号重建方法, 对信号特征的适应性更强, 可以对不同信号获得同样高的合成质量。 换句 话说, 当信道的传输能力一定时, 由于本发明的编码比特数较低, 因此能够编码具有更高带宽的信 号。 由于从听觉上讲信号的带宽越大获得听觉感受越好, 因此当信道的传输能力一定时, 本发明提 供的方法可以获得更高的编码带宽及更高的合成质量。 并且本发明实施例提供的一种对音频信号进 行分频带编码、 解码处理的技术方案, 能够在分频带编解码音频信号的过程中实现用更低的编码速 率获得更大的编码带宽, 获得更高的编码质量。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程 序来指令相关的硬件来完成, 所述的程序可存储于一计算机可读取存储介质中, 该程序在执行时, 可包括如上述各方法的实施例的流程。 其中, 所述的存储介质可为磁碟、 光盘、 只读存储记忆体 (Read- Only Memory, ROM) 或随机存储记忆体 (Random Access Memory, RAM) 等。
以上所述, 仅为本发明较佳的具体实施方式, 但本发明的保护范围并不局限于此, 任何熟悉 本技术领域的技术人员在本发明揭露的技术范围内, 可轻易想到的变化或替换, 都应涵盖在本发明 的保护范围之内。 因此, 本发明的保护范围应该以权利要求的保护范围为准。

Claims

权利要求
1、 一种音频编码方法, 其特征在于, 包括:
提取用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数; 将所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数编码后, 传输给解码端。
2、 根据权利要求 1所述的方法, 其特征在于, 所述方法还包括: 当所述音频信号的谐波间隔与 第一谐波偏移量的值不同时, 进一步提取用于表征所述音频信号的第一谐波偏移量参数, 并对其进 行编码后, 传输给所述解码端。
3、 一种音频编码装置, 其特征在于, 包括:
参数提取单元, 用于提取用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波 间隔参数;
发送单元, 用于将所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数编码后, 传输 给解码端。
4、 根据权利要求 3所述的编码装置, 其特征在于, 所述参数提取单元, 还用于:
当所述音频信号的谐波间隔与第一谐波偏移量的值不同时, 进一步提取用于表征所述音频信号 的第一谐波偏移量参数, 并传送至所述发送单元;
所述发送单元, 还用于将第一谐波偏移量参数编码后, 传输给解码端。
5、 一种音频解码方法, 其特征在于, 包括:
对收到的数据进行解码, 得到用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和 谐波间隔参数;
根据所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 合成音频信号。
6、 根据权利要求 5所述的方法, 其特征在于, 所述方法还包括: 对收到的包含第一谐波偏移量 参数的数据进行解码, 得到用于表征所述音频信号的第一谐波偏移量参数。
7、 根据权利要求 5所述的方法, 其特征在于, 所述合成音频信号的步骤包括:
根据所述谐波间隔参数得到谐波信号;
根据所述音调参数, 调整所述谐波信号与噪声信号之间的比例; 根据调整后的谐波信号与噪声 信号, 得到重建的谱信号;
根据所述频域包络参数和时域包络参数对所述重建的谱信号进行处理得到合成音频信号。
8、 根据权利要求 6所述的方法, 其特征在于, 所述合成音频信号的步骤包括:
根据所述谐波间隔参数和所述第一谐波偏移量参数得到谐波信号;
根据所述音调参数, 调整所述谐波信号与噪声信号之间的比例; 根据调整后的谐波信号与噪声 信号, 得到重建的谱信号;
根据所述频域包络参数和时域包络参数对所述重建的谱信号进行处理得到合成音频信号。
9、 根据权利要求 7或 8所述的方法, 其特征在于, 所述根据所述频域包络参数和时域包络参数对 所述重建的谱信号进行处理得到合成音频信号, 包括:
根据所述频域包络参数对所述重建的谱信号进行频域整形处理, 得到频域整形后的信号, 根据 所述时域包络参数对频域整形后的信号进行时域整形处理, 得到所述合成音频信号;
或者, 根据所述时域包络参数对所述重建的谱信号进行时域整形处理, 得到时域整形后的信号, 根据 所述频域包络参数对时域整形后的信号进行频域整形处理, 得到所述合成音频信号。
10、 一种音频解码装置, 其特征在于, 包括:
解码单元, 用于对收到的数据进行解码, 得到用于表征音频信号的时域包络参数、 频域包络参 数、 音调参数和谐波间隔参数;
合成单元, 用于根据所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 合成音频 信号。
11、 根据权利要求 10所述的解码装置, 其特征在于, 所述解码单元还用于:
对收到的包含第一谐波偏移量参数的数据进行解码, 得到用于表征所述音频信号的第一谐波偏 移量参数。
12、 根据权利要求 11所述的解码装置, 其特征在于, 所述合成单元, 包括:
谐波重建子单元, 用于根据所述谐波间隔参数得到谐波信号; 或当所述用于表征音频信号的谐 波间隔与第一谐波偏移量不同时, 根据所述谐波间隔参数和所述第一谐波偏移量参数, 得到谐波信 号;
谱信号重建子单元, 用于根据所述音调参数, 调整所述谐波重建子单元得到的谐波信号与噪声 信号之间的比例; 并根据调整后的谐波信号与噪声信号, 得到重建的谱信号;
整形子单元, 用于根据所述频域包络参数和时域包络参数对所述重建的谱信号进行处理, 得到 合成音频信号。
13、 一种音频编解码系统, 其特征在于, 包括:
编码装置, 用于提取用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔 参数; 对所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数编码后, 发送至解码装置; 解码装置, 用于对所述编码装置发送来的数据进行解码, 得到所述时域包络参数、 频域包络参 数、 音调参数和谐波间隔参数; 根据所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数 合成音频信号。
14、 一种编码处理方法, 其特征在于, 包括:
当用分频带的方式对音频信号进行编码时, 若当前频带的音频信号的谱信号与前一个频带的音 频信号的谱信号相似, 则提取用于表征音频信号的时域包络参数和频域包络参数, 并将所述时域包 络参数和频域包络参数编码后发送, 同时发送表示当前频带的音频信号的谱信号与前一个频带的音 频信号的谱信号相似的信息;
若当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号不相似, 则提取用于表征音 频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 并将所述时域包络参数、 频域 包络参数、 音调参数和谐波间隔参数编码后发送, 同时发送表示当前频带的音频信号的谱信号与前 一个频带的音频信号的谱信号不相似的信息。
15、 根据权利要求 14所述的方法, 其特征在于, 所述方法, 还包括:
所述表示当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号相似或不相似的信 息, 具体用编码模式参数表示; 所述编码模式参数, 用于指示解码端在当前频带的音频信号的谱信 号与前一个频带的音频信号的谱信号相似时, 根据所述音频信号的时域包络参数和频域包络参数对 当前频带的音频信号进行解码, 以及指示解码端在当前频带的音频信号的谱信号与前一个频带的音 频信号的谱信号不相似时, 根据所述用于表征音频信号的时域包络参数、 频域包络参数、 音调参数 和谐波间隔参数, 对当前频带的音频信号进行解码。
16、 根据权利要求 14所述的方法, 其特征在于, 所述方法还包括: 所述提取用于表征音频信号 的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数时, 当所述音频信号的谐波间隔与第一 谐波偏移量的值不同时, 还提取所述音频信号的第一谐波偏移量参数, 并对其编码后发送。
17、 根据权利要求 14所述的方法, 其特征在于, 所述方法还包括: 所述提取用于表征音频信号 的时域包络参数和频域包络参数时, 还提取所述音频信号的音调参数, 并对其编码后发送。
18、 一种编码处理装置, 其特征在于, 包括:
判断单元,用于判断当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号是否相似; 编码单元, 用于根据所述判断单元得到的判断结果信息, 在当前频带的音频信号的谱信号与前 一个频带的音频信号的谱信号相似时, 提取用于表征音频信号的时域包络参数和频域包络参数; 或 者, 在当前频带的音频信号的谱信号与前一个频带的音频信号间的谱信号不相似时, 提取用于表征 音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数;
传输单元, 用于发送所述判断单元得到的当前频带的音频信号的谱信号与前一个频带的音频信 号间的谱信号相似的信息, 对所述编码单元提取的所述音频信号的时域包络参数和频域包络参数进 行编码后发送; 或者, 发送所述判断单元得到的当前频带的音频信号的谱信号与前一个频带的音频 信号间的谱信号不相似的信息, 对所述编码单元提取的音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数进行编码后发送。
19、 根据权利要求 18所述的编码处理装置, 其特征在于,
所述编码单元, 用于所述在当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号不 相似时, 提取用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数时, 还 用于:
在所述音频信号的谐波间隔与第一谐波偏移量的值不同时, 提取所述音频信号的第一谐波偏移 量参数;
所述传输单元, 还用于将所述第一谐波偏移量参数编码后发送。
20、 根据权利要求 18所述的编码处理装置, 其特征在于,
所述编码单元, 用于所述在当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号相 似时, 提取用于表征音频信号的时域包络参数和频域包络参数时, 还用于: 提取所述音频信号的音 调参数;
所述传输单元, 还用于将所述音调参数编码后发送。
21、 一种解码处理方法, 其特征在于, 包括:
接收编码端发送的数据, 若接收到表示当前频带的音频信号的谱信号与前一个频带的音频信号 的谱信号相似的信息, 根据用于表征音频信号的时域包络参数和频域包络参数合成音频信号, 其中, 所述时域包络参数和频域包络参数是从接收到的数据中解码得到;
若接收到表示当前频带的音频信号的谱信号与前一个频带的音频信号间的谱信号不相似的信 息, 根据用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数合成音频信 号, 其中, 所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数是从接收到的数据中解码 得到。
22、 根据权利要求 21所述的方法, 其特征在于, 所述表示当前频带的音频信号的谱信号与前一 个频带的音频信号的谱信号相似或不相似的信息具体用编码模式参数表示, 所述方法包括:
根据接收到的编码模式参数, 确定所述当前频带的音频信号的谱信号与前一个频带的音频信号 间的谱信号相似或不相似。
23、 根据权利要求 22所述的方法, 其特征在于, 若当前频带的音频信号的谱信号与前一个频带 的音频信号间的谱信号相似, 则所述合成音频信号的步骤包括:
重建谱信号;
根据所述频域包络参数和时域包络参数对所述重建的谱信号进行处理得到合成音频信号。
24、 根据权利要求 23所述的方法, 其特征在于, 所述重建谱信号, 包括:
采用谱复制的方式重建谱信号。
25、 根据权利要求 23所述的方法, 其特征在于, 所述编码端发送的数据还包括: 用于表征所述 音频信号的音调参数;
所述重建谱信号之后还包括:
根据所述音调参数, 对重建的谱信号进行整形处理, 得到整形后的重建谱信号;
所述根据所述频域包络参数和时域包络参数对所述重建的谱信号进行处理得到合成音频信号 为:根据所述频域包络参数和时域包络参数对所述整形后重建的谱信号进行处理得到合成音频信号。
26、 根据权利要求 21所述的方法, 其特征在于, 所述编码端发送的数据还包括: 第一谐波偏移 量参数。
27、 根据权利要求 21所述的方法, 其特征在于, 若当前频带的音频信号的谱信号与前一个频带 的音频信号的谱信号不相似, 则所述合成音频信号的步骤包括:
根据所述谐波间隔参数得到谐波信号;
根据所述音调参数, 调整谐波信号与噪声信号之间的比例; 并根据调整后的谐波信号与噪声信 号, 得到重建的谱信号;
根据所述频域包络参数和时域包络参数对所述重建的谱信号进行处理得到合成音频信号。
28、 根据权利要求 26所述的方法, 其特征在于, 若当前频带的音频信号的谱信号与前一个频带 的音频信号的谱信号不相似, 则所述合成音频信号的步骤包括:
根据所述谐波间隔参数和第一谐波偏移量参数, 得到谐波信号;
根据所述音调参数, 调整谐波信号与噪声信号之间的比例; 并根据调整后的谐波信号与噪声信 号, 得到重建的谱信号;
根据所述频域包络参数和时域包络参数对所述重建的谱信号进行处理得到合成音频信号。
29、 根据权利要求 23或 27或 28所述的方法, 其特征在于, 所述根据所述频域包络参数和时域包 络参数对所述重建的谱信号进行处理得到合成音频信号, 包括:
根据所述频域包络参数对所述重建的谱信号进行频域整形处理, 得到频域整形后的信号, 根据 所述时域包络参数对频域整形后的信号进行时域整形处理, 得到合成音频信号;
或者, 根据所述时域包络参数对所述重建的谱信号进行时域整形处理, 得到时域整形后的信号, 根据 所述频域包络参数对时域整形后的信号进行频域整形处理, 得到合成音频信号。
30、 一种解码处理装置, 其特征在于, 包括:
接收信息单元, 用于接收表示当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号 相似的信息, 并对收到的数据解码得到用于表征音频信号的时域包络参数和频域包络参数; 或者, 接收表示当前频带的音频信号的谱信号与前一个频带的音频信号的谱信号不相似的信息, 并对收到 的数据解码得到用于表征音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数; 解码单元, 用于根据所述接收信息单元接收的所述相似的信息, 以及所述时域包络参数和频域 包络参数, 合成音频信号; 或者, 根据所述不相似的信息, 以及所述时域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 合成音频信号。
31、 根据权利要求 30所述的装置, 其特征在于, 当接收到当前频带的音频信号的谱信号与前一 个频带的音频信号的谱信号相似的信息, 所述解码单元, 具体包括:
重建子单元: 用于重建谱信号;
第一整形子单元: 用于根据所述频域包络参数和时域包络参数对所述重建的谱信号进行处理得 到合成音频信号。
32、 根据权利要求 31所述的装置, 其特征在于, 所述接收信息单元还用于对接收到的包含音调 参数的数据解码, 并得到用于表征音频信号的音调参数;
所述解码单元还包括:
第二整形子单元: 用于根据所述音调参数, 对所述重建子单元重建的谱信号进行整形处理, 得 到整形后的重建谱信号;
所述第一整形子单元, 还用于根据所述频域包络参数和时域包络参数对所述第二整形子单元整 形处理后的重建的谱信号进行处理得到所述合成音频信号。
33、 根据权利要求 30所述的装置, 其特征在于, 当接收到当前频带的音频信号的谱信号与前一 个频带的音频信号的谱信号不相似的信息, 所述接收信息单元用于对收到的数据解码得到用于表征 音频信号的时域包络参数、 频域包络参数、 音调参数和谐波间隔参数, 还接收包含第一谐波偏移量 参数的数据, 并解码得到用于表征音频信号的第一谐波偏移量参数。
34、 根据权利要求 30所述的装置, 其特征在于, 当接收到当前频带的音频信号的谱信号与前一 个频带的音频信号的谱信号不相似的信息, 所述解码单元, 包括:
谐波重建子单元, 用于根据所述谐波间隔参数, 得到谐波信号;
谱信号重建子单元, 用于根据所述音调参数, 调整谐波信号与噪声信号之间的比例, 并根据调 整后的谐波信号与噪声信号, 得到重建的谱信号;
第三整形子单元, 用于根据所述频域包络参数和时域包络参数对所述重建的谱信号进行处理得 到合成音频信号。
35、 根据权利要求 33所述的装置, 其特征在于, 所述解码单元, 包括:
谐波重建子单元, 用于根据所述谐波间隔参数和第一谐波偏移量参数, 得到谐波信号; 谱信号重建子单元, 用于根据所述音调参数, 调整谐波信号与噪声信号之间的比例, 并根据调 整后的谐波信号与噪声信号, 得到重建的谱信号; 第三整形子单元, 用于根据所述频域包络参数和时域包络参数对所述重建的谱信号进行处理得 到合成音频信号。
PCT/CN2009/073559 2008-08-28 2009-08-27 音频编码、解码方法及装置、系统 WO2010022661A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200810119170.6 2008-08-28
CN2008101191706A CN101662288B (zh) 2008-08-28 2008-08-28 音频编码、解码方法及装置、系统

Publications (1)

Publication Number Publication Date
WO2010022661A1 true WO2010022661A1 (zh) 2010-03-04

Family

ID=41720840

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2009/073559 WO2010022661A1 (zh) 2008-08-28 2009-08-27 音频编码、解码方法及装置、系统

Country Status (2)

Country Link
CN (1) CN101662288B (zh)
WO (1) WO2010022661A1 (zh)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5316896B2 (ja) * 2010-03-17 2013-10-16 ソニー株式会社 符号化装置および符号化方法、復号装置および復号方法、並びにプログラム
CN103516440B (zh) 2012-06-29 2015-07-08 华为技术有限公司 语音频信号处理方法和编码装置
CN104243734B (zh) * 2013-06-18 2019-03-01 深圳市共进电子股份有限公司 音频处理系统和方法
EP2916319A1 (en) * 2014-03-07 2015-09-09 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Concept for encoding of information
US9838700B2 (en) * 2014-11-27 2017-12-05 Nippon Telegraph And Telephone Corporation Encoding apparatus, decoding apparatus, and method and program for the same
EP3262639B1 (en) * 2015-02-26 2020-10-07 Fraunhofer Gesellschaft zur Förderung der Angewand Apparatus and method for processing an audio signal to obtain a processed audio signal using a target time-domain envelope
US9768793B2 (en) * 2015-12-17 2017-09-19 Analog Devices Global Adaptive digital quantization noise cancellation filters for mash ADCs
CN106920559B (zh) * 2017-03-02 2020-10-30 奇酷互联网络科技(深圳)有限公司 通话音的优化方法、装置及通话终端
CN113192521A (zh) * 2020-01-13 2021-07-30 华为技术有限公司 一种音频编解码方法和音频编解码设备
CN113593586A (zh) * 2020-04-15 2021-11-02 华为技术有限公司 音频信号编码方法、解码方法、编码设备以及解码设备
CN113963703A (zh) * 2020-07-03 2022-01-21 华为技术有限公司 一种音频编码的方法和编解码设备
CN113948094A (zh) * 2020-07-16 2022-01-18 华为技术有限公司 音频编解码方法和相关装置及计算机可读存储介质
CN113821934B (zh) * 2021-09-30 2024-01-19 国网青海省电力公司电力科学研究院 一种工况参数的预测方法、装置、设备及存储介质
CN114550732B (zh) * 2022-04-15 2022-07-08 腾讯科技(深圳)有限公司 一种高频音频信号的编解码方法和相关装置
CN114566174B (zh) * 2022-04-24 2022-07-19 北京百瑞互联技术有限公司 一种优化语音编码的方法、装置、系统、介质及设备

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1849648A (zh) * 2003-09-16 2006-10-18 松下电器产业株式会社 编码装置和译码装置
CN101197577A (zh) * 2006-12-07 2008-06-11 展讯通信(上海)有限公司 一种用于音频处理框架中的编码和解码方法
CN101241736A (zh) * 2007-02-07 2008-08-13 三星电子株式会社 用于解码参数编码音频信号的方法和装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3259759B2 (ja) * 1996-07-22 2002-02-25 日本電気株式会社 音声信号伝送方法及び音声符号復号化システム
JP4125322B2 (ja) * 2001-09-28 2008-07-30 日本電信電話株式会社 基本周波数抽出装置、その方法、そのプログラム並びにそのプログラムを記録した記録媒体
JP2004109809A (ja) * 2002-09-20 2004-04-08 Nippon Telegr & Teleph Corp <Ntt> 音声分析合成方法及びその装置、音声分析合成プログラム及びそのプログラムを記録した記録媒体

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1849648A (zh) * 2003-09-16 2006-10-18 松下电器产业株式会社 编码装置和译码装置
CN101197577A (zh) * 2006-12-07 2008-06-11 展讯通信(上海)有限公司 一种用于音频处理框架中的编码和解码方法
CN101241736A (zh) * 2007-02-07 2008-08-13 三星电子株式会社 用于解码参数编码音频信号的方法和装置

Also Published As

Publication number Publication date
CN101662288B (zh) 2012-07-04
CN101662288A (zh) 2010-03-03

Similar Documents

Publication Publication Date Title
WO2010022661A1 (zh) 音频编码、解码方法及装置、系统
US10115407B2 (en) Method and apparatus for encoding and decoding high frequency signal
EP1423847B1 (en) Reconstruction of high frequency components
EP1408484A2 (en) Enhancing perceptual quality of sbr (spectral band replication) and hfr (high frequency reconstruction) coding methods by adaptive noise-floor addition and noise substitution limiting
EP3131094B1 (en) Noise signal processing and generation method, encoder/decoder and encoding/decoding system
WO2010078816A1 (zh) 瞬态信号的编码方法和装置、解码方法和装置及处理系统
JP2004101720A (ja) 音響符号化装置及び音響符号化方法
JP2009530685A (ja) Mdct係数を使用する音声後処理
CN101140759A (zh) 语音或音频信号的带宽扩展方法及系统
JP2011504249A (ja) 信号処理方法及び装置
WO2009076871A1 (zh) 带宽扩展中激励信号的生成及信号重建方法和装置
KR102380487B1 (ko) 오디오 신호 디코더에서의 개선된 주파수 대역 확장
CN101131820A (zh) 编码设备、解码设备、编码方法和解码方法
WO2009067883A1 (fr) Procédé de codage/décodage et dispositif pour le bruit de fond
WO2009109139A1 (zh) 超宽带扩展编码、解码方法、编码器及超宽带扩展系统
TWI281657B (en) Method and system for speech coding
Kornagel Techniques for artificial bandwidth extension of telephone speech
WO2010000179A1 (zh) 频带扩展的方法、系统和设备
KR20150069919A (ko) 오디오 신호의 부호화, 복호화 방법 및 장치
JP2004302259A (ja) 音響信号の階層符号化方法および階層復号化方法
JP2000132193A (ja) 信号符号化装置及び方法、並びに信号復号装置及び方法
JP4274614B2 (ja) オーディオ信号復号方法
KR20060124371A (ko) 오디오 에러 은닉 방법
Lin et al. Adaptive bandwidth extension of low bitrate compressed audio based on spectral correlation
Nagaswamy Comparison of CELP speech coder with a wavelet method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09809246

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09809246

Country of ref document: EP

Kind code of ref document: A1