WO2009115038A1 - 一种背景噪声激励信号的生成方法及装置 - Google Patents

一种背景噪声激励信号的生成方法及装置 Download PDF

Info

Publication number
WO2009115038A1
WO2009115038A1 PCT/CN2009/070854 CN2009070854W WO2009115038A1 WO 2009115038 A1 WO2009115038 A1 WO 2009115038A1 CN 2009070854 W CN2009070854 W CN 2009070854W WO 2009115038 A1 WO2009115038 A1 WO 2009115038A1
Authority
WO
WIPO (PCT)
Prior art keywords
excitation signal
background noise
generating
frame
quasi
Prior art date
Application number
PCT/CN2009/070854
Other languages
English (en)
French (fr)
Inventor
代金良
张立斌
舒默特·艾雅
汪林
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP09722292A priority Critical patent/EP2261895B1/en
Priority to MX2010010226A priority patent/MX2010010226A/es
Publication of WO2009115038A1 publication Critical patent/WO2009115038A1/zh
Priority to US12/887,066 priority patent/US8370154B2/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/012Comfort noise or silence coding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters

Definitions

  • the present invention relates to the field of communications, and in particular, to a method and apparatus for generating a background noise excitation signal.
  • the processing of speech is mainly performed by a speech codec. Since the speech signal has short-term stability, the speech codec generally performs frame-by-frame processing for the speech signal, each frame is 10 to 30 ms.
  • the original speech codec is fixed rate, that is, each speech codec has only one fixed encoding rate, such as the encoding rate of the speech codec G.729 is 8kbit/s, and the rate of G.728 is 16kbit. /s.
  • these conventional fixed rate speech codecs generally have a higher coding rate speech codec, which is easy to guarantee the coding quality, but occupies a large communication channel resource; a speech coding compiler with a lower coding rate, although not It is easy to guarantee the coding quality, but the communication channel resources are small.
  • the speech signal contains both the vocal signal generated by the human voice and the silent signal generated by the human vocalization gap (background noise;).
  • the coding rate of the voiced signal is referred to as the speech (the speech at this time refers to the signal of the human voice) encoding rate, and the encoding rate of the background noise is called the noise encoding rate.
  • the background noise signal is encoded with a lower coding rate to effectively reduce the communication bandwidth; and the voiced signal generated by the human voice is encoded at a higher rate to ensure communication quality.
  • a G.729B background noise excitation signal is generated by adding a force to the discontinuous transmission system (DTX, Discontinuous Transmission System) / comfort noise on the 729 prototype.
  • the (CNG, Comfort Noise Generated) system which is a background noise processing system, processes signals at 8 kHz, narrowband, and has a signal processing frame length of 10 ms.
  • the CNG algorithm uses a level-controllable pseudo-white noise to excite an interpolated Linear Predictive Coding (LPC) synthesis filter to obtain comfortable background noise, where the excitation signal level and the LPC filter coefficient are respectively Obtained in the previous mute insertion description (SID, Silence Insertion Descriptor) frame.
  • the excitation signal is a pseudo white noise excitation ex(n)
  • ex(n) is a mixture of speech excitation exl(n) and Gaussian white noise excitation ex2(n).
  • the gain of exl(n) is small, and the purpose of exl(n) is to make the transition between speech and non-speech (such as noise) more natural.
  • the synthesized filter can be excited to obtain comfortable background noise.
  • the target excitation gain is defined as the square root of the average energy of the current frame excitation, which is obtained by the following smoothing algorithm, where is the gain of the decoded SID frame:
  • the 80 sample points are divided into two sub-frames.
  • the excitation signal of the CNG module is synthesized in the following way:
  • Ex(n) aex (n) + ⁇ 2 ( ⁇ )
  • the above is the excitation signal generation principle of the background noise of the CNG module of the 729 ⁇ codec. It can be seen from the above implementation process that although the background noise excitation signal of 729B is generated with some speech excitation exl(n), the speech excitation exl(n) is only a form, and the actual content such as adaptive codebook delay, fixed codebook The position and symbol are randomly generated and have strong randomness. Therefore, the correlation between the background noise excitation signal and the previous speech frame excitation signal is very poor, which makes the transition from the synthesized speech signal to the synthesized background noise signal unnatural, which makes the human ear feel uncomfortable.
  • Embodiments of the present invention provide a method and apparatus for generating a background noise excitation signal to solve the problem of a more natural, smooth, and continuous transition of a signal frame from speech to background noise.
  • an embodiment of the present invention provides a method for generating a background noise excitation signal, where the method includes:
  • the quasi-excitation signal is generated by using the coding parameters of the speech codec stage and the transition length of the excitation signal; and the quasi-excitation signal is weighted with the random excitation signal of the background noise coding frame to obtain an excitation signal of the background noise during the transition phase.
  • the embodiment of the invention further provides a device for generating a background noise excitation signal, the device comprising:
  • a quasi-excitation signal generating unit configured to generate a quasi-excitation signal by using a coding parameter of the speech coding and decoding stage and a transition length of the excitation signal
  • the transition phase excitation signal obtaining unit is configured to weight the quasi-excitation signal generated by the quasi-excitation signal generating unit and the random excitation signal of the background noise encoding frame to obtain an excitation signal of the background noise in the transition phase.
  • the random excitation signal generating the quasi-excitation signal and the background noise is weighted, and the excitation signal of the background noise in the transition phase is obtained.
  • the excitation signal of the transition phase replaces the random excitation signal to synthesize the background noise. Since the information of the two excitation signals is included in the transition phase, the synthetic method of comfortable background noise will make the transition of the synthesized signal from speech to background noise more natural, smooth and continuous. The ear feels more comfortable.
  • FIG. 1 is a flowchart of a method for generating background noise excitation according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a device for generating a background noise excitation according to an embodiment of the present invention.
  • the process of generating the excitation signal in the background noise is: using the excitation signal of the speech coded frame, the pitch delay, and the random excitation signal of the background noise coded frame in the transition phase of the signal frame from the speech coded frame to the background noise frame transition. .
  • the quasi-excitation signal to be weighted is generated using the excitation signal of the previous speech-encoded frame and the pitch delay of the last sub-frame; and then the quasi-excitation signal and the random background noise excitation signal are weighted point by point ( That is, the increment or decrement, etc., but not limited to this method, the excitation signal of the background noise in the transition phase is obtained, and the specific implementation process is detailed in the following drawings and embodiments.
  • FIG. 1 is a flowchart of a method for generating a background noise excitation according to an embodiment of the present invention, the method includes:
  • Step 101 Generate a quasi-excitation signal by using a coding parameter of a speech codec stage and a transition length of the excitation signal;
  • Step 102 Weighting the quasi-excitation signal and the random excitation signal of the background noise coded frame to obtain an excitation signal of the background noise in the transition phase.
  • the method further comprises: setting a transition length N of the excitation signal when the signal frame is converted from the voice coded frame to the background noise coded frame; or
  • the speech codec pre-stores encoding parameters of the speech encoded frame, the encoding parameters including an excitation signal and a pitch delay, which is also called an adaptive codebook delay.
  • the encoding parameters of the received speech coded frames are first saved, and the encoding parameters include: an excitation signal and a pitch delay.
  • the excitation signal is stored in the excitation signal memory in real time.
  • W - exc (0, where e [ 0 ' ⁇ -!] , 7 1 is the maximum value of the pitch delay ⁇ set by the speech codec, and if the value exceeds the frame length, the excitation signal memory.
  • W - exc ( z ') will save the excitation signal of the last few frames. For example, if the threshold is the length of two frames, the excitation signal memory ⁇ - ⁇ ) will save the excitation signals of the last two frames, that is, the excitation The size of the signal memory 6?
  • ⁇ - exc ( z ') is determined by the size.
  • the excitation signal memory ⁇ -exe W and the pitch delay are updated in real time, and each frame is updated. Since each frame contains multiple subframes, ⁇ is actually the pitch delay of the last subframe.
  • the transition length ⁇ of the excitation signal transition is set when the signal frame is converted from the speech coded frame to the background noise coded frame.
  • the value of the transition length N is set according to actual needs.
  • the value of the setting ⁇ in the embodiment of the present invention is 160, but is not limited thereto.
  • step 101 is executed to generate a quasi-excitation signal ⁇ e - exc (") by using the coding parameters of the speech coding and decoding stage and the transition length of the excitation signal, and the formula is:
  • Pre exc(n) old exc(T - Pitch + n%Pitch) where " is the data sample of the signal frame, and "E [0,N _ 1] , n ⁇ y.
  • Pit C h represents the remainder of t divided by Pitch, where ⁇ is the maximum value of the pitch delay, which is the pitch delay of the last subframe in the previous superframe, and N is the transition length of the excitation signal.
  • the quasi-excitation signal is weighted with the random excitation signal of the background noise coded frame to obtain an excitation signal CM r _ex C ( «) of the transition phase background noise.
  • n is the sample point of the signal frame
  • the value decreases with the increase of the value, and increases with the increase of the value, and the sum is equal to 1.
  • the calculation formula of the weighting factor is: ⁇ - ⁇ ⁇
  • the value of N is preferably 160.
  • the weighted sum of the embodiment is an example of a point-by-point weighted sum, but is not limited thereto, and may be other weighting methods, such as an even-point weighted sum, an odd-point weighted sum, etc., and the specific implementation process thereof. Similar to the point-by-point weighting method, it will not be described here.
  • the method resulting in a transition phase excitation signal is CMr - after exc ( ")
  • Embodiment 1 is an implementation process of the invention applied in 729BCNG. It should be noted that in 729B, the maximum value of the pitch delay T is 143, and its specific The process is:
  • the speech codec receives each speech encoded frame and saves encoding parameters of the speech encoded frame, the encoding parameters including an excitation signal and a pitch delay Ato3 ⁇ 4 of the last subframe.
  • the excitation signal can be saved in the excitation signal memory 6? - exe ( z ) in real time, where [G, l 42 ], because the frame length of 729B is
  • the excitation signal memory 6? - exc buffers the excitation signals of the last two frames, of course, the signal memory can also be excited according to the actual situation.
  • _exc (0 caches the last frame, multiple frames or less than one frame) .
  • the excitation signal of the transition phase is - exc ("), and the quasi-excitation signal is weighted with the random excitation signal of the background noise coded frame to obtain the excitation signal CM r _ex C ( «) in the transition phase, Its formula is expressed as:
  • Cur _ exc(n) a ⁇ n) pre _ exc(n) + ⁇ (n)ex(n)
  • the pseudo white noise excitation ie the excitation signal, which is the speech excitation ⁇ 1( ⁇ ) and Gaussian
  • the second embodiment is an implementation process applied in an adaptive multirate encoder (AMR) CNG according to an embodiment of the present invention. It should be noted that, in the AMR, the maximum value of the pitch delay T is 143, and the specific implementation is implemented. The process is:
  • the speech codec receives each speech coded frame and saves the coding parameters of the speech coded frame, including the excitation signal and the pitch delay of the last sub-frame.
  • the excitation signal is stored in the excitation signal memory in real time.
  • w - exc ( z ) where [0,1 42 ], since the frame length of the AMR is 160, the signal memory is activated. Only the excitation signal of the latest frame is buffered in w - exc ( z '). Of course, the signal memory can also be excited according to the actual situation.
  • w - exc ( z ') caches the last frame, multiple frames, or less than one frame.
  • Pre _ exc (n) old _ exc (T - Pitch + n% Pitch
  • n the data sample of the signal frame
  • n %Pitch means "the remainder obtained by dividing Pitch
  • is the maximum value of the pitch delay, which is the last sub-frame in the previous superframe.
  • Cur _ exc(n) a ⁇ n) pre _ exc(n) + ⁇ (n)ex(n)
  • the noise excites an interpolated LPC synthesis filter to obtain comfortable background noise, that is, for each sub-frame, the position and sign of the non-zero pulse in the fixed codebook excitation are generated using a uniformly distributed pseudo-random number, excitation
  • the value of the pulse is +1 and -1.
  • the process of generating the fixed codebook excitation is a known technique, and details are not described herein.
  • the final background noise signal can be obtained by exciting the LPC synthesis filter with the excitation signal of the transition phase - exe (").
  • the transition signal phase transitions from speech to background noise, and after the quasi-excitation signal is introduced, the excitation signal in the transition phase is obtained, which makes the speech from the speech.
  • the background noise is converted, the conversion is more natural and continuous, and the human ear feels more comfortable.
  • the third embodiment is an implementation process of the invention applied in G.729.1 CNG.
  • G.729.1 is a recently announced speech encoder of the International Telecommunication Union (ITU). It is a wideband speech coder, which handles a voice signal bandwidth of 50 ⁇ 7000 Hz. In specific processing, the input signal is It is divided into a high frequency band (4000 ⁇ 7000 Hz) and a low frequency band (50 ⁇ 4000 Hz) for processing. The low frequency band uses the CELP model.
  • the CELP model is the basic model of speech processing, and the encoders such as 729 and AMR are used. Both are this model.
  • G.729.1 The basic signal processing frame length is 20ms, which is called superframe. Each superframe has 320 signal samples. After band division, the signal points of each frequency band in the superframe are 160 points.
  • G.729.1 also defines a CNG system for processing noise, which is also divided into a high frequency band and a low frequency band for processing respectively.
  • the medium-low frequency band is also a code-excited linear predictive coding (CELP) model.
  • CELP code-excited linear predictive coding
  • the embodiment of the present invention can be used in the low-band processing flow in the G.729.1 CNG system.
  • the implementation process of the embodiment of the present invention applied in the G.729.1 CNG module is:
  • the speech codec receives each speech coding superframe and saves the coding parameters of the speech coding superframe, including the excitation signal and the pitch delay excitation signal of the last subframe can be saved in the excitation signal memory in real time.
  • W - exc ( z ') where e [ G , 142 ] this is because the maximum value of the pitch delay ⁇ is 143.
  • Pre exc(n) old exc(T - Pitch + n%Pitch) where " is the data sample of the signal frame, and " £ [0,1 59 ] , which means "divide by the remainder, ⁇ is the pitch delay The maximum value is the pitch delay of the last subframe in the previous superframe.
  • Cur exc(n) a(n) pre exc(n) + ⁇ ⁇ )6 ⁇ ( ⁇ )
  • e [0,l 59 ] is the excitation signal of the currently calculated background noise, and is the two excitation signals
  • the weighting factor which varies with the increase of the value, is increasing with the increase of the value, and the sum of () is equal to 1, respectively:
  • the final background noise signal can be obtained by using the excitation signal of the transition phase as the excitation LPC synthesis filter. It can be seen that in G.729.1, when the signal frame is converted from speech to background noise, the quasi-excitation signal is introduced, and the excitation signal in the transition phase is obtained, which makes the conversion from speech to background noise more conversion. Natural and continuous, the human ear feels more comfortable.
  • the embodiment of the present invention further provides a device for generating background noise excitation, which is shown in FIG. 2.
  • the device includes a quasi-excitation signal generating unit 22 and a transition phase excitation signal obtaining unit 23.
  • the setting unit 21 may also be included, wherein
  • the setting unit 21 is configured to set a transition length of the excitation signal when the signal frame is converted from the voice coded frame to the background noise frame.
  • the quasi-excitation signal generating unit 22 is configured to generate a quasi-excitation signal pre _exc (n; the quasi-excitation signal pre _exc according to the magnitude of the transition length N set by the setting unit 21 (the formula of n is
  • the transition stage excitation signal obtaining unit 23 is configured to weight the quasi-excitation signal generated by the quasi-excitation signal generating unit 22 and the random excitation signal of the background noise coded frame to obtain an excitation signal cur _exc(n) of the transition phase background noise.
  • the formula of the excitation signal cur _exc(n) of the background noise of the transition phase is:
  • rawifo - exe (" is a randomly generated excitation signal
  • (") are the weighting factors of the two excitation signals.
  • the value decreases with the increase of the value, and is increased with the increase of the value.
  • the change, and the sum of (") is equal to 1. Its ⁇ a(n) and ⁇ ( ⁇ , respectively expressed as:
  • the apparatus may further include an excitation unit 24 for exciting the synthesis filter to obtain a background noise signal by using the excitation signal obtained by the transition stage excitation signal obtaining unit 23.
  • the storage unit is configured to pre-save coding parameters of the voice coded frame, where the coding parameters include an excitation signal and a gene delay.
  • the generating device for generating the background noise excitation may be integrated with the encoding end, the decoding end or exist independently; for example, integrated in the discontinuous transmission system DTX of the encoding end, or integrated in the comfort noise generating system CNG of the decoding end.
  • the various units in the device refer to the implementation process of the corresponding steps in the foregoing method, and details are not described herein again.
  • the excitation signal replaces the random excitation signal to synthesize the background noise. Since the information of the two excitation signals is included in the transition phase, the synthetic method of comfortable background noise will make the transition of the synthesized signal from speech to background noise more natural, smooth and continuous. The ear feels more comfortable. It will be understood by those skilled in the art that all or part of the steps of implementing the foregoing embodiments may be performed by a program to instruct related hardware, and the program may be stored in a computer readable storage medium.
  • the method includes the following steps: generating a quasi-excitation signal by using a coding parameter of a speech codec stage and a transition length of the excitation signal; and weighting the quasi-excitation signal with a random excitation signal of the background noise coding frame to obtain an excitation signal in a transition phase.
  • the above mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like.

Description

一种背景噪声激励信号的生成方法及装置
本申请要求于 2008 年 3 月 21 日提交中国专利局、 申请号为 200810084513.X,发明名称为"一种背景噪声激励信号的生成方法及装置 "的中 国专利申请的优先权, 其全部内容通过引用结合在本申请中。
技术领域 本发明涉及通信领域,尤其是涉及一种背景噪声激励信号的生成方法及装 置。
背景技术
在语音通信中,对语音的处理主要由语音编解码器来完成, 由于语音信号 具有短时平稳性, 语音编解码器在对语音信号进行处理时, 一般按帧进行, 每 帧 10 ~ 30ms。 最初的语音编解码器都是定速率的, 即每一种语音编解码器只 有一个固定的编码速率, 如语音编解码器 G.729的编码速率是 8kbit/s、 G.728 的速率是 16kbit/s。 这些传统的定速率语音编解码器从总体来讲, 编码速率较 高的语音编解码器, 容易保证编码质量, 但占用通信信道资源较大; 编码速率 较低的语音编解编器, 虽然不容易保证编码质量, 但占用通信信道资源较小。 语音信号既包含人发声产生的有声信号,也包含人发声间隙产生的无声信 号 (背景噪声;)。 对有声信号的编码速率称为语音(这时的语音特指人发声的 信号)编码速率,而对背景噪声的编码速率称为噪声编码速率。在语音通信中, 人们关注的只是有用的有声信号, 不希望传送无用的无声信号,从而达到降低 传输带宽的目的。但是,如果只对有声信号进行编码传输而不对无声信号进行 编码传输, 则会导致背景噪声的不连续, 在接收端, 会使收听的人感觉非常不 舒服, 特别是在背景噪声较强的情况下, 这种感觉会更明显, 有时会令语音难 以理解,为了解决这种状况,即在人不发声时也需要对无声信号进行编码传输, 在语音编解码器中引入了静音压缩技术,在静音压缩技术中,对背景噪声信号 会釆用较低的编码速率进行编码, 来有效的降低通信带宽; 而对人发声产生的 有声信号釆用较高的速率进行编码, 来保证通信质量。 目前, 一种 G.729B的背景噪声激励信号的生成方法是, 在 729原型上增 力口了非连续传输系统( DTX, Discontinuous Transmission System ) /舒适噪声生 成(CNG, Comfort Noise Generated ) 系统, 即背景噪声的处理系统, 其处理 的信号为 8kHz釆样, 窄带, 信号处理帧长为 10ms。 其 CNG算法用一个电平 可控的伪白噪声激励一个经内插得到的线性预测编码(LPC, Linear Predictive Coding )合成滤波器得到舒适的背景噪声, 其中激励信号电平和 LPC滤波器 系数分别从上一个静音插入描述( SID , Silence Insertion Descriptor )帧中得到。 其激励信号为伪白噪声激励 ex(n), ex(n)是语音激励 exl(n)和高斯白噪声 激励 ex2(n)的一个混合。 exl(n)的增益较小, 而釆用 exl(n)的目的是为了让语 音和非语音 (比如噪声等) 间的过渡更为自然。 然后在得到的伪白噪声激励 ex(n)后, 用其激励合成滤波器即可得到舒适的背景噪声。
其中, 激励信号的生成过程如下: 首先, 定义目标激励增益 令其作为当前帧激励平均能量的平方根, 由下面的平滑算法得到, 其中 是解码后的 SID帧的增益:
〜 = 1)
Figure imgf000004_0001
80个釆样点被分成两个子帧, 对每个子帧, CNG模块的激励信号会用如 下的方式来合成:
(1)、 在 [40,103]范围内随机选择基音延迟;
(2)、 子帧的固定码本矢量中非零脉冲的位置和符号随机选择(这些非零 脉冲的位置和符号的结构与 G.729是一致的);
(3)、 选择一个带增益的自适应码本激励信号, 将其标记为 e。 (")," = ··39 , 而选择的固定码本激励信号标记为 £ (")'" = ·39。 然后以子帧能量为依据计算 自适应增益 G。和固定码本增益 Gf:
-^-∑{Ga x ea (n) + Gf x ef (n))2 = G
n=0
需要注意的是 ^可以选择负值。 K = 40 x¾2 , 而由 ACELP的激
Figure imgf000005_0001
如果将自适应码本增益 G。固定, 那么表现 的方程就变成了一个关于 的二阶方程:
Figure imgf000005_0002
G。的值会被限定以确保上面的方程有解, 更近一步, 可以对一些大的自 适应码本增益值的应用进行限制, 这样, 自适应码本增益 。可以在如下的范 围内随机的选择:
0, axJo.5,J— with Α = Ε,. -Γ /4 将方程 = 的根中绝对值最小的作为 的值 t
Figure imgf000005_0003
最后, 用下式构建 G.729的激励信号: e j (ri) = Ga χ^(«) + (7, xef[n],n = 0...39 合成激励 ex(")可由如下方法合成: 设 是 的能量, £2是 ex2(")的能量, £3是 和 ex2(")的点积:
E, =∑^2(«)
Figure imgf000005_0004
而计算的点数超过自身的大小。
令 "和 分别是混合激励中 ";)和 ex2(")的比例系数, 其中 "设为 0.6, 而 β依照下面的二次方程确定: β2Ε2 + 2 βΕ3 + (a2 -1 如果 没有解, 那么 将被设成 0, 而 "设成 1。 最终的 CNG模块的激励变为 ex(n) .
ex(n) = aex (n) + ββχ2 (η) 以上为 729Β编解码器的 CNG模块的背景噪声的激励信号生成原理。 由上述实现过程可知, 729B 的背景噪声激励信号生成时虽然加入了一些 语音激励 exl(n), 但此语音激励 exl(n)只是形式, 而实际的内容如自适应码本 延迟、 固定码本的位置和符号等均是随机产生的, 随机性较强。 因此其背景噪 声激励信号与之前语音帧激励信号的相关性很差,这就会使得从合成的语音信 号到合成的背景噪声信号的过渡很不自然, 从而使人耳感觉不舒适。
发明内容
本发明实施例提供一种背景噪声激励信号的生成方法及装置,以解决信号 帧从语音到背景噪声转换时, 过渡更加自然、 平滑和连续的问题。 为解决上述技术问题,本发明实施例提供一种背景噪声激励信号的生成方 法, 所述方法包括:
利用语音编解码阶段的编码参数及激励信号的过渡长度生成准激励信号; 将所述准激励信号与背景噪声编码帧的随机激励信号进行加权和,得到过 渡阶段背景噪声的激励信号。
相应地, 本发明实施例还提供一种背景噪声激励信号的生成装置, 所述装 置包括:
准激励信号生成单元,用于利用语音编解码阶段的编码参数及激励信号的 过渡长度生成准激励信号;
过渡阶段激励信号获得单元,用于将准激励信号生成单元生成的准激励信 号与背景噪声编码帧的随机激励信号进行加权和,得到过渡阶段背景噪声的激 励信号。 本发明实施例通过在信号帧从语音编码帧向背景噪声帧转换时,在过渡阶 段,将生成准激励信号与背景噪声的随机激励信号进行加权和,得到过渡阶段 背景噪声的激励信号,用所述过渡阶段的激励信号代替随机的激励信号来合成 背景噪声。 由于在过渡阶段将前后两种激励信号的信息都包含了进来, 釆用这 种舒适背景噪声的合成方法之后, 会使得合成信号从语音到背景噪声转换时, 过渡更加自然、 平滑和连续, 人耳感觉更加舒适。
附图说明 图 1为本发明实施例中背景噪声激励的生成方法的流程图;
图 2为本发明实施例中背景噪声激励的生成装置的结构示意图。
具体实施方式 下面我们将结合附图, 对本发明的最佳实施方案进行详细描述。
本发明实施例中背景噪声中激励信号的生成过程为:在信号帧从语音编码 帧到背景噪声帧转换的过渡阶段利用了语音编码帧的激励信号、基音延迟以及 背景噪声编码帧的随机激励信号。 也就是说, 在过渡阶段, 用之前语音编码帧 的激励信号以及最后子帧的基音延迟生成待加权的准激励信号;然后将所述准 激励信号与随机背景噪声激励信号进行逐点加权和 (即递增或递减等,但并不 限于此方式), 得到过渡阶段背景噪声的激励信号, 其具体的实现过程详见下 述附图与实施例。
请参阅图 1 , 为本发明实施例中背景噪声激励的生成方法的流程图, 所述 方法包括:
步骤 101 : 利用语音编解码阶段的编码参数及激励信号的过渡长度生成准 激励信号;
步骤 102: 将所述准激励信号与背景噪声编码帧的随机激励信号进行加权 和, 得到过渡阶段背景噪声的激励信号。
优选的, 在步骤 101之前, 所述方法还包括: 在信号帧从语音编码帧向背 景噪声编码帧转换时, 设置激励信号的过渡长度 N; 或者 语音编解码器预先保存语音编码帧的编码参数,所述编码参数包括激励信 号及基音延迟, 所述基因延迟也叫自适应码本延迟。
也就是说,在语音编解码器中,先保存接收到各个语音编码帧的编码参数, 所述编码参数包括: 激励信号以及基音延迟。激励信号实时的保存在激励信号 存储器。 W-exc(0中,其中 e [0'Γ-!] , 71为语音编解码器所设定的基音延迟 Α 的最大值, 如果 值超过了帧长, 那么激励信号存储器。 W-exc(z')中会保存最近 几帧的激励信号, 比如,如果 Γ值是两帧的长度, 则激励信号存储器 ^^-^^) 中会保存最近两帧的激励信号, 也就是说, 激励信号存储器 6? - exc(z')的大小 由 来确定。 另外, 激励信号存储器 ^-exeW和基音延迟 是实时更新的, 每帧都要进行更新, 由于每帧中包含多个子帧, ^实际上是最后子帧的基 音延迟。
在信号帧从语音编码帧向背景噪声编码帧转换时,设置激励信号过渡的过 渡长度 ^。 一般情况下, 所述过渡长度 N的值根据实际需要来设定, 比如本 发明实施例中的设置 Ν的值以 160为例, 但并不限于此。
然后执行步骤 101 , 利用语音编解码阶段的编码参数及激励信号的过渡长 度生成准激励信号 ^e-exc(") , 其公式为:
pre exc(n)= old exc(T - Pitch + n%Pitch) 其中, "为信号帧的数据样点, 且" E [0,N _ 1] , n<y。PitCh表示 t除以 Pitch所 得的余数, Γ为基音延迟的最大值, 为前一超帧中最后子帧的基音延迟, N为激励信号的过渡长度。 在步骤 102中,将所述准激励信号与背景噪声编码帧的随机激励信号进行 加权和, 得到过渡阶段背景噪声的激励信号 CMr _exC(«)。
也就是说,如果设过渡阶段的激励信号为 CM -exc(") ,则 CM -exc(")表示为: cur exc(n)=a(n) pre exc(n) + fi(n)random exc(n)
其中, 为随机产生的激励信号, n为信号帧的釆样点, "(")和 是准激励信号和随机激励信号的加权因子。 其中 是随着 "值的增加递 减变化的, 是随着 "值的增加递增变化的, 且 与 之和等于 1。
优选的, 所述加权因子 的计算公式为: Φ —ηΙΝ 所述加权因子 (《)的计算公式为: β η=η Ι Ν ; 其中, w为信号帧的釆样点, 且" [0,N-l]; W为激励信号的过渡长度。 一般情况下, N的值优选为 160。
当然, 本实施例加权和的方式是以逐点加权和为例, 但并不限于此, 还可 以是其它的加权方式, 比如, 偶数点加权和, 奇数点加权和等, 其具体的实现 过程与逐点加权的方式类似, 在此不再赘述。 优选的, 所述方法在得到过渡阶段的激励信号为 CMr-exc(")后, 还可以包 括: 利用所述过渡阶段的激励信号 激励 LPC合成滤波器即可得到最 终的背景噪声信号。 由上述技术方案可知,本发明实施例由于在过渡阶段引入了语音编码帧的 激励信号,因此使得信号帧从语音到背景噪声转换时,转换的更加自然和连续, 提高人耳感觉舒适度。 为了便于本领域技术人员的理解, 下面结合具体的实施例来说明。 实施例一, 为本发明应用在 729BCNG中的实现过程, 需要说明的是, 在 729B中, 基音延迟 T的最大值为 143, 其具体过程为:
(1)、 语音编解码器接收各语音编码帧, 并保存语音编码帧的编码参数, 所述编码参数包括激励信号以及最后子帧的基音延迟 Ato¾。 激励信号可以实 时的保存在激励信号存储器 6? - exe(z)中, 其中 [G,l42], 由于 729B的帧长为
80, 因此激励信号存储器 6? - exc(0中緩存了最近两帧的激励信号, 当然, 也 可以根据实际情况激励信号存储器。 _exc(0緩存最近一帧、 多帧或不到一帧 的情况。 (2)、 在信号帧从语音编码帧向背景噪声编码帧转换时, 设置激励信号过 渡的过渡长度 其中 N=160 , 由于 729B每帧长度为 10ms, 80个数据样点, 因此设置的过渡长度为两个 10ms帧。
(3)、 根据激励信号存储器 ^^-^^)生成语音编码帧的准激励信号 pre _exc(n) ? 其公式为: pre _ exc(n)= old _ exc(T - Pitch + n%Pitch) 其中, "为信号帧的数据样点, 且" e [0,l 59] , n%Pitch表示"除以 Pitch所得 的余数, Γ为基音延迟的最大值, 为前一超帧中最后子帧的基音延迟。
(4)、 设过渡阶段的激励信号为 -exc(") , 将所述准激励信号与背景噪声 编码帧的随机激励信号进行加权和, 得到过渡阶段的激励信号 CMr _exC(«), 其 公式表示为:
cur _ exc(n)= a{n) pre _ exc(n) + β (n)ex(n) 其中, 为伪白噪声激励, 即激励信号, 该激励信号是语音激励 εχ1(η) 和高斯白噪声激励 ex2(n)的一个混合, 由于 exl(n)的增益较小, 而釆用 exl(n) 的目的是为了让语音和非语音间的过渡更为自然,其具体的生成 ex(")过程详见 背景技术, 在此不再赘述。
而 "(")和 β(η、是两个激励信号的加权因子, 其中 "(")是随着 "值的增加递 减变化的, 是随着 "值的增加递增变化的, 且 与 之和等于 1 , 其中 与^ )分别表示为:
Figure imgf000010_0001
(5)、 利用过渡阶段的激励信号 -exe(")激励 LPC合成滤波器, 即可得到 最终的背景噪声信号。 因此, 本发明实施例在 729B中, 在信号帧从语音向背景噪声转换时的过 渡阶段引入上述准激励信号后,会使得信号帧从语音到背景噪声转换时,转换 的更加自然和连续, 人耳感觉更加舒适。 实施例二, 为本发明实施例应用在自适应多速率编码器( AMR , Adaptive Multirate Codec ) CNG中的实现过程, 需要说明的是, 在 AMR中, 基音延迟 T的最大值为 143 , 具体实现过程为:
(1)、 语音编解码器接收各语音编码帧, 并保存语音编码帧的编码参数, 包括激励信号以及最后子帧的基音延迟 ^。 激励信号实时的保存在激励信 号存储器。 w-exc(z)中, 其中 [0,142] , 由于 AMR的帧长为 160, 因此激励信 号存储器。 w-exc(z')中只緩存了最近一帧的激励信号, 当然, 也可以根据实际 情况激励信号存储器。 w-exc(z')緩存最近一帧、 多帧或不到一帧的情况。
(2)、 在从语音编码帧向背景噪声编码帧转换时, 设置激励信号过渡的过 渡长度 W , 其中 N=160 , 由于 AMR每帧长度为 20ms, 160个数据样点, 因此 设置的过渡长度为一个 20ms帧。 (3)、 根据激励信号存储器 ^^-^^)生成语音编码帧的准激励信号 pre _exc(n) ? 其公式为: pre _ exc(n)= old _ exc(T - Pitch + n%Pitch) 其中, n为信号帧的数据样点, 且" e [0,l 59] , n%Pitch表示"除以 Pitch所得 的余数, Γ为基音延迟的最大值, 为前一超帧中最后子帧的基音延迟。 (4)、 设过渡阶段的激励信号为 -exc(") , 将所述准激励信号与背景噪声 编码帧的随机激励信号进行加权和, 得到过渡阶段的激励信号 CMr _exC(«), 其 公式表示为:
cur _ exc(n)= a{n) pre _ exc(n) + β (n)ex(n) 其中, 为固定码本激励(带最终增益的), 利用一个增益可控的随机 噪声激励一个经内插得到的 LPC合成滤波器得到舒适的背景噪声 ,也就是说, 对于每子帧,固定码本激励中非零脉冲的位置和符号使用均匀分布的伪随机数 来生成, 激励脉冲的值为 +1 和 -1 , 对于本领域技术人员, 该固定码本激励的 生成过程为已知技术, 在此不再赘述。
而"(")和 (")是两个激励信号的加权因子。 其中 是随着 "值的增加递 减变化的, 是随着 "值的增加递增变化的, 且 与 之和等于 1 , 其具 体表示为:
α(η)=1—η / 160
Figure imgf000012_0001
(5)、利用过渡阶段的激励信号 -exe(")激励 LPC合成滤波器即可得到最 终的背景噪声信号。
由此可知, 本实施例, 与 729B—样, 在 AMR的 CNG算法中, 信号帧从 语音向背景噪声转换时的过渡阶段, 引入准激励信号后,得到过渡阶段的激励 信号, 会使得从语音到背景噪声转换时, 转换的更加自然和连续, 人耳感觉更 加舒适。 实施例三, 为本发明应用在 G.729.1 CNG中的实现过程。
G.729.1是国际电信联盟(ITU, International Telecommunication Union )最 新公布的一个语音编码器, 其是一个宽带的语音编码器, 即处理的语音信号带 宽为 50 ~ 7000Hz, 在具体处理时, 输入信号被分成高频带 (4000 ~ 7000Hz ) 和低频带 ( 50 ~ 4000Hz )分别进行处理, 其中, 低频带釆用的是 CELP模型, 该 CELP模型是语音处理的基本模型, 729、 AMR等编码器釆用的均是此模型。 G.729.1基本的信号处理帧长为 20ms, 称为超帧, 每个超帧 320个信号釆样样 点,进行频带划分后,超帧中每个频带信号釆样样点均为 160点。同时, G.729.1 也定义了处理噪声的 CNG系统, 其也分为高频带和低频带分别进行处理, 其 中低频带用的也是码激励线性预测编码(CELP, code-excited LPC )模型。 本 发明实施例可用于 G.729.1CNG系统中的低频带处理流程中, 本发明实施例应 用在 G.729.1 CNG模块中的实现过程为:
(1)、 语音编解码器接收各语音编码超帧, 并保存语音编码超帧的编码参 数, 包括激励信号以及最后子帧的基音延迟 激励信号可以实时的保存 在激励信号存储器。 W-exc(z')中, 其中 e [G142] , 这是由于基音延迟 Τ的最大值 为 143。
(2)、 在信号帧从语音编码超帧向背景噪声编码超帧转换时, 设置激励信 号过渡的过渡长度 ^, 其中 N=160 , 即过渡阶段为一个超帧。
(3)、 根据 ^^-^^)生成语音编码帧的准激励信号 PK n .
pre exc(n)= old exc(T - Pitch + n%Pitch) 其中, "为信号帧的数据样点, 且 " £ [0,159] , 表示 "除以 所得 的余数, Γ为基音延迟的最大值, 为前一超帧中最后子帧的基音延迟。
(4)、 设过渡阶段的激励信号为 -exc(") , 将所述准激励信号与背景噪声 编码帧的随机激励信号进行逐点加权和, 得到过渡阶段背景噪声的激励信号 cur _exc(ri), 其公式为:
cur exc(n)= a(n) pre exc(n) + β η)6χ(η) 其中, " e [0,l 59], 为当前计算的背景噪声的激励信号, 和 是 两个激励信号的加权因子。 其中 是随着 "值的增加递减变化的, 是随 着"值的增加递增变化的, 且 与 (《)之和等于 1 , 分别表示为:
Figure imgf000013_0001
(5)、 利用过渡阶段的激励信号为 激励 LPC合成滤波器即可得到 最终的背景噪声信号。 由此可知,在 G.729.1中,在信号帧从语音向背景噪声转换时的过渡阶段, 引入准激励信号后,得到过渡阶段的激励信号,会使得从语音到背景噪声转换 时, 转换的更加自然和连续, 人耳感觉更加舒适。
另外, 本发明实施例还提供一种背景噪声激励的生成装置, 其结构示意图 如图 2所示, 所述装置包括准激励信号生成单元 22和过渡阶段激励信号获得 单元 23。 优选的, 还可以包括设置单元 21 , 其中
所述设置单元 21 , 用于在信号帧从语音编码帧向背景噪声帧转换时, 设 置激励信号的过渡长度 N
所述准激励信号生成单元 22 , 用于根据设置单元 21所设置的过渡长度 N 的大小生成语音编码帧的准激励信号 pre _exc(n; 所述准激励信号 pre _exc(n的 公式为
pre exc(n)= old exc(T - Pitch + n%Pitch)
其中, "为信号帧的数据样点, 且" E [0,N _ 1] , n<y。PitCh表示 t除以 Pitch所 得的余数, Γ为基音延迟的最大值, 为前一超帧中最后子帧的基音延迟。 所述过渡阶段激励信号获得单元 23, 用于将准激励信号生成单元 22生成 的准激励信号与背景噪声编码帧的随机激励信号进行加权和,得到过渡阶段背 景噪声的激励信号 cur _exc(n) , 所述过渡阶段背景噪声的激励信号 cur _exc(n)的 公式为:
cur exc(n)=a(n) pre exc(n) + fi(n)random exc(n)
其中, rawifo -exe(")为随机产生的激励信号, ")和 (")是两个激励信号 的加权因子。 其中 是随着 "值的增加递减变化的, 是随着 "值的增加 递增变化的, 且 与 (《)之和等于 1。 其†a(n)与 β(η、分别表示为:
a(n)=l—nl l60 Ι 160 优选的, 所述装置还可以包括激励单元 24 , 用于利用过渡阶段激励信号 获得单元 23获得的激励信号激励合成滤波器得到背景噪声信号。
优选的, 存储单元, 用于预先保存语音编码帧的编码参数, 所述编码参数 包括激励信号及基因延迟。
优选的, 所述背景噪声激励的生成装置可以集成编码端、解码端或独立存 在; 比如集成在编码端的非连续传输系统 DTX中, 或集成在解码端的舒适噪 声生成系统 CNG中。 所述装置中各个单元的功能和作用详见上述方法中对应步骤的实现过程, 在此不再赘述。 在信号帧从语音编码帧向背景噪声帧转换时,在过渡阶段, 将生成语音编 码帧的准激励信号与背景噪声的随机激励信号进行加权和,得到过渡阶段的激 励信号, 用所述过渡阶段的激励信号代替随机的激励信号来合成背景噪声。 由 于在过渡阶段将前后两种激励信号的信息都包含了进来,釆用这种舒适背景噪 声的合成方法之后 ,会使得合成信号从语音到背景噪声转换时 ,过渡更加自然、 平滑和连续, 人耳感觉更加舒适。 本领域普通技术人员可以理解实现上述实施例方法中的全部或部分步骤 是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可 读存储介质中, 该程序在执行时, 包括如下步骤: 利用语音编解码阶段的编码 参数及激励信号的过渡长度生成准激励信号;将所述准激励信号与背景噪声编 码帧的随机激励信号进行加权和,得到过渡阶段的激励信号。上述提到的存储 介质可以是只读存储器, 磁盘或光盘等。
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通 技术人员来说, 在不脱离本发明原理的前提下, 还可以作出若干改进和润饰, 这些改进和润饰也应视为本发明的保护范围。

Claims

权 利 要 求
1、 一种背景噪声激励信号的生成方法, 其特征在于, 包括:
利用语音编解码阶段的编码参数及激励信号的过渡长度生成准激励信号; 将所述准激励信号与背景噪声编码帧的随机激励信号进行加权和,得到过 渡阶段背景噪声的激励信号。
2、 根据权利要求 1所述的背景噪声激励信号的生成方法, 其特征在于, 所述方法还包括: 在信号帧从语音编码帧向背景噪声编码帧转换时,设置激励 信号的过渡长度。
3、 根据权利要求 2所述的背景噪声激励信号的生成方法, 其特征在于, 所述方法还包括:
预先保存语音编码帧的编码参数, 所述编码参数包括激励信号及基音延 迟。
4、 根据权利要求 3所述的背景噪声激励信号的生成方法, 其特征在于, 所述编码参数中的激励信号实时保存在激励信号存储器 old _exc{ 中, 其中 G [0,r- 1] , 7为语音编解码器所设定基音延迟的最大值。
5、 根据权利要求 4所述的背景噪声激励信号的生成方法, 其特征在于, 所述激励信号存储器 old _ 的大小由 Γ来确定。
6、 根据权利要求 3所述的背景噪声激励信号的生成方法, 其特征在于, 所述生成准激励信号的过程为:
利用所述编码参数中包括的激励信号与最后子帧的基音延迟,以及激励信 号的过渡长度生成语音编码帧的准激励信号。
7、 根据权利要求 6所述的背景噪声激励信号的生成方法, 其特征在于, 生成语音编码帧的准激励信号的公式为:
pre _ exc(n)= old _ exc(T - Pitch + n%Pitch) 其中, "为信号帧的数据样点, 且 w e [0, N - 1], n%Pitch表示 n除以 Pitch所 得的余数, Γ为基音延迟的最大值, 为前一超帧中最后子帧的基音延迟, N为激励信号的过渡长度。
8、 根据权利要求 1或 2所述的背景噪声激励信号的生成方法, 其特征在 于,
所述将准激励信号与背景噪声编码帧的随机激励信号进行加权和,得到过 渡阶段背景噪声的激励信号的公式为:
cur _ exc(n)= a{n) pre _ exc(n) + β {n)random _ exc(n) 其中, cur— exc(n)为过渡阶段背景噪声的激励信号, random _ exc(n)为背景 噪声编码帧随机产生的激励信号, 和 β(τή分别是准激励信号和随机激励信 号的加权因子, 《为信号帧的釆样点。
9、 根据权利要求 8所述的背景噪声激励信号的生成方法, 其特征在于, 所述 是随着 "值的增加呈递减变化, 是随着 "值的增加呈递增变化, 且 α(η)与 β(η)之和等于 1。
10、 根据权利要求 9所述的背景噪声激励信号的生成方法, 其特征在于, 所述加权因子 的计算公式为: α(ή)=\—η Ι Ν . 所述加权因子 (《)的计算公式为: β η= η Ι Ν
其中, 《为信号帧的釆样点, 且" [0,N - l] ; W为激励信号的过渡长度。
11、根据权利要求 1至 7任一项所述的背景噪声激励信号的生成方法, 其 特征在于, 所述方法还包括:
利用所述过渡阶段背景噪声的激励信号 激励合成滤波器得到背 景噪声信号。
12、 一种背景噪声激励信号的生成装置, 其特征在于, 包括:
准激励信号生成单元,用于利用语音编解码阶段的编码参数及激励信号的 过渡长度生成准激励信号;
过渡阶段激励信号获得单元,用于将准激励信号生成单元生成的准激励信 号与背景噪声编码帧的随机激励信号进行加权和,得到过渡阶段背景噪声的激 励信号。
13、根据权利要求 12所述的背景噪声激励信号的生成装置, 其特征在于, 所述装置还包括: 设置单元, 用于在信号帧从语音编码帧向背景噪声编码帧转 换时, 设置激励信号的过渡长度。
14、根据权利要求 13所述的背景噪声激励信号的生成装置, 其特征在于, 所述装置还包括:
激励单元,用于利用过渡阶段激励信号获得单元获得的激励信号激励合成 滤波器得到背景噪声信号。
15、根据权利要求 14所述的背景噪声激励信号的生成装置, 其特征在于, 所述装置还包括:
存储单元, 用于预先保存语音编码帧的编码参数, 所述编码参数包括激励 信号及基音延迟。
16、 根据权利要求 12至 15任一项所述的背景噪声激励信号的生成装置, 其特征在于,所述背景噪声激励的生成装置集成在编码端、解码端或独立存在。
PCT/CN2009/070854 2008-03-21 2009-03-18 一种背景噪声激励信号的生成方法及装置 WO2009115038A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP09722292A EP2261895B1 (en) 2008-03-21 2009-03-18 A generating method and device of background noise excitation signal
MX2010010226A MX2010010226A (es) 2008-03-21 2009-03-18 Un metodo y dispositivo de generacion de señal de excitacion de ruido de fondo.
US12/887,066 US8370154B2 (en) 2008-03-21 2010-09-21 Method and apparatus for generating an excitation signal for background noise

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200810084513.X 2008-03-21
CN200810084513A CN101339767B (zh) 2008-03-21 2008-03-21 一种背景噪声激励信号的生成方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/887,066 Continuation US8370154B2 (en) 2008-03-21 2010-09-21 Method and apparatus for generating an excitation signal for background noise

Publications (1)

Publication Number Publication Date
WO2009115038A1 true WO2009115038A1 (zh) 2009-09-24

Family

ID=40213816

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2009/070854 WO2009115038A1 (zh) 2008-03-21 2009-03-18 一种背景噪声激励信号的生成方法及装置

Country Status (5)

Country Link
US (1) US8370154B2 (zh)
EP (1) EP2261895B1 (zh)
CN (1) CN101339767B (zh)
MX (1) MX2010010226A (zh)
WO (1) WO2009115038A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101339767B (zh) 2008-03-21 2010-05-12 华为技术有限公司 一种背景噪声激励信号的生成方法及装置
US8775818B2 (en) * 2009-11-30 2014-07-08 Red Hat, Inc. Multifactor validation of requests to thwart dynamic cross-site attacks
CN109166588B (zh) * 2013-01-15 2022-11-15 韩国电子通信研究院 处理信道信号的编码/解码装置及方法
CN106204478B (zh) * 2016-07-06 2018-09-07 电子科技大学 基于背景噪声特征空间的磁光图像增强方法
CN106531175B (zh) * 2016-11-13 2019-09-03 南京汉隆科技有限公司 一种网络话机柔和噪声产生的方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6055497A (en) * 1995-03-10 2000-04-25 Telefonaktiebolaget Lm Ericsson System, arrangement, and method for replacing corrupted speech frames and a telecommunications system comprising such arrangement
CN1470051A (zh) * 2000-10-17 2004-01-21 �����ɷ� 非话音语音的高性能低比特率编码方法和设备
WO2007027291A1 (en) * 2005-08-31 2007-03-08 Motorola, Inc. Method and apparatus for comfort noise generation in speech communication systems
CN101069231A (zh) * 2004-03-15 2007-11-07 英特尔公司 语音通信的舒适噪声生成方法
CN101339767A (zh) * 2008-03-21 2009-01-07 华为技术有限公司 一种背景噪声激励信号的生成方法及装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5893056A (en) * 1997-04-17 1999-04-06 Northern Telecom Limited Methods and apparatus for generating noise signals from speech signals
JPH10341256A (ja) * 1997-06-10 1998-12-22 Logic Corp 音声から有音を抽出し、抽出有音から音声を再生する方法および装置
US6636829B1 (en) * 1999-09-22 2003-10-21 Mindspeed Technologies, Inc. Speech communication system and method for handling lost frames
US7146309B1 (en) * 2003-09-02 2006-12-05 Mindspeed Technologies, Inc. Deriving seed values to generate excitation values in a speech coder

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6055497A (en) * 1995-03-10 2000-04-25 Telefonaktiebolaget Lm Ericsson System, arrangement, and method for replacing corrupted speech frames and a telecommunications system comprising such arrangement
CN1470051A (zh) * 2000-10-17 2004-01-21 �����ɷ� 非话音语音的高性能低比特率编码方法和设备
CN101069231A (zh) * 2004-03-15 2007-11-07 英特尔公司 语音通信的舒适噪声生成方法
WO2007027291A1 (en) * 2005-08-31 2007-03-08 Motorola, Inc. Method and apparatus for comfort noise generation in speech communication systems
CN101339767A (zh) * 2008-03-21 2009-01-07 华为技术有限公司 一种背景噪声激励信号的生成方法及装置

Also Published As

Publication number Publication date
US8370154B2 (en) 2013-02-05
CN101339767A (zh) 2009-01-07
MX2010010226A (es) 2010-12-20
US20110022391A1 (en) 2011-01-27
EP2261895A4 (en) 2011-04-06
EP2261895A1 (en) 2010-12-15
CN101339767B (zh) 2010-05-12
EP2261895B1 (en) 2012-05-23

Similar Documents

Publication Publication Date Title
US10249313B2 (en) Adaptive bandwidth extension and apparatus for the same
CN1244907C (zh) 宽带语音编解码器中的高频增强层编码方法和装置
US8370135B2 (en) Method and apparatus for encoding and decoding
KR101295729B1 (ko) 비트 레이트­규모 가변적 및 대역폭­규모 가변적 오디오디코딩에서 비트 레이트 스위칭 방법
JP5571235B2 (ja) ピッチ調整コーディング及び非ピッチ調整コーディングを使用する信号符号化
TW448417B (en) Speech encoder adaptively applying pitch preprocessing with continuous warping
KR101960198B1 (ko) 시간 도메인 코딩과 주파수 도메인 코딩 간의 분류 향상
EP3352169B1 (en) Unvoiced decision for speech processing
CN104978970A (zh) 一种噪声信号的处理和生成方法、编解码器和编解码系统
CN105745705A (zh) 使用语音相关的频谱整形信息编码音频信号和解码音频信号的概念
CN105723456A (zh) 使用确定性及类噪声信息编码音频信号及解码音频信号的概念
WO2009115038A1 (zh) 一种背景噪声激励信号的生成方法及装置
CN105765653B (zh) 自适应高通后滤波器
JP2008503786A (ja) オーディオ信号の符号化及び復号化
JPH1198090A (ja) 音声符号化/復号化装置
JP4727413B2 (ja) 音声符号化・復号装置
JP2001154699A (ja) フレーム消去の隠蔽及びその方法
JP3199142B2 (ja) 音声の励振信号符号化方法および装置
WO2000074036A1 (fr) Dispositif de codage/decodage de la voix et codage des parties non vocales, procede de decodage, et support enregistre d&#39;enregistrement de programme
JP3475958B2 (ja) 無音声符号化を含む音声符号化・復号装置、復号化方法及びプログラムを記録した記録媒体
JP2004004946A (ja) 音声復号装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09722292

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: MX/A/2010/010226

Country of ref document: MX

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 3592/KOLNP/2010

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 2009722292

Country of ref document: EP