WO2009115032A1 - 一种语音信号处理方法及装置 - Google Patents

一种语音信号处理方法及装置 Download PDF

Info

Publication number
WO2009115032A1
WO2009115032A1 PCT/CN2009/070826 CN2009070826W WO2009115032A1 WO 2009115032 A1 WO2009115032 A1 WO 2009115032A1 CN 2009070826 W CN2009070826 W CN 2009070826W WO 2009115032 A1 WO2009115032 A1 WO 2009115032A1
Authority
WO
WIPO (PCT)
Prior art keywords
background noise
energy attenuation
frame
gain value
attenuation gain
Prior art date
Application number
PCT/CN2009/070826
Other languages
English (en)
French (fr)
Inventor
代金良
张立斌
舒默特·艾雅
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP09721810.1A priority Critical patent/EP2234102B1/en
Priority to CA2709790A priority patent/CA2709790C/en
Publication of WO2009115032A1 publication Critical patent/WO2009115032A1/zh
Priority to US12/820,738 priority patent/US7890322B2/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/005Correction of errors induced by the transmission channel, if related to the coding algorithm
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/012Comfort noise or silence coding

Definitions

  • the present invention relates to the field of communications, and in particular, to a voice signal processing method and a voice signal processing apparatus. Background technique
  • voice signals are generally processed in frames.
  • the length of each frame of voice signals is generally 10 milliseconds (ms) to 30 ms.
  • ms milliseconds
  • the speech encoder encodes each frame of the speech signal, and encapsulates the encoded bits into a speech data frame;
  • a communication channel transmitting a voice data frame sent by the sender to the receiving end
  • the receiving end decodes the received voice data frame with a voice decoder to recover the voice signal.
  • the key to whether it can recover the speech signal is whether it can accurately receive the speech data frame sent by the sender, depending on the communication channel.
  • the communication channel if the communication channel resources are tight, the loss of the voice data frame or the error of the voice data frame may occur.
  • the Frame Erasure Concealment (FEC) technology which is widely used in speech codecs, can effectively solve the impact on the communication quality of voice data frames when a communication channel loses voice data frames or voice data frames.
  • Different speech codecs may use different FEC techniques, but generally include operations that attenuate the recovered speech signal.
  • FEC technology is defined on the speech decoder, and FEC processing (corresponding to error concealed frames) is performed on the speech data frame.
  • the speech signal is not purely an audible signal generated by people's vocalization, it may also include background noise of people's vocalization gap.
  • the signal (the background noise signal is a silent signal relative to the voiced signal), the presence of the background noise signal, (corresponding to the background noise frame generated by the speech coder) will cause an error
  • the signal recovered after the hidden processing has a sudden change in energy, which causes discomfort to the listener's hearing. Especially when the background noise frame is lost, the hearing discomfort caused by this energy mutation is more intense.
  • the technical problem to be solved by the embodiments of the present invention is to provide a voice signal processing method and apparatus, which make the energy transition between the error concealment signal area and the background noise signal area natural and smooth, and improve the listener's hearing comfort.
  • the embodiment of the present invention provides a voice signal processing method, which includes:
  • the energy attenuation gain value is set to the background noise signal corresponding to the obtained background noise frame, so that the background noise signal energy attenuation gain value corresponding to the background noise frame is the same as the previous one.
  • the signal energy attenuation gain values corresponding to the frames are within a threshold range; and the energy attenuation of the background noise signal corresponding to the background noise frame is controlled by the energy attenuation gain value.
  • an embodiment of the present invention further provides a voice signal processing apparatus, including:
  • a background noise frame acquiring unit configured to obtain a background noise frame after the error concealed frame
  • An energy attenuation gain value setting unit configured to set an energy attenuation gain value to the background noise signal corresponding to the obtained background noise frame, so that the background noise signal energy attenuation gain value corresponding to the background noise frame is a signal corresponding to the previous frame
  • the energy attenuation gain values differ by a threshold range
  • control unit configured to control, by using the energy attenuation gain value, an energy attenuation of a background noise signal corresponding to the background noise frame.
  • the energy attenuation gain value is set by the background noise signal corresponding to the background noise frame obtained after the error concealment frame, so that the background noise signal energy attenuation gain value corresponding to the background noise frame and the signal energy attenuation corresponding to the previous frame are
  • the gain values are within a threshold range, and the energy attenuation of the background noise corresponding to the background noise frame is controlled by the energy attenuation gain value, thereby setting the background noise signal energy attenuation gain and using the energy attenuation of the background noise signal,
  • the energy transition between the error concealment signal region and the background noise signal region is natural and smooth, and the listener's hearing comfort is improved.
  • FIG. 1 is a schematic diagram of a voice signal processing method according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram showing amplitudes of voice signals obtained by voice signal processing according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of another speech signal amplitude obtained by the speech signal processing according to the embodiment of the present invention.
  • FIG. 4 is a schematic diagram of another speech signal amplitude obtained by the speech signal processing according to the embodiment of the present invention.
  • FIG. 5 is a speech decoder of the embodiment of the present invention. schematic diagram. detailed description
  • the embodiment of the invention provides a method and a device for processing a speech signal, which can realize the energy attenuation of the background noise signal by using the background noise signal and use the energy attenuation of the background noise signal, thereby making the error concealment signal region and the background noise signal region
  • the energy transition is natural and smooth, improving the comfort of the listener's hearing.
  • FIG. 1 is a schematic diagram of a voice signal processing method according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram showing amplitudes of voice signals obtained by voice signal processing according to an embodiment of the present invention.
  • the method shown in Figure 1 mainly includes:
  • the background noise frame may be the same as the background noise frame B described below.
  • the seven consecutive background noise frames B, C, D, E, F, G, and H are described as an example, but are not limited thereto, that is, the previous frame of the first background noise frame B currently obtained is the error concealed frame A,
  • the previous frame of the background noise frame except the first background noise frame B is a background noise frame, and the signal corresponding to the background noise frame is a background noise signal, for example, the background frame of the background noise frame D is the background noise frame C,
  • determining whether the currently obtained frame is a background noise frame may be determined according to a flag bit in the frame header;
  • 102 Set an energy attenuation gain value for the background noise signal corresponding to the obtained background noise frame B, C, D, E, F, G, H, so that the background noise frame B, C, D, E, F,
  • the energy attenuation gain values of the background noise signals corresponding to G and H are respectively within a threshold range of the signal energy attenuation gain values corresponding to the previous frame.
  • 102 can be implemented by the following method:
  • the error concealment signal energy attenuation gain value ⁇ ' corresponding to the saved error concealment frame A is obtained;
  • the background noise frame start energy is set according to the error concealment signal energy attenuation gain value ⁇ ' corresponding to the error concealment frame ⁇
  • the value of the initial energy attenuation gain value a start and the energy attenuation gain value increase value ⁇ smaller than the threshold value is set as the background noise signal energy attenuation gain value corresponding to the first background noise frame ;; And setting a sum of a signal energy attenuation gain value corresponding to a previous background noise frame of the other background noise frame and the energy attenuation gain value addition value to the other background noise, except the first background noise frame ⁇
  • the background noise signal energy attenuation gain value corresponding to the frame specifically, can make:
  • the background noise signal corresponding to the background noise frame ⁇ energy attenuation gain value a stan + ⁇ « , that is, c ⁇ oiseB . Premise;
  • the energy attenuation of the background noise signal corresponding to the background noise frame B, C, D, E, F, G, H is controlled by using the energy attenuation gain value.
  • 103 may be implemented by:
  • the background noise signal is amplitude-attenuated by using the energy attenuation gain value, for example, the background noise signal corresponding to the background noise frame B is amplitude-attenuated by using the background noise signal energy attenuation gain value corresponding to the background noise frame B, and the background is utilized.
  • the background noise signal energy attenuation gain value corresponding to the noise frame C, the amplitude attenuation of the background noise signal corresponding to the background noise frame C, and the like specifically, when the number of sampling points of the background noise signal in each background noise frame is M, And using the background noise signal energy attenuation gain value corresponding to each background noise frame, performing amplitude attenuation on the M background noise signal sampling points corresponding to each background noise frame, as described in the description, corresponding to each background noise frame.
  • the amplitude attenuation of the M background noise signal sample samples can be expressed by the following equation, where no i Se (n) represents the amplitude of the first background noise signal sample sample in the M background noise signals:
  • noise(n) noise(n) xa noise ⁇
  • the voice signal processing method of the embodiment of the present invention as shown in FIG. 1 is implemented, wherein 102 ensures that the background noise signal energy attenuation gain value corresponding to the first background noise frame B and the error concealment signal energy attenuation corresponding to the error concealment frame A are The gain value ⁇ ' is not much different, and it is ensured that when there are at least two background noise frames, the background noise signal corresponding to the background noise signal (D, E, F, G, H) has an energy attenuation gain value and its previous background respectively.
  • the energy attenuation gain value of the background noise signal corresponding to the noise frame is not much different.
  • the background noise signal energy attenuation gain value corresponding to the background noise frame is used to perform energy attenuation on the background noise signal corresponding to the background noise frame, so that the error can be hidden.
  • the energy transition between the signal area and the background noise signal area is natural and smooth, improving the listener's hearing comfort.
  • an energy attenuation gain value is set for the background noise signal corresponding to the obtained background noise frames B, C, D, E, F, G, H, so that the background noise frame B, C , D, E, F, G, H corresponding background noise Xinxian energy attenuation gain value respectively corresponding to the previous frame
  • the difference between the energy attenuation gain values and the threshold value can also be achieved by the following methods:
  • the amplitude of another speech signal obtained by the speech signal processing of the embodiment of the present invention shown in FIG. 3 is different from the amplitude of the speech signal obtained by the speech signal processing of the embodiment of the present invention shown in FIG. 2,
  • the method it should be noted that the following 2 ⁇ should also be smaller than the threshold, for example, let:
  • the background noise signal energy attenuation gain value corresponding to the background noise frame BCDEFGH is respectively different from the signal energy attenuation gain value corresponding to the previous frame within the threshold range
  • the background noise frame ⁇ CDEFGH is corresponding.
  • the background noise signal energy attenuation gain value is increased in a substantially sequential order until the background noise signal corresponding to the background noise frame has an energy attenuation gain value of 1. Therefore, other similar manners may be considered as other implementations of the present invention. Way, for example:
  • FIG. 4 Another speech signal amplitude obtained by the speech signal processing of the embodiment of the present invention as shown in FIG. 4 is different from the speech signal amplitude obtained by the speech signal processing of the embodiment of the present invention shown in FIG. 2, wherein the background noise frame B corresponds to The background noise signal energy attenuation gain value ⁇ ⁇ is equal to the value, and the background noise signal energy attenuation gain value corresponding to the other background noise frame CDEFGH is gradually increased according to the step size ⁇ .
  • a method for implementing voice signal processing according to another embodiment of the present invention includes:
  • the background noise frame may be the same as the background noise frame B described below.
  • the seven consecutive background noise frames B, C, D, E, F, G, and H are described as an example, but are not limited thereto, that is, the previous frame of the first background noise frame B currently obtained is the error concealed frame A,
  • the previous frame of the background noise frame except the first background noise frame B is a background noise frame, and the signal corresponding to the background noise frame is a background noise signal, for example, the background frame of the background noise frame D is the background noise frame C,
  • determining whether the currently obtained frame is a background noise frame may be determined according to a flag bit in the frame header;
  • the background noise signal energy attenuation gain values corresponding to G and H respectively differ from the signal energy attenuation gain values corresponding to the previous frame by a threshold range, and the threshold range is the quality of the voice signal obtained according to the need, and the background is obtained.
  • the difference between the energy attenuation gain value of the background noise signal corresponding to the noise frame and the signal energy attenuation gain value corresponding to the previous frame, and the threshold is the maximum value of the difference range.
  • the speech signal processing apparatus according to the embodiment of the present invention will be described below, but the speech signal processing apparatus of the embodiment of the present invention is not limited to the following speech decoder.
  • FIG. 5 is a schematic diagram of a speech decoder according to an embodiment of the present invention.
  • the apparatus shown in FIG. 5 mainly includes a background noise frame acquiring unit 51, an energy attenuation gain value setting unit 52, and a control unit 53;
  • the gain value setting unit 52 includes an obtaining unit 521, a first setting unit 522, a second setting unit 523, and a third setting unit 524.
  • the control unit 53 includes a background noise signal acquiring unit 531 and a processing unit 532, wherein each unit functions as follows:
  • the background noise frame acquiring unit 51 obtains the background noise frame B, C, D, E, F, G, H after the error concealed frame, that is, the previous frame of the first background noise frame B obtained currently is the error concealed frame A
  • the background frame of the background noise frame other than the first background noise frame B is a background noise frame
  • the signal corresponding to the background noise frame is a background noise signal
  • the background frame of the background noise frame D is the background noise frame C, specifically Ground, determine whether the currently obtained frame is a background noise frame, and can judge according to a flag bit in the frame header, which is The prior art will not be described again;
  • the obtaining unit 521 obtains the error concealment signal energy attenuation gain value corresponding to the saved error concealment frame A "';
  • the first setting unit 522 sets a background noise frame start energy attenuation gain value ⁇ according to the error concealment signal energy attenuation gain value corresponding to the error concealment frame A, where the initial energy attenuation gain value corresponds to the error concealment frame
  • the second setting unit 523 is configured to set a sum of the initial energy attenuation gain value ⁇ and the energy attenuation gain value increase value ⁇ smaller than the threshold value as the background noise signal energy attenuation corresponding to the first background noise frame ⁇
  • the gain value specifically, can be:
  • the background noise signal energy attenuation gain value corresponding to the background noise frame ⁇ . iseB a start + ⁇ « , ie. iseB assumes the premise;
  • a third setting unit 524 in addition to the first background noise frame B, setting a sum of a signal energy attenuation gain value corresponding to a previous background noise frame of another background noise frame and the energy attenuation gain value added value, A background noise signal energy attenuation gain value corresponding to the other background noise frame, specifically,
  • the background noise signal energy attenuation gain value corresponding to the background noise frame C ⁇ _ C ⁇ + A « , ie. ⁇ assumes the premise;
  • the background noise signal energy attenuation gain value corresponding to the background noise frame F ⁇ «clock £ + ⁇ « , ie
  • the above-mentioned calculation unit sets the background noise signal energy attenuation gain value corresponding to at least two background noise frames.
  • the iterative process can be expressed as follows:
  • the method may be, but is not limited to, one of the following two values:
  • L is a preset number of background noise frames, specifically, L can be a value
  • the control unit 53 controls the energy attenuation of the background noise signal corresponding to the background noise frame B, C, D, E, F, G, H by using the energy attenuation gain value.
  • the control unit 53 may include:
  • the background noise signal acquiring unit 531 recovers the background noise signals respectively corresponding to the background noise frames ⁇ C, D, E, F, G, and H;
  • the processing unit 532 performs amplitude attenuation on the background noise signal by using the energy attenuation gain value, for example, using a background noise signal energy attenuation gain value corresponding to the background noise frame B.
  • amplifies the background noise signal corresponding to the background noise frame B, and attenuates the gain value by using the background noise signal energy corresponding to the background noise frame C.
  • performing amplitude attenuation on the background noise signal corresponding to the background noise frame C, etc., specifically, when the number of sampling points of the background noise signal in each background noise frame is M, the background noise signal corresponding to each background noise frame is utilized.
  • the energy attenuation gain value is amplitude-attenuated for the M background noise signal sampling points corresponding to each background noise frame.
  • the processing unit 532 performs amplitude on the M background noise signal sampling samples corresponding to each background noise frame.
  • the attenuation can be expressed by the following equation, where the amplitude of the nth background noise signal sample sample in the M background noise signals is represented: if (a . ⁇ 1),
  • noise(n) noise(n) xa noise ⁇
  • the speech decoder of the embodiment of the present invention as shown in FIG. 5 is implemented, wherein the energy attenuation gain value setting unit 52 ensures the background noise signal energy attenuation gain value corresponding to the first background noise frame B ⁇ _ and the error concealed frame A Corresponding error hidden letter county energy attenuation gain value ⁇ ' is not much difference, the first guarantee
  • the background noise signal (the background noise signal energy attenuation gain value corresponding to the DEFGH is not much different from the background noise signal energy attenuation gain value corresponding to the previous background noise frame, respectively, in the control unit 53
  • the background noise signal corresponding to the background noise frame is energy-attenuated by using the background noise signal energy attenuation gain value corresponding to the background noise frame, so that the energy transition between the error concealment signal region and the background noise signal region is natural, smooth, and improved. The comfort of the listener's hearing.
  • the energy attenuation gain value setting unit 52 is configured to: set an energy attenuation gain value to the background noise signal corresponding to the obtained background noise frame ⁇ CDEFGH, so that the background corresponding to the background noise frame BCDEFGH
  • the energy attenuation gain value of the noise signal differs from the signal energy attenuation gain value corresponding to the previous frame by a threshold value, and can also be specifically used for:
  • the signal energy attenuation gain value is different from the background noise signal energy attenuation gain value corresponding to the previous background noise frame, respectively, within the threshold range, so that the background noise corresponding to the background noise frame (, D, E, F, G, H)
  • the signal energy attenuation gain value is increased in a substantially sequential order until the background noise signal corresponding to the background noise frame has an energy attenuation gain value of 1. Therefore, other similar manners may be considered as other embodiments of the present invention. For example, another speech signal amplitude resulting from the speech signal processing of the embodiment of the present invention shown in FIG. 4 above.
  • the embodiment of the present invention is described by taking the background noise frames C, D, E, F, G, and H as an example, and the present invention can be equally applicable in the actual case where the number of background noise frames can be more or less;
  • the initial energy attenuation gain value and the value of the energy attenuation gain value increase value in the embodiment of the present invention;
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Noise Elimination (AREA)

Description

一种语音信号处理方法及装置 本申请要求于 2008年 3月 20日提交中国专利局、申请号为 200810026901.2、 发明名称为 "一种语音信号处理方法及装置" 的中国专利申请的优先权, 其全 部内容通过引用结合在本申请中。 技术领域
本发明涉及通信领域, 尤其涉及一种语音信号处理方法及一种语音信号处 理装置。 背景技术
在语音通信中, 语音信号一般按帧来处理, 每帧语音信号的长度一般为 10 毫秒(ms )到 30ms, 对每帧语音信号, 其基本处理流程为:
发送端, 语音编码器对每帧语音信号进行编码, 并将编码比特封装成语音 数据帧;
通信信道, 将发送端发出的语音数据帧发送到接收端;
接收端, 对接收到的语音数据帧用语音解码器进行解码, 恢复出语音信号。 对于语音解码器来说, 其是否能恢复出语音信号的关键在于能否准确接收 发送端所发出的语音数据帧, 而这取决于通信信道。 而对于通信信道来说, 如 果通信信道资源较为紧张, 那么就可能发生语音数据帧的丟失或语音数据帧出 错。 目前在语音编解码器中广泛采用的帧差错隐藏( Frame Erasure Concealment , FEC )技术可有效地解决通信信道丟失语音数据帧或语音数据帧出错时对语音数 据帧通信质量所带来的影响。
不同的语音编解码器其采用的 FEC技术可能不同, 但一般均包含对恢复出 的语音信号进行幅度衰减的操作。
语音解码器上定义了 FEC技术, 对语音数据帧进行 FEC处理(对应为差错 隐藏帧), 但是由于语音信号中并不纯粹是人们发声产生的有声信号, 也有可能 包括有人们发声间隙的背景噪声信号 (相对于有声信号, 背景噪声信号为无声 信号), 背景噪声信号的出现, (对应语音编码器生成的背景噪声帧)会使差错 隐藏处理后恢复出来的信号发生能量突变, 给听者的听觉造成不适, 特别是当 背景噪声帧发生丟失时, 这种能量突变造成的听觉不适感更为强烈。
发明内容
本发明实施例所要解决的技术问题在于, 提供了一种语音信号处理方法及 装置, 使差错隐藏信号区域与背景噪声信号区域之间的能量过渡自然、 平滑, 提高听者听觉的舒适感。
为了解决上述技术问题, 本发明实施例提出了一种语音信号处理方法, 包 括:
当差错隐藏帧之后获得的为背景噪声帧时, 对获得的所述背景噪声帧对应 的背景噪声信号设置能量衰减增益值, 使得所述背景噪声帧对应的背景噪声信 号能量衰减增益值与其前一帧对应的信号能量衰减增益值相差在阈值范围内; 利用所述能量衰减增益值控制所述背景噪声帧对应的背景噪声信号的能量 衰减。
相应地, 本发明实施例还提供了一种语音信号处理装置, 包括:
背景噪声帧获取单元, 用于获得差错隐藏帧之后的背景噪声帧;
能量衰减增益值设置单元, 用于对获得的所述背景噪声帧对应的背景噪声 信号设置能量衰减增益值, 使得所述背景噪声帧对应的背景噪声信号能量衰减 增益值与其前一帧对应的信号能量衰减增益值相差在阈值范围内;
控制单元, 用于利用所述能量衰减增益值控制所述背景噪声帧对应的背景 噪声信号的能量衰减。
本发明实施例通过对差错隐藏帧之后获得的背景噪声帧对应的背景噪声信 号设置能量衰减增益值, 使得所述背景噪声帧对应的背景噪声信号能量衰减增 益值与其前一帧对应的信号能量衰减增益值相差在阈值范围内, 并利用所述能 量衰减增益值控制所述背景噪声帧对应的背景噪声的能量衰减, 从而通过设置 背景噪声信号能量衰减增益并利用其对背景噪声信号进行能量衰减, 使差错隐 藏信号区域与背景噪声信号区域之间的能量过渡自然、 平滑, 提高听者听觉的 舒适感。 附图说明
图 1是本发明实施例的语音信号处理方法的示意图;
图 2是本发明实施例的语音信号处理所得语音信号幅度示意图;
图 3是本发明实施例的语音信号处理所得另一语音信号幅度示意图; 图 4是本发明实施例的语音信号处理所得另一语音信号幅度示意图; 图 5是本发明实施例的语音解码器的示意图。 具体实施方式
本发明实施例提供了一种语音信号处理方法及装置, 可实现通过设置背景 噪声信号能量衰减增益并利用其对背景噪声信号进行能量衰减, 从而使差错隐 藏信号区域与背景噪声信号区域之间的能量过渡自然、 平滑, 提高听者听觉的 舒适感。
下面结合附图, 对本发明实施例进行详细说明。
图 1是本发明实施例的语音信号处理方法的示意图, 图 2是本发明实施例 的语音信号处理所得语音信号幅度示意图。 参照该图 1与图 2, 图 1所示方法主 要包括:
101 , 在差错隐藏帧之后, 获得一个或多个背景噪声帧, 在差错隐藏帧之后 只获得一个背景噪声帧时,对该背景噪声帧可与下述背景噪声帧 B的处理相同, 下面具体以 7个连续的背景噪声帧 B、 C、 D、 E、 F、 G、 H为例进行说明,但 不 仅限于此, 即当前获得的首个背景噪声帧 B的前一帧为差错隐藏帧 A, 除所述 首个背景噪声帧 B之外的背景噪声帧前一帧均为背景噪声帧, 该背景噪声帧对 应的信号为背景噪声信号, 例如背景噪声帧 D前一帧为背景噪声帧 C, 具体地, 判断当前获得的帧是否为背景噪声帧, 可根据帧头中一标志位进行判断;
102, 对所述获得的背景噪声帧 B、 C、 D、 E、 F、 G、 H对应的背景噪声信 号设置能量衰减增益值, 使得所述背景噪声帧 B、 C、 D、 E、 F、 G、 H对应的 背景噪声信号能量衰减增益值分别与其前一帧对应的信号能量衰减增益值相差 在阈值范围内, 具体地, 102可通过如下方法实现:
首先,获得已保存的差错隐藏帧 A对应的差错隐藏信号能量衰减增益值 α' ; 其次, 根据所述差错隐藏帧 Α对应的差错隐藏信号能量衰减增益值 α'设定 背景噪声帧起始能量衰减增益值《stort 该起始能量衰减增益值《 与所述差错隐 藏帧对应的差错隐藏信号能量衰减增益值 α'相差在所述阈值范围内, 具体地, 可令" , ="';
再次,将所述起始能量衰减增益值 astart与小于所述阈值的能量衰减增益值增 加值 Δα的和值,设置为所述首个背景噪声帧 Β对应的背景噪声信号能量衰减增 益值; 除所述首个背景噪声帧 Β之外, 将其他背景噪声帧的前一背景噪声帧对 应的信号能量衰减增益值与所述能量衰减增益值增加值的和值, 设置为所述其 他背景噪声帧对应的背景噪声信号能量衰减增益值, 具体地, 可令:
背景噪声帧 Β对应的背景噪声信号能量衰减增益值 = astan + Δ« ,即 c^oiseB 以 。《为前提;
背景噪声帧 C 对应的背景噪声信号能量衰减增益值《„。^=«„。,^+4« , 即 。^以 为前提;
背景噪声帧 D 对应的背景噪声信号能量衰减增益值 =«„。^+Δ« , 即 以" 为前提;
背景噪声帧 Ε 对应的背景噪声信号能量衰减增益值《„ £=αβ +Δ« , 即 6£以 " 为前提;
背景噪声帧 F对应的背景噪声信号能量衰减增益值 ^ =«„ £ +Δ« , 即 以 为前提;
背景噪声帧 G对应的背景噪声信号能量衰减增益值《„。^=«„^+Δα , 即
"„。,^以"„„为前提;
背景噪声帧 Η 对应的背景噪声信号能量衰减增益值 aH =ai_e +A« , 即
//以 "„。,^为前提;
需要说明的是, 当获得连续的多个背景噪声帧且存在某一背景噪声帧对应 的背景噪声信号能量衰减增益值 通过上述相同的叠代过程满足 ≥1时, 此时为满足语音信号处理要求, 令《„ =1, 为描述筒便, 上述设置至少两个背 景噪声帧对应的背景噪声信号能量衰减增益值的叠代过程可用如下式子表示: a . = a . -\-Aa
if (a . >1), 作为一种实施方式, 所述 可为但不仅限于如下两种取值方式中的一种: Δα =丄 , 其中 N取 256; Δα = 1 ¾- , 其中 L为预先设定的背景噪声帧个数, 具体地, L可取值为 L
100;
103, 利用所述能量衰减增益值控制所述背景噪声帧 B、 C、 D、 E、 F、 G、 H对应的背景噪声信号的能量衰减, 具体地, 103可通过如下方法实现:
首先, 恢复出所述背景噪声帧 B、 C、 D、 E、 F、 G、 H分别对应的背景噪 声信号;
其次, 利用所述能量衰减增益值对所述背景噪声信号进行幅度衰减, 例如 利用背景噪声帧 B对应的背景噪声信号能量衰减增益值 对背景噪声帧 B 对应的背景噪声信号进行幅度衰减, 利用背景噪声帧 C对应的背景噪声信号能 量衰减增益值 ^ , 对背景噪声帧 C对应的背景噪声信号进行幅度衰减等等, 具体地, 当每个背景噪声帧中背景噪声信号的采样点数为 M时, 则利用每个背 景噪声帧对应的背景噪声信号能量衰减增益值, 对每个背景噪声帧对应的 M个 背景噪声信号采样点进行幅度衰减, 为描述筒便, 上述对每个背景噪声帧对应 的 M个背景噪声信号采样样点进行幅度衰减可用如下式子表示, 其中 noiSe(n)表 示 M个背景噪声信号中第 个背景噪声信号采样样点的幅度:
if (a . < 1),
for(n = 0; n < M; n + +)
{noise(n) = noise(n) x anoise }
实施如图 1所示的本发明实施例的语音信号处理方法, 其中 102保证了所 述首个背景噪声帧 B对应的背景噪声信号能量衰减增益值 与差错隐藏帧 A 对应的差错隐藏信号能量衰减增益值 α'相差不大, 并保证了存在至少两个背景 噪声帧时, 所述背景噪声帧( 、 D、 E、 F、 G、 H对应的背景噪声信号能量衰减 增益值分别与其前一个背景噪声帧对应的背景噪声信号能量衰减增益值相差不 大, 103中采用上述背景噪声帧对应的背景噪声信号能量衰减增益值对所述背景 噪声帧对应的背景噪声信号进行能量衰减, 可使差错隐藏信号区域与背景噪声 信号区域之间的能量过渡自然、 平滑, 提高听者听觉的舒适感。
作为一种实施方式, 上述 102中对所述获得的背景噪声帧 B、 C、 D、 E、 F、 G、 H对应的背景噪声信号设置能量衰减增益值, 使得所述背景噪声帧 B、 C、 D、 E、 F、 G、 H对应的背景噪声信县能量衰减增益值分别与其前一帧对应的信 号能量衰减增益值相差在阈值范围内 还可以通过如下方法实现:
参照图 3 所示的本发明实施例的语音信号处理所得另一语音信号幅度, 与 图 2所示的本发明实施例的语音信号处理所得语音信号幅度不同的是, 此处采 用 "进 2退 的方法, 需要说明的是, 下面的 2Δα也应该小于所述阈值, 例如, 令:
背景噪声帧 Β 对应的背景噪声信号能量衰减增益值^^ = art +2A« , 即 «„ 以 为前提;
背景噪声帧 C 对应的背景噪声信号能量衰减增益值^ _C=^^-A« , 即
。^以 为前提;
背景噪声帧 D对应的背景噪声信号能量衰减增益值
Figure imgf000008_0001
«„。 以 , 为前提;
背景噪声帧 Ε 对应的背景噪声信号能量衰减增益值^_£= - 《
Figure imgf000008_0002
£以 为前提;
背景噪声帧 F对应的背景噪声信号能量衰减增益值《„。^
Figure imgf000008_0003
« 以《 为前提;
背景噪声帧 G对应的背景噪声信号能量衰减增益 即
«„。,^以 „„为前提;
背景噪声帧 Η对应的背景噪声信号能量衰减增益值 ai_H =ae +2A« , 即
//以 ,^为前提,
这样, 在保证了所述背景噪声帧 B C D E F G H对应的背景噪声 信号能量衰减增益值分别与其前一帧对应的信号能量衰减增益值相差在所述阈 值范围内的同时, 使得背景噪声帧 ^ C D E F G H对应的背景噪声信 号能量衰减增益值按照一个大致顺序的顺序增加, 直至背景噪声帧对应的背景 噪声信号能量衰减增益值为 1 为止即可, 因此, 采用其他类似的方式也可认为 是本发明的其他实施方式, 例如:
采用如图 4所示的本发明实施例的语音信号处理所得另一语音信号幅度, 其与图 2所示本发明实施例的语音信号处理所得语音信号幅度的主要区别在于, 背景噪声帧 B对应的背景噪声信号能量衰减增益值^ ^与所述 取值相等, 其他背景噪声帧 C D E F G H对应的背景噪声信号能量衰减增益值在 ^ 基础上按照步长 Δα逐步增加。 参照图 2, 本发明另一实施例实现语音信号处理的方法包括:
201 , 在差错隐藏帧之后, 获得一个或多个背景噪声帧, 在差错隐藏帧之后 只获得一个背景噪声帧时,对该背景噪声帧可与下述背景噪声帧 B的处理相同, 下面具体以 7个连续的背景噪声帧 B、 C、 D、 E、 F、 G、 H为例进行说明,但 不 仅限于此, 即当前获得的首个背景噪声帧 B的前一帧为差错隐藏帧 A, 除所述 首个背景噪声帧 B之外的背景噪声帧前一帧均为背景噪声帧, 该背景噪声帧对 应的信号为背景噪声信号, 例如背景噪声帧 D前一帧为背景噪声帧 C, 具体地, 判断当前获得的帧是否为背景噪声帧, 可根据帧头中一标志位进行判断;
202, 对所述获得的背景噪声帧 B、 C、 D、 E、 F、 G、 H对应的背景噪声信 号设置能量衰减增益值, 使得所述背景噪声帧 B、 C、 D、 E、 F、 G、 H对应的 背景噪声信号能量衰减增益值分别与其前一帧对应的信号能量衰减增益值相差 在阈值范围内, 所述阈值范围是根据需要获得的语音信号的质量, 得出的所述 背景噪声帧对应的背景噪声信号能量衰减增益值分别与其前一帧对应的信号能 量衰减增益值的差值范围, 该阈值即为该差值范围的最大值。 202的具体实现方 法请参见 202, 在此不再赘述;
203, 利用所述能量衰减增益值控制所述背景噪声帧 B、 C、 D、 E、 F、 G、 H对应的背景噪声信号的能量衰减。 203的具体实现方法请参见 103, 在此不再 赘述。
下面相应地对本发明实施例的语音信号处理装置进行说明, 但本发明实施 例的语音信号处理装置并不仅限于下面的语音解码器。
图 5是本发明实施例的语音解码器的示意图, 参照该图 5与图 2, 图 5所示 装置主要包括背景噪声帧获取单元 51、 能量衰减增益值设置单元 52、 控制单元 53; 能量衰减增益值设置单元 52包括获取单元 521、 第一设置单元 522、 第二 设置单元 523、第三设置单元 524;控制单元 53包括背景噪声信号获取单元 531、 处理单元 532, 其中各单元功能如下述:
背景噪声帧获取单元 51 , 获得差错隐藏帧之后的背景噪声帧 B、 C、 D、 E、 F、 G、 H, 即当前获得的首个背景噪声帧 B的前一帧为差错隐藏帧 A, 除所述 首个背景噪声帧 B之外的背景噪声帧前一帧为背景噪声帧, 该背景噪声帧对应 的信号为背景噪声信号, 例如背景噪声帧 D前一帧为背景噪声帧 C, 具体地, 判断当前获得的帧是否为背景噪声帧, 可根据帧头中一标志位进行判断, 此为 现有技术不再赘述;
获取单元 521, 获得已保存的差错隐藏帧 A对应的差错隐藏信号能量衰减 增益值 "';
第一设置单元 522, 根据所述差错隐藏帧 A对应的差错隐藏信号能量衰减 增益值 设定背景噪声帧起始能量衰减增益值 α , 该起始能量衰减增益值 与所述差错隐藏帧对应的差错隐藏信号能量衰减增益值 α'相差在所述阈值范围 内, 具体地, 可令^ rt=«';
第二设置单元 523, 将所述起始能量衰减增益值 α 与小于所述阈值的能量 衰减增益值增加值 Δα的和值,设置为所述首个背景噪声帧 Β对应的背景噪声信 号能量衰减增益值, 具体地, 可令:
背景噪声帧 Β对应的背景噪声信号能量衰减增益值 。iseB = astart + Δ« ,即 。iseB 以 为前提;
第三设置单元 524,除所述首个背景噪声帧 B之外,将其他背景噪声帧的前 一背景噪声帧对应的信号能量衰减增益值与所述能量衰减增益值增加值的和 值, 设置为所述其他背景噪声帧对应的背景噪声信号能量衰减增益值, 具体地, 可令:
背景噪声帧 C 对应的背景噪声信号能量衰减增益值^ _C=^^+A« , 即 。^以 为前提;
背景噪声帧 D 对应的背景噪声信号能量衰减增益值 =α„。^+Δ« , 即
«„。 以 , 为前提;
背景噪声帧 Ε 对应的背景噪声信号能量衰减增益值《„ £=αβ +Δ« , 即 6£以 " e 为前提;
背景噪声帧 F对应的背景噪声信号能量衰减增益值 ^ =«„ £ +Δ« , 即
« 以《 为前提;
背景噪声帧 G对应的背景噪声信号能量衰减增益 即
«„。,^以 „„为前提;
背景噪声帧 Η 对应的背景噪声信号能量衰减增益值 aH =ai_e +A« , 即
//以 „,^为前提;
需要说明的是, 当获得连续的多个背景噪声帧且存在某一背景噪声帧对应 的背景噪声信号能量衰减增益值 通过上述相同的叠代过程满足 ≥1时, 此时为满足语音信号处理要求, 令《„ = 1 , 为描述筒便, 上述计算单元设置至 少两个背景噪声帧对应的背景噪声信号能量衰减增益值的叠代过程可用如下式 子表示:
a . = a . + Aa
if (a . > 1), 作为一种实施方式, 所述 可为但不仅限于如下两种取值方式中的一种:
Δα =丄 , 其中 N取 256;
Ν
Aa - ^' , 其中 L为预先设定的背景噪声帧个数, 具体地, L可取值为
L
100;
控制单元 53, 利用所述能量衰减增益值控制所述背景噪声帧 B、 C、 D、 E、 F、 G、 H对应的背景噪声信号的能量衰减, 具体地, 控制单元 53可包括:
背景噪声信号获取单元 531 , 恢复出所述背景噪声帧^ C、 D、 E、 F、 G、 H分别对应的背景噪声信号;
处理单元 532, 利用所述能量衰减增益值对所述背景噪声信号进行幅度衰 减, 例如利用背景噪声帧 B对应的背景噪声信号能量衰减增益值 。,^ , 对背景 噪声帧 B对应的背景噪声信号进行幅度衰减, 利用背景噪声帧 C对应的背景噪 声信号能量衰减增益值 。^ , 对背景噪声帧 C对应的背景噪声信号进行幅度衰 减等等, 具体地, 当每个背景噪声帧中背景噪声信号的采样点数为 M时, 则利 用每个背景噪声帧对应的背景噪声信号能量衰减增益值, 对每个背景噪声帧对 应的 M个背景噪声信号采样点进行幅度衰减, 为描述筒便, 处理单元 532对每 个背景噪声帧对应的 M个背景噪声信号采样样点进行幅度衰减可用如下式子表 示, 其中 表示 M个背景噪声信号中第 n个背景噪声信号采样样点的幅度: if (a . < 1),
for(n = 0; n < M; n + +)
{noise(n) = noise(n) x anoise }
实施如图 5所示的本发明实施例的语音解码器, 其中能量衰减增益值设置 单元 52保证了所述首个背景噪声帧 B对应的背景噪声信号能量衰减增益值^ _ 与差错隐藏帧 A对应的差错隐藏信县能量衰减增益值 α'相差不大, 第并保证了 存在至少两个背景噪声帧时, 所述背景噪声帧(、 D E F G H对应的背景 噪声信号能量衰减增益值分别与其前一个背景噪声帧对应的背景噪声信号能量 衰减增益值相差不大, 控制单元 53中采用上述背景噪声帧对应的背景噪声信号 能量衰减增益值对所述背景噪声帧对应的背景噪声信号进行能量衰减, 可使差 错隐藏信号区域与背景噪声信号区域之间的能量过渡自然、 平滑, 提高听者听 觉的舒适感。
作为一种实施方式, 上述能量衰减增益值设置单元 52为实现如下功能: 对 所述获得的背景噪声帧 ^ C D E F G H对应的背景噪声信号设置能量 衰减增益值, 使得所述背景噪声帧 B C D E F G H对应的背景噪声信 号能量衰减增益值分别与其前一帧对应的信号能量衰减增益值相差在阈值范围 内, 还可以具体用于:
参照图 3 的本发明实施例的语音信号处理所得另一语音信号幅度示意图, 与图 2所示的本发明实施例的语音信号处理所得语音信号幅度不同的是, 此处 采用 "进 2退 的方法, 需要说明的是, 下面的 2Δα也应该小于所述阈值, 例 如, 令:
背景噪声帧 Β 对应的背景噪声信号能量衰减增益值^^ = art +2A« , 即 «„ 以 为前提;
背景噪声帧 C 对应的背景噪声信号能量衰减增益值^ _C=^^-A« , 即 。^以 为前提;
背景噪声帧 D对应的背景噪声信号能量衰减增益值 αι_β=α„。^+2Δ« , 即
«„。 以 „^为前提;
背景噪声帧 Ε 对应的背景噪声信号能量衰减增益值《„ £=αβ-Δ« , 即 6£以 " 为前提;
背景噪声帧 F对应的背景噪声信号能量衰减增益值《„。^ =αι_£ +2Δ«, 即
« 以 为前提;
背景噪声帧 G对应的背景噪声信号能量衰减增益 即
«„。,^以 为前提;
背景噪声帧 Η对应的背景噪声信号能量衰减增益值 ai_H =ae +2A« , 即
//以 ,^为前提,
这样, 在保证了所述背景噪声帧 B C D E F G H对应的背景噪声 信号能量衰减增益值分别与其前一个背景噪声帧对应的背景噪声信号能量衰减 增益值相差在所述阈值范围内的同时, 使得背景噪声帧( 、 D、 E、 F、 G、 H对 应的背景噪声信号能量衰减增益值按照一个大致顺序的顺序增加, 直至背景噪 声帧对应的背景噪声信号能量衰减增益值为 1 为止即可, 因此, 采用其他类似 的方式也可认为是本发明的其他实施方式, 例如, 上图 4所示的本发明实施例 的语音信号处理所得另一语音信号幅度。
需要说明的有如下几点:
1、 上述本发明实施例以背景噪声帧 C、 D、 E、 F、 G、 H为例进行说明, 而在背景噪声帧数量可多可少的实际情况下, 本发明也可以同样适用;
2、 上述阈值的取值可以根据实际情况, 从如下值中取值但不仅限于: 2Δα、 2.5Δα , 3Δα等, 其中 Δα = ^ ; 根据该阈值的取值范围, 可根据实际情况, 确 定上述本发明实施例中的起始能量衰减增益值以及能量衰减增益值增加值的取 值;
3、 当发生丟失的为背景噪声帧时, 由于根据现有技术的 FEC技术处理得到 的差错隐藏信号能量会比没有发生背景噪声帧丟失时衰减得更为剧烈, 若此时 在差错隐藏帧之后得到背景噪声帧, 那么差错隐藏信号区域到背景噪声信号区 域的能量过渡会比没有发生背景噪声帧丟失时突变更加明显, 在这种情况下应 用本发明实施例会有效地使差错隐藏信号区域与背景噪声信号区域之间的能量 过渡自然、 平滑, 提高听者听觉的舒适感。
另外, 本领域普通技术人员可以理解实现上述实施例方法中的全部或部分 流程, 是可以通过程序来指令相关的硬件来完成, 所述的程序可存储于一计算 机可读取存储介质中, 该程序在执行时, 可包括如上述各方法的实施例的流程。 其中, 所述的存储介质可为磁碟、 光盘、 只读存储记忆体(Read-Only Memory, ROM )或随机存储记忆体(Random Access Memory, RAM )等。
以上所述是本发明的具体实施方式, 应当指出, 对于本技术领域的普通技 术人员来说, 在不脱离本发明原理的前提下, 还可以做出若干改进和润饰, 这 些改进和润饰也视为本发明的保护范围。

Claims

权 利 要 求
1、 一种语音信号处理方法, 其特征在于, 包括:
当差错隐藏帧之后获得的为背景噪声帧时, 对获得的所述背景噪声帧对应 的背景噪声信号设置能量衰减增益值, 使所述背景噪声帧对应的背景噪声信号 能量衰减增益值与其前一帧对应的信号能量衰减增益值相差在阈值范围内; 利用所述能量衰减增益值控制所述背景噪声帧对应的背景噪声信号的能量 衰减。
2、 如权利要求 1所述的语音信号处理方法, 其特征在于, 所述对获得的所 述背景噪声帧对应的背景噪声信号设置能量衰减增益值包括:
获得所述差错隐藏帧对应的差错隐藏信号能量衰减增益值;
根据所述差错隐藏帧对应的差错隐藏信号能量衰减增益值设置背景噪声帧 起始能量衰减增益值, 该起始能量衰减增益值与所述差错隐藏帧对应的差错隐 藏信号能量衰减增益值相差在阈值范围内;
将所述起始能量衰减增益值与小于所述阈值的能量衰减增益值增加值的和 值, 设置为所述差错隐藏帧之后获得的第一个背景噪声帧对应的背景噪声信号 能量衰减增益值。
3、 如权利要求 2所述的语音信号处理方法, 其特征在于, 该方法还包括: 当所述差错隐藏帧之后获得的为至少两个背景噪声帧时, 除所述第一个背 景噪声帧之外, 将其他背景噪声帧的前一背景噪声帧对应的信号能量衰减增益 值与所述能量衰减增益值增加值的和值, 设置为所述其他背景噪声帧对应的背 景噪声信号能量衰减增益值。
4、 如权利要求 3所述的语音信号处理方法, 其特征在于, 所述能量衰减增 益值增加值为 1/256, 或为一设定值, 该设定值为:
1与所述起始能量衰减增益值的差值,该差值与预先设定的背景噪声帧个数 相比得到所述设定值。
5、 如权利要求 4所述的语音信号处理方法, 其特征在于, 所述预先设定的 背景噪声帧个数为 100。
6、 如权利要求 1或 2所述的语音信号处理方法, 其特征在于, 所述阈值为 根据需要获得的语音信号的质量, 得出的所述背景噪声帧对应的背景噪声信号 能量衰减增益值分别与其前一帧对应的信号能量衰减增益值的差值范围的最大 值。
7、 如权利要求 1至 5中任一项所述的语音信号处理方法, 其特征在于, 所 述起始能量衰减增益值等于所述差错隐藏帧对应的差错隐藏信号能量衰减增益 值。
8、 如权利要求 1至 5中任一项所述的语音信号处理方法, 其特征在于, 所 述利用所述能量衰减增益值控制所述背景噪声帧对应的背景噪声信号的能量衰 减包括:
恢复出所述背景噪声帧对应的背景噪声信号;
利用所述能量衰减增益值对所述背景噪声信号进行幅度衰减。
9、 如权利要求 1至 5中任一项所述的语音信号处理方法, 其特征在于, 所 述差错隐藏帧中包含有进行差错隐藏处理的背景噪声帧。
10、 一种语音信号处理装置, 其特征在于, 包括:
背景噪声帧获取单元, 用于获得差错隐藏帧之后的背景噪声帧;
能量衰减增益值设置单元, 用于对获得的所述背景噪声帧对应的背景噪声 信号设置能量衰减增益值, 使所述背景噪声帧对应的背景噪声信号能量衰减增 益值与其前一帧对应的信号能量衰减增益值相差在阈值范围内;
控制单元, 用于利用所述能量衰减增益值控制所述背景噪声帧对应的背景 噪声信号的能量衰减。
11、 如权利要求 10所述的语音 号处理装置, 其特征在于, 所述能量衰减 增益值设置单元包括:
获取单元, 用于获得所述差错隐藏帧对应的差错隐藏信号能量衰减增益值; 第一设置单元, 用于根据所述差错隐藏帧对应的差错隐藏信号能量衰减增 益值设定背景噪声帧起始能量衰减增益值, 该起始能量衰减增益值与所述差错 隐藏帧对应的差错隐藏信号能量衰减增益值相差在阈值范围内;
第二设置单元, 用于将所述起始能量衰减增益值与小于所述阈值的能量衰 减增益值增加值的和值, 设置为所述差错隐藏帧之后获得的第一个背景噪声帧 对应的背景噪声信号能量衰减增益值。
12、 如权利要求 11所述的语音信号处理装置, 其特征在于, 当所述差错隐 藏帧之后获得的为至少两个背景噪声帧时, 所述能量衰减增益值设置单元还包 括:
第三设置单元, 用于将除所述第一个背景噪声帧之外的其他背景噪声帧的 前一背景噪声帧对应的信号能量衰减增益值与所述能量衰减增益值增加值的和 值, 设置为所述其他背景噪声帧对应的背景噪声信号能量衰减增益值。
13、 如权利要求 10或 11所述的语音信号处理装置, 其特征在于, 所述阈 值为根据需要获得的语音信号的质量, 得出的所述背景噪声帧对应的背景噪声 信号能量衰减增益值分别与其前一帧对应的信号能量衰减增益值的差值范围的 最大值。
14、 如权利要求 10至 12中任一项所述的语音信号处理装置, 其特征在于, 所述控制单元包括:
背景噪声信号获取单元, 用于恢复出所述背景噪声帧对应的背景噪声信号; 处理单元, 用于利用所述能量衰减增益值对所述背景噪声信号进行幅度衰 减。
15、 如权利要求 10至 12中任一项所述的语音信号处理装置, 其特征在于, 所述差错隐藏帧中包含有进行差错隐藏处理的背景噪声帧。
16、 如权利要求 10至 12中任一项所述的语音信号处理装置, 其特征在于 所述语音信号处理装置为语音解码器。
PCT/CN2009/070826 2008-03-20 2009-03-17 一种语音信号处理方法及装置 WO2009115032A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP09721810.1A EP2234102B1 (en) 2008-03-20 2009-03-17 A voice signal processing method and device
CA2709790A CA2709790C (en) 2008-03-20 2009-03-17 Method and apparatus for speech signal processing
US12/820,738 US7890322B2 (en) 2008-03-20 2010-06-22 Method and apparatus for speech signal processing

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CNB2008100269012A CN100550133C (zh) 2008-03-20 2008-03-20 一种语音信号处理方法及装置
CN200810026901.2 2008-03-20

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/820,738 Continuation US7890322B2 (en) 2008-03-20 2010-06-22 Method and apparatus for speech signal processing

Publications (1)

Publication Number Publication Date
WO2009115032A1 true WO2009115032A1 (zh) 2009-09-24

Family

ID=40213815

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2009/070826 WO2009115032A1 (zh) 2008-03-20 2009-03-17 一种语音信号处理方法及装置

Country Status (6)

Country Link
US (1) US7890322B2 (zh)
EP (1) EP2234102B1 (zh)
CN (1) CN100550133C (zh)
CA (1) CA2709790C (zh)
RU (1) RU2435233C1 (zh)
WO (1) WO2009115032A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101291193B1 (ko) 2006-11-30 2013-07-31 삼성전자주식회사 프레임 오류은닉방법
CN100550133C (zh) 2008-03-20 2009-10-14 华为技术有限公司 一种语音信号处理方法及装置
ES2881672T3 (es) * 2012-08-29 2021-11-30 Nippon Telegraph & Telephone Método de descodificación, aparato de descodificación, programa, y soporte de registro para ello
JP6561499B2 (ja) * 2015-03-05 2019-08-21 ヤマハ株式会社 音声合成装置および音声合成方法
US10013996B2 (en) * 2015-09-18 2018-07-03 Qualcomm Incorporated Collaborative audio processing
CN107833579B (zh) * 2017-10-30 2021-06-11 广州酷狗计算机科技有限公司 噪声消除方法、装置及计算机可读存储介质
US10784988B2 (en) 2018-12-21 2020-09-22 Microsoft Technology Licensing, Llc Conditional forward error correction for network data
US10803876B2 (en) * 2018-12-21 2020-10-13 Microsoft Technology Licensing, Llc Combined forward and backward extrapolation of lost network data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0603854B1 (en) * 1992-12-24 2001-03-14 Nec Corporation Speech decoder
CN1288557A (zh) * 1998-01-21 2001-03-21 诺基亚移动电话有限公司 解码方法和包括自适应后置滤波器的系统
US6385578B1 (en) * 1998-10-16 2002-05-07 Samsung Electronics Co., Ltd. Method for eliminating annoying noises of enhanced variable rate codec (EVRC) during error packet processing
CN1930607A (zh) * 2004-03-05 2007-03-14 松下电器产业株式会社 差错隐藏装置以及差错隐藏方法
CN101080766A (zh) * 2004-11-03 2007-11-28 声学技术公司 使用bark频带weiner滤波器和线性衰减的噪声降低和舒适噪声增益控制
CN101339766A (zh) * 2008-03-20 2009-01-07 华为技术有限公司 一种语音信号处理方法及装置

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5351338A (en) * 1992-07-06 1994-09-27 Telefonaktiebolaget L M Ericsson Time variable spectral analysis based on interpolation for speech coding
SE502244C2 (sv) * 1993-06-11 1995-09-25 Ericsson Telefon Ab L M Sätt och anordning för avkodning av ljudsignaler i ett system för mobilradiokommunikation
SE9500858L (sv) * 1995-03-10 1996-09-11 Ericsson Telefon Ab L M Anordning och förfarande vid talöverföring och ett telekommunikationssystem omfattande dylik anordning
JPH08305395A (ja) 1995-04-28 1996-11-22 Matsushita Electric Ind Co Ltd 雑音再生装置
US5960389A (en) * 1996-11-15 1999-09-28 Nokia Mobile Phones Limited Methods for generating comfort noise during discontinuous transmission
GB2330485B (en) 1997-10-16 2002-05-29 Motorola Ltd Background noise contrast reduction for handovers involving a change of speech codec
US6453289B1 (en) 1998-07-24 2002-09-17 Hughes Electronics Corporation Method of noise reduction for speech codecs
US6604071B1 (en) 1999-02-09 2003-08-05 At&T Corp. Speech enhancement with gain limitations based on speech activity
JP2003501925A (ja) 1999-06-07 2003-01-14 エリクソン インコーポレイテッド パラメトリックノイズモデル統計値を用いたコンフォートノイズの生成方法及び装置
FI116643B (fi) * 1999-11-15 2006-01-13 Nokia Corp Kohinan vaimennus
CA2290037A1 (en) 1999-11-18 2001-05-18 Voiceage Corporation Gain-smoothing amplifier device and method in codecs for wideband speech and audio signals
US6757395B1 (en) 2000-01-12 2004-06-29 Sonic Innovations, Inc. Noise reduction apparatus and method
US6804640B1 (en) 2000-02-29 2004-10-12 Nuance Communications Signal noise reduction using magnitude-domain spectral subtraction
US7003455B1 (en) 2000-10-16 2006-02-21 Microsoft Corporation Method of noise reduction using correction and scaling vectors with partitioning of the acoustic space in the domain of noisy speech
CN1288557C (zh) 2003-06-25 2006-12-06 英业达股份有限公司 多执行线程同时停止的方法
CN1758694A (zh) 2004-10-10 2006-04-12 中兴通讯股份有限公司 一种产生舒适噪声的装置
US7454335B2 (en) 2006-03-20 2008-11-18 Mindspeed Technologies, Inc. Method and system for reducing effects of noise producing artifacts in a voice codec

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0603854B1 (en) * 1992-12-24 2001-03-14 Nec Corporation Speech decoder
CN1288557A (zh) * 1998-01-21 2001-03-21 诺基亚移动电话有限公司 解码方法和包括自适应后置滤波器的系统
US6385578B1 (en) * 1998-10-16 2002-05-07 Samsung Electronics Co., Ltd. Method for eliminating annoying noises of enhanced variable rate codec (EVRC) during error packet processing
CN1930607A (zh) * 2004-03-05 2007-03-14 松下电器产业株式会社 差错隐藏装置以及差错隐藏方法
CN101080766A (zh) * 2004-11-03 2007-11-28 声学技术公司 使用bark频带weiner滤波器和线性衰减的噪声降低和舒适噪声增益控制
CN101339766A (zh) * 2008-03-20 2009-01-07 华为技术有限公司 一种语音信号处理方法及装置

Also Published As

Publication number Publication date
CA2709790C (en) 2013-06-04
EP2234102A4 (en) 2011-04-27
US20100250247A1 (en) 2010-09-30
CA2709790A1 (en) 2009-09-24
CN101339766A (zh) 2009-01-07
US7890322B2 (en) 2011-02-15
EP2234102A1 (en) 2010-09-29
CN100550133C (zh) 2009-10-14
EP2234102B1 (en) 2014-05-07
RU2435233C1 (ru) 2011-11-27

Similar Documents

Publication Publication Date Title
WO2009115032A1 (zh) 一种语音信号处理方法及装置
JP7427752B2 (ja) 時間領域デコーダにおける量子化雑音を低減するためのデバイスおよび方法
JP4673411B2 (ja) 移動通信ネットワークにおける方法および装置
TWI579834B (zh) 調整聲音清晰度強化的方法與系統
WO2007045971A2 (en) Method and apparatus for resynchronizing packetized audio streams
US20040181405A1 (en) Recovering an erased voice frame with time warping
JP2013117729A (ja) 背景雑音情報の断続伝送及び正確な再生の方法
JP2005534950A (ja) 線形予測に基づく音声コーデックにおける効率的なフレーム消失の隠蔽のための方法、及び装置
JP2002542518A5 (zh)
JP2012247810A (ja) ノイズ生成装置、方法、及びコンピュータ可読記録媒体
JP2019512733A (ja) 適切に復号されたオーディオフレームの復号化表現の特性を使用する誤り隠蔽ユニット、オーディオデコーダ、および関連する方法およびコンピュータプログラム
TW200807395A (en) Controlling a time-scaling of an audio signal
JP2003533902A5 (zh)
WO2015100999A1 (zh) 语音频码流的解码方法及装置
JP2000209663A (ja) 音声チャネル上で非音声情報を送信する方法
CN106683681B (zh) 处理丢失帧的方法和装置
KR100745683B1 (ko) 음성의 특징을 이용한 패킷 손실 은닉 방법
US7584096B2 (en) Method and apparatus for encoding speech
CN106504747A (zh) 移动环境下基于异构双mic 的语音识别自适应系统的方法
CN1780326A (zh) 通话音量自适应调节方法
US10127916B2 (en) Method and apparatus for enhancing alveolar trill
JP3980592B2 (ja) 通信装置、符号化列送信装置、符号化列受信装置、これらの装置として機能させるプログラムとこれを記録した記録媒体、および符号列受信復号方法、通信装置の制御方法
JP2008122911A (ja) キー再同期区間の音声データを見積もるためのベクトル情報の挿入方法、ベクトル情報伝送方法、およびベクトル情報を用いたキー再同期区間の音声データ見積り方法
JP6529473B2 (ja) 無線通信装置、無線通信システム、及びノイズ軽減方法
Hassan et al. Audio Covering Signal For Speech Signal Hiding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09721810

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 3507/CHENP/2010

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 2709790

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2009721810

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2010129857

Country of ref document: RU