CN1757060B - Celp语音编码的话音指数控制 - Google Patents

Celp语音编码的话音指数控制 Download PDF

Info

Publication number
CN1757060B
CN1757060B CN2004800060153A CN200480006015A CN1757060B CN 1757060 B CN1757060 B CN 1757060B CN 2004800060153 A CN2004800060153 A CN 2004800060153A CN 200480006015 A CN200480006015 A CN 200480006015A CN 1757060 B CN1757060 B CN 1757060B
Authority
CN
China
Prior art keywords
speech
demoder
index
scrambler
celp
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2004800060153A
Other languages
English (en)
Other versions
CN1757060A (zh
Inventor
高扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mandus Bide Technology LLC
MACOM Technology Solutions Holdings Inc
Original Assignee
Mindspeed Technologies LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mindspeed Technologies LLC filed Critical Mindspeed Technologies LLC
Publication of CN1757060A publication Critical patent/CN1757060A/zh
Application granted granted Critical
Publication of CN1757060B publication Critical patent/CN1757060B/zh
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/26Pre-filtering or post-filtering
    • G10L19/265Pre-filtering, e.g. high frequency emphasis prior to encoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/005Correction of errors induced by the transmission channel, if related to the coding algorithm
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/087Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters using mixed excitation models, e.g. MELP, MBE, split band LPC or HVXC
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/20Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/90Pitch determination of speech signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/09Long term prediction, i.e. removing periodical redundancies, e.g. by using adaptive codebook or pitch predictor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
  • Synchronisation In Digital Transmission Systems (AREA)
  • Noise Elimination (AREA)
  • Image Analysis (AREA)
  • Measurement Of Optical Distance (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)

Abstract

一种利用综合分析(ABS)编码器提高语音合成质量的方法。由于浊语音信号中的周期性程度对于浊语音的不同片断而言有显著差异,在综合分析型的语音编码(例如,CELP)中会产生不稳定的感知质量。因此,本发明利用指示语音信号的周期性程度的话音指数控制和改进ABS型语音编码。所述话音指数可被用于通过控制编码器和/或解码器来提高质量稳定性,其可以用于:固定码本(301)短期增强,包括频谱倾斜;感知加权滤波器;子固定码本确定;LPC插值(304);固定码本音调增强;后音调增强;在解码器高频带的噪声注入;LTP正弦窗;信号分解等。

Description

CELP语音编码的话音指数控制
相关申请 
本申请主张于2003年3月15日提交的序号为60/455,435的美国申请的权利,并将其全部内容在此引用作为参考。 
以下为与本申请同一天提交的相关美国专利申请,并在此引用作为参考: 
美国专利申请,序号10/799,533,“SIGNAL DECOMPOSITION OFVOICED SPEECH FOR CELP SPEECH CODING”,律师卷号:0160112。 
美国专利申请,序号10/799,505,“SIMPLE NOISE SUPPRESSIONMODEL”,律师卷号:0160114。 
美国专利申请,序号10/799,460,“ADAPTIVE CORRELATIONWINDOW FOR OPEN-LOOP PITCH”,律师卷号:0160115。 
美国专利申请,序号10/799,504,“RECOVERING AN ERASEDVOICE FRAME WITH TIME WAPPING”,律师卷号:0160116。 
技术领域
本发明主要涉及语音编码,更确切地,涉及码激励线性预测(CELP)语音编码。 
背景技术
一般而言,语音信号可被限制频带为约10kHz而不会影响感知。然而,在远程通信中,语音信号带宽通常被更严格地限制。众所周知,电话网络将语音信号的带宽限制在300Hz到3400Hz之间,称为“窄带”。这样的带宽限制导致了电话语音中的特征音。300Hz的下限和3400Hz的上限均会对语音质量产生影响。 
在大多数数字语音编码器中,语音信号被以8kHz采样,导致最大信号带宽为4kHz。然而,在实际中,通常将信号的带宽限制为在较高频率端约为3600Hz。在较低频率端,截止频率通常在50Hz和200Hz之间。所述窄带语音信号需要8kb/s的采样频率,且提供了一种被称为长话质量的语音质量。尽管所述长活质量对于电话通信而言已经足够,但是,对于一些新兴应用,例如,电话会议,多媒体服务以及高清晰度电视,需要更好的质量。 
通过增加带宽,所述通信质量可以得到提高以进行上述应用。例如,通过将采样频率增加到16kHz,可提供范围从50Hz到约7000Hz的更宽的带宽,其被称为“宽带”。将较低频率范围扩展到50Hz增加了自然度、现场感和舒适度。在频谱的另一端,较高频率范围被扩展到7000Hz,可以增加可懂度,使得更易于区分摩擦音。 
在数字语境下,通过一种众所周知的被称为综合分析(ABS)的方法对语音进行合成。综合分析也被称为闭环方法或波形匹配方法。对于中或高比特率,其提供了比其它方法相对更好的语音编码质量。一种已知的ABS方法即为码激励线性预测(CELP)。在CELP编码中,通过利用编码的激励信息激励线性预测编码(LPC)滤波器来合成语音。所述LPC滤波器的输出被与浊语音进行比较,并被用于在闭环意义下调整滤波器参数,直到找到基于最小误差的最佳参数。影响CELP编码的一个因素为,对于不同的浊语音片段,话音度(voicing degree)可以有显著地变化,从而导致语音编码中的不稳定的感知质量(perceptual quality)。 
本发明致力于解决上述综合分析浊语音问题。 
发明内容
依照在此宽泛描述的本发明的目的,提供了利用话音指数(voicingindex)控制语音编码过程以提高合成语音质量的系统和方法。 
根据本发明的一个实施例,指示了语音信号的周期性程度(periodicitydegree)的话音指数用于控制和提高ABS型语音编码。对于不同的浊语音 片段,所述周期性程度可以有显著变化,此变化可能会在诸如CELP的综合分析型语音编码中导致不稳定的感知质量。 
通过控制编码器和/或解码器,话音指数可被用于提高质量稳定性,例如,在以下领域:(a)固定码本短期增强(fixed-codebook short-termenhancement),包括频谱倾斜(spectrum tilt),(b)感知加权滤波器,(c)子固定码本确定,(d)LPC插值,(e)固定码本音调增强,(f)后音调增强,(g)解码器中高频带的噪声注入,(h)LTP正弦窗(Sincwindow),(i)信号分解,等等。在CELP语音编码的一个实施例中,话音指数可以基于标准化的音调相关(pitch correlation)。 
下面将进一步参照附图和说明使得本发明的这些和其它方面变得更加明显。所有这些附加的系统、方法、特点和优点均包含在此描述中,在本发明的范围以内,并由所附权利要求保护。 
附图说明
图1示出了样本语音信号的频域特征; 
图2示出了编码器和解码器均可使用的话音指数分类; 
图3示出了基本CELP编码框图; 
图4示出了依据本发明实施例的,利用附加的自适应加权滤波器进行语音增强的CELP编码过程; 
图5示出了依据本发明实施例的,利用后置滤波器结构的解码器实现; 
图6示出了利用多个子码本的CELP编码框图; 
图7A示出了用于产生正弦窗的采样; 
图7B示出了一种正弦窗。 
具体实施方式
本申请在此将对功能块组件和各种处理步骤进行描述。更可取的是,可以利用任何数量的被配置以执行特定功能的硬件组件和/或软件组件来实现这样的功能块。例如,本申请可以采用各种集成电路组件,例如,存 储器元件、数字信号处理元件、发射机、接收机、检音器、音频发生器、逻辑元件等,其可在一个或多个微处理器或其它控制装置控制下实现多种功能。此外,可注意到,本申请可以采用任何数量的常规技术来进行数据传输、信号发送、信号处理和波形加工、音频生成和检测,等等。这些本领域技术人员所熟知的常用技术在此将不做详述。 
话音指数传统上是一种重要的指数,其被发送给解码器以进行谐波语音编码(Harmonic speech coding)。所述话音指数通常表示浊语音的周期性程度和/或周期谐波频带边界(periodic harmonic band boundary)。话音指数通常不用于CELP编码系统。然而,本发明的实施例使用话音指数来提供控制并提高在CELP或其它综合分析型编码器中的合成语音的质量。 
图1示出了样本语音信号的频域特征。此图中,宽带频域从略高于0Hz伸展到约7.0kHz。尽管对于以16kHz采样的语音信号而言,该频谱中的最高可能频率结束于8.0kHz(即,Nyquist(奈奎斯特)折叠频率),但是,此图示出了在7.0kHz到8.0kHz之间区域中能量几乎为零。对于本领域技术人员而言,很明显,在此使用的信号范围仅用于说明的目的,而在此表述的原理可应用于其它信号频带。 
如图1所示,语音信号在较低频率处非常调和,但是由于存在有噪声的(noisy)语音信号的可能性随着频率的增加而增加,在较高频率处的语音信号并不保持调和。例如,在此图中,语音信号表现出了在较高频率处变得有噪声的特征,例如,在5.0kHz以上。该有噪声的信号使得在较高频率的波形匹配非常困难。因此,如果需要高质量语音,类似ABS编码(例如,CELP)的技术将变得不可靠。例如,在CELP编码器中,通过最小化原始语音与合成语音之间的误差,将合成器设计为与原始语音信号相匹配。由于有噪声的信号不可预测,从而使得误差最小化非常困难。 
由于给出了以上问题,本发明实施例使用了话音指数,其被从编码器发送到解码器,以提高由诸如CELP编码器的ABS型语音编码器所合成的语音的质量。 
话音指数,其被编码器发送给解码器,可以表示浊语音的周期性或信号的谐波结构。在另一个实施例中,所述话音指数可用三个比特表示,以提供八类语音信号。例如,图2示出了编码器和解码器均可使用的话音指数分类。此图中,指数0(即,“000”)可指示背景噪声,指数1(即,“001”)可指示类似噪声(noise-like)或清音语音信号,指数2(即,“010”)可指示不规则的浊音信号,例如,开始时的浊音信号,以及指数3-7(即,“011”到“111”)各自可指示语音信号的周期性。例如,指数3(“011”)可表示最不具有周期性的信号,而指数7(“111”)表示最具有周期性的信号。 
话音指数信息可作为每一个编码帧的一部分由编码器传送。换言之,每一帧可包括话音指数比特(如,三个比特),其用于指示该具体帧的周期性程度。在一个实施例中,用于CELP的话音指数可基于标准化的音调相关参数,Rp,且可以由以下方程推出:10log(1-Rp)2,其中,-1.0<Rp<1.0。 
在一个例子中,话音指数可用于固定码本短期增强,包括所述频谱倾斜。图3示出了基本CELP编码框图。如图所示,CELP编码块300包括固定码本301,增益块302,音调过滤块303,以及LPC滤波器304。CELP编码块300还包括比较块306,加权滤波块320,均方误差(MSE)计算块308。 
CELP编码背后的基本思想为,输入语音307与合成输出305进行比较,以生成误差309,其为均方误差。利用对新编码参数的选择,在闭环意义下连续计算,直到误差309为最小。 
在接收侧,解码器利用相似块301-304(见图5)合成语音。从而,当需要选择适当的码本条目、增益以及滤波器等时,编码器将信息传送给解码器。 
在CELP语音编码系统中,当语音信号更具有周期性时,音调滤波器(如,303)的贡献强于固定码本(如,301)的贡献。这样,本发明的实施例可以使用所述话音指数,以通过实现自适应高通滤波器而对高频区域 给予更多的注意,该滤波器由所述话音指数的值进行控制。可以实现例如图4所示的构造。例如,自适应滤波器310可以是用于强调高频区域中的功率的自适应滤波器。在此图中,加权滤波器420也可以是用于提高CELP编码过程的自适应滤波器。 
在解码器侧,话音指数可用于选择适当的后置滤波器520参数。图5示出了利用后置滤波结构的解码器实现。在一个或更多的实施例中,后置滤波器520可具有存于表中的多种结构,可以利用话音指数中的信息对其进行选择。 
在另一个例子中,话音指数可与CELP的感知加权滤波器一起使用。例如,所述感知加权滤波器可由图4中的自适应滤波器420表示。众所周知,波形匹配通过进行均方误差最小化来最小化语音信号的最重要部分(即,高能量部分)的误差,并忽略低能量区域。本发明的实施例使用了自适应加权过程来改善低能量区域。例如,话音指数可用于定义取决于帧的周期性程度的加权滤波器420的积极性。 
在另一个实施例中,如图6所示,话音指数可用于确定子固定码本。固定码本可能有多个子固定码本,例如,一个具有较少的脉冲却有较高的位置解析度的子固定码本601,一个具有较多的脉冲却有较低的位置解析度的子固定码本602,以及噪声子码本603。因此,如果话音指数指示有噪声的信号,可以使用子码本602或噪声码本603;如果话音指数未指示有噪声的信号,那么取决于所给帧的周期性程度可以使用子码本中的一个(例如,601或602)。可注意到,在一个或多个实施例中,增益块(码本)302也可单独应用于每一个子码本。 
此外,话音指数可与LPC插值一起使用。例如,在线形插值期间,如果插值的LPC的位置处于前一个LPC和当前的LPC中间,前一个LPC与当前的LPC同样重要。因此,如果话音指数,例如,指示在前帧为清音,而本帧为浊音,那么在LPC插值期间,所述LPC插值算法更倾向于当前帧而不是在前帧。 
所述话音指数可用于固定码本音调增强。典型地,在前的音调增益可用于进行音调增强。然而,话音指数提供了与当前帧相关的信息,从而,与在前的音调增益信息相比,其提供了更好的指示。可以基于所述话音指数确定音调增益的幅度。换言之,所述帧越具有周期性(基于话音指数值),增强的幅度越大。例如,所述话音指数可与美国专利申请09/365,444一起使用,以确定在其中定义的双向音调增强系统中的增强幅度,此专利于1999年8月2日提交,在此引用作为参考。
作为进一步的例子,所述话音指数可被用于替代用于后音调增强的音调增益。这是一个优点,因为,如前所述,可以从标准化音调相关值,即,Rp,得到话音指数,所述Rp典型地在0.0到1.0之间;然而,音调增益可超过1.0,并且可以反过来影响后音调增强过程。 
作为另一个例子,所述话音指数还可用于确定可能注入在解码器侧的高频带中的噪声量。当输入语音被分解为浊音部分和噪声部分时,如美国专利申请10/799,533中所讨论的,可以使用该实施例,所述专利与此同时提交,名为“SIGNAL DECOMPOSITION OF VOICED SPEECH FORCELP SPEECH CODING”,其在此引用作为参考。 
所述话音指数还可以被用于控制正弦窗的调整。所述正弦窗用于利用CELP编码的分数式音调滞后(fractional pitch lag)生成自适应码本贡献向量,即,LTP激励向量。在宽带语音编码中,已知强谐波出现在频带的低频区域而噪声信号出现在高频区域。 
长期预测或LTP通过采用在前的激励,并根据音调周期将其复制到当前子帧来产生谐波。可注意到,如果进行了在前帧的单纯复制,则谐波也同样在频域的末端频谱得到复制。然而,这不是真实浊音信号的准确表示,尤其对于宽带语音编码而言。 
在一个实施例中,对于宽带语音信号而言,当在前信号被用于表示当前信号时,由于在高频区域出现噪声的高可能性,自适应低通滤波器被应用于正弦插值窗。 
在CELP编码中,固定码本对语音信号的有噪声或不规则部分有贡献,而音调自适应码本对语音信号的浊音或规则部分有贡献。自适应码本贡献 被利用正弦窗产生,由于音调滞后可以是分数的,所以其可以被使用。如果音调滞后为整数,一个激励信号可被复制到下一个;然而,因为所述音调滞后是分数的,对在前激励信号的直接复制将不会产生作用。当正弦窗被修改后,即使对于整数音调滞后,直接复制也不会产生作用。为产生音调贡献,采集了多个样本,如图7A所示,其被加权然后被相加在一起,其中,样本的权重被称为正弦窗,其本来就具有对称的形状,如图7B所示。实际中的形状取决于音调滞后的分数部分以及应用于正弦窗的自适应低通滤波器。所述正弦窗的应用类似于卷积或滤波,但是正弦窗为非因果滤波器。在如下表示中,窗信号w(n)与信号s(n)在时域卷积,这等同于窗频谱W(w)与信号频谱S(w)在频域相乘: 
U ACB ( n · ) = w ( n ) * s ( n ) ↔ W ( w ) S ( w )
根据以上表示,正弦窗的低通等同于对最终自适应码本贡献(UACB(n))或激励信号进行低通;然而,由于正弦窗短于激励,正弦窗的低通更具有优势。于是,改变正弦窗比改变激励更容易;此外,正弦窗的滤波可以被预先计算和记忆。 
在本发明的一个实施例中,话音指数可以被用于提供信息以控制正弦窗的低通滤波器的改变。例如,话音指数可以提供关于谐波结构强弱的信息。如果调谐结构强,则对所述正弦窗施加弱低通滤波器,而如果调谐结构弱,则对所述正弦窗施加强低通滤波器。 
尽管本发明的以上实施例是参照宽带语音信号来描述的,本发明同样也可应用于窄带语音信号。 
以上表述的方法和系统可存在于软件、硬件或设备的固件中,无需脱离本发明的精神,其可在微处理器、数字信号处理器、专用IC或现场可编程门阵列(“FPGA”),或者其任何组合中实现。此外,无需脱离其精神和实质特点,本发明能够以其它具体形式实施。在此描述的实施例只具有说明性而不具有限制性。 

Claims (45)

1.一种提高包括了编码器和解码器的语音编码系统中的合成语音质量的方法,所述方法包括:
获取输入语音信号;
利用码激励线性预测(CELP)编码器对所述输入语音进行编码,以生成用于所述输入语音信号的合成的CELP编码参数;
生成多个CELP语音帧,所述多个CELP语音帧的每一帧包含所述CELP编码参数;
产生话音指数,其中,所述话音指数指示所述输入语音信号的多个分类中的一个,其中所述输入语音信号的所述多个分类中的每个表示所述输入语音信号的周期性的不同程度;以及
将所述话音指数作为所述多个CELP语音帧的每一帧的一部分发送给所述解码器,以改善所述输入语音信号的所述合成。
2.权利要求1的方法,其中,所述输入语音信号的多个分类包括:背景噪声类、清音类、第一话音类和第二话音类,其中所述第一话音类的周期性程度低于所述第二话音类。
3.权利要求1的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以控制自适应高通滤波器。
4.权利要求1的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以控制自适应感知加权滤波器。
5.权利要求1的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以控制自适应正弦窗。
6.权利要求1的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以通过固定码本的短期增强来控制所述输入语音信号的频谱倾斜。
7.权利要求1的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以控制感知加权滤波器。
8.权利要求1的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以控制线性预测编码器。
9.权利要求1的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以控制音调增强固定码本。
10.权利要求1的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以控制后音调增强。
11.权利要求1的方法,其中,所述话音指数由所述解码器使用,以从多个子码本中选择至少一个子码本。
12.权利要求1的方法,其中,所述话音指数具有多个比特,用于指示所述多个CELP语音帧的每一帧的分类。
13.权利要求12的方法,其中,所述多个比特为三个比特。
14.权利要求12的方法,其中,所述分类用于指示所述输入语音信号的周期性。
15.一种提高包括了编码器和解码器的语音编码系统中的合成语音质量的方法,所述方法包括:
利用所述解码器从所述编码器接收多个码激励线性预测(CELP)语音帧;
利用所述解码器,通过对所述多个CELP语音帧的每一帧进行解码,来获取多个CELP编码参数;
利用所述解码器,通过对所述多个CELP语音帧的每一帧进行解码来获取话音指数,以供所述解码器用于改善所述输入语音信号的合成,其中,所述话音指数指示所述输入语音信号的多个分类中的一个,其中所述输入语音信号的所述多个分类中的每个表示所述输入语音信号的周期性的不同程度;以及
由所述解码器利用所述多个CELP编码参数和所述话音指数生成所述输入语音信号的合成版本。
16.权利要求15的方法,其中,所述输入语音信号的多个分类包括:背景噪声类、清音类、第一话音类和第二话音类,其中所述第一话音类的周期性程度低于所述第二话音类。
17.权利要求15的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以控制自适应高通滤波器。
18.权利要求15的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以控制自适应感知加权滤波器。
19.权利要求15的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以控制用于音调贡献的自适应正弦窗。
20.权利要求15的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以通过固定码本的短期增强来控制所述输入语音信号的频谱倾斜。
21.权利要求15的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以控制线性预测编码滤波器。
22.权利要求15的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以控制音调增强固定码本。
23.权利要求15的方法,其中,将所述话音指数从所述编码器发送给所述解码器,以控制后音调增强。
24.权利要求15的方法,其中,所述解码器使用所述话音指数,以从多个子码本中选择至少一个子码本。
25.权利要求15的方法,其中,所述话音指数具有多个比特,用于指示所述多个CELP语音帧的每一帧的分类。
26.权利要求25的方法,其中,所述多个比特为三个比特。
27.权利要求25的方法,其中,所述分类用于指示所述输入语音信号的周期性。
28.一种用于提高输入语音信号的合成语音质量的编码器,所述编码器包括:
接收机,用于接收所述输入语音信号;
码激励线性预测(CELP)编码器,用于生成用于所述输入语音信号的合成的CELP编码参数,用于生成多个CELP语音帧,所述多个CELP语音帧的每一帧包含所述CELP编码参数,并且还用于生成指示所述输入语音信号的多个分类中的一个的话音指数,其中所述输入语音信号的所述多个分类中的每个表示所述输入语音信号的周期性的不同程度;
发射机,用于将所述话音指数作为所述多个CELP语音帧的每一帧的一部分发送给所述解码器,以用于改善所述输入语音信号的所述合成。
29.权利要求28的编码器,其中,所述输入语音信号的多个分类包括:背景噪声类、清音类、第一话音类和第二话音类,其中所述第一话音类的周期性程度低于所述第二话音类。
30.权利要求28的编码器,其中,将所述话音指数从所述编码器发送给所述解码器,以控制自适应高通滤波器。
31.权利要求28的编码器,其中,将所述话音指数从所述编码器发送给所述解码器,以控制自适应感知加权滤波器。
32.权利要求28的编码器,其中,将所述话音指数从所述编码器发送给所述解码器,以控制自适应正弦窗。
33.权利要求28的编码器,其中,所述话音指数由所述解码器使用,以从多个子码本中选择至少一个子码本。
34.权利要求28的编码器,其中,所述话音指数具有多个比特,用于指示所述多个CELP语音帧的每一帧的分类。
35.权利要求34的编码器,其中,所述多个比特为三个比特。
36.权利要求34的编码器,其中,所述分类用于指示有噪声的语音信号。
37.一种用于提高输入语音信号的合成语音质量的解码器,所述解码器包括:
接收机,用于基于所述输入语音信号从编码器接收多个码激励线性预测(CELP)语音帧,
其中,所述解码器通过对所述多个CELP语音帧的每一帧进行解码来获取多个CELP编码参数,并且其中,所述解码器通过对所述多个CELP语音帧的每一帧进行解码来获取话音指数,所述话音指数指示所述输入语音信号的多个分类中的一个,其中所述输入语音信号的所述多个分类中的每个表示所述输入语音信号的周期性的不同程度,
其中,所述解码器利用所述多个CELP编码参数和所述话音指数来生成所述输入语音信号的合成版本。
38.权利要求37的编码器,其中,所述输入语音信号的多个分类包括:背景噪声类、清音类、第一话音类和第二话音类,其中所述第一话音类的周期性程度低于所述第二话音类。
39.权利要求37的解码器,其中,将所述话音指数从所述编码器发送给所述解码器,以控制自适应高通滤波器。
40.权利要求37的解码器,其中,将所述话音指数从所述编码器发送给所述解码器,以控制自适应感知加权滤波器。
41.权利要求37的解码器,其中,将所述话音指数从所述编码器发送给所述解码器,以控制用于音调贡献的自适应正弦窗。
42.权利要求37的解码器,其中,所述解码器使用所述话音指数,来从多个子码本中选择至少一个子码本。
43.权利要求37的解码器,其中,所述话音指数具有多个比特,用于指示所述多个CELP语音帧的每一帧的分类。
44.权利要求43的解码器,其中,所述分类用于指示周期性指数。
45.权利要求43的解码器,其中,所述周期性指数的范围为从低周期性指数到高周期性指数。
CN2004800060153A 2003-03-15 2004-03-11 Celp语音编码的话音指数控制 Expired - Fee Related CN1757060B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US45543503P 2003-03-15 2003-03-15
US60/455,435 2003-03-15
PCT/US2004/007581 WO2004084180A2 (en) 2003-03-15 2004-03-11 Voicing index controls for celp speech coding

Publications (2)

Publication Number Publication Date
CN1757060A CN1757060A (zh) 2006-04-05
CN1757060B true CN1757060B (zh) 2012-08-15

Family

ID=33029999

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2004800060153A Expired - Fee Related CN1757060B (zh) 2003-03-15 2004-03-11 Celp语音编码的话音指数控制

Country Status (4)

Country Link
US (5) US7024358B2 (zh)
EP (2) EP1604354A4 (zh)
CN (1) CN1757060B (zh)
WO (5) WO2004084179A2 (zh)

Families Citing this family (95)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7742927B2 (en) * 2000-04-18 2010-06-22 France Telecom Spectral enhancing method and device
US20030187663A1 (en) 2002-03-28 2003-10-02 Truman Michael Mead Broadband frequency translation for high frequency regeneration
JP4178319B2 (ja) * 2002-09-13 2008-11-12 インターナショナル・ビジネス・マシーンズ・コーポレーション 音声処理におけるフェーズ・アライメント
US7933767B2 (en) * 2004-12-27 2011-04-26 Nokia Corporation Systems and methods for determining pitch lag for a current frame of information
US7702502B2 (en) * 2005-02-23 2010-04-20 Digital Intelligence, L.L.C. Apparatus for signal decomposition, analysis and reconstruction
US20060282264A1 (en) * 2005-06-09 2006-12-14 Bellsouth Intellectual Property Corporation Methods and systems for providing noise filtering using speech recognition
KR101116363B1 (ko) * 2005-08-11 2012-03-09 삼성전자주식회사 음성신호 분류방법 및 장치, 및 이를 이용한 음성신호부호화방법 및 장치
EP1772855B1 (en) * 2005-10-07 2013-09-18 Nuance Communications, Inc. Method for extending the spectral bandwidth of a speech signal
US7720677B2 (en) * 2005-11-03 2010-05-18 Coding Technologies Ab Time warped modified transform coding of audio signals
JP3981399B1 (ja) * 2006-03-10 2007-09-26 松下電器産業株式会社 固定符号帳探索装置および固定符号帳探索方法
KR100900438B1 (ko) * 2006-04-25 2009-06-01 삼성전자주식회사 음성 패킷 복구 장치 및 방법
US8010350B2 (en) * 2006-08-03 2011-08-30 Broadcom Corporation Decimated bisectional pitch refinement
US8239190B2 (en) * 2006-08-22 2012-08-07 Qualcomm Incorporated Time-warping frames of wideband vocoder
EP2063418A4 (en) * 2006-09-15 2010-12-15 Panasonic Corp AUDIO CODING DEVICE AND AUDIO CODING METHOD
GB2444757B (en) * 2006-12-13 2009-04-22 Motorola Inc Code excited linear prediction speech coding
US7521622B1 (en) 2007-02-16 2009-04-21 Hewlett-Packard Development Company, L.P. Noise-resistant detection of harmonic segments of audio signals
ES2533626T3 (es) * 2007-03-02 2015-04-13 Telefonaktiebolaget L M Ericsson (Publ) Métodos y adaptaciones en una red de telecomunicaciones
GB0704622D0 (en) * 2007-03-09 2007-04-18 Skype Ltd Speech coding system and method
CN101320565B (zh) * 2007-06-08 2011-05-11 华为技术有限公司 感知加权滤波方法及感知加权滤波器
CN101321033B (zh) * 2007-06-10 2011-08-10 华为技术有限公司 帧补偿方法及系统
US8868417B2 (en) * 2007-06-15 2014-10-21 Alon Konchitsky Handset intelligibility enhancement system using adaptive filters and signal buffers
US20080312916A1 (en) * 2007-06-15 2008-12-18 Mr. Alon Konchitsky Receiver Intelligibility Enhancement System
US8015002B2 (en) 2007-10-24 2011-09-06 Qnx Software Systems Co. Dynamic noise reduction using linear model fitting
US8606566B2 (en) * 2007-10-24 2013-12-10 Qnx Software Systems Limited Speech enhancement through partial speech reconstruction
US8326617B2 (en) 2007-10-24 2012-12-04 Qnx Software Systems Limited Speech enhancement with minimum gating
US8296136B2 (en) * 2007-11-15 2012-10-23 Qnx Software Systems Limited Dynamic controller for improving speech intelligibility
EP2242048B1 (en) * 2008-01-09 2017-06-14 LG Electronics Inc. Method and apparatus for identifying frame type
CN101483495B (zh) * 2008-03-20 2012-02-15 华为技术有限公司 一种背景噪声生成方法以及噪声处理装置
FR2929466A1 (fr) * 2008-03-28 2009-10-02 France Telecom Dissimulation d'erreur de transmission dans un signal numerique dans une structure de decodage hierarchique
US8768690B2 (en) 2008-06-20 2014-07-01 Qualcomm Incorporated Coding scheme selection for low-bit-rate applications
US20090319261A1 (en) * 2008-06-20 2009-12-24 Qualcomm Incorporated Coding of transitional speech frames for low-bit-rate applications
US20090319263A1 (en) * 2008-06-20 2009-12-24 Qualcomm Incorporated Coding of transitional speech frames for low-bit-rate applications
MY154452A (en) * 2008-07-11 2015-06-15 Fraunhofer Ges Forschung An apparatus and a method for decoding an encoded audio signal
CA2699316C (en) * 2008-07-11 2014-03-18 Max Neuendorf Apparatus and method for calculating bandwidth extension data using a spectral tilt controlled framing
KR101400484B1 (ko) 2008-07-11 2014-05-28 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. 시간 워프 활성 신호의 제공 및 이를 이용한 오디오 신호의 인코딩
US8407046B2 (en) * 2008-09-06 2013-03-26 Huawei Technologies Co., Ltd. Noise-feedback for spectral envelope quantization
WO2010028292A1 (en) * 2008-09-06 2010-03-11 Huawei Technologies Co., Ltd. Adaptive frequency prediction
WO2010028297A1 (en) 2008-09-06 2010-03-11 GH Innovation, Inc. Selective bandwidth extension
US8515747B2 (en) * 2008-09-06 2013-08-20 Huawei Technologies Co., Ltd. Spectrum harmonic/noise sharpness control
WO2010031003A1 (en) 2008-09-15 2010-03-18 Huawei Technologies Co., Ltd. Adding second enhancement layer to celp based core layer
US8577673B2 (en) * 2008-09-15 2013-11-05 Huawei Technologies Co., Ltd. CELP post-processing for music signals
CN101599272B (zh) * 2008-12-30 2011-06-08 华为技术有限公司 基音搜索方法及装置
GB2466668A (en) * 2009-01-06 2010-07-07 Skype Ltd Speech filtering
CN102016530B (zh) * 2009-02-13 2012-11-14 华为技术有限公司 一种基音周期检测方法和装置
CN102483926B (zh) 2009-07-27 2013-07-24 Scti控股公司 在处理语音信号中通过把语音作为目标和忽略噪声以降噪的系统及方法
ES2453098T3 (es) * 2009-10-20 2014-04-04 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Códec multimodo de audio
KR101666521B1 (ko) * 2010-01-08 2016-10-14 삼성전자 주식회사 입력 신호의 피치 주기 검출 방법 및 그 장치
US8321216B2 (en) * 2010-02-23 2012-11-27 Broadcom Corporation Time-warping of audio signals for packet loss concealment avoiding audible artifacts
US8473287B2 (en) 2010-04-19 2013-06-25 Audience, Inc. Method for jointly optimizing noise reduction and voice quality in a mono or multi-microphone system
US8538035B2 (en) 2010-04-29 2013-09-17 Audience, Inc. Multi-microphone robust noise suppression
US8798290B1 (en) 2010-04-21 2014-08-05 Audience, Inc. Systems and methods for adaptive signal equalization
US8781137B1 (en) 2010-04-27 2014-07-15 Audience, Inc. Wind noise detection and suppression
US9245538B1 (en) * 2010-05-20 2016-01-26 Audience, Inc. Bandwidth enhancement of speech signals assisted by noise reduction
US8447595B2 (en) * 2010-06-03 2013-05-21 Apple Inc. Echo-related decisions on automatic gain control of uplink speech signal in a communications device
US20110300874A1 (en) * 2010-06-04 2011-12-08 Apple Inc. System and method for removing tdma audio noise
US8447596B2 (en) 2010-07-12 2013-05-21 Audience, Inc. Monaural noise suppression based on computational auditory scene analysis
US8560330B2 (en) 2010-07-19 2013-10-15 Futurewei Technologies, Inc. Energy envelope perceptual correction for high band coding
US9047875B2 (en) 2010-07-19 2015-06-02 Futurewei Technologies, Inc. Spectrum flatness control for bandwidth extension
EP2645365B1 (en) * 2010-11-24 2018-01-17 LG Electronics Inc. Speech signal encoding method and speech signal decoding method
CN102201240B (zh) * 2011-05-27 2012-10-03 中国科学院自动化研究所 基于逆滤波的谐波噪声激励模型声码器
US8781023B2 (en) * 2011-11-01 2014-07-15 At&T Intellectual Property I, L.P. Method and apparatus for improving transmission of data on a bandwidth expanded channel
US8774308B2 (en) 2011-11-01 2014-07-08 At&T Intellectual Property I, L.P. Method and apparatus for improving transmission of data on a bandwidth mismatched channel
SI2774145T1 (sl) * 2011-11-03 2020-10-30 Voiceage Evs Llc Izboljšane negovorne vsebine v celp dekoderju z nizko frekvenco
WO2013096875A2 (en) * 2011-12-21 2013-06-27 Huawei Technologies Co., Ltd. Adaptively encoding pitch lag for voiced speech
US9972325B2 (en) * 2012-02-17 2018-05-15 Huawei Technologies Co., Ltd. System and method for mixed codebook excitation for speech coding
CN105976830B (zh) 2013-01-11 2019-09-20 华为技术有限公司 音频信号编码和解码方法、音频信号编码和解码装置
JP6218855B2 (ja) * 2013-01-29 2017-10-25 フラウンホーファーゲゼルシャフト ツール フォルデルング デル アンゲヴァンテン フォルシユング エー.フアー. 摩擦音または破擦音のオンセットまたはオフセットの時間的近接性における増大した時間分解能を使用するオーディオエンコーダ、オーディオデコーダ、システム、方法およびコンピュータプログラム
EP2830053A1 (en) * 2013-07-22 2015-01-28 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Multi-channel audio decoder, multi-channel audio encoder, methods and computer program using a residual-signal-based adjustment of a contribution of a decorrelated signal
US9418671B2 (en) * 2013-08-15 2016-08-16 Huawei Technologies Co., Ltd. Adaptive high-pass post-filter
SG10201609186UA (en) 2013-10-31 2016-12-29 Fraunhofer Ges Forschung Audio Decoder And Method For Providing A Decoded Audio Information Using An Error Concealment Modifying A Time Domain Excitation Signal
CN104637486B (zh) * 2013-11-07 2017-12-29 华为技术有限公司 一种数据帧的内插方法及装置
US9570095B1 (en) * 2014-01-17 2017-02-14 Marvell International Ltd. Systems and methods for instantaneous noise estimation
US9928850B2 (en) * 2014-01-24 2018-03-27 Nippon Telegraph And Telephone Corporation Linear predictive analysis apparatus, method, program and recording medium
PL3462453T3 (pl) * 2014-01-24 2020-10-19 Nippon Telegraph And Telephone Corporation Urządzenie, sposób i program do analizy liniowo-predykcyjnej oraz nośnik zapisu
US9524735B2 (en) * 2014-01-31 2016-12-20 Apple Inc. Threshold adaptation in two-channel noise estimation and voice activity detection
US9697843B2 (en) * 2014-04-30 2017-07-04 Qualcomm Incorporated High band excitation signal generation
US9467779B2 (en) 2014-05-13 2016-10-11 Apple Inc. Microphone partial occlusion detector
US10149047B2 (en) * 2014-06-18 2018-12-04 Cirrus Logic Inc. Multi-aural MMSE analysis techniques for clarifying audio signals
CN105335592A (zh) * 2014-06-25 2016-02-17 国际商业机器公司 生成时间数据序列的缺失区段中的数据的方法和设备
FR3024582A1 (fr) * 2014-07-29 2016-02-05 Orange Gestion de la perte de trame dans un contexte de transition fd/lpd
CN107113357B (zh) * 2014-12-23 2021-05-28 杜比实验室特许公司 与语音质量估计相关的改进方法和设备
US11295753B2 (en) 2015-03-03 2022-04-05 Continental Automotive Systems, Inc. Speech quality under heavy noise conditions in hands-free communication
US10847170B2 (en) 2015-06-18 2020-11-24 Qualcomm Incorporated Device and method for generating a high-band signal from non-linearly processed sub-ranges
US9837089B2 (en) * 2015-06-18 2017-12-05 Qualcomm Incorporated High-band signal generation
US9685170B2 (en) * 2015-10-21 2017-06-20 International Business Machines Corporation Pitch marking in speech processing
US9734844B2 (en) * 2015-11-23 2017-08-15 Adobe Systems Incorporated Irregularity detection in music
WO2017094862A1 (ja) * 2015-12-02 2017-06-08 日本電信電話株式会社 空間相関行列推定装置、空間相関行列推定方法および空間相関行列推定プログラム
US10482899B2 (en) 2016-08-01 2019-11-19 Apple Inc. Coordination of beamformers for noise estimation and noise suppression
US10761522B2 (en) * 2016-09-16 2020-09-01 Honeywell Limited Closed-loop model parameter identification techniques for industrial model-based process controllers
EP3324407A1 (en) * 2016-11-17 2018-05-23 Fraunhofer Gesellschaft zur Förderung der Angewand Apparatus and method for decomposing an audio signal using a ratio as a separation characteristic
EP3324406A1 (en) 2016-11-17 2018-05-23 Fraunhofer Gesellschaft zur Förderung der Angewand Apparatus and method for decomposing an audio signal using a variable threshold
US11602311B2 (en) 2019-01-29 2023-03-14 Murata Vios, Inc. Pulse oximetry system
US11404061B1 (en) * 2021-01-11 2022-08-02 Ford Global Technologies, Llc Speech filtering for masks
US11545143B2 (en) 2021-05-18 2023-01-03 Boris Fridman-Mintz Recognition or synthesis of human-uttered harmonic sounds
CN113872566B (zh) * 2021-12-02 2022-02-11 成都星联芯通科技有限公司 带宽连续可调的调制滤波装置和方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1189264A (zh) * 1996-02-15 1998-07-29 菲利浦电子有限公司 降低了复杂度的信号传输系统
CN1272939A (zh) * 1998-06-09 2000-11-08 松下电器产业株式会社 语音编码设备和语音解码设备
EP1105872A1 (en) * 1998-08-24 2001-06-13 Conexant Systems, Inc. Completed fixed codebook for speech encoder
CN1331826A (zh) * 1998-12-21 2002-01-16 高通股份有限公司 可变速率语音编码

Family Cites Families (66)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4989248A (en) * 1983-01-28 1991-01-29 Texas Instruments Incorporated Speaker-dependent connected speech word recognition method
US4831551A (en) * 1983-01-28 1989-05-16 Texas Instruments Incorporated Speaker-dependent connected speech word recognizer
US4751737A (en) * 1985-11-06 1988-06-14 Motorola Inc. Template generation method in a speech recognition system
US5086475A (en) * 1988-11-19 1992-02-04 Sony Corporation Apparatus for generating, recording or reproducing sound source data
US5371853A (en) * 1991-10-28 1994-12-06 University Of Maryland At College Park Method and system for CELP speech coding and codebook for use therewith
US5765127A (en) * 1992-03-18 1998-06-09 Sony Corp High efficiency encoding method
JP3277398B2 (ja) * 1992-04-15 2002-04-22 ソニー株式会社 有声音判別方法
US5734789A (en) * 1992-06-01 1998-03-31 Hughes Electronics Voiced, unvoiced or noise modes in a CELP vocoder
US5574825A (en) * 1994-03-14 1996-11-12 Lucent Technologies Inc. Linear prediction coefficient generation during frame erasure or packet loss
JP3557662B2 (ja) * 1994-08-30 2004-08-25 ソニー株式会社 音声符号化方法及び音声復号化方法、並びに音声符号化装置及び音声復号化装置
US5699477A (en) 1994-11-09 1997-12-16 Texas Instruments Incorporated Mixed excitation linear prediction with fractional pitch
FI97612C (fi) * 1995-05-19 1997-01-27 Tamrock Oy Sovitelma kallionporauslaitteen vinssin ohjaamiseksi
US5706392A (en) * 1995-06-01 1998-01-06 Rutgers, The State University Of New Jersey Perceptual speech coder and method
US5732389A (en) * 1995-06-07 1998-03-24 Lucent Technologies Inc. Voiced/unvoiced classification of speech for excitation codebook selection in celp speech decoding during frame erasures
US5664055A (en) * 1995-06-07 1997-09-02 Lucent Technologies Inc. CS-ACELP speech compression system with adaptive pitch prediction filter gain based on a measure of periodicity
US5774837A (en) * 1995-09-13 1998-06-30 Voxware, Inc. Speech coding system and method using voicing probability determination
US5809459A (en) * 1996-05-21 1998-09-15 Motorola, Inc. Method and apparatus for speech excitation waveform coding using multiple error waveforms
JPH1091194A (ja) * 1996-09-18 1998-04-10 Sony Corp 音声復号化方法及び装置
JP3707153B2 (ja) * 1996-09-24 2005-10-19 ソニー株式会社 ベクトル量子化方法、音声符号化方法及び装置
JP3707154B2 (ja) * 1996-09-24 2005-10-19 ソニー株式会社 音声符号化方法及び装置
US6014622A (en) * 1996-09-26 2000-01-11 Rockwell Semiconductor Systems, Inc. Low bit rate speech coder using adaptive open-loop subframe pitch lag estimation and vector quantization
EP0878790A1 (en) * 1997-05-15 1998-11-18 Hewlett-Packard Company Voice coding system and method
WO1999010719A1 (en) 1997-08-29 1999-03-04 The Regents Of The University Of California Method and apparatus for hybrid coding of speech at 4kbps
US6263312B1 (en) * 1997-10-03 2001-07-17 Alaris, Inc. Audio compression and decompression employing subband decomposition of residual signal and distortion reduction
US6169970B1 (en) * 1998-01-08 2001-01-02 Lucent Technologies Inc. Generalized analysis-by-synthesis speech coding method and apparatus
US6182033B1 (en) * 1998-01-09 2001-01-30 At&T Corp. Modular approach to speech enhancement with an application to speech coding
US6272231B1 (en) * 1998-11-06 2001-08-07 Eyematic Interfaces, Inc. Wavelet-based facial motion capture for avatar animation
WO1999059139A2 (en) * 1998-05-11 1999-11-18 Koninklijke Philips Electronics N.V. Speech coding based on determining a noise contribution from a phase change
GB9811019D0 (en) * 1998-05-21 1998-07-22 Univ Surrey Speech coders
US6141638A (en) * 1998-05-28 2000-10-31 Motorola, Inc. Method and apparatus for coding an information signal
US6138092A (en) * 1998-07-13 2000-10-24 Lockheed Martin Corporation CELP speech synthesizer with epoch-adaptive harmonic generator for pitch harmonics below voicing cutoff frequency
US6330533B2 (en) * 1998-08-24 2001-12-11 Conexant Systems, Inc. Speech encoder adaptively applying pitch preprocessing with warping of target signal
US6260010B1 (en) * 1998-08-24 2001-07-10 Conexant Systems, Inc. Speech encoder using gain normalization that combines open and closed loop gains
JP4249821B2 (ja) * 1998-08-31 2009-04-08 富士通株式会社 ディジタルオーディオ再生装置
US6308155B1 (en) * 1999-01-20 2001-10-23 International Computer Science Institute Feature extraction for automatic speech recognition
US6453287B1 (en) * 1999-02-04 2002-09-17 Georgia-Tech Research Corporation Apparatus and quality enhancement algorithm for mixed excitation linear predictive (MELP) and other speech coders
US7423983B1 (en) * 1999-09-20 2008-09-09 Broadcom Corporation Voice and data exchange over a packet based network
US6889183B1 (en) * 1999-07-15 2005-05-03 Nortel Networks Limited Apparatus and method of regenerating a lost audio segment
US6691082B1 (en) * 1999-08-03 2004-02-10 Lucent Technologies Inc Method and system for sub-band hybrid coding
US6910011B1 (en) * 1999-08-16 2005-06-21 Haman Becker Automotive Systems - Wavemakers, Inc. Noisy acoustic signal enhancement
US6111183A (en) * 1999-09-07 2000-08-29 Lindemann; Eric Audio signal synthesis system based on probabilistic estimation of time-varying spectra
SE9903223L (sv) * 1999-09-09 2001-05-08 Ericsson Telefon Ab L M Förfarande och anordning i telekommunikationssystem
US6959274B1 (en) * 1999-09-22 2005-10-25 Mindspeed Technologies, Inc. Fixed rate speech compression system and method
US6581032B1 (en) * 1999-09-22 2003-06-17 Conexant Systems, Inc. Bitstream protocol for transmission of encoded voice signals
US6636829B1 (en) * 1999-09-22 2003-10-21 Mindspeed Technologies, Inc. Speech communication system and method for handling lost frames
US6574593B1 (en) 1999-09-22 2003-06-03 Conexant Systems, Inc. Codebook tables for encoding and decoding
EP1147515A1 (en) * 1999-11-10 2001-10-24 Koninklijke Philips Electronics N.V. Wide band speech synthesis by means of a mapping matrix
FI116643B (fi) * 1999-11-15 2006-01-13 Nokia Corp Kohinan vaimennus
US20070110042A1 (en) * 1999-12-09 2007-05-17 Henry Li Voice and data exchange over a packet based network
US6766292B1 (en) * 2000-03-28 2004-07-20 Tellabs Operations, Inc. Relative noise ratio weighting techniques for adaptive noise cancellation
FI115329B (fi) * 2000-05-08 2005-04-15 Nokia Corp Menetelmä ja järjestely lähdesignaalin kaistanleveyden vaihtamiseksi tietoliikenneyhteydessä, jossa on valmiudet useisiin kaistanleveyksiin
US7136810B2 (en) * 2000-05-22 2006-11-14 Texas Instruments Incorporated Wideband speech coding system and method
US20020016698A1 (en) * 2000-06-26 2002-02-07 Toshimichi Tokuda Device and method for audio frequency range expansion
US6990453B2 (en) * 2000-07-31 2006-01-24 Landmark Digital Services Llc System and methods for recognizing sound and music signals in high noise and distortion
US6898566B1 (en) * 2000-08-16 2005-05-24 Mindspeed Technologies, Inc. Using signal to noise ratio of a speech signal to adjust thresholds for extracting speech parameters for coding the speech signal
DE10041512B4 (de) * 2000-08-24 2005-05-04 Infineon Technologies Ag Verfahren und Vorrichtung zur künstlichen Erweiterung der Bandbreite von Sprachsignalen
CA2327041A1 (en) * 2000-11-22 2002-05-22 Voiceage Corporation A method for indexing pulse positions and signs in algebraic codebooks for efficient coding of wideband signals
US6937904B2 (en) * 2000-12-13 2005-08-30 Alfred E. Mann Institute For Biomedical Engineering At The University Of Southern California System and method for providing recovery from muscle denervation
US20020133334A1 (en) * 2001-02-02 2002-09-19 Geert Coorman Time scale modification of digitally sampled waveforms in the time domain
ATE353503T1 (de) * 2001-04-24 2007-02-15 Nokia Corp Verfahren zum ändern der grösse eines zitlerpuffers zur zeitausrichtung, kommunikationssystem, empfängerseite und transcoder
US6766289B2 (en) * 2001-06-04 2004-07-20 Qualcomm Incorporated Fast code-vector searching
US6985857B2 (en) * 2001-09-27 2006-01-10 Motorola, Inc. Method and apparatus for speech coding using training and quantizing
SE521600C2 (sv) * 2001-12-04 2003-11-18 Global Ip Sound Ab Lågbittaktskodek
US7283585B2 (en) * 2002-09-27 2007-10-16 Broadcom Corporation Multiple data rate communication system
US7519530B2 (en) * 2003-01-09 2009-04-14 Nokia Corporation Audio signal processing
US7254648B2 (en) * 2003-01-30 2007-08-07 Utstarcom, Inc. Universal broadband server system and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1189264A (zh) * 1996-02-15 1998-07-29 菲利浦电子有限公司 降低了复杂度的信号传输系统
CN1272939A (zh) * 1998-06-09 2000-11-08 松下电器产业株式会社 语音编码设备和语音解码设备
EP1105872A1 (en) * 1998-08-24 2001-06-13 Conexant Systems, Inc. Completed fixed codebook for speech encoder
CN1331826A (zh) * 1998-12-21 2002-01-16 高通股份有限公司 可变速率语音编码

Also Published As

Publication number Publication date
EP1604352A2 (en) 2005-12-14
US7529664B2 (en) 2009-05-05
US7379866B2 (en) 2008-05-27
US20040181405A1 (en) 2004-09-16
WO2004084179A3 (en) 2006-08-24
EP1604352A4 (en) 2007-12-19
US20040181399A1 (en) 2004-09-16
WO2004084181A3 (en) 2004-12-09
WO2004084180A2 (en) 2004-09-30
WO2004084181B1 (en) 2005-01-20
WO2004084180A3 (en) 2004-12-23
WO2004084179A2 (en) 2004-09-30
US7024358B2 (en) 2006-04-04
US20050065792A1 (en) 2005-03-24
WO2004084467A3 (en) 2005-12-01
WO2004084180B1 (en) 2005-01-27
CN1757060A (zh) 2006-04-05
WO2004084467A2 (en) 2004-09-30
US20040181397A1 (en) 2004-09-16
WO2004084182A1 (en) 2004-09-30
WO2004084181A2 (en) 2004-09-30
EP1604354A4 (en) 2008-04-02
US7155386B2 (en) 2006-12-26
EP1604354A2 (en) 2005-12-14
US20040181411A1 (en) 2004-09-16

Similar Documents

Publication Publication Date Title
CN1757060B (zh) Celp语音编码的话音指数控制
US10249313B2 (en) Adaptive bandwidth extension and apparatus for the same
EP0832482B1 (en) Speech coder
Bessette et al. The adaptive multirate wideband speech codec (AMR-WB)
US8630864B2 (en) Method for switching rate and bandwidth scalable audio decoding rate
EP1509903B1 (en) Method and device for efficient frame erasure concealment in linear predictive based speech codecs
EP1979895B1 (en) Method and device for efficient frame erasure concealment in speech codecs
US6556966B1 (en) Codebook structure for changeable pulse multimode speech coding
EP1141946B1 (en) Coded enhancement feature for improved performance in coding communication signals
DE60012760T2 (de) Multimodaler sprachkodierer
US20020007269A1 (en) Codebook structure and search for speech coding
EP3352169B1 (en) Unvoiced decision for speech processing
JP2005528647A (ja) 合成発話の周波数選択的ピッチ強調方法およびデバイス
KR20020052191A (ko) 음성 분류를 이용한 음성의 가변 비트 속도 켈프 코딩 방법
EP1756807B1 (en) Audio encoding
McCree et al. A 1.7 kb/s MELP coder with improved analysis and quantization
US6826527B1 (en) Concealment of frame erasures and method
US7596491B1 (en) Layered CELP system and method
WO2014131260A1 (en) System and method for post excitation enhancement for low bit rate speech coding
KR20010075491A (ko) 음성 코더 매개변수를 양자화하는 방법
Paksoy et al. A variable rate multimodal speech coder with gain-matched analysis-by-synthesis
EP2951824B1 (en) Adaptive high-pass post-filter
KR102138320B1 (ko) 통신 시스템에서 신호 코덱 장치 및 방법
US6415252B1 (en) Method and apparatus for coding and decoding speech
Tandel et al. Implementation of CELP CODER and to evaluate the performance in terms of bit rate, coding delay and quality of speech

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: California, USA

Patentee after: Mandus bide technology LLC

Address before: California, USA

Patentee before: Mindspeed Technologies, Inc.

CP01 Change in the name or title of a patent holder
TR01 Transfer of patent right

Effective date of registration: 20180329

Address after: Massachusetts, USA

Patentee after: MACOM technology solving holding Co.

Address before: California, USA

Patentee before: Mandus bide technology LLC

TR01 Transfer of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120815

Termination date: 20190311

CF01 Termination of patent right due to non-payment of annual fee