WO2001026094A1 - Dispositif de codage vocal et procede de codage vocal - Google Patents

Dispositif de codage vocal et procede de codage vocal Download PDF

Info

Publication number
WO2001026094A1
WO2001026094A1 PCT/JP2000/006689 JP0006689W WO0126094A1 WO 2001026094 A1 WO2001026094 A1 WO 2001026094A1 JP 0006689 W JP0006689 W JP 0006689W WO 0126094 A1 WO0126094 A1 WO 0126094A1
Authority
WO
WIPO (PCT)
Prior art keywords
noise
parameter
input signal
information source
model
Prior art date
Application number
PCT/JP2000/006689
Other languages
English (en)
Japanese (ja)
Inventor
Tadashi Yonezaki
Original Assignee
Matsushita Electric Industrial Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co., Ltd. filed Critical Matsushita Electric Industrial Co., Ltd.
Priority to AU74473/00A priority Critical patent/AU7447300A/en
Publication of WO2001026094A1 publication Critical patent/WO2001026094A1/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition

Definitions

  • the present invention relates to a voice coding device and a voice coding method used for a communication device of a wireless communication system such as a mobile phone and a mobile phone.
  • FIG. 1 is a block diagram showing a configuration of a conventional speech encoding device.
  • noise section detection section 11 separates an input signal into a speech section and another section, and detects a signal other than the speech section as background noise.
  • the noise model estimating unit 12 estimates a noise model such as the amplitude frequency characteristic of the noise signal in the noise section detected by the noise section detecting unit 11.
  • the noise removing unit 13 removes noise from the input signal using the noise model estimated by the noise model estimating unit 12.
  • noise can be removed by using a spectral subtraction method or the like.
  • the noise elimination processing is described in Japanese Patent Application Laid-Open Nos. 10-133689 and 10-187193.
  • the speech analysis unit 14 extracts the parameter by praying for the signal from which the noise has been removed, which is the output of the noise removal unit 13.
  • the parameter overnight quantizer 15 quantizes the parameters extracted by the speech analyzer 14 and minimizes the error based on one measure represented by the Euclidean distance. Is extracted and output as a code corresponding to the quantized value.
  • the conventional speech coding apparatus removes a noise signal component from an input signal and extracts a parameter specializing in the speech signal, thereby achieving a high-quality speech coding at a low bit rate. Has been realized.
  • An object of the present invention is to provide a speech coding apparatus and a speech coding apparatus which are less dependent on the accuracy of a noise model, are robust against noise signal components, and can realize high quality speech coding processing even in a background noise environment. It is to provide a method of conversion. This object is achieved by performing a parameter quantization using a noise magnitude or noise model and a source model.
  • FIG. 1 is a block diagram showing the configuration of a conventional speech coding apparatus
  • FIG. 2 is a block diagram showing a configuration of the speech coding apparatus according to Embodiment 1 of the present invention
  • FIG. 3 is a block diagram showing the internal configuration of a parameter quantization unit of the speech coding apparatus according to the above embodiment.
  • FIG. 4 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 2 of the present invention.
  • FIG. 5 is a block diagram showing an internal configuration of a parameter / quantization unit of the speech coding apparatus according to Embodiment 2 of the present invention.
  • FIG. 6 is a block diagram showing an internal configuration of a parameter quantization unit of a speech coding apparatus according to Embodiment 3 of the present invention.
  • FIG. 7 is a block diagram showing the internal configuration of the parameter quantization unit of the speech coding apparatus according to Embodiment 4 of the present invention.
  • FIG. 2 is a block diagram showing a configuration of the speech coding apparatus according to Embodiment 1 of the present invention.
  • a noise section detection unit 101 separates an input signal into a speech section and other sections, and detects a signal outside the speech section as background noise.
  • the noise level estimator 102 estimates the noise level (noise level) of the noise section detected by the noise section detector 101.
  • the information source model storage unit 103 remembers an information source model that models a sequence of parameters for a speech input signal containing no noise.
  • the voice analyzer 104 analyzes the input signal and extracts parameters.
  • the parameter overnight quantizer 105 quantizes the parameter extracted by the speech analyzer 104 based on the information source model and the noise level, and outputs a code corresponding to the quantized value.
  • FIG. 3 is a block diagram showing an internal configuration of parameter / quantization unit 105 of the speech coding apparatus according to the present embodiment.
  • an allowable error level determiner 201 determines an allowable error according to the noise level estimated by the noise level estimator 102.
  • the codebook 202 stores a quantized value corresponding to the transmission code.
  • the sign extractor 203 detects that the error from the parameter extracted by the speech analyzer 104 is less than the allowable error.
  • the code below is extracted from the codebook 202.
  • the code selector 204 selects the most probable code among the codes extracted by the code extractor 203 as a transmission code based on the information source model.
  • the present invention is robust against noise signal components and can realize high-quality speech encoding processing even in a background noise environment.
  • FIG. 4 is a block diagram showing a configuration of the speech coding apparatus according to Embodiment 2 of the present invention. 4 adopts a configuration having a noise model estimator 301 instead of the noise level estimator 102 as compared to FIG.
  • the noise model estimating unit 301 estimates a noise model such as the amplitude frequency characteristic of a noise signal in the noise section detected by the noise section detecting unit 101.
  • the parameter overnight quantizer 105 quantizes the parameter extracted by the speech analyzer 104 based on the likelihood of the parameter sequence obtained from the information source model and the noise model, The code corresponding to the quantized value is output.
  • FIG. 5 is a block diagram showing an internal configuration of parameter overnight quantization section 105 of the speech coding apparatus according to the present embodiment.
  • the parameter overnight quantization unit 105 in FIG. 5 employs a configuration having an allowable error range determiner 401 instead of the allowable error level determiner 201 as compared with FIG.
  • an allowable error range determiner 201 determines an allowable error range based on the noise model estimated by the noise model estimator 301.
  • the noise model estimator 301 estimates the variance of the degree of noise superposition for each element in the vector quantization.
  • the code extractor 203 extracts, from the codebook 202, a code whose error from the parameter extracted by the voice analysis unit 104 is within the allowable error range.
  • the transmission level is further improved as compared with the case where the noise level is used.
  • High quality speech coding can be realized ⁇
  • FIG. 6 is a block diagram showing the internal configuration of the parameter quantization unit 105 of the speech coding apparatus according to Embodiment 3 of the present invention.
  • the configuration of the speech coding apparatus according to the present embodiment is the same as the configuration of the speech coding apparatus shown in FIG. 2 of Embodiment 1, and a description thereof will be omitted.
  • an error calculation weight determiner 501 determines a difference between an input parameter and a quantized value based on the noise level estimated by the noise level estimator 102 and the information source model. Determine the weight for each element of the parameter in.
  • weighting is performed so that the error value of a parameter element having a correlation with the power envelope of the adaptive source becomes small.
  • Codebook 502 stores the quantization value corresponding to the transmission code.
  • the quantizer 503 quantizes the parameters extracted by the speech analyzer 104 according to the weight determined by the error calculation weight determiner 501 by using a codebook 502. Become In this way, the parameters of the parameters are weighted based on the noise level and the information source model, and the parameters are quantized, so that the noise signal is not degraded without deteriorating the performance of the signal including no noise. Robust background noise environment for components Even below, high-quality speech encoding processing can be realized.
  • the weighting process can be performed using the noise model described in the second embodiment.
  • FIG. 7 is a block diagram showing an internal configuration of parameter / quantization unit 105 of the speech coding apparatus according to Embodiment 4 of the present invention.
  • the configuration of the speech coding apparatus according to the present embodiment is the same as the configuration of the speech coding apparatus shown in FIG. 2 of Embodiment 1, and a description thereof will be omitted.
  • the code appearance probability calculator 6001 calculates the parameter quantization value when the input signal contains no noise from the noise level and the information source model estimated by the noise level estimation unit 102. Estimate the appearance probability.
  • Codebook 602 stores the quantization value corresponding to the transmission code.
  • the quantizer 603 calculates the parameters extracted by the speech analysis unit 104 in accordance with the likelihood obtained by combining the occurrence probability estimated by the code appearance probability calculator 601 and the error value. , And quantize using the codebook 602.
  • the noise for the signal including no noise is not degraded and the noise is reduced. It is robust against signal components and can achieve high-quality speech coding even in a background noise environment.
  • the weighting process can be performed using the noise model described in the second embodiment.
  • the dependence on the accuracy of the noise model is small, the speech is robust against the noise signal component, and the high-quality speech is obtained even in the background noise environment.
  • An encoding process can be realized.
  • the present specification is based on Japanese Patent Application No. 11-2814466 filed on Oct. 1, 1999. This content is included here.
  • the present invention is suitable for use in a communication device of a wireless communication system such as a mobile phone and a mobile phone.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)

Abstract

Dans cette invention, une unité de détection de section de bruit (101) sépare un signal d'entrée en une section vocale et en d'autres sections, en vue de détecter un signal contenu dans les sections autres que la section vocale, comme bruit de fond. Une unité d'évaluation de niveau de bruit (102) évalue une section de bruit. Une unité de stockage de modèle source d'information (103) stocke un modèle source d'information obtenu par modélisation d'une chaîne de paramètres pour un signal d'entrée vocal sans bruit. Une unité d'analyse vocale (104) analyse un signal d'entrée et extrait un paramètre. Une unité de quantification de paramètre (105) quantifie le paramètre extrait par l'unité d'analyse vocale (104) sur la base du modèle source d'information et du niveau de bruit, et émet un code correspondant à la valeur quantifiée. Ainsi, on peut réaliser un processus de codage vocal qui est moins dépendant de la précision du modèle de bruit et résistant à une composante du signal de bruit, et qui préserve un haut niveau de qualité, même dans des conditions avec bruit de fond.
PCT/JP2000/006689 1999-10-01 2000-09-28 Dispositif de codage vocal et procede de codage vocal WO2001026094A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU74473/00A AU7447300A (en) 1999-10-01 2000-09-28 Voice encoding device and voice encoding method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP11/281466 1999-10-01
JP28146699A JP3315956B2 (ja) 1999-10-01 1999-10-01 音声符号化装置及び音声符号化方法

Publications (1)

Publication Number Publication Date
WO2001026094A1 true WO2001026094A1 (fr) 2001-04-12

Family

ID=17639585

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2000/006689 WO2001026094A1 (fr) 1999-10-01 2000-09-28 Dispositif de codage vocal et procede de codage vocal

Country Status (3)

Country Link
JP (1) JP3315956B2 (fr)
AU (1) AU7447300A (fr)
WO (1) WO2001026094A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3079151A1 (fr) * 2015-04-09 2016-10-12 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Codeur audio et procédé de codage d'un signal audio

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08328598A (ja) * 1995-05-26 1996-12-13 Sanyo Electric Co Ltd 音声符号化・復号化装置
JPH10307598A (ja) * 1997-05-09 1998-11-17 Hitachi Ltd 音声符号化送信装置

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08328598A (ja) * 1995-05-26 1996-12-13 Sanyo Electric Co Ltd 音声符号化・復号化装置
JPH10307598A (ja) * 1997-05-09 1998-11-17 Hitachi Ltd 音声符号化送信装置

Also Published As

Publication number Publication date
JP3315956B2 (ja) 2002-08-19
AU7447300A (en) 2001-05-10
JP2001109496A (ja) 2001-04-20

Similar Documents

Publication Publication Date Title
KR100636317B1 (ko) 분산 음성 인식 시스템 및 그 방법
JP4491210B2 (ja) 再帰的構成における反復ノイズ推定法
KR101201146B1 (ko) 최적의 추정을 위한 중요한 양으로서 순간적인 신호 대 잡음비를 사용하는 잡음 감소 방법
FI110726B (fi) Äänen aktiivisuuden ilmaisu
EP2617029B1 (fr) Estimation de pitch lag
JP4316583B2 (ja) 特徴量補正装置、特徴量補正方法および特徴量補正プログラム
KR20070001276A (ko) 신호 인코딩
AU2004229048A1 (en) Method and apparatus for multi-sensory speech enhancement
JP4875249B2 (ja) 自動音声認識実行方法
US8046215B2 (en) Method and apparatus to detect voice activity by adding a random signal
US6389389B1 (en) Speech recognition using unequally-weighted subvector error measures for determining a codebook vector index to represent plural speech parameters
JP2005535920A (ja) バックエンドの音声検出装置を有する配信音声認識および方法
CN101421780A (zh) 音频编码和解码中的激励处理
JP3478209B2 (ja) 音声信号復号方法及び装置と音声信号符号化復号方法及び装置と記録媒体
JPH04270398A (ja) 音声符号化方式
EP1672619A2 (fr) Dispositif et procédé de codage de la parole
EP1199712B1 (fr) Procédé pour la réduction du bruit
JP4543283B2 (ja) 無線機同定方法及び装置
JP2002530704A (ja) 分散音声認識プロセスにおけるエラーの軽減方法および装置
US20090198489A1 (en) Method and apparatus for frequency encoding, and method and apparatus for frequency decoding
EP2617034B1 (fr) Détermination d'énergie de cycle de fréquence fondamentale et mise à l'échelle d'un signal d'excitation
US7164719B2 (en) System to reduce distortion due to coding with a sample-by-sample quantizer
JP3315956B2 (ja) 音声符号化装置及び音声符号化方法
JPH0844395A (ja) 音声ピッチ検出装置
JP4603429B2 (ja) クライアント・サーバ音声認識方法、サーバ計算機での音声認識方法、音声特徴量抽出・送信方法、これらの方法を用いたシステム、装置、プログラムおよび記録媒体

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 09856553

Country of ref document: US

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase