WO2001026094A1 - Dispositif de codage vocal et procede de codage vocal - Google Patents
Dispositif de codage vocal et procede de codage vocal Download PDFInfo
- Publication number
- WO2001026094A1 WO2001026094A1 PCT/JP2000/006689 JP0006689W WO0126094A1 WO 2001026094 A1 WO2001026094 A1 WO 2001026094A1 JP 0006689 W JP0006689 W JP 0006689W WO 0126094 A1 WO0126094 A1 WO 0126094A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- noise
- parameter
- input signal
- information source
- model
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 17
- 239000000284 extract Substances 0.000 claims abstract description 7
- 238000001514 detection method Methods 0.000 claims abstract description 5
- 238000013139 quantization Methods 0.000 claims description 25
- 230000005540 biological transmission Effects 0.000 claims description 11
- 238000004891 communication Methods 0.000 claims description 6
- 230000001419 dependent effect Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 14
- 238000004364 calculation method Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000002542 deteriorative effect Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000011410 subtraction method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
Definitions
- the present invention relates to a voice coding device and a voice coding method used for a communication device of a wireless communication system such as a mobile phone and a mobile phone.
- FIG. 1 is a block diagram showing a configuration of a conventional speech encoding device.
- noise section detection section 11 separates an input signal into a speech section and another section, and detects a signal other than the speech section as background noise.
- the noise model estimating unit 12 estimates a noise model such as the amplitude frequency characteristic of the noise signal in the noise section detected by the noise section detecting unit 11.
- the noise removing unit 13 removes noise from the input signal using the noise model estimated by the noise model estimating unit 12.
- noise can be removed by using a spectral subtraction method or the like.
- the noise elimination processing is described in Japanese Patent Application Laid-Open Nos. 10-133689 and 10-187193.
- the speech analysis unit 14 extracts the parameter by praying for the signal from which the noise has been removed, which is the output of the noise removal unit 13.
- the parameter overnight quantizer 15 quantizes the parameters extracted by the speech analyzer 14 and minimizes the error based on one measure represented by the Euclidean distance. Is extracted and output as a code corresponding to the quantized value.
- the conventional speech coding apparatus removes a noise signal component from an input signal and extracts a parameter specializing in the speech signal, thereby achieving a high-quality speech coding at a low bit rate. Has been realized.
- An object of the present invention is to provide a speech coding apparatus and a speech coding apparatus which are less dependent on the accuracy of a noise model, are robust against noise signal components, and can realize high quality speech coding processing even in a background noise environment. It is to provide a method of conversion. This object is achieved by performing a parameter quantization using a noise magnitude or noise model and a source model.
- FIG. 1 is a block diagram showing the configuration of a conventional speech coding apparatus
- FIG. 2 is a block diagram showing a configuration of the speech coding apparatus according to Embodiment 1 of the present invention
- FIG. 3 is a block diagram showing the internal configuration of a parameter quantization unit of the speech coding apparatus according to the above embodiment.
- FIG. 4 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 2 of the present invention.
- FIG. 5 is a block diagram showing an internal configuration of a parameter / quantization unit of the speech coding apparatus according to Embodiment 2 of the present invention.
- FIG. 6 is a block diagram showing an internal configuration of a parameter quantization unit of a speech coding apparatus according to Embodiment 3 of the present invention.
- FIG. 7 is a block diagram showing the internal configuration of the parameter quantization unit of the speech coding apparatus according to Embodiment 4 of the present invention.
- FIG. 2 is a block diagram showing a configuration of the speech coding apparatus according to Embodiment 1 of the present invention.
- a noise section detection unit 101 separates an input signal into a speech section and other sections, and detects a signal outside the speech section as background noise.
- the noise level estimator 102 estimates the noise level (noise level) of the noise section detected by the noise section detector 101.
- the information source model storage unit 103 remembers an information source model that models a sequence of parameters for a speech input signal containing no noise.
- the voice analyzer 104 analyzes the input signal and extracts parameters.
- the parameter overnight quantizer 105 quantizes the parameter extracted by the speech analyzer 104 based on the information source model and the noise level, and outputs a code corresponding to the quantized value.
- FIG. 3 is a block diagram showing an internal configuration of parameter / quantization unit 105 of the speech coding apparatus according to the present embodiment.
- an allowable error level determiner 201 determines an allowable error according to the noise level estimated by the noise level estimator 102.
- the codebook 202 stores a quantized value corresponding to the transmission code.
- the sign extractor 203 detects that the error from the parameter extracted by the speech analyzer 104 is less than the allowable error.
- the code below is extracted from the codebook 202.
- the code selector 204 selects the most probable code among the codes extracted by the code extractor 203 as a transmission code based on the information source model.
- the present invention is robust against noise signal components and can realize high-quality speech encoding processing even in a background noise environment.
- FIG. 4 is a block diagram showing a configuration of the speech coding apparatus according to Embodiment 2 of the present invention. 4 adopts a configuration having a noise model estimator 301 instead of the noise level estimator 102 as compared to FIG.
- the noise model estimating unit 301 estimates a noise model such as the amplitude frequency characteristic of a noise signal in the noise section detected by the noise section detecting unit 101.
- the parameter overnight quantizer 105 quantizes the parameter extracted by the speech analyzer 104 based on the likelihood of the parameter sequence obtained from the information source model and the noise model, The code corresponding to the quantized value is output.
- FIG. 5 is a block diagram showing an internal configuration of parameter overnight quantization section 105 of the speech coding apparatus according to the present embodiment.
- the parameter overnight quantization unit 105 in FIG. 5 employs a configuration having an allowable error range determiner 401 instead of the allowable error level determiner 201 as compared with FIG.
- an allowable error range determiner 201 determines an allowable error range based on the noise model estimated by the noise model estimator 301.
- the noise model estimator 301 estimates the variance of the degree of noise superposition for each element in the vector quantization.
- the code extractor 203 extracts, from the codebook 202, a code whose error from the parameter extracted by the voice analysis unit 104 is within the allowable error range.
- the transmission level is further improved as compared with the case where the noise level is used.
- High quality speech coding can be realized ⁇
- FIG. 6 is a block diagram showing the internal configuration of the parameter quantization unit 105 of the speech coding apparatus according to Embodiment 3 of the present invention.
- the configuration of the speech coding apparatus according to the present embodiment is the same as the configuration of the speech coding apparatus shown in FIG. 2 of Embodiment 1, and a description thereof will be omitted.
- an error calculation weight determiner 501 determines a difference between an input parameter and a quantized value based on the noise level estimated by the noise level estimator 102 and the information source model. Determine the weight for each element of the parameter in.
- weighting is performed so that the error value of a parameter element having a correlation with the power envelope of the adaptive source becomes small.
- Codebook 502 stores the quantization value corresponding to the transmission code.
- the quantizer 503 quantizes the parameters extracted by the speech analyzer 104 according to the weight determined by the error calculation weight determiner 501 by using a codebook 502. Become In this way, the parameters of the parameters are weighted based on the noise level and the information source model, and the parameters are quantized, so that the noise signal is not degraded without deteriorating the performance of the signal including no noise. Robust background noise environment for components Even below, high-quality speech encoding processing can be realized.
- the weighting process can be performed using the noise model described in the second embodiment.
- FIG. 7 is a block diagram showing an internal configuration of parameter / quantization unit 105 of the speech coding apparatus according to Embodiment 4 of the present invention.
- the configuration of the speech coding apparatus according to the present embodiment is the same as the configuration of the speech coding apparatus shown in FIG. 2 of Embodiment 1, and a description thereof will be omitted.
- the code appearance probability calculator 6001 calculates the parameter quantization value when the input signal contains no noise from the noise level and the information source model estimated by the noise level estimation unit 102. Estimate the appearance probability.
- Codebook 602 stores the quantization value corresponding to the transmission code.
- the quantizer 603 calculates the parameters extracted by the speech analysis unit 104 in accordance with the likelihood obtained by combining the occurrence probability estimated by the code appearance probability calculator 601 and the error value. , And quantize using the codebook 602.
- the noise for the signal including no noise is not degraded and the noise is reduced. It is robust against signal components and can achieve high-quality speech coding even in a background noise environment.
- the weighting process can be performed using the noise model described in the second embodiment.
- the dependence on the accuracy of the noise model is small, the speech is robust against the noise signal component, and the high-quality speech is obtained even in the background noise environment.
- An encoding process can be realized.
- the present specification is based on Japanese Patent Application No. 11-2814466 filed on Oct. 1, 1999. This content is included here.
- the present invention is suitable for use in a communication device of a wireless communication system such as a mobile phone and a mobile phone.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Quality & Reliability (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
Abstract
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU74473/00A AU7447300A (en) | 1999-10-01 | 2000-09-28 | Voice encoding device and voice encoding method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP11/281466 | 1999-10-01 | ||
JP28146699A JP3315956B2 (ja) | 1999-10-01 | 1999-10-01 | 音声符号化装置及び音声符号化方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2001026094A1 true WO2001026094A1 (fr) | 2001-04-12 |
Family
ID=17639585
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2000/006689 WO2001026094A1 (fr) | 1999-10-01 | 2000-09-28 | Dispositif de codage vocal et procede de codage vocal |
Country Status (3)
Country | Link |
---|---|
JP (1) | JP3315956B2 (fr) |
AU (1) | AU7447300A (fr) |
WO (1) | WO2001026094A1 (fr) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3079151A1 (fr) * | 2015-04-09 | 2016-10-12 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Codeur audio et procédé de codage d'un signal audio |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08328598A (ja) * | 1995-05-26 | 1996-12-13 | Sanyo Electric Co Ltd | 音声符号化・復号化装置 |
JPH10307598A (ja) * | 1997-05-09 | 1998-11-17 | Hitachi Ltd | 音声符号化送信装置 |
-
1999
- 1999-10-01 JP JP28146699A patent/JP3315956B2/ja not_active Expired - Fee Related
-
2000
- 2000-09-28 AU AU74473/00A patent/AU7447300A/en not_active Abandoned
- 2000-09-28 WO PCT/JP2000/006689 patent/WO2001026094A1/fr active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08328598A (ja) * | 1995-05-26 | 1996-12-13 | Sanyo Electric Co Ltd | 音声符号化・復号化装置 |
JPH10307598A (ja) * | 1997-05-09 | 1998-11-17 | Hitachi Ltd | 音声符号化送信装置 |
Also Published As
Publication number | Publication date |
---|---|
JP3315956B2 (ja) | 2002-08-19 |
AU7447300A (en) | 2001-05-10 |
JP2001109496A (ja) | 2001-04-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR100636317B1 (ko) | 분산 음성 인식 시스템 및 그 방법 | |
JP4491210B2 (ja) | 再帰的構成における反復ノイズ推定法 | |
KR101201146B1 (ko) | 최적의 추정을 위한 중요한 양으로서 순간적인 신호 대 잡음비를 사용하는 잡음 감소 방법 | |
FI110726B (fi) | Äänen aktiivisuuden ilmaisu | |
EP2617029B1 (fr) | Estimation de pitch lag | |
JP4316583B2 (ja) | 特徴量補正装置、特徴量補正方法および特徴量補正プログラム | |
KR20070001276A (ko) | 신호 인코딩 | |
AU2004229048A1 (en) | Method and apparatus for multi-sensory speech enhancement | |
JP4875249B2 (ja) | 自動音声認識実行方法 | |
US8046215B2 (en) | Method and apparatus to detect voice activity by adding a random signal | |
US6389389B1 (en) | Speech recognition using unequally-weighted subvector error measures for determining a codebook vector index to represent plural speech parameters | |
JP2005535920A (ja) | バックエンドの音声検出装置を有する配信音声認識および方法 | |
CN101421780A (zh) | 音频编码和解码中的激励处理 | |
JP3478209B2 (ja) | 音声信号復号方法及び装置と音声信号符号化復号方法及び装置と記録媒体 | |
JPH04270398A (ja) | 音声符号化方式 | |
EP1672619A2 (fr) | Dispositif et procédé de codage de la parole | |
EP1199712B1 (fr) | Procédé pour la réduction du bruit | |
JP4543283B2 (ja) | 無線機同定方法及び装置 | |
JP2002530704A (ja) | 分散音声認識プロセスにおけるエラーの軽減方法および装置 | |
US20090198489A1 (en) | Method and apparatus for frequency encoding, and method and apparatus for frequency decoding | |
EP2617034B1 (fr) | Détermination d'énergie de cycle de fréquence fondamentale et mise à l'échelle d'un signal d'excitation | |
US7164719B2 (en) | System to reduce distortion due to coding with a sample-by-sample quantizer | |
JP3315956B2 (ja) | 音声符号化装置及び音声符号化方法 | |
JPH0844395A (ja) | 音声ピッチ検出装置 | |
JP4603429B2 (ja) | クライアント・サーバ音声認識方法、サーバ計算機での音声認識方法、音声特徴量抽出・送信方法、これらの方法を用いたシステム、装置、プログラムおよび記録媒体 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
WWE | Wipo information: entry into national phase |
Ref document number: 09856553 Country of ref document: US |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase |