CA2219358A1 - Quantification de signaux vocaux a l'aide de modeles auditifs humains dans des systemes de codage predictif - Google Patents
Quantification de signaux vocaux a l'aide de modeles auditifs humains dans des systemes de codage predictif Download PDFInfo
- Publication number
- CA2219358A1 CA2219358A1 CA 2219358 CA2219358A CA2219358A1 CA 2219358 A1 CA2219358 A1 CA 2219358A1 CA 2219358 CA2219358 CA 2219358 CA 2219358 A CA2219358 A CA 2219358A CA 2219358 A1 CA2219358 A1 CA 2219358A1
- Authority
- CA
- Canada
- Prior art keywords
- signal
- speech
- pitch
- lpc
- processor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000013139 quantization Methods 0.000 title abstract description 6
- 238000000034 method Methods 0.000 claims description 29
- 238000001228 spectrum Methods 0.000 abstract description 20
- 230000007774 longterm Effects 0.000 abstract description 7
- 238000007906 compression Methods 0.000 abstract description 4
- 230000006835 compression Effects 0.000 abstract description 4
- 230000008447 perception Effects 0.000 abstract description 4
- 238000005070 sampling Methods 0.000 abstract description 4
- 239000013598 vector Substances 0.000 description 60
- 238000007493 shaping process Methods 0.000 description 32
- 230000000873 masking effect Effects 0.000 description 21
- 230000004044 response Effects 0.000 description 20
- 238000003786 synthesis reaction Methods 0.000 description 18
- 230000015572 biosynthetic process Effects 0.000 description 17
- 230000003044 adaptive effect Effects 0.000 description 15
- 230000006870 function Effects 0.000 description 14
- 230000015654 memory Effects 0.000 description 14
- 241000282320 Panthera leo Species 0.000 description 11
- 230000008569 process Effects 0.000 description 8
- 230000003595 spectral effect Effects 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 6
- 230000007480 spreading Effects 0.000 description 6
- 238000003892 spreading Methods 0.000 description 6
- 238000006243 chemical reaction Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- YFONKFDEZLYQDH-OPQQBVKSSA-N N-[(1R,2S)-2,6-dimethyindan-1-yl]-6-[(1R)-1-fluoroethyl]-1,3,5-triazine-2,4-diamine Chemical compound C[C@@H](F)C1=NC(N)=NC(N[C@H]2C3=CC(C)=CC=C3C[C@@H]2C)=N1 YFONKFDEZLYQDH-OPQQBVKSSA-N 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 239000000872 buffer Substances 0.000 description 3
- 238000012937 correction Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000009499 grossing Methods 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 230000002238 attenuated effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 150000002500 ions Chemical class 0.000 description 2
- 230000007775 late Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 239000012536 storage buffer Substances 0.000 description 2
- 101000836150 Homo sapiens Transforming acidic coiled-coil-containing protein 3 Proteins 0.000 description 1
- 101100207086 Schizosaccharomyces pombe (strain 972 / ATCC 24843) rec6 gene Proteins 0.000 description 1
- 102100027048 Transforming acidic coiled-coil-containing protein 3 Human genes 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 210000000721 basilar membrane Anatomy 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000003750 conditioning effect Effects 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/002—Dynamic bit allocation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0212—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
La présente invention concerne un système de compression de la parole dénommé "Codage Prédictif par Transformée" ou TPC (pour "Transform Predictive Coding") qui permet de coder la parole de la bande des 7 Khz (échantillonnée à 16 Khz) en atteignant un débit binaire de 16 ou 32 k-octets/s, à raison de 1 à 2 bits par échantillon. Pour annuler les redondances, le système utilise un dispositif prédictif à court terme et à long terme. Le résiduel de prédiction subit une transformation et un codage dans le domaine de fréquences représenté dans la figure, et ce, au niveau du processeur de transformée (110) après prise en compte des données du domaine temporel de l'additionneur (60) et l'entrée des paramètres depuis le processeur de réponse d'amplitude à filtre de mise en forme (100), ce qui corrige le spectre en vue de la perception auditive. Le vocodeur TPC n'utilise qu'une quantification en boucle ouverte comme le démontre la présence d'un extracteur/interpolateur de hauteur de son (70), ce qui fait que le vocodeur TPC n'est que faiblement complexe. La parole est de qualité transparente à 32 k-octets/s, de très bonne qualité à 24 k-octets/s, et acceptable à 16 k-octets/s.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US1229696P | 1996-02-26 | 1996-02-26 | |
US60/012,296 | 1996-02-26 |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2219358A1 true CA2219358A1 (fr) | 1997-08-28 |
Family
ID=21754300
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA 2219358 Abandoned CA2219358A1 (fr) | 1996-02-26 | 1997-02-26 | Quantification de signaux vocaux a l'aide de modeles auditifs humains dans des systemes de codage predictif |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP0954851A1 (fr) |
JP (1) | JPH11504733A (fr) |
CA (1) | CA2219358A1 (fr) |
MX (1) | MX9708203A (fr) |
WO (1) | WO1997031367A1 (fr) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6397178B1 (en) | 1998-09-18 | 2002-05-28 | Conexant Systems, Inc. | Data organizational scheme for enhanced selection of gain parameters for speech coding |
US6778953B1 (en) * | 2000-06-02 | 2004-08-17 | Agere Systems Inc. | Method and apparatus for representing masked thresholds in a perceptual audio coder |
WO2002091363A1 (fr) * | 2001-05-08 | 2002-11-14 | Koninklijke Philips Electronics N.V. | Codage audio |
US7451091B2 (en) | 2003-10-07 | 2008-11-11 | Matsushita Electric Industrial Co., Ltd. | Method for determining time borders and frequency resolutions for spectral envelope coding |
DE102006022346B4 (de) * | 2006-05-12 | 2008-02-28 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Informationssignalcodierung |
KR101696632B1 (ko) | 2010-07-02 | 2017-01-16 | 돌비 인터네셔널 에이비 | 선택적인 베이스 포스트 필터 |
WO2012161675A1 (fr) * | 2011-05-20 | 2012-11-29 | Google Inc. | Unité de codage redondant pour codec audio |
WO2013062201A1 (fr) * | 2011-10-24 | 2013-05-02 | 엘지전자 주식회사 | Procédé et dispositif de quantification de signaux vocaux par sélection de bande |
CN111862995A (zh) * | 2020-06-22 | 2020-10-30 | 北京达佳互联信息技术有限公司 | 一种码率确定模型训练方法、码率确定方法及装置 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5012517A (en) * | 1989-04-18 | 1991-04-30 | Pacific Communication Science, Inc. | Adaptive transform coder having long term predictor |
FR2700632B1 (fr) * | 1993-01-21 | 1995-03-24 | France Telecom | Système de codage-décodage prédictif d'un signal numérique de parole par transformée adaptative à codes imbriqués. |
-
1997
- 1997-02-26 EP EP97907830A patent/EP0954851A1/fr not_active Withdrawn
- 1997-02-26 WO PCT/US1997/002898 patent/WO1997031367A1/fr not_active Application Discontinuation
- 1997-02-26 JP JP9530382A patent/JPH11504733A/ja active Pending
- 1997-02-26 CA CA 2219358 patent/CA2219358A1/fr not_active Abandoned
- 1997-02-26 MX MX9708203A patent/MX9708203A/es unknown
Also Published As
Publication number | Publication date |
---|---|
JPH11504733A (ja) | 1999-04-27 |
EP0954851A4 (fr) | 1999-11-10 |
EP0954851A1 (fr) | 1999-11-10 |
MX9708203A (es) | 1997-12-31 |
WO1997031367A1 (fr) | 1997-08-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2185746C (fr) | Methode perceptive de masquage du bruit basee sur la reponse frequentielle d'un filtre de synthese | |
CA2185731C (fr) | Quantification des signaux vocaux au moyen de modeles de l'audition humaine dans les systemes de codage predictif | |
US6014621A (en) | Synthesis of speech signals in the absence of coded parameters | |
Gersho | Advances in speech and audio compression | |
US6735567B2 (en) | Encoding and decoding speech signals variably based on signal classification | |
US6574593B1 (en) | Codebook tables for encoding and decoding | |
Paliwal et al. | Vector quantization of LPC parameters in the presence of channel errors | |
Chen et al. | Real-time vector APC speech coding at 4800 bps with adaptive postfiltering | |
US6961698B1 (en) | Multi-mode bitstream transmission protocol of encoded voice signals with embeded characteristics | |
JP3490685B2 (ja) | 広帯域信号の符号化における適応帯域ピッチ探索のための方法および装置 | |
US4969192A (en) | Vector adaptive predictive coder for speech and audio | |
US6098036A (en) | Speech coding system and method including spectral formant enhancer | |
CA2140329C (fr) | Decomposition en bruit et en signaux periodiques dans l'interpolation des formes d'onde | |
KR100304092B1 (ko) | 오디오 신호 부호화 장치, 오디오 신호 복호화 장치 및 오디오 신호 부호화/복호화 장치 | |
US5699382A (en) | Method for noise weighting filtering | |
US6119082A (en) | Speech coding system and method including harmonic generator having an adaptive phase off-setter | |
US6081776A (en) | Speech coding system and method including adaptive finite impulse response filter | |
JP4176349B2 (ja) | マルチモードの音声符号器 | |
US6138092A (en) | CELP speech synthesizer with epoch-adaptive harmonic generator for pitch harmonics below voicing cutoff frequency | |
MXPA96004161A (en) | Quantification of speech signals using human auiditive models in predict encoding systems | |
KR20030046451A (ko) | 음성 코딩을 위한 코드북 구조 및 탐색 방법 | |
Ordentlich et al. | Low-delay code-excited linear-predictive coding of wideband speech at 32 kbps | |
CA2219358A1 (fr) | Quantification de signaux vocaux a l'aide de modeles auditifs humains dans des systemes de codage predictif | |
JPH01261930A (ja) | 音声復号器のポスト雑音整形フィルタ | |
CA2303711C (fr) | Methode de filtrage pour la ponderation du bruit |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
FZDE | Dead |