CA2450230A1 - Speech feature extraction system - Google Patents
Speech feature extraction system Download PDFInfo
- Publication number
- CA2450230A1 CA2450230A1 CA002450230A CA2450230A CA2450230A1 CA 2450230 A1 CA2450230 A1 CA 2450230A1 CA 002450230 A CA002450230 A CA 002450230A CA 2450230 A CA2450230 A CA 2450230A CA 2450230 A1 CA2450230 A1 CA 2450230A1
- Authority
- CA
- Canada
- Prior art keywords
- signal
- frequency
- filter
- filters
- band
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000605 extraction Methods 0.000 title description 31
- 238000012545 processing Methods 0.000 claims abstract description 35
- 238000000034 method Methods 0.000 claims description 32
- 239000013598 vector Substances 0.000 claims description 18
- 238000001914 filtration Methods 0.000 claims description 11
- 238000005070 sampling Methods 0.000 claims description 6
- 230000003111 delayed effect Effects 0.000 claims description 3
- 239000000284 extract Substances 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000002411 adverse Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Alarm Systems (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
- Sorting Of Articles (AREA)
- Machine Translation (AREA)
- Telephonic Communication Services (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US09/882,744 US6493668B1 (en) | 2001-06-15 | 2001-06-15 | Speech feature extraction system |
| US09/882,744 | 2001-06-15 | ||
| PCT/US2002/019182 WO2002103676A1 (en) | 2001-06-15 | 2002-06-14 | Speech feature extraction system |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CA2450230A1 true CA2450230A1 (en) | 2002-12-27 |
Family
ID=25381249
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CA002450230A Abandoned CA2450230A1 (en) | 2001-06-15 | 2002-06-14 | Speech feature extraction system |
Country Status (7)
| Country | Link |
|---|---|
| US (2) | US6493668B1 (enExample) |
| EP (1) | EP1402517B1 (enExample) |
| JP (1) | JP4177755B2 (enExample) |
| AT (1) | ATE421137T1 (enExample) |
| CA (1) | CA2450230A1 (enExample) |
| DE (1) | DE60230871D1 (enExample) |
| WO (1) | WO2002103676A1 (enExample) |
Families Citing this family (37)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP3673507B2 (ja) * | 2002-05-16 | 2005-07-20 | 独立行政法人科学技術振興機構 | 音声波形の特徴を高い信頼性で示す部分を決定するための装置およびプログラム、音声信号の特徴を高い信頼性で示す部分を決定するための装置およびプログラム、ならびに擬似音節核抽出装置およびプログラム |
| JP4265908B2 (ja) * | 2002-12-12 | 2009-05-20 | アルパイン株式会社 | 音声認識装置及び音声認識性能改善方法 |
| DE102004008225B4 (de) * | 2004-02-19 | 2006-02-16 | Infineon Technologies Ag | Verfahren und Einrichtung zum Ermitteln von Merkmalsvektoren aus einem Signal zur Mustererkennung, Verfahren und Einrichtung zur Mustererkennung sowie computerlesbare Speichermedien |
| US20070041517A1 (en) * | 2005-06-30 | 2007-02-22 | Pika Technologies Inc. | Call transfer detection method using voice identification techniques |
| US20070118364A1 (en) * | 2005-11-23 | 2007-05-24 | Wise Gerald B | System for generating closed captions |
| US20070118372A1 (en) * | 2005-11-23 | 2007-05-24 | General Electric Company | System and method for generating closed captions |
| US8345890B2 (en) | 2006-01-05 | 2013-01-01 | Audience, Inc. | System and method for utilizing inter-microphone level differences for speech enhancement |
| US8204252B1 (en) | 2006-10-10 | 2012-06-19 | Audience, Inc. | System and method for providing close microphone adaptive array processing |
| US8744844B2 (en) | 2007-07-06 | 2014-06-03 | Audience, Inc. | System and method for adaptive intelligent noise suppression |
| US8194880B2 (en) | 2006-01-30 | 2012-06-05 | Audience, Inc. | System and method for utilizing omni-directional microphones for speech enhancement |
| US9185487B2 (en) | 2006-01-30 | 2015-11-10 | Audience, Inc. | System and method for providing noise suppression utilizing null processing noise subtraction |
| US7778831B2 (en) * | 2006-02-21 | 2010-08-17 | Sony Computer Entertainment Inc. | Voice recognition with dynamic filter bank adjustment based on speaker categorization determined from runtime pitch |
| US8204253B1 (en) | 2008-06-30 | 2012-06-19 | Audience, Inc. | Self calibration of audio device |
| US20080010067A1 (en) * | 2006-07-07 | 2008-01-10 | Chaudhari Upendra V | Target specific data filter to speed processing |
| US8259926B1 (en) | 2007-02-23 | 2012-09-04 | Audience, Inc. | System and method for 2-channel and 3-channel acoustic echo cancellation |
| US8189766B1 (en) | 2007-07-26 | 2012-05-29 | Audience, Inc. | System and method for blind subband acoustic echo cancellation postfiltering |
| PL2186086T3 (pl) | 2007-08-27 | 2013-07-31 | Ericsson Telefon Ab L M | Częstotliwość adaptacyjnego przejścia między wypełnianiem szumami a rozszerzaniem pasma |
| US20090150164A1 (en) * | 2007-12-06 | 2009-06-11 | Hu Wei | Tri-model audio segmentation |
| US8180064B1 (en) | 2007-12-21 | 2012-05-15 | Audience, Inc. | System and method for providing voice equalization |
| US8194882B2 (en) | 2008-02-29 | 2012-06-05 | Audience, Inc. | System and method for providing single microphone noise suppression fallback |
| US8355511B2 (en) | 2008-03-18 | 2013-01-15 | Audience, Inc. | System and method for envelope-based acoustic echo cancellation |
| US8521530B1 (en) | 2008-06-30 | 2013-08-27 | Audience, Inc. | System and method for enhancing a monaural audio signal |
| US8626516B2 (en) * | 2009-02-09 | 2014-01-07 | Broadcom Corporation | Method and system for dynamic range control in an audio processing system |
| US9838784B2 (en) | 2009-12-02 | 2017-12-05 | Knowles Electronics, Llc | Directional audio capture |
| US9008329B1 (en) | 2010-01-26 | 2015-04-14 | Audience, Inc. | Noise reduction using multi-feature cluster tracker |
| US8767978B2 (en) | 2011-03-25 | 2014-07-01 | The Intellisis Corporation | System and method for processing sound signals implementing a spectral motion transform |
| US8548803B2 (en) * | 2011-08-08 | 2013-10-01 | The Intellisis Corporation | System and method of processing a sound signal including transforming the sound signal into a frequency-chirp domain |
| US9183850B2 (en) | 2011-08-08 | 2015-11-10 | The Intellisis Corporation | System and method for tracking sound pitch across an audio signal |
| US8620646B2 (en) | 2011-08-08 | 2013-12-31 | The Intellisis Corporation | System and method for tracking sound pitch across an audio signal using harmonic envelope |
| WO2013184667A1 (en) | 2012-06-05 | 2013-12-12 | Rank Miner, Inc. | System, method and apparatus for voice analytics of recorded audio |
| US9536540B2 (en) | 2013-07-19 | 2017-01-03 | Knowles Electronics, Llc | Speech signal separation and synthesis based on auditory scene analysis and speech modeling |
| US9280968B2 (en) * | 2013-10-04 | 2016-03-08 | At&T Intellectual Property I, L.P. | System and method of using neural transforms of robust audio features for speech processing |
| DE112015004185T5 (de) | 2014-09-12 | 2017-06-01 | Knowles Electronics, Llc | Systeme und Verfahren zur Wiederherstellung von Sprachkomponenten |
| US9922668B2 (en) | 2015-02-06 | 2018-03-20 | Knuedge Incorporated | Estimating fractional chirp rate with multiple frequency representations |
| US9870785B2 (en) | 2015-02-06 | 2018-01-16 | Knuedge Incorporated | Determining features of harmonic signals |
| US9842611B2 (en) | 2015-02-06 | 2017-12-12 | Knuedge Incorporated | Estimating pitch using peak-to-peak distances |
| US9820042B1 (en) | 2016-05-02 | 2017-11-14 | Knowles Electronics, Llc | Stereo separation and directional suppression with omni-directional microphones |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4300229A (en) * | 1979-02-21 | 1981-11-10 | Nippon Electric Co., Ltd. | Transmitter and receiver for an othogonally multiplexed QAM signal of a sampling rate N times that of PAM signals, comprising an N/2-point offset fourier transform processor |
| US4221934A (en) * | 1979-05-11 | 1980-09-09 | Rca Corporation | Compandor for group of FDM signals |
| GB8307702D0 (en) * | 1983-03-21 | 1983-04-27 | British Telecomm | Digital band-split filter means |
| NL8400677A (nl) * | 1984-03-02 | 1985-10-01 | Philips Nv | Transmissiesysteem voor de overdracht van data signalen in een modulaatband. |
-
2001
- 2001-06-15 US US09/882,744 patent/US6493668B1/en not_active Expired - Lifetime
-
2002
- 2002-06-14 EP EP02744395A patent/EP1402517B1/en not_active Expired - Lifetime
- 2002-06-14 US US10/173,247 patent/US7013274B2/en not_active Expired - Lifetime
- 2002-06-14 CA CA002450230A patent/CA2450230A1/en not_active Abandoned
- 2002-06-14 WO PCT/US2002/019182 patent/WO2002103676A1/en not_active Ceased
- 2002-06-14 JP JP2003505912A patent/JP4177755B2/ja not_active Expired - Fee Related
- 2002-06-14 DE DE60230871T patent/DE60230871D1/de not_active Expired - Lifetime
- 2002-06-14 AT AT02744395T patent/ATE421137T1/de not_active IP Right Cessation
Also Published As
| Publication number | Publication date |
|---|---|
| DE60230871D1 (de) | 2009-03-05 |
| US7013274B2 (en) | 2006-03-14 |
| US20020198711A1 (en) | 2002-12-26 |
| JP4177755B2 (ja) | 2008-11-05 |
| JP2004531767A (ja) | 2004-10-14 |
| WO2002103676A1 (en) | 2002-12-27 |
| EP1402517B1 (en) | 2009-01-14 |
| US20030014245A1 (en) | 2003-01-16 |
| EP1402517A1 (en) | 2004-03-31 |
| EP1402517A4 (en) | 2007-04-25 |
| ATE421137T1 (de) | 2009-01-15 |
| US6493668B1 (en) | 2002-12-10 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US7013274B2 (en) | Speech feature extraction system | |
| JP2004531767A5 (enExample) | ||
| CA2247364C (en) | Method and recognizer for recognizing a sampled sound signal in noise | |
| US10134409B2 (en) | Segmenting audio signals into auditory events | |
| US6804643B1 (en) | Speech recognition | |
| CN109256127B (zh) | 一种基于非线性幂变换Gammachirp滤波器的鲁棒语音特征提取方法 | |
| CA2184256A1 (en) | Speaker identification and verification system | |
| AU2002252143A1 (en) | Segmenting audio signals into auditory events | |
| EP1393300A1 (en) | Segmenting audio signals into auditory events | |
| EP1093112B1 (en) | A method for generating speech feature signals and an apparatus for carrying through this method | |
| CN108564956A (zh) | 一种声纹识别方法和装置、服务器、存储介质 | |
| CN110767238B (zh) | 基于地址信息的黑名单识别方法、装置、设备及存储介质 | |
| CN118522271B (zh) | 一种基于ai技术的沉浸式数字医生评估方法 | |
| Hammam et al. | Blind signal separation with noise reduction for efficient speaker identification | |
| Zouhir et al. | Bionic Cepstral coefficients (BCC): A new auditory feature extraction to noise-robust speaker identification | |
| CN112863517A (zh) | 基于感知谱收敛率的语音识别方法 | |
| Huizen et al. | Feature extraction with mel scale separation method on noise audio recordings | |
| Zeremdini et al. | Multi-pitch estimation based on multi-scale product analysis, improved comb filter and dynamic programming | |
| Lalitha et al. | An encapsulation of vital non-linear frequency features for various speech applications | |
| Meriem et al. | New front end based on multitaper and gammatone filters for robust speaker verification | |
| JPS6229799B2 (enExample) | ||
| Han et al. | Relative mel-frequency cepstral coefficients compensation for robust telephone speech recognition. | |
| JPH03122699A (ja) | 雑音除去装置及び該装置を用いた音声認識装置 | |
| Haddad et al. | The matrix pencil and its applications to speech processing | |
| CN117079666A (zh) | 歌曲打分方法、装置、终端设备以及存储介质 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| EEER | Examination request | ||
| FZDE | Discontinued |