AU2001273410A1 - Method and apparatus for constructing voice templates for a speaker-independent voice recognition system - Google Patents

Method and apparatus for constructing voice templates for a speaker-independent voice recognition system

Info

Publication number
AU2001273410A1
AU2001273410A1 AU2001273410A AU7341001A AU2001273410A1 AU 2001273410 A1 AU2001273410 A1 AU 2001273410A1 AU 2001273410 A AU2001273410 A AU 2001273410A AU 7341001 A AU7341001 A AU 7341001A AU 2001273410 A1 AU2001273410 A1 AU 2001273410A1
Authority
AU
Australia
Prior art keywords
utterances
generate
speaker
recognition system
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2001273410A
Inventor
Ning Bi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of AU2001273410A1 publication Critical patent/AU2001273410A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training

Landscapes

  • Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)
  • Machine Translation (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)
  • Audible-Bandwidth Dynamoelectric Transducers Other Than Pickups (AREA)

Abstract

A method and apparatus for constructing voice templates for a speaker-independent voice recognition system includes segmenting a training utterance to generate time-clustered segments, each segment being represented by a mean. The means for all utterances of a given word are quantized to generate template vectors. Each template vector is compared with testing utterances to generate a comparison result. The comparison is typically a dynamic time warping computation. The training utterances are matched with the template vectors if the comparison result exceeds at least one predefined threshold value, to generate an optimal path result, and the training utterances are partitioned in accordance with the optimal path result. The partitioning is typically a K-means segmentation computation. The partitioned utterances may then be re-quantized and re-compared with the testing utterances until the at least one predefined threshold value is not exceeded.
AU2001273410A 2000-07-13 2001-07-11 Method and apparatus for constructing voice templates for a speaker-independent voice recognition system Abandoned AU2001273410A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US09/615,572 US6735563B1 (en) 2000-07-13 2000-07-13 Method and apparatus for constructing voice templates for a speaker-independent voice recognition system
US09615572 2000-07-13
PCT/US2001/022009 WO2002007145A2 (en) 2000-07-13 2001-07-11 Method and apparatus for constructing voice templates for a speaker-independent voice recognition system

Publications (1)

Publication Number Publication Date
AU2001273410A1 true AU2001273410A1 (en) 2002-01-30

Family

ID=24465970

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2001273410A Abandoned AU2001273410A1 (en) 2000-07-13 2001-07-11 Method and apparatus for constructing voice templates for a speaker-independent voice recognition system

Country Status (13)

Country Link
US (1) US6735563B1 (en)
EP (1) EP1301919B1 (en)
JP (1) JP4202124B2 (en)
KR (1) KR100766761B1 (en)
CN (1) CN1205601C (en)
AT (1) ATE345562T1 (en)
AU (1) AU2001273410A1 (en)
BR (1) BR0112405A (en)
DE (1) DE60124551T2 (en)
ES (1) ES2275700T3 (en)
HK (1) HK1056427A1 (en)
TW (1) TW514867B (en)
WO (1) WO2002007145A2 (en)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6990446B1 (en) * 2000-10-10 2006-01-24 Microsoft Corporation Method and apparatus using spectral addition for speaker recognition
DE10127559A1 (en) 2001-06-06 2002-12-12 Philips Corp Intellectual Pty User group-specific pattern processing system, e.g. for telephone banking systems, involves using specific pattern processing data record for the user group
TW541517B (en) * 2001-12-25 2003-07-11 Univ Nat Cheng Kung Speech recognition system
KR100533601B1 (en) * 2002-12-05 2005-12-06 베스티안파트너스(주) A method for deciding a gender of a speaker in a speaker-independent speech recognition system of a mobile phone
US7509257B2 (en) * 2002-12-24 2009-03-24 Marvell International Ltd. Method and apparatus for adapting reference templates
WO2005026043A2 (en) 2003-07-29 2005-03-24 Intelligent Energy, Inc. Methods for providing thin hydrogen separation membranes and associated uses
US7389233B1 (en) * 2003-09-02 2008-06-17 Verizon Corporate Services Group Inc. Self-organizing speech recognition for information extraction
KR100827074B1 (en) * 2004-04-06 2008-05-02 삼성전자주식회사 Apparatus and method for automatic dialling in a mobile portable telephone
US7914468B2 (en) * 2004-09-22 2011-03-29 Svip 4 Llc Systems and methods for monitoring and modifying behavior
US8219391B2 (en) * 2005-02-15 2012-07-10 Raytheon Bbn Technologies Corp. Speech analyzing system with speech codebook
CN1963918A (en) * 2005-11-11 2007-05-16 株式会社东芝 Compress of speaker cyclostyle, combination apparatus and method and authentication of speaker
US8612229B2 (en) * 2005-12-15 2013-12-17 Nuance Communications, Inc. Method and system for conveying an example in a natural language understanding application
JP4745094B2 (en) * 2006-03-20 2011-08-10 富士通株式会社 Clustering system, clustering method, clustering program, and attribute estimation system using clustering system
US20070276668A1 (en) * 2006-05-23 2007-11-29 Creative Technology Ltd Method and apparatus for accessing an audio file from a collection of audio files using tonal matching
US8532984B2 (en) 2006-07-31 2013-09-10 Qualcomm Incorporated Systems, methods, and apparatus for wideband encoding and decoding of active frames
US8239190B2 (en) 2006-08-22 2012-08-07 Qualcomm Incorporated Time-warping frames of wideband vocoder
TWI349266B (en) * 2007-04-13 2011-09-21 Qisda Corp Voice recognition system and method
CN101465123B (en) * 2007-12-20 2011-07-06 株式会社东芝 Verification method and device for speaker authentication and speaker authentication system
US20120168331A1 (en) * 2010-12-30 2012-07-05 Safecode Drug Technologies Corp. Voice template protector for administering medicine
CN102623008A (en) * 2011-06-21 2012-08-01 中国科学院苏州纳米技术与纳米仿生研究所 Voiceprint identification method
CN105989849B (en) * 2015-06-03 2019-12-03 乐融致新电子科技(天津)有限公司 A kind of sound enhancement method, audio recognition method, clustering method and device
US10134425B1 (en) * 2015-06-29 2018-11-20 Amazon Technologies, Inc. Direction-based speech endpointing
KR101901965B1 (en) * 2017-01-12 2018-09-28 엘에스산전 주식회사 apparatus FOR CREATING SCREEN OF PROJECT
KR102509821B1 (en) * 2017-09-18 2023-03-14 삼성전자주식회사 Method and apparatus for generating oos(out-of-service) sentence
CN110706710A (en) * 2018-06-25 2020-01-17 普天信息技术有限公司 Voice recognition method and device, electronic equipment and storage medium
CN109801622B (en) * 2019-01-31 2020-12-22 嘉楠明芯(北京)科技有限公司 Speech recognition template training method, speech recognition method and speech recognition device
CN111063348B (en) * 2019-12-13 2022-06-07 腾讯科技(深圳)有限公司 Information processing method, device and equipment and computer storage medium

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4415767A (en) * 1981-10-19 1983-11-15 Votan Method and apparatus for speech recognition and reproduction
CA1261472A (en) 1985-09-26 1989-09-26 Yoshinao Shiraki Reference speech pattern generating method
US4797929A (en) * 1986-01-03 1989-01-10 Motorola, Inc. Word recognition in a speech recognition system using data reduced word templates
CA1299750C (en) * 1986-01-03 1992-04-28 Ira Alan Gerson Optimal method of data reduction in a speech recognition system
US4855910A (en) * 1986-10-22 1989-08-08 North American Philips Corporation Time-clustered cardio-respiratory encoder and method for clustering cardio-respiratory signals
US5226084A (en) * 1990-12-05 1993-07-06 Digital Voice Systems, Inc. Methods for speech quantization and error correction
AU671952B2 (en) 1991-06-11 1996-09-19 Qualcomm Incorporated Variable rate vocoder
US5337394A (en) * 1992-06-09 1994-08-09 Kurzweil Applied Intelligence, Inc. Speech recognizer
US5682464A (en) * 1992-06-29 1997-10-28 Kurzweil Applied Intelligence, Inc. Word model candidate preselection for speech recognition using precomputed matrix of thresholded distance values
JP3336754B2 (en) * 1994-08-19 2002-10-21 ソニー株式会社 Digital video signal recording method and recording apparatus
US5839103A (en) * 1995-06-07 1998-11-17 Rutgers, The State University Of New Jersey Speaker verification system using decision fusion logic
JP3180655B2 (en) * 1995-06-19 2001-06-25 日本電信電話株式会社 Word speech recognition method by pattern matching and apparatus for implementing the method
KR0169414B1 (en) * 1995-07-01 1999-01-15 김광호 Multi-channel serial interface control circuit
US6519561B1 (en) * 1997-11-03 2003-02-11 T-Netix, Inc. Model adaptation of neural tree networks and other fused models for speaker verification
US6278972B1 (en) * 1999-01-04 2001-08-21 Qualcomm Incorporated System and method for segmentation and recognition of speech signals
US6266643B1 (en) * 1999-03-03 2001-07-24 Kenneth Canfield Speeding up audio without changing pitch by comparing dominant frequencies
US6510534B1 (en) * 2000-06-29 2003-01-21 Logicvision, Inc. Method and apparatus for testing high performance circuits

Also Published As

Publication number Publication date
ES2275700T3 (en) 2007-06-16
US6735563B1 (en) 2004-05-11
ATE345562T1 (en) 2006-12-15
EP1301919B1 (en) 2006-11-15
KR20030014332A (en) 2003-02-15
KR100766761B1 (en) 2007-10-17
TW514867B (en) 2002-12-21
JP4202124B2 (en) 2008-12-24
WO2002007145A2 (en) 2002-01-24
DE60124551T2 (en) 2007-09-06
CN1441947A (en) 2003-09-10
WO2002007145A3 (en) 2002-05-23
DE60124551D1 (en) 2006-12-28
CN1205601C (en) 2005-06-08
HK1056427A1 (en) 2004-02-13
BR0112405A (en) 2003-12-30
JP2004504641A (en) 2004-02-12
EP1301919A2 (en) 2003-04-16

Similar Documents

Publication Publication Date Title
AU2001273410A1 (en) Method and apparatus for constructing voice templates for a speaker-independent voice recognition system
US4918732A (en) Frame comparison method for word recognition in high noise environments
EP0413361B1 (en) Speech-recognition circuitry employing nonlinear processing, speech element modelling and phoneme estimation
Stolcke et al. Highly accurate phonetic segmentation using boundary correction models and system fusion
Sugamura et al. Isolated word recognition using phoneme-like templates
Paliwal Lexicon-building methods for an acoustic sub-word based speech recognizer
EP1005019A3 (en) Segment-based similarity measurement method for speech recognition
Karnjanadecha et al. Signal modeling for high-performance robust isolated word recognition
Hon et al. Towards large vocabulary Mandarin Chinese speech recognition
Yokoya et al. Recovery of superquadric primitives from a range image using simulated annealing
Euler et al. Statistical segmentation and word modeling techniques in isolated word recognition
Tian et al. Tone recognition with fractionized models and outlined features
Barai et al. An ASR system using MFCC and VQ/GMM with emphasis on environmental dependency
EP0255529A4 (en) Frame comparison method for word recognition in high noise environments.
JPH01202798A (en) Voice recognizing method
Makino et al. Utilizing state-level distance vector representation for improved spoken term detection by text and spoken queries.
Ban et al. Speaking rate dependent multiple acoustic models using continuous frame rate normalization
Kockmann et al. Contour modeling of prosodic and acoustic features for speaker recognition
Ma et al. Acoustic segment modeling for speaker recognition
Aktas et al. Large-vocabulary isolated word recognition with fast coarse time alignment
Singh et al. Effect of MFCC based features for speech signal alignments
Diwakar et al. Repetition detection in dysarthric speech
Hattori et al. Supplementation of HMM for articulatory variation in speaker adaptation
Radfar et al. A joint identification-separation technique for single channel speech separation
Zhou et al. Multisegment multiple VQ codebooks-based speaker independent isolated-word recognition using unbiased mel cepstrum