JP2009527024A - 話者非依存的音声認識を有する通信装置 - Google Patents

話者非依存的音声認識を有する通信装置 Download PDF

Info

Publication number
JP2009527024A
JP2009527024A JP2008555320A JP2008555320A JP2009527024A JP 2009527024 A JP2009527024 A JP 2009527024A JP 2008555320 A JP2008555320 A JP 2008555320A JP 2008555320 A JP2008555320 A JP 2008555320A JP 2009527024 A JP2009527024 A JP 2009527024A
Authority
JP
Japan
Prior art keywords
feature vector
vector
likelihood
word model
phonetic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2008555320A
Other languages
English (en)
Japanese (ja)
Inventor
ディートマー ルウィッシュ
Original Assignee
インテレクチャル ベンチャーズ ファンド 21 エルエルシー
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by インテレクチャル ベンチャーズ ファンド 21 エルエルシー filed Critical インテレクチャル ベンチャーズ ファンド 21 エルエルシー
Publication of JP2009527024A publication Critical patent/JP2009527024A/ja
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/12Speech classification or search using dynamic programming techniques, e.g. dynamic time warping [DTW]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/187Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/26Devices for calling a subscriber
    • H04M1/27Devices whereby a plurality of signals may be stored simultaneously
    • H04M1/271Devices whereby a plurality of signals may be stored simultaneously controlled by voice recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Telephonic Communication Services (AREA)
  • Telephone Function (AREA)
JP2008555320A 2006-02-14 2007-02-13 話者非依存的音声認識を有する通信装置 Pending JP2009527024A (ja)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US77357706P 2006-02-14 2006-02-14
PCT/US2007/003876 WO2007095277A2 (fr) 2006-02-14 2007-02-13 Dispositif de communication dote de reconnaissance vocale independante du locuteur

Publications (1)

Publication Number Publication Date
JP2009527024A true JP2009527024A (ja) 2009-07-23

Family

ID=38328169

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2008555320A Pending JP2009527024A (ja) 2006-02-14 2007-02-13 話者非依存的音声認識を有する通信装置

Country Status (7)

Country Link
US (1) US20070203701A1 (fr)
EP (1) EP1994529B1 (fr)
JP (1) JP2009527024A (fr)
KR (1) KR20080107376A (fr)
CN (1) CN101385073A (fr)
AT (1) ATE536611T1 (fr)
WO (1) WO2007095277A2 (fr)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070225049A1 (en) * 2006-03-23 2007-09-27 Andrada Mauricio P Voice controlled push to talk system
US8521235B2 (en) * 2008-03-27 2013-08-27 General Motors Llc Address book sharing system and method for non-verbally adding address book contents using the same
US8515749B2 (en) * 2009-05-20 2013-08-20 Raytheon Bbn Technologies Corp. Speech-to-speech translation
US8626511B2 (en) * 2010-01-22 2014-01-07 Google Inc. Multi-dimensional disambiguation of voice commands
WO2013167934A1 (fr) 2012-05-07 2013-11-14 Mls Multimedia S.A. Procédés et système mettant en œuvre une sélection de nom vocale intelligente à partir des listes de répertoire constituées dans des langues à alphabet non latin
EP3825471A1 (fr) 2012-07-19 2021-05-26 Sumitomo (S.H.I.) Construction Machinery Co., Ltd. Pelle comportant un dispositif d'informations portable multifonctionnel
US9401140B1 (en) * 2012-08-22 2016-07-26 Amazon Technologies, Inc. Unsupervised acoustic model training
CN107210038B (zh) * 2015-02-11 2020-11-10 邦及欧路夫森有限公司 多媒体系统中的说话者识别
KR101684554B1 (ko) * 2015-08-20 2016-12-08 현대자동차 주식회사 음성 다이얼링 시스템 및 그 방법
EP3496090A1 (fr) * 2017-12-07 2019-06-12 Thomson Licensing Dispositif et procédé d'interaction vocale préservant la confidentialité
JP7173049B2 (ja) * 2018-01-10 2022-11-16 ソニーグループ株式会社 情報処理装置、情報処理システム、および情報処理方法、並びにプログラム
US11410642B2 (en) * 2019-08-16 2022-08-09 Soundhound, Inc. Method and system using phoneme embedding
CN113673235A (zh) * 2020-08-27 2021-11-19 谷歌有限责任公司 基于能量的语言模型

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08202385A (ja) * 1995-01-26 1996-08-09 Nec Corp 音声適応化装置,単語音声認識装置,連続音声認識装置およびワードスポッティング装置
JPH1165590A (ja) * 1997-08-25 1999-03-09 Nec Corp 音声認識ダイアル装置
US5930751A (en) * 1997-05-30 1999-07-27 Lucent Technologies Inc. Method of implicit confirmation for automatic speech recognition
EP1327976A1 (fr) * 2001-12-21 2003-07-16 Cortologic AG Méthode et dispositif pour la reconnaissance de parole en présence de bruit
US20030156723A1 (en) * 2000-09-01 2003-08-21 Dietmar Ruwisch Process and apparatus for eliminating loudspeaker interference from microphone signals
EP1369847A1 (fr) * 2002-06-04 2003-12-10 Cortologic AG Système et méthode de reconnaissance de la parole
JP2004109464A (ja) * 2002-09-18 2004-04-08 Pioneer Electronic Corp 音声認識装置及び音声認識方法

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4908865A (en) * 1984-12-27 1990-03-13 Texas Instruments Incorporated Speaker independent speech recognition method and system
US6236964B1 (en) * 1990-02-01 2001-05-22 Canon Kabushiki Kaisha Speech recognition apparatus and method for matching inputted speech and a word generated from stored referenced phoneme data
US5390278A (en) * 1991-10-08 1995-02-14 Bell Canada Phoneme based speech recognition
US5353376A (en) * 1992-03-20 1994-10-04 Texas Instruments Incorporated System and method for improved speech acquisition for hands-free voice telecommunication in a noisy environment
FI97919C (fi) * 1992-06-05 1997-03-10 Nokia Mobile Phones Ltd Puheentunnistusmenetelmä ja -järjestelmä puheella ohjattavaa puhelinta varten
US5758021A (en) * 1992-06-12 1998-05-26 Alcatel N.V. Speech recognition combining dynamic programming and neural network techniques
US5675706A (en) * 1995-03-31 1997-10-07 Lucent Technologies Inc. Vocabulary independent discriminative utterance verification for non-keyword rejection in subword based speech recognition
US5799276A (en) * 1995-11-07 1998-08-25 Accent Incorporated Knowledge-based speech recognition system and methods having frame length computed based upon estimated pitch period of vocalic intervals
US5963903A (en) * 1996-06-28 1999-10-05 Microsoft Corporation Method and system for dynamically adjusted training for speech recognition
FI972723A0 (fi) * 1997-06-24 1997-06-24 Nokia Mobile Phones Ltd Mobila kommunikationsanordningar
KR100277105B1 (ko) * 1998-02-27 2001-01-15 윤종용 음성 인식 데이터 결정 장치 및 방법
US6321195B1 (en) * 1998-04-28 2001-11-20 Lg Electronics Inc. Speech recognition method
US6389393B1 (en) * 1998-04-28 2002-05-14 Texas Instruments Incorporated Method of adapting speech recognition models for speaker, microphone, and noisy environment
US6289309B1 (en) * 1998-12-16 2001-09-11 Sarnoff Corporation Noise spectrum tracking for speech enhancement
US6418411B1 (en) * 1999-03-12 2002-07-09 Texas Instruments Incorporated Method and system for adaptive speech recognition in a noisy environment
US6487530B1 (en) * 1999-03-30 2002-11-26 Nortel Networks Limited Method for recognizing non-standard and standard speech by speaker independent and speaker dependent word models
US7457750B2 (en) * 2000-10-13 2008-11-25 At&T Corp. Systems and methods for dynamic re-configurable speech recognition
GB0028277D0 (en) * 2000-11-20 2001-01-03 Canon Kk Speech processing system
FI114051B (fi) * 2001-11-12 2004-07-30 Nokia Corp Menetelmä sanakirjatiedon kompressoimiseksi
US20050197837A1 (en) * 2004-03-08 2005-09-08 Janne Suontausta Enhanced multilingual speech recognition system
JP4551915B2 (ja) 2007-07-03 2010-09-29 ホシデン株式会社 複合操作型入力装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08202385A (ja) * 1995-01-26 1996-08-09 Nec Corp 音声適応化装置,単語音声認識装置,連続音声認識装置およびワードスポッティング装置
US5930751A (en) * 1997-05-30 1999-07-27 Lucent Technologies Inc. Method of implicit confirmation for automatic speech recognition
JPH1165590A (ja) * 1997-08-25 1999-03-09 Nec Corp 音声認識ダイアル装置
US20030156723A1 (en) * 2000-09-01 2003-08-21 Dietmar Ruwisch Process and apparatus for eliminating loudspeaker interference from microphone signals
EP1327976A1 (fr) * 2001-12-21 2003-07-16 Cortologic AG Méthode et dispositif pour la reconnaissance de parole en présence de bruit
EP1369847A1 (fr) * 2002-06-04 2003-12-10 Cortologic AG Système et méthode de reconnaissance de la parole
JP2004109464A (ja) * 2002-09-18 2004-04-08 Pioneer Electronic Corp 音声認識装置及び音声認識方法

Also Published As

Publication number Publication date
CN101385073A (zh) 2009-03-11
EP1994529A2 (fr) 2008-11-26
KR20080107376A (ko) 2008-12-10
WO2007095277A3 (fr) 2007-10-11
ATE536611T1 (de) 2011-12-15
US20070203701A1 (en) 2007-08-30
EP1994529B1 (fr) 2011-12-07
WO2007095277A2 (fr) 2007-08-23

Similar Documents

Publication Publication Date Title
EP1994529B1 (fr) Dispositif de communication dote de reconnaissance vocale independante du locuteur
US7689417B2 (en) Method, system and apparatus for improved voice recognition
KR100277105B1 (ko) 음성 인식 데이터 결정 장치 및 방법
KR100984528B1 (ko) 분산형 음성 인식 시스템에서 음성 인식을 위한 시스템 및방법
US20060215821A1 (en) Voice nametag audio feedback for dialing a telephone call
US20040199388A1 (en) Method and apparatus for verbal entry of digits or commands
US20070005206A1 (en) Automobile interface
JPH07210190A (ja) 音声認識方法及びシステム
JP4520596B2 (ja) 音声認識方法および音声認識装置
JPH09106296A (ja) 音声認識装置及び方法
US20050273334A1 (en) Method for automatic speech recognition
CN101345055A (zh) 语音处理器和通信终端设备
US6788767B2 (en) Apparatus and method for providing call return service
EP1110207B1 (fr) Procede et systeme de composition en vocal
US20050049858A1 (en) Methods and systems for improving alphabetic speech recognition accuracy
US20020069064A1 (en) Method and apparatus for testing user interface integrity of speech-enabled devices
KR100467593B1 (ko) 음성인식 키 입력 무선 단말장치, 무선 단말장치에서키입력 대신 음성을 이용하는 방법 및 그 기록매체
WO2007067837A2 (fr) Controle de la qualite vocale pour la reconstruction de haute qualite de la parole
KR100433550B1 (ko) 스피드 음성 다이얼 장치와 방법
JP6811865B2 (ja) 音声認識装置および音声認識方法
EP1385148B1 (fr) Procédé pour augmenter le taux de reconnaissance d'un système de reconnaissance vocale et serveur vocal mettant en oeuvre ce procédé
JP2007194833A (ja) ハンズフリー機能を備えた携帯電話
JP2004004182A (ja) 音声認識装置、音声認識方法及び音声認識プログラム
KR20190041108A (ko) 차량의 음성생성 시스템 및 방법
JP2020034832A (ja) 辞書生成装置、音声認識システムおよび辞書生成方法

Legal Events

Date Code Title Description
RD03 Notification of appointment of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7423

Effective date: 20100727

RD04 Notification of resignation of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7424

Effective date: 20100802

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20110517

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20110810

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20111013

A601 Written request for extension of time

Free format text: JAPANESE INTERMEDIATE CODE: A601

Effective date: 20111221

A602 Written permission of extension of time

Free format text: JAPANESE INTERMEDIATE CODE: A602

Effective date: 20120104

A02 Decision of refusal

Free format text: JAPANESE INTERMEDIATE CODE: A02

Effective date: 20120406