JP2009527024A - 話者非依存的音声認識を有する通信装置 - Google Patents
話者非依存的音声認識を有する通信装置 Download PDFInfo
- Publication number
- JP2009527024A JP2009527024A JP2008555320A JP2008555320A JP2009527024A JP 2009527024 A JP2009527024 A JP 2009527024A JP 2008555320 A JP2008555320 A JP 2008555320A JP 2008555320 A JP2008555320 A JP 2008555320A JP 2009527024 A JP2009527024 A JP 2009527024A
- Authority
- JP
- Japan
- Prior art keywords
- feature vector
- vector
- likelihood
- word model
- phonetic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004891 communication Methods 0.000 title claims abstract description 69
- 239000013598 vector Substances 0.000 claims abstract description 347
- 238000000034 method Methods 0.000 claims abstract description 76
- 238000009826 distribution Methods 0.000 claims description 65
- 230000006978 adaptation Effects 0.000 claims description 53
- 230000003595 spectral effect Effects 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 12
- 230000007613 environmental effect Effects 0.000 claims description 11
- 238000006243 chemical reaction Methods 0.000 claims description 7
- 238000012935 Averaging Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims 2
- 230000009466 transformation Effects 0.000 claims 1
- 238000000844 transformation Methods 0.000 claims 1
- 238000012549 training Methods 0.000 description 6
- 238000012937 correction Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000003825 pressing Methods 0.000 description 5
- 230000001413 cellular effect Effects 0.000 description 3
- 238000012790 confirmation Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 239000002131 composite material Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/12—Speech classification or search using dynamic programming techniques, e.g. dynamic time warping [DTW]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/26—Devices for calling a subscriber
- H04M1/27—Devices whereby a plurality of signals may be stored simultaneously
- H04M1/271—Devices whereby a plurality of signals may be stored simultaneously controlled by voice recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G10L2015/025—Phonemes, fenemes or fenones being the recognition units
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Telephonic Communication Services (AREA)
- Telephone Function (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US77357706P | 2006-02-14 | 2006-02-14 | |
PCT/US2007/003876 WO2007095277A2 (fr) | 2006-02-14 | 2007-02-13 | Dispositif de communication dote de reconnaissance vocale independante du locuteur |
Publications (1)
Publication Number | Publication Date |
---|---|
JP2009527024A true JP2009527024A (ja) | 2009-07-23 |
Family
ID=38328169
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2008555320A Pending JP2009527024A (ja) | 2006-02-14 | 2007-02-13 | 話者非依存的音声認識を有する通信装置 |
Country Status (7)
Country | Link |
---|---|
US (1) | US20070203701A1 (fr) |
EP (1) | EP1994529B1 (fr) |
JP (1) | JP2009527024A (fr) |
KR (1) | KR20080107376A (fr) |
CN (1) | CN101385073A (fr) |
AT (1) | ATE536611T1 (fr) |
WO (1) | WO2007095277A2 (fr) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070225049A1 (en) * | 2006-03-23 | 2007-09-27 | Andrada Mauricio P | Voice controlled push to talk system |
US8521235B2 (en) * | 2008-03-27 | 2013-08-27 | General Motors Llc | Address book sharing system and method for non-verbally adding address book contents using the same |
US8515749B2 (en) * | 2009-05-20 | 2013-08-20 | Raytheon Bbn Technologies Corp. | Speech-to-speech translation |
US8626511B2 (en) * | 2010-01-22 | 2014-01-07 | Google Inc. | Multi-dimensional disambiguation of voice commands |
WO2013167934A1 (fr) | 2012-05-07 | 2013-11-14 | Mls Multimedia S.A. | Procédés et système mettant en œuvre une sélection de nom vocale intelligente à partir des listes de répertoire constituées dans des langues à alphabet non latin |
EP3825471A1 (fr) | 2012-07-19 | 2021-05-26 | Sumitomo (S.H.I.) Construction Machinery Co., Ltd. | Pelle comportant un dispositif d'informations portable multifonctionnel |
US9401140B1 (en) * | 2012-08-22 | 2016-07-26 | Amazon Technologies, Inc. | Unsupervised acoustic model training |
CN107210038B (zh) * | 2015-02-11 | 2020-11-10 | 邦及欧路夫森有限公司 | 多媒体系统中的说话者识别 |
KR101684554B1 (ko) * | 2015-08-20 | 2016-12-08 | 현대자동차 주식회사 | 음성 다이얼링 시스템 및 그 방법 |
EP3496090A1 (fr) * | 2017-12-07 | 2019-06-12 | Thomson Licensing | Dispositif et procédé d'interaction vocale préservant la confidentialité |
JP7173049B2 (ja) * | 2018-01-10 | 2022-11-16 | ソニーグループ株式会社 | 情報処理装置、情報処理システム、および情報処理方法、並びにプログラム |
US11410642B2 (en) * | 2019-08-16 | 2022-08-09 | Soundhound, Inc. | Method and system using phoneme embedding |
CN113673235A (zh) * | 2020-08-27 | 2021-11-19 | 谷歌有限责任公司 | 基于能量的语言模型 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08202385A (ja) * | 1995-01-26 | 1996-08-09 | Nec Corp | 音声適応化装置,単語音声認識装置,連続音声認識装置およびワードスポッティング装置 |
JPH1165590A (ja) * | 1997-08-25 | 1999-03-09 | Nec Corp | 音声認識ダイアル装置 |
US5930751A (en) * | 1997-05-30 | 1999-07-27 | Lucent Technologies Inc. | Method of implicit confirmation for automatic speech recognition |
EP1327976A1 (fr) * | 2001-12-21 | 2003-07-16 | Cortologic AG | Méthode et dispositif pour la reconnaissance de parole en présence de bruit |
US20030156723A1 (en) * | 2000-09-01 | 2003-08-21 | Dietmar Ruwisch | Process and apparatus for eliminating loudspeaker interference from microphone signals |
EP1369847A1 (fr) * | 2002-06-04 | 2003-12-10 | Cortologic AG | Système et méthode de reconnaissance de la parole |
JP2004109464A (ja) * | 2002-09-18 | 2004-04-08 | Pioneer Electronic Corp | 音声認識装置及び音声認識方法 |
Family Cites Families (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4908865A (en) * | 1984-12-27 | 1990-03-13 | Texas Instruments Incorporated | Speaker independent speech recognition method and system |
US6236964B1 (en) * | 1990-02-01 | 2001-05-22 | Canon Kabushiki Kaisha | Speech recognition apparatus and method for matching inputted speech and a word generated from stored referenced phoneme data |
US5390278A (en) * | 1991-10-08 | 1995-02-14 | Bell Canada | Phoneme based speech recognition |
US5353376A (en) * | 1992-03-20 | 1994-10-04 | Texas Instruments Incorporated | System and method for improved speech acquisition for hands-free voice telecommunication in a noisy environment |
FI97919C (fi) * | 1992-06-05 | 1997-03-10 | Nokia Mobile Phones Ltd | Puheentunnistusmenetelmä ja -järjestelmä puheella ohjattavaa puhelinta varten |
US5758021A (en) * | 1992-06-12 | 1998-05-26 | Alcatel N.V. | Speech recognition combining dynamic programming and neural network techniques |
US5675706A (en) * | 1995-03-31 | 1997-10-07 | Lucent Technologies Inc. | Vocabulary independent discriminative utterance verification for non-keyword rejection in subword based speech recognition |
US5799276A (en) * | 1995-11-07 | 1998-08-25 | Accent Incorporated | Knowledge-based speech recognition system and methods having frame length computed based upon estimated pitch period of vocalic intervals |
US5963903A (en) * | 1996-06-28 | 1999-10-05 | Microsoft Corporation | Method and system for dynamically adjusted training for speech recognition |
FI972723A0 (fi) * | 1997-06-24 | 1997-06-24 | Nokia Mobile Phones Ltd | Mobila kommunikationsanordningar |
KR100277105B1 (ko) * | 1998-02-27 | 2001-01-15 | 윤종용 | 음성 인식 데이터 결정 장치 및 방법 |
US6321195B1 (en) * | 1998-04-28 | 2001-11-20 | Lg Electronics Inc. | Speech recognition method |
US6389393B1 (en) * | 1998-04-28 | 2002-05-14 | Texas Instruments Incorporated | Method of adapting speech recognition models for speaker, microphone, and noisy environment |
US6289309B1 (en) * | 1998-12-16 | 2001-09-11 | Sarnoff Corporation | Noise spectrum tracking for speech enhancement |
US6418411B1 (en) * | 1999-03-12 | 2002-07-09 | Texas Instruments Incorporated | Method and system for adaptive speech recognition in a noisy environment |
US6487530B1 (en) * | 1999-03-30 | 2002-11-26 | Nortel Networks Limited | Method for recognizing non-standard and standard speech by speaker independent and speaker dependent word models |
US7457750B2 (en) * | 2000-10-13 | 2008-11-25 | At&T Corp. | Systems and methods for dynamic re-configurable speech recognition |
GB0028277D0 (en) * | 2000-11-20 | 2001-01-03 | Canon Kk | Speech processing system |
FI114051B (fi) * | 2001-11-12 | 2004-07-30 | Nokia Corp | Menetelmä sanakirjatiedon kompressoimiseksi |
US20050197837A1 (en) * | 2004-03-08 | 2005-09-08 | Janne Suontausta | Enhanced multilingual speech recognition system |
JP4551915B2 (ja) | 2007-07-03 | 2010-09-29 | ホシデン株式会社 | 複合操作型入力装置 |
-
2007
- 2007-02-13 WO PCT/US2007/003876 patent/WO2007095277A2/fr active Application Filing
- 2007-02-13 EP EP07750697A patent/EP1994529B1/fr not_active Not-in-force
- 2007-02-13 AT AT07750697T patent/ATE536611T1/de active
- 2007-02-13 US US11/674,424 patent/US20070203701A1/en not_active Abandoned
- 2007-02-13 JP JP2008555320A patent/JP2009527024A/ja active Pending
- 2007-02-13 KR KR1020087020244A patent/KR20080107376A/ko not_active Application Discontinuation
- 2007-02-13 CN CNA2007800054635A patent/CN101385073A/zh active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08202385A (ja) * | 1995-01-26 | 1996-08-09 | Nec Corp | 音声適応化装置,単語音声認識装置,連続音声認識装置およびワードスポッティング装置 |
US5930751A (en) * | 1997-05-30 | 1999-07-27 | Lucent Technologies Inc. | Method of implicit confirmation for automatic speech recognition |
JPH1165590A (ja) * | 1997-08-25 | 1999-03-09 | Nec Corp | 音声認識ダイアル装置 |
US20030156723A1 (en) * | 2000-09-01 | 2003-08-21 | Dietmar Ruwisch | Process and apparatus for eliminating loudspeaker interference from microphone signals |
EP1327976A1 (fr) * | 2001-12-21 | 2003-07-16 | Cortologic AG | Méthode et dispositif pour la reconnaissance de parole en présence de bruit |
EP1369847A1 (fr) * | 2002-06-04 | 2003-12-10 | Cortologic AG | Système et méthode de reconnaissance de la parole |
JP2004109464A (ja) * | 2002-09-18 | 2004-04-08 | Pioneer Electronic Corp | 音声認識装置及び音声認識方法 |
Also Published As
Publication number | Publication date |
---|---|
CN101385073A (zh) | 2009-03-11 |
EP1994529A2 (fr) | 2008-11-26 |
KR20080107376A (ko) | 2008-12-10 |
WO2007095277A3 (fr) | 2007-10-11 |
ATE536611T1 (de) | 2011-12-15 |
US20070203701A1 (en) | 2007-08-30 |
EP1994529B1 (fr) | 2011-12-07 |
WO2007095277A2 (fr) | 2007-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1994529B1 (fr) | Dispositif de communication dote de reconnaissance vocale independante du locuteur | |
US7689417B2 (en) | Method, system and apparatus for improved voice recognition | |
KR100277105B1 (ko) | 음성 인식 데이터 결정 장치 및 방법 | |
KR100984528B1 (ko) | 분산형 음성 인식 시스템에서 음성 인식을 위한 시스템 및방법 | |
US20060215821A1 (en) | Voice nametag audio feedback for dialing a telephone call | |
US20040199388A1 (en) | Method and apparatus for verbal entry of digits or commands | |
US20070005206A1 (en) | Automobile interface | |
JPH07210190A (ja) | 音声認識方法及びシステム | |
JP4520596B2 (ja) | 音声認識方法および音声認識装置 | |
JPH09106296A (ja) | 音声認識装置及び方法 | |
US20050273334A1 (en) | Method for automatic speech recognition | |
CN101345055A (zh) | 语音处理器和通信终端设备 | |
US6788767B2 (en) | Apparatus and method for providing call return service | |
EP1110207B1 (fr) | Procede et systeme de composition en vocal | |
US20050049858A1 (en) | Methods and systems for improving alphabetic speech recognition accuracy | |
US20020069064A1 (en) | Method and apparatus for testing user interface integrity of speech-enabled devices | |
KR100467593B1 (ko) | 음성인식 키 입력 무선 단말장치, 무선 단말장치에서키입력 대신 음성을 이용하는 방법 및 그 기록매체 | |
WO2007067837A2 (fr) | Controle de la qualite vocale pour la reconstruction de haute qualite de la parole | |
KR100433550B1 (ko) | 스피드 음성 다이얼 장치와 방법 | |
JP6811865B2 (ja) | 音声認識装置および音声認識方法 | |
EP1385148B1 (fr) | Procédé pour augmenter le taux de reconnaissance d'un système de reconnaissance vocale et serveur vocal mettant en oeuvre ce procédé | |
JP2007194833A (ja) | ハンズフリー機能を備えた携帯電話 | |
JP2004004182A (ja) | 音声認識装置、音声認識方法及び音声認識プログラム | |
KR20190041108A (ko) | 차량의 음성생성 시스템 및 방법 | |
JP2020034832A (ja) | 辞書生成装置、音声認識システムおよび辞書生成方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
RD03 | Notification of appointment of power of attorney |
Free format text: JAPANESE INTERMEDIATE CODE: A7423 Effective date: 20100727 |
|
RD04 | Notification of resignation of power of attorney |
Free format text: JAPANESE INTERMEDIATE CODE: A7424 Effective date: 20100802 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20110517 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20110810 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20111013 |
|
A601 | Written request for extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A601 Effective date: 20111221 |
|
A602 | Written permission of extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A602 Effective date: 20120104 |
|
A02 | Decision of refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A02 Effective date: 20120406 |