WO2008087934A1 - 拡張認識辞書学習装置と音声認識システム - Google Patents

拡張認識辞書学習装置と音声認識システム Download PDF

Info

Publication number
WO2008087934A1
WO2008087934A1 PCT/JP2008/050346 JP2008050346W WO2008087934A1 WO 2008087934 A1 WO2008087934 A1 WO 2008087934A1 JP 2008050346 W JP2008050346 W JP 2008050346W WO 2008087934 A1 WO2008087934 A1 WO 2008087934A1
Authority
WO
WIPO (PCT)
Prior art keywords
utterance
speech recognition
recognition dictionary
extended
variations
Prior art date
Application number
PCT/JP2008/050346
Other languages
English (en)
French (fr)
Inventor
Yoshifumi Onishi
Original Assignee
Nec Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nec Corporation filed Critical Nec Corporation
Priority to JP2008554032A priority Critical patent/JP5240457B2/ja
Priority to US12/523,302 priority patent/US8918318B2/en
Publication of WO2008087934A1 publication Critical patent/WO2008087934A1/ja

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0635Training updating or merging of old and new templates; Mean values; Weighting

Landscapes

  • Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)
  • Image Analysis (AREA)

Abstract

新たに使用する話者に対しても、その話者の音声と対応する発声ラベルを事前に使用して学習しておく必要なしに、その話者に適した、拡張認識辞書を用いての認識を可能とする。 拡張認識辞書学習装置は、音声認識結果から出力される音響モデル系列と、入力される正解音響モデル系列とを比較して、それらモデルの対応を発声変形データとして算出する発声変形データ算出手段と、算出された発声変形データにおいて広く出現する発声変形および偏って出現する発声変形を分類する発声変形データ分類手段と、分類された発声変形を組み合わせて、複数の発声変形集合とし、それぞれの発声変形集合に含まれる発声変形により、発声変形集合ごとに認識辞書を拡張する認識辞書拡張手段とを有する。音声認識装置は、この発声変形集合ごとの拡張認識辞書を用いて音声認識結果を出力する。
PCT/JP2008/050346 2007-01-16 2008-01-15 拡張認識辞書学習装置と音声認識システム WO2008087934A1 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2008554032A JP5240457B2 (ja) 2007-01-16 2008-01-15 拡張認識辞書学習装置と音声認識システム
US12/523,302 US8918318B2 (en) 2007-01-16 2008-01-15 Extended recognition dictionary learning device and speech recognition system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2007006977 2007-01-16
JP2007-006977 2007-01-16

Publications (1)

Publication Number Publication Date
WO2008087934A1 true WO2008087934A1 (ja) 2008-07-24

Family

ID=39635938

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2008/050346 WO2008087934A1 (ja) 2007-01-16 2008-01-15 拡張認識辞書学習装置と音声認識システム

Country Status (3)

Country Link
US (1) US8918318B2 (ja)
JP (1) JP5240457B2 (ja)
WO (1) WO2008087934A1 (ja)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010117651A (ja) * 2008-11-14 2010-05-27 Nec Corp 拡張認識辞書学習装置、これを用いた音声認識システム、その方法及びそのプログラム
JP2010145784A (ja) * 2008-12-19 2010-07-01 Casio Computer Co Ltd 音声認識装置、音響モデル学習装置、音声認識方法、および、プログラム
JP2010176067A (ja) * 2009-02-02 2010-08-12 Fujitsu Ltd 音声認識装置及び音声認識方法
JP2010176103A (ja) * 2009-02-02 2010-08-12 Nippon Hoso Kyokai <Nhk> 発音辞書修正装置、音声認識装置、およびコンピュータプログラム
JP2011053312A (ja) * 2009-08-31 2011-03-17 Nippon Hoso Kyokai <Nhk> 適応化音響モデル生成装置及びプログラム
JP2016011995A (ja) * 2014-06-27 2016-01-21 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation 発音辞書の拡張システム、拡張プログラム、拡張方法、該拡張方法により得られた拡張発音辞書を用いた音響モデルの学習方法、学習プログラム、および学習システム
JP2016161765A (ja) * 2015-03-02 2016-09-05 日本放送協会 発音系列拡張装置およびそのプログラム

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009078256A1 (ja) * 2007-12-18 2009-06-25 Nec Corporation 発音変動規則抽出装置、発音変動規則抽出方法、および発音変動規則抽出用プログラム
GB2471811B (en) * 2008-05-09 2012-05-16 Fujitsu Ltd Speech recognition dictionary creating support device,computer readable medium storing processing program, and processing method
US9634855B2 (en) 2010-05-13 2017-04-25 Alexander Poltorak Electronic personal interactive device that determines topics of interest using a conversational agent
US10811004B2 (en) * 2013-03-28 2020-10-20 Nuance Communications, Inc. Auto-generation of parsing grammars from a concept ontology
JP6390264B2 (ja) * 2014-08-21 2018-09-19 トヨタ自動車株式会社 応答生成方法、応答生成装置及び応答生成プログラム
US10332505B2 (en) * 2017-03-09 2019-06-25 Capital One Services, Llc Systems and methods for providing automated natural language dialogue with customers
US9741337B1 (en) * 2017-04-03 2017-08-22 Green Key Technologies Llc Adaptive self-trained computer engines with associated databases and methods of use thereof
KR102012404B1 (ko) * 2017-08-18 2019-08-20 동아대학교 산학협력단 언어 분석기별 정답 레이블 분포를 이용한 자연어 이해 방법
US20190295541A1 (en) * 2018-03-23 2019-09-26 Polycom, Inc. Modifying spoken commands

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6153699A (ja) * 1984-08-24 1986-03-17 松下電器産業株式会社 音声認識装置
JPS62235992A (ja) * 1986-04-05 1987-10-16 シャープ株式会社 音声認識方式
WO2006126649A1 (ja) * 2005-05-27 2006-11-30 Matsushita Electric Industrial Co., Ltd. 音声編集装置、音声編集方法、および、音声編集プログラム

Family Cites Families (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4843389A (en) * 1986-12-04 1989-06-27 International Business Machines Corp. Text compression and expansion method and apparatus
JP2701500B2 (ja) 1990-01-17 1998-01-21 日本電気株式会社 音声認識装置のための標準パターン学習方式
JP2596869B2 (ja) * 1992-04-30 1997-04-02 松下電器産業株式会社 概念辞書管理装置
JPH0720889A (ja) 1993-06-30 1995-01-24 Omron Corp 不特定話者の音声認識装置および方法
JPH08123470A (ja) 1994-10-25 1996-05-17 Nippon Hoso Kyokai <Nhk> 音声認識装置
US5875443A (en) * 1996-01-30 1999-02-23 Sun Microsystems, Inc. Internet-based spelling checker dictionary system with automatic updating
JP2974621B2 (ja) 1996-09-19 1999-11-10 株式会社エイ・ティ・アール音声翻訳通信研究所 音声認識用単語辞書作成装置及び連続音声認識装置
JP3466857B2 (ja) * 1997-03-06 2003-11-17 株式会社東芝 辞書更新方法および辞書更新システム
US6078886A (en) * 1997-04-14 2000-06-20 At&T Corporation System and method for providing remote automatic speech recognition services via a packet network
US6061646A (en) * 1997-12-18 2000-05-09 International Business Machines Corp. Kiosk for multiple spoken languages
JPH11344992A (ja) 1998-06-01 1999-12-14 Ntt Data Corp 音声辞書作成方法、個人認証装置および記録媒体
US6744860B1 (en) * 1998-12-31 2004-06-01 Bell Atlantic Network Services Methods and apparatus for initiating a voice-dialing operation
WO2000067162A1 (en) * 1999-05-05 2000-11-09 West Publishing Company Document-classification system, method and software
JP2001101185A (ja) * 1999-09-24 2001-04-13 Internatl Business Mach Corp <Ibm> 辞書の自動切り換えが可能な機械翻訳方法および装置並びにそのような機械翻訳方法を実行するためのプログラムを記憶したプログラム記憶媒体
US7392185B2 (en) * 1999-11-12 2008-06-24 Phoenix Solutions, Inc. Speech based learning/training system using semantic decoding
US7725307B2 (en) * 1999-11-12 2010-05-25 Phoenix Solutions, Inc. Query engine for processing voice based queries including semantic decoding
US6456975B1 (en) * 2000-01-13 2002-09-24 Microsoft Corporation Automated centralized updating of speech recognition systems
US7113910B1 (en) * 2000-02-18 2006-09-26 At&T Corp. Document expansion in speech retrieval
US6272464B1 (en) * 2000-03-27 2001-08-07 Lucent Technologies Inc. Method and apparatus for assembling a prediction list of name pronunciation variations for use during speech recognition
US6912498B2 (en) * 2000-05-02 2005-06-28 Scansoft, Inc. Error correction in speech recognition by correcting text around selected area
US7031908B1 (en) * 2000-06-01 2006-04-18 Microsoft Corporation Creating a language model for a language processing system
US6810376B1 (en) * 2000-07-11 2004-10-26 Nusuara Technologies Sdn Bhd System and methods for determining semantic similarity of sentences
US7042443B2 (en) * 2001-10-11 2006-05-09 Woodard Scott E Speed Writer program and device with Speed Writer program installed
US7567953B2 (en) * 2002-03-01 2009-07-28 Business Objects Americas System and method for retrieving and organizing information from disparate computer network information sources
US7257531B2 (en) * 2002-04-19 2007-08-14 Medcom Information Systems, Inc. Speech to text system using controlled vocabulary indices
US7197460B1 (en) * 2002-04-23 2007-03-27 At&T Corp. System for handling frequently asked questions in a natural language dialog service
US7606714B2 (en) * 2003-02-11 2009-10-20 Microsoft Corporation Natural language classification within an automated response system
US7283997B1 (en) * 2003-05-14 2007-10-16 Apple Inc. System and method for ranking the relevance of documents retrieved by a query
WO2005066837A1 (ja) * 2003-12-26 2005-07-21 Matsushita Electric Industrial Co., Ltd. 辞書作成装置および辞書作成方法
JP4218758B2 (ja) * 2004-12-21 2009-02-04 インターナショナル・ビジネス・マシーンズ・コーポレーション 字幕生成装置、字幕生成方法、及びプログラム
US7693267B2 (en) * 2005-12-30 2010-04-06 Microsoft Corporation Personalized user specific grammars

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6153699A (ja) * 1984-08-24 1986-03-17 松下電器産業株式会社 音声認識装置
JPS62235992A (ja) * 1986-04-05 1987-10-16 シャープ株式会社 音声認識方式
WO2006126649A1 (ja) * 2005-05-27 2006-11-30 Matsushita Electric Industrial Co., Ltd. 音声編集装置、音声編集方法、および、音声編集プログラム

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
NANJO H. ET AL.: "Koen Onsei Ninshiki no Tameno Kyoshi Nanshi Gengo Model Tekio to Hatsuwa Sokudo ni Tekio shita Decoding", THE TRANSACTIONS OF THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINERS D-II, vol. J87-D-II, 1 August 2004 (2004-08-01), pages 1581 - 1592, XP003016028 *
SAMEJIMA M. ET AL.: "Kodomo Onsei ni Taisuru Jubun Tokeiryo ni Motozuku Kyoshi Nashi Washa Tekio no Kento", THE ACOUSTICAL SOCIETY OF JAPAN (ASJ) 2004 NEN SHUKI KENKYU HAPPYOKAI KOEN RONBUNSHU -I-, 21 September 2004 (2004-09-21), pages 109 - 110 *
SATO S. ET AL.: "Jikkyo. Taidan ni Okeru Hassei Henkei o Koryo shita Onkyo Model no Kento", IEICE TECHNICAL REPORT, vol. 105, no. 493, 14 December 2005 (2005-12-14), pages 31 - 36 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010117651A (ja) * 2008-11-14 2010-05-27 Nec Corp 拡張認識辞書学習装置、これを用いた音声認識システム、その方法及びそのプログラム
JP2010145784A (ja) * 2008-12-19 2010-07-01 Casio Computer Co Ltd 音声認識装置、音響モデル学習装置、音声認識方法、および、プログラム
JP2010176067A (ja) * 2009-02-02 2010-08-12 Fujitsu Ltd 音声認識装置及び音声認識方法
JP2010176103A (ja) * 2009-02-02 2010-08-12 Nippon Hoso Kyokai <Nhk> 発音辞書修正装置、音声認識装置、およびコンピュータプログラム
JP2011053312A (ja) * 2009-08-31 2011-03-17 Nippon Hoso Kyokai <Nhk> 適応化音響モデル生成装置及びプログラム
JP2016011995A (ja) * 2014-06-27 2016-01-21 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation 発音辞書の拡張システム、拡張プログラム、拡張方法、該拡張方法により得られた拡張発音辞書を用いた音響モデルの学習方法、学習プログラム、および学習システム
JP2016161765A (ja) * 2015-03-02 2016-09-05 日本放送協会 発音系列拡張装置およびそのプログラム

Also Published As

Publication number Publication date
JPWO2008087934A1 (ja) 2010-05-06
JP5240457B2 (ja) 2013-07-17
US8918318B2 (en) 2014-12-23
US20100023329A1 (en) 2010-01-28

Similar Documents

Publication Publication Date Title
WO2008087934A1 (ja) 拡張認識辞書学習装置と音声認識システム
WO2009025356A1 (ja) 音声認識装置および音声認識方法
CN106531185B (zh) 基于语音相似度的语音评测方法及系统
ATE434252T1 (de) Spracherkennung mit sprecheranpassung basierend auf grundfrequenzklassifizierung
WO2017218243A3 (en) Intent recognition and emotional text-to-speech learning system
WO2007103520A3 (en) Codebook-less speech conversion method and system
WO2008073850A3 (en) Method and apparatus for reading education
MX2008002500A (es) Incorporacion de entrenamiento de voz en tutorial de usuario interactivo.
WO2007115088A3 (en) A system and method for applying dynamic contextual grammars and language models to improve automatic speech recognition accuracy
ATE457510T1 (de) Spracherkennungssystem mit riesigem vokabular
DE60134395D1 (de) Diskriminatives Trainieren von Hidden Markov Modellen für die Erkennung fliessender Sprache
WO2007117814A3 (en) Voice signal perturbation for speech recognition
EP4235648A3 (en) Language model biasing
WO2015009586A3 (en) Performing an operation relative to tabular data based upon voice input
WO2010030129A3 (en) Multimodal unification of articulation for device interfacing
WO2008142836A1 (ja) 声質変換装置および声質変換方法
WO2008114448A1 (ja) 音声認識システム、音声認識プログラムおよび音声認識方法
TW200638337A (en) Using a spoken utterance for disambiguation of spelling inputs into a speech recognition system
TW200601263A (en) Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition
WO2004063902A3 (en) Speech training method with color instruction
WO2015057907A3 (en) System and method for learning alternate pronunciations for speech recognition
KR20120054845A (ko) 로봇의 음성인식방법
EP2306345A3 (en) Speech retrieval apparatus and speech retrieval method
WO2012064408A3 (en) Method for tone/intonation recognition using auditory attention cues
ATE514162T1 (de) Dynamische erzeugung von kontexten zur spracherkennung

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08703210

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2008554032

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 12523302

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08703210

Country of ref document: EP

Kind code of ref document: A1