WO2006034152A3 - Discriminative training of document transcription system - Google Patents

Discriminative training of document transcription system Download PDF

Info

Publication number
WO2006034152A3
WO2006034152A3 PCT/US2005/033403 US2005033403W WO2006034152A3 WO 2006034152 A3 WO2006034152 A3 WO 2006034152A3 US 2005033403 W US2005033403 W US 2005033403W WO 2006034152 A3 WO2006034152 A3 WO 2006034152A3
Authority
WO
WIPO (PCT)
Prior art keywords
transcript
audio stream
spoken
training
acoustic model
Prior art date
Application number
PCT/US2005/033403
Other languages
French (fr)
Other versions
WO2006034152A2 (en
Inventor
Girija Yegnanarayanan
Juergen Fritsch
Lambert Mathias
Original Assignee
Multimodal Technologies Inc
Girija Yegnanarayanan
Juergen Fritsch
Lambert Mathias
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Multimodal Technologies Inc, Girija Yegnanarayanan, Juergen Fritsch, Lambert Mathias filed Critical Multimodal Technologies Inc
Publication of WO2006034152A2 publication Critical patent/WO2006034152A2/en
Publication of WO2006034152A3 publication Critical patent/WO2006034152A3/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models

Abstract

A system is provided for training an acoustic model (330) for use in speech recognition. In particular, such a system may be used to perform training (328) based on a spoken audio stream (302) and a non-literal transcript (304) of the spoken audio stream (302). Such a system may identify (204) text (308) in the non-literal transcript (304) which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream (302) which produced the corresponding text in the non-literal transcript (304), and thereby produce a revised transcript (326) which more accurately represents the spoken audio stream (302). The revised, and more accurate, transcript (326) may be used to train (328) the acoustic model (330) using discriminative training techniques (1414), thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript (304).
PCT/US2005/033403 2004-09-17 2005-09-16 Discriminative training of document transcription system WO2006034152A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US61117104P 2004-09-17 2004-09-17
US60/611,171 2004-09-17

Publications (2)

Publication Number Publication Date
WO2006034152A2 WO2006034152A2 (en) 2006-03-30
WO2006034152A3 true WO2006034152A3 (en) 2007-03-01

Family

ID=36090556

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2005/033403 WO2006034152A2 (en) 2004-09-17 2005-09-16 Discriminative training of document transcription system

Country Status (1)

Country Link
WO (1) WO2006034152A2 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2026327A4 (en) * 2006-05-31 2012-03-07 Nec Corp Language model learning system, language model learning method, and language model learning program
CN113689860A (en) * 2021-07-29 2021-11-23 北京捷通华声科技股份有限公司 Training method, device and equipment of voice recognition model and voice recognition method, device and equipment
CN113781853B (en) * 2021-08-23 2023-04-25 安徽教育出版社 Teacher-student remote interactive education platform based on terminal

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6055494A (en) * 1996-10-28 2000-04-25 The Trustees Of Columbia University In The City Of New York System and method for medical language extraction and encoding
US6263308B1 (en) * 2000-03-20 2001-07-17 Microsoft Corporation Methods and apparatus for performing speech recognition using acoustic models which are improved through an interactive process
US6535849B1 (en) * 2000-01-18 2003-03-18 Scansoft, Inc. Method and system for generating semi-literal transcripts for speech recognition systems
US6691088B1 (en) * 1998-10-21 2004-02-10 Koninklijke Philips Electronics N.V. Method of determining parameters of a statistical language model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6055494A (en) * 1996-10-28 2000-04-25 The Trustees Of Columbia University In The City Of New York System and method for medical language extraction and encoding
US6691088B1 (en) * 1998-10-21 2004-02-10 Koninklijke Philips Electronics N.V. Method of determining parameters of a statistical language model
US6535849B1 (en) * 2000-01-18 2003-03-18 Scansoft, Inc. Method and system for generating semi-literal transcripts for speech recognition systems
US6263308B1 (en) * 2000-03-20 2001-07-17 Microsoft Corporation Methods and apparatus for performing speech recognition using acoustic models which are improved through an interactive process

Also Published As

Publication number Publication date
WO2006034152A2 (en) 2006-03-30

Similar Documents

Publication Publication Date Title
WO2006023631A3 (en) Document transcription system training
US7881930B2 (en) ASR-aided transcription with segmented feedback training
TW200601263A (en) Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition
US20110313762A1 (en) Speech output with confidence indication
WO2009025356A1 (en) Voice recognition device and voice recognition method
WO2008142836A1 (en) Voice tone converting device and voice tone converting method
WO2007018842A3 (en) Content-based audio playback emphasis
EP1557821A3 (en) Segmental tonal modeling for tonal languages
WO2002054033A3 (en) Hierarchical language models for speech recognition
WO2008030756A3 (en) Method and system for training a text-to-speech synthesis system using a specific domain speech database
WO2007118100A3 (en) Automatic language model update
EP4235649A3 (en) Language model biasing
DE602005001125D1 (en) Learn the pronunciation of new words using a pronunciation graph
WO2008073850A3 (en) Method and apparatus for reading education
WO2011133766A3 (en) Methods and systems for training dictation-based speech-to-text systems using recorded samples
EP1696421A3 (en) Learning in automatic speech recognition
EP1022722A3 (en) Speaker adaptation based on eigenvoices
GB0207343D0 (en) Signal processing system
WO2007117814A3 (en) Voice signal perturbation for speech recognition
WO2007034478A3 (en) System and method for correcting speech
WO2006076280A3 (en) Method and system for assessing pronunciation difficulties of non-native speakers
WO2006033044A3 (en) Method of training a robust speaker-dependent speech recognition system with speaker-dependent expressions and robust speaker-dependent speech recognition system
DE59904741D1 (en) ARRANGEMENT AND METHOD FOR RECOGNIZING A PRESET VOCUS IN SPOKEN LANGUAGE BY A COMPUTER
ATE401644T1 (en) METHOD FOR VOICE RECOGNITION
US20070294082A1 (en) Voice Recognition Method and System Adapted to the Characteristics of Non-Native Speakers

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 05798604

Country of ref document: EP

Kind code of ref document: A2