WO2006034152A3

WO2006034152A3 - Discriminative training of document transcription system

Info

Publication number: WO2006034152A3
Application number: PCT/US2005/033403
Authority: WO
Inventors: Girija Yegnanarayanan; Juergen Fritsch; Lambert Mathias
Original assignee: Multimodal Technologies Inc; Girija Yegnanarayanan; Juergen Fritsch; Lambert Mathias
Priority date: 2004-09-17
Filing date: 2005-09-16
Publication date: 2007-03-01
Also published as: WO2006034152A2

Abstract

A system is provided for training an acoustic model (330) for use in speech recognition. In particular, such a system may be used to perform training (328) based on a spoken audio stream (302) and a non-literal transcript (304) of the spoken audio stream (302). Such a system may identify (204) text (308) in the non-literal transcript (304) which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream (302) which produced the corresponding text in the non-literal transcript (304), and thereby produce a revised transcript (326) which more accurately represents the spoken audio stream (302). The revised, and more accurate, transcript (326) may be used to train (328) the acoustic model (330) using discriminative training techniques (1414), thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript (304).