CN1205599C - 在语音识别中利用静音的系统 - Google Patents

在语音识别中利用静音的系统 Download PDF

Info

Publication number
CN1205599C
CN1205599C CNB998030759A CN99803075A CN1205599C CN 1205599 C CN1205599 C CN 1205599C CN B998030759 A CNB998030759 A CN B998030759A CN 99803075 A CN99803075 A CN 99803075A CN 1205599 C CN1205599 C CN 1205599C
Authority
CN
China
Prior art keywords
quiet
speech
branch
prefix trees
phoneme
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
CNB998030759A
Other languages
English (en)
Chinese (zh)
Other versions
CN1307715A (zh
Inventor
江丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Publication of CN1307715A publication Critical patent/CN1307715A/zh
Application granted granted Critical
Publication of CN1205599C publication Critical patent/CN1205599C/zh
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • G10L15/05Word boundary detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/187Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/085Methods for reducing search complexity, pruning

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)
  • Telephonic Communication Services (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
CNB998030759A 1998-02-20 1999-02-09 在语音识别中利用静音的系统 Expired - Lifetime CN1205599C (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/026,841 US6374219B1 (en) 1997-09-19 1998-02-20 System for using silence in speech recognition
US09/026,841 1998-02-20

Publications (2)

Publication Number Publication Date
CN1307715A CN1307715A (zh) 2001-08-08
CN1205599C true CN1205599C (zh) 2005-06-08

Family

ID=21834100

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB998030759A Expired - Lifetime CN1205599C (zh) 1998-02-20 1999-02-09 在语音识别中利用静音的系统

Country Status (7)

Country Link
US (1) US6374219B1 (enExample)
EP (1) EP1055226B1 (enExample)
JP (1) JP4414088B2 (enExample)
KR (1) KR100651957B1 (enExample)
CN (1) CN1205599C (enExample)
CA (1) CA2315832C (enExample)
WO (1) WO1999042991A1 (enExample)

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19635754A1 (de) * 1996-09-03 1998-03-05 Siemens Ag Sprachverarbeitungssystem und Verfahren zur Sprachverarbeitung
US7392185B2 (en) 1999-11-12 2008-06-24 Phoenix Solutions, Inc. Speech based learning/training system using semantic decoding
US7050977B1 (en) 1999-11-12 2006-05-23 Phoenix Solutions, Inc. Speech-enabled server for internet website and method
US7725307B2 (en) 1999-11-12 2010-05-25 Phoenix Solutions, Inc. Query engine for processing voice based queries including semantic decoding
US6665640B1 (en) 1999-11-12 2003-12-16 Phoenix Solutions, Inc. Interactive speech based learning/training system formulating search queries based on natural language parsing of recognized user queries
US9076448B2 (en) * 1999-11-12 2015-07-07 Nuance Communications, Inc. Distributed real time speech recognition system
US6615172B1 (en) 1999-11-12 2003-09-02 Phoenix Solutions, Inc. Intelligent query engine for processing voice based queries
US6633846B1 (en) 1999-11-12 2003-10-14 Phoenix Solutions, Inc. Distributed realtime speech recognition system
US7526431B2 (en) * 2001-09-05 2009-04-28 Voice Signal Technologies, Inc. Speech recognition using ambiguous or phone key spelling and/or filtering
US7444286B2 (en) 2001-09-05 2008-10-28 Roth Daniel L Speech recognition using re-utterance recognition
US7467089B2 (en) * 2001-09-05 2008-12-16 Roth Daniel L Combined speech and handwriting recognition
WO2004023455A2 (en) * 2002-09-06 2004-03-18 Voice Signal Technologies, Inc. Methods, systems, and programming for performing speech recognition
US7809574B2 (en) 2001-09-05 2010-10-05 Voice Signal Technologies Inc. Word recognition using choice lists
US7313526B2 (en) * 2001-09-05 2007-12-25 Voice Signal Technologies, Inc. Speech recognition using selectable recognition modes
US7505911B2 (en) * 2001-09-05 2009-03-17 Roth Daniel L Combined speech recognition and sound recording
US20040064315A1 (en) * 2002-09-30 2004-04-01 Deisher Michael E. Acoustic confidence driven front-end preprocessing for speech recognition in adverse environments
US7389230B1 (en) * 2003-04-22 2008-06-17 International Business Machines Corporation System and method for classification of voice signals
US9117460B2 (en) * 2004-05-12 2015-08-25 Core Wireless Licensing S.A.R.L. Detection of end of utterance in speech recognition system
US8032374B2 (en) * 2006-12-05 2011-10-04 Electronics And Telecommunications Research Institute Method and apparatus for recognizing continuous speech using search space restriction based on phoneme recognition
US8165877B2 (en) * 2007-08-03 2012-04-24 Microsoft Corporation Confidence measure generation for speech related searching
JP4757936B2 (ja) * 2009-07-23 2011-08-24 Kddi株式会社 パターン認識方法および装置ならびにパターン認識プログラムおよびその記録媒体
US9224384B2 (en) * 2012-06-06 2015-12-29 Cypress Semiconductor Corporation Histogram based pre-pruning scheme for active HMMS
US9514739B2 (en) * 2012-06-06 2016-12-06 Cypress Semiconductor Corporation Phoneme score accelerator
US20140365221A1 (en) * 2012-07-31 2014-12-11 Novospeech Ltd. Method and apparatus for speech recognition
JP6235280B2 (ja) * 2013-09-19 2017-11-22 株式会社東芝 音声同時処理装置、方法およびプログラム
US8719032B1 (en) 2013-12-11 2014-05-06 Jefferson Audio Video Systems, Inc. Methods for presenting speech blocks from a plurality of audio input data streams to a user in an interface
US10134425B1 (en) * 2015-06-29 2018-11-20 Amazon Technologies, Inc. Direction-based speech endpointing
US10121471B2 (en) * 2015-06-29 2018-11-06 Amazon Technologies, Inc. Language model speech endpointing
CN105427870B (zh) * 2015-12-23 2019-08-30 北京奇虎科技有限公司 一种针对停顿的语音识别方法和装置
KR102435750B1 (ko) * 2017-12-14 2022-08-25 현대자동차주식회사 멀티미디어 장치 및 이를 포함하는 차량, 멀티미디어 장치의 방송 청취 방법
US11893983B2 (en) * 2021-06-23 2024-02-06 International Business Machines Corporation Adding words to a prefix tree for improving speech recognition
CN117351963A (zh) * 2023-11-21 2024-01-05 京东城市(北京)数字科技有限公司 用于语音识别的方法、装置、设备和可读介质

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4336421A (en) 1980-04-08 1982-06-22 Threshold Technology, Inc. Apparatus and method for recognizing spoken words
US4977599A (en) * 1985-05-29 1990-12-11 International Business Machines Corporation Speech recognition employing a set of Markov models that includes Markov models representing transitions to and from silence
US4852173A (en) 1987-10-29 1989-07-25 International Business Machines Corporation Design and construction of a binary-tree system for language modelling
US5159637A (en) 1988-07-27 1992-10-27 Fujitsu Limited Speech word recognizing apparatus using information indicative of the relative significance of speech features
US5202952A (en) * 1990-06-22 1993-04-13 Dragon Systems, Inc. Large-vocabulary continuous speech prefiltering and processing system
DE4130632A1 (de) 1991-09-14 1993-03-18 Philips Patentverwaltung Verfahren zum erkennen der gesprochenen woerter in einem sprachsignal
US5848388A (en) * 1993-03-25 1998-12-08 British Telecommunications Plc Speech recognition with sequence parsing, rejection and pause detection options
JPH0728487A (ja) * 1993-03-26 1995-01-31 Texas Instr Inc <Ti> 音声認識方法
US5623609A (en) * 1993-06-14 1997-04-22 Hal Trust, L.L.C. Computer system and computer-implemented process for phonology-based automatic speech recognition
US5794197A (en) * 1994-01-21 1998-08-11 Micrsoft Corporation Senone tree representation and evaluation
DE69616466T2 (de) 1995-08-18 2002-12-12 Gsbs Development Corp., Muskegon Feueralarmsystem
GB2305288A (en) * 1995-09-15 1997-04-02 Ibm Speech recognition system
US6076056A (en) * 1997-09-19 2000-06-13 Microsoft Corporation Speech recognition system for recognizing continuous and isolated speech

Also Published As

Publication number Publication date
EP1055226B1 (en) 2017-08-16
JP2002504719A (ja) 2002-02-12
KR100651957B1 (ko) 2006-12-01
US6374219B1 (en) 2002-04-16
CA2315832A1 (en) 1999-08-26
WO1999042991A1 (en) 1999-08-26
CN1307715A (zh) 2001-08-08
JP4414088B2 (ja) 2010-02-10
KR20010034367A (ko) 2001-04-25
CA2315832C (en) 2004-11-16
EP1055226A1 (en) 2000-11-29

Similar Documents

Publication Publication Date Title
CN1205599C (zh) 在语音识别中利用静音的系统
CN1202512C (zh) 用于识别连续和分立语音的语音识别系统
US8280733B2 (en) Automatic speech recognition learning using categorization and selective incorporation of user-initiated corrections
CN110364171B (zh) 一种语音识别方法、语音识别系统及存储介质
US10210862B1 (en) Lattice decoding and result confirmation using recurrent neural networks
JP4974510B2 (ja) 音響情報から意味的な意図を識別するためのシステムおよび方法
JP6550068B2 (ja) 音声認識における発音予測
US9292487B1 (en) Discriminative language model pruning
US6542866B1 (en) Speech recognition method and apparatus utilizing multiple feature streams
ES2291440T3 (es) Procedimiento, modulo, dispositivo y servidor para reconocimiento de voz.
US8532990B2 (en) Speech recognition of a list entry
WO2018207390A1 (en) Speech recognition system and method for speech recognition
CN1254787C (zh) 使用离散语言模型的语音识别方法和设备
JP2002507010A (ja) 同時に起こるマルチモード口述のための装置及び方法
US12165640B2 (en) Response method, terminal, and storage medium for speech response
CN1551103B (zh) 用于语音识别和自然语言理解的具有合成统计和基于规则的语法模型的系统
JPH09127978A (ja) 音声認識方法及び装置及びコンピュータ制御装置
CN116778914A (zh) 命令词识别模型的训练方法、命令词识别方法及装置
CN1298171A (zh) 执行句法置换规则的语音识别装置
CN1369830A (zh) 歧义消除语言模型
Sarikaya et al. Word level confidence measurement using semantic features
JP2000148178A (ja) 複合的な文法ネットワークを用いる音声認識システム
JPH07104780A (ja) 不特定話者連続音声認識方法
JP2005091504A (ja) 音声認識装置
JPH11190999A (ja) 音声スポッティング装置

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: MICROSOFT TECHNOLOGY LICENSING LLC

Free format text: FORMER OWNER: MICROSOFT CORP.

Effective date: 20150429

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20150429

Address after: Washington State

Patentee after: Micro soft technique license Co., Ltd

Address before: Washington, USA

Patentee before: Microsoft Corp.

CX01 Expiry of patent term
CX01 Expiry of patent term

Granted publication date: 20050608