JPH0756675B2 - 音声認識方法および装置 - Google Patents

音声認識方法および装置

Info

Publication number
JPH0756675B2
JPH0756675B2 JP2244139A JP24413990A JPH0756675B2 JP H0756675 B2 JPH0756675 B2 JP H0756675B2 JP 2244139 A JP2244139 A JP 2244139A JP 24413990 A JP24413990 A JP 24413990A JP H0756675 B2 JPH0756675 B2 JP H0756675B2
Authority
JP
Japan
Prior art keywords
predictor
opt
feature
word
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP2244139A
Other languages
English (en)
Japanese (ja)
Other versions
JPH03147079A (ja
Inventor
アーサー・ジエー・ナダス
デービツド・ナハモー
Original Assignee
インターナシヨナル・ビジネス・マシーンズ・コーポレーシヨン
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by インターナシヨナル・ビジネス・マシーンズ・コーポレーシヨン filed Critical インターナシヨナル・ビジネス・マシーンズ・コーポレーシヨン
Publication of JPH03147079A publication Critical patent/JPH03147079A/ja
Publication of JPH0756675B2 publication Critical patent/JPH0756675B2/ja
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • G10L15/197Probabilistic grammars, e.g. word n-grams

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)
  • Machine Translation (AREA)
JP2244139A 1989-10-26 1990-09-17 音声認識方法および装置 Expired - Lifetime JPH0756675B2 (ja)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US07/427,420 US5263117A (en) 1989-10-26 1989-10-26 Method and apparatus for finding the best splits in a decision tree for a language model for a speech recognizer
US427420 2003-05-01

Publications (2)

Publication Number Publication Date
JPH03147079A JPH03147079A (ja) 1991-06-24
JPH0756675B2 true JPH0756675B2 (ja) 1995-06-14

Family

ID=23694801

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2244139A Expired - Lifetime JPH0756675B2 (ja) 1989-10-26 1990-09-17 音声認識方法および装置

Country Status (4)

Country Link
US (1) US5263117A (oth)
EP (1) EP0424665A2 (oth)
JP (1) JPH0756675B2 (oth)
CA (1) CA2024382C (oth)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5745649A (en) * 1994-07-07 1998-04-28 Nynex Science & Technology Corporation Automated speech recognition using a plurality of different multilayer perception structures to model a plurality of distinct phoneme categories
US5680509A (en) * 1994-09-27 1997-10-21 International Business Machines Corporation Method and apparatus for estimating phone class probabilities a-posteriori using a decision tree
US5729656A (en) 1994-11-30 1998-03-17 International Business Machines Corporation Reduction of search space in speech recognition using phone boundaries and phone ranking
AU5738296A (en) * 1995-05-26 1996-12-11 Applied Language Technologies Method and apparatus for dynamic adaptation of a large vocab ulary speech recognition system and for use of constraints f rom a database in a large vocabulary speech recognition syst em
US5822730A (en) * 1996-08-22 1998-10-13 Dragon Systems, Inc. Lexical tree pre-filtering in speech recognition
US5864819A (en) * 1996-11-08 1999-01-26 International Business Machines Corporation Internal window object tree method for representing graphical user interface applications for speech navigation
US6167377A (en) * 1997-03-28 2000-12-26 Dragon Systems, Inc. Speech recognition language models
US6418431B1 (en) * 1998-03-30 2002-07-09 Microsoft Corporation Information retrieval and speech recognition based on language models
US6304773B1 (en) 1998-05-21 2001-10-16 Medtronic Physio-Control Manufacturing Corp. Automatic detection and reporting of cardiac asystole
US6865528B1 (en) 2000-06-01 2005-03-08 Microsoft Corporation Use of a unified language model
US7031908B1 (en) * 2000-06-01 2006-04-18 Microsoft Corporation Creating a language model for a language processing system
US6859774B2 (en) * 2001-05-02 2005-02-22 International Business Machines Corporation Error corrective mechanisms for consensus decoding of speech
US7711570B2 (en) * 2001-10-21 2010-05-04 Microsoft Corporation Application abstraction with dialog purpose
US8229753B2 (en) 2001-10-21 2012-07-24 Microsoft Corporation Web server controls for web enabled recognition and/or audible prompting
US7133856B2 (en) * 2002-05-17 2006-11-07 The Board Of Trustees Of The Leland Stanford Junior University Binary tree for complex supervised learning
US7292982B1 (en) 2003-05-29 2007-11-06 At&T Corp. Active labeling for spoken language understanding
US8301436B2 (en) * 2003-05-29 2012-10-30 Microsoft Corporation Semantic object synchronous understanding for highly interactive interface
US7200559B2 (en) 2003-05-29 2007-04-03 Microsoft Corporation Semantic object synchronous understanding implemented with speech application language tags
US8160883B2 (en) 2004-01-10 2012-04-17 Microsoft Corporation Focus tracking in dialogs

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4181813A (en) * 1978-05-08 1980-01-01 John Marley System and method for speech recognition
JPS57211338A (en) * 1981-06-24 1982-12-25 Tokyo Shibaura Electric Co Tatal image diagnosis data treating apparatus
JPS58115497A (ja) * 1981-12-28 1983-07-09 シャープ株式会社 音声認識方法
US4658429A (en) * 1983-12-29 1987-04-14 Hitachi, Ltd. System and method for preparing a recognition dictionary
JPS60262290A (ja) * 1984-06-08 1985-12-25 Hitachi Ltd 情報認識システム
US4759068A (en) * 1985-05-29 1988-07-19 International Business Machines Corporation Constructing Markov models of words from multiple utterances
FR2591005B1 (fr) * 1985-12-04 1988-01-08 Thomson Csf Procede d'identification de structure arborescentes dans des images numeriques et son application a un dispositif de traitement d'images
US4719571A (en) * 1986-03-05 1988-01-12 International Business Machines Corporation Algorithm for constructing tree structured classifiers
US4852173A (en) * 1987-10-29 1989-07-25 International Business Machines Corporation Design and construction of a binary-tree system for language modelling

Also Published As

Publication number Publication date
EP0424665A2 (en) 1991-05-02
CA2024382A1 (en) 1991-04-27
CA2024382C (en) 1994-08-02
US5263117A (en) 1993-11-16
EP0424665A3 (oth) 1994-01-12
JPH03147079A (ja) 1991-06-24

Similar Documents

Publication Publication Date Title
JPH0756675B2 (ja) 音声認識方法および装置
Sigtia et al. An end-to-end neural network for polyphonic piano music transcription
CN111291183B (zh) 利用文本分类模型进行分类预测的方法及装置
US9728183B2 (en) System and method for combining frame and segment level processing, via temporal pooling, for phonetic classification
US20140067735A1 (en) Computer-implemented deep tensor neural network
JP7332024B2 (ja) 認識装置、学習装置、それらの方法、およびプログラム
WO2019158927A1 (en) A method of generating music data
JP6704585B2 (ja) 情報処理装置
Harere et al. Mispronunciation detection of basic Quranic recitation rules using deep learning
Altınçay et al. An information theoretic framework for weight estimation in the combination of probabilistic classifiers for speaker identification
Shetty et al. Bi-directional long short-term memory neural networks for music composition
Noroozi et al. Speech-based emotion recognition and next reaction prediction
Campbell Analog I/O nets for syllable timing
US6430532B2 (en) Determining an adequate representative sound using two quality criteria, from sound models chosen from a structure including a set of sound models
KR102159988B1 (ko) 음성 몽타주 생성 방법 및 시스템
Zagagy et al. Ackem: automatic classification, using knn based ensemble modeling
JP2006201265A (ja) 音声認識装置
EP4167227B1 (en) System and method for recognising chords in music
Blomqvist et al. Swedish Dialect Classification using Artificial Neural Networks and Guassian Mixture Models
Johansson Generative AI for TimeDependent Data
Naaman et al. Learning Similarity Functions for Pronunciation Variations
Shankar et al. Spoken term detection from continuous speech using ANN posteriors and image processing techniques
Do Neural networks for automatic speaker, language, and sex identification
Puthran et al. A Multimodal Method for Detecting Language through Speech in Ten Indian Languages
CN119830949A (zh) 自然语言处理任务的处理方法、装置、设备和存储介质