JPH0756675B2 - 音声認識方法および装置 - Google Patents
音声認識方法および装置Info
- Publication number
- JPH0756675B2 JPH0756675B2 JP2244139A JP24413990A JPH0756675B2 JP H0756675 B2 JPH0756675 B2 JP H0756675B2 JP 2244139 A JP2244139 A JP 2244139A JP 24413990 A JP24413990 A JP 24413990A JP H0756675 B2 JPH0756675 B2 JP H0756675B2
- Authority
- JP
- Japan
- Prior art keywords
- predictor
- opt
- feature
- word
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
- Machine Translation (AREA)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US07/427,420 US5263117A (en) | 1989-10-26 | 1989-10-26 | Method and apparatus for finding the best splits in a decision tree for a language model for a speech recognizer |
| US427420 | 2003-05-01 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| JPH03147079A JPH03147079A (ja) | 1991-06-24 |
| JPH0756675B2 true JPH0756675B2 (ja) | 1995-06-14 |
Family
ID=23694801
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2244139A Expired - Lifetime JPH0756675B2 (ja) | 1989-10-26 | 1990-09-17 | 音声認識方法および装置 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US5263117A (oth) |
| EP (1) | EP0424665A2 (oth) |
| JP (1) | JPH0756675B2 (oth) |
| CA (1) | CA2024382C (oth) |
Families Citing this family (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5745649A (en) * | 1994-07-07 | 1998-04-28 | Nynex Science & Technology Corporation | Automated speech recognition using a plurality of different multilayer perception structures to model a plurality of distinct phoneme categories |
| US5680509A (en) * | 1994-09-27 | 1997-10-21 | International Business Machines Corporation | Method and apparatus for estimating phone class probabilities a-posteriori using a decision tree |
| US5729656A (en) | 1994-11-30 | 1998-03-17 | International Business Machines Corporation | Reduction of search space in speech recognition using phone boundaries and phone ranking |
| AU5738296A (en) * | 1995-05-26 | 1996-12-11 | Applied Language Technologies | Method and apparatus for dynamic adaptation of a large vocab ulary speech recognition system and for use of constraints f rom a database in a large vocabulary speech recognition syst em |
| US5822730A (en) * | 1996-08-22 | 1998-10-13 | Dragon Systems, Inc. | Lexical tree pre-filtering in speech recognition |
| US5864819A (en) * | 1996-11-08 | 1999-01-26 | International Business Machines Corporation | Internal window object tree method for representing graphical user interface applications for speech navigation |
| US6167377A (en) * | 1997-03-28 | 2000-12-26 | Dragon Systems, Inc. | Speech recognition language models |
| US6418431B1 (en) * | 1998-03-30 | 2002-07-09 | Microsoft Corporation | Information retrieval and speech recognition based on language models |
| US6304773B1 (en) | 1998-05-21 | 2001-10-16 | Medtronic Physio-Control Manufacturing Corp. | Automatic detection and reporting of cardiac asystole |
| US6865528B1 (en) | 2000-06-01 | 2005-03-08 | Microsoft Corporation | Use of a unified language model |
| US7031908B1 (en) * | 2000-06-01 | 2006-04-18 | Microsoft Corporation | Creating a language model for a language processing system |
| US6859774B2 (en) * | 2001-05-02 | 2005-02-22 | International Business Machines Corporation | Error corrective mechanisms for consensus decoding of speech |
| US7711570B2 (en) * | 2001-10-21 | 2010-05-04 | Microsoft Corporation | Application abstraction with dialog purpose |
| US8229753B2 (en) | 2001-10-21 | 2012-07-24 | Microsoft Corporation | Web server controls for web enabled recognition and/or audible prompting |
| US7133856B2 (en) * | 2002-05-17 | 2006-11-07 | The Board Of Trustees Of The Leland Stanford Junior University | Binary tree for complex supervised learning |
| US7292982B1 (en) | 2003-05-29 | 2007-11-06 | At&T Corp. | Active labeling for spoken language understanding |
| US8301436B2 (en) * | 2003-05-29 | 2012-10-30 | Microsoft Corporation | Semantic object synchronous understanding for highly interactive interface |
| US7200559B2 (en) | 2003-05-29 | 2007-04-03 | Microsoft Corporation | Semantic object synchronous understanding implemented with speech application language tags |
| US8160883B2 (en) | 2004-01-10 | 2012-04-17 | Microsoft Corporation | Focus tracking in dialogs |
Family Cites Families (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4181813A (en) * | 1978-05-08 | 1980-01-01 | John Marley | System and method for speech recognition |
| JPS57211338A (en) * | 1981-06-24 | 1982-12-25 | Tokyo Shibaura Electric Co | Tatal image diagnosis data treating apparatus |
| JPS58115497A (ja) * | 1981-12-28 | 1983-07-09 | シャープ株式会社 | 音声認識方法 |
| US4658429A (en) * | 1983-12-29 | 1987-04-14 | Hitachi, Ltd. | System and method for preparing a recognition dictionary |
| JPS60262290A (ja) * | 1984-06-08 | 1985-12-25 | Hitachi Ltd | 情報認識システム |
| US4759068A (en) * | 1985-05-29 | 1988-07-19 | International Business Machines Corporation | Constructing Markov models of words from multiple utterances |
| FR2591005B1 (fr) * | 1985-12-04 | 1988-01-08 | Thomson Csf | Procede d'identification de structure arborescentes dans des images numeriques et son application a un dispositif de traitement d'images |
| US4719571A (en) * | 1986-03-05 | 1988-01-12 | International Business Machines Corporation | Algorithm for constructing tree structured classifiers |
| US4852173A (en) * | 1987-10-29 | 1989-07-25 | International Business Machines Corporation | Design and construction of a binary-tree system for language modelling |
-
1989
- 1989-10-26 US US07/427,420 patent/US5263117A/en not_active Expired - Lifetime
-
1990
- 1990-08-31 CA CA002024382A patent/CA2024382C/en not_active Expired - Fee Related
- 1990-09-17 JP JP2244139A patent/JPH0756675B2/ja not_active Expired - Lifetime
- 1990-09-19 EP EP90118023A patent/EP0424665A2/en not_active Withdrawn
Also Published As
| Publication number | Publication date |
|---|---|
| EP0424665A2 (en) | 1991-05-02 |
| CA2024382A1 (en) | 1991-04-27 |
| CA2024382C (en) | 1994-08-02 |
| US5263117A (en) | 1993-11-16 |
| EP0424665A3 (oth) | 1994-01-12 |
| JPH03147079A (ja) | 1991-06-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JPH0756675B2 (ja) | 音声認識方法および装置 | |
| Sigtia et al. | An end-to-end neural network for polyphonic piano music transcription | |
| CN111291183B (zh) | 利用文本分类模型进行分类预测的方法及装置 | |
| US9728183B2 (en) | System and method for combining frame and segment level processing, via temporal pooling, for phonetic classification | |
| US20140067735A1 (en) | Computer-implemented deep tensor neural network | |
| JP7332024B2 (ja) | 認識装置、学習装置、それらの方法、およびプログラム | |
| WO2019158927A1 (en) | A method of generating music data | |
| JP6704585B2 (ja) | 情報処理装置 | |
| Harere et al. | Mispronunciation detection of basic Quranic recitation rules using deep learning | |
| Altınçay et al. | An information theoretic framework for weight estimation in the combination of probabilistic classifiers for speaker identification | |
| Shetty et al. | Bi-directional long short-term memory neural networks for music composition | |
| Noroozi et al. | Speech-based emotion recognition and next reaction prediction | |
| Campbell | Analog I/O nets for syllable timing | |
| US6430532B2 (en) | Determining an adequate representative sound using two quality criteria, from sound models chosen from a structure including a set of sound models | |
| KR102159988B1 (ko) | 음성 몽타주 생성 방법 및 시스템 | |
| Zagagy et al. | Ackem: automatic classification, using knn based ensemble modeling | |
| JP2006201265A (ja) | 音声認識装置 | |
| EP4167227B1 (en) | System and method for recognising chords in music | |
| Blomqvist et al. | Swedish Dialect Classification using Artificial Neural Networks and Guassian Mixture Models | |
| Johansson | Generative AI for TimeDependent Data | |
| Naaman et al. | Learning Similarity Functions for Pronunciation Variations | |
| Shankar et al. | Spoken term detection from continuous speech using ANN posteriors and image processing techniques | |
| Do | Neural networks for automatic speaker, language, and sex identification | |
| Puthran et al. | A Multimodal Method for Detecting Language through Speech in Ten Indian Languages | |
| CN119830949A (zh) | 自然语言处理任务的处理方法、装置、设备和存储介质 |