ATE545130T1 - Verfahren zur optimierung der durchführung eines neuronalen netzwerkes in einem spracherkennungssystem durch bedingtes überspringen einer variablen anzahl von zeitfenstern - Google Patents
Verfahren zur optimierung der durchführung eines neuronalen netzwerkes in einem spracherkennungssystem durch bedingtes überspringen einer variablen anzahl von zeitfensternInfo
- Publication number
- ATE545130T1 ATE545130T1 AT02798359T AT02798359T ATE545130T1 AT E545130 T1 ATE545130 T1 AT E545130T1 AT 02798359 T AT02798359 T AT 02798359T AT 02798359 T AT02798359 T AT 02798359T AT E545130 T1 ATE545130 T1 AT E545130T1
- Authority
- AT
- Austria
- Prior art keywords
- frames
- distance
- optimizing
- recognition system
- variable number
- Prior art date
Links
- 238000000034 method Methods 0.000 title abstract 3
- 230000001537 neural effect Effects 0.000 title 1
- 238000013528 artificial neural network Methods 0.000 abstract 3
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Time-Division Multiplex Systems (AREA)
- Telephonic Communication Services (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/EP2002/014718 WO2004057573A1 (en) | 2002-12-23 | 2002-12-23 | Method of optimising the execution of a neural network in a speech recognition system through conditionally skipping a variable number of frames |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| ATE545130T1 true ATE545130T1 (de) | 2012-02-15 |
Family
ID=32668683
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| AT02798359T ATE545130T1 (de) | 2002-12-23 | 2002-12-23 | Verfahren zur optimierung der durchführung eines neuronalen netzwerkes in einem spracherkennungssystem durch bedingtes überspringen einer variablen anzahl von zeitfenstern |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US7769580B2 (de) |
| EP (1) | EP1576580B1 (de) |
| AT (1) | ATE545130T1 (de) |
| AU (1) | AU2002363894A1 (de) |
| WO (1) | WO2004057573A1 (de) |
Families Citing this family (18)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR101351019B1 (ko) | 2007-04-13 | 2014-01-13 | 엘지전자 주식회사 | 방송 신호 송수신 장치 및 방송 신호 송수신 방법 |
| US8401331B2 (en) * | 2007-12-06 | 2013-03-19 | Alcatel Lucent | Video quality analysis using a linear approximation technique |
| US20100057452A1 (en) * | 2008-08-28 | 2010-03-04 | Microsoft Corporation | Speech interfaces |
| CN103021408B (zh) * | 2012-12-04 | 2014-10-22 | 中国科学院自动化研究所 | 一种发音稳定段辅助的语音识别优化解码方法及装置 |
| JP6596924B2 (ja) * | 2014-05-29 | 2019-10-30 | 日本電気株式会社 | 音声データ処理装置、音声データ処理方法、及び、音声データ処理プログラム |
| US9627532B2 (en) * | 2014-06-18 | 2017-04-18 | Nuance Communications, Inc. | Methods and apparatus for training an artificial neural network for use in speech recognition |
| US9520128B2 (en) * | 2014-09-23 | 2016-12-13 | Intel Corporation | Frame skipping with extrapolation and outputs on demand neural network for automatic speech recognition |
| US10304440B1 (en) * | 2015-07-10 | 2019-05-28 | Amazon Technologies, Inc. | Keyword spotting using multi-task configuration |
| KR102423302B1 (ko) * | 2015-10-06 | 2022-07-19 | 삼성전자주식회사 | 음성 인식에서의 음향 점수 계산 장치 및 방법과, 음향 모델 학습 장치 및 방법 |
| BR102016007265B1 (pt) * | 2016-04-01 | 2022-11-16 | Samsung Eletrônica da Amazônia Ltda. | Método multimodal e em tempo real para filtragem de conteúdo sensível |
| KR102565274B1 (ko) | 2016-07-07 | 2023-08-09 | 삼성전자주식회사 | 자동 통역 방법 및 장치, 및 기계 번역 방법 및 장치 |
| CN109697977B (zh) * | 2017-10-23 | 2023-10-31 | 三星电子株式会社 | 语音识别方法和设备 |
| KR102676221B1 (ko) | 2017-10-23 | 2024-06-19 | 삼성전자주식회사 | 음성 인식 방법 및 장치 |
| US10628486B2 (en) * | 2017-11-15 | 2020-04-21 | Google Llc | Partitioning videos |
| KR102424514B1 (ko) | 2017-12-04 | 2022-07-25 | 삼성전자주식회사 | 언어 처리 방법 및 장치 |
| WO2020218634A1 (ko) * | 2019-04-23 | 2020-10-29 | 엘지전자 주식회사 | 응답 기기 결정 방법 및 장치 |
| US12266351B2 (en) * | 2022-08-26 | 2025-04-01 | Qualcomm Incorporated | Adaptive frame skipping for speech recognition |
| US12525226B2 (en) * | 2023-02-10 | 2026-01-13 | Qualcomm Incorporated | Latency reduction for multi-stage speech recognition |
Family Cites Families (25)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4379949A (en) * | 1981-08-10 | 1983-04-12 | Motorola, Inc. | Method of and means for variable-rate coding of LPC parameters |
| US4489434A (en) * | 1981-10-05 | 1984-12-18 | Exxon Corporation | Speech recognition method and apparatus |
| US5317673A (en) * | 1992-06-22 | 1994-05-31 | Sri International | Method and apparatus for context-dependent estimation of multiple probability distributions of phonetic classes with multilayer perceptrons in a speech recognition system |
| US5461696A (en) * | 1992-10-28 | 1995-10-24 | Motorola, Inc. | Decision directed adaptive neural network |
| US5461699A (en) * | 1993-10-25 | 1995-10-24 | International Business Machines Corporation | Forecasting using a neural network and a statistical forecast |
| US5689616A (en) * | 1993-11-19 | 1997-11-18 | Itt Corporation | Automatic language identification/verification system |
| US6061652A (en) * | 1994-06-13 | 2000-05-09 | Matsushita Electric Industrial Co., Ltd. | Speech recognition apparatus |
| US5638487A (en) * | 1994-12-30 | 1997-06-10 | Purespeech, Inc. | Automatic speech recognition |
| IT1280816B1 (it) * | 1995-03-22 | 1998-02-11 | Cselt Centro Studi Lab Telecom | Metodo per velocizzare l'esecuzione di reti neurali per il trattamento di segnali correlati. |
| US6064958A (en) * | 1996-09-20 | 2000-05-16 | Nippon Telegraph And Telephone Corporation | Pattern recognition scheme using probabilistic models based on mixtures distribution of discrete distribution |
| US5924065A (en) * | 1997-06-16 | 1999-07-13 | Digital Equipment Corporation | Environmently compensated speech processing |
| US6032116A (en) * | 1997-06-27 | 2000-02-29 | Advanced Micro Devices, Inc. | Distance measure in a speech recognition system for speech recognition using frequency shifting factors to compensate for input signal frequency shifts |
| US6253178B1 (en) | 1997-09-22 | 2001-06-26 | Nortel Networks Limited | Search and rescoring method for a speech recognition system |
| US6118392A (en) * | 1998-03-12 | 2000-09-12 | Liquid Audio Inc. | Lossless data compression with low complexity |
| US6801895B1 (en) * | 1998-12-07 | 2004-10-05 | At&T Corp. | Method and apparatus for segmenting a multi-media program based upon audio events |
| US6236942B1 (en) * | 1998-09-15 | 2001-05-22 | Scientific Prediction Incorporated | System and method for delineating spatially dependent objects, such as hydrocarbon accumulations from seismic data |
| US6404925B1 (en) * | 1999-03-11 | 2002-06-11 | Fuji Xerox Co., Ltd. | Methods and apparatuses for segmenting an audio-visual recording using image similarity searching and audio speaker recognition |
| US6418407B1 (en) * | 1999-09-30 | 2002-07-09 | Motorola, Inc. | Method and apparatus for pitch determination of a low bit rate digital voice message |
| US6496798B1 (en) * | 1999-09-30 | 2002-12-17 | Motorola, Inc. | Method and apparatus for encoding and decoding frames of voice model parameters into a low bit rate digital voice message |
| US6418405B1 (en) * | 1999-09-30 | 2002-07-09 | Motorola, Inc. | Method and apparatus for dynamic segmentation of a low bit rate digital voice message |
| US6567775B1 (en) * | 2000-04-26 | 2003-05-20 | International Business Machines Corporation | Fusion of audio and video based speaker identification for multimedia information access |
| US6418378B1 (en) * | 2000-06-26 | 2002-07-09 | Westerngeco, L.L.C. | Neural net prediction of seismic streamer shape |
| US6944315B1 (en) * | 2000-10-31 | 2005-09-13 | Intel Corporation | Method and apparatus for performing scale-invariant gesture recognition |
| US7433820B2 (en) * | 2004-05-12 | 2008-10-07 | International Business Machines Corporation | Asynchronous Hidden Markov Model method and system |
| US7391907B1 (en) * | 2004-10-01 | 2008-06-24 | Objectvideo, Inc. | Spurious object detection in a video surveillance system |
-
2002
- 2002-12-23 EP EP02798359A patent/EP1576580B1/de not_active Expired - Lifetime
- 2002-12-23 WO PCT/EP2002/014718 patent/WO2004057573A1/en not_active Ceased
- 2002-12-23 AT AT02798359T patent/ATE545130T1/de active
- 2002-12-23 AU AU2002363894A patent/AU2002363894A1/en not_active Abandoned
- 2002-12-23 US US10/538,876 patent/US7769580B2/en not_active Expired - Lifetime
Also Published As
| Publication number | Publication date |
|---|---|
| EP1576580B1 (de) | 2012-02-08 |
| US20060111897A1 (en) | 2006-05-25 |
| WO2004057573A1 (en) | 2004-07-08 |
| AU2002363894A1 (en) | 2004-07-14 |
| US7769580B2 (en) | 2010-08-03 |
| EP1576580A1 (de) | 2005-09-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| ATE545130T1 (de) | Verfahren zur optimierung der durchführung eines neuronalen netzwerkes in einem spracherkennungssystem durch bedingtes überspringen einer variablen anzahl von zeitfenstern | |
| ATE262723T1 (de) | Verbesserte verfahren zur rückgewinnung verlorener datenrahmen für ein lpc-basiertes, parametrisches sprachkodierungsystem. | |
| CN111899759B (zh) | 音频数据的预训练、模型训练方法、装置、设备及介质 | |
| CN110473529A (zh) | 一种基于自注意力机制的流式语音转写系统 | |
| DE60123999D1 (de) | Gewinn-faktoren quantisierung für einen celp- sprachkodierer | |
| CA2463230A1 (en) | A method and apparatus for decoding handwritten characters | |
| CA2486128A1 (en) | System and method for using meta-data dependent language modeling for automatic speech recognition | |
| ATE345562T1 (de) | Verfahren und vorrichtung zur erzeugung der referenzmuster für ein sprecherunabhängiges spracherkennungssystem | |
| DE60017763D1 (de) | Verfahren und vorrichtung zur erhaltung einer ziel-bitrate in einem sprachkodierer | |
| DE60023851T2 (de) | Verfahren und vorrichtung zur erzeugung von zufallszahlen für mit 1/8 bitrate arbeitenden sprachkodierer | |
| DK1328927T3 (da) | Fremgangsmåde og system til estimering af kunstigt höjbåndssignal i tale-codec | |
| ATE349750T1 (de) | Verfahren zur beschleunigung der durchführung von spracherkennung mit neuralen netzwerken, sowie entsprechende vorrichtung | |
| ATE319160T1 (de) | Verfahren zur rauschrobusten klassifikation in der sprachkodierung | |
| TW200818802A (en) | Systems, methods, and apparatus for signal change detection | |
| KR101535135B1 (ko) | 비음수 행렬 인수분해 및 기저 행렬 업데이트를 이용한 음향 개선 방법 및 시스템 | |
| US7797156B2 (en) | Speech analyzing system with adaptive noise codebook | |
| CN1145140C (zh) | 有选择地把一罚值赋予语音识别系统所伴随概率的方法 | |
| CN1342969A (zh) | 用于识别语音的方法 | |
| Chetouani et al. | Neural predictive coding for speech discriminant feature extraction: The DFE-NPC. | |
| GB2419204A (en) | Conditional rate modelling | |
| WO2002025452A3 (fr) | Procede et dispositif de prediction de trafic avec un reseau de neurones | |
| Turunen et al. | Hammerstein model for speech coding | |
| Lee et al. | Recognizing low/high anger in speech for call centers | |
| CN112614515A (zh) | 音频处理方法、装置、电子设备及存储介质 | |
| JP2008262523A (ja) | 対話システム評価方法 |