ATE545130T1 - Verfahren zur optimierung der durchführung eines neuronalen netzwerkes in einem spracherkennungssystem durch bedingtes überspringen einer variablen anzahl von zeitfenstern - Google Patents

Verfahren zur optimierung der durchführung eines neuronalen netzwerkes in einem spracherkennungssystem durch bedingtes überspringen einer variablen anzahl von zeitfenstern

Info

Publication number
ATE545130T1
ATE545130T1 AT02798359T AT02798359T ATE545130T1 AT E545130 T1 ATE545130 T1 AT E545130T1 AT 02798359 T AT02798359 T AT 02798359T AT 02798359 T AT02798359 T AT 02798359T AT E545130 T1 ATE545130 T1 AT E545130T1
Authority
AT
Austria
Prior art keywords
frames
distance
optimizing
recognition system
variable number
Prior art date
Application number
AT02798359T
Other languages
English (en)
Inventor
Roberto Gemello
Dario Albesano
Original Assignee
Loquendo Spa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Loquendo Spa filed Critical Loquendo Spa
Application granted granted Critical
Publication of ATE545130T1 publication Critical patent/ATE545130T1/de

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Time-Division Multiplex Systems (AREA)
  • Telephonic Communication Services (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
AT02798359T 2002-12-23 2002-12-23 Verfahren zur optimierung der durchführung eines neuronalen netzwerkes in einem spracherkennungssystem durch bedingtes überspringen einer variablen anzahl von zeitfenstern ATE545130T1 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2002/014718 WO2004057573A1 (en) 2002-12-23 2002-12-23 Method of optimising the execution of a neural network in a speech recognition system through conditionally skipping a variable number of frames

Publications (1)

Publication Number Publication Date
ATE545130T1 true ATE545130T1 (de) 2012-02-15

Family

ID=32668683

Family Applications (1)

Application Number Title Priority Date Filing Date
AT02798359T ATE545130T1 (de) 2002-12-23 2002-12-23 Verfahren zur optimierung der durchführung eines neuronalen netzwerkes in einem spracherkennungssystem durch bedingtes überspringen einer variablen anzahl von zeitfenstern

Country Status (5)

Country Link
US (1) US7769580B2 (de)
EP (1) EP1576580B1 (de)
AT (1) ATE545130T1 (de)
AU (1) AU2002363894A1 (de)
WO (1) WO2004057573A1 (de)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101351019B1 (ko) 2007-04-13 2014-01-13 엘지전자 주식회사 방송 신호 송수신 장치 및 방송 신호 송수신 방법
US8401331B2 (en) * 2007-12-06 2013-03-19 Alcatel Lucent Video quality analysis using a linear approximation technique
US20100057452A1 (en) * 2008-08-28 2010-03-04 Microsoft Corporation Speech interfaces
CN103021408B (zh) * 2012-12-04 2014-10-22 中国科学院自动化研究所 一种发音稳定段辅助的语音识别优化解码方法及装置
JP6596924B2 (ja) * 2014-05-29 2019-10-30 日本電気株式会社 音声データ処理装置、音声データ処理方法、及び、音声データ処理プログラム
US9627532B2 (en) * 2014-06-18 2017-04-18 Nuance Communications, Inc. Methods and apparatus for training an artificial neural network for use in speech recognition
US9520128B2 (en) * 2014-09-23 2016-12-13 Intel Corporation Frame skipping with extrapolation and outputs on demand neural network for automatic speech recognition
US10304440B1 (en) * 2015-07-10 2019-05-28 Amazon Technologies, Inc. Keyword spotting using multi-task configuration
KR102423302B1 (ko) * 2015-10-06 2022-07-19 삼성전자주식회사 음성 인식에서의 음향 점수 계산 장치 및 방법과, 음향 모델 학습 장치 및 방법
BR102016007265B1 (pt) * 2016-04-01 2022-11-16 Samsung Eletrônica da Amazônia Ltda. Método multimodal e em tempo real para filtragem de conteúdo sensível
KR102565274B1 (ko) 2016-07-07 2023-08-09 삼성전자주식회사 자동 통역 방법 및 장치, 및 기계 번역 방법 및 장치
CN109697977B (zh) * 2017-10-23 2023-10-31 三星电子株式会社 语音识别方法和设备
KR102676221B1 (ko) 2017-10-23 2024-06-19 삼성전자주식회사 음성 인식 방법 및 장치
US10628486B2 (en) * 2017-11-15 2020-04-21 Google Llc Partitioning videos
KR102424514B1 (ko) 2017-12-04 2022-07-25 삼성전자주식회사 언어 처리 방법 및 장치
WO2020218634A1 (ko) * 2019-04-23 2020-10-29 엘지전자 주식회사 응답 기기 결정 방법 및 장치
US12266351B2 (en) * 2022-08-26 2025-04-01 Qualcomm Incorporated Adaptive frame skipping for speech recognition
US12525226B2 (en) * 2023-02-10 2026-01-13 Qualcomm Incorporated Latency reduction for multi-stage speech recognition

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4379949A (en) * 1981-08-10 1983-04-12 Motorola, Inc. Method of and means for variable-rate coding of LPC parameters
US4489434A (en) * 1981-10-05 1984-12-18 Exxon Corporation Speech recognition method and apparatus
US5317673A (en) * 1992-06-22 1994-05-31 Sri International Method and apparatus for context-dependent estimation of multiple probability distributions of phonetic classes with multilayer perceptrons in a speech recognition system
US5461696A (en) * 1992-10-28 1995-10-24 Motorola, Inc. Decision directed adaptive neural network
US5461699A (en) * 1993-10-25 1995-10-24 International Business Machines Corporation Forecasting using a neural network and a statistical forecast
US5689616A (en) * 1993-11-19 1997-11-18 Itt Corporation Automatic language identification/verification system
US6061652A (en) * 1994-06-13 2000-05-09 Matsushita Electric Industrial Co., Ltd. Speech recognition apparatus
US5638487A (en) * 1994-12-30 1997-06-10 Purespeech, Inc. Automatic speech recognition
IT1280816B1 (it) * 1995-03-22 1998-02-11 Cselt Centro Studi Lab Telecom Metodo per velocizzare l'esecuzione di reti neurali per il trattamento di segnali correlati.
US6064958A (en) * 1996-09-20 2000-05-16 Nippon Telegraph And Telephone Corporation Pattern recognition scheme using probabilistic models based on mixtures distribution of discrete distribution
US5924065A (en) * 1997-06-16 1999-07-13 Digital Equipment Corporation Environmently compensated speech processing
US6032116A (en) * 1997-06-27 2000-02-29 Advanced Micro Devices, Inc. Distance measure in a speech recognition system for speech recognition using frequency shifting factors to compensate for input signal frequency shifts
US6253178B1 (en) 1997-09-22 2001-06-26 Nortel Networks Limited Search and rescoring method for a speech recognition system
US6118392A (en) * 1998-03-12 2000-09-12 Liquid Audio Inc. Lossless data compression with low complexity
US6801895B1 (en) * 1998-12-07 2004-10-05 At&T Corp. Method and apparatus for segmenting a multi-media program based upon audio events
US6236942B1 (en) * 1998-09-15 2001-05-22 Scientific Prediction Incorporated System and method for delineating spatially dependent objects, such as hydrocarbon accumulations from seismic data
US6404925B1 (en) * 1999-03-11 2002-06-11 Fuji Xerox Co., Ltd. Methods and apparatuses for segmenting an audio-visual recording using image similarity searching and audio speaker recognition
US6418407B1 (en) * 1999-09-30 2002-07-09 Motorola, Inc. Method and apparatus for pitch determination of a low bit rate digital voice message
US6496798B1 (en) * 1999-09-30 2002-12-17 Motorola, Inc. Method and apparatus for encoding and decoding frames of voice model parameters into a low bit rate digital voice message
US6418405B1 (en) * 1999-09-30 2002-07-09 Motorola, Inc. Method and apparatus for dynamic segmentation of a low bit rate digital voice message
US6567775B1 (en) * 2000-04-26 2003-05-20 International Business Machines Corporation Fusion of audio and video based speaker identification for multimedia information access
US6418378B1 (en) * 2000-06-26 2002-07-09 Westerngeco, L.L.C. Neural net prediction of seismic streamer shape
US6944315B1 (en) * 2000-10-31 2005-09-13 Intel Corporation Method and apparatus for performing scale-invariant gesture recognition
US7433820B2 (en) * 2004-05-12 2008-10-07 International Business Machines Corporation Asynchronous Hidden Markov Model method and system
US7391907B1 (en) * 2004-10-01 2008-06-24 Objectvideo, Inc. Spurious object detection in a video surveillance system

Also Published As

Publication number Publication date
EP1576580B1 (de) 2012-02-08
US20060111897A1 (en) 2006-05-25
WO2004057573A1 (en) 2004-07-08
AU2002363894A1 (en) 2004-07-14
US7769580B2 (en) 2010-08-03
EP1576580A1 (de) 2005-09-21

Similar Documents

Publication Publication Date Title
ATE545130T1 (de) Verfahren zur optimierung der durchführung eines neuronalen netzwerkes in einem spracherkennungssystem durch bedingtes überspringen einer variablen anzahl von zeitfenstern
ATE262723T1 (de) Verbesserte verfahren zur rückgewinnung verlorener datenrahmen für ein lpc-basiertes, parametrisches sprachkodierungsystem.
CN111899759B (zh) 音频数据的预训练、模型训练方法、装置、设备及介质
CN110473529A (zh) 一种基于自注意力机制的流式语音转写系统
DE60123999D1 (de) Gewinn-faktoren quantisierung für einen celp- sprachkodierer
CA2463230A1 (en) A method and apparatus for decoding handwritten characters
CA2486128A1 (en) System and method for using meta-data dependent language modeling for automatic speech recognition
ATE345562T1 (de) Verfahren und vorrichtung zur erzeugung der referenzmuster für ein sprecherunabhängiges spracherkennungssystem
DE60017763D1 (de) Verfahren und vorrichtung zur erhaltung einer ziel-bitrate in einem sprachkodierer
DE60023851T2 (de) Verfahren und vorrichtung zur erzeugung von zufallszahlen für mit 1/8 bitrate arbeitenden sprachkodierer
DK1328927T3 (da) Fremgangsmåde og system til estimering af kunstigt höjbåndssignal i tale-codec
ATE349750T1 (de) Verfahren zur beschleunigung der durchführung von spracherkennung mit neuralen netzwerken, sowie entsprechende vorrichtung
ATE319160T1 (de) Verfahren zur rauschrobusten klassifikation in der sprachkodierung
TW200818802A (en) Systems, methods, and apparatus for signal change detection
KR101535135B1 (ko) 비음수 행렬 인수분해 및 기저 행렬 업데이트를 이용한 음향 개선 방법 및 시스템
US7797156B2 (en) Speech analyzing system with adaptive noise codebook
CN1145140C (zh) 有选择地把一罚值赋予语音识别系统所伴随概率的方法
CN1342969A (zh) 用于识别语音的方法
Chetouani et al. Neural predictive coding for speech discriminant feature extraction: The DFE-NPC.
GB2419204A (en) Conditional rate modelling
WO2002025452A3 (fr) Procede et dispositif de prediction de trafic avec un reseau de neurones
Turunen et al. Hammerstein model for speech coding
Lee et al. Recognizing low/high anger in speech for call centers
CN112614515A (zh) 音频处理方法、装置、电子设备及存储介质
JP2008262523A (ja) 対話システム評価方法