ATE464635T1 - Verfahren zum erzeugen und verwenden eines vektorcodebuchs, verfahren und einrichtung zum komprimieren von daten und verteiltes spracherkennungssystem - Google Patents

Verfahren zum erzeugen und verwenden eines vektorcodebuchs, verfahren und einrichtung zum komprimieren von daten und verteiltes spracherkennungssystem

Info

Publication number
ATE464635T1
ATE464635T1 AT04763512T AT04763512T ATE464635T1 AT E464635 T1 ATE464635 T1 AT E464635T1 AT 04763512 T AT04763512 T AT 04763512T AT 04763512 T AT04763512 T AT 04763512T AT E464635 T1 ATE464635 T1 AT E464635T1
Authority
AT
Austria
Prior art keywords
sub
sets
generating
compressing data
feature
Prior art date
Application number
AT04763512T
Other languages
English (en)
Inventor
Maurizio Fodrini
Donato Ettorre
Gianmario Bollano
Original Assignee
Telecom Italia Spa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telecom Italia Spa filed Critical Telecom Italia Spa
Application granted granted Critical
Publication of ATE464635T1 publication Critical patent/ATE464635T1/de

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
AT04763512T 2004-07-23 2004-07-23 Verfahren zum erzeugen und verwenden eines vektorcodebuchs, verfahren und einrichtung zum komprimieren von daten und verteiltes spracherkennungssystem ATE464635T1 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2004/008372 WO2006007871A1 (en) 2004-07-23 2004-07-23 Method for generating a vector codebook, method and device for compressing data, and distributed speech recognition system

Publications (1)

Publication Number Publication Date
ATE464635T1 true ATE464635T1 (de) 2010-04-15

Family

ID=34958455

Family Applications (1)

Application Number Title Priority Date Filing Date
AT04763512T ATE464635T1 (de) 2004-07-23 2004-07-23 Verfahren zum erzeugen und verwenden eines vektorcodebuchs, verfahren und einrichtung zum komprimieren von daten und verteiltes spracherkennungssystem

Country Status (8)

Country Link
US (1) US8214204B2 (de)
EP (1) EP1771841B1 (de)
JP (1) JP4703648B2 (de)
KR (1) KR101010585B1 (de)
CN (1) CN101019171B (de)
AT (1) ATE464635T1 (de)
DE (1) DE602004026645D1 (de)
WO (1) WO2006007871A1 (de)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7587314B2 (en) * 2005-08-29 2009-09-08 Nokia Corporation Single-codebook vector quantization for multiple-rate applications
US20070299667A1 (en) * 2006-06-22 2007-12-27 Texas Instruments, Incorporated System and method for reducing storage requirements for a model containing mixed weighted distributions and automatic speech recognition model incorporating the same
CN101335004B (zh) * 2007-11-02 2010-04-21 华为技术有限公司 一种多级量化的方法及装置
GB0901262D0 (en) * 2009-01-26 2009-03-11 Mitsubishi Elec R&D Ct Europe Video identification
KR101711158B1 (ko) * 2010-12-22 2017-03-14 한국전자통신연구원 셀룰러 시스템에서 인접 셀간 간섭 제어 방법
US9779731B1 (en) * 2012-08-20 2017-10-03 Amazon Technologies, Inc. Echo cancellation based on shared reference signals
US10147441B1 (en) 2013-12-19 2018-12-04 Amazon Technologies, Inc. Voice controlled system
CN103837890B (zh) * 2014-02-26 2016-07-06 中国石油集团川庆钻探工程有限公司地球物理勘探公司 获取地震数据的方法及设备
CA3001839C (en) * 2015-10-14 2018-10-23 Pindrop Security, Inc. Call detail record analysis to identify fraudulent activity and fraud detection in interactive voice response systems
CN107564535B (zh) * 2017-08-29 2020-09-01 中国人民解放军理工大学 一种分布式低速语音通话方法
US11470194B2 (en) 2019-08-19 2022-10-11 Pindrop Security, Inc. Caller verification via carrier metadata
CN112445943A (zh) * 2019-09-05 2021-03-05 阿里巴巴集团控股有限公司 数据处理的方法、装置和系统

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4958225A (en) * 1989-06-09 1990-09-18 Utah State University Foundation Full-search-equivalent method for matching data and a vector quantizer utilizing such method
US5061924B1 (en) 1991-01-25 1996-04-30 American Telephone & Telegraph Efficient vector codebook
US5651026A (en) * 1992-06-01 1997-07-22 Hughes Electronics Robust vector quantization of line spectral frequencies
JP3093879B2 (ja) * 1992-07-27 2000-10-03 オリンパス光学工業株式会社 ベクトル量子化コードブック作成及び探索装置
US5774839A (en) * 1995-09-29 1998-06-30 Rockwell International Corporation Delayed decision switched prediction multi-stage LSF vector quantization
GB9622055D0 (en) * 1996-10-23 1996-12-18 Univ Strathclyde Vector quantisation
US6009387A (en) * 1997-03-20 1999-12-28 International Business Machines Corporation System and method of compression/decompressing a speech signal by using split vector quantization and scalar quantization
US6161086A (en) * 1997-07-29 2000-12-12 Texas Instruments Incorporated Low-complexity speech coding with backward and inverse filtered target matching and a tree structured mutitap adaptive codebook search
US5946653A (en) 1997-10-01 1999-08-31 Motorola, Inc. Speaker independent speech recognition system and method
US6067515A (en) * 1997-10-27 2000-05-23 Advanced Micro Devices, Inc. Split matrix quantization with split vector quantization error compensation and selective enhanced processing for robust speech recognition
US5966688A (en) * 1997-10-28 1999-10-12 Hughes Electronics Corporation Speech mode based multi-stage vector quantizer
US6148283A (en) * 1998-09-23 2000-11-14 Qualcomm Inc. Method and apparatus using multi-path multi-stage vector quantizer
AU1445100A (en) 1998-10-13 2000-05-01 Hadasit Medical Research Services & Development Company Ltd Method and system for determining a vector index to represent a plurality of speech parameters in signal processing for identifying an utterance
US7389227B2 (en) * 2000-01-14 2008-06-17 C & S Technology Co., Ltd. High-speed search method for LSP quantizer using split VQ and fixed codebook of G.729 speech encoder
JP3483513B2 (ja) * 2000-03-02 2004-01-06 沖電気工業株式会社 音声録音再生装置
JP3367931B2 (ja) * 2000-03-06 2003-01-20 日本電信電話株式会社 共役構造ベクトル量子化方法
US6633839B2 (en) * 2001-02-02 2003-10-14 Motorola, Inc. Method and apparatus for speech reconstruction in a distributed speech recognition system
US7003454B2 (en) * 2001-05-16 2006-02-21 Nokia Corporation Method and system for line spectral frequency vector quantization in speech codec
CN1190772C (zh) * 2002-09-30 2005-02-23 中国科学院声学研究所 语音识别系统及用于语音识别系统的特征矢量集的压缩方法
US20040176950A1 (en) * 2003-03-04 2004-09-09 Docomo Communications Laboratories Usa, Inc. Methods and apparatuses for variable dimension vector quantization

Also Published As

Publication number Publication date
WO2006007871A1 (en) 2006-01-26
US20090037172A1 (en) 2009-02-05
CN101019171B (zh) 2011-08-10
EP1771841A1 (de) 2007-04-11
EP1771841B1 (de) 2010-04-14
JP4703648B2 (ja) 2011-06-15
CN101019171A (zh) 2007-08-15
KR20070047795A (ko) 2007-05-07
KR101010585B1 (ko) 2011-01-24
DE602004026645D1 (de) 2010-05-27
WO2006007871A8 (en) 2006-03-16
US8214204B2 (en) 2012-07-03
JP2008507718A (ja) 2008-03-13

Similar Documents

Publication Publication Date Title
CN108630193B (zh) 语音识别方法及装置
US7835910B1 (en) Exploiting unlabeled utterances for spoken language understanding
ATE464635T1 (de) Verfahren zum erzeugen und verwenden eines vektorcodebuchs, verfahren und einrichtung zum komprimieren von daten und verteiltes spracherkennungssystem
WO2019191556A1 (en) Knowledge transfer in permutation invariant training for single-channel multi-talker speech recognition
EP4235369A3 (de) Modalitätslernen auf mobilen vorrichtungen
ATE417346T1 (de) Spracherkennungs- und korrektursystem, korrekturvorrichtung und verfahren zur erstellung eines lexikons von alternativen
CN111326168B (zh) 语音分离方法、装置、电子设备和存储介质
JP2004536330A5 (de)
CN108986798B (zh) 语音数据的处理方法、装置及设备
MX9505296A (es) Metodo y aparato para el reconocimiento de la voz, con reduccion del ruido de polarizacion.
WO2007005098A3 (en) Method and apparatus for generating and updating a voice tag
CA3158930A1 (en) Arousal model generating method, intelligent terminal arousing method, and corresponding devices
MX2008002500A (es) Incorporacion de entrenamiento de voz en tutorial de usuario interactivo.
Weninger et al. Recognition of nonprototypical emotions in reverberated and noisy speech by nonnegative matrix factorization
US9378735B1 (en) Estimating speaker-specific affine transforms for neural network based speech recognition systems
CN104143335B (zh) 音频编码方法及相关装置
DE69923026D1 (de) Sprecher- und Umgebungsadaptation auf der Basis von Stimm-Eigenvektoren sowie der Maximalwahrscheinlichkeitsmethode
CA2947957A1 (en) Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system
CN106875944A (zh) 一种语音控制家庭智能终端的系统
ATE540382T1 (de) Objektregistrierung
CN110544472B (zh) 提升使用cnn网络结构的语音任务的性能的方法
Tu et al. Mutual Information Enhanced Training for Speaker Embedding.
CN106297769A (zh) 一种应用于语种识别的鉴别性特征提取方法
Mysore et al. A non-negative approach to language informed speech separation
CN111640450A (zh) 多人声音频处理方法、装置、设备及可读存储介质

Legal Events

Date Code Title Description
RER Ceased as to paragraph 5 lit. 3 law introducing patent treaties