ATE339756T1 - Verfahren und vorrichtung zur bestimmung von formanten unter benutzung eines restsignalmodells - Google Patents

Verfahren und vorrichtung zur bestimmung von formanten unter benutzung eines restsignalmodells

Info

Publication number
ATE339756T1
ATE339756T1 AT04007986T AT04007986T ATE339756T1 AT E339756 T1 ATE339756 T1 AT E339756T1 AT 04007986 T AT04007986 T AT 04007986T AT 04007986 T AT04007986 T AT 04007986T AT E339756 T1 ATE339756 T1 AT E339756T1
Authority
AT
Austria
Prior art keywords
formants
identified
residual signal
signal model
model
Prior art date
Application number
AT04007986T
Other languages
English (en)
Inventor
Issam Bazzi
Li Deng
Alejandro Acero
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Application granted granted Critical
Publication of ATE339756T1 publication Critical patent/ATE339756T1/de

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/15Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being formant information

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Telephonic Communication Services (AREA)
AT04007986T 2003-04-01 2004-04-01 Verfahren und vorrichtung zur bestimmung von formanten unter benutzung eines restsignalmodells ATE339756T1 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/404,411 US7424423B2 (en) 2003-04-01 2003-04-01 Method and apparatus for formant tracking using a residual model

Publications (1)

Publication Number Publication Date
ATE339756T1 true ATE339756T1 (de) 2006-10-15

Family

ID=32850595

Family Applications (1)

Application Number Title Priority Date Filing Date
AT04007986T ATE339756T1 (de) 2003-04-01 2004-04-01 Verfahren und vorrichtung zur bestimmung von formanten unter benutzung eines restsignalmodells

Country Status (7)

Country Link
US (1) US7424423B2 (de)
EP (1) EP1465153B1 (de)
JP (1) JP4718789B2 (de)
KR (1) KR101026632B1 (de)
CN (1) CN100562926C (de)
AT (1) ATE339756T1 (de)
DE (1) DE602004002312T2 (de)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7475011B2 (en) * 2004-08-25 2009-01-06 Microsoft Corporation Greedy algorithm for identifying values for vocal tract resonance vectors
KR100634526B1 (ko) * 2004-11-24 2006-10-16 삼성전자주식회사 포만트 트래킹 장치 및 방법
KR100717625B1 (ko) 2006-02-10 2007-05-15 삼성전자주식회사 음성 인식에서의 포먼트 주파수 추정 방법 및 장치
US8010356B2 (en) * 2006-02-17 2011-08-30 Microsoft Corporation Parameter learning in a hidden trajectory model
US7877255B2 (en) * 2006-03-31 2011-01-25 Voice Signal Technologies, Inc. Speech recognition using channel verification
EP1930879B1 (de) * 2006-09-29 2009-07-29 Honda Research Institute Europe GmbH Gemeinsame Schätzung von Formant-Trajektorien mittels Bayesischer Techniken und adaptiver Segmentierung
CN101067929B (zh) * 2007-06-05 2011-04-20 南京大学 使用共振峰增强提取话音共振峰轨迹的方法
EP2232700B1 (de) 2007-12-21 2014-08-13 Dts Llc System zur einstellung der wahrgenommenen lautstärke von tonsignalen
US8538042B2 (en) 2009-08-11 2013-09-17 Dts Llc System for increasing perceived loudness of speakers
US8204742B2 (en) * 2009-09-14 2012-06-19 Srs Labs, Inc. System for processing an audio signal to enhance speech intelligibility
US20120078625A1 (en) * 2010-09-23 2012-03-29 Waveform Communications, Llc Waveform analysis of speech
US20140207456A1 (en) * 2010-09-23 2014-07-24 Waveform Communications, Llc Waveform analysis of speech
JP6147744B2 (ja) 2011-07-29 2017-06-14 ディーティーエス・エルエルシーDts Llc 適応音声了解度処理システムおよび方法
US9312829B2 (en) 2012-04-12 2016-04-12 Dts Llc System for adjusting loudness of audio signals in real time
US9728200B2 (en) * 2013-01-29 2017-08-08 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for adaptive formant sharpening in linear prediction coding
US9520141B2 (en) * 2013-02-28 2016-12-13 Google Inc. Keyboard typing detection and suppression
US9805714B2 (en) * 2016-03-22 2017-10-31 Asustek Computer Inc. Directional keyword verification method applicable to electronic device and electronic device using the same

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3649765A (en) * 1969-10-29 1972-03-14 Bell Telephone Labor Inc Speech analyzer-synthesizer system employing improved formant extractor
JPH0785200B2 (ja) * 1986-11-13 1995-09-13 日本電気株式会社 スペクトル標準パタンの作成方法
US5799276A (en) * 1995-11-07 1998-08-25 Accent Incorporated Knowledge-based speech recognition system and methods having frame length computed based upon estimated pitch period of vocalic intervals
US5729694A (en) * 1996-02-06 1998-03-17 The Regents Of The University Of California Speech coding, reconstruction and recognition using acoustics and electromagnetic waves
US6064958A (en) 1996-09-20 2000-05-16 Nippon Telegraph And Telephone Corporation Pattern recognition scheme using probabilistic models based on mixtures distribution of discrete distribution
US5815090A (en) * 1996-10-31 1998-09-29 University Of Florida Research Foundation, Inc. Remote monitoring system for detecting termites
JP2986792B2 (ja) * 1998-03-16 1999-12-06 株式会社エイ・ティ・アール音声翻訳通信研究所 話者正規化処理装置及び音声認識装置
US6980952B1 (en) * 1998-08-15 2005-12-27 Texas Instruments Incorporated Source normalization training for HMM modeling of speech
US6502066B2 (en) 1998-11-24 2002-12-31 Microsoft Corporation System for generating formant tracks by modifying formants synthesized from speech units
US20010044719A1 (en) * 1999-07-02 2001-11-22 Mitsubishi Electric Research Laboratories, Inc. Method and system for recognizing, indexing, and searching acoustic signals
US6910007B2 (en) * 2000-05-31 2005-06-21 At&T Corp Stochastic modeling of spectral adjustment for high quality pitch modification
JP2002133411A (ja) * 2000-08-17 2002-05-10 Canon Inc 情報処理方法、情報処理装置及びプログラム
JP2002278592A (ja) * 2001-03-21 2002-09-27 Fujitsu Ltd データ照合プログラム、データ照合方法およびデータ照合装置
US6931374B2 (en) 2003-04-01 2005-08-16 Microsoft Corporation Method of speech recognition using variational inference with switching state space models

Also Published As

Publication number Publication date
JP4718789B2 (ja) 2011-07-06
DE602004002312D1 (de) 2006-10-26
EP1465153A2 (de) 2004-10-06
KR101026632B1 (ko) 2011-04-04
JP2004310091A (ja) 2004-11-04
US20040199382A1 (en) 2004-10-07
CN1534596A (zh) 2004-10-06
EP1465153A3 (de) 2005-01-19
US7424423B2 (en) 2008-09-09
EP1465153B1 (de) 2006-09-13
KR20040088364A (ko) 2004-10-16
CN100562926C (zh) 2009-11-25
DE602004002312T2 (de) 2006-12-28

Similar Documents

Publication Publication Date Title
ATE339756T1 (de) Verfahren und vorrichtung zur bestimmung von formanten unter benutzung eines restsignalmodells
TW200601263A (en) Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition
DE60309142D1 (de) Vorrichtung zur bestimmung von parametern eines gauss'schen mischungmodells (gmm) oder eines gmm basierten hidden markov modells
CN105206258A (zh) 声学模型的生成方法和装置及语音合成方法和装置
WO2007027989A3 (en) Dynamic speech sharpening
ATE405919T1 (de) Spracherkennungssystem und verfahren auf phonetischer basis
ATE134275T1 (de) Verfahren zur sprecheradaptiven erkennung von sprache
RU2009119491A (ru) Способ и устройство кодирования кадров перехода в речевых сигналах
ATE265083T1 (de) Verfahren und vorrichtung zum unterscheidenden training von akustischen modellen in einem spracherkennungssystem
DE60310785D1 (de) Verfahren und Vorrichtung zur Übersetzung von gesprochener Sprache
FR2522179B1 (fr) Procede et appareil de reconnaissance de paroles permettant de reconnaitre des phonemes particuliers du signal vocal quelle que soit la personne qui parle
DE69635655D1 (de) Srecherangepasste spracherkennung
DE60325881D1 (de) Verfahren zum betreiben eines spracherkennungssystemes
DE60128479D1 (de) Verfahren und vorrichtung zur bestimmung eines synthetischen höheren bandsignals in einem sprachkodierer
ATE401644T1 (de) Verfahren zur spracherkennung
JP2016539355A5 (de)
Yarra et al. Automatic detection of syllable stress using sonority based prominence features for pronunciation evaluation
DE602004004572D1 (de) Verfolgen von Vokaltraktresonanzen unter Verwendung einer zielgeführten Einschränkung
CN105206264B (zh) 语音合成方法和装置
DE69937854D1 (de) Verfahren und Vorrichtung zur Spracherkennung unter Verwendung von phonetischen Transkriptionen
CN111862939B (zh) 一种韵律短语标注方法和装置
ATE357723T1 (de) Verfahren zur mehrsprachigen spracherkennung
Hillenbrand et al. Perception of sinewave vowels
Gong et al. Score-informed syllable segmentation for jingju a cappella singing voice with mel-frequency intensity profiles
Wang et al. Improved Mandarin speech recognition by lattice rescoring with enhanced tone models

Legal Events

Date Code Title Description
RER Ceased as to paragraph 5 lit. 3 law introducing patent treaties