ATE430359T1 - System und verfahren zur erkennung von einer semantischen absicht basierend auf akustischen informationen - Google Patents

System und verfahren zur erkennung von einer semantischen absicht basierend auf akustischen informationen

Info

Publication number
ATE430359T1
ATE430359T1 AT05111074T AT05111074T ATE430359T1 AT E430359 T1 ATE430359 T1 AT E430359T1 AT 05111074 T AT05111074 T AT 05111074T AT 05111074 T AT05111074 T AT 05111074T AT E430359 T1 ATE430359 T1 AT E430359T1
Authority
AT
Austria
Prior art keywords
acoustics
order
cluster
acoustic information
semantic
Prior art date
Application number
AT05111074T
Other languages
English (en)
Inventor
Alejandro Acero
Asela J Gunawardana
Dong Yu
Milind Mahajan
Xiao Li
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Application granted granted Critical
Publication of ATE430359T1 publication Critical patent/ATE430359T1/de

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1815Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning

Landscapes

  • Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)
AT05111074T 2004-12-10 2005-11-22 System und verfahren zur erkennung von einer semantischen absicht basierend auf akustischen informationen ATE430359T1 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/009,630 US7634406B2 (en) 2004-12-10 2004-12-10 System and method for identifying semantic intent from acoustic information

Publications (1)

Publication Number Publication Date
ATE430359T1 true ATE430359T1 (de) 2009-05-15

Family

ID=36021832

Family Applications (1)

Application Number Title Priority Date Filing Date
AT05111074T ATE430359T1 (de) 2004-12-10 2005-11-22 System und verfahren zur erkennung von einer semantischen absicht basierend auf akustischen informationen

Country Status (5)

Country Link
US (1) US7634406B2 (de)
EP (1) EP1669980B1 (de)
JP (1) JP4974510B2 (de)
AT (1) ATE430359T1 (de)
DE (1) DE602005014189D1 (de)

Families Citing this family (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060277525A1 (en) * 2005-06-06 2006-12-07 Microsoft Corporation Lexical, grammatical, and semantic inference mechanisms
US8700404B1 (en) * 2005-08-27 2014-04-15 At&T Intellectual Property Ii, L.P. System and method for using semantic and syntactic graphs for utterance classification
US8032375B2 (en) * 2006-03-17 2011-10-04 Microsoft Corporation Using generic predictive models for slot values in language modeling
US7752152B2 (en) * 2006-03-17 2010-07-06 Microsoft Corporation Using predictive user models for language modeling on a personal device with user behavior models based on statistical modeling
JP4734155B2 (ja) * 2006-03-24 2011-07-27 株式会社東芝 音声認識装置、音声認識方法および音声認識プログラム
US7930183B2 (en) * 2006-03-29 2011-04-19 Microsoft Corporation Automatic identification of dialog timing problems for an interactive speech dialog application using speech log data indicative of cases of barge-in and timing problems
US20070239453A1 (en) * 2006-04-06 2007-10-11 Microsoft Corporation Augmenting context-free grammars with back-off grammars for processing out-of-grammar utterances
US7689420B2 (en) * 2006-04-06 2010-03-30 Microsoft Corporation Personalizing a context-free grammar using a dictation language model
US7707027B2 (en) * 2006-04-13 2010-04-27 Nuance Communications, Inc. Identification and rejection of meaningless input during natural language classification
US20080091423A1 (en) * 2006-10-13 2008-04-17 Shourya Roy Generation of domain models from noisy transcriptions
US8108205B2 (en) * 2006-12-01 2012-01-31 Microsoft Corporation Leveraging back-off grammars for authoring context-free grammars
US8712757B2 (en) * 2007-01-10 2014-04-29 Nuance Communications, Inc. Methods and apparatus for monitoring communication through identification of priority-ranked keywords
GB2453366B (en) * 2007-10-04 2011-04-06 Toshiba Res Europ Ltd Automatic speech recognition method and apparatus
US8660844B2 (en) * 2007-10-24 2014-02-25 At&T Intellectual Property I, L.P. System and method of evaluating user simulations in a spoken dialog system with a diversion metric
JP2010224194A (ja) * 2009-03-23 2010-10-07 Sony Corp 音声認識装置及び音声認識方法、言語モデル生成装置及び言語モデル生成方法、並びにコンピューター・プログラム
US20130219333A1 (en) * 2009-06-12 2013-08-22 Adobe Systems Incorporated Extensible Framework for Facilitating Interaction with Devices
KR101615262B1 (ko) * 2009-08-12 2016-04-26 삼성전자주식회사 시멘틱 정보를 이용한 멀티 채널 오디오 인코딩 및 디코딩 방법 및 장치
US8457968B2 (en) * 2009-12-08 2013-06-04 At&T Intellectual Property I, L.P. System and method for efficient tracking of multiple dialog states with incremental recombination
US9378202B2 (en) * 2010-03-26 2016-06-28 Virtuoz Sa Semantic clustering
US8694304B2 (en) 2010-03-26 2014-04-08 Virtuoz Sa Semantic clustering and user interfaces
US8880399B2 (en) * 2010-09-27 2014-11-04 Rosetta Stone, Ltd. Utterance verification and pronunciation scoring by lattice transduction
US9524291B2 (en) 2010-10-06 2016-12-20 Virtuoz Sa Visual display of semantic information
US8688453B1 (en) * 2011-02-28 2014-04-01 Nuance Communications, Inc. Intent mining via analysis of utterances
US8798995B1 (en) 2011-09-23 2014-08-05 Amazon Technologies, Inc. Key word determinations from voice data
US9214157B2 (en) * 2011-12-06 2015-12-15 At&T Intellectual Property I, L.P. System and method for machine-mediated human-human conversation
US9082403B2 (en) * 2011-12-15 2015-07-14 Microsoft Technology Licensing, Llc Spoken utterance classification training for a speech recognition system
US8983840B2 (en) * 2012-06-19 2015-03-17 International Business Machines Corporation Intent discovery in audio or text-based conversation
US9158760B2 (en) * 2012-12-21 2015-10-13 The Nielsen Company (Us), Llc Audio decoding with supplemental semantic audio recognition and report generation
US9183849B2 (en) 2012-12-21 2015-11-10 The Nielsen Company (Us), Llc Audio matching with semantic audio recognition and report generation
US9195649B2 (en) 2012-12-21 2015-11-24 The Nielsen Company (Us), Llc Audio processing techniques for semantic audio recognition and report generation
US9047268B2 (en) * 2013-01-31 2015-06-02 Google Inc. Character and word level language models for out-of-vocabulary text input
US9454240B2 (en) 2013-02-05 2016-09-27 Google Inc. Gesture keyboard input of non-dictionary character strings
US10354677B2 (en) * 2013-02-28 2019-07-16 Nuance Communications, Inc. System and method for identification of intent segment(s) in caller-agent conversations
US9626960B2 (en) * 2013-04-25 2017-04-18 Nuance Communications, Inc. Systems and methods for providing metadata-dependent language models
US8756499B1 (en) * 2013-04-29 2014-06-17 Google Inc. Gesture keyboard input of non-dictionary character strings using substitute scoring
US9733894B2 (en) 2013-07-02 2017-08-15 24/7 Customer, Inc. Method and apparatus for facilitating voice user interface design
US9842586B2 (en) 2014-07-09 2017-12-12 Genesys Telecommunications Laboratories, Inc. System and method for semantically exploring concepts
US9953646B2 (en) 2014-09-02 2018-04-24 Belleau Technologies Method and system for dynamic speech recognition and tracking of prewritten script
US9858923B2 (en) * 2015-09-24 2018-01-02 Intel Corporation Dynamic adaptation of language models and semantic tracking for automatic speech recognition
KR102429260B1 (ko) * 2015-10-12 2022-08-05 삼성전자주식회사 음성 에이전트 기반의 제어 명령 처리 장치 및 방법과, 에이전트 장치
US10083451B2 (en) 2016-07-08 2018-09-25 Asapp, Inc. Using semantic processing for customer support
CN109792402B (zh) 2016-07-08 2020-03-06 艾赛普公司 自动响应用户的请求
JP6886651B2 (ja) * 2016-12-08 2021-06-16 株式会社国際電気通信基礎技術研究所 行動コマンド生成システム、応答システムおよび行動コマンド生成方法
US10216832B2 (en) * 2016-12-19 2019-02-26 Interactions Llc Underspecification of intents in a natural language processing system
US10665228B2 (en) * 2018-05-23 2020-05-26 Bank of America Corporaiton Quantum technology for use with extracting intents from linguistics
US10477028B1 (en) 2018-07-13 2019-11-12 Bank Of America Corporation System for processing voice responses using a natural language processing engine
US11315256B2 (en) * 2018-12-06 2022-04-26 Microsoft Technology Licensing, Llc Detecting motion in video using motion vectors
CN109657186A (zh) * 2018-12-27 2019-04-19 广州势必可赢网络科技有限公司 一种人数统计方法、系统及相关装置
CN112396444A (zh) * 2019-08-15 2021-02-23 阿里巴巴集团控股有限公司 一种智能机器人应答方法及装置
US11587551B2 (en) 2020-04-07 2023-02-21 International Business Machines Corporation Leveraging unpaired text data for training end-to-end spoken language understanding systems
US20230419979A1 (en) * 2022-06-28 2023-12-28 Samsung Electronics Co., Ltd. Online speaker diarization using local and global clustering

Family Cites Families (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0494573A1 (de) * 1991-01-08 1992-07-15 International Business Machines Corporation Verfahren zur automatischen Unterdrückung der Zweideutigkeit von den Verbindungen von Synonymen in einem elektronischen Wörterbuch für ein Natursprachenverarbeitungssystem
NZ248751A (en) * 1994-03-23 1997-11-24 Ryan John Kevin Text analysis and coding
JP3745403B2 (ja) * 1994-04-12 2006-02-15 ゼロックス コーポレイション オーディオデータセグメントのクラスタリング方法
JP3453456B2 (ja) * 1995-06-19 2003-10-06 キヤノン株式会社 状態共有モデルの設計方法及び装置ならびにその状態共有モデルを用いた音声認識方法および装置
JP3627299B2 (ja) * 1995-07-19 2005-03-09 ソニー株式会社 音声認識方法及び装置
US5835893A (en) * 1996-02-15 1998-11-10 Atr Interpreting Telecommunications Research Labs Class-based word clustering for speech recognition using a three-level balanced hierarchical similarity
US5806030A (en) * 1996-05-06 1998-09-08 Matsushita Electric Ind Co Ltd Low complexity, high accuracy clustering method for speech recognizer
US6601055B1 (en) * 1996-12-27 2003-07-29 Linda M. Roberts Explanation generation system for a diagnosis support tool employing an inference system
US5860063A (en) * 1997-07-11 1999-01-12 At&T Corp Automated meaningful phrase clustering
US6449612B1 (en) * 1998-03-17 2002-09-10 Microsoft Corporation Varying cluster number in a scalable clustering system for use with large databases
US20030154072A1 (en) * 1998-03-31 2003-08-14 Scansoft, Inc., A Delaware Corporation Call analysis
US6725195B2 (en) * 1998-08-25 2004-04-20 Sri International Method and apparatus for probabilistic recognition using small number of state clusters
US6393460B1 (en) * 1998-08-28 2002-05-21 International Business Machines Corporation Method and system for informing users of subjects of discussion in on-line chats
WO2000025299A1 (de) * 1998-10-27 2000-05-04 Siemens Aktiengesellschaft Verfahren und anordnung zur klassenbildung für ein sprachmodell basierend auf linguistischen klassen
US6317707B1 (en) * 1998-12-07 2001-11-13 At&T Corp. Automatic clustering of tokens from a corpus for grammar acquisition
US6665681B1 (en) * 1999-04-09 2003-12-16 Entrieva, Inc. System and method for generating a taxonomy from a plurality of documents
WO2000073936A1 (en) * 1999-05-28 2000-12-07 Sehda, Inc. Phrase-based dialogue modeling with particular application to creating recognition grammars for voice-controlled user interfaces
US6526379B1 (en) * 1999-11-29 2003-02-25 Matsushita Electric Industrial Co., Ltd. Discriminative clustering methods for automatic speech recognition
GB0000735D0 (en) 2000-01-13 2000-03-08 Eyretel Ltd System and method for analysing communication streams
US6751621B1 (en) * 2000-01-27 2004-06-15 Manning & Napier Information Services, Llc. Construction of trainable semantic vectors and clustering, classification, and searching using trainable semantic vectors
US7275033B1 (en) * 2000-09-30 2007-09-25 Intel Corporation Method and system for using rule-based knowledge to build a class-based domain specific statistical language model
DE60002584D1 (de) * 2000-11-07 2003-06-12 Ericsson Telefon Ab L M Anwendung von Referenzdaten für Spracherkennung
US6937983B2 (en) * 2000-12-20 2005-08-30 International Business Machines Corporation Method and system for semantic speech recognition
JP2002358095A (ja) * 2001-03-30 2002-12-13 Sony Corp 音声処理装置および音声処理方法、並びにプログラムおよび記録媒体
US20040120472A1 (en) 2001-04-19 2004-06-24 Popay Paul I Voice response system
US7031909B2 (en) * 2002-03-12 2006-04-18 Verity, Inc. Method and system for naming a cluster of words and phrases
US7085771B2 (en) * 2002-05-17 2006-08-01 Verity, Inc System and method for automatically discovering a hierarchy of concepts from a corpus of documents
US7107207B2 (en) * 2002-06-19 2006-09-12 Microsoft Corporation Training machine learning by sequential conditional generalized iterative scaling
JP2004198597A (ja) * 2002-12-17 2004-07-15 Advanced Telecommunication Research Institute International 音声認識装置および文分類装置としてコンピュータを動作させるコンピュータプログラム、階層化された言語モデルを作成する方法を実現する様にコンピュータを動作させるコンピュータプログラム、および記憶媒体
JP4392581B2 (ja) * 2003-02-20 2010-01-06 ソニー株式会社 言語処理装置および言語処理方法、並びにプログラムおよび記録媒体
JP4828091B2 (ja) * 2003-03-05 2011-11-30 ヒューレット・パッカード・カンパニー クラスタリング方法プログラム及び装置
JP4223841B2 (ja) * 2003-03-17 2009-02-12 富士通株式会社 音声対話システム及び方法
US7103553B2 (en) * 2003-06-04 2006-09-05 Matsushita Electric Industrial Co., Ltd. Assistive call center interface
JP4191021B2 (ja) * 2003-12-01 2008-12-03 株式会社国際電気通信基礎技術研究所 ドメイン検証器のトレーニング装置、入力データのドメイン検証装置、及びコンピュータプログラム

Also Published As

Publication number Publication date
US7634406B2 (en) 2009-12-15
EP1669980A2 (de) 2006-06-14
EP1669980A3 (de) 2007-11-28
JP2006171710A (ja) 2006-06-29
US20060129397A1 (en) 2006-06-15
DE602005014189D1 (de) 2009-06-10
JP4974510B2 (ja) 2012-07-11
EP1669980B1 (de) 2009-04-29

Similar Documents

Publication Publication Date Title
ATE430359T1 (de) System und verfahren zur erkennung von einer semantischen absicht basierend auf akustischen informationen
Johnsrude et al. Swinging at a cocktail party: Voice familiarity aids speech perception in the presence of a competing voice
CN105096941B (zh) 语音识别方法以及装置
US10068588B2 (en) Real-time emotion recognition from audio signals
CN104252864B (zh) 实时语音分析方法和系统
Masataka Music, evolution and language
CN105975569A (zh) 一种语音处理的方法及终端
CN102831891B (zh) 一种语音数据处理方法及系统
CN104123115B (zh) 一种音频信息处理方法及电子设备
ATE440334T1 (de) System für sprachgesteuerte auswahl einer audiodatei und verfahren dafür
ATE426233T1 (de) Interaktives spracherkennungssystem
CN107210034A (zh) 选择性会议摘要
WO2007056344A3 (en) Techiques for model optimization for statistical pattern recognition
CN106782615A (zh) 语音数据情感检测方法和装置及系统
SG135951A1 (en) Presentation of data based on user input
CN104205215B (zh) 自动实时言语障碍矫正
MX2008002500A (es) Incorporacion de entrenamiento de voz en tutorial de usuario interactivo.
JP6980411B2 (ja) 情報処理装置、対話処理方法、及び対話処理プログラム
Stecker Moderate actual intentionalism defended
ATE514162T1 (de) Dynamische erzeugung von kontexten zur spracherkennung
ATE407411T1 (de) Verfahren zum bereitstellen von kontoinformation und system zum aufschreiben von diktiertem text
CN109754788A (zh) 一种语音控制方法、装置、设备及存储介质
DE60214850D1 (de) Für eine benutzergruppe spezifisches musterverarbeitungssystem
CN107403620A (zh) 一种语音识别方法及装置
US10522135B2 (en) System and method for segmenting audio files for transcription

Legal Events

Date Code Title Description
RER Ceased as to paragraph 5 lit. 3 law introducing patent treaties