CN112639965A - 在包括多个设备的环境中的语音识别方法和设备 - Google Patents

在包括多个设备的环境中的语音识别方法和设备 Download PDF

Info

Publication number
CN112639965A
CN112639965A CN201980055917.2A CN201980055917A CN112639965A CN 112639965 A CN112639965 A CN 112639965A CN 201980055917 A CN201980055917 A CN 201980055917A CN 112639965 A CN112639965 A CN 112639965A
Authority
CN
China
Prior art keywords
speaker
speech recognition
speech
recognition
score
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201980055917.2A
Other languages
English (en)
Chinese (zh)
Inventor
曹根硕
卢在英
邢知远
张东韩
李在原
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Priority claimed from PCT/KR2019/013903 external-priority patent/WO2020085769A1/fr
Publication of CN112639965A publication Critical patent/CN112639965A/zh
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/08Use of distortion metrics or a particular distance between probe pattern and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/12Score normalisation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/32Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/221Announcement of recognition results

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Telephonic Communication Services (AREA)
CN201980055917.2A 2018-10-24 2019-10-22 在包括多个设备的环境中的语音识别方法和设备 Pending CN112639965A (zh)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
KR20180127696 2018-10-24
KR10-2018-0127696 2018-10-24
KR10-2019-0110772 2019-09-06
KR1020190110772A KR20200047311A (ko) 2018-10-24 2019-09-06 복수의 장치들이 있는 환경에서의 음성 인식 방법 및 장치
PCT/KR2019/013903 WO2020085769A1 (fr) 2018-10-24 2019-10-22 Procédé et appareil de reconnaissance vocale dans un environnement comprenant une pluralité d'appareils

Publications (1)

Publication Number Publication Date
CN112639965A true CN112639965A (zh) 2021-04-09

Family

ID=70733911

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980055917.2A Pending CN112639965A (zh) 2018-10-24 2019-10-22 在包括多个设备的环境中的语音识别方法和设备

Country Status (3)

Country Link
EP (1) EP3797414A4 (fr)
KR (1) KR20200047311A (fr)
CN (1) CN112639965A (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116097348A (zh) 2020-11-11 2023-05-09 三星电子株式会社 电子装置、系统及其控制方法
KR20220099831A (ko) * 2021-01-07 2022-07-14 삼성전자주식회사 전자 장치 및 전자 장치에서 사용자 발화 처리 방법

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130073293A1 (en) * 2011-09-20 2013-03-21 Lg Electronics Inc. Electronic device and method for controlling the same
US10026399B2 (en) * 2015-09-11 2018-07-17 Amazon Technologies, Inc. Arbitration between voice-enabled devices
WO2018067528A1 (fr) * 2016-10-03 2018-04-12 Google Llc Négociation de primauté d'un dispositif parmi des dispositifs d'interface vocale
US10559309B2 (en) * 2016-12-22 2020-02-11 Google Llc Collaborative voice controlled devices

Also Published As

Publication number Publication date
EP3797414A4 (fr) 2021-08-25
EP3797414A1 (fr) 2021-03-31
KR20200047311A (ko) 2020-05-07

Similar Documents

Publication Publication Date Title
US10607597B2 (en) Speech signal recognition system and method
US11687319B2 (en) Speech recognition method and apparatus with activation word based on operating environment of the apparatus
CN110288987B (zh) 用于处理声音数据的系统和控制该系统的方法
US9443527B1 (en) Speech recognition capability generation and control
US20200135212A1 (en) Speech recognition method and apparatus in environment including plurality of apparatuses
TWI619114B (zh) 環境敏感之自動語音辨識的方法和系統
CN109643549B (zh) 基于说话者识别的语音识别方法和装置
JP7173758B2 (ja) 個人化された音声認識方法及びこれを行うユーザ端末及びサーバ
CN110310623B (zh) 样本生成方法、模型训练方法、装置、介质及电子设备
KR102655628B1 (ko) 발화의 음성 데이터를 처리하는 방법 및 장치
CN112074900B (zh) 用于自然语言处理的音频分析
US11380326B2 (en) Method and apparatus for performing speech recognition with wake on voice (WoV)
EP3533052B1 (fr) Procédé et appareil de reconnaissance vocale
KR102531654B1 (ko) 음성 입력 인증 디바이스 및 그 방법
CN114762038A (zh) 在多轮对话中的自动轮次描述
US11830501B2 (en) Electronic device and operation method for performing speech recognition
KR20200051462A (ko) 전자 장치 및 그 동작방법
CN112639965A (zh) 在包括多个设备的环境中的语音识别方法和设备
CN112384974A (zh) 电子装置和用于提供或获得用于训练电子装置的数据的方法
CN111145735B (zh) 电子设备及其操作方法
US10803868B2 (en) Sound output system and voice processing method
US20230126305A1 (en) Method of identifying target device based on reception of utterance and electronic device therefor
US20230127543A1 (en) Method of identifying target device based on utterance and electronic device therefor
KR20200021400A (ko) 음성 인식을 수행하는 전자 장치 및 그 동작 방법
CN116686046A (zh) 电子设备及其控制方法

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210409

WD01 Invention patent application deemed withdrawn after publication