KR20200047311A - 복수의 장치들이 있는 환경에서의 음성 인식 방법 및 장치 - Google Patents

복수의 장치들이 있는 환경에서의 음성 인식 방법 및 장치 Download PDF

Info

Publication number
KR20200047311A
KR20200047311A KR1020190110772A KR20190110772A KR20200047311A KR 20200047311 A KR20200047311 A KR 20200047311A KR 1020190110772 A KR1020190110772 A KR 1020190110772A KR 20190110772 A KR20190110772 A KR 20190110772A KR 20200047311 A KR20200047311 A KR 20200047311A
Authority
KR
South Korea
Prior art keywords
speaker
recognition
speech recognition
speech
score
Prior art date
Application number
KR1020190110772A
Other languages
English (en)
Korean (ko)
Inventor
조근석
노재영
형지원
장동한
이재원
Original Assignee
삼성전자주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 삼성전자주식회사 filed Critical 삼성전자주식회사
Priority to CN201980055917.2A priority Critical patent/CN112639965A/zh
Priority to PCT/KR2019/013903 priority patent/WO2020085769A1/en
Priority to EP19874900.4A priority patent/EP3797414A4/de
Priority to US16/662,387 priority patent/US20200135212A1/en
Publication of KR20200047311A publication Critical patent/KR20200047311A/ko

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/08Use of distortion metrics or a particular distance between probe pattern and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/12Score normalisation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/32Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/221Announcement of recognition results

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Telephonic Communication Services (AREA)
KR1020190110772A 2018-10-24 2019-09-06 복수의 장치들이 있는 환경에서의 음성 인식 방법 및 장치 KR20200047311A (ko)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201980055917.2A CN112639965A (zh) 2018-10-24 2019-10-22 在包括多个设备的环境中的语音识别方法和设备
PCT/KR2019/013903 WO2020085769A1 (en) 2018-10-24 2019-10-22 Speech recognition method and apparatus in environment including plurality of apparatuses
EP19874900.4A EP3797414A4 (de) 2018-10-24 2019-10-22 Verfahren und vorrichtung zur spracherkennung in einer umgebung mit mehreren geräten
US16/662,387 US20200135212A1 (en) 2018-10-24 2019-10-24 Speech recognition method and apparatus in environment including plurality of apparatuses

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020180127696 2018-10-24
KR20180127696 2018-10-24

Publications (1)

Publication Number Publication Date
KR20200047311A true KR20200047311A (ko) 2020-05-07

Family

ID=70733911

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020190110772A KR20200047311A (ko) 2018-10-24 2019-09-06 복수의 장치들이 있는 환경에서의 음성 인식 방법 및 장치

Country Status (3)

Country Link
EP (1) EP3797414A4 (de)
KR (1) KR20200047311A (de)
CN (1) CN112639965A (de)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022149693A1 (en) * 2021-01-07 2022-07-14 Samsung Electronics Co., Ltd. Electronic device and method for processing user utterance in the electronic device
US11915697B2 (en) 2020-11-11 2024-02-27 Samsung Electronics Co., Ltd. Electronic device, system and control method thereof

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130073293A1 (en) * 2011-09-20 2013-03-21 Lg Electronics Inc. Electronic device and method for controlling the same
US10026399B2 (en) * 2015-09-11 2018-07-17 Amazon Technologies, Inc. Arbitration between voice-enabled devices
WO2018067528A1 (en) * 2016-10-03 2018-04-12 Google Llc Device leadership negotiation among voice interface devices
US10559309B2 (en) * 2016-12-22 2020-02-11 Google Llc Collaborative voice controlled devices

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11915697B2 (en) 2020-11-11 2024-02-27 Samsung Electronics Co., Ltd. Electronic device, system and control method thereof
WO2022149693A1 (en) * 2021-01-07 2022-07-14 Samsung Electronics Co., Ltd. Electronic device and method for processing user utterance in the electronic device
US11769503B2 (en) 2021-01-07 2023-09-26 Samsung Electronics Co., Ltd. Electronic device and method for processing user utterance in the electronic device

Also Published As

Publication number Publication date
EP3797414A4 (de) 2021-08-25
CN112639965A (zh) 2021-04-09
EP3797414A1 (de) 2021-03-31

Similar Documents

Publication Publication Date Title
KR102513297B1 (ko) 전자 장치 및 전자 장치의 기능 실행 방법
US10607597B2 (en) Speech signal recognition system and method
US11577379B2 (en) Robot and method for recognizing wake-up word thereof
US20200135212A1 (en) Speech recognition method and apparatus in environment including plurality of apparatuses
US11380326B2 (en) Method and apparatus for performing speech recognition with wake on voice (WoV)
KR102531654B1 (ko) 음성 입력 인증 디바이스 및 그 방법
KR102490916B1 (ko) 전자 장치, 이의 제어 방법 및 비일시적인 컴퓨터 판독가능 기록매체
US20200257496A1 (en) Electronic device for providing voice-based service using external device, external device and operation method thereof
US20220254369A1 (en) Electronic device supporting improved voice activity detection
KR102544249B1 (ko) 발화의 문맥을 공유하여 번역을 수행하는 전자 장치 및 그 동작 방법
CN114223029A (zh) 支持装置进行语音识别的服务器及服务器的操作方法
US20220284906A1 (en) Electronic device and operation method for performing speech recognition
KR20200033707A (ko) 전자 장치, 및 이의 학습 데이터 제공 또는 획득 방법
CN111640429A (zh) 提供语音识别服务的方法和用于该方法的电子装置
KR20200047311A (ko) 복수의 장치들이 있는 환경에서의 음성 인식 방법 및 장치
EP3654170B1 (de) Elektronische vorrichtung und wifi-verbindungsverfahren dafür
US20220270605A1 (en) Electronic apparatus and assistant service providing method thereof
US20200143807A1 (en) Electronic device and operation method thereof
US20220270617A1 (en) Electronic device for supporting artificial intelligence agent services to talk to users
KR20220033325A (ko) 음성 인식을 위한 전자장치 및 그 제어방법
KR102677052B1 (ko) 보이스 어시스턴트 서비스를 제공하는 시스템 및 방법
US20240212681A1 (en) Voice recognition device having barge-in function and method thereof
CN111971670A (zh) 在对话中生成响应
US20230016465A1 (en) Electronic device and speaker verification method of electronic device
US12001808B2 (en) Method and apparatus for providing interpretation situation information to one or more devices based on an accumulated delay among three devices in three different languages