KR20200047311A - 복수의 장치들이 있는 환경에서의 음성 인식 방법 및 장치 - Google Patents
복수의 장치들이 있는 환경에서의 음성 인식 방법 및 장치 Download PDFInfo
- Publication number
- KR20200047311A KR20200047311A KR1020190110772A KR20190110772A KR20200047311A KR 20200047311 A KR20200047311 A KR 20200047311A KR 1020190110772 A KR1020190110772 A KR 1020190110772A KR 20190110772 A KR20190110772 A KR 20190110772A KR 20200047311 A KR20200047311 A KR 20200047311A
- Authority
- KR
- South Korea
- Prior art keywords
- speaker
- recognition
- speech recognition
- speech
- score
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000005236 sound signal Effects 0.000 claims abstract description 41
- 238000004891 communication Methods 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 16
- 230000008859 change Effects 0.000 claims description 15
- 230000004044 response Effects 0.000 claims description 4
- 238000013473 artificial intelligence Methods 0.000 abstract description 22
- 238000004422 calculation algorithm Methods 0.000 abstract description 7
- 238000010801 machine learning Methods 0.000 abstract description 6
- 238000013135 deep learning Methods 0.000 abstract description 5
- 230000006870 function Effects 0.000 description 15
- 238000003860 storage Methods 0.000 description 13
- 238000013528 artificial neural network Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 239000013598 vector Substances 0.000 description 5
- 238000009826 distribution Methods 0.000 description 4
- 230000003213 activating effect Effects 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000005406 washing Methods 0.000 description 2
- 244000141359 Malus pumila Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 235000021016 apples Nutrition 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 235000013601 eggs Nutrition 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012856 packing Methods 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/08—Use of distortion metrics or a particular distance between probe pattern and reference templates
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/12—Score normalisation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/32—Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/221—Announcement of recognition results
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Telephonic Communication Services (AREA)
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201980055917.2A CN112639965A (zh) | 2018-10-24 | 2019-10-22 | 在包括多个设备的环境中的语音识别方法和设备 |
PCT/KR2019/013903 WO2020085769A1 (en) | 2018-10-24 | 2019-10-22 | Speech recognition method and apparatus in environment including plurality of apparatuses |
EP19874900.4A EP3797414A4 (de) | 2018-10-24 | 2019-10-22 | Verfahren und vorrichtung zur spracherkennung in einer umgebung mit mehreren geräten |
US16/662,387 US20200135212A1 (en) | 2018-10-24 | 2019-10-24 | Speech recognition method and apparatus in environment including plurality of apparatuses |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020180127696 | 2018-10-24 | ||
KR20180127696 | 2018-10-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20200047311A true KR20200047311A (ko) | 2020-05-07 |
Family
ID=70733911
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020190110772A KR20200047311A (ko) | 2018-10-24 | 2019-09-06 | 복수의 장치들이 있는 환경에서의 음성 인식 방법 및 장치 |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP3797414A4 (de) |
KR (1) | KR20200047311A (de) |
CN (1) | CN112639965A (de) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022149693A1 (en) * | 2021-01-07 | 2022-07-14 | Samsung Electronics Co., Ltd. | Electronic device and method for processing user utterance in the electronic device |
US11915697B2 (en) | 2020-11-11 | 2024-02-27 | Samsung Electronics Co., Ltd. | Electronic device, system and control method thereof |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130073293A1 (en) * | 2011-09-20 | 2013-03-21 | Lg Electronics Inc. | Electronic device and method for controlling the same |
US10026399B2 (en) * | 2015-09-11 | 2018-07-17 | Amazon Technologies, Inc. | Arbitration between voice-enabled devices |
WO2018067528A1 (en) * | 2016-10-03 | 2018-04-12 | Google Llc | Device leadership negotiation among voice interface devices |
US10559309B2 (en) * | 2016-12-22 | 2020-02-11 | Google Llc | Collaborative voice controlled devices |
-
2019
- 2019-09-06 KR KR1020190110772A patent/KR20200047311A/ko unknown
- 2019-10-22 CN CN201980055917.2A patent/CN112639965A/zh active Pending
- 2019-10-22 EP EP19874900.4A patent/EP3797414A4/de not_active Withdrawn
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11915697B2 (en) | 2020-11-11 | 2024-02-27 | Samsung Electronics Co., Ltd. | Electronic device, system and control method thereof |
WO2022149693A1 (en) * | 2021-01-07 | 2022-07-14 | Samsung Electronics Co., Ltd. | Electronic device and method for processing user utterance in the electronic device |
US11769503B2 (en) | 2021-01-07 | 2023-09-26 | Samsung Electronics Co., Ltd. | Electronic device and method for processing user utterance in the electronic device |
Also Published As
Publication number | Publication date |
---|---|
EP3797414A4 (de) | 2021-08-25 |
CN112639965A (zh) | 2021-04-09 |
EP3797414A1 (de) | 2021-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102513297B1 (ko) | 전자 장치 및 전자 장치의 기능 실행 방법 | |
US10607597B2 (en) | Speech signal recognition system and method | |
US11577379B2 (en) | Robot and method for recognizing wake-up word thereof | |
US20200135212A1 (en) | Speech recognition method and apparatus in environment including plurality of apparatuses | |
US11380326B2 (en) | Method and apparatus for performing speech recognition with wake on voice (WoV) | |
KR102531654B1 (ko) | 음성 입력 인증 디바이스 및 그 방법 | |
KR102490916B1 (ko) | 전자 장치, 이의 제어 방법 및 비일시적인 컴퓨터 판독가능 기록매체 | |
US20200257496A1 (en) | Electronic device for providing voice-based service using external device, external device and operation method thereof | |
US20220254369A1 (en) | Electronic device supporting improved voice activity detection | |
KR102544249B1 (ko) | 발화의 문맥을 공유하여 번역을 수행하는 전자 장치 및 그 동작 방법 | |
CN114223029A (zh) | 支持装置进行语音识别的服务器及服务器的操作方法 | |
US20220284906A1 (en) | Electronic device and operation method for performing speech recognition | |
KR20200033707A (ko) | 전자 장치, 및 이의 학습 데이터 제공 또는 획득 방법 | |
CN111640429A (zh) | 提供语音识别服务的方法和用于该方法的电子装置 | |
KR20200047311A (ko) | 복수의 장치들이 있는 환경에서의 음성 인식 방법 및 장치 | |
EP3654170B1 (de) | Elektronische vorrichtung und wifi-verbindungsverfahren dafür | |
US20220270605A1 (en) | Electronic apparatus and assistant service providing method thereof | |
US20200143807A1 (en) | Electronic device and operation method thereof | |
US20220270617A1 (en) | Electronic device for supporting artificial intelligence agent services to talk to users | |
KR20220033325A (ko) | 음성 인식을 위한 전자장치 및 그 제어방법 | |
KR102677052B1 (ko) | 보이스 어시스턴트 서비스를 제공하는 시스템 및 방법 | |
US20240212681A1 (en) | Voice recognition device having barge-in function and method thereof | |
CN111971670A (zh) | 在对话中生成响应 | |
US20230016465A1 (en) | Electronic device and speaker verification method of electronic device | |
US12001808B2 (en) | Method and apparatus for providing interpretation situation information to one or more devices based on an accumulated delay among three devices in three different languages |