KR20180084394A - 발화 완료 감지 방법 및 이를 구현한 전자 장치 - Google Patents

발화 완료 감지 방법 및 이를 구현한 전자 장치 Download PDF

Info

Publication number
KR20180084394A
KR20180084394A KR1020170007951A KR20170007951A KR20180084394A KR 20180084394 A KR20180084394 A KR 20180084394A KR 1020170007951 A KR1020170007951 A KR 1020170007951A KR 20170007951 A KR20170007951 A KR 20170007951A KR 20180084394 A KR20180084394 A KR 20180084394A
Authority
KR
South Korea
Prior art keywords
electronic device
voice input
sensing
memory
utterance completion
Prior art date
Application number
KR1020170007951A
Other languages
English (en)
Inventor
김용호
파테리야 사우라브
김선아
주가현
황상웅
장세이
Original Assignee
삼성전자주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 삼성전자주식회사 filed Critical 삼성전자주식회사
Priority to KR1020170007951A priority Critical patent/KR20180084394A/ko
Priority to EP17892640.8A priority patent/EP3570275B1/en
Priority to AU2017394767A priority patent/AU2017394767A1/en
Priority to PCT/KR2017/013397 priority patent/WO2018135743A1/ko
Priority to US16/478,702 priority patent/US11211048B2/en
Priority to CN201780083799.7A priority patent/CN110199350B/zh
Publication of KR20180084394A publication Critical patent/KR20180084394A/ko

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • G10L15/05Word boundary detection
    • G06K9/00335
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • G10L15/25Speech recognition using non-acoustical features using position of the lips, movement of the lips or face analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/87Detection of discrete points within a voice signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)
  • Telephone Function (AREA)

Abstract

다양한 실시예는 마이크, 메모리, 및 상기 마이크 또는 상기 메모리와 기능적으로 연결된 프로세서를 포함하고, 상기 프로세서는, 음성 입력에 기반하여 EPD(end point detection) 시간을 카운트하고, 상기 EPD 시간이 만료되는 경우, 음성 입력의 마지막 단어가 상기 메모리에 저장된 기설정된 단어에 해당하는지 판단하고, 상기 마지막 단어가 상기 기설정된 단어에 해당하는 경우 상기 EPD 시간을 연장하여 음성 입력의 수신을 대기하도록 설정된 전자 장치 및 방법을 제공한다. 또한, 다른 실시예도 가능하다.
KR1020170007951A 2017-01-17 2017-01-17 발화 완료 감지 방법 및 이를 구현한 전자 장치 KR20180084394A (ko)

Priority Applications (6)

Application Number Priority Date Filing Date Title
KR1020170007951A KR20180084394A (ko) 2017-01-17 2017-01-17 발화 완료 감지 방법 및 이를 구현한 전자 장치
EP17892640.8A EP3570275B1 (en) 2017-01-17 2017-11-23 Method for sensing end of speech, and electronic apparatus implementing same
AU2017394767A AU2017394767A1 (en) 2017-01-17 2017-11-23 Method for sensing end of speech, and electronic apparatus implementing same
PCT/KR2017/013397 WO2018135743A1 (ko) 2017-01-17 2017-11-23 발화 완료 감지 방법 및 이를 구현한 전자 장치
US16/478,702 US11211048B2 (en) 2017-01-17 2017-11-23 Method for sensing end of speech, and electronic apparatus implementing same
CN201780083799.7A CN110199350B (zh) 2017-01-17 2017-11-23 用于感测语音结束的方法和实现该方法的电子设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020170007951A KR20180084394A (ko) 2017-01-17 2017-01-17 발화 완료 감지 방법 및 이를 구현한 전자 장치

Publications (1)

Publication Number Publication Date
KR20180084394A true KR20180084394A (ko) 2018-07-25

Family

ID=62909023

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020170007951A KR20180084394A (ko) 2017-01-17 2017-01-17 발화 완료 감지 방법 및 이를 구현한 전자 장치

Country Status (6)

Country Link
US (1) US11211048B2 (ko)
EP (1) EP3570275B1 (ko)
KR (1) KR20180084394A (ko)
CN (1) CN110199350B (ko)
AU (1) AU2017394767A1 (ko)
WO (1) WO2018135743A1 (ko)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020085784A1 (en) * 2018-10-23 2020-04-30 Samsung Electronics Co., Ltd. Electronic device and system which provides service based on voice recognition
WO2020091187A1 (ko) * 2018-10-31 2020-05-07 삼성전자주식회사 전자 장치 및 그 제어 방법
KR20200125034A (ko) * 2019-04-25 2020-11-04 에스케이텔레콤 주식회사 음성분석장치 및 음성분석장치의 동작 방법
WO2022169301A1 (ko) * 2021-02-04 2022-08-11 삼성전자 주식회사 음성 인식을 지원하는 전자 장치 및 그 동작 방법
WO2024005226A1 (ko) * 2022-06-29 2024-01-04 엘지전자 주식회사 디스플레이 장치

Families Citing this family (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7500746B1 (en) 2004-04-15 2009-03-10 Ip Venture, Inc. Eyewear with radiation detection system
US8109629B2 (en) 2003-10-09 2012-02-07 Ipventure, Inc. Eyewear supporting electrical components and apparatus therefor
US11630331B2 (en) 2003-10-09 2023-04-18 Ingeniospec, Llc Eyewear with touch-sensitive input surface
US11829518B1 (en) 2004-07-28 2023-11-28 Ingeniospec, Llc Head-worn device with connection region
US11644693B2 (en) 2004-07-28 2023-05-09 Ingeniospec, Llc Wearable audio system supporting enhanced hearing support
US11852901B2 (en) 2004-10-12 2023-12-26 Ingeniospec, Llc Wireless headset supporting messages and hearing enhancement
US11733549B2 (en) 2005-10-11 2023-08-22 Ingeniospec, Llc Eyewear having removable temples that support electrical components
CN109559759B (zh) * 2017-09-27 2021-10-08 华硕电脑股份有限公司 具备增量注册单元的电子设备及其方法
KR20190084789A (ko) * 2018-01-09 2019-07-17 엘지전자 주식회사 전자 장치 및 그 제어 방법
US10777048B2 (en) 2018-04-12 2020-09-15 Ipventure, Inc. Methods and apparatus regarding electronic eyewear applicable for seniors
KR102612835B1 (ko) * 2018-04-20 2023-12-13 삼성전자주식회사 전자 장치 및 전자 장치의 기능 실행 방법
CN108769432B (zh) * 2018-07-27 2020-02-11 Oppo广东移动通信有限公司 主耳机切换方法以及移动终端
CN109524001A (zh) * 2018-12-28 2019-03-26 北京金山安全软件有限公司 一种信息处理方法、装置及儿童穿戴设备
US11741951B2 (en) * 2019-02-22 2023-08-29 Lenovo (Singapore) Pte. Ltd. Context enabled voice commands
KR102221963B1 (ko) * 2019-05-02 2021-03-04 엘지전자 주식회사 화상 정보를 제공하는 인공 지능 장치 및 그 방법
US11770872B2 (en) * 2019-07-19 2023-09-26 Jvckenwood Corporation Radio apparatus, radio communication system, and radio communication method
CN110459224B (zh) * 2019-07-31 2022-02-25 北京百度网讯科技有限公司 语音识别结果处理方法、装置、计算机设备及存储介质
CN110689877A (zh) * 2019-09-17 2020-01-14 华为技术有限公司 一种语音结束端点检测方法及装置
EP4037328A4 (en) * 2019-09-27 2023-08-30 LG Electronics Inc. ARTIFICIAL INTELLIGENCE DISPLAY DEVICE AND SYSTEM
US11749265B2 (en) * 2019-10-04 2023-09-05 Disney Enterprises, Inc. Techniques for incremental computer-based natural language understanding
KR20210050901A (ko) * 2019-10-29 2021-05-10 엘지전자 주식회사 음성 인식 방법 및 음성 인식 장치
CN112825248A (zh) * 2019-11-19 2021-05-21 阿里巴巴集团控股有限公司 语音处理方法、模型训练方法、界面显示方法及设备
KR20210089347A (ko) * 2020-01-08 2021-07-16 엘지전자 주식회사 음성 인식 장치 및 음성데이터를 학습하는 방법
CN113362828B (zh) * 2020-03-04 2022-07-05 阿波罗智联(北京)科技有限公司 用于识别语音的方法和装置
CN111554287B (zh) * 2020-04-27 2023-09-05 佛山市顺德区美的洗涤电器制造有限公司 语音处理方法及装置、家电设备和可读存储介质
KR20210148580A (ko) * 2020-06-01 2021-12-08 엘지전자 주식회사 서버 및 이를 포함하는 시스템
US20220101827A1 (en) * 2020-09-30 2022-03-31 Qualcomm Incorporated Target keyword selection
CN112466296A (zh) * 2020-11-10 2021-03-09 北京百度网讯科技有限公司 语音交互的处理方法、装置、电子设备及存储介质
US11984124B2 (en) * 2020-11-13 2024-05-14 Apple Inc. Speculative task flow execution
US11870835B2 (en) * 2021-02-23 2024-01-09 Avaya Management L.P. Word-based representation of communication session quality
CN113744726A (zh) * 2021-08-23 2021-12-03 阿波罗智联(北京)科技有限公司 语音识别方法、装置、电子设备和存储介质
EP4152322A1 (de) * 2021-09-16 2023-03-22 Siemens Healthcare GmbH Verfahren zur verarbeitung eines audiosignals, verfahren zur steuerung einer vorrichtung und zugehöriges system
CN114203204B (zh) * 2021-12-06 2024-04-05 北京百度网讯科技有限公司 尾点检测方法、装置、设备和存储介质
WO2023182718A1 (en) * 2022-03-24 2023-09-28 Samsung Electronics Co., Ltd. Systems and methods for dynamically adjusting a listening time of a voice assistant device
US11908473B2 (en) * 2022-05-10 2024-02-20 Apple Inc. Task modification after task initiation

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6873953B1 (en) 2000-05-22 2005-03-29 Nuance Communications Prosody based endpoint detection
JP2007057844A (ja) * 2005-08-24 2007-03-08 Fujitsu Ltd 音声認識システムおよび音声処理システム
JP4906379B2 (ja) 2006-03-22 2012-03-28 富士通株式会社 音声認識装置、音声認識方法、及びコンピュータプログラム
KR101556594B1 (ko) * 2009-01-14 2015-10-01 삼성전자 주식회사 신호처리장치 및 신호처리장치에서의 음성 인식 방법
JP5382780B2 (ja) 2009-03-17 2014-01-08 株式会社国際電気通信基礎技術研究所 発話意図情報検出装置及びコンピュータプログラム
KR101581883B1 (ko) 2009-04-30 2016-01-11 삼성전자주식회사 모션 정보를 이용하는 음성 검출 장치 및 방법
JP2011257529A (ja) 2010-06-08 2011-12-22 Nippon Telegr & Teleph Corp <Ntt> 保留関連発話抽出方法、装置及びプログラム
KR20130134620A (ko) 2012-05-31 2013-12-10 한국전자통신연구원 디코딩 정보를 이용한 끝점 검출 장치 및 그 방법
KR101992676B1 (ko) 2012-07-26 2019-06-25 삼성전자주식회사 영상 인식을 이용하여 음성 인식을 하는 방법 및 장치
US9437186B1 (en) 2013-06-19 2016-09-06 Amazon Technologies, Inc. Enhanced endpoint detection for speech recognition
KR102229972B1 (ko) 2013-08-01 2021-03-19 엘지전자 주식회사 음성 인식 장치 및 그 방법
CN104780263A (zh) * 2015-03-10 2015-07-15 广东小天才科技有限公司 一种语音断点延长判断的方法及装置
US9666192B2 (en) * 2015-05-26 2017-05-30 Nuance Communications, Inc. Methods and apparatus for reducing latency in speech recognition applications
US10186254B2 (en) * 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US10339917B2 (en) * 2015-09-03 2019-07-02 Google Llc Enhanced speech endpointing
US9747926B2 (en) * 2015-10-16 2017-08-29 Google Inc. Hotword recognition
US10269341B2 (en) * 2015-10-19 2019-04-23 Google Llc Speech endpointing
KR102495517B1 (ko) * 2016-01-26 2023-02-03 삼성전자 주식회사 전자 장치, 전자 장치의 음성 인식 방법
US10339918B2 (en) * 2016-09-27 2019-07-02 Intel IP Corporation Adaptive speech endpoint detector

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020085784A1 (en) * 2018-10-23 2020-04-30 Samsung Electronics Co., Ltd. Electronic device and system which provides service based on voice recognition
WO2020091187A1 (ko) * 2018-10-31 2020-05-07 삼성전자주식회사 전자 장치 및 그 제어 방법
CN112912954A (zh) * 2018-10-31 2021-06-04 三星电子株式会社 电子装置及其控制方法
US11893982B2 (en) 2018-10-31 2024-02-06 Samsung Electronics Co., Ltd. Electronic apparatus and controlling method therefor
CN112912954B (zh) * 2018-10-31 2024-05-24 三星电子株式会社 电子装置及其控制方法
KR20200125034A (ko) * 2019-04-25 2020-11-04 에스케이텔레콤 주식회사 음성분석장치 및 음성분석장치의 동작 방법
WO2022169301A1 (ko) * 2021-02-04 2022-08-11 삼성전자 주식회사 음성 인식을 지원하는 전자 장치 및 그 동작 방법
WO2024005226A1 (ko) * 2022-06-29 2024-01-04 엘지전자 주식회사 디스플레이 장치

Also Published As

Publication number Publication date
US11211048B2 (en) 2021-12-28
AU2017394767A1 (en) 2019-08-29
CN110199350A (zh) 2019-09-03
EP3570275B1 (en) 2022-04-20
EP3570275A4 (en) 2020-04-08
EP3570275A1 (en) 2019-11-20
US20190378493A1 (en) 2019-12-12
CN110199350B (zh) 2023-09-26
WO2018135743A1 (ko) 2018-07-26

Similar Documents

Publication Publication Date Title
KR20180084394A (ko) 발화 완료 감지 방법 및 이를 구현한 전자 장치
EP3923277A3 (en) Delayed responses by computational assistant
EP4235395A3 (en) Device voice control
KR20180084392A (ko) 전자 장치 및 그의 동작 방법
GB2552623A (en) Systems and methods for automated evaluation of human speech
EP4235647A3 (en) Determining dialog states for language models
AU2019268195A1 (en) Zero latency digital assistant
WO2018038385A3 (ko) 음성 인식 방법 및 이를 수행하는 전자 장치
PH12017550013A1 (en) Updating language understanding classifier models for a digital personal assistant based on crowd-sourcing
EP3751561A3 (en) Hotword recognition
WO2015161240A3 (en) Speaker verification
GB2566215A (en) Voice user interface
EP4276819A3 (en) Electronic device and voice recognition method thereof
EP4033358A3 (en) Remote invocation of mobile device actions
MY179900A (en) Speech recognition method and speech recognition apparatus
PH12019502079A1 (en) Electronic device for controlling audio output and operation method thereof
WO2016028628A3 (en) System and method for speech validation
EP4280210A3 (en) Hotword detection on multiple devices
CN106687908A8 (zh) 用于调用话音输入的手势快捷方式
EP4243013A3 (en) Method, apparatus and computer-readable media for touch and speech interface with audio location
EP3851972A3 (en) Display apparatus and control methods thereof
MX2017008246A (es) Agentes de escalamiento de asistente personal digital entre dispositivos.
MX2015009812A (es) Metodo y sistema para el reconicimiento de comandos de voz.
JP2016535312A5 (ko)
MX2014010795A (es) Dispositivo para extraer informacion a partir de un dialogo.

Legal Events

Date Code Title Description
A201 Request for examination