TWI871343B - 激活話音識別 - Google Patents

激活話音識別 Download PDF

Info

Publication number
TWI871343B
TWI871343B TW109125736A TW109125736A TWI871343B TW I871343 B TWI871343 B TW I871343B TW 109125736 A TW109125736 A TW 109125736A TW 109125736 A TW109125736 A TW 109125736A TW I871343 B TWI871343 B TW I871343B
Authority
TW
Taiwan
Prior art keywords
hand
detecting
detector
audio signal
response
Prior art date
Application number
TW109125736A
Other languages
English (en)
Chinese (zh)
Other versions
TW202121115A (zh
Inventor
尹盛萊
姜勇莫
張輝珍
金秉根
黃圭雄
Original Assignee
美商高通公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 美商高通公司 filed Critical 美商高通公司
Publication of TW202121115A publication Critical patent/TW202121115A/zh
Application granted granted Critical
Publication of TWI871343B publication Critical patent/TWI871343B/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3206Monitoring of events, devices or parameters that trigger a change in power modality
    • G06F1/3231Monitoring the presence, absence or movement of users
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/12Details of acquisition arrangements; Constructional details thereof
    • G06V10/14Optical characteristics of the device performing the acquisition or on the illumination arrangements
    • G06V10/143Sensing or illuminating at different wavelengths
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/227Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of the speaker; Human-factor methodology

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • General Health & Medical Sciences (AREA)
  • User Interface Of Digital Computer (AREA)
  • Measuring Pulse, Heart Rate, Blood Pressure Or Blood Flow (AREA)
  • Circuit For Audible Band Transducer (AREA)
TW109125736A 2019-07-30 2020-07-30 激活話音識別 TWI871343B (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US16/526,608 US11437031B2 (en) 2019-07-30 2019-07-30 Activating speech recognition based on hand patterns detected using plurality of filters
US16/526,608 2019-07-30

Publications (2)

Publication Number Publication Date
TW202121115A TW202121115A (zh) 2021-06-01
TWI871343B true TWI871343B (zh) 2025-02-01

Family

ID=72087256

Family Applications (1)

Application Number Title Priority Date Filing Date
TW109125736A TWI871343B (zh) 2019-07-30 2020-07-30 激活話音識別

Country Status (9)

Country Link
US (1) US11437031B2 (https=)
EP (1) EP4004908B1 (https=)
JP (1) JP7645230B2 (https=)
KR (1) KR102926603B1 (https=)
CN (1) CN114144831B (https=)
BR (1) BR112022000922A2 (https=)
PH (1) PH12021553299A1 (https=)
TW (1) TWI871343B (https=)
WO (1) WO2021021970A1 (https=)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20210015234A (ko) * 2019-08-01 2021-02-10 삼성전자주식회사 전자 장치, 및 그의 음성 명령에 따른 기능이 실행되도록 제어하는 방법
US11682391B2 (en) * 2020-03-30 2023-06-20 Motorola Solutions, Inc. Electronic communications device having a user interface including a single input interface for electronic digital assistant and voice control access
US11862189B2 (en) * 2020-04-01 2024-01-02 Qualcomm Incorporated Method and apparatus for target sound detection
US11590929B2 (en) * 2020-05-05 2023-02-28 Nvidia Corporation Systems and methods for performing commands in a vehicle using speech and image recognition
US20220201083A1 (en) * 2020-12-22 2022-06-23 Cerence Operating Company Platform for integrating disparate ecosystems within a vehicle
KR20230092180A (ko) * 2021-12-17 2023-06-26 현대자동차주식회사 차량 및 그의 제어방법
US12412565B2 (en) * 2022-01-28 2025-09-09 Syntiant Corp. Prediction based wake-word detection and methods for use therewith

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020077830A1 (en) * 2000-12-19 2002-06-20 Nokia Corporation Method for activating context sensitive speech recognition in a terminal
US20090253463A1 (en) * 2008-04-08 2009-10-08 Jong-Ho Shin Mobile terminal and menu control method thereof
CN102575943A (zh) * 2009-08-28 2012-07-11 罗伯特·博世有限公司 对于机动车辆的基于手势的信息和命令输入
US20120179472A1 (en) * 2011-01-06 2012-07-12 Samsung Electronics Co., Ltd. Electronic device controlled by a motion and controlling method thereof
US20150105976A1 (en) * 2013-10-11 2015-04-16 Panasonic Intellectual Property Corporation Of America Processing method, program, processing apparatus, and detection system
CN105373227B (zh) * 2015-10-29 2019-03-26 小米科技有限责任公司 一种智能关闭电子设备的方法及装置

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE9902229L (sv) * 1999-06-07 2001-02-05 Ericsson Telefon Ab L M Apparatus and method of controlling a voice controlled operation
KR20090107364A (ko) * 2008-04-08 2009-10-13 엘지전자 주식회사 이동 단말기 및 그 메뉴 제어방법
KR101502003B1 (ko) * 2008-07-08 2015-03-12 엘지전자 주식회사 이동 단말기 및 그 텍스트 입력 방법
KR20100007625A (ko) * 2008-07-14 2010-01-22 엘지전자 주식회사 이동 단말기 및 그 메뉴 표시 방법
KR101537693B1 (ko) * 2008-11-24 2015-07-20 엘지전자 주식회사 단말기 및 그 제어 방법
JP5229083B2 (ja) * 2009-04-14 2013-07-03 ソニー株式会社 情報処理装置、情報処理方法及びプログラム
JP2013080015A (ja) * 2011-09-30 2013-05-02 Toshiba Corp 音声認識装置および音声認識方法
JP6211256B2 (ja) * 2012-09-26 2017-10-11 株式会社ナビタイムジャパン 情報処理装置、情報処理方法、および情報処理プログラム
JP6030430B2 (ja) * 2012-12-14 2016-11-24 クラリオン株式会社 制御装置、車両及び携帯端末
KR101546709B1 (ko) * 2013-11-25 2015-08-24 현대자동차주식회사 음성 인식 장치, 그를 가지는 차량 및 그 방법
KR101643560B1 (ko) * 2014-12-17 2016-08-10 현대자동차주식회사 음성 인식 장치, 그를 가지는 차량 및 그 방법
CN105869637B (zh) * 2016-05-26 2019-10-15 百度在线网络技术(北京)有限公司 语音唤醒方法和装置
CN107197090B (zh) * 2017-05-18 2020-07-14 维沃移动通信有限公司 一种语音信号的接收方法及移动终端
CN207758675U (zh) 2017-12-29 2018-08-24 广州视声光电有限公司 一种触发式车载后视镜
JP7091983B2 (ja) * 2018-10-01 2022-06-28 トヨタ自動車株式会社 機器制御装置
CN209571226U (zh) * 2018-12-20 2019-11-01 深圳市朗强科技有限公司 一种语音识别装置及系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020077830A1 (en) * 2000-12-19 2002-06-20 Nokia Corporation Method for activating context sensitive speech recognition in a terminal
US20090253463A1 (en) * 2008-04-08 2009-10-08 Jong-Ho Shin Mobile terminal and menu control method thereof
CN102575943A (zh) * 2009-08-28 2012-07-11 罗伯特·博世有限公司 对于机动车辆的基于手势的信息和命令输入
US20120179472A1 (en) * 2011-01-06 2012-07-12 Samsung Electronics Co., Ltd. Electronic device controlled by a motion and controlling method thereof
US20150105976A1 (en) * 2013-10-11 2015-04-16 Panasonic Intellectual Property Corporation Of America Processing method, program, processing apparatus, and detection system
CN105373227B (zh) * 2015-10-29 2019-03-26 小米科技有限责任公司 一种智能关闭电子设备的方法及装置

Also Published As

Publication number Publication date
KR20220041831A (ko) 2022-04-01
JP2022543201A (ja) 2022-10-11
CN114144831B (zh) 2025-07-25
US11437031B2 (en) 2022-09-06
BR112022000922A2 (pt) 2022-03-08
TW202121115A (zh) 2021-06-01
WO2021021970A1 (en) 2021-02-04
PH12021553299A1 (en) 2022-08-01
CN114144831A (zh) 2022-03-04
EP4004908B1 (en) 2024-10-09
EP4004908C0 (en) 2024-10-09
KR102926603B1 (ko) 2026-02-11
EP4004908A1 (en) 2022-06-01
JP7645230B2 (ja) 2025-03-13
US20210035571A1 (en) 2021-02-04

Similar Documents

Publication Publication Date Title
TWI871343B (zh) 激活話音識別
EP3179474B1 (en) User focus activated voice recognition
CN108615526B (zh) 语音信号中关键词的检测方法、装置、终端及存储介质
CN111696570B (zh) 语音信号处理方法、装置、设备及存储介质
KR102216048B1 (ko) 음성 명령 인식 장치 및 방법
CN111933112B (zh) 唤醒语音确定方法、装置、设备及介质
CN111833872B (zh) 对电梯的语音控制方法、装置、设备、系统及介质
CN113380275B (zh) 语音处理方法、装置、智能设备及存储介质
WO2014130463A2 (en) Hybrid performance scaling or speech recognition
CN113744736B (zh) 命令词识别方法、装置、电子设备及存储介质
WO2019107145A1 (ja) 情報処理装置、及び情報処理方法
CN114220420A (zh) 多模态语音唤醒方法、装置及计算机可读存储介质
WO2024179425A1 (zh) 语音交互方法及相关设备
CN111681655A (zh) 语音控制方法、装置、电子设备及存储介质
US11682392B2 (en) Information processing apparatus
CN111681654A (zh) 语音控制方法、装置、电子设备及存储介质
CN114333821A (zh) 电梯控制方法、装置、电子设备、存储介质及产品
CN114299945B (zh) 语音信号的识别方法、装置、电子设备、存储介质及产品
CN116189718B (zh) 语音活性检测方法、装置、设备及存储介质
CN112151017A (zh) 语音处理方法、装置、系统、设备及存储介质
CN108966094A (zh) 发声控制方法、装置、电子装置及计算机可读介质
WO2023151360A1 (zh) 一种电子设备控制方法、装置及电子设备
KR20240099616A (ko) 끼어들기 기능을 갖는 음성인식장치 및 방법