CN113628613A - 两阶段的用户可定制唤醒词检测 - Google Patents

两阶段的用户可定制唤醒词检测 Download PDF

Info

Publication number
CN113628613A
CN113628613A CN202110467675.7A CN202110467675A CN113628613A CN 113628613 A CN113628613 A CN 113628613A CN 202110467675 A CN202110467675 A CN 202110467675A CN 113628613 A CN113628613 A CN 113628613A
Authority
CN
China
Prior art keywords
model
training
detected utterance
determining
utterance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110467675.7A
Other languages
English (en)
Chinese (zh)
Inventor
R·措普夫
A·潘迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cypress Semiconductor Corp
Original Assignee
Cypress Semiconductor Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/032,653 external-priority patent/US11783818B2/en
Application filed by Cypress Semiconductor Corp filed Critical Cypress Semiconductor Corp
Publication of CN113628613A publication Critical patent/CN113628613A/zh
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/10Speech classification or search using distance or distortion measures between unknown speech and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • G10L15/144Training of HMMs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • Machine Translation (AREA)
CN202110467675.7A 2020-05-06 2021-04-28 两阶段的用户可定制唤醒词检测 Pending CN113628613A (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202063020984P 2020-05-06 2020-05-06
US63/020,984 2020-05-06
US17/032,653 US11783818B2 (en) 2020-05-06 2020-09-25 Two stage user customizable wake word detection
US17/032,653 2020-09-25

Publications (1)

Publication Number Publication Date
CN113628613A true CN113628613A (zh) 2021-11-09

Family

ID=78232057

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110467675.7A Pending CN113628613A (zh) 2020-05-06 2021-04-28 两阶段的用户可定制唤醒词检测

Country Status (2)

Country Link
CN (1) CN113628613A (de)
DE (1) DE102021111594A1 (de)

Also Published As

Publication number Publication date
DE102021111594A1 (de) 2021-11-11
DE102021111594A9 (de) 2022-01-27

Similar Documents

Publication Publication Date Title
US20230082944A1 (en) Techniques for language independent wake-up word detection
CN110364143B (zh) 语音唤醒方法、装置及其智能电子设备
CN107481718B (zh) 语音识别方法、装置、存储介质及电子设备
US9775113B2 (en) Voice wakeup detecting device with digital microphone and associated method
US10304440B1 (en) Keyword spotting using multi-task configuration
US10485049B1 (en) Wireless device connection handover
CN105765650B (zh) 带有多向解码的语音辨识器
JP3674990B2 (ja) 音声認識対話装置および音声認識対話処理方法
CN109346075A (zh) 通过人体振动识别用户语音以控制电子设备的方法和系统
CN105704298A (zh) 声音唤醒侦测装置与方法
US20140201639A1 (en) Audio user interface apparatus and method
CN109272991B (zh) 语音交互的方法、装置、设备和计算机可读存储介质
US10477294B1 (en) Multi-device audio capture
US11308946B2 (en) Methods and apparatus for ASR with embedded noise reduction
WO2023029615A1 (zh) 语音唤醒的方法、装置、设备、存储介质及程序产品
US11064281B1 (en) Sending and receiving wireless data
CN101350196A (zh) 任务相关的说话人身份确认片上系统及其确认方法
US11783818B2 (en) Two stage user customizable wake word detection
CN113628613A (zh) 两阶段的用户可定制唤醒词检测
Adnene et al. Design and implementation of an automatic speech recognition based voice control system
US20210210109A1 (en) Adaptive decoder for highly compressed grapheme model
JP3846500B2 (ja) 音声認識対話装置および音声認識対話処理方法
US20230386458A1 (en) Pre-wakeword speech processing
US20240079004A1 (en) System and method for receiving a voice command
US12027156B2 (en) Noise robust representations for keyword spotting systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination