KR101986354B1 - 키워드 오인식을 방지하는 음성 제어 장치 및 이의 동작 방법 - Google Patents

키워드 오인식을 방지하는 음성 제어 장치 및 이의 동작 방법 Download PDF

Info

Publication number
KR101986354B1
KR101986354B1 KR1020170062391A KR20170062391A KR101986354B1 KR 101986354 B1 KR101986354 B1 KR 101986354B1 KR 1020170062391 A KR1020170062391 A KR 1020170062391A KR 20170062391 A KR20170062391 A KR 20170062391A KR 101986354 B1 KR101986354 B1 KR 101986354B1
Authority
KR
South Korea
Prior art keywords
keyword
feature vector
audio data
speaker feature
audio
Prior art date
Application number
KR1020170062391A
Other languages
English (en)
Korean (ko)
Other versions
KR20180127065A (ko
Inventor
김병열
한익상
권오혁
이봉진
오명우
최민석
이찬규
임정희
최지수
강한용
김수환
최정아
Original Assignee
네이버 주식회사
라인 가부시키가이샤
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 네이버 주식회사, 라인 가부시키가이샤 filed Critical 네이버 주식회사
Priority to KR1020170062391A priority Critical patent/KR101986354B1/ko
Priority to JP2018094704A priority patent/JP6510117B2/ja
Publication of KR20180127065A publication Critical patent/KR20180127065A/ko
Priority to JP2019071410A priority patent/JP2019133182A/ja
Application granted granted Critical
Publication of KR101986354B1 publication Critical patent/KR101986354B1/ko
Priority to JP2022000145A priority patent/JP2022033258A/ja

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3206Monitoring of events, devices or parameters that trigger a change in power modality
    • G06F1/3228Monitoring task completion, e.g. by use of idle timers, stop commands or wait commands
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/038Vector quantisation, e.g. TwinVQ audio
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)
  • User Interface Of Digital Computer (AREA)
  • Telephone Function (AREA)
KR1020170062391A 2017-05-19 2017-05-19 키워드 오인식을 방지하는 음성 제어 장치 및 이의 동작 방법 KR101986354B1 (ko)

Priority Applications (4)

Application Number Priority Date Filing Date Title
KR1020170062391A KR101986354B1 (ko) 2017-05-19 2017-05-19 키워드 오인식을 방지하는 음성 제어 장치 및 이의 동작 방법
JP2018094704A JP6510117B2 (ja) 2017-05-19 2018-05-16 音声制御装置、音声制御装置の動作方法、コンピュータプログラム及び記録媒体
JP2019071410A JP2019133182A (ja) 2017-05-19 2019-04-03 音声制御装置、音声制御方法、コンピュータプログラム及び記録媒体
JP2022000145A JP2022033258A (ja) 2017-05-19 2022-01-04 音声制御装置、動作方法及びコンピュータプログラム

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020170062391A KR101986354B1 (ko) 2017-05-19 2017-05-19 키워드 오인식을 방지하는 음성 제어 장치 및 이의 동작 방법

Related Child Applications (1)

Application Number Title Priority Date Filing Date
KR1020190064068A Division KR102061206B1 (ko) 2019-05-30 2019-05-30 키워드 오인식을 방지하는 음성 제어 장치 및 이의 동작 방법

Publications (2)

Publication Number Publication Date
KR20180127065A KR20180127065A (ko) 2018-11-28
KR101986354B1 true KR101986354B1 (ko) 2019-09-30

Family

ID=64561798

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020170062391A KR101986354B1 (ko) 2017-05-19 2017-05-19 키워드 오인식을 방지하는 음성 제어 장치 및 이의 동작 방법

Country Status (2)

Country Link
JP (3) JP6510117B2 (ja)
KR (1) KR101986354B1 (ja)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101986354B1 (ko) * 2017-05-19 2019-09-30 네이버 주식회사 키워드 오인식을 방지하는 음성 제어 장치 및 이의 동작 방법
JP7035476B2 (ja) * 2017-11-20 2022-03-15 富士通株式会社 音声処理プログラム、音声処理装置、及び音声処理方法
CN109637531B (zh) * 2018-12-06 2020-09-15 珠海格力电器股份有限公司 一种语音控制方法、装置、存储介质及空调
WO2020122271A1 (ko) * 2018-12-11 2020-06-18 엘지전자 주식회사 디스플레이 장치
CN109785836B (zh) * 2019-01-28 2021-03-30 三星电子(中国)研发中心 交互方法和装置
CN109992239A (zh) * 2019-04-15 2019-07-09 北京百度网讯科技有限公司 语音出行方法、装置、终端及存储介质
KR102225001B1 (ko) * 2019-05-21 2021-03-08 엘지전자 주식회사 음성 인식 방법 및 음성 인식 장치
KR20220071591A (ko) * 2020-11-24 2022-05-31 삼성전자주식회사 전자장치 및 그 제어방법
KR20220136750A (ko) 2021-04-01 2022-10-11 삼성전자주식회사 사용자 발화를 처리하는 전자 장치, 및 그 전자 장치의 제어 방법
CN113450828B (zh) * 2021-06-25 2024-07-09 平安科技(深圳)有限公司 音乐流派的识别方法、装置、设备及存储介质
CN115731926A (zh) * 2021-08-30 2023-03-03 佛山市顺德区美的电子科技有限公司 智能设备的控制方法、装置、智能设备和可读存储介质
CN114038457B (zh) * 2021-11-04 2022-09-13 贝壳找房(北京)科技有限公司 用于语音唤醒的方法、电子设备、存储介质和程序
JP2023128711A (ja) 2022-03-04 2023-09-14 キヤノン株式会社 医療システム

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012073361A (ja) * 2010-09-28 2012-04-12 Fujitsu Ltd 音声認識装置及び音声認識方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2996019B2 (ja) * 1992-07-22 1999-12-27 日本電気株式会社 音声認識装置
JP3524370B2 (ja) * 1998-02-19 2004-05-10 富士通テン株式会社 音声起動システム
JP4577543B2 (ja) * 2000-11-21 2010-11-10 ソニー株式会社 モデル適応装置およびモデル適応方法、記録媒体、並びに音声認識装置
JP2006039382A (ja) * 2004-07-29 2006-02-09 Nissan Motor Co Ltd 音声認識装置
KR20140139982A (ko) * 2013-05-28 2014-12-08 삼성전자주식회사 전자 장치의 음성인식을 수행하는 방법 및 이를 사용하는 전자 장치
KR101986354B1 (ko) * 2017-05-19 2019-09-30 네이버 주식회사 키워드 오인식을 방지하는 음성 제어 장치 및 이의 동작 방법

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012073361A (ja) * 2010-09-28 2012-04-12 Fujitsu Ltd 音声認識装置及び音声認識方法

Also Published As

Publication number Publication date
KR20180127065A (ko) 2018-11-28
JP6510117B2 (ja) 2019-05-08
JP2022033258A (ja) 2022-02-28
JP2019133182A (ja) 2019-08-08
JP2018194844A (ja) 2018-12-06

Similar Documents

Publication Publication Date Title
KR101986354B1 (ko) 키워드 오인식을 방지하는 음성 제어 장치 및 이의 동작 방법
US10504511B2 (en) Customizable wake-up voice commands
US12094461B2 (en) Processing spoken commands to control distributed audio outputs
US11600291B1 (en) Device selection from audio data
US11875820B1 (en) Context driven device arbitration
US12033632B2 (en) Context-based device arbitration
US11720326B2 (en) Audio output control
US12125483B1 (en) Determining device groups
KR102596430B1 (ko) 화자 인식에 기초한 음성 인식 방법 및 장치
CN110140168B (zh) 上下文热词
US11669300B1 (en) Wake word detection configuration
US10332513B1 (en) Voice enablement and disablement of speech processing functionality
KR20210120960A (ko) 사용자의 입력 입력에 기초하여 타겟 디바이스를 결정하고, 타겟 디바이스를 제어하는 서버 및 그 동작 방법
US10685664B1 (en) Analyzing noise levels to determine usability of microphones
JP2022013610A (ja) 音声インタラクション制御方法、装置、電子機器、記憶媒体及びシステム
KR20190096308A (ko) 전자기기
US12125489B1 (en) Speech recognition using multiple voice-enabled devices
KR20200007530A (ko) 사용자 음성 입력 처리 방법 및 이를 지원하는 전자 장치
KR102061206B1 (ko) 키워드 오인식을 방지하는 음성 제어 장치 및 이의 동작 방법
US11348579B1 (en) Volume initiated communications
US11693622B1 (en) Context configurable keywords
US11133004B1 (en) Accessory for an audio output device
CN116524916A (zh) 一种语音处理方法、装置及车辆

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
E701 Decision to grant or registration of patent right
A107 Divisional application of patent
GRNT Written decision to grant