JP6673276B2 - 音声検出装置、音声検出方法、及びプログラム - Google Patents

音声検出装置、音声検出方法、及びプログラム Download PDF

Info

Publication number
JP6673276B2
JP6673276B2 JP2017062756A JP2017062756A JP6673276B2 JP 6673276 B2 JP6673276 B2 JP 6673276B2 JP 2017062756 A JP2017062756 A JP 2017062756A JP 2017062756 A JP2017062756 A JP 2017062756A JP 6673276 B2 JP6673276 B2 JP 6673276B2
Authority
JP
Japan
Prior art keywords
sound source
sound
voice
detected
robot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2017062756A
Other languages
English (en)
Japanese (ja)
Other versions
JP2018165759A5 (enExample
JP2018165759A (ja
Inventor
敬輔 島田
敬輔 島田
浩一 中込
浩一 中込
崇史 山谷
崇史 山谷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Casio Computer Co Ltd
Original Assignee
Casio Computer Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Casio Computer Co Ltd filed Critical Casio Computer Co Ltd
Priority to JP2017062756A priority Critical patent/JP6673276B2/ja
Priority to US15/885,031 priority patent/US10424320B2/en
Priority to CN201810198308.XA priority patent/CN108665891B/zh
Publication of JP2018165759A publication Critical patent/JP2018165759A/ja
Publication of JP2018165759A5 publication Critical patent/JP2018165759A5/ja
Application granted granted Critical
Publication of JP6673276B2 publication Critical patent/JP6673276B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/227Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of the speaker; Human-factor methodology
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L2025/783Detection of presence or absence of voice signals based on threshold decision

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Manipulator (AREA)
  • Toys (AREA)
  • Studio Devices (AREA)
  • Computer Vision & Pattern Recognition (AREA)
JP2017062756A 2017-03-28 2017-03-28 音声検出装置、音声検出方法、及びプログラム Active JP6673276B2 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2017062756A JP6673276B2 (ja) 2017-03-28 2017-03-28 音声検出装置、音声検出方法、及びプログラム
US15/885,031 US10424320B2 (en) 2017-03-28 2018-01-31 Voice detection, apparatus, voice detection method, and non-transitory computer-readable storage medium
CN201810198308.XA CN108665891B (zh) 2017-03-28 2018-03-09 声音检测装置、声音检测方法以及记录介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2017062756A JP6673276B2 (ja) 2017-03-28 2017-03-28 音声検出装置、音声検出方法、及びプログラム

Publications (3)

Publication Number Publication Date
JP2018165759A JP2018165759A (ja) 2018-10-25
JP2018165759A5 JP2018165759A5 (enExample) 2019-01-24
JP6673276B2 true JP6673276B2 (ja) 2020-03-25

Family

ID=63670988

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2017062756A Active JP6673276B2 (ja) 2017-03-28 2017-03-28 音声検出装置、音声検出方法、及びプログラム

Country Status (3)

Country Link
US (1) US10424320B2 (enExample)
JP (1) JP6673276B2 (enExample)
CN (1) CN108665891B (enExample)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6686977B2 (ja) * 2017-06-23 2020-04-22 カシオ計算機株式会社 音源分離情報検出装置、ロボット、音源分離情報検出方法及びプログラム
JP7107017B2 (ja) * 2018-06-21 2022-07-27 カシオ計算機株式会社 ロボット、ロボットの制御方法及びプログラム
KR102093822B1 (ko) * 2018-11-12 2020-03-26 한국과학기술연구원 음원 분리 장치
CN109543578B (zh) * 2018-11-13 2020-12-22 北京猎户星空科技有限公司 智能设备控制方法、装置和存储介质
JP2020089947A (ja) * 2018-12-06 2020-06-11 ソニー株式会社 情報処理装置、情報処理方法及びプログラム
JP2020135110A (ja) * 2019-02-14 2020-08-31 本田技研工業株式会社 エージェント装置、エージェント装置の制御方法、およびプログラム
CN112148742B (zh) * 2019-06-28 2024-09-10 Oppo广东移动通信有限公司 地图更新方法及装置、终端、存储介质
KR102280803B1 (ko) * 2019-07-02 2021-07-21 엘지전자 주식회사 로봇 및 그의 구동 방법
KR102704312B1 (ko) * 2019-07-09 2024-09-06 엘지전자 주식회사 커뮤니케이션 로봇 및 그의 구동 방법
US11565426B2 (en) * 2019-07-19 2023-01-31 Lg Electronics Inc. Movable robot and method for tracking position of speaker by movable robot
CN110509292A (zh) * 2019-09-05 2019-11-29 南京法法法信息科技有限公司 一种公共场所用移动式普法机器人
TWI714303B (zh) * 2019-10-09 2020-12-21 宇智網通股份有限公司 聲源定位方法及聲音系統
US11501794B1 (en) * 2020-05-15 2022-11-15 Amazon Technologies, Inc. Multimodal sentiment detection
CN111916061B (zh) * 2020-07-22 2024-05-07 北京地平线机器人技术研发有限公司 语音端点检测方法、装置、可读存储介质及电子设备
CN112017633B (zh) * 2020-09-10 2024-04-26 北京地平线信息技术有限公司 语音识别方法、装置、存储介质及电子设备
US11380302B2 (en) * 2020-10-22 2022-07-05 Google Llc Multi channel voice activity detection
KR20240076871A (ko) * 2022-11-23 2024-05-31 삼성전자주식회사 맵을 이용하여 공간을 주행하는 로봇 및 그의 위치 식별 방법

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000326274A (ja) * 1999-05-24 2000-11-28 Nec Corp 自律行動ロボット
TWI236610B (en) * 2000-12-06 2005-07-21 Sony Corp Robotic creature device
JP4689107B2 (ja) * 2001-08-22 2011-05-25 本田技研工業株式会社 自律行動ロボット
JP3714268B2 (ja) * 2002-03-18 2005-11-09 ソニー株式会社 ロボット装置
JP2005250233A (ja) * 2004-03-05 2005-09-15 Sanyo Electric Co Ltd ロボット装置
JP2006239844A (ja) 2005-03-04 2006-09-14 Sony Corp 障害物回避装置、障害物回避方法及び障害物回避プログラム並びに移動型ロボット装置
JP4453596B2 (ja) * 2005-04-06 2010-04-21 株式会社安川電機 ロボット制御方法およびロボット装置
JP4456561B2 (ja) * 2005-12-12 2010-04-28 本田技研工業株式会社 自律移動ロボット
JP2007221300A (ja) * 2006-02-15 2007-08-30 Fujitsu Ltd ロボット及びロボットの制御方法
JP5170440B2 (ja) * 2006-05-10 2013-03-27 本田技研工業株式会社 音源追跡システム、方法、およびロボット
JP2008126329A (ja) * 2006-11-17 2008-06-05 Toyota Motor Corp 音声認識ロボットおよび音声認識ロボットの制御方法
JP4560078B2 (ja) * 2007-12-06 2010-10-13 本田技研工業株式会社 コミュニケーションロボット
CN201210187Y (zh) * 2008-06-13 2009-03-18 河北工业大学 一种自主搜寻声源的机器人
US20120046788A1 (en) * 2009-01-24 2012-02-23 Tek Electrical (Suzhou) Co., Ltd. Speech system used for robot and robot with speech system
CN101786274A (zh) * 2009-01-24 2010-07-28 泰怡凯电器(苏州)有限公司 一种用于机器人的语音系统及带有该语音系统的机器人
CN101786272A (zh) * 2010-01-05 2010-07-28 深圳先进技术研究院 一种用于家庭智能监控服务的多感知机器人
JP2011149782A (ja) 2010-01-20 2011-08-04 National Institute Of Advanced Industrial Science & Technology パーティクルフィルタリングによる移動ロボットからの2次元音源地図作成方法
US9014848B2 (en) * 2010-05-20 2015-04-21 Irobot Corporation Mobile robot system
JP2012040655A (ja) * 2010-08-20 2012-03-01 Nec Corp ロボット制御方法、プログラム、及びロボット
JP6240995B2 (ja) * 2013-01-15 2017-12-06 株式会社国際電気通信基礎技術研究所 移動体、音響源マップ作成システムおよび音響源マップ作成方法
CN105283775B (zh) * 2013-04-12 2018-04-06 株式会社日立制作所 移动机器人以及音源位置推定系统
KR102104896B1 (ko) * 2014-01-17 2020-05-29 엘지전자 주식회사 로봇청소기 및 이를 이용한 휴먼 케어방법
JP6221158B2 (ja) * 2014-08-27 2017-11-01 本田技研工業株式会社 自律行動ロボット、及び自律行動ロボットの制御方法
CN104967726B (zh) * 2015-04-30 2018-03-23 努比亚技术有限公司 语音指令处理方法和装置、移动终端
CN105835064B (zh) * 2016-05-03 2018-05-01 北京光年无限科技有限公司 一种智能机器人的多模态输出方法和智能机器人系统

Also Published As

Publication number Publication date
US20180286432A1 (en) 2018-10-04
JP2018165759A (ja) 2018-10-25
CN108665891A (zh) 2018-10-16
US10424320B2 (en) 2019-09-24
CN108665891B (zh) 2023-05-02

Similar Documents

Publication Publication Date Title
JP6673276B2 (ja) 音声検出装置、音声検出方法、及びプログラム
US10665249B2 (en) Sound source separation for robot from target voice direction and noise voice direction
CN108664889B (zh) 对象物检测装置、对象物检测方法以及记录介质
US11501794B1 (en) Multimodal sentiment detection
US11646009B1 (en) Autonomously motile device with noise suppression
US7536029B2 (en) Apparatus and method performing audio-video sensor fusion for object localization, tracking, and separation
Chen et al. Smart homecare surveillance system: Behavior identification based on state-transition support vector machines and sound directivity pattern analysis
CN105979442B (zh) 噪声抑制方法、装置和可移动设备
KR102463806B1 (ko) 이동이 가능한 전자 장치 및 그 동작 방법
US11605179B2 (en) System for determining anatomical feature orientation
US11714157B2 (en) System to determine direction toward user
US12288566B1 (en) Beamforming using multiple sensor data
JP2009157767A (ja) 顔画像認識装置、顔画像認識方法、顔画像認識プログラムおよびそのプログラムを記録した記録媒体
Rekik et al. Human machine interaction via visual speech spotting
CN111176443B (zh) 一种车载智能系统及其控制方法
CN112925235A (zh) 交互时的声源定位方法、设备和计算机可读存储介质
Fransen et al. Using vision, acoustics, and natural language for disambiguation
CN111421557A (zh) 电子装置及其控制方法
CN111241922A (zh) 一种机器人及其控制方法、计算机可读存储介质
Chau et al. Audio-visual slam towards human tracking and human-robot interaction in indoor environments
US11412133B1 (en) Autonomously motile device with computer vision
Yang et al. Camera pose estimation and localization with active audio sensing
Sasaki et al. Online spatial sound perception using microphone array on mobile robot
CN110730378A (zh) 一种信息处理方法及系统
US12456469B1 (en) Beamforming using image data

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20181207

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20181207

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20191028

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20191126

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20200116

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20200204

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20200217

R150 Certificate of patent or registration of utility model

Ref document number: 6673276

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150