CN108665891B - 声音检测装置、声音检测方法以及记录介质 - Google Patents

声音检测装置、声音检测方法以及记录介质 Download PDF

Info

Publication number
CN108665891B
CN108665891B CN201810198308.XA CN201810198308A CN108665891B CN 108665891 B CN108665891 B CN 108665891B CN 201810198308 A CN201810198308 A CN 201810198308A CN 108665891 B CN108665891 B CN 108665891B
Authority
CN
China
Prior art keywords
sound
generation source
unit
sound generation
robot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810198308.XA
Other languages
English (en)
Chinese (zh)
Other versions
CN108665891A (zh
Inventor
岛田敬辅
中込浩一
山谷崇史
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Casio Computer Co Ltd
Original Assignee
Casio Computer Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Casio Computer Co Ltd filed Critical Casio Computer Co Ltd
Publication of CN108665891A publication Critical patent/CN108665891A/zh
Application granted granted Critical
Publication of CN108665891B publication Critical patent/CN108665891B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/227Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of the speaker; Human-factor methodology
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L2025/783Detection of presence or absence of voice signals based on threshold decision

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Manipulator (AREA)
  • Toys (AREA)
  • Studio Devices (AREA)
  • Computer Vision & Pattern Recognition (AREA)
CN201810198308.XA 2017-03-28 2018-03-09 声音检测装置、声音检测方法以及记录介质 Active CN108665891B (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2017062756A JP6673276B2 (ja) 2017-03-28 2017-03-28 音声検出装置、音声検出方法、及びプログラム
JP2017-062756 2017-03-28

Publications (2)

Publication Number Publication Date
CN108665891A CN108665891A (zh) 2018-10-16
CN108665891B true CN108665891B (zh) 2023-05-02

Family

ID=63670988

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810198308.XA Active CN108665891B (zh) 2017-03-28 2018-03-09 声音检测装置、声音检测方法以及记录介质

Country Status (3)

Country Link
US (1) US10424320B2 (enExample)
JP (1) JP6673276B2 (enExample)
CN (1) CN108665891B (enExample)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6686977B2 (ja) * 2017-06-23 2020-04-22 カシオ計算機株式会社 音源分離情報検出装置、ロボット、音源分離情報検出方法及びプログラム
JP7107017B2 (ja) * 2018-06-21 2022-07-27 カシオ計算機株式会社 ロボット、ロボットの制御方法及びプログラム
KR102093822B1 (ko) * 2018-11-12 2020-03-26 한국과학기술연구원 음원 분리 장치
CN109543578B (zh) * 2018-11-13 2020-12-22 北京猎户星空科技有限公司 智能设备控制方法、装置和存储介质
JP2020089947A (ja) * 2018-12-06 2020-06-11 ソニー株式会社 情報処理装置、情報処理方法及びプログラム
JP2020135110A (ja) * 2019-02-14 2020-08-31 本田技研工業株式会社 エージェント装置、エージェント装置の制御方法、およびプログラム
CN112148742B (zh) * 2019-06-28 2024-09-10 Oppo广东移动通信有限公司 地图更新方法及装置、终端、存储介质
KR102280803B1 (ko) * 2019-07-02 2021-07-21 엘지전자 주식회사 로봇 및 그의 구동 방법
KR102704312B1 (ko) * 2019-07-09 2024-09-06 엘지전자 주식회사 커뮤니케이션 로봇 및 그의 구동 방법
US11565426B2 (en) * 2019-07-19 2023-01-31 Lg Electronics Inc. Movable robot and method for tracking position of speaker by movable robot
CN110509292A (zh) * 2019-09-05 2019-11-29 南京法法法信息科技有限公司 一种公共场所用移动式普法机器人
TWI714303B (zh) * 2019-10-09 2020-12-21 宇智網通股份有限公司 聲源定位方法及聲音系統
US11501794B1 (en) * 2020-05-15 2022-11-15 Amazon Technologies, Inc. Multimodal sentiment detection
CN111916061B (zh) * 2020-07-22 2024-05-07 北京地平线机器人技术研发有限公司 语音端点检测方法、装置、可读存储介质及电子设备
CN112017633B (zh) * 2020-09-10 2024-04-26 北京地平线信息技术有限公司 语音识别方法、装置、存储介质及电子设备
US11380302B2 (en) * 2020-10-22 2022-07-05 Google Llc Multi channel voice activity detection
KR20240076871A (ko) * 2022-11-23 2024-05-31 삼성전자주식회사 맵을 이용하여 공간을 주행하는 로봇 및 그의 위치 식별 방법

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1416381A (zh) * 2000-12-06 2003-05-07 索尼公司 机器人设备、用于控制机器人设备运动的方法以及用于控制机器人设备运动的系统
JP2006289514A (ja) * 2005-04-06 2006-10-26 Yaskawa Electric Corp ロボット制御システムおよび制御方法
CN101786272A (zh) * 2010-01-05 2010-07-28 深圳先进技术研究院 一种用于家庭智能监控服务的多感知机器人
CN101786274A (zh) * 2009-01-24 2010-07-28 泰怡凯电器(苏州)有限公司 一种用于机器人的语音系统及带有该语音系统的机器人
JP2012040655A (ja) * 2010-08-20 2012-03-01 Nec Corp ロボット制御方法、プログラム、及びロボット
CN104783736A (zh) * 2014-01-17 2015-07-22 Lg电子株式会社 机器人吸尘器及利用该机器人吸尘器的人照料方法
CN105835064A (zh) * 2016-05-03 2016-08-10 北京光年无限科技有限公司 一种智能机器人的多模态输出方法和智能机器人系统

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000326274A (ja) * 1999-05-24 2000-11-28 Nec Corp 自律行動ロボット
JP4689107B2 (ja) * 2001-08-22 2011-05-25 本田技研工業株式会社 自律行動ロボット
JP3714268B2 (ja) * 2002-03-18 2005-11-09 ソニー株式会社 ロボット装置
JP2005250233A (ja) * 2004-03-05 2005-09-15 Sanyo Electric Co Ltd ロボット装置
JP2006239844A (ja) 2005-03-04 2006-09-14 Sony Corp 障害物回避装置、障害物回避方法及び障害物回避プログラム並びに移動型ロボット装置
JP4456561B2 (ja) * 2005-12-12 2010-04-28 本田技研工業株式会社 自律移動ロボット
JP2007221300A (ja) * 2006-02-15 2007-08-30 Fujitsu Ltd ロボット及びロボットの制御方法
JP5170440B2 (ja) * 2006-05-10 2013-03-27 本田技研工業株式会社 音源追跡システム、方法、およびロボット
JP2008126329A (ja) * 2006-11-17 2008-06-05 Toyota Motor Corp 音声認識ロボットおよび音声認識ロボットの制御方法
JP4560078B2 (ja) * 2007-12-06 2010-10-13 本田技研工業株式会社 コミュニケーションロボット
CN201210187Y (zh) * 2008-06-13 2009-03-18 河北工业大学 一种自主搜寻声源的机器人
US20120046788A1 (en) * 2009-01-24 2012-02-23 Tek Electrical (Suzhou) Co., Ltd. Speech system used for robot and robot with speech system
JP2011149782A (ja) 2010-01-20 2011-08-04 National Institute Of Advanced Industrial Science & Technology パーティクルフィルタリングによる移動ロボットからの2次元音源地図作成方法
US9014848B2 (en) * 2010-05-20 2015-04-21 Irobot Corporation Mobile robot system
JP6240995B2 (ja) * 2013-01-15 2017-12-06 株式会社国際電気通信基礎技術研究所 移動体、音響源マップ作成システムおよび音響源マップ作成方法
CN105283775B (zh) * 2013-04-12 2018-04-06 株式会社日立制作所 移动机器人以及音源位置推定系统
JP6221158B2 (ja) * 2014-08-27 2017-11-01 本田技研工業株式会社 自律行動ロボット、及び自律行動ロボットの制御方法
CN104967726B (zh) * 2015-04-30 2018-03-23 努比亚技术有限公司 语音指令处理方法和装置、移动终端

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1416381A (zh) * 2000-12-06 2003-05-07 索尼公司 机器人设备、用于控制机器人设备运动的方法以及用于控制机器人设备运动的系统
JP2006289514A (ja) * 2005-04-06 2006-10-26 Yaskawa Electric Corp ロボット制御システムおよび制御方法
CN101786274A (zh) * 2009-01-24 2010-07-28 泰怡凯电器(苏州)有限公司 一种用于机器人的语音系统及带有该语音系统的机器人
CN101786272A (zh) * 2010-01-05 2010-07-28 深圳先进技术研究院 一种用于家庭智能监控服务的多感知机器人
JP2012040655A (ja) * 2010-08-20 2012-03-01 Nec Corp ロボット制御方法、プログラム、及びロボット
CN104783736A (zh) * 2014-01-17 2015-07-22 Lg电子株式会社 机器人吸尘器及利用该机器人吸尘器的人照料方法
CN105835064A (zh) * 2016-05-03 2016-08-10 北京光年无限科技有限公司 一种智能机器人的多模态输出方法和智能机器人系统

Also Published As

Publication number Publication date
US20180286432A1 (en) 2018-10-04
JP2018165759A (ja) 2018-10-25
CN108665891A (zh) 2018-10-16
JP6673276B2 (ja) 2020-03-25
US10424320B2 (en) 2019-09-24

Similar Documents

Publication Publication Date Title
CN108665891B (zh) 声音检测装置、声音检测方法以及记录介质
US10665249B2 (en) Sound source separation for robot from target voice direction and noise voice direction
CN108664889B (zh) 对象物检测装置、对象物检测方法以及记录介质
EP3373202B1 (en) Verification method and system
US8155394B2 (en) Wireless location and facial/speaker recognition system
WO2019089432A1 (en) System and method associated with user authentication based on an acoustic-based echo-signature
Shu et al. Application of extended Kalman filter for improving the accuracy and smoothness of Kinect skeleton-joint estimates
JP2006192563A (ja) 識別対象識別装置およびそれを備えたロボット
US11714157B2 (en) System to determine direction toward user
JP2011237621A (ja) ロボット
Qian et al. 3D audio-visual speaker tracking with an adaptive particle filter
US20040190754A1 (en) Image transmission system for a mobile robot
CN110073673A (zh) 面部识别系统
US12288566B1 (en) Beamforming using multiple sensor data
Fransen et al. Using vision, acoustics, and natural language for disambiguation
CN112767520A (zh) 数字人生成方法、装置、电子设备及存储介质
WO2021166811A1 (ja) 情報処理装置および行動モード設定方法
KR101520446B1 (ko) 구타 및 가혹행위 방지를 위한 감시 시스템
Zhu et al. Speaker localization based on audio-visual bimodal fusion
CN110730378A (zh) 一种信息处理方法及系统
US12456469B1 (en) Beamforming using image data
Tapu et al. Face recognition in video streams for mobile assistive devices dedicated to visually impaired
Kale et al. Active Multi-Modal Approach for Enhanced User Recognition in Social Robots
CN116012910A (zh) 悬停追踪方法、悬停追踪装置、悬停追踪设备及存储介质
Martinson et al. Guiding computational perception through a shared auditory space

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TG01 Patent term adjustment
TG01 Patent term adjustment