CN116457879B - 自适应声音事件分类 - Google Patents

自适应声音事件分类

Info

Publication number
CN116457879B
CN116457879B CN202180077242.9A CN202180077242A CN116457879B CN 116457879 B CN116457879 B CN 116457879B CN 202180077242 A CN202180077242 A CN 202180077242A CN 116457879 B CN116457879 B CN 116457879B
Authority
CN
China
Prior art keywords
model
event classification
sound
sound event
classification model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202180077242.9A
Other languages
English (en)
Chinese (zh)
Other versions
CN116457879A (zh
Inventor
F·萨基
Y·郭
E·维斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of CN116457879A publication Critical patent/CN116457879A/zh
Application granted granted Critical
Publication of CN116457879B publication Critical patent/CN116457879B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/65Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/08Mouthpieces; Microphones; Attachments therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Circuit For Audible Band Transducer (AREA)
CN202180077242.9A 2020-11-24 2021-11-19 自适应声音事件分类 Active CN116457879B (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US17/102,724 US11410677B2 (en) 2020-11-24 2020-11-24 Adaptive sound event classification
US17/102,724 2020-11-24
PCT/US2021/072520 WO2022115838A1 (en) 2020-11-24 2021-11-19 Adaptive sound event classification

Publications (2)

Publication Number Publication Date
CN116457879A CN116457879A (zh) 2023-07-18
CN116457879B true CN116457879B (zh) 2026-01-09

Family

ID=79231098

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180077242.9A Active CN116457879B (zh) 2020-11-24 2021-11-19 自适应声音事件分类

Country Status (6)

Country Link
US (1) US11410677B2 (https=)
EP (1) EP4252231B1 (https=)
JP (1) JP7757405B2 (https=)
KR (1) KR20230110513A (https=)
CN (1) CN116457879B (https=)
WO (1) WO2022115838A1 (https=)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112669867B (zh) * 2020-12-15 2023-04-11 阿波罗智联(北京)科技有限公司 噪声消除算法的调试方法、装置和电子设备
US12585506B2 (en) * 2021-01-28 2026-03-24 Oracle International Corporation System and method for determination of model fitness and stability for model deployment in automated model generation
WO2022182356A1 (en) * 2021-02-26 2022-09-01 Hewlett-Packard Development Company, L.P. Noise suppression controls
US11508395B1 (en) * 2021-05-03 2022-11-22 Dell Products, L.P. Intelligent selection of audio signatures based upon contextual information to perform management actions
US12244994B2 (en) 2021-07-27 2025-03-04 Qualcomm Incorporated Processing of audio signals from multiple microphones
CN115016760B (zh) * 2022-06-02 2023-04-14 北京百度网讯科技有限公司 数据处理方法、装置、设备及介质
US11740905B1 (en) * 2022-07-25 2023-08-29 Dimaag-Ai, Inc. Drift detection in static processes
KR102717465B1 (ko) * 2022-09-08 2024-10-15 서울과학기술대학교 산학협력단 점진적 머신 러닝 기법을 이용한 cnn 기반 음원 인식 시스템 및 방법
US20240253655A1 (en) * 2023-02-01 2024-08-01 Global Sense Inc. Acoustic Artificial Intelligence Model for Detecting Events Associated with a Vehicle
CN121368798A (zh) * 2023-06-27 2026-01-20 杜比实验室特许公司 在移动捕获中融合音频、视觉和传感器上下文信息
KR20250063887A (ko) * 2023-10-27 2025-05-09 삼성전자주식회사 메타 렌즈를 이용하여 획득한 이미지로부터 비전 인식을 수행하는 전자 장치 및 그 동작 방법
CN117711436B (zh) * 2024-02-05 2024-04-09 中国电子科技集团公司第十五研究所 一种基于多传感器融合的远场声音分类方法和装置
US12542707B2 (en) * 2024-02-22 2026-02-03 Dell Products L.P. Facilitating intelligent concept drift mitigation in advanced communication networks
CN118538235B (zh) * 2024-05-15 2024-11-15 盐城工学院 一种音频数据的聚类分类方法及系统

Family Cites Families (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4827521A (en) 1986-03-27 1989-05-02 International Business Machines Corporation Training of markov models used in a speech recognition system
CA2345661A1 (en) 1998-10-02 2000-04-13 International Business Machines Corporation Conversational browser and conversational systems
DE60108373T2 (de) 2001-08-02 2005-12-22 Sony International (Europe) Gmbh Verfahren zur Detektion von Emotionen in Sprachsignalen unter Verwendung von Sprecheridentifikation
US7620547B2 (en) 2002-07-25 2009-11-17 Sony Deutschland Gmbh Spoken man-machine interface with speaker identification
US7640160B2 (en) 2005-08-05 2009-12-29 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US7949529B2 (en) 2005-08-29 2011-05-24 Voicebox Technologies, Inc. Mobile systems and methods of supporting natural language human-machine interactions
US20070183604A1 (en) 2006-02-09 2007-08-09 St-Infonox Response to anomalous acoustic environments
US7877335B2 (en) 2007-10-18 2011-01-25 Yahoo! Inc. System and method for learning a network of categories using prediction
JP5084591B2 (ja) 2008-04-17 2012-11-28 Jx日鉱日石エネルギー株式会社 異常検知装置
US8788270B2 (en) 2009-06-16 2014-07-22 University Of Florida Research Foundation, Inc. Apparatus and method for determining an emotion state of a speaker
US8600743B2 (en) 2010-01-06 2013-12-03 Apple Inc. Noise profile determination for voice-related feature
US8381107B2 (en) 2010-01-13 2013-02-19 Apple Inc. Adaptive audio feedback system and method
US9165556B1 (en) 2012-02-01 2015-10-20 Predictive Business Intelligence, LLC Methods and systems related to audio data processing to provide key phrase notification and potential cost associated with the key phrase
KR101356165B1 (ko) * 2012-03-09 2014-01-24 엘지전자 주식회사 로봇 청소기 및 이의 제어 방법
US9575963B2 (en) 2012-04-20 2017-02-21 Maluuba Inc. Conversational agent
US8463648B1 (en) 2012-05-04 2013-06-11 Pearl.com LLC Method and apparatus for automated topic extraction used for the creation and promotion of new categories in a consultation system
US20140074466A1 (en) 2012-09-10 2014-03-13 Google Inc. Answering questions using environmental context
US10134401B2 (en) 2012-11-21 2018-11-20 Verint Systems Ltd. Diarization using linguistic labeling
US9449613B2 (en) 2012-12-06 2016-09-20 Audeme Llc Room identification using acoustic features in a recording
US10013483B2 (en) 2014-01-30 2018-07-03 Microsoft Technology Licensing, Llc System and method for identifying trending topics in a social network
US9466316B2 (en) 2014-02-06 2016-10-11 Otosense Inc. Device, method and system for instant real time neuro-compatible imaging of a signal
US20180158288A1 (en) * 2014-04-10 2018-06-07 Twin Harbor Labs Llc Methods and apparatus for notifying a user of the operating condition of a household appliance
US10410630B2 (en) 2014-06-19 2019-09-10 Robert Bosch Gmbh System and method for speech-enabled personalized operation of devices and services in multiple operating environments
US10073673B2 (en) 2014-07-14 2018-09-11 Samsung Electronics Co., Ltd. Method and system for robust tagging of named entities in the presence of source or translation errors
US9412361B1 (en) 2014-09-30 2016-08-09 Amazon Technologies, Inc. Configuring system operation using image data
US9643511B2 (en) 2014-12-17 2017-05-09 Samsung Electronics Co., Ltd. Method and apparatus for estimating state of charge (SOC) of battery in electric vehicle
JP5956624B1 (ja) 2015-02-02 2016-07-27 西日本高速道路エンジニアリング四国株式会社 異常音の検出方法及びその検出値を用いた構造物の異常判定方法、並びに、振動波の類似度検出方法及びその検出値を用いた音声認識方法
US10482184B2 (en) 2015-03-08 2019-11-19 Google Llc Context-based natural language processing
EP3093846A1 (en) * 2015-05-12 2016-11-16 Nxp B.V. Accoustic context recognition using local binary pattern method and apparatus
JP6556575B2 (ja) 2015-09-15 2019-08-07 株式会社東芝 音声処理装置、音声処理方法及び音声処理プログラム
US9668073B2 (en) * 2015-10-07 2017-05-30 Robert Bosch Gmbh System and method for audio scene understanding of physical object sound sources
US9847000B2 (en) 2015-10-29 2017-12-19 Immersion Corporation Ambient triggered notifications for rendering haptic effects
US9946862B2 (en) 2015-12-01 2018-04-17 Qualcomm Incorporated Electronic device generating notification based on context data in response to speech phrase from user
US10026401B1 (en) 2015-12-28 2018-07-17 Amazon Technologies, Inc. Naming devices via voice commands
US10902043B2 (en) 2016-01-03 2021-01-26 Gracenote, Inc. Responding to remote media classification queries using classifier models and context parameters
US10373612B2 (en) 2016-03-21 2019-08-06 Amazon Technologies, Inc. Anchored speech detection and speech recognition
US10304444B2 (en) 2016-03-23 2019-05-28 Amazon Technologies, Inc. Fine-grained natural language understanding
WO2017187712A1 (ja) 2016-04-26 2017-11-02 株式会社ソニー・インタラクティブエンタテインメント 情報処理装置
US10026405B2 (en) 2016-05-03 2018-07-17 SESTEK Ses velletisim Bilgisayar Tekn. San. Ve Tic A.S. Method for speaker diarization
DE112017001830B4 (de) * 2016-05-06 2024-02-22 Robert Bosch Gmbh Sprachverbesserung und audioereignisdetektion für eine umgebung mit nichtstationären geräuschen
US10705683B2 (en) 2016-10-31 2020-07-07 Microsoft Technology Licensing, Llc Changing visual aspects of a graphical user interface to bring focus to a message
EP3545374B1 (en) 2016-11-23 2024-11-06 Alarm.com Incorporated Detection of authorized user presence and handling of unauthenticated monitoring system commands
US10713703B2 (en) 2016-11-30 2020-07-14 Apple Inc. Diversity in media item recommendations
US10311454B2 (en) 2017-06-22 2019-06-04 NewVoiceMedia Ltd. Customer interaction and experience system using emotional-semantic computing
US10665223B2 (en) 2017-09-29 2020-05-26 Udifi, Inc. Acoustic and other waveform event detection and correction systems and methods
US10481858B2 (en) 2017-12-06 2019-11-19 Harman International Industries, Incorporated Generating personalized audio content based on mood
US10832009B2 (en) 2018-01-02 2020-11-10 International Business Machines Corporation Extraction and summarization of decision elements from communications
WO2019150813A1 (ja) 2018-01-30 2019-08-08 富士フイルム株式会社 データ処理装置及び方法、認識装置、学習データ保存装置、機械学習装置並びにプログラム
US10832672B2 (en) * 2018-07-13 2020-11-10 International Business Machines Corporation Smart speaker system with cognitive sound analysis and response
US11231905B2 (en) * 2019-03-27 2022-01-25 Intel Corporation Vehicle with external audio speaker and microphone
US11568731B2 (en) * 2019-07-15 2023-01-31 Apple Inc. Systems and methods for identifying an acoustic source based on observed sound
US10783434B1 (en) * 2019-10-07 2020-09-22 Audio Analytic Ltd Method of training a sound event recognition system
CN111341343B (zh) * 2020-03-02 2023-06-30 乐鑫信息科技(上海)股份有限公司 一种用于异常声音检测的在线更新系统和方法

Also Published As

Publication number Publication date
KR20230110513A (ko) 2023-07-24
EP4252231B1 (en) 2025-09-03
US11410677B2 (en) 2022-08-09
JP2023550092A (ja) 2023-11-30
EP4252231A1 (en) 2023-10-04
US20220165292A1 (en) 2022-05-26
WO2022115838A1 (en) 2022-06-02
CN116457879A (zh) 2023-07-18
JP7757405B2 (ja) 2025-10-21

Similar Documents

Publication Publication Date Title
CN116457879B (zh) 自适应声音事件分类
CN114730566B (zh) 声音事件检测学习
US20250103888A1 (en) Context-based model selection
US20220164667A1 (en) Transfer learning for sound event classification
CN111931946B (zh) 数据处理方法、装置、计算机设备及存储介质
US20240232258A9 (en) Sound search
US11869228B2 (en) System and a method for generating an image recognition model and classifying an input image
US12417761B2 (en) Dummy prototypical networks for few-shot open-set keyword spotting
CN115222991A (zh) 分类模型的训练方法、图像分类方法、装置及电子设备
WO2020207316A1 (zh) 设备资源配置方法、装置、存储介质及电子设备
CN112230829B (zh) 用于计算设备上的自动服务激活的系统和方法
WO2023183663A1 (en) Dummy prototypical networks for few-shot open-set keyword spotting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant