JP7757405B2 - 適応型サウンドイベント分類 - Google Patents
適応型サウンドイベント分類Info
- Publication number
- JP7757405B2 JP7757405B2 JP2023529961A JP2023529961A JP7757405B2 JP 7757405 B2 JP7757405 B2 JP 7757405B2 JP 2023529961 A JP2023529961 A JP 2023529961A JP 2023529961 A JP2023529961 A JP 2023529961A JP 7757405 B2 JP7757405 B2 JP 7757405B2
- Authority
- JP
- Japan
- Prior art keywords
- model
- sound
- sec
- event classification
- sound event
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/65—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/08—Mouthpieces; Microphones; Attachments therefor
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Acoustics & Sound (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Circuit For Audible Band Transducer (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/102,724 US11410677B2 (en) | 2020-11-24 | 2020-11-24 | Adaptive sound event classification |
| US17/102,724 | 2020-11-24 | ||
| PCT/US2021/072520 WO2022115838A1 (en) | 2020-11-24 | 2021-11-19 | Adaptive sound event classification |
Publications (4)
| Publication Number | Publication Date |
|---|---|
| JP2023550092A JP2023550092A (ja) | 2023-11-30 |
| JP2023550092A5 JP2023550092A5 (https=) | 2024-10-29 |
| JPWO2022115838A5 JPWO2022115838A5 (https=) | 2024-10-29 |
| JP7757405B2 true JP7757405B2 (ja) | 2025-10-21 |
Family
ID=79231098
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2023529961A Active JP7757405B2 (ja) | 2020-11-24 | 2021-11-19 | 適応型サウンドイベント分類 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US11410677B2 (https=) |
| EP (1) | EP4252231B1 (https=) |
| JP (1) | JP7757405B2 (https=) |
| KR (1) | KR20230110513A (https=) |
| CN (1) | CN116457879B (https=) |
| WO (1) | WO2022115838A1 (https=) |
Families Citing this family (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112669867B (zh) * | 2020-12-15 | 2023-04-11 | 阿波罗智联(北京)科技有限公司 | 噪声消除算法的调试方法、装置和电子设备 |
| US12585506B2 (en) * | 2021-01-28 | 2026-03-24 | Oracle International Corporation | System and method for determination of model fitness and stability for model deployment in automated model generation |
| WO2022182356A1 (en) * | 2021-02-26 | 2022-09-01 | Hewlett-Packard Development Company, L.P. | Noise suppression controls |
| US11508395B1 (en) * | 2021-05-03 | 2022-11-22 | Dell Products, L.P. | Intelligent selection of audio signatures based upon contextual information to perform management actions |
| US12244994B2 (en) | 2021-07-27 | 2025-03-04 | Qualcomm Incorporated | Processing of audio signals from multiple microphones |
| CN115016760B (zh) * | 2022-06-02 | 2023-04-14 | 北京百度网讯科技有限公司 | 数据处理方法、装置、设备及介质 |
| US11740905B1 (en) * | 2022-07-25 | 2023-08-29 | Dimaag-Ai, Inc. | Drift detection in static processes |
| KR102717465B1 (ko) * | 2022-09-08 | 2024-10-15 | 서울과학기술대학교 산학협력단 | 점진적 머신 러닝 기법을 이용한 cnn 기반 음원 인식 시스템 및 방법 |
| US20240253655A1 (en) * | 2023-02-01 | 2024-08-01 | Global Sense Inc. | Acoustic Artificial Intelligence Model for Detecting Events Associated with a Vehicle |
| CN121368798A (zh) * | 2023-06-27 | 2026-01-20 | 杜比实验室特许公司 | 在移动捕获中融合音频、视觉和传感器上下文信息 |
| KR20250063887A (ko) * | 2023-10-27 | 2025-05-09 | 삼성전자주식회사 | 메타 렌즈를 이용하여 획득한 이미지로부터 비전 인식을 수행하는 전자 장치 및 그 동작 방법 |
| CN117711436B (zh) * | 2024-02-05 | 2024-04-09 | 中国电子科技集团公司第十五研究所 | 一种基于多传感器融合的远场声音分类方法和装置 |
| US12542707B2 (en) * | 2024-02-22 | 2026-02-03 | Dell Products L.P. | Facilitating intelligent concept drift mitigation in advanced communication networks |
| CN118538235B (zh) * | 2024-05-15 | 2024-11-15 | 盐城工学院 | 一种音频数据的聚类分类方法及系统 |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2009259020A (ja) | 2008-04-17 | 2009-11-05 | Japan Energy Corp | 異常検知装置 |
| US20190103094A1 (en) | 2017-09-29 | 2019-04-04 | Udifi, Inc. | Acoustic and Other Waveform Event Detection and Correction Systems and Methods |
| WO2019150813A1 (ja) | 2018-01-30 | 2019-08-08 | 富士フイルム株式会社 | データ処理装置及び方法、認識装置、学習データ保存装置、機械学習装置並びにプログラム |
Family Cites Families (50)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4827521A (en) | 1986-03-27 | 1989-05-02 | International Business Machines Corporation | Training of markov models used in a speech recognition system |
| CA2345661A1 (en) | 1998-10-02 | 2000-04-13 | International Business Machines Corporation | Conversational browser and conversational systems |
| DE60108373T2 (de) | 2001-08-02 | 2005-12-22 | Sony International (Europe) Gmbh | Verfahren zur Detektion von Emotionen in Sprachsignalen unter Verwendung von Sprecheridentifikation |
| US7620547B2 (en) | 2002-07-25 | 2009-11-17 | Sony Deutschland Gmbh | Spoken man-machine interface with speaker identification |
| US7640160B2 (en) | 2005-08-05 | 2009-12-29 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
| US7949529B2 (en) | 2005-08-29 | 2011-05-24 | Voicebox Technologies, Inc. | Mobile systems and methods of supporting natural language human-machine interactions |
| US20070183604A1 (en) | 2006-02-09 | 2007-08-09 | St-Infonox | Response to anomalous acoustic environments |
| US7877335B2 (en) | 2007-10-18 | 2011-01-25 | Yahoo! Inc. | System and method for learning a network of categories using prediction |
| US8788270B2 (en) | 2009-06-16 | 2014-07-22 | University Of Florida Research Foundation, Inc. | Apparatus and method for determining an emotion state of a speaker |
| US8600743B2 (en) | 2010-01-06 | 2013-12-03 | Apple Inc. | Noise profile determination for voice-related feature |
| US8381107B2 (en) | 2010-01-13 | 2013-02-19 | Apple Inc. | Adaptive audio feedback system and method |
| US9165556B1 (en) | 2012-02-01 | 2015-10-20 | Predictive Business Intelligence, LLC | Methods and systems related to audio data processing to provide key phrase notification and potential cost associated with the key phrase |
| KR101356165B1 (ko) * | 2012-03-09 | 2014-01-24 | 엘지전자 주식회사 | 로봇 청소기 및 이의 제어 방법 |
| US9575963B2 (en) | 2012-04-20 | 2017-02-21 | Maluuba Inc. | Conversational agent |
| US8463648B1 (en) | 2012-05-04 | 2013-06-11 | Pearl.com LLC | Method and apparatus for automated topic extraction used for the creation and promotion of new categories in a consultation system |
| US20140074466A1 (en) | 2012-09-10 | 2014-03-13 | Google Inc. | Answering questions using environmental context |
| US10134401B2 (en) | 2012-11-21 | 2018-11-20 | Verint Systems Ltd. | Diarization using linguistic labeling |
| US9449613B2 (en) | 2012-12-06 | 2016-09-20 | Audeme Llc | Room identification using acoustic features in a recording |
| US10013483B2 (en) | 2014-01-30 | 2018-07-03 | Microsoft Technology Licensing, Llc | System and method for identifying trending topics in a social network |
| US9466316B2 (en) | 2014-02-06 | 2016-10-11 | Otosense Inc. | Device, method and system for instant real time neuro-compatible imaging of a signal |
| US20180158288A1 (en) * | 2014-04-10 | 2018-06-07 | Twin Harbor Labs Llc | Methods and apparatus for notifying a user of the operating condition of a household appliance |
| US10410630B2 (en) | 2014-06-19 | 2019-09-10 | Robert Bosch Gmbh | System and method for speech-enabled personalized operation of devices and services in multiple operating environments |
| US10073673B2 (en) | 2014-07-14 | 2018-09-11 | Samsung Electronics Co., Ltd. | Method and system for robust tagging of named entities in the presence of source or translation errors |
| US9412361B1 (en) | 2014-09-30 | 2016-08-09 | Amazon Technologies, Inc. | Configuring system operation using image data |
| US9643511B2 (en) | 2014-12-17 | 2017-05-09 | Samsung Electronics Co., Ltd. | Method and apparatus for estimating state of charge (SOC) of battery in electric vehicle |
| JP5956624B1 (ja) | 2015-02-02 | 2016-07-27 | 西日本高速道路エンジニアリング四国株式会社 | 異常音の検出方法及びその検出値を用いた構造物の異常判定方法、並びに、振動波の類似度検出方法及びその検出値を用いた音声認識方法 |
| US10482184B2 (en) | 2015-03-08 | 2019-11-19 | Google Llc | Context-based natural language processing |
| EP3093846A1 (en) * | 2015-05-12 | 2016-11-16 | Nxp B.V. | Accoustic context recognition using local binary pattern method and apparatus |
| JP6556575B2 (ja) | 2015-09-15 | 2019-08-07 | 株式会社東芝 | 音声処理装置、音声処理方法及び音声処理プログラム |
| US9668073B2 (en) * | 2015-10-07 | 2017-05-30 | Robert Bosch Gmbh | System and method for audio scene understanding of physical object sound sources |
| US9847000B2 (en) | 2015-10-29 | 2017-12-19 | Immersion Corporation | Ambient triggered notifications for rendering haptic effects |
| US9946862B2 (en) | 2015-12-01 | 2018-04-17 | Qualcomm Incorporated | Electronic device generating notification based on context data in response to speech phrase from user |
| US10026401B1 (en) | 2015-12-28 | 2018-07-17 | Amazon Technologies, Inc. | Naming devices via voice commands |
| US10902043B2 (en) | 2016-01-03 | 2021-01-26 | Gracenote, Inc. | Responding to remote media classification queries using classifier models and context parameters |
| US10373612B2 (en) | 2016-03-21 | 2019-08-06 | Amazon Technologies, Inc. | Anchored speech detection and speech recognition |
| US10304444B2 (en) | 2016-03-23 | 2019-05-28 | Amazon Technologies, Inc. | Fine-grained natural language understanding |
| WO2017187712A1 (ja) | 2016-04-26 | 2017-11-02 | 株式会社ソニー・インタラクティブエンタテインメント | 情報処理装置 |
| US10026405B2 (en) | 2016-05-03 | 2018-07-17 | SESTEK Ses velletisim Bilgisayar Tekn. San. Ve Tic A.S. | Method for speaker diarization |
| DE112017001830B4 (de) * | 2016-05-06 | 2024-02-22 | Robert Bosch Gmbh | Sprachverbesserung und audioereignisdetektion für eine umgebung mit nichtstationären geräuschen |
| US10705683B2 (en) | 2016-10-31 | 2020-07-07 | Microsoft Technology Licensing, Llc | Changing visual aspects of a graphical user interface to bring focus to a message |
| EP3545374B1 (en) | 2016-11-23 | 2024-11-06 | Alarm.com Incorporated | Detection of authorized user presence and handling of unauthenticated monitoring system commands |
| US10713703B2 (en) | 2016-11-30 | 2020-07-14 | Apple Inc. | Diversity in media item recommendations |
| US10311454B2 (en) | 2017-06-22 | 2019-06-04 | NewVoiceMedia Ltd. | Customer interaction and experience system using emotional-semantic computing |
| US10481858B2 (en) | 2017-12-06 | 2019-11-19 | Harman International Industries, Incorporated | Generating personalized audio content based on mood |
| US10832009B2 (en) | 2018-01-02 | 2020-11-10 | International Business Machines Corporation | Extraction and summarization of decision elements from communications |
| US10832672B2 (en) * | 2018-07-13 | 2020-11-10 | International Business Machines Corporation | Smart speaker system with cognitive sound analysis and response |
| US11231905B2 (en) * | 2019-03-27 | 2022-01-25 | Intel Corporation | Vehicle with external audio speaker and microphone |
| US11568731B2 (en) * | 2019-07-15 | 2023-01-31 | Apple Inc. | Systems and methods for identifying an acoustic source based on observed sound |
| US10783434B1 (en) * | 2019-10-07 | 2020-09-22 | Audio Analytic Ltd | Method of training a sound event recognition system |
| CN111341343B (zh) * | 2020-03-02 | 2023-06-30 | 乐鑫信息科技(上海)股份有限公司 | 一种用于异常声音检测的在线更新系统和方法 |
-
2020
- 2020-11-24 US US17/102,724 patent/US11410677B2/en active Active
-
2021
- 2021-11-19 EP EP21836306.7A patent/EP4252231B1/en active Active
- 2021-11-19 JP JP2023529961A patent/JP7757405B2/ja active Active
- 2021-11-19 WO PCT/US2021/072520 patent/WO2022115838A1/en not_active Ceased
- 2021-11-19 CN CN202180077242.9A patent/CN116457879B/zh active Active
- 2021-11-19 KR KR1020237016431A patent/KR20230110513A/ko active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2009259020A (ja) | 2008-04-17 | 2009-11-05 | Japan Energy Corp | 異常検知装置 |
| US20190103094A1 (en) | 2017-09-29 | 2019-04-04 | Udifi, Inc. | Acoustic and Other Waveform Event Detection and Correction Systems and Methods |
| WO2019150813A1 (ja) | 2018-01-30 | 2019-08-08 | 富士フイルム株式会社 | データ処理装置及び方法、認識装置、学習データ保存装置、機械学習装置並びにプログラム |
Non-Patent Citations (1)
| Title |
|---|
| KOH, Eunjeong et al.,INCREMENTAL LEARNING ALGORITHM FOR SOUND EVENT DETECTION,arXiv [online],2020年03月26日,pp.1-6,Retrieved from <https://arxiv.org/pdf/2003.12175v1>,[retrieved on 2025.08.28] |
Also Published As
| Publication number | Publication date |
|---|---|
| KR20230110513A (ko) | 2023-07-24 |
| EP4252231B1 (en) | 2025-09-03 |
| US11410677B2 (en) | 2022-08-09 |
| JP2023550092A (ja) | 2023-11-30 |
| EP4252231A1 (en) | 2023-10-04 |
| US20220165292A1 (en) | 2022-05-26 |
| WO2022115838A1 (en) | 2022-06-02 |
| CN116457879A (zh) | 2023-07-18 |
| CN116457879B (zh) | 2026-01-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7757405B2 (ja) | 適応型サウンドイベント分類 | |
| JP7840941B2 (ja) | コンテキストベースのモデル選択 | |
| EP4066243B1 (en) | Sound event detection learning | |
| KR102428920B1 (ko) | 전자 장치 및 그 동작 방법 | |
| US20220164667A1 (en) | Transfer learning for sound event classification | |
| CN115699036A (zh) | 支持跨平台、边缘-云混合人工智能服务的智能层 | |
| US20240232258A9 (en) | Sound search | |
| US12417761B2 (en) | Dummy prototypical networks for few-shot open-set keyword spotting | |
| KR102795306B1 (ko) | 학습 처리 시스템, 로컬 파라미터 개수 결정 장치 및 방법 | |
| KR20180049787A (ko) | 전자 장치, 그의 제어 방법 | |
| CN119948490A (zh) | 用于共享和剪枝视觉和语言模型的权重的装置和方法 | |
| CN118871984A (zh) | 用于个性化关键词检出的多任务学习 | |
| US12229343B2 (en) | System, device and method for real time gesture prediction | |
| WO2020207316A1 (zh) | 设备资源配置方法、装置、存储介质及电子设备 | |
| CN115099422A (zh) | 一种流水线训练方法、电子设备及介质 | |
| WO2023183663A1 (en) | Dummy prototypical networks for few-shot open-set keyword spotting | |
| CN112153461A (zh) | 用于定位发声物的方法、装置、电子设备及可读存储介质 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20241021 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20241021 |
|
| TRDD | Decision of grant or rejection written | ||
| A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20250829 |
|
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20250909 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20251008 |
|
| R150 | Certificate of patent or registration of utility model |
Ref document number: 7757405 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |