KR20230110513A - 적응형 사운드 이벤트 분류 - Google Patents
적응형 사운드 이벤트 분류 Download PDFInfo
- Publication number
- KR20230110513A KR20230110513A KR1020237016431A KR20237016431A KR20230110513A KR 20230110513 A KR20230110513 A KR 20230110513A KR 1020237016431 A KR1020237016431 A KR 1020237016431A KR 20237016431 A KR20237016431 A KR 20237016431A KR 20230110513 A KR20230110513 A KR 20230110513A
- Authority
- KR
- South Korea
- Prior art keywords
- model
- event classification
- sound
- sound event
- sec
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/65—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/08—Mouthpieces; Microphones; Attachments therefor
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Acoustics & Sound (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Circuit For Audible Band Transducer (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/102,724 US11410677B2 (en) | 2020-11-24 | 2020-11-24 | Adaptive sound event classification |
| US17/102,724 | 2020-11-24 | ||
| PCT/US2021/072520 WO2022115838A1 (en) | 2020-11-24 | 2021-11-19 | Adaptive sound event classification |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| KR20230110513A true KR20230110513A (ko) | 2023-07-24 |
Family
ID=79231098
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| KR1020237016431A Pending KR20230110513A (ko) | 2020-11-24 | 2021-11-19 | 적응형 사운드 이벤트 분류 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US11410677B2 (https=) |
| EP (1) | EP4252231B1 (https=) |
| JP (1) | JP7757405B2 (https=) |
| KR (1) | KR20230110513A (https=) |
| CN (1) | CN116457879B (https=) |
| WO (1) | WO2022115838A1 (https=) |
Families Citing this family (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112669867B (zh) * | 2020-12-15 | 2023-04-11 | 阿波罗智联(北京)科技有限公司 | 噪声消除算法的调试方法、装置和电子设备 |
| US12585506B2 (en) * | 2021-01-28 | 2026-03-24 | Oracle International Corporation | System and method for determination of model fitness and stability for model deployment in automated model generation |
| WO2022182356A1 (en) * | 2021-02-26 | 2022-09-01 | Hewlett-Packard Development Company, L.P. | Noise suppression controls |
| US11508395B1 (en) * | 2021-05-03 | 2022-11-22 | Dell Products, L.P. | Intelligent selection of audio signatures based upon contextual information to perform management actions |
| US12244994B2 (en) | 2021-07-27 | 2025-03-04 | Qualcomm Incorporated | Processing of audio signals from multiple microphones |
| CN115016760B (zh) * | 2022-06-02 | 2023-04-14 | 北京百度网讯科技有限公司 | 数据处理方法、装置、设备及介质 |
| US11740905B1 (en) * | 2022-07-25 | 2023-08-29 | Dimaag-Ai, Inc. | Drift detection in static processes |
| KR102717465B1 (ko) * | 2022-09-08 | 2024-10-15 | 서울과학기술대학교 산학협력단 | 점진적 머신 러닝 기법을 이용한 cnn 기반 음원 인식 시스템 및 방법 |
| US20240253655A1 (en) * | 2023-02-01 | 2024-08-01 | Global Sense Inc. | Acoustic Artificial Intelligence Model for Detecting Events Associated with a Vehicle |
| CN121368798A (zh) * | 2023-06-27 | 2026-01-20 | 杜比实验室特许公司 | 在移动捕获中融合音频、视觉和传感器上下文信息 |
| KR20250063887A (ko) * | 2023-10-27 | 2025-05-09 | 삼성전자주식회사 | 메타 렌즈를 이용하여 획득한 이미지로부터 비전 인식을 수행하는 전자 장치 및 그 동작 방법 |
| CN117711436B (zh) * | 2024-02-05 | 2024-04-09 | 中国电子科技集团公司第十五研究所 | 一种基于多传感器融合的远场声音分类方法和装置 |
| US12542707B2 (en) * | 2024-02-22 | 2026-02-03 | Dell Products L.P. | Facilitating intelligent concept drift mitigation in advanced communication networks |
| CN118538235B (zh) * | 2024-05-15 | 2024-11-15 | 盐城工学院 | 一种音频数据的聚类分类方法及系统 |
Family Cites Families (53)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4827521A (en) | 1986-03-27 | 1989-05-02 | International Business Machines Corporation | Training of markov models used in a speech recognition system |
| CA2345661A1 (en) | 1998-10-02 | 2000-04-13 | International Business Machines Corporation | Conversational browser and conversational systems |
| DE60108373T2 (de) | 2001-08-02 | 2005-12-22 | Sony International (Europe) Gmbh | Verfahren zur Detektion von Emotionen in Sprachsignalen unter Verwendung von Sprecheridentifikation |
| US7620547B2 (en) | 2002-07-25 | 2009-11-17 | Sony Deutschland Gmbh | Spoken man-machine interface with speaker identification |
| US7640160B2 (en) | 2005-08-05 | 2009-12-29 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
| US7949529B2 (en) | 2005-08-29 | 2011-05-24 | Voicebox Technologies, Inc. | Mobile systems and methods of supporting natural language human-machine interactions |
| US20070183604A1 (en) | 2006-02-09 | 2007-08-09 | St-Infonox | Response to anomalous acoustic environments |
| US7877335B2 (en) | 2007-10-18 | 2011-01-25 | Yahoo! Inc. | System and method for learning a network of categories using prediction |
| JP5084591B2 (ja) | 2008-04-17 | 2012-11-28 | Jx日鉱日石エネルギー株式会社 | 異常検知装置 |
| US8788270B2 (en) | 2009-06-16 | 2014-07-22 | University Of Florida Research Foundation, Inc. | Apparatus and method for determining an emotion state of a speaker |
| US8600743B2 (en) | 2010-01-06 | 2013-12-03 | Apple Inc. | Noise profile determination for voice-related feature |
| US8381107B2 (en) | 2010-01-13 | 2013-02-19 | Apple Inc. | Adaptive audio feedback system and method |
| US9165556B1 (en) | 2012-02-01 | 2015-10-20 | Predictive Business Intelligence, LLC | Methods and systems related to audio data processing to provide key phrase notification and potential cost associated with the key phrase |
| KR101356165B1 (ko) * | 2012-03-09 | 2014-01-24 | 엘지전자 주식회사 | 로봇 청소기 및 이의 제어 방법 |
| US9575963B2 (en) | 2012-04-20 | 2017-02-21 | Maluuba Inc. | Conversational agent |
| US8463648B1 (en) | 2012-05-04 | 2013-06-11 | Pearl.com LLC | Method and apparatus for automated topic extraction used for the creation and promotion of new categories in a consultation system |
| US20140074466A1 (en) | 2012-09-10 | 2014-03-13 | Google Inc. | Answering questions using environmental context |
| US10134401B2 (en) | 2012-11-21 | 2018-11-20 | Verint Systems Ltd. | Diarization using linguistic labeling |
| US9449613B2 (en) | 2012-12-06 | 2016-09-20 | Audeme Llc | Room identification using acoustic features in a recording |
| US10013483B2 (en) | 2014-01-30 | 2018-07-03 | Microsoft Technology Licensing, Llc | System and method for identifying trending topics in a social network |
| US9466316B2 (en) | 2014-02-06 | 2016-10-11 | Otosense Inc. | Device, method and system for instant real time neuro-compatible imaging of a signal |
| US20180158288A1 (en) * | 2014-04-10 | 2018-06-07 | Twin Harbor Labs Llc | Methods and apparatus for notifying a user of the operating condition of a household appliance |
| US10410630B2 (en) | 2014-06-19 | 2019-09-10 | Robert Bosch Gmbh | System and method for speech-enabled personalized operation of devices and services in multiple operating environments |
| US10073673B2 (en) | 2014-07-14 | 2018-09-11 | Samsung Electronics Co., Ltd. | Method and system for robust tagging of named entities in the presence of source or translation errors |
| US9412361B1 (en) | 2014-09-30 | 2016-08-09 | Amazon Technologies, Inc. | Configuring system operation using image data |
| US9643511B2 (en) | 2014-12-17 | 2017-05-09 | Samsung Electronics Co., Ltd. | Method and apparatus for estimating state of charge (SOC) of battery in electric vehicle |
| JP5956624B1 (ja) | 2015-02-02 | 2016-07-27 | 西日本高速道路エンジニアリング四国株式会社 | 異常音の検出方法及びその検出値を用いた構造物の異常判定方法、並びに、振動波の類似度検出方法及びその検出値を用いた音声認識方法 |
| US10482184B2 (en) | 2015-03-08 | 2019-11-19 | Google Llc | Context-based natural language processing |
| EP3093846A1 (en) * | 2015-05-12 | 2016-11-16 | Nxp B.V. | Accoustic context recognition using local binary pattern method and apparatus |
| JP6556575B2 (ja) | 2015-09-15 | 2019-08-07 | 株式会社東芝 | 音声処理装置、音声処理方法及び音声処理プログラム |
| US9668073B2 (en) * | 2015-10-07 | 2017-05-30 | Robert Bosch Gmbh | System and method for audio scene understanding of physical object sound sources |
| US9847000B2 (en) | 2015-10-29 | 2017-12-19 | Immersion Corporation | Ambient triggered notifications for rendering haptic effects |
| US9946862B2 (en) | 2015-12-01 | 2018-04-17 | Qualcomm Incorporated | Electronic device generating notification based on context data in response to speech phrase from user |
| US10026401B1 (en) | 2015-12-28 | 2018-07-17 | Amazon Technologies, Inc. | Naming devices via voice commands |
| US10902043B2 (en) | 2016-01-03 | 2021-01-26 | Gracenote, Inc. | Responding to remote media classification queries using classifier models and context parameters |
| US10373612B2 (en) | 2016-03-21 | 2019-08-06 | Amazon Technologies, Inc. | Anchored speech detection and speech recognition |
| US10304444B2 (en) | 2016-03-23 | 2019-05-28 | Amazon Technologies, Inc. | Fine-grained natural language understanding |
| WO2017187712A1 (ja) | 2016-04-26 | 2017-11-02 | 株式会社ソニー・インタラクティブエンタテインメント | 情報処理装置 |
| US10026405B2 (en) | 2016-05-03 | 2018-07-17 | SESTEK Ses velletisim Bilgisayar Tekn. San. Ve Tic A.S. | Method for speaker diarization |
| DE112017001830B4 (de) * | 2016-05-06 | 2024-02-22 | Robert Bosch Gmbh | Sprachverbesserung und audioereignisdetektion für eine umgebung mit nichtstationären geräuschen |
| US10705683B2 (en) | 2016-10-31 | 2020-07-07 | Microsoft Technology Licensing, Llc | Changing visual aspects of a graphical user interface to bring focus to a message |
| EP3545374B1 (en) | 2016-11-23 | 2024-11-06 | Alarm.com Incorporated | Detection of authorized user presence and handling of unauthenticated monitoring system commands |
| US10713703B2 (en) | 2016-11-30 | 2020-07-14 | Apple Inc. | Diversity in media item recommendations |
| US10311454B2 (en) | 2017-06-22 | 2019-06-04 | NewVoiceMedia Ltd. | Customer interaction and experience system using emotional-semantic computing |
| US10665223B2 (en) | 2017-09-29 | 2020-05-26 | Udifi, Inc. | Acoustic and other waveform event detection and correction systems and methods |
| US10481858B2 (en) | 2017-12-06 | 2019-11-19 | Harman International Industries, Incorporated | Generating personalized audio content based on mood |
| US10832009B2 (en) | 2018-01-02 | 2020-11-10 | International Business Machines Corporation | Extraction and summarization of decision elements from communications |
| WO2019150813A1 (ja) | 2018-01-30 | 2019-08-08 | 富士フイルム株式会社 | データ処理装置及び方法、認識装置、学習データ保存装置、機械学習装置並びにプログラム |
| US10832672B2 (en) * | 2018-07-13 | 2020-11-10 | International Business Machines Corporation | Smart speaker system with cognitive sound analysis and response |
| US11231905B2 (en) * | 2019-03-27 | 2022-01-25 | Intel Corporation | Vehicle with external audio speaker and microphone |
| US11568731B2 (en) * | 2019-07-15 | 2023-01-31 | Apple Inc. | Systems and methods for identifying an acoustic source based on observed sound |
| US10783434B1 (en) * | 2019-10-07 | 2020-09-22 | Audio Analytic Ltd | Method of training a sound event recognition system |
| CN111341343B (zh) * | 2020-03-02 | 2023-06-30 | 乐鑫信息科技(上海)股份有限公司 | 一种用于异常声音检测的在线更新系统和方法 |
-
2020
- 2020-11-24 US US17/102,724 patent/US11410677B2/en active Active
-
2021
- 2021-11-19 EP EP21836306.7A patent/EP4252231B1/en active Active
- 2021-11-19 JP JP2023529961A patent/JP7757405B2/ja active Active
- 2021-11-19 WO PCT/US2021/072520 patent/WO2022115838A1/en not_active Ceased
- 2021-11-19 CN CN202180077242.9A patent/CN116457879B/zh active Active
- 2021-11-19 KR KR1020237016431A patent/KR20230110513A/ko active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| EP4252231B1 (en) | 2025-09-03 |
| US11410677B2 (en) | 2022-08-09 |
| JP2023550092A (ja) | 2023-11-30 |
| EP4252231A1 (en) | 2023-10-04 |
| US20220165292A1 (en) | 2022-05-26 |
| WO2022115838A1 (en) | 2022-06-02 |
| CN116457879A (zh) | 2023-07-18 |
| CN116457879B (zh) | 2026-01-09 |
| JP7757405B2 (ja) | 2025-10-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11410677B2 (en) | Adaptive sound event classification | |
| EP4066243B1 (en) | Sound event detection learning | |
| US20250103888A1 (en) | Context-based model selection | |
| KR102428920B1 (ko) | 전자 장치 및 그 동작 방법 | |
| US20220164667A1 (en) | Transfer learning for sound event classification | |
| CN111931946B (zh) | 数据处理方法、装置、计算机设备及存储介质 | |
| US20240232258A9 (en) | Sound search | |
| US12417761B2 (en) | Dummy prototypical networks for few-shot open-set keyword spotting | |
| CN111581958A (zh) | 对话状态确定方法、装置、计算机设备及存储介质 | |
| US12347439B2 (en) | Multi-task learning for personalized keyword spotting | |
| CN115222991A (zh) | 分类模型的训练方法、图像分类方法、装置及电子设备 | |
| KR20210003491A (ko) | 로봇 및 그의 구동 방법 | |
| US20190163436A1 (en) | Electronic device and method for controlling the same | |
| US12229343B2 (en) | System, device and method for real time gesture prediction | |
| WO2023183664A1 (en) | Multi-task learning for personalized keyword spotting | |
| WO2023183663A1 (en) | Dummy prototypical networks for few-shot open-set keyword spotting |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PA0105 | International application |
Patent event date: 20230515 Patent event code: PA01051R01D Comment text: International Patent Application |
|
| PG1501 | Laying open of application | ||
| A201 | Request for examination | ||
| PA0201 | Request for examination |
Patent event code: PA02012R01D Patent event date: 20241104 Comment text: Request for Examination of Application |