CN109616104B - 基于关键点编码和多脉冲学习的环境声音识别方法 - Google Patents
基于关键点编码和多脉冲学习的环境声音识别方法 Download PDFInfo
- Publication number
- CN109616104B CN109616104B CN201910101670.5A CN201910101670A CN109616104B CN 109616104 B CN109616104 B CN 109616104B CN 201910101670 A CN201910101670 A CN 201910101670A CN 109616104 B CN109616104 B CN 109616104B
- Authority
- CN
- China
- Prior art keywords
- pulse
- key point
- neuron
- learning
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 210000002569 neuron Anatomy 0.000 claims abstract description 35
- 230000007613 environmental effect Effects 0.000 claims abstract description 14
- 238000013528 artificial neural network Methods 0.000 claims abstract description 12
- 230000007246 mechanism Effects 0.000 claims abstract description 12
- 238000007781 pre-processing Methods 0.000 claims abstract description 12
- 230000010365 information processing Effects 0.000 claims abstract description 7
- 238000012545 processing Methods 0.000 claims abstract description 6
- 238000000605 extraction Methods 0.000 claims abstract description 5
- 239000012528 membrane Substances 0.000 claims description 10
- 238000010586 diagram Methods 0.000 claims description 9
- 230000000946 synaptic effect Effects 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 230000004044 response Effects 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 230000005236 sound signal Effects 0.000 claims description 4
- 230000008901 benefit Effects 0.000 claims description 3
- 238000009432 framing Methods 0.000 claims description 3
- 230000000873 masking effect Effects 0.000 claims description 3
- 210000003766 afferent neuron Anatomy 0.000 claims description 2
- 230000033228 biological regulation Effects 0.000 claims description 2
- 230000003247 decreasing effect Effects 0.000 claims description 2
- 230000010354 integration Effects 0.000 claims description 2
- 230000001242 postsynaptic effect Effects 0.000 claims description 2
- 230000000284 resting effect Effects 0.000 claims description 2
- 210000000225 synapse Anatomy 0.000 claims description 2
- 101000760764 Homo sapiens Tyrosyl-DNA phosphodiesterase 1 Proteins 0.000 claims 1
- 102100024579 Tyrosyl-DNA phosphodiesterase 1 Human genes 0.000 claims 1
- 238000013527 convolutional neural network Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 3
- 241000102542 Kara Species 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000010304 firing Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 239000002184 metal Substances 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 210000000653 nervous system Anatomy 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/061—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/01—Assessment or evaluation of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Theoretical Computer Science (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Neurology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
- Complex Calculations (AREA)
Abstract
Description
Claims (2)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910101670.5A CN109616104B (zh) | 2019-01-31 | 2019-01-31 | 基于关键点编码和多脉冲学习的环境声音识别方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910101670.5A CN109616104B (zh) | 2019-01-31 | 2019-01-31 | 基于关键点编码和多脉冲学习的环境声音识别方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109616104A CN109616104A (zh) | 2019-04-12 |
CN109616104B true CN109616104B (zh) | 2022-12-30 |
Family
ID=66019509
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910101670.5A Active CN109616104B (zh) | 2019-01-31 | 2019-01-31 | 基于关键点编码和多脉冲学习的环境声音识别方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109616104B (zh) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111028861B (zh) * | 2019-12-10 | 2022-02-22 | 思必驰科技股份有限公司 | 频谱掩码模型训练方法、音频场景识别方法及系统 |
CN111310816B (zh) * | 2020-02-07 | 2023-04-07 | 天津大学 | 基于无监督匹配追踪编码的仿脑架构图像识别方法 |
CN111681648A (zh) * | 2020-03-10 | 2020-09-18 | 天津大学 | 基于增强脉冲的声音识别方法 |
CN112749637B (zh) * | 2020-12-29 | 2023-09-08 | 电子科技大学 | 一种基于snn的分布式光纤传感信号识别方法 |
CN112734012B (zh) * | 2021-01-07 | 2024-03-05 | 北京灵汐科技有限公司 | 脉冲神经网络训练方法、数据处理方法、电子设备和介质 |
CN113257282B (zh) * | 2021-07-15 | 2021-10-08 | 成都时识科技有限公司 | 语音情感识别方法、装置、电子设备以及存储介质 |
CN113974607B (zh) * | 2021-11-17 | 2024-04-26 | 杭州电子科技大学 | 一种基于脉冲神经网络的睡眠鼾声检测系统 |
CN115906960A (zh) * | 2022-11-18 | 2023-04-04 | 天津大学 | 基于生物学习神经网络的声音识别方法 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106709997A (zh) * | 2016-04-29 | 2017-05-24 | 电子科技大学 | 基于深度神经网络和稀疏自编码器的三维关键点检测方法 |
CN106845541A (zh) * | 2017-01-17 | 2017-06-13 | 杭州电子科技大学 | 一种基于生物视觉与精确脉冲驱动神经网络的图像识别方法 |
CN108596195A (zh) * | 2018-05-09 | 2018-09-28 | 福建亿榕信息技术有限公司 | 一种基于稀疏编码特征提取的场景识别方法 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4823001B2 (ja) * | 2006-09-27 | 2011-11-24 | 富士通セミコンダクター株式会社 | オーディオ符号化装置 |
-
2019
- 2019-01-31 CN CN201910101670.5A patent/CN109616104B/zh active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106709997A (zh) * | 2016-04-29 | 2017-05-24 | 电子科技大学 | 基于深度神经网络和稀疏自编码器的三维关键点检测方法 |
CN106845541A (zh) * | 2017-01-17 | 2017-06-13 | 杭州电子科技大学 | 一种基于生物视觉与精确脉冲驱动神经网络的图像识别方法 |
CN108596195A (zh) * | 2018-05-09 | 2018-09-28 | 福建亿榕信息技术有限公司 | 一种基于稀疏编码特征提取的场景识别方法 |
Non-Patent Citations (5)
Title |
---|
《A Spiking Neural Network System for Robust Sequence Recognition》;Qiang Yu et al.;《IEEE Transactions on Neural Networks and Learning Systems》;20150414;第27卷(第3期);全文 * |
《A Supervised Multi-Spike Learning Algorithm for Spiking Neural Networks》;Yu Miao et al.;《2018 International Joint Conference on Neural Networks (IJCNN)》;20181014;全文 * |
《Combining robust spike coding with spiking neural networks for sound event classification》;Jonathan Dennis et al.;《2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)》;20150806;第176-179页 * |
《Spike Timing or Rate? Neurons Learn to Make Decisions for Both Through Threshold-Driven Plasticity》;Qiang Yu et al.;《 IEEE Transactions on Cybernetics》;20180427;第49卷(第6期);第2178-2188页 * |
《基于视觉分层的前馈多脉冲神经网络算法研究》;金昕;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190115(第01期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN109616104A (zh) | 2019-04-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109616104B (zh) | 基于关键点编码和多脉冲学习的环境声音识别方法 | |
Sarangi et al. | Optimization of data-driven filterbank for automatic speaker verification | |
Shahamiri et al. | Real-time frequency-based noise-robust Automatic Speech Recognition using Multi-Nets Artificial Neural Networks: A multi-views multi-learners approach | |
US11694696B2 (en) | Method and apparatus for implementing speaker identification neural network | |
Verma et al. | Frequency Estimation from Waveforms Using Multi-Layered Neural Networks. | |
CN113571067B (zh) | 一种基于边界攻击的声纹识别对抗样本生成方法 | |
CN109448749A (zh) | 基于有监督学习听觉注意的语音提取方法、系统、装置 | |
Song et al. | A machine learning-based underwater noise classification method | |
CN109903749B (zh) | 基于关键点编码和卷积神经网络进行鲁棒的声音识别方法 | |
CN115424620A (zh) | 一种基于自适应触发器的声纹识别后门样本生成方法 | |
Alamsyah et al. | Speech gender classification using bidirectional long short term memory | |
Sertsi et al. | Robust voice activity detection based on LSTM recurrent neural networks and modulation spectrum | |
Shi et al. | Deep neural network and noise classification-based speech enhancement | |
Tan et al. | Digit recognition using neural networks | |
Tawaqal et al. | Recognizing five major dialects in Indonesia based on MFCC and DRNN | |
Kato et al. | Statistical regression models for noise robust F0 estimation using recurrent deep neural networks | |
Nicolson et al. | Sum-product networks for robust automatic speaker identification | |
CN115602156A (zh) | 一种基于多突触连接光脉冲神经网络的语音识别方法 | |
Bourouba et al. | Feature extraction algorithm using new cepstral techniques for robust speech recognition | |
Nayem et al. | Incorporating intra-spectral dependencies with a recurrent output layer for improved speech enhancement | |
Malekzadeh et al. | Persian vowel recognition with MFCC and ANN on PCVC speech dataset | |
Mendelev et al. | Robust voice activity detection with deep maxout neural networks | |
Wu et al. | Audio-based expansion learning for aerial target recognition | |
Shanmugapriya et al. | Deep neural network based speaker verification system using features from glottal activity regions | |
Gade et al. | Hybrid Deep Convolutional Neural Network based Speaker Recognition for Noisy Speech Environments |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20231008 Address after: 14th, 15th, 16th, and 17th floors, 18th floor, Building 1, Nord Center, No. 168 Luwei Road, Hongshunli Street, Hebei District, Tianjin, 300000 Patentee after: HUIYAN TECHNOLOGY (TIANJIN) Co.,Ltd. Address before: 300072 Tianjin City, Nankai District Wei Jin Road No. 92 Patentee before: Tianjin University |
|
TR01 | Transfer of patent right | ||
CP02 | Change in the address of a patent holder |
Address after: No.14,15,16,17, 18th Floor, Building 1, Nord Center, No. 168 Luwei Road, Hongshunli Street, Hebei District, Tianjin, 300000 Patentee after: HUIYAN TECHNOLOGY (TIANJIN) Co.,Ltd. Address before: 14th, 15th, 16th, and 17th floors, 18th floor, Building 1, Nord Center, No. 168 Luwei Road, Hongshunli Street, Hebei District, Tianjin, 300000 Patentee before: HUIYAN TECHNOLOGY (TIANJIN) Co.,Ltd. |
|
CP02 | Change in the address of a patent holder |