CN110265002A - Audio recognition method, device, computer equipment and computer readable storage medium - Google Patents
Audio recognition method, device, computer equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN110265002A CN110265002A CN201910480466.9A CN201910480466A CN110265002A CN 110265002 A CN110265002 A CN 110265002A CN 201910480466 A CN201910480466 A CN 201910480466A CN 110265002 A CN110265002 A CN 110265002A
- Authority
- CN
- China
- Prior art keywords
- layer
- carry
- neural network
- bit
- convolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000003860 storage Methods 0.000 title claims abstract description 18
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 41
- 238000012545 processing Methods 0.000 claims abstract description 28
- 238000012360 testing method Methods 0.000 claims abstract description 16
- 238000013528 artificial neural network Methods 0.000 claims description 33
- 238000012549 training Methods 0.000 claims description 29
- 238000004590 computer program Methods 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 14
- 238000005070 sampling Methods 0.000 claims description 11
- 238000005520 cutting process Methods 0.000 claims description 8
- 239000011159 matrix material Substances 0.000 claims description 6
- 230000001419 dependent effect Effects 0.000 claims description 4
- 230000008859 change Effects 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 15
- 238000004364 calculation method Methods 0.000 description 10
- 230000006835 compression Effects 0.000 description 9
- 238000007906 compression Methods 0.000 description 9
- 230000001133 acceleration Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 238000012896 Statistical algorithm Methods 0.000 description 1
- 229910002056 binary alloy Inorganic materials 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000012938 design process Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910480466.9A CN110265002B (en) | 2019-06-04 | 2019-06-04 | Speech recognition method, speech recognition device, computer equipment and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910480466.9A CN110265002B (en) | 2019-06-04 | 2019-06-04 | Speech recognition method, speech recognition device, computer equipment and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110265002A true CN110265002A (en) | 2019-09-20 |
CN110265002B CN110265002B (en) | 2021-07-23 |
Family
ID=67916581
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910480466.9A Active CN110265002B (en) | 2019-06-04 | 2019-06-04 | Speech recognition method, speech recognition device, computer equipment and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110265002B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110852361A (en) * | 2019-10-30 | 2020-02-28 | 清华大学 | Image classification method and device based on improved deep neural network and electronic equipment |
CN111583940A (en) * | 2020-04-20 | 2020-08-25 | 东南大学 | Very low power consumption keyword awakening neural network circuit |
CN111860778A (en) * | 2020-07-08 | 2020-10-30 | 北京灵汐科技有限公司 | Full-additive convolution method and device |
CN112863520A (en) * | 2021-01-18 | 2021-05-28 | 东南大学 | Binary weight convolution neural network module and method for voiceprint recognition by using same |
CN113409773A (en) * | 2021-08-18 | 2021-09-17 | 中科南京智能技术研究院 | Binaryzation neural network voice awakening method and system |
CN114822510A (en) * | 2022-06-28 | 2022-07-29 | 中科南京智能技术研究院 | Voice awakening method and system based on binary convolutional neural network |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101097509A (en) * | 2006-06-26 | 2008-01-02 | 英特尔公司 | Sparse tree adder |
CN103259529A (en) * | 2012-02-17 | 2013-08-21 | 京微雅格(北京)科技有限公司 | Integrated circuit using carry skip chains |
CN106816147A (en) * | 2017-01-25 | 2017-06-09 | 上海交通大学 | Speech recognition system based on binary neural network acoustic model |
CN106909970A (en) * | 2017-01-12 | 2017-06-30 | 南京大学 | A kind of two-value weight convolutional neural networks hardware accelerator computing module based on approximate calculation |
CN107153873A (en) * | 2017-05-08 | 2017-09-12 | 中国科学院计算技术研究所 | A kind of two-value convolutional neural networks processor and its application method |
CN107203808A (en) * | 2017-05-08 | 2017-09-26 | 中国科学院计算技术研究所 | A kind of two-value Convole Unit and corresponding two-value convolutional neural networks processor |
WO2018048907A1 (en) * | 2016-09-06 | 2018-03-15 | Neosensory, Inc. C/O Tmc+260 | Method and system for providing adjunct sensory information to a user |
CN108010515A (en) * | 2017-11-21 | 2018-05-08 | 清华大学 | A kind of speech terminals detection and awakening method and device |
WO2018102240A1 (en) * | 2016-12-02 | 2018-06-07 | Microsoft Technology Licensing, Llc | Joint language understanding and dialogue management |
CN109100142A (en) * | 2018-06-26 | 2018-12-28 | 北京交通大学 | A kind of semi-supervised method for diagnosing faults of bearing based on graph theory |
CN109214502A (en) * | 2017-07-03 | 2019-01-15 | 清华大学 | Neural network weight discretization method and system |
CN109643228A (en) * | 2016-10-01 | 2019-04-16 | 英特尔公司 | Low energy consumption mantissa multiplication for floating point multiplication addition operation |
CN109787929A (en) * | 2019-02-20 | 2019-05-21 | 深圳市宝链人工智能科技有限公司 | Signal modulate method, electronic device and computer readable storage medium |
-
2019
- 2019-06-04 CN CN201910480466.9A patent/CN110265002B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101097509A (en) * | 2006-06-26 | 2008-01-02 | 英特尔公司 | Sparse tree adder |
CN103259529A (en) * | 2012-02-17 | 2013-08-21 | 京微雅格(北京)科技有限公司 | Integrated circuit using carry skip chains |
WO2018048907A1 (en) * | 2016-09-06 | 2018-03-15 | Neosensory, Inc. C/O Tmc+260 | Method and system for providing adjunct sensory information to a user |
CN109643228A (en) * | 2016-10-01 | 2019-04-16 | 英特尔公司 | Low energy consumption mantissa multiplication for floating point multiplication addition operation |
WO2018102240A1 (en) * | 2016-12-02 | 2018-06-07 | Microsoft Technology Licensing, Llc | Joint language understanding and dialogue management |
CN106909970A (en) * | 2017-01-12 | 2017-06-30 | 南京大学 | A kind of two-value weight convolutional neural networks hardware accelerator computing module based on approximate calculation |
CN106816147A (en) * | 2017-01-25 | 2017-06-09 | 上海交通大学 | Speech recognition system based on binary neural network acoustic model |
CN107203808A (en) * | 2017-05-08 | 2017-09-26 | 中国科学院计算技术研究所 | A kind of two-value Convole Unit and corresponding two-value convolutional neural networks processor |
CN107153873A (en) * | 2017-05-08 | 2017-09-12 | 中国科学院计算技术研究所 | A kind of two-value convolutional neural networks processor and its application method |
CN109214502A (en) * | 2017-07-03 | 2019-01-15 | 清华大学 | Neural network weight discretization method and system |
CN108010515A (en) * | 2017-11-21 | 2018-05-08 | 清华大学 | A kind of speech terminals detection and awakening method and device |
CN109100142A (en) * | 2018-06-26 | 2018-12-28 | 北京交通大学 | A kind of semi-supervised method for diagnosing faults of bearing based on graph theory |
CN109787929A (en) * | 2019-02-20 | 2019-05-21 | 深圳市宝链人工智能科技有限公司 | Signal modulate method, electronic device and computer readable storage medium |
Non-Patent Citations (7)
Title |
---|
BENJAMIN GRAHAM: ""Spatially-sparse convolutioanl neural networks"", 《COMPUTER VISION AND PATTERN RECOGNITION》 * |
DANDAN SONG: ""Low Bits: Binary Neural Network For Vad and Wakeup"", 《2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING》 * |
MATTHIEU COURBARIAUX: ""Binarized Neural Networks:Training Neural networks with Weights and Activations Constrained to +1 or -1"", 《ARXIV》 * |
SHOUYI YIN: ""A 141 uW, 2.46 pJ/Neuron Binarized Convolutional Neural Network based Self-Selflearning"", 《IEEE》 * |
SONG HAN: ""Deep compression: compressing deep neural networks with pruing"", 《ICLR 2016》 * |
YAN-MIN QIAN: ""Binary neural networks for speech recognition"", 《FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING》 * |
YU PAN: ""A multilevel Cell STT-MRAM-Based Computing In-Memory Accelerator for binary Convolutioanl Neural Networks"", 《IEEE TRANSACTION ON MAGNETICS》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110852361A (en) * | 2019-10-30 | 2020-02-28 | 清华大学 | Image classification method and device based on improved deep neural network and electronic equipment |
CN111583940A (en) * | 2020-04-20 | 2020-08-25 | 东南大学 | Very low power consumption keyword awakening neural network circuit |
CN111860778A (en) * | 2020-07-08 | 2020-10-30 | 北京灵汐科技有限公司 | Full-additive convolution method and device |
CN112863520A (en) * | 2021-01-18 | 2021-05-28 | 东南大学 | Binary weight convolution neural network module and method for voiceprint recognition by using same |
CN112863520B (en) * | 2021-01-18 | 2023-10-24 | 东南大学 | Binary weight convolutional neural network module and method for identifying voiceprint by using same |
CN113409773A (en) * | 2021-08-18 | 2021-09-17 | 中科南京智能技术研究院 | Binaryzation neural network voice awakening method and system |
CN114822510A (en) * | 2022-06-28 | 2022-07-29 | 中科南京智能技术研究院 | Voice awakening method and system based on binary convolutional neural network |
CN114822510B (en) * | 2022-06-28 | 2022-10-04 | 中科南京智能技术研究院 | Voice awakening method and system based on binary convolutional neural network |
Also Published As
Publication number | Publication date |
---|---|
CN110265002B (en) | 2021-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110265002B (en) | Speech recognition method, speech recognition device, computer equipment and computer readable storage medium | |
Lu et al. | Evaluations on deep neural networks training using posit number system | |
CN110378468B (en) | Neural network accelerator based on structured pruning and low bit quantization | |
CN111062472B (en) | Sparse neural network accelerator based on structured pruning and acceleration method thereof | |
Kim | DeepX: Deep learning accelerator for restricted boltzmann machine artificial neural networks | |
CN107340993B (en) | Arithmetic device and method | |
CN110163353B (en) | Computing device and method | |
US20180018555A1 (en) | System and method for building artificial neural network architectures | |
US20210192009A1 (en) | Apparatus and method for generating efficient convolution | |
KR20190107766A (en) | Computing device and method | |
Li et al. | Quantized neural networks with new stochastic multipliers | |
TWI738048B (en) | Arithmetic framework system and method for operating floating-to-fixed arithmetic framework | |
CN107402905B (en) | Neural network-based computing method and device | |
Chong et al. | A 2.5 μW KWS engine with pruned LSTM and embedded MFCC for IoT applications | |
CN109145107A (en) | Subject distillation method, apparatus, medium and equipment based on convolutional neural networks | |
Jiang et al. | A low-latency LSTM accelerator using balanced sparsity based on FPGA | |
Li et al. | E-sparse: Boosting the large language model inference through entropy-based n: M sparsity | |
Fuketa et al. | Image-classifier deep convolutional neural network training by 9-bit dedicated hardware to realize validation accuracy and energy efficiency superior to the half precision floating point format | |
Walia et al. | Fast and low-power quantized fixed posit high-accuracy DNN implementation | |
Wu et al. | GBC: An energy-efficient LSTM accelerator with gating units level balanced compression strategy | |
Temenos et al. | A stochastic computing sigma-delta adder architecture for efficient neural network design | |
Hsieh et al. | A multiplier-less convolutional neural network inference accelerator for intelligent edge devices | |
Sun et al. | HSIM-DNN: Hardware simulator for computation-, storage-and power-efficient deep neural networks | |
CN110990776B (en) | Coding distributed computing method, device, computer equipment and storage medium | |
Devnath et al. | A mathematical approach towards quantization of floating point weights in low power neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20230717 Address after: 610, Floor 6, Block A, No. 2, Lize Middle Second Road, Chaoyang District, Beijing 100102 Patentee after: Zhongguancun Technology Leasing Co.,Ltd. Address before: 100056 2212, 22 / F, No.9, North Fourth Ring Road West, Haidian District, Beijing Patentee before: Beijing Qingwei Intelligent Technology Co.,Ltd. |
|
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20231114 Address after: 100192 201, 2nd floor, building 26, yard 1, Baosheng South Road, Haidian District, Beijing Patentee after: Beijing Qingwei Intelligent Technology Co.,Ltd. Address before: 610, Floor 6, Block A, No. 2, Lize Middle Second Road, Chaoyang District, Beijing 100102 Patentee before: Zhongguancun Technology Leasing Co.,Ltd. |
|
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Liu Ling Inventor after: OuYang Peng Inventor after: Li Xiudong Inventor after: Wang Bo Inventor before: Liu Ling Inventor before: OuYang Peng Inventor before: Yin Shouyi Inventor before: Li Xiudong Inventor before: Wang Bo |