CN106816147A - Speech recognition system based on binary neural network acoustic model - Google Patents
Speech recognition system based on binary neural network acoustic model Download PDFInfo
- Publication number
- CN106816147A CN106816147A CN201710055681.5A CN201710055681A CN106816147A CN 106816147 A CN106816147 A CN 106816147A CN 201710055681 A CN201710055681 A CN 201710055681A CN 106816147 A CN106816147 A CN 106816147A
- Authority
- CN
- China
- Prior art keywords
- neural network
- input
- binary neural
- module
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 36
- 238000000034 method Methods 0.000 claims description 19
- 239000011159 matrix material Substances 0.000 claims description 18
- 230000007935 neutral effect Effects 0.000 claims description 10
- 239000013598 vector Substances 0.000 claims description 9
- 238000006243 chemical reaction Methods 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 3
- 230000007704 transition Effects 0.000 claims description 3
- 238000009432 framing Methods 0.000 claims description 2
- 230000000306 recurrent effect Effects 0.000 claims description 2
- 230000001133 acceleration Effects 0.000 abstract description 7
- 230000007423 decrease Effects 0.000 abstract description 2
- 230000014509 gene expression Effects 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Signal Processing (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Probability & Statistics with Applications (AREA)
- Complex Calculations (AREA)
Abstract
Description
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710055681.5A CN106816147A (en) | 2017-01-25 | 2017-01-25 | Speech recognition system based on binary neural network acoustic model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710055681.5A CN106816147A (en) | 2017-01-25 | 2017-01-25 | Speech recognition system based on binary neural network acoustic model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106816147A true CN106816147A (en) | 2017-06-09 |
Family
ID=59113098
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710055681.5A Pending CN106816147A (en) | 2017-01-25 | 2017-01-25 | Speech recognition system based on binary neural network acoustic model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106816147A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107220702A (en) * | 2017-06-21 | 2017-09-29 | 北京图森未来科技有限公司 | A kind of Neural network optimization and device |
CN108874754A (en) * | 2018-05-30 | 2018-11-23 | 苏州思必驰信息科技有限公司 | language model compression method and system |
CN109754789A (en) * | 2017-11-07 | 2019-05-14 | 北京国双科技有限公司 | The recognition methods of phoneme of speech sound and device |
CN110033766A (en) * | 2019-04-17 | 2019-07-19 | 重庆大学 | A kind of audio recognition method based on binaryzation recurrent neural network |
CN110085255A (en) * | 2019-03-27 | 2019-08-02 | 河海大学常州校区 | Voice conversion learns Gaussian process regression modeling method based on depth kernel |
CN110265002A (en) * | 2019-06-04 | 2019-09-20 | 北京清微智能科技有限公司 | Audio recognition method, device, computer equipment and computer readable storage medium |
WO2019179036A1 (en) * | 2018-03-19 | 2019-09-26 | 平安科技(深圳)有限公司 | Deep neural network model, electronic device, identity authentication method, and storage medium |
WO2020061884A1 (en) * | 2018-09-27 | 2020-04-02 | Intel Corporation | Composite binary decomposition network |
CN111160534A (en) * | 2019-12-31 | 2020-05-15 | 中山大学 | Binary neural network forward propagation frame suitable for mobile terminal |
CN111239597A (en) * | 2020-01-14 | 2020-06-05 | 温州大学乐清工业研究院 | Method for representing electric life of alternating current contactor based on audio signal characteristics |
CN113270091A (en) * | 2020-02-14 | 2021-08-17 | 声音猎手公司 | Audio processing system and method |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103117060A (en) * | 2013-01-18 | 2013-05-22 | 中国科学院声学研究所 | Modeling approach and modeling system of acoustic model used in speech recognition |
CN105702250A (en) * | 2016-01-06 | 2016-06-22 | 福建天晴数码有限公司 | Voice recognition method and device |
-
2017
- 2017-01-25 CN CN201710055681.5A patent/CN106816147A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103117060A (en) * | 2013-01-18 | 2013-05-22 | 中国科学院声学研究所 | Modeling approach and modeling system of acoustic model used in speech recognition |
CN105702250A (en) * | 2016-01-06 | 2016-06-22 | 福建天晴数码有限公司 | Voice recognition method and device |
Non-Patent Citations (2)
Title |
---|
COURBARIAUX M. ET AL: ""Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1"", 《IEEE》 * |
祝嘉声: ""基于DNN的汉语语音识别声学模型的研究"", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107220702B (en) * | 2017-06-21 | 2020-11-24 | 北京图森智途科技有限公司 | Computer vision processing method and device of low-computing-capacity processing equipment |
CN107220702A (en) * | 2017-06-21 | 2017-09-29 | 北京图森未来科技有限公司 | A kind of Neural network optimization and device |
CN109754789A (en) * | 2017-11-07 | 2019-05-14 | 北京国双科技有限公司 | The recognition methods of phoneme of speech sound and device |
CN109754789B (en) * | 2017-11-07 | 2021-06-08 | 北京国双科技有限公司 | Method and device for recognizing voice phonemes |
WO2019179036A1 (en) * | 2018-03-19 | 2019-09-26 | 平安科技(深圳)有限公司 | Deep neural network model, electronic device, identity authentication method, and storage medium |
CN108874754A (en) * | 2018-05-30 | 2018-11-23 | 苏州思必驰信息科技有限公司 | language model compression method and system |
US11934949B2 (en) | 2018-09-27 | 2024-03-19 | Intel Corporation | Composite binary decomposition network |
WO2020061884A1 (en) * | 2018-09-27 | 2020-04-02 | Intel Corporation | Composite binary decomposition network |
CN110085255A (en) * | 2019-03-27 | 2019-08-02 | 河海大学常州校区 | Voice conversion learns Gaussian process regression modeling method based on depth kernel |
CN110085255B (en) * | 2019-03-27 | 2021-05-28 | 河海大学常州校区 | Speech conversion Gaussian process regression modeling method based on deep kernel learning |
CN110033766A (en) * | 2019-04-17 | 2019-07-19 | 重庆大学 | A kind of audio recognition method based on binaryzation recurrent neural network |
CN110265002A (en) * | 2019-06-04 | 2019-09-20 | 北京清微智能科技有限公司 | Audio recognition method, device, computer equipment and computer readable storage medium |
CN110265002B (en) * | 2019-06-04 | 2021-07-23 | 北京清微智能科技有限公司 | Speech recognition method, speech recognition device, computer equipment and computer readable storage medium |
CN111160534A (en) * | 2019-12-31 | 2020-05-15 | 中山大学 | Binary neural network forward propagation frame suitable for mobile terminal |
CN111239597A (en) * | 2020-01-14 | 2020-06-05 | 温州大学乐清工业研究院 | Method for representing electric life of alternating current contactor based on audio signal characteristics |
CN113270091A (en) * | 2020-02-14 | 2021-08-17 | 声音猎手公司 | Audio processing system and method |
CN113270091B (en) * | 2020-02-14 | 2024-04-16 | 声音猎手公司 | Audio processing system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106816147A (en) | Speech recognition system based on binary neural network acoustic model | |
US20220415452A1 (en) | Method and apparatus for determining drug molecule property, and storage medium | |
Zhang et al. | Platon: Pruning large transformer models with upper confidence bound of weight importance | |
CN110705294A (en) | Named entity recognition model training method, named entity recognition method and device | |
Zhang et al. | Fast spoken query detection using lower-bound dynamic time warping on graphical processing units | |
CN110287961A (en) | Chinese word cutting method, electronic device and readable storage medium storing program for executing | |
CN103810999A (en) | Linguistic model training method and system based on distributed neural networks | |
CN114970522B (en) | Pre-training method, device, equipment and storage medium of language model | |
CN108846120A (en) | Method, system and storage medium for classifying to text set | |
Scanzio et al. | Parallel implementation of artificial neural network training | |
CN105229625B (en) | Method for voice recognition and acoustic processing device | |
WO2023020522A1 (en) | Methods for natural language processing and training natural language processing model, and device | |
CN110580458A (en) | music score image recognition method combining multi-scale residual error type CNN and SRU | |
Patel et al. | Deep learning for natural language processing | |
CN112883722A (en) | Distributed text summarization method based on cloud data center | |
CN107506345A (en) | The construction method and device of language model | |
EP3340066A1 (en) | Fft accelerator | |
CN112132281B (en) | Model training method, device, server and medium based on artificial intelligence | |
CN105340005B (en) | The predictive pruning scheme of effective HMM is obtained based on histogram | |
Gomes et al. | Deep learning brasil at absapt 2022: Portuguese transformer ensemble approaches | |
WO2022174499A1 (en) | Method and apparatus for predicting text prosodic boundaries, computer device, and storage medium | |
Chong et al. | Exploring recognition network representations for efficient speech inference on highly parallel platforms. | |
CN112287667A (en) | Text generation method and equipment | |
CN110222339B (en) | Intention recognition method and device based on improved XGBoost algorithm | |
Cai et al. | Fast learning of deep neural networks via singular value decomposition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20200616 Address after: Room 223, old administration building, 800 Dongchuan Road, Minhang District, Shanghai, 200240 Applicant after: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd. Applicant after: AI SPEECH Ltd. Address before: 200240 Dongchuan Road, Shanghai, No. 800, No. Applicant before: SHANGHAI JIAO TONG University Applicant before: AI SPEECH Ltd. |
|
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20201021 Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu. Applicant after: AI SPEECH Ltd. Address before: Room 223, old administration building, 800 Dongchuan Road, Minhang District, Shanghai, 200240 Applicant before: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd. Applicant before: AI SPEECH Ltd. |
|
TA01 | Transfer of patent application right | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170609 |
|
RJ01 | Rejection of invention patent application after publication |