CN106782513B - Speech recognition realization method and system based on confidence level - Google Patents
Speech recognition realization method and system based on confidence level Download PDFInfo
- Publication number
- CN106782513B CN106782513B CN201710060942.2A CN201710060942A CN106782513B CN 106782513 B CN106782513 B CN 106782513B CN 201710060942 A CN201710060942 A CN 201710060942A CN 106782513 B CN106782513 B CN 106782513B
- Authority
- CN
- China
- Prior art keywords
- speech recognition
- phoneme
- probability
- information
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 230000001360 synchronised effect Effects 0.000 claims abstract description 30
- 230000002860 competitive effect Effects 0.000 claims abstract description 17
- 238000011156 evaluation Methods 0.000 claims description 7
- 239000002671 adjuvant Substances 0.000 claims description 4
- 230000004927 fusion Effects 0.000 claims description 3
- 230000006978 adaptation Effects 0.000 claims description 2
- 238000013528 artificial neural network Methods 0.000 claims description 2
- 238000005259 measurement Methods 0.000 claims description 2
- 238000013138 pruning Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 238000007796 conventional method Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 239000000945 filler Substances 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- RLLPVAHGXHCWKJ-IEBWSBKVSA-N (3-phenoxyphenyl)methyl (1s,3s)-3-(2,2-dichloroethenyl)-2,2-dimethylcyclopropane-1-carboxylate Chemical compound CC1(C)[C@H](C=C(Cl)Cl)[C@@H]1C(=O)OCC1=CC=CC(OC=2C=CC=CC=2)=C1 RLLPVAHGXHCWKJ-IEBWSBKVSA-N 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/32—Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
Abstract
Description
System constructs each link | Conventional method | The present invention | Advantage compares |
Word figure generates | Synchronous decoding frame by frame | Phoneme synchronous decoding | It is more acurrate, efficient generating process |
The building of full search space | Based on filler or word figure | Adjuvant search space | Construction search space is more comprehensively |
Confidence calculations | Word figure posterior probability | Confusion network competes probability | Voice recognition information is more acurrate |
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710060942.2A CN106782513B (en) | 2017-01-25 | 2017-01-25 | Speech recognition realization method and system based on confidence level |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710060942.2A CN106782513B (en) | 2017-01-25 | 2017-01-25 | Speech recognition realization method and system based on confidence level |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106782513A CN106782513A (en) | 2017-05-31 |
CN106782513B true CN106782513B (en) | 2019-08-23 |
Family
ID=58943125
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710060942.2A Active CN106782513B (en) | 2017-01-25 | 2017-01-25 | Speech recognition realization method and system based on confidence level |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106782513B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107871499B (en) * | 2017-10-27 | 2020-06-16 | 珠海市杰理科技股份有限公司 | Speech recognition method, system, computer device and computer-readable storage medium |
CN110164416B (en) * | 2018-12-07 | 2023-05-09 | 腾讯科技(深圳)有限公司 | Voice recognition method and device, equipment and storage medium thereof |
CN110808032B (en) * | 2019-09-20 | 2023-12-22 | 平安科技(深圳)有限公司 | Voice recognition method, device, computer equipment and storage medium |
CN113192535B (en) * | 2021-04-16 | 2022-09-09 | 中国科学院声学研究所 | Voice keyword retrieval method, system and electronic device |
CN115394288B (en) * | 2022-10-28 | 2023-01-24 | 成都爱维译科技有限公司 | Language identification method and system for civil aviation multi-language radio land-air conversation |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7571098B1 (en) * | 2003-05-29 | 2009-08-04 | At&T Intellectual Property Ii, L.P. | System and method of spoken language understanding using word confusion networks |
CN101763855A (en) * | 2009-11-20 | 2010-06-30 | 安徽科大讯飞信息科技股份有限公司 | Method and device for judging confidence of speech recognition |
CN101887725A (en) * | 2010-04-30 | 2010-11-17 | 中国科学院声学研究所 | Phoneme confusion network-based phoneme posterior probability calculation method |
CN104157285A (en) * | 2013-05-14 | 2014-11-19 | 腾讯科技(深圳)有限公司 | Voice recognition method and device, and electronic equipment |
CN105895081A (en) * | 2016-04-11 | 2016-08-24 | 苏州思必驰信息科技有限公司 | Speech recognition decoding method and speech recognition decoding device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102371188B1 (en) * | 2015-06-30 | 2022-03-04 | 삼성전자주식회사 | Apparatus and method for speech recognition, and electronic device |
-
2017
- 2017-01-25 CN CN201710060942.2A patent/CN106782513B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7571098B1 (en) * | 2003-05-29 | 2009-08-04 | At&T Intellectual Property Ii, L.P. | System and method of spoken language understanding using word confusion networks |
CN101763855A (en) * | 2009-11-20 | 2010-06-30 | 安徽科大讯飞信息科技股份有限公司 | Method and device for judging confidence of speech recognition |
CN101887725A (en) * | 2010-04-30 | 2010-11-17 | 中国科学院声学研究所 | Phoneme confusion network-based phoneme posterior probability calculation method |
CN104157285A (en) * | 2013-05-14 | 2014-11-19 | 腾讯科技(深圳)有限公司 | Voice recognition method and device, and electronic equipment |
CN105895081A (en) * | 2016-04-11 | 2016-08-24 | 苏州思必驰信息科技有限公司 | Speech recognition decoding method and speech recognition decoding device |
Non-Patent Citations (2)
Title |
---|
《一种广播新闻语音的关键词检测系统》;张鹏远等;《全国网络与信息安全技术研讨会》;20070731;第398-405页 |
《基于环境特征的语音识别置信度研究》;国玉晶等;《第十届全国人机语音通讯学术会议暨国际语音语言处理研讨会》;20090816;第298-302页 |
Also Published As
Publication number | Publication date |
---|---|
CN106782513A (en) | 2017-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106782513B (en) | Speech recognition realization method and system based on confidence level | |
JP6916264B2 (en) | Real-time speech recognition methods based on disconnection attention, devices, equipment and computer readable storage media | |
CN111739508B (en) | End-to-end speech synthesis method and system based on DNN-HMM bimodal alignment network | |
CN106098059B (en) | Customizable voice awakening method and system | |
Zayats et al. | Disfluency detection using a bidirectional LSTM | |
CN102982811B (en) | Voice endpoint detection method based on real-time decoding | |
JP7070894B2 (en) | Time series information learning system, method and neural network model | |
US10714076B2 (en) | Initialization of CTC speech recognition with standard HMM | |
KR20190125463A (en) | Method and apparatus for detecting voice emotion, computer device and storage medium | |
CN108899013A (en) | Voice search method, device and speech recognition system | |
CN106611597A (en) | Voice wakeup method and voice wakeup device based on artificial intelligence | |
CN101604520A (en) | Spoken language voice recognition method based on statistical model and syntax rule | |
CN111539199B (en) | Text error correction method, device, terminal and storage medium | |
CN108364634A (en) | Spoken language pronunciation evaluating method based on deep neural network posterior probability algorithm | |
JP2018060047A (en) | Learning device for acoustic model and computer program therefor | |
CN113823257B (en) | Speech synthesizer construction method, speech synthesis method and device | |
CN112309398A (en) | Working time monitoring method and device, electronic equipment and storage medium | |
CN100431003C (en) | Voice decoding method based on mixed network | |
CN111179941A (en) | Intelligent device awakening method, registration method and device | |
Deng et al. | History utterance embedding transformer lm for speech recognition | |
Huang et al. | Towards word-level end-to-end neural speaker diarization with auxiliary network | |
CN113096646B (en) | Audio recognition method and device, electronic equipment and storage medium | |
CN115240713A (en) | Voice emotion recognition method and device based on multi-modal features and contrast learning | |
JP2905674B2 (en) | Unspecified speaker continuous speech recognition method | |
CN113936642A (en) | Pronunciation dictionary construction method, voice recognition method and related device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20200628 Address after: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120 Co-patentee after: AI SPEECH Co.,Ltd. Patentee after: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd. Address before: 200240 Dongchuan Road, Shanghai, No. 800, No. Co-patentee before: AI SPEECH Co.,Ltd. Patentee before: SHANGHAI JIAO TONG University |
|
TR01 | Transfer of patent right |
Effective date of registration: 20201105 Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu. Patentee after: AI SPEECH Co.,Ltd. Address before: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120 Patentee before: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd. Patentee before: AI SPEECH Co.,Ltd. |
|
TR01 | Transfer of patent right | ||
CP01 | Change in the name or title of a patent holder |
Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu. Patentee after: Sipic Technology Co.,Ltd. Address before: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu. Patentee before: AI SPEECH Co.,Ltd. |
|
CP01 | Change in the name or title of a patent holder | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: Implementation Method and System of Confidence Based Speech Recognition Effective date of registration: 20230726 Granted publication date: 20190823 Pledgee: CITIC Bank Limited by Share Ltd. Suzhou branch Pledgor: Sipic Technology Co.,Ltd. Registration number: Y2023980049433 |
|
PE01 | Entry into force of the registration of the contract for pledge of patent right |