CN107452374B - 基于单向自标注辅助信息的多视角语言识别方法 - Google Patents
基于单向自标注辅助信息的多视角语言识别方法 Download PDFInfo
- Publication number
- CN107452374B CN107452374B CN201710561261.4A CN201710561261A CN107452374B CN 107452374 B CN107452374 B CN 107452374B CN 201710561261 A CN201710561261 A CN 201710561261A CN 107452374 B CN107452374 B CN 107452374B
- Authority
- CN
- China
- Prior art keywords
- model
- labeling
- auxiliary
- language model
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000002372 labelling Methods 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000013528 artificial neural network Methods 0.000 claims abstract description 28
- 230000002457 bidirectional effect Effects 0.000 claims abstract description 8
- 230000000306 recurrent effect Effects 0.000 claims description 13
- 239000013598 vector Substances 0.000 claims description 13
- 125000004122 cyclic group Chemical group 0.000 claims description 7
- 239000011159 matrix material Substances 0.000 claims description 4
- 238000013527 convolutional neural network Methods 0.000 claims description 3
- 230000000087 stabilizing effect Effects 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 3
- 230000006978 adaptation Effects 0.000 abstract 1
- 239000010410 layer Substances 0.000 description 19
- 230000006870 function Effects 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
Abstract
Description
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710561261.4A CN107452374B (zh) | 2017-07-11 | 2017-07-11 | 基于单向自标注辅助信息的多视角语言识别方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710561261.4A CN107452374B (zh) | 2017-07-11 | 2017-07-11 | 基于单向自标注辅助信息的多视角语言识别方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107452374A CN107452374A (zh) | 2017-12-08 |
CN107452374B true CN107452374B (zh) | 2020-05-05 |
Family
ID=60488802
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710561261.4A Active CN107452374B (zh) | 2017-07-11 | 2017-07-11 | 基于单向自标注辅助信息的多视角语言识别方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107452374B (zh) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108417201B (zh) * | 2018-01-19 | 2020-11-06 | 苏州思必驰信息科技有限公司 | 单信道多说话人身份识别方法及系统 |
JP7258988B2 (ja) * | 2019-02-08 | 2023-04-17 | ヤフー株式会社 | 情報処理装置、情報処理方法及び情報処理プログラム |
CN110738984B (zh) * | 2019-05-13 | 2020-12-11 | 苏州闪驰数控系统集成有限公司 | 人工智能cnn、lstm神经网络语音识别系统 |
CN111179910A (zh) * | 2019-12-17 | 2020-05-19 | 深圳追一科技有限公司 | 语速识别方法和装置、服务器、计算机可读存储介质 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106328122A (zh) * | 2016-08-19 | 2017-01-11 | 深圳市唯特视科技有限公司 | 一种利用长短期记忆模型递归神经网络的语音识别方法 |
US9607616B2 (en) * | 2015-08-17 | 2017-03-28 | Mitsubishi Electric Research Laboratories, Inc. | Method for using a multi-scale recurrent neural network with pretraining for spoken language understanding tasks |
CN106682220A (zh) * | 2017-01-04 | 2017-05-17 | 华南理工大学 | 一种基于深度学习的在线中医文本命名实体识别方法 |
-
2017
- 2017-07-11 CN CN201710561261.4A patent/CN107452374B/zh active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9607616B2 (en) * | 2015-08-17 | 2017-03-28 | Mitsubishi Electric Research Laboratories, Inc. | Method for using a multi-scale recurrent neural network with pretraining for spoken language understanding tasks |
CN106328122A (zh) * | 2016-08-19 | 2017-01-11 | 深圳市唯特视科技有限公司 | 一种利用长短期记忆模型递归神经网络的语音识别方法 |
CN106682220A (zh) * | 2017-01-04 | 2017-05-17 | 华南理工大学 | 一种基于深度学习的在线中医文本命名实体识别方法 |
Non-Patent Citations (3)
Title |
---|
"A Unified Tagging Solution:Bidirectional LSTM Recurrent Neural Network with Word Embedding";PeiluWang等;《arXiv:1511.00215 [cs.CL]》;20151101;全文 * |
"Joint Online Spoken Language Understanding and Language Modeling with Recurrent Neural Networks";Bing Liu等;《arXiv:1609.01462v1 [cs.CL]》;20160906;全文 * |
"基于词向量和LSTM的汉语零指代消解研究";吴兵兵;《中国优秀硕士学位论文全文数据库 信息科技辑》;20170215(第02期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN107452374A (zh) | 2017-12-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10854193B2 (en) | Methods, devices and computer-readable storage media for real-time speech recognition | |
Audhkhasi et al. | End-to-end ASR-free keyword search from speech | |
Audhkhasi et al. | Direct acoustics-to-word models for english conversational speech recognition | |
US11145293B2 (en) | Speech recognition with sequence-to-sequence models | |
CN108492820B (zh) | 基于循环神经网络语言模型和深度神经网络声学模型的中文语音识别方法 | |
CN111883110B (zh) | 语音识别的声学模型训练方法、系统、设备及介质 | |
CN110556100B (zh) | 端到端语音识别模型的训练方法及系统 | |
CN108170686B (zh) | 文本翻译方法及装置 | |
CN107452374B (zh) | 基于单向自标注辅助信息的多视角语言识别方法 | |
US20180349327A1 (en) | Text error correction method and apparatus based on recurrent neural network of artificial intelligence | |
US10714076B2 (en) | Initialization of CTC speech recognition with standard HMM | |
CN112331183B (zh) | 基于自回归网络的非平行语料语音转换方法及系统 | |
CN112037773B (zh) | 一种n最优口语语义识别方法、装置及电子设备 | |
CN106340297A (zh) | 一种基于云计算与置信度计算的语音识别方法与系统 | |
CN102063900A (zh) | 克服混淆发音的语音识别方法及系统 | |
JP2023545988A (ja) | トランスフォーマトランスデューサ:ストリーミング音声認識と非ストリーミング音声認識を統合する1つのモデル | |
Tanaka et al. | Neural Error Corrective Language Models for Automatic Speech Recognition. | |
CN112509560B (zh) | 一种基于缓存语言模型的语音识别自适应方法和系统 | |
CN114596844A (zh) | 声学模型的训练方法、语音识别方法及相关设备 | |
KR20210141115A (ko) | 발화 시간 추정 방법 및 장치 | |
WO2022028378A1 (zh) | 语音意图识别方法、装置及设备 | |
Collobert et al. | Word-level speech recognition with a letter to word encoder | |
CN110992943A (zh) | 基于词混淆网络的语义理解方法及系统 | |
KR20230158608A (ko) | 종단 간 자동 음성 인식 신뢰도 및 삭제 추정을 위한 멀티태스크 학습 | |
CN112967720B (zh) | 少量重口音数据下的端到端语音转文本模型优化方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20200629 Address after: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120 Co-patentee after: AI SPEECH Co.,Ltd. Patentee after: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd. Address before: 200240 Dongchuan Road, Shanghai, No. 800, No. Co-patentee before: AI SPEECH Co.,Ltd. Patentee before: SHANGHAI JIAO TONG University |
|
TR01 | Transfer of patent right |
Effective date of registration: 20201030 Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu. Patentee after: AI SPEECH Co.,Ltd. Address before: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120 Patentee before: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd. Patentee before: AI SPEECH Co.,Ltd. |
|
TR01 | Transfer of patent right | ||
CP01 | Change in the name or title of a patent holder |
Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu. Patentee after: Sipic Technology Co.,Ltd. Address before: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu. Patentee before: AI SPEECH Co.,Ltd. |
|
CP01 | Change in the name or title of a patent holder | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: A Multi perspective Language Recognition Method Based on Unidirectional Self labeling Assisted Information Effective date of registration: 20230726 Granted publication date: 20200505 Pledgee: CITIC Bank Limited by Share Ltd. Suzhou branch Pledgor: Sipic Technology Co.,Ltd. Registration number: Y2023980049433 |
|
PE01 | Entry into force of the registration of the contract for pledge of patent right |