CN104112447A - 提高统计语言模型准确度的方法及系统 - Google Patents
提高统计语言模型准确度的方法及系统 Download PDFInfo
- Publication number
- CN104112447A CN104112447A CN201410366038.0A CN201410366038A CN104112447A CN 104112447 A CN104112447 A CN 104112447A CN 201410366038 A CN201410366038 A CN 201410366038A CN 104112447 A CN104112447 A CN 104112447A
- Authority
- CN
- China
- Prior art keywords
- language model
- training set
- parameter
- language
- probability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000012549 training Methods 0.000 claims abstract description 47
- 230000006870 function Effects 0.000 claims abstract description 22
- 239000000463 material Substances 0.000 claims description 35
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000005194 fractionation Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 2
- 238000009499 grossing Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000013179 statistical model Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 241000208125 Nicotiana Species 0.000 description 1
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 240000005373 Panax quinquefolius Species 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Landscapes
- Machine Translation (AREA)
Abstract
Description
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410366038.0A CN104112447B (zh) | 2014-07-28 | 2014-07-28 | 提高统计语言模型准确度的方法及系统 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410366038.0A CN104112447B (zh) | 2014-07-28 | 2014-07-28 | 提高统计语言模型准确度的方法及系统 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104112447A true CN104112447A (zh) | 2014-10-22 |
CN104112447B CN104112447B (zh) | 2017-08-25 |
Family
ID=51709208
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410366038.0A Active CN104112447B (zh) | 2014-07-28 | 2014-07-28 | 提高统计语言模型准确度的方法及系统 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104112447B (zh) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101833547A (zh) * | 2009-03-09 | 2010-09-15 | 三星电子(中国)研发中心 | 基于个人语料库进行短语级预测输入的方法 |
CN102509549A (zh) * | 2011-09-28 | 2012-06-20 | 盛乐信息技术(上海)有限公司 | 语言模型训练方法及系统 |
WO2012151255A1 (en) * | 2011-05-02 | 2012-11-08 | Vistaprint Technologies Limited | Statistical spell checker |
CN103294817A (zh) * | 2013-06-13 | 2013-09-11 | 华东师范大学 | 一种基于类别分布概率的文本特征抽取方法 |
CN103870447A (zh) * | 2014-03-11 | 2014-06-18 | 北京优捷信达信息科技有限公司 | 一种基于隐含狄利克雷模型的关键词抽取方法 |
CN103885938A (zh) * | 2014-04-14 | 2014-06-25 | 东南大学 | 基于用户反馈的行业拼写错误检查方法 |
-
2014
- 2014-07-28 CN CN201410366038.0A patent/CN104112447B/zh active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101833547A (zh) * | 2009-03-09 | 2010-09-15 | 三星电子(中国)研发中心 | 基于个人语料库进行短语级预测输入的方法 |
WO2012151255A1 (en) * | 2011-05-02 | 2012-11-08 | Vistaprint Technologies Limited | Statistical spell checker |
CN102509549A (zh) * | 2011-09-28 | 2012-06-20 | 盛乐信息技术(上海)有限公司 | 语言模型训练方法及系统 |
CN103294817A (zh) * | 2013-06-13 | 2013-09-11 | 华东师范大学 | 一种基于类别分布概率的文本特征抽取方法 |
CN103870447A (zh) * | 2014-03-11 | 2014-06-18 | 北京优捷信达信息科技有限公司 | 一种基于隐含狄利克雷模型的关键词抽取方法 |
CN103885938A (zh) * | 2014-04-14 | 2014-06-25 | 东南大学 | 基于用户反馈的行业拼写错误检查方法 |
Also Published As
Publication number | Publication date |
---|---|
CN104112447B (zh) | 2017-08-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11741366B2 (en) | Compressed recurrent neural network models | |
CN108363790B (zh) | 用于对评论进行评估的方法、装置、设备和存储介质 | |
CN107480143B (zh) | 基于上下文相关性的对话话题分割方法和系统 | |
CN102156551B (zh) | 一种字词输入的纠错方法及系统 | |
CN103678282B (zh) | 一种分词方法及装置 | |
KR101715118B1 (ko) | 문서 감정 분류용 딥러닝 인코딩 장치 및 방법. | |
WO2019164818A1 (en) | Question answering from minimal context over documents | |
CN111079442A (zh) | 文档的向量化表示方法、装置和计算机设备 | |
CN111310440B (zh) | 文本的纠错方法、装置和系统 | |
CN109829162A (zh) | 一种文本分词方法及装置 | |
CN112395385B (zh) | 基于人工智能的文本生成方法、装置、计算机设备及介质 | |
CN107292382A (zh) | 一种神经网络声学模型激活函数定点量化方法 | |
EP4131076A1 (en) | Serialized data processing method and device, and text processing method and device | |
JP6517537B2 (ja) | 単語ベクトル学習装置、自然言語処理装置、方法、及びプログラム | |
CN108021551B (zh) | 一种语料扩展方法及装置 | |
TWI567569B (zh) | Natural language processing systems, natural language processing methods, and natural language processing programs | |
CN110427608A (zh) | 一种引入分层形声特征的中文词向量表示学习方法 | |
CN108363688A (zh) | 一种融合先验信息的命名实体链接方法 | |
US20230154161A1 (en) | Memory-optimized contrastive learning | |
Pham et al. | Nnvlp: A neural network-based vietnamese language processing toolkit | |
CN113763937A (zh) | 语音处理模型的生成方法、装置、设备以及存储介质 | |
CN113505583A (zh) | 基于语义决策图神经网络的情感原因子句对提取方法 | |
CN108021544B (zh) | 对实体词的语义关系进行分类的方法、装置和电子设备 | |
CN105335375A (zh) | 主题挖掘方法和装置 | |
CN110705217A (zh) | 一种错别字检测方法、装置及计算机存储介质、电子设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20170707 Address after: 230088, Hefei province high tech Zone, 2800 innovation Avenue, 288 innovation industry park, H2 building, room two, Anhui Applicant after: Anhui Puji Information Technology Co.,Ltd. Address before: Wangjiang Road high tech Development Zone Hefei city Anhui province 230088 No. 666 Applicant before: IFLYTEK Co.,Ltd. |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder |
Address after: 230088, Hefei province high tech Zone, 2800 innovation Avenue, 288 innovation industry park, H2 building, room two, Anhui Patentee after: ANHUI IFLYTEK MEDICAL INFORMATION TECHNOLOGY CO.,LTD. Address before: 230088, Hefei province high tech Zone, 2800 innovation Avenue, 288 innovation industry park, H2 building, room two, Anhui Patentee before: Anhui Puji Information Technology Co.,Ltd. |
|
CP01 | Change in the name or title of a patent holder | ||
CP03 | Change of name, title or address |
Address after: 230088 floor 23-24, building A5, No. 666, Wangjiang West Road, high tech Zone, Hefei, Anhui Province Patentee after: Anhui Xunfei Medical Co.,Ltd. Address before: Room 288, H2 / F, phase II, innovation industrial park, 2800 innovation Avenue, high tech Zone, Hefei City, Anhui Province, 230088 Patentee before: ANHUI IFLYTEK MEDICAL INFORMATION TECHNOLOGY CO.,LTD. |
|
CP03 | Change of name, title or address | ||
CP01 | Change in the name or title of a patent holder |
Address after: 230088 floor 23-24, building A5, No. 666, Wangjiang West Road, high tech Zone, Hefei, Anhui Province Patentee after: IFLYTEK Medical Technology Co.,Ltd. Address before: 230088 floor 23-24, building A5, No. 666, Wangjiang West Road, high tech Zone, Hefei, Anhui Province Patentee before: Anhui Xunfei Medical Co.,Ltd. |
|
CP01 | Change in the name or title of a patent holder |