CN102945232B - 面向统计机器翻译的训练语料质量评价及选取方法 - Google Patents
面向统计机器翻译的训练语料质量评价及选取方法 Download PDFInfo
- Publication number
- CN102945232B CN102945232B CN201210469172.4A CN201210469172A CN102945232B CN 102945232 B CN102945232 B CN 102945232B CN 201210469172 A CN201210469172 A CN 201210469172A CN 102945232 B CN102945232 B CN 102945232B
- Authority
- CN
- China
- Prior art keywords
- sentence
- quality
- phrase
- translation
- quality assessment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Machine Translation (AREA)
Abstract
Description
数据 | 2 | 1 | 0 | ALL |
CWMT | 156,544 | 474,356 | 104,476 | 735,376 |
NIST | 919,143 | 121,460 | 8,670 | 1,049,273 |
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210469172.4A CN102945232B (zh) | 2012-11-16 | 2012-11-16 | 面向统计机器翻译的训练语料质量评价及选取方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210469172.4A CN102945232B (zh) | 2012-11-16 | 2012-11-16 | 面向统计机器翻译的训练语料质量评价及选取方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102945232A CN102945232A (zh) | 2013-02-27 |
CN102945232B true CN102945232B (zh) | 2015-01-21 |
Family
ID=47728179
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210469172.4A Active CN102945232B (zh) | 2012-11-16 | 2012-11-16 | 面向统计机器翻译的训练语料质量评价及选取方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102945232B (zh) |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2014235599A (ja) * | 2013-06-03 | 2014-12-15 | 独立行政法人情報通信研究機構 | 翻訳装置、学習装置、翻訳方法、およびプログラム |
US9678939B2 (en) | 2013-12-04 | 2017-06-13 | International Business Machines Corporation | Morphology analysis for machine translation |
CN103631773A (zh) * | 2013-12-16 | 2014-03-12 | 哈尔滨工业大学 | 基于领域相似性度量方法的统计机器翻译方法 |
CN105446958A (zh) * | 2014-07-18 | 2016-03-30 | 富士通株式会社 | 词对齐方法和词对齐设备 |
CN104731777B (zh) * | 2015-03-31 | 2019-02-01 | 网易有道信息技术(北京)有限公司 | 一种译文评价方法及装置 |
CN105335358B (zh) * | 2015-11-18 | 2018-07-06 | 成都优译信息技术有限公司 | 翻译系统中使用语料等级评价方法 |
CN105512114B (zh) * | 2015-12-14 | 2018-06-15 | 清华大学 | 平行句对的筛选方法和系统 |
JP6620934B2 (ja) * | 2016-01-29 | 2019-12-18 | パナソニックIpマネジメント株式会社 | 翻訳支援方法、翻訳支援装置、翻訳装置及び翻訳支援プログラム |
CN105930432B (zh) * | 2016-04-19 | 2020-01-07 | 北京百度网讯科技有限公司 | 序列标注工具的训练方法和装置 |
CN107526727B (zh) * | 2017-07-31 | 2021-01-19 | 苏州大学 | 基于统计机器翻译的语言生成方法 |
CN107491444B (zh) * | 2017-08-18 | 2020-10-27 | 南京大学 | 基于双语词嵌入技术的并行化词对齐方法 |
JP6969443B2 (ja) * | 2018-02-27 | 2021-11-24 | 日本電信電話株式会社 | 学習品質推定装置、方法、及びプログラム |
CN108537246A (zh) * | 2018-02-28 | 2018-09-14 | 成都优译信息技术股份有限公司 | 一种平行语料按翻译质量进行分类的方法及系统 |
CN110874536B (zh) * | 2018-08-29 | 2023-06-27 | 阿里巴巴集团控股有限公司 | 语料质量评估模型生成方法和双语句对互译质量评估方法 |
CN110929532B (zh) * | 2019-11-21 | 2023-03-21 | 腾讯科技(深圳)有限公司 | 数据处理方法、装置、设备及存储介质 |
CN111178091B (zh) * | 2019-12-20 | 2023-05-09 | 沈阳雅译网络技术有限公司 | 一种多维度的中英双语数据清洗方法 |
CN111159356B (zh) * | 2019-12-31 | 2023-06-09 | 重庆和贯科技有限公司 | 基于教学内容的知识图谱构建方法 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102193912A (zh) * | 2010-03-12 | 2011-09-21 | 富士通株式会社 | 短语划分模型建立方法、统计机器翻译方法以及解码器 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8655640B2 (en) * | 2011-03-02 | 2014-02-18 | Raytheon Bbn Technologies Corp. | Automatic word alignment |
-
2012
- 2012-11-16 CN CN201210469172.4A patent/CN102945232B/zh active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102193912A (zh) * | 2010-03-12 | 2011-09-21 | 富士通株式会社 | 短语划分模型建立方法、统计机器翻译方法以及解码器 |
Non-Patent Citations (4)
Title |
---|
The Impact of Parsing Accuracy on Syntax-based SMT;Hao ZHANG 等;《Natural Language Processing and Knowledge Engineering (NLP-KE), 2010 International Conference on》;20100823;第1-4页 * |
基于信息检索方法的统计翻译系统训练数据选择与优化;黄瑾 等;《中文信息学报》;20080526;第22卷(第2期);第40-46页 * |
基于句对质量和覆盖度的统计机器翻译训练语料选取;姚树杰 等;《中文信息学报》;20110804;第25卷(第2期);第72-77页 * |
平行语料库处理初探:一种排序模型;陈毅东 等;《中文信息学报》;20060425;第20卷(第z1期);第66-70页 * |
Also Published As
Publication number | Publication date |
---|---|
CN102945232A (zh) | 2013-02-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102945232B (zh) | 面向统计机器翻译的训练语料质量评价及选取方法 | |
US20190188263A1 (en) | Word semantic embedding apparatus and method using lexical semantic network and homograph disambiguating apparatus and method using lexical semantic network and word embedding | |
CN107818164A (zh) | 一种智能问答方法及其系统 | |
CN104731777A (zh) | 一种译文评价方法及装置 | |
CN109344236A (zh) | 一种基于多种特征的问题相似度计算方法 | |
KR20080021017A (ko) | 텍스트 기반의 문서 비교 | |
Chen et al. | Improving distributed representation of word sense via wordnet gloss composition and context clustering | |
US9646512B2 (en) | System and method for automated teaching of languages based on frequency of syntactic models | |
CN103869998B (zh) | 一种对输入法所产生的候选项进行排序的方法及装置 | |
CN111460820A (zh) | 一种基于预训练模型bert的网络空间安全领域命名实体识别方法和装置 | |
CN110825850B (zh) | 一种自然语言主题分类方法及装置 | |
Shah et al. | Sentimental Analysis Using Supervised Learning Algorithms | |
CN111221962A (zh) | 一种基于新词扩展与复杂句式扩展的文本情感分析方法 | |
Zhang et al. | HANSpeller++: A unified framework for Chinese spelling correction | |
CN110059220A (zh) | 一种基于深度学习与贝叶斯概率矩阵分解的电影推荐方法 | |
Jiang et al. | Enriching word embeddings with domain knowledge for readability assessment | |
Gomaa et al. | Arabic short answer scoring with effective feedback for students | |
Dubossarsky et al. | Coming to your senses: on controls and evaluation sets in polysemy research | |
CN112417119A (zh) | 一种基于深度学习的开放域问答预测方法 | |
Sadr et al. | Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures | |
Guo et al. | IJCNLP-2017 task 5: Multi-choice question answering in examinations | |
Pratap et al. | Talla at SemEval-2018 task 7: Hybrid loss optimization for relation classification using convolutional neural networks | |
Tran et al. | Ijs at textgraphs-16 natural language premise selection task: Will contextual information improve natural language premise selection? | |
Mahmoodvand et al. | Semi-supervised approach for Persian word sense disambiguation | |
Alwaneen et al. | Stacked dynamic memory-coattention network for answering why-questions in Arabic |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220214 Address after: 110004 1001 - (1103), block C, No. 78, Sanhao Street, Heping District, Shenyang City, Liaoning Province Patentee after: Calf Yazhi (Shenyang) Technology Co.,Ltd. Address before: Room 1517, No. 55, Sanhao Street, Heping District, Shenyang, Liaoning 110003 Patentee before: SHENYANG YAYI NETWORK TECHNOLOGY CO.,LTD. |
|
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220715 Address after: 110004 11 / F, block C, Neusoft computer city, 78 Sanhao Street, Heping District, Shenyang City, Liaoning Province Patentee after: SHENYANG YAYI NETWORK TECHNOLOGY CO.,LTD. Address before: 110004 1001 - (1103), block C, No. 78, Sanhao Street, Heping District, Shenyang City, Liaoning Province Patentee before: Calf Yazhi (Shenyang) Technology Co.,Ltd. |
|
TR01 | Transfer of patent right | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: Quality Evaluation and Selection of Training Corpus for statistical machine translation Effective date of registration: 20230508 Granted publication date: 20150121 Pledgee: China Construction Bank Shenyang Hunnan sub branch Pledgor: SHENYANG YAYI NETWORK TECHNOLOGY CO.,LTD. Registration number: Y2023210000101 |
|
PE01 | Entry into force of the registration of the contract for pledge of patent right |