CN102184171B - 机器翻译检查方法 - Google Patents

机器翻译检查方法 Download PDF

Info

Publication number
CN102184171B
CN102184171B CN 201110098977 CN201110098977A CN102184171B CN 102184171 B CN102184171 B CN 102184171B CN 201110098977 CN201110098977 CN 201110098977 CN 201110098977 A CN201110098977 A CN 201110098977A CN 102184171 B CN102184171 B CN 102184171B
Authority
CN
China
Prior art keywords
translation
similarity
threshold values
word
mechanical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 201110098977
Other languages
English (en)
Other versions
CN102184171A (zh
Inventor
江潮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Transn Iol Technology Co ltd
Original Assignee
TRANSN (BEIJING) INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TRANSN (BEIJING) INFORMATION TECHNOLOGY Co Ltd filed Critical TRANSN (BEIJING) INFORMATION TECHNOLOGY Co Ltd
Priority to CN 201110098977 priority Critical patent/CN102184171B/zh
Publication of CN102184171A publication Critical patent/CN102184171A/zh
Application granted granted Critical
Publication of CN102184171B publication Critical patent/CN102184171B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

本发明公开了一种机器翻译检查方法,包括:机器翻译检查,选择需要检查的语料,定义原文的翻译方向;将原文保存到文本文件中,通过机器翻译模块进行翻译,所述机器翻译模块用于根据翻译方向翻译指定文本内容;检查译文和翻译结果,根据句子中词的相似度阀值及标记设置标记达到相似度的原文内容,并输出报表。本发明减少了系统中各个模块之间交互访问的时间。

Description

机器翻译检查方法
技术领域
本发明涉及一种计算机技术,具体说,涉及一种机器翻译检查方法。
背景技术
机器翻译(machine translation),又称机译(MT),是利用计算机把一种自然语言转变成另一种自然语言的过程。用以完成这一过程的软件叫做机器翻译系统。自20世纪30年代法国科学家阿尔楚尼提出机器翻译的设想以来,利用计算机在不同语种的文献之间自动实现准确、流畅的翻译,一直是科学家的追求目标。采用统计分析法设计更加智能化的翻译软件,掀起了新一轮研究开发热潮。谷歌公司的计算机翻译专家弗朗斯奥彻认为,提高计算机翻译水平的关键是,需要在软件中建立大约相当于100万本书的数据库。Systran公司总裁赛巴塔卡克斯也表示,他们在翻译专利文献等非常专业的技术文献时也采用了统计分析法,并认为目前计算机翻译界对统计分析法的热衷,在某种程度上也代表了市场需要的技术方向。与此同时,采用统计分析法和采用句子语法结构分析法设计翻译软件的两种技术线路也逐渐开始融合,依靠词组而不是单个单词统计分析方法,也能够处理语义学方面的问题。
上述机器翻译过程中,在翻译的时候直接使用了机器翻译结果,而这些结果往往是不准确的,达不到商用效果。
发明内容
本发明所解决的技术问题是提供一种机器翻译检查方法,减少了系统中各个模块之间交互访问的时间。
技术方案如下:
一种机器翻译检查方法,包括:
机器翻译检查,选择需要检查的语料,定义原文的翻译方向;
将原文保存到文本文件中,通过机器翻译模块进行翻译,所述机器翻译模块用于根据翻译方向翻译指定文本内容;
检查译文和翻译结果,根据句子中词的相似度阀值及标记设置标记达到相似度的原文内容,并输出报表;
计算匹配率,根据两个句子的长度、单词顺序、单词多少进行综合计算所述匹配率;
在所述检查和标记过程中,依次比对译文和翻译结果,计算相似度阀值,根据所述相似度阀值判断译文和翻译结果是否大于相似度,如果是,根据标记设置标记原文内容;
所述相似度包括高相似度和一般相似度,当所述相似度阀值在0.8至1之间为所述高相似度,当所述相似度阀值在0.6至0.8之间为一般相似度。
进一步:在所述检查和标记过程中,判断循环是否结束,如果没有继续计算匹配率。
进一步:如果是最后一条语料表示则结束,如果不是最后一条语料则表示没有结束。
进一步:所述输出报表的内容包括序号、原文、译文、机器译文或者相似度。
进一步:所述相似度阀值的阀值区间在0-100%之间。
技术效果包括:
1、本发明采用文件翻译方式,一次提交,全部翻译,减少了系统中各个模块之间交互访问的时间,方便快速。
2、设定各自的权重比例,综合计算句子的相似度,根据相似度标记颜色,然后通过报表输出,表现更直观。
3、本发明翻译准确,缩短了翻译时间。
附图说明
图1是本发明中机器翻译检查方法的流程图;
图2是本发明中检查和标记过程的流程图。
具体实施方式
本发明对原译文和机器译文进行分词,根据词的相似度以及词之间的对应位置,生成一个二维矩阵,实现标记原文。下面参考附图和优选实施例,对本发明技术方案作详细描述。
如图1所示,是本发明中机器翻译检查方法的流程图。下面对机器翻译检查方法作详细描述。通过机器翻译引擎对语料原文进行翻译,然后将机器译文和语料译文进行比对。
步骤101:机器翻译检查;
步骤102:选择需要检查的语料,然后选择翻译引擎,定义原文的翻译方向;
步骤103:机器翻译。
将原文保存到文本文件中,通过机器翻译模块进行翻译,机器翻译模块的作用是根据翻译方向翻译指定文本内容。
本发明采用文件翻译方式,一次提交,全部翻译方式来处理。将原文保存到文本文件中,每句一行,参考代码如下:
Figure GDA00002009303100031
Figure GDA00002009303100041
将文本文件提交到后台翻译,参考代码如下:
步骤104:检查,依次比对译文和翻译结果;
步骤105:标记,根据句子中词的相似度阀值(阀值区间在0-100%之间)及标记设置,标记达到相似度(高相似度:1>阀值>80%,一般相似度:80%>阀值>60%)的原文内容,标记可以选用颜色方式(在报表中体现,不在原语料中标记)。
步骤106:输出报表(内容包括:序号、原文、译文、机器译文、相似度等)。
标记颜色,输出报表,参考代码如下:
如图2所示,是本发明中检查和标记过程的流程图。下面对检查和标记过程作详细描述。
步骤201:接收译文和翻译结果;
步骤202:计算匹配率(匹配率区间:0-100%,根据两个句子的长度、单词顺序、单词多少进行综合计算);
计算匹配率,参考代码如下:
Figure GDA00002009303100052
Figure GDA00002009303100061
步骤203:判断译文和翻译结果的相似度是否大于相似度N(高相似度:相似度阀值在80%~1,一般相似度:相似度阀值在60%~80%之间,阀值即上面计算的匹配率),如果是,进行步骤204,否则进行步骤205;
步骤204:根据标记设置标记原文内容(在报表中体现,不在原语料中标记);
步骤205:判断循环是否结束;如果是最后一条语料,则结束,进行步骤206,如果不是最后一条语料,说明没有结束,进行步骤202。

Claims (5)

1.一种机器翻译检查方法,包括:
机器翻译检查,选择需要检查的语料,定义原文的翻译方向;
将原文保存到文本文件中,通过机器翻译模块进行翻译,所述机器翻译模块用于根据翻译方向翻译指定文本内容;
检查译文和翻译结果,根据句子中词的相似度阀值及标记设置标记达到相似度的原文内容,并输出报表;
计算匹配率,根据两个句子的长度、单词顺序、单词多少进行综合计算所述匹配率;其中,词不同产生的匹配率影响占比率0.65;多出一个词,这个词产生的匹配率影响占比0.75;发现一个不同的词,这个词产生的匹配率影响占比0.1;
在所述检查和标记过程中,依次比对译文和翻译结果,计算相似度阀值,根据所述相似度阀值判断译文和翻译结果是否大于相似度,如果是,根据标记设置标记原文内容;
所述相似度包括高相似度和一般相似度,当所述相似度阀值在0.8至1之间为所述高相似度,当所述相似度阀值在0.6至0.8之间为一般相似度。
2.如权利要求1所述的机器翻译检查方法,其特征在于:在所述检查和标记过程中,判断循环是否结束,如果没有继续计算匹配率。
3.如权利要求2所述的机器翻译检查方法,其特征在于:如果是最后一条语料表示则结束,如果不是最后一条语料则表示没有结束。
4.如权利要求1至3任一项所述的机器翻译检查方法,其特征在于:所述输出报表的内容包括序号、原文、译文、机器译文或者相似度。
5.如权利要求1所述的机器翻译检查方法,其特征在于:所述相似度阀值的阀值区间在0-100%之间。
CN 201110098977 2011-04-20 2011-04-20 机器翻译检查方法 Active CN102184171B (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110098977 CN102184171B (zh) 2011-04-20 2011-04-20 机器翻译检查方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110098977 CN102184171B (zh) 2011-04-20 2011-04-20 机器翻译检查方法

Publications (2)

Publication Number Publication Date
CN102184171A CN102184171A (zh) 2011-09-14
CN102184171B true CN102184171B (zh) 2013-08-14

Family

ID=44570348

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110098977 Active CN102184171B (zh) 2011-04-20 2011-04-20 机器翻译检查方法

Country Status (1)

Country Link
CN (1) CN102184171B (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4918174B1 (ja) 2011-09-20 2012-04-18 株式会社Pijin 情報提供装置、情報提供方法、及びコンピュータプログラム
CN107301174B (zh) * 2017-06-22 2019-12-24 北京理工大学 一种基于拼接的集成式自动译后编辑系统及方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1573739A (zh) * 2003-06-04 2005-02-02 株式会社国际电气通信基础技术研究所 用于改良机器翻译之翻译知识的方法和装置
CN1641631A (zh) * 2004-01-13 2005-07-20 中国科学院计算技术研究所 一种机器翻译自动评测方法及其系统
CN101520779A (zh) * 2009-04-17 2009-09-02 哈尔滨工业大学 一种机器翻译自动诊断评价方法
CN101777044A (zh) * 2010-01-29 2010-07-14 中国科学院声学研究所 利用语句结构信息的机器翻译自动评测系统及实现方法
CN101923540A (zh) * 2010-07-20 2010-12-22 陈洁 语言翻译质量审核方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1661593B (zh) * 2004-02-24 2010-04-28 北京中专翻译有限公司 一种计算机语言翻译方法及其翻译系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1573739A (zh) * 2003-06-04 2005-02-02 株式会社国际电气通信基础技术研究所 用于改良机器翻译之翻译知识的方法和装置
CN1641631A (zh) * 2004-01-13 2005-07-20 中国科学院计算技术研究所 一种机器翻译自动评测方法及其系统
CN101520779A (zh) * 2009-04-17 2009-09-02 哈尔滨工业大学 一种机器翻译自动诊断评价方法
CN101777044A (zh) * 2010-01-29 2010-07-14 中国科学院声学研究所 利用语句结构信息的机器翻译自动评测系统及实现方法
CN101923540A (zh) * 2010-07-20 2010-12-22 陈洁 语言翻译质量审核方法

Also Published As

Publication number Publication date
CN102184171A (zh) 2011-09-14

Similar Documents

Publication Publication Date Title
Ive et al. DeepQuest: a framework for neural-based quality estimation
CN105808530B (zh) 一种统计机器翻译中的翻译方法和装置
CN102708098B (zh) 一种基于依存连贯性约束的双语词语自动对齐方法
CN103365838A (zh) 基于多元特征的英语作文语法错误自动纠正方法
CN107329961A (zh) 一种云翻译记忆库快速增量式模糊匹配的方法
CN103020044A (zh) 一种机器辅助网页翻译方法及其系统
Padó et al. Optimal constituent alignment with edge covers for semantic projection
Pagé-Perron et al. Machine translation and automated analysis of the Sumerian language
CN104731774A (zh) 面向通用机译引擎的个性化翻译方法及装置
Lim et al. Multilingual dependency parsing for low-resource languages: Case studies on north saami and komi-zyrian
CN106339371B (zh) 一种基于词向量的英汉词义映射方法和装置
Hlaing et al. Improving neural machine translation with POS-tag features for low-resource language pairs
CN106055633A (zh) 一种中文微博主客观句分类方法
CN112836525A (zh) 一种基于人机交互机器翻译系统及其自动优化方法
López-Ludeña et al. Automatic categorization for improving Spanish into Spanish Sign Language machine translation
CN108664464B (zh) 一种语义相关度的确定方法及确定装置
Sánchez-Martínez et al. Inferring shallow-transfer machine translation rules from small parallel corpora
Wax Automated grammar engineering for verbal morphology
Callison-Burch et al. Co-training for statistical machine translation
Zhang et al. Improved statistical machine translation by multiple Chinese word segmentation
Lavie Stat-XFER: A general search-based syntax-driven framework for machine translation
Li et al. Cultural concept adaptation on multimodal reasoning
CN102184171B (zh) 机器翻译检查方法
Dušek et al. Robust multilingual statistical morphological generation models
Hudík et al. The integration of moses into localization industry

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: TRANSN (CHINA) NETWORK TECHNOLOGY CO., LTD.

Free format text: FORMER OWNER: TRANSN (BEIJING) INFORMATION TECHNOLOGY CO., LTD.

Effective date: 20150624

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20150624

Address after: 430070, Optics Valley Software Park, East Lake Development Zone, Wuhan, Hubei, six South, South Lake Road, Optics Valley Software Park, 2, 4, 204 rooms

Patentee after: Vivid (China) Network Technology Co.,Ltd.

Address before: 100085 Beijing city Haidian District Qingyun aromatic garden Ting Building 9, Tsing Wun contemporary building seventeen 1707A1 room

Patentee before: TRANSN (BEIJING) INFORMATION TECHNOLOGY Co.,Ltd.

PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method for checking mechanical translation

Effective date of registration: 20150818

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: Vivid (China) Network Technology Co.,Ltd.

Registration number: 2015420000011

PLDC Enforcement, change and cancellation of contracts on pledge of patent right or utility model
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20160823

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: Vivid (China) Network Technology Co.,Ltd.

Registration number: 2015420000011

PLDC Enforcement, change and cancellation of contracts on pledge of patent right or utility model
C56 Change in the name or address of the patentee
CP03 Change of name, title or address

Address after: 430070, Optics Valley Software Park, East Lake Development Zone, Wuhan, Hubei, six South, South Lake Road, Optics Valley Software Park, two, 4, 204 rooms

Patentee after: TRANSN IOL TECHNOLOGY Co.,Ltd.

Address before: 430070, Optics Valley Software Park, East Lake Development Zone, Wuhan, Hubei, six South, South Lake Road, Optics Valley Software Park, 2, 4, 204 rooms

Patentee before: Vivid (China) Network Technology Co.,Ltd.

PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method for checking mechanical translation

Effective date of registration: 20160926

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: TRANSN IOL TECHNOLOGY Co.,Ltd.

Registration number: 2016420000038

PLDC Enforcement, change and cancellation of contracts on pledge of patent right or utility model
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20170921

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: TRANSN IOL TECHNOLOGY Co.,Ltd.

Registration number: 2016420000038

PC01 Cancellation of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method for checking mechanical translation

Effective date of registration: 20170927

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: TRANSN IOL TECHNOLOGY Co.,Ltd.

Registration number: 2017420000031

PE01 Entry into force of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20180927

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: TRANSN IOL TECHNOLOGY Co.,Ltd.

Registration number: 2017420000031

PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method for checking mechanical translation

Effective date of registration: 20180930

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: TRANSN IOL TECHNOLOGY Co.,Ltd.

Registration number: 2018420000053

PE01 Entry into force of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20190926

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: TRANSN IOL TECHNOLOGY Co.,Ltd.

Registration number: 2018420000053

PC01 Cancellation of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method for checking mechanical translation

Effective date of registration: 20190929

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: TRANSN IOL TECHNOLOGY Co.,Ltd.

Registration number: Y2019420000021

PC01 Cancellation of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20201009

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: TRANSN IOL TECHNOLOGY Co.,Ltd.

Registration number: Y2019420000021

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Machine translation checking method

Effective date of registration: 20201016

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: TRANSN IOL TECHNOLOGY Co.,Ltd.

Registration number: Y2020420000071

PC01 Cancellation of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20211105

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: TRANSN IOL TECHNOLOGY Co.,Ltd.

Registration number: Y2020420000071

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Machine translation checking method

Effective date of registration: 20211203

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: TRANSN IOL TECHNOLOGY Co.,Ltd.

Registration number: Y2021420000136

PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20221227

Granted publication date: 20130814

Pledgee: Guanggu Branch of Wuhan Rural Commercial Bank Co.,Ltd.

Pledgor: TRANSN IOL TECHNOLOGY Co.,Ltd.

Registration number: Y2021420000136

PC01 Cancellation of the registration of the contract for pledge of patent right