CN103729350A - 多维度待译文档的预处理方法 - Google Patents
多维度待译文档的预处理方法 Download PDFInfo
- Publication number
- CN103729350A CN103729350A CN201310752261.4A CN201310752261A CN103729350A CN 103729350 A CN103729350 A CN 103729350A CN 201310752261 A CN201310752261 A CN 201310752261A CN 103729350 A CN103729350 A CN 103729350A
- Authority
- CN
- China
- Prior art keywords
- translated
- document
- mrow
- paragraph
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000007781 pre-processing Methods 0.000 title claims abstract description 17
- 238000013519 translation Methods 0.000 claims abstract description 43
- 238000004364 calculation method Methods 0.000 description 11
- 238000012417 linear regression Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Landscapes
- Machine Translation (AREA)
Abstract
Description
特征词 | 特征词的词频 | 特征词的段落属性 | 特征词在段落中的位置 |
keyword1 | tf1 | SegNum1 | Loc1-1、Loc1-2、… |
Keyword2 | tf2 | SegNum1 | Loc2-1、Loc2-2、… |
… | … | … | … |
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310752261.4A CN103729350B (zh) | 2013-12-30 | 2013-12-30 | 多维度待译文档的预处理方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310752261.4A CN103729350B (zh) | 2013-12-30 | 2013-12-30 | 多维度待译文档的预处理方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103729350A true CN103729350A (zh) | 2014-04-16 |
CN103729350B CN103729350B (zh) | 2017-01-04 |
Family
ID=50453428
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310752261.4A Active CN103729350B (zh) | 2013-12-30 | 2013-12-30 | 多维度待译文档的预处理方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103729350B (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104484323A (zh) * | 2014-12-26 | 2015-04-01 | 武汉传神信息技术有限公司 | 一种基于文档片段的翻译处理方法 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101079028A (zh) * | 2007-05-29 | 2007-11-28 | 中国科学院计算技术研究所 | 一种统计机器翻译中的在线翻译模型选择方法 |
CN103049568A (zh) * | 2012-12-31 | 2013-04-17 | 武汉传神信息技术有限公司 | 对海量文档库的文档分类的方法 |
CN103064970A (zh) * | 2012-12-31 | 2013-04-24 | 武汉传神信息技术有限公司 | 优化译员的检索方法 |
CN103106245A (zh) * | 2012-12-31 | 2013-05-15 | 武汉传神信息技术有限公司 | 基于大规模术语语料库对译稿自动碎片化分类的方法 |
-
2013
- 2013-12-30 CN CN201310752261.4A patent/CN103729350B/zh active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101079028A (zh) * | 2007-05-29 | 2007-11-28 | 中国科学院计算技术研究所 | 一种统计机器翻译中的在线翻译模型选择方法 |
CN103049568A (zh) * | 2012-12-31 | 2013-04-17 | 武汉传神信息技术有限公司 | 对海量文档库的文档分类的方法 |
CN103064970A (zh) * | 2012-12-31 | 2013-04-24 | 武汉传神信息技术有限公司 | 优化译员的检索方法 |
CN103106245A (zh) * | 2012-12-31 | 2013-05-15 | 武汉传神信息技术有限公司 | 基于大规模术语语料库对译稿自动碎片化分类的方法 |
Non-Patent Citations (2)
Title |
---|
BAKER MONA: "Corpora in translation studies: an overview and some suggestions for future research", 《TARGET》, 31 December 1995 (1995-12-31) * |
钱之莹: "汉英/英汉平行翻译语料库的设计及其在翻译中的应用", 《中国优秀博硕士学位论文全文数据库 哲学与人文科学辑 》, no. 5, 15 September 2005 (2005-09-15) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104484323A (zh) * | 2014-12-26 | 2015-04-01 | 武汉传神信息技术有限公司 | 一种基于文档片段的翻译处理方法 |
Also Published As
Publication number | Publication date |
---|---|
CN103729350B (zh) | 2017-01-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106055538B (zh) | 主题模型和语义分析相结合的文本标签自动抽取方法 | |
CN103744834B (zh) | 一种翻译任务准确分配的方法 | |
CN106649260B (zh) | 基于评论文本挖掘的产品特征结构树构建方法 | |
CN109446423B (zh) | 一种新闻以及文本的情感判断系统及方法 | |
CN106651696B (zh) | 一种近似题推送方法及系统 | |
CN103729421B (zh) | 一种译员文档精确匹配的方法 | |
CN107122413A (zh) | 一种基于图模型的关键词提取方法及装置 | |
CN106997382A (zh) | 基于大数据的创新创意标签自动标注方法及系统 | |
CN108920482B (zh) | 基于词汇链特征扩展和lda模型的微博短文本分类方法 | |
CN110472203B (zh) | 一种文章的查重检测方法、装置、设备及存储介质 | |
Noaman et al. | Naive Bayes classifier based Arabic document categorization | |
Kumar et al. | Legal document summarization using latent dirichlet allocation | |
CN107526841A (zh) | 一种基于Web的藏文文本自动摘要生成方法 | |
CN112667806B (zh) | 一种使用lda的文本分类筛选方法 | |
Wahbeh et al. | Comparative assessment of the performance of three WEKA text classifiers applied to arabic text | |
CN111695358A (zh) | 生成词向量的方法、装置、计算机存储介质和电子设备 | |
CN102360436B (zh) | 一种基于部件的联机手写藏文字符的识别方法 | |
CN109062895A (zh) | 一种智能语义处理方法 | |
CN110929022A (zh) | 一种文本摘要生成方法及系统 | |
CN103744840B (zh) | 一种文档翻译难度的分析方法 | |
Glaser et al. | Sentence Boundary Detection in German Legal Documents. | |
CN109815328B (zh) | 一种摘要生成方法及装置 | |
CN103729348B (zh) | 一种语句翻译复杂度的分析方法 | |
CN103714051B (zh) | 一种待译文档的预处理方法 | |
Saini et al. | Intrinsic plagiarism detection system using stylometric features and DBSCAN |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 430073 East Lake Hubei Development Zone, Optics Valley Software Park, a phase of the west, South Lake Road South, Optics Valley Software Park, No. 2, No. 5, layer 205, six Applicant after: Language network (Wuhan) Information Technology Co., Ltd. Address before: 430073 East Lake Hubei Development Zone, Optics Valley Software Park, a phase of the west, South Lake Road South, Optics Valley Software Park, No. 2, No. 5, layer 205, six Applicant before: Wuhan Transn Information Technology Co., Ltd. |
|
CB03 | Change of inventor or designer information |
Inventor after: Jiang Chao Inventor after: Zhang Pi Inventor before: Jiang Chao |
|
COR | Change of bibliographic data | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: Multi-dimension preprocessing method for files to be translated Effective date of registration: 20181115 Granted publication date: 20170104 Pledgee: Bank of Communications Co., Ltd. Wuhan Branch of Hubei Free Trade Experimental Zone Pledgor: Language network (Wuhan) Information Technology Co., Ltd. Registration number: 2018420000061 |
|
PC01 | Cancellation of the registration of the contract for pledge of patent right |
Date of cancellation: 20200617 Granted publication date: 20170104 Pledgee: Bank of Communications Co.,Ltd. Wuhan Branch of Hubei Free Trade Experimental Zone Pledgor: IOL (WUHAN) INFORMATION TECHNOLOGY Co.,Ltd. Registration number: 2018420000061 |
|
PC01 | Cancellation of the registration of the contract for pledge of patent right |