CN102550049A - 通过动态学习提取规则来获取词表外的翻译 - Google Patents
通过动态学习提取规则来获取词表外的翻译 Download PDFInfo
- Publication number
- CN102550049A CN102550049A CN200980161654XA CN200980161654A CN102550049A CN 102550049 A CN102550049 A CN 102550049A CN 200980161654X A CN200980161654X A CN 200980161654XA CN 200980161654 A CN200980161654 A CN 200980161654A CN 102550049 A CN102550049 A CN 102550049A
- Authority
- CN
- China
- Prior art keywords
- term
- bilingual
- translation
- candidate
- pattern
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013519 translation Methods 0.000 title claims abstract description 134
- 230000014616 translation Effects 0.000 title claims abstract description 134
- 238000000605 extraction Methods 0.000 title description 6
- 238000000034 method Methods 0.000 claims abstract description 43
- 230000000295 complement effect Effects 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 8
- 230000004044 response Effects 0.000 claims description 4
- 240000007711 Peperomia pellucida Species 0.000 claims description 2
- 235000012364 Peperomia pellucida Nutrition 0.000 claims description 2
- 238000012795 verification Methods 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 description 22
- 238000004891 communication Methods 0.000 description 15
- 230000005540 biological transmission Effects 0.000 description 8
- 230000015654 memory Effects 0.000 description 8
- 239000000284 extract Substances 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000010200 validation analysis Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000000386 athletic effect Effects 0.000 description 2
- 238000009412 basement excavation Methods 0.000 description 2
- 210000001072 colon Anatomy 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000006866 deterioration Effects 0.000 description 2
- 230000005611 electricity Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 244000188472 Ilex paraguariensis Species 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012797 qualification Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- ZSDSQXJSNMTJDA-UHFFFAOYSA-N trifluralin Chemical compound CCCN(CCC)C1=C([N+]([O-])=O)C=C(C(F)(F)F)C=C1[N+]([O-])=O ZSDSQXJSNMTJDA-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/45—Example-based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/51—Translation evaluation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
Description
Claims (18)
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2009/001078 WO2011035455A1 (en) | 2009-09-25 | 2009-09-25 | Acquisition of out-of-vocabulary translations by dynamically learning extraction rules |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102550049A true CN102550049A (zh) | 2012-07-04 |
CN102550049B CN102550049B (zh) | 2016-05-25 |
Family
ID=43795271
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200980161654.XA Active CN102550049B (zh) | 2009-09-25 | 2009-09-25 | 通过动态学习提取规则来获取词表外的翻译 |
Country Status (3)
Country | Link |
---|---|
US (1) | US8670974B2 (zh) |
CN (1) | CN102550049B (zh) |
WO (1) | WO2011035455A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111563387A (zh) * | 2019-02-12 | 2020-08-21 | 阿里巴巴集团控股有限公司 | 语句相似度确定方法及装置、语句翻译方法及装置 |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8332205B2 (en) * | 2009-01-09 | 2012-12-11 | Microsoft Corporation | Mining transliterations for out-of-vocabulary query terms |
US9471565B2 (en) * | 2011-07-29 | 2016-10-18 | At&T Intellectual Property I, L.P. | System and method for locating bilingual web sites |
US8990066B2 (en) * | 2012-01-31 | 2015-03-24 | Microsoft Corporation | Resolving out-of-vocabulary words during machine translation |
US9176936B2 (en) * | 2012-09-28 | 2015-11-03 | International Business Machines Corporation | Transliteration pair matching |
CN103646117B (zh) * | 2013-12-27 | 2016-09-28 | 苏州大学 | 一种基于链接的双语平行网页识别方法及系统 |
US10831999B2 (en) * | 2019-02-26 | 2020-11-10 | International Business Machines Corporation | Translation of ticket for resolution |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101308512A (zh) * | 2008-06-25 | 2008-11-19 | 北京金山软件有限公司 | 一种基于网页的互译翻译对抽取方法及装置 |
CN101425087A (zh) * | 2008-09-16 | 2009-05-06 | 网易有道信息技术(北京)有限公司 | 构建词典的方法和系统 |
US20090182547A1 (en) * | 2008-01-16 | 2009-07-16 | Microsoft Corporation | Adaptive Web Mining of Bilingual Lexicon for Query Translation |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1452093A (zh) | 2003-04-21 | 2003-10-29 | 北京嘉盛联侨信息工程技术有限公司 | 用单一词库进行双向词汇翻译的方法 |
CN1452101A (zh) | 2003-04-21 | 2003-10-29 | 北京嘉盛联侨信息工程技术有限公司 | 用一个词库实现双向词汇翻译和单词分组记忆的方法 |
US7805289B2 (en) * | 2006-07-10 | 2010-09-28 | Microsoft Corporation | Aligning hierarchal and sequential document trees to identify parallel data |
US8306806B2 (en) * | 2008-12-02 | 2012-11-06 | Microsoft Corporation | Adaptive web mining of bilingual lexicon |
US8275604B2 (en) * | 2009-03-18 | 2012-09-25 | Microsoft Corporation | Adaptive pattern learning for bilingual data mining |
-
2009
- 2009-09-25 WO PCT/CN2009/001078 patent/WO2011035455A1/en active Application Filing
- 2009-09-25 US US12/922,154 patent/US8670974B2/en not_active Expired - Fee Related
- 2009-09-25 CN CN200980161654.XA patent/CN102550049B/zh active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090182547A1 (en) * | 2008-01-16 | 2009-07-16 | Microsoft Corporation | Adaptive Web Mining of Bilingual Lexicon for Query Translation |
CN101308512A (zh) * | 2008-06-25 | 2008-11-19 | 北京金山软件有限公司 | 一种基于网页的互译翻译对抽取方法及装置 |
CN101425087A (zh) * | 2008-09-16 | 2009-05-06 | 网易有道信息技术(北京)有限公司 | 构建词典的方法和系统 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111563387A (zh) * | 2019-02-12 | 2020-08-21 | 阿里巴巴集团控股有限公司 | 语句相似度确定方法及装置、语句翻译方法及装置 |
CN111563387B (zh) * | 2019-02-12 | 2023-05-02 | 阿里巴巴集团控股有限公司 | 语句相似度确定方法及装置、语句翻译方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
US20110178792A1 (en) | 2011-07-21 |
US8670974B2 (en) | 2014-03-11 |
HK1172186A1 (zh) | 2013-04-12 |
WO2011035455A1 (en) | 2011-03-31 |
CN102550049B (zh) | 2016-05-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108287858B (zh) | 自然语言的语义提取方法及装置 | |
CN102053991B (zh) | 用于多语言文档检索的方法及系统 | |
CN103198057B (zh) | 一种自动给文档添加标签的方法和装置 | |
CN101694668B (zh) | 网页结构相似性确定方法及装置 | |
CN102227724A (zh) | 对于音译的机器学习 | |
Hämäläinen et al. | From the paft to the fiiture: a fully automatic NMT and word embeddings method for OCR post-correction | |
CN102550049A (zh) | 通过动态学习提取规则来获取词表外的翻译 | |
US20150286634A1 (en) | Method and system for providing translated result | |
CN102402584A (zh) | 多语言文本中的语言识别 | |
CN104102721A (zh) | 信息推荐方法和装置 | |
WO2009035863A2 (en) | Mining bilingual dictionaries from monolingual web pages | |
CN101996210A (zh) | 用于搜索电子地图的方法和系统 | |
CN102779140A (zh) | 一种关键词获取方法及装置 | |
CN110516011B (zh) | 一种多源实体数据融合方法、装置及设备 | |
CN114067343B (zh) | 一种数据集的构建方法、模型训练方法和对应装置 | |
CN110209781B (zh) | 一种文本处理方法、装置以及相关设备 | |
CN104008093A (zh) | 用于中文姓名音译的方法和系统 | |
CN107111618A (zh) | 将图像的缩略图链接到网页 | |
CN103605690A (zh) | 一种即时通信中识别广告消息的装置和方法 | |
US10296635B2 (en) | Auditing and augmenting user-generated tags for digital content | |
CN112765965A (zh) | 文本多标签分类方法、装置、设备和存储介质 | |
CN110866407B (zh) | 确定互译文本及文本间相似度分析方法、装置及设备 | |
CN103455572A (zh) | 获取网页中影视主体的方法及装置 | |
CN104462151A (zh) | 评估网页发布时间的方法和相关装置 | |
Sundriyal et al. | DESYR: definition and syntactic representation based claim detection on the web |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1172186 Country of ref document: HK |
|
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C41 | Transfer of patent application or patent right or utility model | ||
TR01 | Transfer of patent right |
Effective date of registration: 20160802 Address after: American California Patentee after: EXCALIBUR IP LLC Address before: American California Patentee before: Yahoo Corp. |
|
REG | Reference to a national code |
Ref country code: HK Ref legal event code: GR Ref document number: 1172186 Country of ref document: HK |