CN101194253B - 来源于单语和可用双语语料库的搭配翻译 - Google Patents
来源于单语和可用双语语料库的搭配翻译 Download PDFInfo
- Publication number
- CN101194253B CN101194253B CN2006800206987A CN200680020698A CN101194253B CN 101194253 B CN101194253 B CN 101194253B CN 2006800206987 A CN2006800206987 A CN 2006800206987A CN 200680020698 A CN200680020698 A CN 200680020698A CN 101194253 B CN101194253 B CN 101194253B
- Authority
- CN
- China
- Prior art keywords
- collocation
- translation
- language
- context
- probability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/45—Example-based machine translation; Alignment
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
Description
Claims (11)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/152,540 US20060282255A1 (en) | 2005-06-14 | 2005-06-14 | Collocation translation from monolingual and available bilingual corpora |
US11/152,540 | 2005-06-14 | ||
PCT/US2006/023182 WO2006138386A2 (en) | 2005-06-14 | 2006-06-14 | Collocation translation from monolingual and available bilingual corpora |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101194253A CN101194253A (zh) | 2008-06-04 |
CN101194253B true CN101194253B (zh) | 2012-08-29 |
Family
ID=37525132
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2006800206987A Expired - Fee Related CN101194253B (zh) | 2005-06-14 | 2006-06-14 | 来源于单语和可用双语语料库的搭配翻译 |
Country Status (8)
Country | Link |
---|---|
US (1) | US20060282255A1 (zh) |
EP (1) | EP1889180A2 (zh) |
JP (1) | JP2008547093A (zh) |
KR (1) | KR20080014845A (zh) |
CN (1) | CN101194253B (zh) |
BR (1) | BRPI0611592A2 (zh) |
MX (1) | MX2007015438A (zh) |
WO (1) | WO2006138386A2 (zh) |
Families Citing this family (43)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060116865A1 (en) | 1999-09-17 | 2006-06-01 | Www.Uniscape.Com | E-services translation utilizing machine translation and translation memory |
US7904595B2 (en) | 2001-01-18 | 2011-03-08 | Sdl International America Incorporated | Globalization management system and method therefor |
US7574348B2 (en) * | 2005-07-08 | 2009-08-11 | Microsoft Corporation | Processing collocation mistakes in documents |
US20070016397A1 (en) * | 2005-07-18 | 2007-01-18 | Microsoft Corporation | Collocation translation using monolingual corpora |
US10319252B2 (en) | 2005-11-09 | 2019-06-11 | Sdl Inc. | Language capability assessment and training apparatus and techniques |
US7865352B2 (en) * | 2006-06-02 | 2011-01-04 | Microsoft Corporation | Generating grammatical elements in natural language sentences |
US8209163B2 (en) * | 2006-06-02 | 2012-06-26 | Microsoft Corporation | Grammatical element generation in machine translation |
US7774193B2 (en) * | 2006-12-05 | 2010-08-10 | Microsoft Corporation | Proofing of word collocation errors based on a comparison with collocations in a corpus |
US20080168049A1 (en) * | 2007-01-08 | 2008-07-10 | Microsoft Corporation | Automatic acquisition of a parallel corpus from a network |
JP5342760B2 (ja) * | 2007-09-03 | 2013-11-13 | 株式会社東芝 | 訳語学習のためのデータを作成する装置、方法、およびプログラム |
KR100911619B1 (ko) | 2007-12-11 | 2009-08-12 | 한국전자통신연구원 | 자동번역 시스템에서의 영어 어휘 패턴 구축 방법 및 장치 |
TWI403911B (zh) * | 2008-11-28 | 2013-08-01 | Inst Information Industry | 中文辭典建置裝置和方法,以及儲存媒體 |
CN102117284A (zh) * | 2009-12-30 | 2011-07-06 | 安世亚太科技(北京)有限公司 | 一种跨语言知识检索的方法 |
US10417646B2 (en) | 2010-03-09 | 2019-09-17 | Sdl Inc. | Predicting the cost associated with translating textual content |
KR101762866B1 (ko) * | 2010-11-05 | 2017-08-16 | 에스케이플래닛 주식회사 | 구문 구조 변환 모델과 어휘 변환 모델을 결합한 기계 번역 장치 및 기계 번역 방법 |
US10657540B2 (en) | 2011-01-29 | 2020-05-19 | Sdl Netherlands B.V. | Systems, methods, and media for web content management |
US9547626B2 (en) | 2011-01-29 | 2017-01-17 | Sdl Plc | Systems, methods, and media for managing ambient adaptability of web applications and web services |
US8838433B2 (en) | 2011-02-08 | 2014-09-16 | Microsoft Corporation | Selection of domain-adapted translation subcorpora |
US10580015B2 (en) | 2011-02-25 | 2020-03-03 | Sdl Netherlands B.V. | Systems, methods, and media for executing and optimizing online marketing initiatives |
US8527259B1 (en) * | 2011-02-28 | 2013-09-03 | Google Inc. | Contextual translation of digital content |
US10140320B2 (en) | 2011-02-28 | 2018-11-27 | Sdl Inc. | Systems, methods, and media for generating analytical data |
US9984054B2 (en) | 2011-08-24 | 2018-05-29 | Sdl Inc. | Web interface including the review and manipulation of a web document and utilizing permission based control |
US9773270B2 (en) | 2012-05-11 | 2017-09-26 | Fredhopper B.V. | Method and system for recommending products based on a ranking cocktail |
US10261994B2 (en) | 2012-05-25 | 2019-04-16 | Sdl Inc. | Method and system for automatic management of reputation of translators |
US10452740B2 (en) | 2012-09-14 | 2019-10-22 | Sdl Netherlands B.V. | External content libraries |
US11308528B2 (en) | 2012-09-14 | 2022-04-19 | Sdl Netherlands B.V. | Blueprinting of multimedia assets |
US11386186B2 (en) | 2012-09-14 | 2022-07-12 | Sdl Netherlands B.V. | External content library connector systems and methods |
US9916306B2 (en) | 2012-10-19 | 2018-03-13 | Sdl Inc. | Statistical linguistic analysis of source content |
CN102930031B (zh) * | 2012-11-08 | 2015-10-07 | 哈尔滨工业大学 | 由网页中提取双语平行正文的方法和系统 |
CN103577399B (zh) * | 2013-11-05 | 2018-01-23 | 北京百度网讯科技有限公司 | 双语语料库的数据扩充方法和装置 |
CN103714055B (zh) * | 2013-12-30 | 2017-03-15 | 北京百度网讯科技有限公司 | 从图片中自动提取双语词典的方法及装置 |
CN103678714B (zh) * | 2013-12-31 | 2017-05-10 | 北京百度网讯科技有限公司 | 实体知识库的构建方法和装置 |
CN105068998B (zh) * | 2015-07-29 | 2017-12-15 | 百度在线网络技术(北京)有限公司 | 基于神经网络模型的翻译方法及装置 |
US10614167B2 (en) | 2015-10-30 | 2020-04-07 | Sdl Plc | Translation review workflow systems and methods |
JP6705318B2 (ja) * | 2016-07-14 | 2020-06-03 | 富士通株式会社 | 対訳辞書作成装置、対訳辞書作成方法、及び対訳辞書作成プログラム |
US10635863B2 (en) | 2017-10-30 | 2020-04-28 | Sdl Inc. | Fragment recall and adaptive automated translation |
US10817676B2 (en) | 2017-12-27 | 2020-10-27 | Sdl Inc. | Intelligent routing services and systems |
US10984196B2 (en) * | 2018-01-11 | 2021-04-20 | International Business Machines Corporation | Distributed system for evaluation and feedback of digital text-based content |
CN108549637A (zh) * | 2018-04-19 | 2018-09-18 | 京东方科技集团股份有限公司 | 基于拼音的语义识别方法、装置以及人机对话系统 |
US11256867B2 (en) | 2018-10-09 | 2022-02-22 | Sdl Inc. | Systems and methods of machine learning for digital assets and message creation |
CN111428518B (zh) * | 2019-01-09 | 2023-11-21 | 科大讯飞股份有限公司 | 一种低频词翻译方法及装置 |
CN110728154B (zh) * | 2019-08-28 | 2023-05-26 | 云知声智能科技股份有限公司 | 一种半监督式通用神经机器翻译模型的构建方法 |
WO2023128170A1 (ko) * | 2021-12-28 | 2023-07-06 | 삼성전자 주식회사 | 전자 장치, 전자 장치의 제어 방법 및 프로그램이 기록된 기록매체 |
Family Cites Families (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4868750A (en) * | 1987-10-07 | 1989-09-19 | Houghton Mifflin Company | Collocational grammar system |
US5850561A (en) * | 1994-09-23 | 1998-12-15 | Lucent Technologies Inc. | Glossary construction tool |
GB2334115A (en) * | 1998-01-30 | 1999-08-11 | Sharp Kk | Processing text eg for approximate translation |
US6092034A (en) * | 1998-07-27 | 2000-07-18 | International Business Machines Corporation | Statistical translation system and method for fast sense disambiguation and translation of large corpora using fertility models and sense models |
GB9821787D0 (en) * | 1998-10-06 | 1998-12-02 | Data Limited | Apparatus for classifying or processing data |
US6885985B2 (en) * | 2000-12-18 | 2005-04-26 | Xerox Corporation | Terminology translation for unaligned comparable corpora using category based translation probabilities |
US7734459B2 (en) * | 2001-06-01 | 2010-06-08 | Microsoft Corporation | Automatic extraction of transfer mappings from bilingual corpora |
JP4304268B2 (ja) * | 2001-08-10 | 2009-07-29 | 独立行政法人情報通信研究機構 | 複数言語対訳テキスト入力による第3言語テキスト生成アルゴリズム及び装置、プログラム |
US20030154071A1 (en) * | 2002-02-11 | 2003-08-14 | Shreve Gregory M. | Process for the document management and computer-assisted translation of documents utilizing document corpora constructed by intelligent agents |
CA2487739A1 (en) * | 2002-05-28 | 2003-12-04 | Vladimir Vladimirovich Nasypny | Method for synthesising a self-learning system for knowledge acquisition for text-retrieval systems |
KR100530154B1 (ko) * | 2002-06-07 | 2005-11-21 | 인터내셔널 비지네스 머신즈 코포레이션 | 변환방식 기계번역시스템에서 사용되는 변환사전을생성하는 방법 및 장치 |
US7031911B2 (en) * | 2002-06-28 | 2006-04-18 | Microsoft Corporation | System and method for automatic detection of collocation mistakes in documents |
US7349839B2 (en) * | 2002-08-27 | 2008-03-25 | Microsoft Corporation | Method and apparatus for aligning bilingual corpora |
US7194455B2 (en) * | 2002-09-19 | 2007-03-20 | Microsoft Corporation | Method and system for retrieving confirming sentences |
US7249012B2 (en) * | 2002-11-20 | 2007-07-24 | Microsoft Corporation | Statistical method and apparatus for learning translation relationships among phrases |
JP2004326584A (ja) * | 2003-04-25 | 2004-11-18 | Nippon Telegr & Teleph Corp <Ntt> | 対訳固有表現抽出装置及び方法、対訳固有表現抽出プログラム |
US7346487B2 (en) * | 2003-07-23 | 2008-03-18 | Microsoft Corporation | Method and apparatus for identifying translations |
US7454393B2 (en) * | 2003-08-06 | 2008-11-18 | Microsoft Corporation | Cost-benefit approach to automatically composing answers to questions by extracting information from large unstructured corpora |
US7689412B2 (en) * | 2003-12-05 | 2010-03-30 | Microsoft Corporation | Synonymous collocation extraction using translation information |
US20070016397A1 (en) * | 2005-07-18 | 2007-01-18 | Microsoft Corporation | Collocation translation using monolingual corpora |
-
2005
- 2005-06-14 US US11/152,540 patent/US20060282255A1/en not_active Abandoned
-
2006
- 2006-06-14 JP JP2008517071A patent/JP2008547093A/ja not_active Ceased
- 2006-06-14 WO PCT/US2006/023182 patent/WO2006138386A2/en active Application Filing
- 2006-06-14 MX MX2007015438A patent/MX2007015438A/es not_active Application Discontinuation
- 2006-06-14 CN CN2006800206987A patent/CN101194253B/zh not_active Expired - Fee Related
- 2006-06-14 KR KR1020077028750A patent/KR20080014845A/ko not_active Application Discontinuation
- 2006-06-14 BR BRPI0611592-6A patent/BRPI0611592A2/pt not_active IP Right Cessation
- 2006-06-14 EP EP06784886A patent/EP1889180A2/en not_active Withdrawn
Non-Patent Citations (7)
Title |
---|
----.----.《ACL’03:Proceedings of the 41st Annual Meeting on Association for Computational Linguistics》.2003,第1卷----. * |
----.----.《Presentations at DARPA IAO Machine Translation Workshop》.2002,----. * |
----.----.《计算机科学》.1995,第22卷(第4期),----. * |
Franz Josef Och,Hermann Ney.Discriminative Training and Maximum Entropy Models for Statistical Machine Translation.《Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL)》.2002,第295-302页. * |
Hua Wu, Ming Zhou.Synonymous Collocation Extraction Using Translation Information.《ACL’03:Proceedings of the 41st Annual Meeting on Association for Computational Linguistics》.Association for Computational Linguistics,2003,第1卷第1-8页. * |
Philipp Koehn,Franz Josef Och,Daniel Marcu.Statistical Phrase-Based Translation.《Presentations at DARPA IAO Machine Translation Workshop》.2002,正文第4.5节第2段. * |
周强.基于语料库和面向统计学的自然语言处理技术.《计算机科学》.1995,第22卷(第4期),36-40. * |
Also Published As
Publication number | Publication date |
---|---|
US20060282255A1 (en) | 2006-12-14 |
WO2006138386A3 (en) | 2007-12-27 |
JP2008547093A (ja) | 2008-12-25 |
WO2006138386A2 (en) | 2006-12-28 |
EP1889180A2 (en) | 2008-02-20 |
BRPI0611592A2 (pt) | 2010-09-21 |
KR20080014845A (ko) | 2008-02-14 |
CN101194253A (zh) | 2008-06-04 |
MX2007015438A (es) | 2008-02-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101194253B (zh) | 来源于单语和可用双语语料库的搭配翻译 | |
US7689412B2 (en) | Synonymous collocation extraction using translation information | |
JP4237001B2 (ja) | 文書のコロケーション誤りを自動的に検出するシステムおよび方法 | |
CN102084417B (zh) | 现场维护语音到语音翻译的系统和方法 | |
KR101004515B1 (ko) | 문장 데이터베이스로부터 문장들을 사용자에게 제공하는 컴퓨터 구현 방법 및 이 방법을 수행하기 위한 컴퓨터 실행가능 명령어가 저장되어 있는 유형의 컴퓨터 판독가능 기록 매체, 문장 데이터베이스로부터 확인 문장들을 검색하는 시스템이 저장되어 있는 컴퓨터 판독가능 기록 매체 | |
US8209163B2 (en) | Grammatical element generation in machine translation | |
US8543563B1 (en) | Domain adaptation for query translation | |
CN103154939B (zh) | 使用依存丛林的统计机器翻译方法 | |
US20130006954A1 (en) | Translation system adapted for query translation via a reranking framework | |
US20130226556A1 (en) | Machine translation device and machine translation method in which a syntax conversion model and a word translation model are combined | |
US8874433B2 (en) | Syntax-based augmentation of statistical machine translation phrase tables | |
Tsvetkov et al. | Cross-lingual bridges with models of lexical borrowing | |
US9311299B1 (en) | Weakly supervised part-of-speech tagging with coupled token and type constraints | |
US9442922B2 (en) | System and method for incrementally updating a reordering model for a statistical machine translation system | |
KR20160133349A (ko) | 구 표 생성 방법 및 구 표를 이용한 기계 번역 방법 | |
Kouremenos et al. | A novel rule based machine translation scheme from Greek to Greek Sign Language: Production of different types of large corpora and Language Models evaluation | |
Prabhakar et al. | Machine transliteration and transliterated text retrieval: a survey | |
Fung et al. | Multilingual spoken language processing | |
US20070016397A1 (en) | Collocation translation using monolingual corpora | |
Chung et al. | Sentence‐Chain Based Seq2seq Model for Corpus Expansion | |
Musleh et al. | Enabling medical translation for low-resource languages | |
Tyers et al. | Developing prototypes for machine translation between two Sámi languages | |
JP2005284723A (ja) | 自然言語処理システム及び自然言語処理方法、並びにコンピュータ・プログラム | |
Weiner | Pronominal anaphora in machine translation | |
KR102143158B1 (ko) | 한국어 구문 분석을 활용한 정보 처리 시스템 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
ASS | Succession or assignment of patent right |
Owner name: MICROSOFT TECHNOLOGY LICENSING LLC Free format text: FORMER OWNER: MICROSOFT CORP. Effective date: 20150422 |
|
C41 | Transfer of patent application or patent right or utility model | ||
TR01 | Transfer of patent right |
Effective date of registration: 20150422 Address after: Washington State Patentee after: Micro soft technique license Co., Ltd Address before: Washington State Patentee before: Microsoft Corp. Effective date of registration: 20150422 Address after: Washington State Patentee after: Micro soft technique license Co., Ltd Address before: Washington State Patentee before: Microsoft Corp. |
|
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20120829 Termination date: 20190614 |
|
CF01 | Termination of patent right due to non-payment of annual fee |