WO2012030053A3 - 병렬 말뭉치의 구 정렬을 이용한 숙어 표현 인식 장치 및 그 방법 - Google Patents
병렬 말뭉치의 구 정렬을 이용한 숙어 표현 인식 장치 및 그 방법 Download PDFInfo
- Publication number
- WO2012030053A3 WO2012030053A3 PCT/KR2011/003832 KR2011003832W WO2012030053A3 WO 2012030053 A3 WO2012030053 A3 WO 2012030053A3 KR 2011003832 W KR2011003832 W KR 2011003832W WO 2012030053 A3 WO2012030053 A3 WO 2012030053A3
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- idiomatic expression
- recognizing
- parallel corpus
- expression
- phrase alignment
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/191—Automatic line break hyphenation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/44—Statistical methods, e.g. probability models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/45—Example-based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/49—Data-driven translation using very large corpora, e.g. the web
Abstract
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/820,199 US20140303955A1 (en) | 2010-09-02 | 2011-05-25 | Apparatus and method for recognizing an idiomatic expression using phrase alignment of a parallel corpus |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2010-0085959 | 2010-09-02 | ||
KR1020100085959A KR101745349B1 (ko) | 2010-09-02 | 2010-09-02 | 병렬 말뭉치의 구 정렬을 이용한 숙어 표현 인식 장치 및 그 방법 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2012030053A2 WO2012030053A2 (ko) | 2012-03-08 |
WO2012030053A3 true WO2012030053A3 (ko) | 2012-04-19 |
Family
ID=45773336
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2011/003832 WO2012030053A2 (ko) | 2010-09-02 | 2011-05-25 | 병렬 말뭉치의 구 정렬을 이용한 숙어 표현 인식 장치 및 그 방법 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20140303955A1 (ko) |
KR (1) | KR101745349B1 (ko) |
WO (1) | WO2012030053A2 (ko) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9785704B2 (en) * | 2012-01-04 | 2017-10-10 | Microsoft Technology Licensing, Llc | Extracting query dimensions from search results |
KR102013230B1 (ko) | 2012-10-31 | 2019-08-23 | 십일번가 주식회사 | 구문 전처리 기반의 구문 분석 장치 및 그 방법 |
US10347240B2 (en) * | 2015-02-26 | 2019-07-09 | Nantmobile, Llc | Kernel-based verbal phrase splitting devices and methods |
CN106202068B (zh) * | 2016-07-25 | 2019-01-22 | 哈尔滨工业大学 | 基于多语平行语料的语义向量的机器翻译方法 |
US11288452B2 (en) * | 2019-07-26 | 2022-03-29 | Beijing Didi Infinity Technology And Development Co., Ltd. | Dual monolingual cross-entropy-delta filtering of noisy parallel data and use thereof |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR19990047856A (ko) * | 1997-12-05 | 1999-07-05 | 정선종 | 다국어 기계번역 장치를 위한 다국어용 숙어 인식 시스템 |
KR20010027882A (ko) * | 1999-09-16 | 2001-04-06 | 정선종 | 대역문틀에 기반한 구 단위 숙어의 인식 장치 및 그 방법 |
KR20030094632A (ko) * | 2002-06-07 | 2003-12-18 | 인터내셔널 비지네스 머신즈 코포레이션 | 변환방식 기계번역시스템에서 사용되는 변환사전을생성하는 방법 및 장치 |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6161083A (en) * | 1996-05-02 | 2000-12-12 | Sony Corporation | Example-based translation method and system which calculates word similarity degrees, a priori probability, and transformation probability to determine the best example for translation |
JP2005527894A (ja) * | 2002-03-28 | 2005-09-15 | ユニバーシティ・オブ・サザン・カリフォルニア | 統計的機械翻訳 |
US7249012B2 (en) * | 2002-11-20 | 2007-07-24 | Microsoft Corporation | Statistical method and apparatus for learning translation relationships among phrases |
US7765098B2 (en) * | 2005-04-26 | 2010-07-27 | Content Analyst Company, Llc | Machine translation using vector space representations |
US7536295B2 (en) * | 2005-12-22 | 2009-05-19 | Xerox Corporation | Machine translation using non-contiguous fragments of text |
US7657421B2 (en) * | 2006-06-28 | 2010-02-02 | International Business Machines Corporation | System and method for identifying and defining idioms |
US8594992B2 (en) * | 2008-06-09 | 2013-11-26 | National Research Council Of Canada | Method and system for using alignment means in matching translation |
US8244519B2 (en) * | 2008-12-03 | 2012-08-14 | Xerox Corporation | Dynamic translation memory using statistical machine translation |
KR101266361B1 (ko) * | 2009-09-10 | 2013-05-22 | 한국전자통신연구원 | 구조화된 번역 메모리 기반의 자동 번역 시스템 및 자동 번역 방법 |
US8548796B2 (en) * | 2010-01-20 | 2013-10-01 | Xerox Corporation | Statistical machine translation system and method for translation of text into languages which produce closed compound words |
US8543374B2 (en) * | 2010-08-12 | 2013-09-24 | Xerox Corporation | Translation system combining hierarchical and phrase-based models |
-
2010
- 2010-09-02 KR KR1020100085959A patent/KR101745349B1/ko active IP Right Grant
-
2011
- 2011-05-25 US US13/820,199 patent/US20140303955A1/en not_active Abandoned
- 2011-05-25 WO PCT/KR2011/003832 patent/WO2012030053A2/ko active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR19990047856A (ko) * | 1997-12-05 | 1999-07-05 | 정선종 | 다국어 기계번역 장치를 위한 다국어용 숙어 인식 시스템 |
KR20010027882A (ko) * | 1999-09-16 | 2001-04-06 | 정선종 | 대역문틀에 기반한 구 단위 숙어의 인식 장치 및 그 방법 |
KR20030094632A (ko) * | 2002-06-07 | 2003-12-18 | 인터내셔널 비지네스 머신즈 코포레이션 | 변환방식 기계번역시스템에서 사용되는 변환사전을생성하는 방법 및 장치 |
Also Published As
Publication number | Publication date |
---|---|
KR101745349B1 (ko) | 2017-06-09 |
KR20120022390A (ko) | 2012-03-12 |
US20140303955A1 (en) | 2014-10-09 |
WO2012030053A2 (ko) | 2012-03-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2009035863A3 (en) | Mining bilingual dictionaries from monolingual web pages | |
WO2012030053A3 (ko) | 병렬 말뭉치의 구 정렬을 이용한 숙어 표현 인식 장치 및 그 방법 | |
WO2013064752A3 (en) | Machine translation quality measurement | |
IL240549B (en) | Device, system and method for imaging and labeling whole blood | |
MX2016005225A (es) | Metodo y aparato de reconocimiento de huellas dactilares. | |
MX2018003490A (es) | Traduccion universal. | |
EP2466541A3 (en) | Image processing apparatus, image processing method and image processing program | |
BR112012011091A2 (pt) | método e aparelho para extração e avaliação de qualidade de palavra | |
CL2014002526A1 (es) | Metodo para detectar una cara, que comprende preprocesar una imagen, extraer las esquinas de la imagen preprocesada, obtener un componente conectado de las esquinas, extraer los centroides del componente, hacer coincidir los centroides con una plantilla, calcular una probabilidad de coincidencia de los centroides con la plantilla, identificar las regiones formadas por los centroides teniendo una probabilidad de coincidencia mayor o igual a un valor predeterminado; sistema; medio de almacenamiento | |
MX340339B (es) | Metodos de transferencia de calibracion para un instrumento de pruebas. | |
WO2011051817A3 (en) | System and method for increasing the accuracy of optical character recognition (ocr) | |
MX340429B (es) | Sistema y metodo para coincidencia de direcciones contextual y de formato libre. | |
MX347895B (es) | Dispositivo y método para obtener y procesar lecturas de medición de un ser vivo. | |
WO2010140779A3 (ko) | 시료 채취/주입 기구 및 이를 포함하는 생체 데이터 측정용 세트 | |
WO2011021198A3 (en) | Gas chromatographic analysis method and system | |
MX357547B (es) | Metodos y aparato para identificar atributos de fluidos. | |
BR112014010208A2 (pt) | método, aparelho e sistema para permitir a recuperação de conteúdo de interesse para uma revisão posterior | |
WO2015050321A8 (ko) | 자율학습 정렬 기반의 정렬 코퍼스 생성 장치 및 그 방법과, 정렬 코퍼스를 사용한 파괴 표현 형태소 분석 장치 및 그 형태소 분석 방법 | |
BR112012011377A2 (pt) | equipamento e método implementado em computador para o reconhecimento de característica de imagem independentemente da orientação ou escala da imagem e meio de armazenamento legível por computador não transitório para armazenar instruções | |
GB2547350A (en) | Molecular cell imaging using optical spectroscopy | |
WO2013159972A3 (de) | Sensor mit zeitstempel für abtast-zeitpunkt | |
NZ589039A (en) | Recognition of a word image with a plurality of characters by way of comparing two possible candidates based on an evaluation value | |
WO2011074772A3 (ko) | 문법 오류 시뮬레이션 장치 및 방법. | |
BR112012031056A2 (pt) | dispositivo, método e programa de identificação de informação de avaliação, e, meio de gravação legível por computador | |
WO2011143141A3 (en) | Method and apparatus for performing asynchronous and synchronous reset removal during synthesis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11822028 Country of ref document: EP Kind code of ref document: A2 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13820199 Country of ref document: US |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 13.06.2013) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11822028 Country of ref document: EP Kind code of ref document: A2 |