WO2012026667A3 - 토큰 분리 및 번역 과정을 통합한 통합 디코딩 장치 및 그 방법 - Google Patents

토큰 분리 및 번역 과정을 통합한 통합 디코딩 장치 및 그 방법 Download PDF

Info

Publication number
WO2012026667A3
WO2012026667A3 PCT/KR2011/003830 KR2011003830W WO2012026667A3 WO 2012026667 A3 WO2012026667 A3 WO 2012026667A3 KR 2011003830 W KR2011003830 W KR 2011003830W WO 2012026667 A3 WO2012026667 A3 WO 2012026667A3
Authority
WO
WIPO (PCT)
Prior art keywords
categorization
token
method therefor
decoding apparatus
interpretation
Prior art date
Application number
PCT/KR2011/003830
Other languages
English (en)
French (fr)
Other versions
WO2012026667A2 (ko
Inventor
황영숙
김상범
윤창호
시아오시얀
리우양
리우췬
린쇼우슌
Original Assignee
에스케이텔레콤 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 에스케이텔레콤 주식회사 filed Critical 에스케이텔레콤 주식회사
Priority to US13/813,463 priority Critical patent/US8543376B2/en
Publication of WO2012026667A2 publication Critical patent/WO2012026667A2/ko
Publication of WO2012026667A3 publication Critical patent/WO2012026667A3/ko

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/44Statistical methods, e.g. probability models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/53Processing of non-Latin text

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Software Systems (AREA)
  • Algebra (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Machine Translation (AREA)
  • Error Detection And Correction (AREA)

Abstract

본 발명은 토큰 분리 및 번역 과정을 통합한 통합 디코딩 장치 및 그 방법에 관한 것으로서, 상세하게는 입력 문자 시퀀스에 대해서 디코딩 동안에 토큰 분리 및 번역을 함께 수행하여 통합적으로 디코딩함으로써, 모든 가능한 후보 토큰들을 생성하고 번역 오류들을 감소시킬 수 있으며 최적의 번역 결과를 획득할 수 있다.
PCT/KR2011/003830 2010-08-23 2011-05-25 토큰 분리 및 번역 과정을 통합한 통합 디코딩 장치 및 그 방법 WO2012026667A2 (ko)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/813,463 US8543376B2 (en) 2010-08-23 2011-05-25 Apparatus and method for decoding using joint tokenization and translation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2010-0081677 2010-08-23
KR1020100081677A KR101682207B1 (ko) 2010-08-23 2010-08-23 토큰 분리 및 번역 과정을 통합한 통합 디코딩 장치 및 그 방법

Publications (2)

Publication Number Publication Date
WO2012026667A2 WO2012026667A2 (ko) 2012-03-01
WO2012026667A3 true WO2012026667A3 (ko) 2012-04-19

Family

ID=45723875

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2011/003830 WO2012026667A2 (ko) 2010-08-23 2011-05-25 토큰 분리 및 번역 과정을 통합한 통합 디코딩 장치 및 그 방법

Country Status (3)

Country Link
US (1) US8543376B2 (ko)
KR (1) KR101682207B1 (ko)
WO (1) WO2012026667A2 (ko)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101356417B1 (ko) * 2010-11-05 2014-01-28 고려대학교 산학협력단 병렬 말뭉치를 이용한 동사구 번역 패턴 구축 장치 및 그 방법
JP2014078132A (ja) * 2012-10-10 2014-05-01 Toshiba Corp 機械翻訳装置、方法およびプログラム
CN104216892B (zh) * 2013-05-31 2018-01-02 亿览在线网络技术(北京)有限公司 歌曲搜索中非语义、非词组的切换方法
WO2016043539A1 (ko) * 2014-09-18 2016-03-24 특허법인 남앤드남 소번역메모리를 포함하는 번역 메모리, 그를 이용한 역방향 번역메모리 및 이들을 기록한 컴퓨터 판독가능한 저장매체
US9953171B2 (en) * 2014-09-22 2018-04-24 Infosys Limited System and method for tokenization of data for privacy
EP3210132A1 (en) * 2014-10-24 2017-08-30 Google, Inc. Neural machine translation systems with rare word processing
US9940324B2 (en) * 2015-03-10 2018-04-10 International Business Machines Corporation Performance detection and enhancement of machine translation
US9934203B2 (en) 2015-03-10 2018-04-03 International Business Machines Corporation Performance detection and enhancement of machine translation
US10140983B2 (en) * 2015-08-28 2018-11-27 International Business Machines Corporation Building of n-gram language model for automatic speech recognition (ASR)
US10430485B2 (en) 2016-05-10 2019-10-01 Go Daddy Operating Company, LLC Verifying character sets in domain name requests
US10180930B2 (en) * 2016-05-10 2019-01-15 Go Daddy Operating Company, Inc. Auto completing domain names comprising multiple languages
US10735736B2 (en) * 2017-08-29 2020-08-04 Google Llc Selective mixing for entropy coding in video compression
KR102069692B1 (ko) * 2017-10-26 2020-01-23 한국전자통신연구원 신경망 기계번역 방법 및 장치
CN110263304B (zh) * 2018-11-29 2023-01-10 腾讯科技(深圳)有限公司 语句编码方法、语句解码方法、装置、存储介质及设备
CN110263348A (zh) * 2019-03-06 2019-09-20 腾讯科技(深圳)有限公司 翻译方法、装置、计算机设备和存储介质
KR20210037307A (ko) 2019-09-27 2021-04-06 삼성전자주식회사 전자 장치 및 전자 장치의 제어 방법
US11797781B2 (en) 2020-08-06 2023-10-24 International Business Machines Corporation Syntax-based multi-layer language translation
KR20220093653A (ko) 2020-12-28 2022-07-05 삼성전자주식회사 전자 장치 및 그 제어 방법
US20240062021A1 (en) * 2022-08-22 2024-02-22 Oracle International Corporation Calibrating confidence scores of a machine learning model trained as a natural language interface

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR19990001034A (ko) * 1997-06-12 1999-01-15 윤종용 문맥 정보 및 지역적 문서 형태를 이용한 문장 추출 방법
KR20000056245A (ko) * 1999-02-18 2000-09-15 윤종용 예제기반 기계번역에서 분별성이 반영된 유사도를 이용한 번역예문 선정방법
US20090248422A1 (en) * 2008-03-28 2009-10-01 Microsoft Corporation Intra-language statistical machine translation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8612205B2 (en) * 2010-06-14 2013-12-17 Xerox Corporation Word alignment method and system for improved vocabulary coverage in statistical machine translation
US9098488B2 (en) * 2011-04-03 2015-08-04 Microsoft Technology Licensing, Llc Translation of multilingual embedded phrases

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR19990001034A (ko) * 1997-06-12 1999-01-15 윤종용 문맥 정보 및 지역적 문서 형태를 이용한 문장 추출 방법
KR20000056245A (ko) * 1999-02-18 2000-09-15 윤종용 예제기반 기계번역에서 분별성이 반영된 유사도를 이용한 번역예문 선정방법
US20090248422A1 (en) * 2008-03-28 2009-10-01 Microsoft Corporation Intra-language statistical machine translation

Also Published As

Publication number Publication date
KR20120018687A (ko) 2012-03-05
KR101682207B1 (ko) 2016-12-12
US20130132064A1 (en) 2013-05-23
US8543376B2 (en) 2013-09-24
WO2012026667A2 (ko) 2012-03-01

Similar Documents

Publication Publication Date Title
WO2012026667A3 (ko) 토큰 분리 및 번역 과정을 통합한 통합 디코딩 장치 및 그 방법
EP2620849A4 (en) Operation input apparatus, operation input method, and program
GB2466905B (en) Input apparatus, input method, and recording medium recording input program
WO2012134246A3 (ko) 엔트로피 디코딩 방법 및 이를 이용하는 디코딩 장치
WO2012135783A3 (en) Augmented conversational understanding agent
HK1185423A1 (zh) 用於鑒定電子手錶的方法和用於執行該方法的電子手錶
EP2299922A4 (en) PLASMA FORMING SYSTEM, METHOD AND DEVICE
EP2214442B8 (en) Apparatus and method for coordinating the operations of discontinuous reception and the semi-persistent scheduling.
WO2011156644A3 (en) Programmable device, heirarchical parallel machines, methods for providing state information
MY167013A (en) Coding apparatus and decoding apparatus with bandwidth extension
MY185753A (en) Coding apparatus and decoding apparatus with bandwidth extension
WO2010003057A3 (en) Treating cancer
HK1135486A1 (en) User interface, apparatus and method for handwriting input
WO2013002623A3 (ko) 대역폭 확장신호 생성장치 및 방법
EP2267550A4 (en) CARRIER CORE FOR ELECTROPHOTOGRAPHIC DEVELOPER AND METHOD FOR PRODUCING THE SAME, CARRIER AND METHOD FOR PRODUCING SAME, AND ELECTROPHOTOGRAPHIC REVELER
GB2498648B (en) Method of, and apparatus for, making an optical waveguide.
HK1140066A1 (en) Convolutional turbo coding method and apparatus for implementing the coding method
EP2600154A4 (en) METHOD FOR PRODUCING DATA SET FOR INTEGRATED PROTEOMIC, INTEGRATED PROTEASE METHOD USING THE DATA SET FOR INTEGRATED PROTEOME, THAT IS PRODUCED BY THE PRODUCTION METHOD, AND METHOD FOR IDENTIFYING, BY ITS USE, THE SUBSTANCE THAT IN THE ORIGIN
AP2012006273A0 (en) An apparatus and method for size reduction.
EP2256501A4 (en) ANALYSIS DEVICE AND ANALYSIS APPARATUS AND ANALYSIS METHOD USING THE DEVICE
EP2716917A4 (en) ASSEMBLY FOR REPAIRING AN ANKLE ASSEMBLED MEMBER AND METHOD OF USE
EP2282564A4 (en) PRIOR AUTHENTICATION METHOD, AUTHENTICATION SYSTEM AND AUTHENTICATION APPARATUS
TWI563392B (en) System on chip, electronic system including the same, and method of operating the same
HK1201105A1 (zh) 擾碼的生成方法、裝置和擾碼的處理裝置
PL2470328T3 (pl) Sposób wytwarzania układu optycznego z co najmniej dwoma optycznymi powierzchniami funkcyjnymi na wspólnej strukturze nośnej i urządzenie do realizacji sposobu

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11820095

Country of ref document: EP

Kind code of ref document: A2

WWE Wipo information: entry into national phase

Ref document number: 13813463

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 11/06/2013)

122 Ep: pct application non-entry in european phase

Ref document number: 11820095

Country of ref document: EP

Kind code of ref document: A2