WO2012026668A2 - 의존관계 포레스트를 이용한 통계적 기계 번역 방법 - Google Patents
의존관계 포레스트를 이용한 통계적 기계 번역 방법 Download PDFInfo
- Publication number
- WO2012026668A2 WO2012026668A2 PCT/KR2011/003968 KR2011003968W WO2012026668A2 WO 2012026668 A2 WO2012026668 A2 WO 2012026668A2 KR 2011003968 W KR2011003968 W KR 2011003968W WO 2012026668 A2 WO2012026668 A2 WO 2012026668A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dependency
- forest
- structures
- dependent
- node
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/55—Rule-based translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/191—Automatic line break hyphenation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/44—Statistical methods, e.g. probability models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/45—Example-based machine translation; Alignment
Definitions
- the present invention relates to a statistical machine translation method using a dependency forest.
- a plurality of dependency trees are generated by dependency analysis of language pair corpus, and a dependency forest is generated by combining a plurality of generated dependency trees.
- the dependency forest can be used to improve the translation performance by applying the generated translation rule and dependency language model when converting the source language text into the target language text.
- Figure 1 shows the dependency tree of the English sentence "He saw a boy with a telescope.” An arrow points from the child to the parent. Often the parent is called the head of the child. For example, in Figure 1, 'saw' is the head of 'he'
- the dependency tree is relatively less complex than the structure analysis because there is no need to analyze the structure of the sentence.
- Eligible dependency structures consist of fixed or fluid dependent trees Fixed children form a complete dependency tree of all children Flow structures are siblings of a common head It is composed of nodes (sibling nodes) but the head itself is unspecified or fluid, eg, Figures 2 (a) and (b) are two fixed structures and (c) is a flow structure.
- Extracting a string-dependent structure rule from an ordered string-dependent structure pair is similar to extracting an SCFG, except that the target language side is a qualified structure. For example, we can first extract a string-dependent structure rule that matches the word alignment as follows:
- X is the non-terminal code and the subscript indicates the correspondence between the non-terminal code on the source language and the target language side.
- the tree probability in FIG. 1 may be calculated as follows.
- P T (x) is the probability of the word x that is the root of the dependency tree.
- P L and P R are the probability of occurrence on the left and right sides, respectively.
- the present invention has been made to solve the above problems, and aims to improve the quality of rule tables and dependency language models by using a new concept of dependency forest that combines multiple dependency trees instead of one best dependency tree. It is done.
- an object of the present invention is to improve translation performance by applying a rule table and a dependency language model generated using a dependency forest.
- the translation rule generation method according to the present invention is characterized in that the translation rule is extracted using a dependency forest in which a plurality of dependency trees are combined.
- a translation rule generation method comprises the steps of performing a dependency analysis on a language pair corpus, generating a dependency tree by combining the dependency analysis and combining a plurality of dependency trees. Generating a forest, searching for a plurality of qualified structures for each node in the dependency forest, and extracting translation rules if the dependent structures in the plurality of qualified structures match word alignment. It features.
- the statistical machine translation method according to the present invention is characterized by translating a primitive language using a translation rule and a dependency language model generated from a dependency forest in which a plurality of dependency trees are combined. .
- a translation rule generating apparatus comprises means for generating a dependency tree by performing a dependency analysis on a language pair corpus and combining a plurality of dependency trees to generate a dependency forest; Means for searching a plurality of qualified structures for each node in the dependency forest and means for extracting translation rules if the dependent structures in the plurality of qualified structures match the word alignment.
- a statistical machine translation apparatus generates a dependency tree by dependency analysis of a source sentence and a target sentence of a parallel corpus, and combines a plurality of dependency trees for the source sentence and the target sentence.
- a dependency analyzer for generating a dependency forest, a translation rule extractor for extracting a translation rule using the dependency forest, a language model learner for generating a dependency language model using the dependency forest of the target sentence, and And a decoder for converting source language text to object language text by applying a translation rule and a dependency language model.
- a rule table and a dependency language model are generated from a dependency forest created by combining a plurality of dependency trees, and the translation is performed using the dependency tree.
- 1 is a diagram showing a learning example consisting of a dependency tree of an English sentence, a Chinese translation, and word alignment.
- FIG. 2 shows a fixed structure and a flow structure in a qualified dependent structure.
- FIG. 5 shows a statistical machine translation apparatus according to the present invention.
- the present invention uses a source sentence string and a number of dependent trees for the corresponding destination statement during the learning phase in a tree based statistical machine translation framework.
- the present invention presents a dependency tree of compact representation called dependency forest to effectively handle multiple dependency trees.
- Dependency forests have the same structure of hypergraphs as packed forests.
- Hypergraph-based dependency forests are aligned to source statement strings.
- a plurality of translation rules are extracted by checking whether the target phrase is a qualified structure from a string-to-forest aligned corpus.
- each node is a word.
- a span is added to each node to identify the nodes.
- nodes are connected by hyperedges.
- an edge points to its head from a dependent, but the hyperedge binds all dependent words with a common head.
- the rule extraction algorithm searches for a well-formed structure for each node in a bottom-up manner.
- the algorithm maintains k best-formed structures for each node.
- the qualified structure of the head can be constructed from its dependent words.
- the k best fixed and floating structures for each node in the dependency forest can be found by manipulating their fixed word's fixed structure. Then, if the dependency structure matches the word alignment, the string-dependency matching rule is extracted.
- 3 (a) and 3 (b) show two dependency trees for the example of the English sentence of FIG. 1.
- the preposition phrase 'with a telescope' may depend on 'saw' or 'boy'.
- 4 shows a dependency forest that densely represents two dependency trees by sharing a common node and edge.
- each node is a word. To distinguish between nodes, span each node. For example, the span of 'a' is (2, 2). 'A' is the third word in the sentence. The fourth word 'boy' dominates nodes a 2,2 and can be represented as boy 2,3 . Note that the location of the 'boy' is taken into account here. Similarly, in the word FIG. 3 (b), 'boy' may be represented as boy 2,6 .
- nodes are connected by hyperedges. In a dependency tree, an edge simply points to its head from the dependent word, but the hyperedge binds all dependent words with a common head.
- hyper edge e1 indicates that he 0,0 , boy 2,3 , with 4,6 are dependent words from left to right of saw 0,6 .
- Dependency forests can be represented as ⁇ V, E> pairs.
- V is the node set and E is the hyperedge set.
- each node belongs to V (v ⁇ V) and is represented in the form of w i, j .
- w i, j indicates that the substring dominates substrings from positions i to j except itself.
- Each hyperedge belongs to E (e ⁇ E) and is represented by a pair of ⁇ tail (e) and head (e)>.
- head (e) belongs to V as the head (head (e) ⁇ V)
- tail (e) also belongs to V as the dependent word of the head (tail (e) ⁇ V).
- Dependency forests have the same structure of hypergraphs as packed forests. However, in the bundled forest, each hyperedge treats the corresponding PCFG rule probability as the weight, while the dependency forest makes the weighted hypergraph. This is because the dependency parser outputs a positive or negative score for each edge of the dependency tree rather than the hyperedge of the dependency forest. For example, in (a) of FIG. 3, the scores for edges he ⁇ saw, boy ⁇ saw and with ⁇ saw are 13, 22, and -12, respectively.
- C (e) is the count of hyperedge e
- head (e) is the head
- tail (e) is the set of dependent words of the head
- v is one dependent word
- s (v, head (e)) is The score of the edge from v to head (e).
- the count of the hyper edge e1 in FIG. 4 is as follows.
- the probabilities p (e) of the hyperedges can then be obtained by normalizing the counts between all hyperedges with the same head collected from the training corpus.
- the GHKM algorithm cannot be applied to extract string dependency rules from dependency forests. This is because the algorithm requires that a complete subtree be present in the rule, while neither fixed nor flow dependent structures guarantee that all heads of the head are included. For example, the flow structure of FIG. 2C actually includes two trees.
- the algorithm according to the present invention searches for a qualified structure for each node in a bottom-up manner.
- the algorithm maintains the best k qualified structures for each node, which can be constructed from the qualified structures of its dependent words. For example, in FIG. 4, since the fixed structure whose telescope 5,6 is the root is (a) the telescope, the fixed structure whose node with 4,6 is the root can be obtained by adding the fixed structure of its dependent words to the node.
- Figure 2 (b) shows the structure according to the result.
- a method of evaluating the qualified structure extracted from the node will be described.
- a fractional count is assigned to each eligible structure according to the Mi and Huang 2008 document.
- root (t) is the root of the tree
- e is the edge
- leaves (t) is the leaf (component) set of the tree
- ⁇ ( ⁇ ) is the outside probability
- ⁇ ( ⁇ ) is the inside probability
- the subtree whose boy 2,6 is the root has the following posterior probability.
- TOP represents the root node of the forest.
- the fractional count of the minimum tree fragment containing the qualified structure is used to approximate the fractional count.
- the fractional count of the qualified structure can be used to calculate the relative frequency of the rule with the qualified structure on the target language side.
- the posterior probability of hyperedge e 2 in FIG. 4 is calculated as follows.
- Each n-gram (eg, "boy-as-head a" is assigned the same fractional frequency of the hyperedge to which it belongs.
- Table 1 shows the BLEU scores and average decoding times for the Chinese-English test set.
- the first translation system (basic translation system) used the dependency language model and rules table learned from the best one dependency tree, and the rest of the translation systems depended on at least one of the dependency language model and rules table. Was applied. * Or ** is significantly better than the default translation system.
- Table 1 shows the BLEU scores on the test set.
- the first column “Rule” indicates whether the string dependency rule was learned from the best one dependency tree or dependency forest.
- the second column “DepLM” also distinguishes two primitive languages for learning dependency language models.
- the basic translation system used the dependency language model and rule table learned from the best one dependency tree.
- a rule table obtained from the dependency forest and a dependency language model are added to improve string-dependent translation in a consistent and significant range from +1.3 to +1.4 BLEU points. Also, even with the rules table and dependency language model learned from the dependency forest, the decoding time increases only slightly.
- Table 2 shows the BLEU scores for the Korean-Chinese test set.
- the learning corpus contains about 8.2M Korean words and 7.3M Chinese words.
- Chinese sentences were used to learn the 5-gram dependency language model as well as the 5-gram dependency language model.
- Both the development and test sets contain 1,006 sentences with a single reference.
- Table 2 shows the BLEU scores on the test set. It can also be seen that the forest based approach according to the present invention provides a significant improvement over the basic translation.
- the statistical machine translation apparatus is largely composed of a training part and a decoding part.
- a dependency parser first parses the source and target statements of the bilingual corpus.
- the dependency analysis generates a dependency tree for the source and object statements.
- the dependency analyzer combines the generated multiple dependency trees to create each dependency forest for the source and destination statements.
- the translation rule extractor creates a translation rule using the dependency forest and stores the generated translation rule in a translation table.
- the dependency language model learner creates the dependency language model using the dependency forest for the object sentence and stores the generated dependency language model in the language model database (DLM).
- the source language input is input to the decoder, and the decoder generates the output language text using the translation rule and the dependency language model.
- the decoder can improve the translation performance by using the translation language and the dependency language model generated from the dependency forest.
- the present invention can be applied to a variety of playback apparatuses by implementing a translation rule generation using the dependency forest and a machine translation method using the same as a software program and recording them on a computer-readable recording medium.
- Various playback devices may be PCs, laptops, portable terminals, and the like.
- the recording medium may be a hard disk, a flash memory, a RAM, a ROM, or the like as an internal type of each playback device, or an optical disc such as a CD-R or a CD-RW, a compact flash card, a smart media, a memory stick, or a multimedia card as an external type. have.
- the present invention analyzes the dependencies from the parallel corpus, creates a plurality of dependency trees, combines the generated multiple dependency trees to create dependency forests, and generates translation rules and dependency language models using dependency forests.
- the translation performance can be improved by applying the generated translation rule and dependency language model, and thus it can be widely used in the field of statistical machine translation.
Abstract
Description
Rule | DepLM | NIST2004 | NIST2005 | NIST2006 | time |
tree | tree | 32.99 | 29.55 | 30.10 | 18.6 |
tree | forest | 33.55* | 30.12* | 30.88* | 23.3 |
forest | tree | 33.43* | 30.10* | 30.55* | 20.9 |
forest | forest | 34.37** | 30.90** | 31.51** | 27.7 |
Rule | DepLM | NIST2004 |
tree | tree | 32.99 |
tree | forest | 33.55* |
forest | tree | 33.43* |
forest | forest | 34.37** |
Claims (19)
- 다수의 의존 트리가 결합된 의존관계 포레스트를 이용하여 번역 규칙을 추출하는 것을 특징으로 하는 번역 규칙 생성 방법.
- 제 1 항에 있어서,상기 의존관계 포레스트의 각 노드는 하이퍼에지에 의해 연결되며 상기 하이퍼에지는 공통의 헤드를 가진 모든 의존단어를 묶는 것을 특징으로 하는 번역 규칙 생성 방법.
- 제 2 항에 있어서,상기 노드는 스팬에 의해 구별되는 것을 특징으로 하는 번역 규칙 생성 방법.
- 제 1 항에 있어서,상기 의존관계 포레스트는 원시문장 문자열과 정렬되고 문자열 의존관계 포레스트 정렬 말뭉치(string-to-forest aligned corpus)로부터 번역 규칙을 추출하는 것을 특징으로 하는 번역 규칙 생성 방법.
- 제 2 항에 있어서,상기 노드에 대한 적격 구조를 탐색하여 각 노드에 대한 복수의 최선 적격 구조를 유지하는 것을 특징으로 번역 규칙 생성 방법.
- 제 5 항에 있어서,상기 복수의 최선 적격 구조는 상기 노드의 의존단어의 고정 구조를 연결하여 구하는 것을 특징으로 하는 번역 규칙 생성 방법.
- 제 6 항에 있어서,상기 적격 구조내의 의존 구조가 단어 정렬과 일치하면 번역 규칙을 추출하는 것을 특징으로 하는 번역 규칙 생성 방법.
- 언어 쌍 말뭉치에 대한 의존 분석을 수행하는 단계와,상기 의존 분석에 의해 의존 트리를 생성하고 다수의 의존 트리를 결합하여 의존관계 포레스트를 생성하는 단계와,상기 의존관계 포레스트 내의 각 노드에 대한 다수의 적격 구조를 탐색하는 단계와,상기 다수의 적격 구조 내의 의존 구조가 단어 정렬과 일치하면 번역 규칙을 추출하는 단계를 포함하는 것을 특징으로 하는 번역 규칙 생성 방법.
- 제 8 항에 있어서,상기 다수의 적격 구조는 k개의 최선 고정 및 유동 구조로서 상기 노드의 의존단어의 고정 구조를 조작하여 구하는 것을 특징으로 하는 번역 규칙 생성 방법.
- 다수의 의존 트리가 결합된 의존관계 포레스트로부터 생성된 번역 규칙 및 의존관계 언어 모델을 이용하여 원시언어를 번역하는 것을 특징으로 하는 통계적 기계 번역 방법.
- 제 10 항에 있어서,상기 의존관계 포레스트의 각 노드는 하이퍼에지에 의해 연결되며 상기 하이퍼에지는 공통의 헤드를 가진 모든 의존단어를 묶는 것을 특징으로 하는 통계적 기계 번역 방법.
- 제 11 항에 있어서,상기 의존관계 포레스트의 모든 하이퍼에지를 나열하여 모든 헤드 및 그의 의존단어를 수집하고 수집된 정보로부터 상기 의존관계 언어 모델을 생성하는 것을 특징으로 하는 통계적 기계 번역 방법.
- 언어 쌍 말뭉치에 대한 의존 분석을 수행하여 의존 트리를 생성하고 다수의 의존 트리를 결합하여 의존관계 포레스트를 생성하는 수단과,상기 의존관계 포레스트 내의 각 노드에 대한 다수의 적격 구조를 탐색하는 수단과,상기 다수의 적격 구조 내의 의존 구조가 단어 정렬과 일치하면 번역 규칙을 추출하는 수단을 포함하는 것을 특징으로 하는 번역 규칙 생성 장치.
- 제 13 항에 있어서,상기 다수의 적격 구조는 k개의 최선 고정 및 유동 구조로서 상기 노드의 의존단어의 고정 구조를 조작하여 구하는 것을 특징으로 하는 번역 규칙 생성 장치.
- 언어 쌍 말뭉치의 원시문장 및 목적문장을 의존 분석하여 의존 트리를 생성하고 다수의 의존 트리를 결합하여 상기 원시문장 및 목적문장에 대한 의존관계 포레스트를 생성하는 의존 분석기와,상기 의존관계 포레스트를 이용하여 번역 규칙을 추출하는 번역 규칙 추출기와,상기 목적문장의 의존관계 포레스트를 이용하여 의존관계 언어 모델을 생성하는 언어 모델 학습기와,상기 번역 규칙 및 의존관계 언어 모델을 적용하여 원시언어 텍스트를 목적언어 텍스트로 변환하는 디코더를 포함하는 것을 특징으로 하는 통계적 기계 번역 장치.
- 제 15 항에 있어서,상기 의존 분석기는 다수의 의존 트리를 구성하는 노드를 하이퍼에지에 의해 연결하여 의존관계 포레스트를 생성하고, 상기 하이퍼에지는 공통의 헤드를 가진 모든 의존단어를 묶는 것을 특징으로 하는 통계적 기계 번역 장치.
- 제 16 항에 있어서,상기 번역 규칙 추출기는 상기 의존관계 포레스트 내의 각 노드에 대한 다수의 적격 구조를 탐색하고 상기 다수의 적격 구조 내의 의존 구조가 단어 정렬과 일치하면 번역 규칙을 추출하는 것을 특징으로 하는 통계적 기계 번역 장치.
- 제 16 항에 있어서,상기 언어 모델 학습기는 상기 의존관계 포레스트의 모든 하이퍼에지를 나열하여 모든 헤드 및 그의 의존단어를 수집하고 수집된 정보로부터 상기 의존관계 언어 모델을 생성하는 것을 특징으로 하는 통계적 기계 번역 장치.
- 제 1 항 내지 제 12 항 중 어느 한 항에 의한 과정을 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201180040952.0A CN103154939B (zh) | 2010-08-23 | 2011-05-31 | 使用依存丛林的统计机器翻译方法 |
US13/818,137 US20130158975A1 (en) | 2010-08-23 | 2011-05-31 | Statistical machine translation method using dependency forest |
US15/968,078 US10303775B2 (en) | 2010-08-23 | 2018-05-01 | Statistical machine translation method using dependency forest |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2010-0081378 | 2010-08-23 | ||
KR1020100081378A KR101732634B1 (ko) | 2010-08-23 | 2010-08-23 | 의존관계 포레스트를 이용한 통계적 기계 번역 방법 |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/818,137 A-371-Of-International US20130158975A1 (en) | 2010-08-23 | 2011-05-31 | Statistical machine translation method using dependency forest |
US15/968,078 Continuation US10303775B2 (en) | 2010-08-23 | 2018-05-01 | Statistical machine translation method using dependency forest |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2012026668A2 true WO2012026668A2 (ko) | 2012-03-01 |
WO2012026668A3 WO2012026668A3 (ko) | 2012-04-19 |
Family
ID=45723876
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2011/003968 WO2012026668A2 (ko) | 2010-08-23 | 2011-05-31 | 의존관계 포레스트를 이용한 통계적 기계 번역 방법 |
Country Status (4)
Country | Link |
---|---|
US (2) | US20130158975A1 (ko) |
KR (1) | KR101732634B1 (ko) |
CN (1) | CN103154939B (ko) |
WO (1) | WO2012026668A2 (ko) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9996567B2 (en) * | 2014-05-30 | 2018-06-12 | Georgetown University | Process and framework for facilitating data sharing using a distributed hypergraph |
US11226945B2 (en) * | 2008-11-14 | 2022-01-18 | Georgetown University | Process and framework for facilitating information sharing using a distributed hypergraph |
KR101356417B1 (ko) * | 2010-11-05 | 2014-01-28 | 고려대학교 산학협력단 | 병렬 말뭉치를 이용한 동사구 번역 패턴 구축 장치 및 그 방법 |
US8935151B1 (en) * | 2011-12-07 | 2015-01-13 | Google Inc. | Multi-source transfer of delexicalized dependency parsers |
CN104239290B (zh) * | 2014-08-08 | 2017-02-15 | 中国科学院计算技术研究所 | 基于依存树的统计机器翻译方法及系统 |
CN104991890A (zh) * | 2015-07-15 | 2015-10-21 | 昆明理工大学 | 一种基于汉越词对齐语料构建越南语依存树库的方法 |
CN106383818A (zh) * | 2015-07-30 | 2017-02-08 | 阿里巴巴集团控股有限公司 | 一种机器翻译方法及装置 |
CN105573994B (zh) * | 2016-01-26 | 2019-03-22 | 沈阳雅译网络技术有限公司 | 基于句法骨架的统计机器翻译系统 |
US10740348B2 (en) * | 2016-06-06 | 2020-08-11 | Georgetown University | Application programming interface and hypergraph transfer protocol supporting a global hypergraph approach to reducing complexity for accelerated multi-disciplinary scientific discovery |
US11417415B2 (en) * | 2018-08-10 | 2022-08-16 | International Business Machines Corporation | Molecular representation |
US11030168B2 (en) * | 2018-12-11 | 2021-06-08 | Sap Se | Parallelization of order dependent procedures during software change processes |
US11003880B1 (en) | 2020-08-05 | 2021-05-11 | Georgetown University | Method and system for contact tracing |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8249856B2 (en) * | 2008-03-20 | 2012-08-21 | Raytheon Bbn Technologies Corp. | Machine translation |
CN101398815B (zh) * | 2008-06-13 | 2011-02-16 | 中国科学院计算技术研究所 | 一种机器翻译方法 |
US8285536B1 (en) * | 2009-07-31 | 2012-10-09 | Google Inc. | Optimizing parameters for machine translation |
-
2010
- 2010-08-23 KR KR1020100081378A patent/KR101732634B1/ko active IP Right Grant
-
2011
- 2011-05-31 US US13/818,137 patent/US20130158975A1/en not_active Abandoned
- 2011-05-31 WO PCT/KR2011/003968 patent/WO2012026668A2/ko active Application Filing
- 2011-05-31 CN CN201180040952.0A patent/CN103154939B/zh active Active
-
2018
- 2018-05-01 US US15/968,078 patent/US10303775B2/en active Active
Non-Patent Citations (3)
Title |
---|
HAITAO MI ET AL.: 'Constituency to Dependency Translation with Forests' PROCEEDINGS OF ACL 2010 16 July 2010, pages 1433 - 1442 * |
HAITAO MI ET AL.: 'Forest-based Translation Rule Extraction' PROCEEDING OF EMNLP 2008 27 October 2008, pages 206 - 214 * |
YANG LIU ET AL.: 'Improving Tree-to-Tree Translation with Packed Forest' PROCEEDINGS OF ACL/IJCNLP 2009 07 August 2009, pages 558 - 566 * |
Also Published As
Publication number | Publication date |
---|---|
US20130158975A1 (en) | 2013-06-20 |
WO2012026668A3 (ko) | 2012-04-19 |
KR20120021933A (ko) | 2012-03-09 |
CN103154939A (zh) | 2013-06-12 |
KR101732634B1 (ko) | 2017-05-08 |
CN103154939B (zh) | 2016-04-27 |
US10303775B2 (en) | 2019-05-28 |
US20180314690A1 (en) | 2018-11-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2012026668A2 (ko) | 의존관계 포레스트를 이용한 통계적 기계 번역 방법 | |
WO2012060540A1 (ko) | 구문 구조 변환 모델과 어휘 변환 모델을 결합한 기계 번역 장치 및 기계 번역 방법 | |
WO2014025135A1 (ko) | 문법 오류 검출 방법, 이를 위한 오류검출장치 및 이 방법이 기록된 컴퓨터로 판독 가능한 기록매체 | |
WO2012026667A2 (ko) | 토큰 분리 및 번역 과정을 통합한 통합 디코딩 장치 및 그 방법 | |
WO2017010652A1 (ko) | 자동질의응답 방법 및 그 장치 | |
KR100530154B1 (ko) | 변환방식 기계번역시스템에서 사용되는 변환사전을생성하는 방법 및 장치 | |
WO2014069779A1 (ko) | 구문 전처리 기반의 구문 분석 장치 및 그 방법 | |
WO2016208941A1 (ko) | 텍스트 전처리 방법 및 이를 수행하는 전처리 시스템 | |
WO2015050321A1 (ko) | 자율학습 정렬 기반의 정렬 코퍼스 생성 장치 및 그 방법과, 정렬 코퍼스를 사용한 파괴 표현 형태소 분석 장치 및 그 형태소 분석 방법 | |
Huang et al. | Soft syntactic constraints for hierarchical phrase-based translation using latent syntactic distributions | |
Fung et al. | BiFrameNet: bilingual frame semantics resource construction by cross-lingual induction | |
KR20080052282A (ko) | 통계적 기계번역 시스템에서 단어 및 구문들간의 번역관계를 자율적으로 학습하기 위한 장치 및 그 방법 | |
Banik et al. | Statistical-based system combination approach to gain advantages over different machine translation systems | |
WO2012008684A2 (ko) | 계층적 구문 기반의 통계적 기계 번역에서의 번역규칙 필터링과 목적단어 생성을 위한 방법 및 장치 | |
WO2012060534A1 (ko) | 병렬 말뭉치를 이용한 동사구 번역 패턴 구축 장치 및 그 방법 | |
WO2012030053A2 (ko) | 병렬 말뭉치의 구 정렬을 이용한 숙어 표현 인식 장치 및 그 방법 | |
Volk et al. | Parallel corpora, terminology extraction and machine translation | |
JP5924677B2 (ja) | 機械翻訳装置、機械翻訳方法、およびプログラム | |
Nguyen et al. | A tree-to-string phrase-based model for statistical machine translation | |
Arora et al. | Comparative analysis of phrase based, hierarchical and syntax based statistical machine translation | |
JP4588657B2 (ja) | 翻訳装置 | |
Nakazawa et al. | EBMT System of KYOTO Team in PatentMT Task at NTCIR-9. | |
Sanguinetti et al. | Dependency and constituency in translation shift analysis | |
Sánchez-Martínez et al. | Using alignment templates to infer shallow-transfer machine translation rules | |
JP2632806B2 (ja) | 言語解析装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201180040952.0 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11820096 Country of ref document: EP Kind code of ref document: A2 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13818137 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 11/06/2013) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11820096 Country of ref document: EP Kind code of ref document: A2 |