CN110232193B - Structured text translation method and device - Google Patents

Structured text translation method and device Download PDF

Info

Publication number
CN110232193B
CN110232193B CN201910349677.9A CN201910349677A CN110232193B CN 110232193 B CN110232193 B CN 110232193B CN 201910349677 A CN201910349677 A CN 201910349677A CN 110232193 B CN110232193 B CN 110232193B
Authority
CN
China
Prior art keywords
text
translation
target
structured
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910349677.9A
Other languages
Chinese (zh)
Other versions
CN110232193A (en
Inventor
刘洋
张嘉成
栾焕博
孙茂松
翟飞飞
许静芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Beijing Sogou Technology Development Co Ltd
Original Assignee
Tsinghua University
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Beijing Sogou Technology Development Co Ltd filed Critical Tsinghua University
Priority to CN201910349677.9A priority Critical patent/CN110232193B/en
Publication of CN110232193A publication Critical patent/CN110232193A/en
Application granted granted Critical
Publication of CN110232193B publication Critical patent/CN110232193B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a method and a device for translating a structured text, wherein the method comprises the following steps: removing the structural mark of the target structural text to be translated to obtain the target text; inputting the target text into a trained text translation neural network model, and performing search translation on translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information; and according to the alignment information, performing structured mark recovery processing on the target translation text to obtain a target structured translation text. According to the embodiment of the invention, the structured mark of the structured text is removed, so that the text without the structured mark is translated through the neural network model based on the phrase search space, the structured mark of the translated text is recovered, the structured translated text is obtained, and the translation of the structured text through the neural network model is realized.

Description

Structured text translation method and device
Technical Field
The invention relates to the technical field of machine translation, in particular to a method and a device for translating a structured text.
Background
In recent years, the rapid development of neural network machine translation technology enables the quality of machine translation to be remarkably improved. Further, the improvement of the quality of machine translation also makes it begin to be widely used in real life.
Although neural machine translation is excellent in translating pure text, it cannot be well applied to translating structured text, because the translation of structured text needs to satisfy structural constraints, for example, a translation corresponding to content of a source end between a pair of HTML tags at a target end must also be included between the same pair of HTML tags, but because in existing neural machine translation, no corpus of structured text is used to train a model for the structured text translation; moreover, the neural machine translation lacks explicit alignment information and cannot add structural constraints, so that the existing neural machine translation is difficult to translate structured texts.
Therefore, a method and an apparatus for structured text translation are needed to solve the above problems.
Disclosure of Invention
Aiming at the problems in the prior art, the embodiment of the invention provides a method and a device for translating a structured text.
In a first aspect, an embodiment of the present invention provides a method for translating a structured text, including:
removing the structural mark of the target structural text to be translated to obtain the target text;
inputting the target text into a trained text translation neural network model, and searching and translating translation candidate words of the text according to a phrase search space to obtain a target translation text and alignment information, wherein the trained text translation neural network model is obtained by training a sample text without a structured marker;
and according to the alignment information, performing structured mark recovery processing on the target translation text to obtain a target structured translation text.
In a second aspect, an embodiment of the present invention provides a structured text translation apparatus, including:
the structured mark removing module is used for removing the structured marks of the target structured text to be translated to obtain the target text;
the text translation module is used for inputting the target text into a trained text translation neural network model, searching and translating translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information, and the trained text translation neural network model is obtained by training a sample text without a structural mark;
and the structural mark recovery module is used for performing structural mark recovery processing on the target translation text according to the alignment information to obtain a target structural translation text.
In a third aspect, an embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the steps of the method provided in the first aspect when executing the program.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the method as provided in the first aspect.
According to the structured text translation method and device provided by the embodiment of the invention, the structured marks of the structured text are removed, so that the text without the structured marks is translated through the neural network model based on the phrase search space, the translated text is recovered with the structured marks to obtain the structured translated text, and the structured text is translated through the neural network model.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a structured text translation method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a grid column search method according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a structured text translation apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Machine translation is a process of converting one natural language into another natural language by a computer, and in recent years, neural network machine translation technology is rapidly developed, so that the quality of machine translation is remarkably improved. Although existing machine translation has excellent effects when translating pure text, when translating structured text, machine translation of the target is difficult to apply. The embodiment of the invention translates the text without the structural mark through the neural network model based on the phrase searching space, and restores the structural mark of the translated text to obtain the structural translated text, thereby realizing the translation of the structural text through the neural network model. It should be noted that, in the embodiment of the present invention, a word or a phrase of a text at a translation end (for example, a target structured text to be translated, a target text, a sample text with structured marks removed, and a sample structured text to be translated) is used as a source end word, and a word or a phrase of a text at a translation end (for example, a target translated text, a sample translated text, and a target structured translated text) is used as a target end word.
Fig. 1 is a schematic flow diagram of a structured text translation method according to an embodiment of the present invention, and as shown in fig. 1, an embodiment of the present invention provides a structured text translation method, including:
step 101, removing a structural mark of a target structural text to be translated to obtain the target text;
in the embodiment of the present invention, firstly, the obtained target structured text needs to be processed, and the carried structured tag is removed, so as to obtain the target text with the structured tag removed and the structured constraint retained.
Step 102, inputting the target text into a trained text translation neural network model, and performing search translation on translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information, wherein the trained text translation neural network model is obtained by training a sample text without a structural mark.
In the embodiment of the invention, a target text is translated through a trained text translation neural network model, firstly, the model searches for translation candidate words of each word or phrase in the target text through a phrase search space, when the translation candidate words are searched, the search sequence is not limited, but the target text with structural constraint needs to be translated as a whole. After the phrase search space finishes searching each word or translation candidate word of the phrase in the target text, the trained text translation neural network model selects the translation candidate words according to the translation probability of each translation candidate word and the source end word in the target text, selects the target end word with the highest translation probability from the translation candidate words, obtains the target translation text, and obtains the alignment information between the target translation text and the target text.
And 103, performing structural mark recovery processing on the target translation text to obtain a target structural translation text.
In the embodiment of the invention, according to the alignment information between the target translation text and the target text, the structural mark recovery is carried out on the target translation text, so that the translation of the structural text is completed.
According to the structured text translation method provided by the embodiment of the invention, the structured mark of the structured text is removed, so that the text without the structured mark is translated through the neural network model based on the phrase search space, the structured mark of the translated text is recovered, the structured translated text is obtained, and the translation of the structured text through the neural network model is realized.
On the basis of the above embodiment, the trained text translation neural network model is obtained by training through the following steps:
constructing a training sample set according to the sample text without the structured marks;
and training the pre-trained text translation neural network model according to the training sample set to obtain the trained text translation neural network model.
In the embodiment of the invention, a sample text without a structural mark is input into a neural network model for training, a phrase-based search space (namely a corpus model) is constructed in the neural network model, after the sample text is input into a neural network, a translation candidate word of each word or phrase in the sample text is searched through the phrase search space to obtain a translation candidate word corresponding to a source end word of each sample text, so that a pre-trained structural text translation neural network model is obtained, the pre-trained structural text translation neural network model selects the translation candidate words according to the translation probabilities of each translation candidate word and the source end word in the sample text, a sample target end word with the highest translation probability is selected from the translation candidate words, and a sample translation text is obtained, so that the training of the text translation neural network model is completed.
On the basis of the above embodiment, before the constructing the training sample set from the sample text without the structured labels, the method further includes:
acquiring a sample structured text to be translated;
and removing the structured marks of the sample structured text according to the label pair matching information of the sample structured text to obtain a sample text without the structured marks, so as to construct a training sample set.
In The embodiment of The invention, after The sample structured text is obtained, The structured mark is removed according to The matching information of The pair of labels in The text, for example, The matching information of The pair of labels of < a > The Raven </a > is < a > and </a >, therefore, The < a > and </a > are removed, and The sample text without The structured mark is obtained. It should be noted that the sample text without the structural mark also retains a labeling constraint, and after the translation of the sample text is completed, the structural mark recovery processing is performed on the sample translation text according to the label pair matching information and the alignment information between the sample translation text and the sample text.
On the basis of the above embodiment, the inputting the target text into a trained text translation neural network model, and performing search translation on the translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information includes:
and inserting corresponding stop words into the positions of the empty words between the source end words of the target text according to the corresponding stop word translation frequency between the source end words of the target text so as to obtain a target translation text and alignment information.
Further, on the basis of the above embodiment, the inputting the target text into a trained text translation neural network model, and performing search translation on the translation candidate words of the target text according to the phrase search space to obtain a target translation text and alignment information further includes:
translating the corresponding source end words in the target text into empty words according to the source end word omission probability of the target text, wherein the formula of the source end word omission probability is as follows:
Figure BDA0002043525050000051
wherein x isiAn ith word in the source words representing the target text,<null>represents xiTranslating into an empty word, wherein O represents the positions of all the missing source end words;
obtaining the translation logarithm probability of the target text according to the source end word omission probability and the translation probability of the text translation neural network model, wherein the formula is as follows:
Figure BDA0002043525050000052
the logP (y | x) represents the translation probability of the text translation neural network model, y represents a sentence corresponding to a translated target text, x represents a sentence corresponding to a target text before translation, and lambda represents a hyper-parameter.
In The embodiment of The present invention, in order to improve The flexibility of translation, a target-side word is translated from an empty word between source-side words (i.e. a word is inserted) or a source-side word is translated to an empty word at a target side (i.e. The word is omitted), for example, a target text obtained by removing a structured label is "American pore island Poe's The Raven", a stop word is inserted between The target text (source side) "American" and "poet" and translated into a target endword ", The" in The source endword is translated into an empty word in The target endword as The omitted word (i.e. The word is not translated), and The translation is finally obtained as "crow" of The loving slope of poetry in The united states ". For the former, according to the translation frequency of stop words, only stop words aligned to the source end with higher probability of empty words and higher occurrence frequency are inserted. For the latter, unlike the conventional neural machine translation decoding, when a source end word is missed, the missing probability of the source end word is simultaneously considered in the final translation probability, so as to obtain the final translation logarithmic probability.
On the basis of the above embodiment, the inputting the target text into a trained text translation neural network model, and performing search translation on the translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information further includes:
and searching and translating the translation candidate words of the target text according to the phrase search space by a grid column search method to obtain a target translation text and alignment information.
In the embodiment of the present invention, fig. 2 is a schematic diagram of a lattice column search method provided by the embodiment of the present invention, and as shown in fig. 2, expansion is performed by the lattice column search method from a state without translation, a word that is supposed to be translatable at present is queried from a source end word for translation each time of expansion, a new translation candidate word is generated, and after scoring the translation candidate word, a translation candidate word with a higher score is selected according to a hypothesis score for further expansion until the translation candidate word is expanded to a last lattice column. In the embodiment of the invention, the translated target end word number and the translated source end word number are used as indexes to be stored, specifically, only a plurality of translation candidate words with higher translation probability are reserved in each grid column, and only the translation candidate word at the top of the grid column is selected as a final translation result.
Fig. 3 is a schematic structural diagram of a structured text translation apparatus according to an embodiment of the present invention, and as shown in fig. 3, an embodiment of the present invention provides a structured text translation apparatus, which includes a structured flag removing module 301, a text translation module 302, and a structured flag restoring module 303, where the structured flag removing module 301 is configured to remove a structured flag of a target structured text to be translated, so as to obtain the target text; the text translation module 302 is configured to input the target text into a trained text translation neural network model, search and translate a translation candidate word of the target text according to a phrase search space, and obtain a target translation text and alignment information, where the trained text translation neural network model is obtained by training a sample text without a structured label; the structural mark recovery module 303 is configured to perform structural mark recovery processing on the target translation text according to the alignment information, so as to obtain a target structural translation text.
In this embodiment of the present invention, the structured flag removing module 301 processes the obtained target structured text, and removes the carried structured flag to obtain the target text with the structured flag removed and the structured constraint retained. The text translation module 302 searches for each word or translation candidate word of the phrase in the target text through the phrase search space, and when performing the translation candidate word search, the search sequence is not limited, but the target text with the structural constraint needs to be translated as a whole. After the search space completes the search of the translation candidate words of each word or phrase in the target text, the text translation module 302 selects the translation candidate words according to the translation probability of each translation candidate word and the source end word in the target text, selects the target end word with the highest translation probability from the translation candidate words, obtains the target translation text, and obtains the alignment information between the target translation text and the target text. Finally, the structural mark recovery module 303 recovers the structural mark of the target translation text according to the alignment information between the target translation text and the target text, thereby completing the translation of the structural text.
According to the structured text translation device provided by the embodiment of the invention, the structured marks of the structured text are removed, so that the text without the structured marks is translated through the neural network model based on the phrase search space, the structured marks of the translated text are recovered, the structured translated text is obtained, and the translation of the structured text through the neural network model is realized.
The apparatus provided in the embodiment of the present invention is used for executing the above method embodiments, and for details of the process and the details, reference is made to the above embodiments, which are not described herein again.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device may include: a Processor (Processor)401, a communication Interface (communication Interface)402, a Memory (Memory)403 and a communication bus 404, wherein the Processor 401, the communication Interface 402 and the Memory 403 complete communication with each other through the communication bus 404. Processor 401 may call logic instructions in memory 403 to perform the following method: removing the structural mark of the target structural text to be translated to obtain the target text; inputting the target text into a trained text translation neural network model, and searching and translating translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information, wherein the trained structured text translation neural network model is obtained by training a sample text without a structured mark; and according to the alignment information, performing structured mark recovery processing on the target translation text to obtain a target structured translation text.
In addition, the logic instructions in the memory 403 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
An embodiment of the present invention discloses a computer program product, which includes a computer program stored on a non-transitory computer readable storage medium, the computer program including program instructions, when the program instructions are executed by a computer, the computer can execute the methods provided by the above method embodiments, for example, the method includes: removing the structural mark of the target structural text to be translated to obtain the target text; inputting the target text into a trained text translation neural network model, and searching and translating translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information, wherein the trained text translation neural network model is obtained by training a sample text without a structured mark; and according to the alignment information, performing structured mark recovery processing on the target translation text to obtain a target structured translation text.
An embodiment of the present invention provides a non-transitory computer-readable storage medium storing server instructions, where the server instructions cause a computer to execute the structured text translation method provided in the foregoing embodiment, and the method includes: removing the structural mark of the target structural text to be translated to obtain the target text; inputting the target text into a trained text translation neural network model, and searching and translating translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information, wherein the trained text translation neural network model is obtained by training a sample text without a structured mark; and according to the alignment information, performing structured mark recovery processing on the target translation text to obtain a target structured translation text.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (8)

1. A method for structured text translation, comprising:
removing the structural mark of the target structural text to be translated to obtain the target text;
inputting the target text into a trained text translation neural network model, and searching and translating translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information, wherein the trained text translation neural network model is obtained by training a sample text without a structured mark;
according to the alignment information, performing structured mark recovery processing on the target translation text to obtain a target structured translation text;
inputting the target text into a trained text translation neural network model, and performing search translation on translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information, and further comprising:
translating the corresponding source end words in the target text into empty words according to the source end word omission probability of the target text, wherein the formula of the source end word omission probability is as follows:
Figure FDA0002548226230000011
wherein x isiAn ith word in the source words representing the target text,<null>represents xiTranslating into an empty word, wherein O represents the positions of all the missing source end words;
obtaining the translation logarithm probability of the target text according to the source end word omission probability and the translation probability of the text translation neural network model, wherein the formula is as follows:
Figure FDA0002548226230000012
the logP (y | x) represents the translation probability of the text translation neural network model, y represents a sentence corresponding to a translated target text, x represents a sentence corresponding to a target text before translation, and lambda represents a hyper-parameter.
2. The method for structured text translation according to claim 1, wherein the trained neural network model for text translation is obtained by training through the following steps:
constructing a training sample set according to the sample text without the structured marks;
and training the pre-trained text translation neural network model according to the training sample set to obtain the trained text translation neural network model.
3. The method of structured text translation according to claim 2, wherein prior to said constructing a training sample set from sample text that is free of structured labels, said method further comprises:
acquiring a sample structured text to be translated;
and removing the structured marks of the sample structured text according to the label pair matching information of the sample structured text to obtain a sample text without the structured marks, so as to construct a training sample set.
4. The method for structured text translation according to claim 1, wherein the inputting the target text into a trained text translation neural network model, and performing search translation on the translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information comprises:
and inserting corresponding stop words into the positions of the empty words between the source end words of the target text according to the corresponding stop word translation frequency between the source end words of the target text so as to obtain a target translation text and alignment information.
5. The method for structured text translation according to any one of claims 1 to 4, wherein the step of inputting the target text into a trained text translation neural network model, and performing search translation on the translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information further comprises:
searching and translating the translation candidate words of the target text according to the phrase search space by a grid column search method to obtain a target translation text and alignment information, wherein the grid column search method specifically comprises the following steps:
expanding from an untranslated state, inquiring a word which is supposed to be translatable at present from a source end word for translation each time, and generating a new translation candidate word;
and after scoring the new translation candidate words, selecting the translation candidate words with the scores higher than the scores according to the hypothesis scores, and performing the next expansion until the expansion reaches the last grid column.
6. A structured text translation apparatus, comprising:
the structured mark removing module is used for removing the structured marks of the target structured text to be translated to obtain the target text;
the text translation module is used for inputting the target text into a trained text translation neural network model, searching and translating translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information, and the trained text translation neural network model is obtained by training a sample text without a structural mark;
the structural mark recovery module is used for performing structural mark recovery processing on the target translation text according to the alignment information to obtain a target structural translation text;
the text translation module is specifically configured to:
inputting the target text into a trained text translation neural network model, and performing search translation on translation candidate words of the target text according to a phrase search space to obtain a target translation text and alignment information, and further comprising:
translating the corresponding source end words in the target text into empty words according to the source end word omission probability of the target text, wherein the formula of the source end word omission probability is as follows:
Figure FDA0002548226230000031
wherein x isiAn ith word in the source words representing the target text,<null>represents xiTranslating into an empty word, wherein O represents the positions of all the missing source end words;
obtaining the translation logarithm probability of the target text according to the source end word omission probability and the translation probability of the text translation neural network model, wherein the formula is as follows:
Figure FDA0002548226230000032
the logP (y | x) represents the translation probability of the text translation neural network model, y represents a sentence corresponding to a translated target text, x represents a sentence corresponding to a target text before translation, and lambda represents a hyper-parameter.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1 to 5 are implemented when the processor executes the program.
8. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
CN201910349677.9A 2019-04-28 2019-04-28 Structured text translation method and device Active CN110232193B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910349677.9A CN110232193B (en) 2019-04-28 2019-04-28 Structured text translation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910349677.9A CN110232193B (en) 2019-04-28 2019-04-28 Structured text translation method and device

Publications (2)

Publication Number Publication Date
CN110232193A CN110232193A (en) 2019-09-13
CN110232193B true CN110232193B (en) 2020-08-28

Family

ID=67860318

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910349677.9A Active CN110232193B (en) 2019-04-28 2019-04-28 Structured text translation method and device

Country Status (1)

Country Link
CN (1) CN110232193B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1745379A (en) * 2003-01-28 2006-03-08 法国电信公司 Method and system for supplying an automatic web content translation service
CN101685440A (en) * 2008-09-25 2010-03-31 国际商业机器公司 Method and system for translating improved structural document of application path information
CN103678284A (en) * 2012-08-31 2014-03-26 上海斐讯数据通信技术有限公司 Method and device for translating page characters
CN104881406A (en) * 2015-06-15 2015-09-02 携程计算机技术(上海)有限公司 Web page translation method and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101398815B (en) * 2008-06-13 2011-02-16 中国科学院计算技术研究所 Machine translation method
CN103425638A (en) * 2013-08-30 2013-12-04 清华大学 Word alignment method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1745379A (en) * 2003-01-28 2006-03-08 法国电信公司 Method and system for supplying an automatic web content translation service
CN101685440A (en) * 2008-09-25 2010-03-31 国际商业机器公司 Method and system for translating improved structural document of application path information
CN103678284A (en) * 2012-08-31 2014-03-26 上海斐讯数据通信技术有限公司 Method and device for translating page characters
CN104881406A (en) * 2015-06-15 2015-09-02 携程计算机技术(上海)有限公司 Web page translation method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WEB网页文件的解析及翻译引擎的设计与实现;赵志辉;《中国优秀硕士学位论文全文数据库信息科技辑》;20130515(第05期);第I138-2101页 *
基于短语的对数线性模型的统计机器翻译方法与系统实现;宋彦;《中国优秀硕士学位论文全文数据库信息科技辑》;20111215(第S2期);第I138-1953页 *

Also Published As

Publication number Publication date
CN110232193A (en) 2019-09-13

Similar Documents

Publication Publication Date Title
CN109271631B (en) Word segmentation method, device, equipment and storage medium
US20140163951A1 (en) Hybrid adaptation of named entity recognition
CN111753531A (en) Text error correction method and device based on artificial intelligence, computer equipment and storage medium
CN112766000B (en) Machine translation method and system based on pre-training model
CN106372053B (en) Syntactic analysis method and device
CN106156013B (en) A kind of two-part machine translation method that regular collocation type phrase is preferential
CN106383814A (en) Word segmentation method of English social media short text
CN114860942B (en) Text intention classification method, device, equipment and storage medium
CN111813923A (en) Text summarization method, electronic device and storage medium
CN110147558B (en) Method and device for processing translation corpus
CN107491441B (en) Method for dynamically extracting translation template based on forced decoding
CN111831792B (en) Electric power knowledge base construction method and system
CN110232193B (en) Structured text translation method and device
CN107992479A (en) Word rank Chinese Text Chunking method based on transfer method
GuoDong A chunking strategy towards unknown word detection in Chinese word segmentation
CN116129883A (en) Speech recognition method, device, computer equipment and storage medium
CN113011149B (en) Text error correction method and system
CN112686059B (en) Text translation method, device, electronic equipment and storage medium
CN116484842A (en) Statement error correction method and device, electronic equipment and storage medium
CN114462427A (en) Machine translation method and device based on term protection
CN111310457B (en) Word mismatching recognition method and device, electronic equipment and storage medium
CN110888976B (en) Text abstract generation method and device
CN113886521A (en) Text relation automatic labeling method based on similar vocabulary
CN111159339A (en) Text matching processing method and device
CN111814433B (en) Uygur language entity identification method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant