CN113947094A - Auxiliary translation method - Google Patents

Auxiliary translation method Download PDF

Info

Publication number
CN113947094A
CN113947094A CN202111155095.0A CN202111155095A CN113947094A CN 113947094 A CN113947094 A CN 113947094A CN 202111155095 A CN202111155095 A CN 202111155095A CN 113947094 A CN113947094 A CN 113947094A
Authority
CN
China
Prior art keywords
translation
translator
sentence
term
library
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111155095.0A
Other languages
Chinese (zh)
Inventor
李光华
张娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiaguyi Beijing Language Technology Co ltd
Original Assignee
Jiaguyi Beijing Language Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiaguyi Beijing Language Technology Co ltd filed Critical Jiaguyi Beijing Language Technology Co ltd
Priority to CN202111155095.0A priority Critical patent/CN113947094A/en
Publication of CN113947094A publication Critical patent/CN113947094A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/49Data-driven translation using very large corpora, e.g. the web

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides an auxiliary translation method integrating translator input, memory library segment/term library and real-time preview, which comprises the following steps: the first step is as follows: analyzing and splitting an editable document uploaded by a translator into independent sentence segments by taking a sentence as a unit; the second step is as follows: and associating the independent sentence segments with the memory base and the term base, and simultaneously extracting the memory base segments in batches to be fused into a recommended translation for the translator in the interactive translation link: displaying the document in the analyzed independent sentence segments into an HTML format which can be previewed and edited in a browser, and simultaneously further segmenting the sentence pairs in the analyzed independent sentence segments for term library matching; the fourth step: matching the parsed sentence segments and word segmentation results in a related memory library and/or term library, and marking sentences meeting preset conditions; the fifth step: the machine translation engine returns the translation.

Description

Auxiliary translation method
Technical Field
The invention relates to an auxiliary translation method integrating translator input, memory library segment/term library and real-time preview.
Background
The neural network machine translation enables the automatic translation to have a great reference meaning for a translator, and the current popular translation mode in the market is a post-translation editing mode (a first-class scheme), namely, a translator directly modifies the automatically translated translation. Yet another type of solution is an interactive translation (second type of solution) that re-decodes the translator input segments to generate a translation, and the machine recommends the translation in real time based on the translator input, without providing memory bank segment/term bank fusion, and editor previews.
The first scheme has the defects that the machine translation text directly occupies a translation input box, the translation thought of a translator is limited, the translator is easily misled, the translator only slightly complements the machine translation result, and the modified content of the translator cannot realize the incremental learning of the machine model. The second scheme has the disadvantages that the memory library segment and the term library of the translator and the full text preview information are not fused, only machine translation-based recommended translations are provided during interactive translation prompting, and the memory library segment and the term corresponding to the current sentence segment cannot be fused to recommend the translations in real time, so that the recommendation accuracy is not high, and the efficiency of the translator is improved to a limited extent.
Disclosure of Invention
The invention aims to solve the technical problem of providing an auxiliary translation method integrating translator input, memory library segments/term libraries and real-time preview aiming at overcoming the defects in the prior art and improving the translation efficiency of a translator by integrating the translator input, the memory library segments/term libraries and the real-time preview.
According to the invention, the auxiliary translation method integrating translator input, memory library segment/term library and real-time preview comprises the following steps:
the first step is as follows: analyzing and splitting an editable document uploaded by a translator into independent sentence segments by taking a sentence as a unit;
the second step is as follows: associating the independent sentence segments with a memory library and a term library; meanwhile, memory library segments are extracted in batch to be fused into recommended translation text of an interpreter in an interactive translation link;
the third step: displaying the document in the analyzed independent sentence segments into an HTML format which can be previewed and edited in a browser, and simultaneously further segmenting the sentence pairs in the analyzed independent sentence segments for term library matching;
the fourth step: matching the parsed sentence segments and word segmentation results in a related memory base and/or a term base, marking sentences meeting preset conditions, and transmitting the sentences to a machine translation engine;
the fifth step: the machine translation engine returns the translation.
Preferably, the fifth step presents the translated version and the original version in an editor for previewing and translation by the translator.
Preferably, the auxiliary translation method with the input of the fused translator, the memory library segment/term library and the real-time preview further includes:
a sixth step: and matching the results of the memory base and the term base sentence by sentence according to the cursor position of the translator, and sending the matching result meeting the preset condition to a machine translation engine.
Preferably, the auxiliary translation method with the input of the fused translator, the memory library segment/term library and the real-time preview further includes:
a seventh step of: and acquiring the translator input segment, and corresponding memory library segments and terms for intervening the machine translation decoding result, and returning the machine translation re-translation result to the translator.
An eighth step: after the translator determines the translation of the sentence, the confirmed terms and the memory library segments of the sentence are stored in real time for subsequent machine translation engine input result intervention.
A ninth step: periodically, the original text and the translated text confirmed by the translator are used for incremental learning of the machine translation model.
Preferably, the auxiliary translation method with the input of the fused translator, the memory library segment/term library and the real-time preview further includes: and the machine translation engine performs incremental learning according to the collected original text and the collected translated text, and updates the machine translation engine so as to output the translated text with better quality next time.
Preferably, the memory store stores previously identified pairs of translation sentences and the term store stores previously identified pairs of translation terms.
Preferably, the individual sentence fragments have the same language orientation and domain as the memory and term libraries.
Preferably, the predetermined condition indicates that the degree of match is above a predetermined threshold.
In a word, on the basis of an interactive translation technology, the invention can greatly improve the recommendation accuracy of the translated text and the translation efficiency of the translator by further fusing the memory library segments/term libraries of the translator and full-text preview information, does not limit the translation thought of the translator, can improve the pleasure of the translation process, enhances the occupational value feeling of the translator, further promotes the spread of cross-language knowledge and improves the circulation efficiency of global social information.
Drawings
A more complete understanding of the present invention, and the attendant advantages and features thereof, will be more readily understood by reference to the following detailed description when considered in conjunction with the accompanying drawings wherein:
FIG. 1 schematically illustrates a flow chart of an assisted translation method that merges translator input, memory pool segments/term pools and real-time previewing according to a preferred embodiment of the present invention.
FIG. 2 is a schematic diagram of an auxiliary translation method with merging of translator input, memory pool segments/term pools and real-time previewing according to the preferred embodiment of the present invention.
FIG. 3 is a diagram schematically illustrating the effect of the auxiliary translation method with the translator input, the memory base segment/term base and the real-time preview according to the preferred embodiment of the present invention.
It is to be noted, however, that the appended drawings illustrate rather than limit the invention. It is noted that the drawings representing structures may not be drawn to scale. Also, in the drawings, the same or similar elements are denoted by the same or similar reference numerals.
Detailed Description
In order that the present disclosure may be more clearly and readily understood, reference will now be made in detail to the present disclosure as illustrated in the accompanying drawings.
FIG. 1 schematically illustrates a flow chart of an assisted translation method that merges translator input, memory pool segments/term pools and real-time previewing according to a preferred embodiment of the present invention. FIG. 2 is a schematic diagram of an auxiliary translation method with merging of translator input, memory pool segments/term pools and real-time previewing according to the preferred embodiment of the present invention. FIG. 3 is a diagram schematically illustrating the effect of the auxiliary translation method with the translator input, the memory base segment/term base and the real-time preview according to the preferred embodiment of the present invention.
As shown in fig. 1, fig. 2 and fig. 3, the auxiliary translation method for merging translator input, memory library segment/term library and real-time preview according to the preferred embodiment of the present invention includes:
first step S1: analyzing and splitting an editable document uploaded by a translator into independent sentence segments by taking a sentence as a unit;
for example, when a translator uploads a document such as docx/doc/xlsxx/xls/pptx/ppt/pdf, the document is parsed and split into independent sentence segments in sentence units.
Second step S2: associating the independent sentence segments with a memory base and a term base, and extracting memory base segments in batches to be fused into a recommended translation for an interpreter in an interactive translation link;
preferably, the memory store stores previously identified pairs of translation sentences and the term store stores previously identified pairs of translation terms. Preferably, the individual sentence fragments have the same language orientation and domain as the memory and term libraries.
For example, separate sentence fragments are associated with a memory pool (previously identified translation sentence pairs) and a term pool (previously identified translation term pairs) that have been accumulated by the translator in the system and are in the same language orientation and domain.
Third step S3: displaying the document in the analyzed independent sentence segments into an HTML format which can be previewed and edited in a browser, and simultaneously further segmenting the sentence pairs in the analyzed independent sentence segments for term library matching;
for example, the parsed document is presented in an HTML format that can be previewed and edited on a browser, while the parsed sentence pairs are further participled for term base matching.
Fourth step S4: matching the parsed sentence segments and word segmentation results in a related memory base and/or a term base, marking sentences meeting preset conditions, and transmitting the sentences to a machine translation engine;
preferably, the predetermined condition indicates that the degree of match is above a predetermined threshold.
For example, the parsed sentence segments and word segmentation results are matched with an associated memory base and/or term base, and sentences above a predetermined threshold are matched, labeled and sent to a machine translation engine.
Fifth step S5: the machine translation engine returns the translation.
Preferably, the translated and original text is presented in an editor for the translator to preview and translate.
Preferably, the following sixth step may also be performed:
sixth step S6: matching the results of the memory base and the term base sentence by sentence according to the cursor position of the translator, and sending the matching result meeting the preset condition to a machine translation engine (namely, the translator inputs the content, and the machine translation engine recommends a new translation result in real time according to the input content) so that a decoder intervenes in the machine translation output result;
for example, if there is a match above a certain threshold, it is sent to the machine translation engine for the decoder to intervene in the machine translation output.
Preferably, the following seventh step may also be performed:
seventh step S7: and acquiring the translator input segment, and corresponding memory library segments and terms for intervening the machine translation decoding result, and returning the machine translation re-translation result to the translator.
Eighth step S8: after the translator determines the translation of the sentence, the confirmed terms and the memory library segments of the sentence are stored in real time for subsequent machine translation engine input result intervention.
Ninth step S9: periodically, the original text and the translated text confirmed by the translator are used for incremental learning of the machine translation model.
And repeatedly interacting for many times until the translation of the sentence is finished, and after clicking a confirmation button by a translator, regarding the changed confirmation sentence segment as the confirmed translation and storing the confirmed translation into a subsequent machine translation model training database. And backfilling the confirmed translated text into a preview frame in real time, so that a translator can judge the translation and format effects.
Preferably, the machine translation engine performs incremental learning according to the collected original text and the collected translated text, and updates the machine translation engine so as to output a translated text with better quality next time.
On the basis of an interactive translation technology, the translator memory library/term library and full-text preview information are further fused, so that the translation efficiency of a translator can be greatly improved, the translation thought of the translator is not limited, the pleasure of the translation process can be improved, the occupational value sense of the translator is enhanced, further, the cross-language knowledge propagation is promoted, and the global social information transfer efficiency is improved.
In addition, it should be noted that the terms "first", "second", "third", and the like in the specification are used for distinguishing various components, elements, steps, and the like in the specification, and are not used for representing a logical relationship or a sequential relationship between the various components, elements, steps, and the like, unless otherwise specified.
It is to be understood that while the present invention has been described in conjunction with the preferred embodiments thereof, it is not intended to limit the invention to those embodiments. It will be apparent to those skilled in the art from this disclosure that many changes and modifications can be made, or equivalents modified, in the embodiments of the invention without departing from the scope of the invention. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention are still within the scope of the protection of the technical solution of the present invention, unless the contents of the technical solution of the present invention are departed.

Claims (9)

1. An auxiliary translation method integrating translator input, memory library segment/term library and real-time preview is characterized by comprising the following steps:
the first step is as follows: analyzing and splitting an editable document uploaded by a translator into independent sentence segments by taking a sentence as a unit;
the second step is as follows: associating the independent sentence segments with a memory library and a term library; meanwhile, memory library segments are extracted in batch to be fused into recommended translation text of an interpreter in an interactive translation link; the third step: displaying the document in the analyzed independent sentence segments into an HTML format which can be previewed and edited in a browser, and simultaneously further segmenting the sentence pairs in the analyzed independent sentence segments for term library matching;
the fourth step: matching the parsed sentence segments and word segmentation results in a related memory base and/or a term base, marking sentences meeting preset conditions, and transmitting the sentences to a machine translation engine;
the fifth step: the machine translation engine returns the translation.
2. An aided translation method according to claim 1, wherein in the fifth step, the translated text and the original text are displayed in an editor for a translator to preview and translate.
3. The aided translation method according to claim 1 or 2, further comprising:
a sixth step: and matching the results of the memory base and the term base sentence by sentence according to the cursor position of the translator, and sending the matching result meeting the preset condition to a machine translation engine.
4. The aided translation method according to claim 1 or 2, further comprising:
a seventh step of: and acquiring the translator input segment, and corresponding memory library segments and terms for intervening the machine translation decoding result, and returning the machine translation re-translation result to the translator.
An eighth step: after the translator determines the translation of the sentence, the term and the memory library segment after the sentence is determined are stored in real time for the subsequent machine translation engine to intervene in the input result;
a ninth step: periodically, the original text and the translated text confirmed by the translator are used for incremental learning of the machine translation model.
5. The aided translation method according to claim 1 or 2, further comprising: and the machine translation engine performs incremental learning according to the collected original text and the collected translated text and updates the machine translation engine.
6. An aided translation method according to claim 1 or claim 2, wherein the memory stores previously identified pairs of translation sentences.
7. An aided translation method according to claim 1 or claim 2, wherein the term library stores previously identified pairs of translation terms.
8. An aided translation method according to claim 1 or 2, wherein the independent sentence fragments have the same language direction and field as the memory and term libraries.
9. An aided translation method according to claim 1 or claim 2 wherein the predetermined condition is indicative of the degree of match being above a predetermined threshold.
CN202111155095.0A 2021-09-29 2021-09-29 Auxiliary translation method Pending CN113947094A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111155095.0A CN113947094A (en) 2021-09-29 2021-09-29 Auxiliary translation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111155095.0A CN113947094A (en) 2021-09-29 2021-09-29 Auxiliary translation method

Publications (1)

Publication Number Publication Date
CN113947094A true CN113947094A (en) 2022-01-18

Family

ID=79329896

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111155095.0A Pending CN113947094A (en) 2021-09-29 2021-09-29 Auxiliary translation method

Country Status (1)

Country Link
CN (1) CN113947094A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114841175A (en) * 2022-04-22 2022-08-02 北京百度网讯科技有限公司 Machine translation method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150120273A1 (en) * 2013-10-28 2015-04-30 Translation Management Systems Ltd. Networked language translation system and method
CN107885737A (en) * 2017-12-27 2018-04-06 传神语联网网络科技股份有限公司 A kind of human-computer interaction interpretation method and system
CN110543644A (en) * 2019-09-04 2019-12-06 语联网(武汉)信息技术有限公司 Machine translation method and device containing term translation and electronic equipment
CN112541365A (en) * 2020-12-21 2021-03-23 语联网(武汉)信息技术有限公司 Machine translation method and device based on term replacement

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150120273A1 (en) * 2013-10-28 2015-04-30 Translation Management Systems Ltd. Networked language translation system and method
CN107885737A (en) * 2017-12-27 2018-04-06 传神语联网网络科技股份有限公司 A kind of human-computer interaction interpretation method and system
CN110543644A (en) * 2019-09-04 2019-12-06 语联网(武汉)信息技术有限公司 Machine translation method and device containing term translation and electronic equipment
CN112541365A (en) * 2020-12-21 2021-03-23 语联网(武汉)信息技术有限公司 Machine translation method and device based on term replacement

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
大辞科技: "memoQ单机版入门指南", pages 1 - 25, Retrieved from the Internet <URL:https://www.datalsp.com/archives/15920> *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114841175A (en) * 2022-04-22 2022-08-02 北京百度网讯科技有限公司 Machine translation method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US6073143A (en) Document conversion system including data monitoring means that adds tag information to hyperlink information and translates a document when such tag information is included in a document retrieval request
WO2021077973A1 (en) Personalised product description generating method based on multi-source crowd intelligence data
CN103235775B (en) A kind of statistical machine translation method merging translation memory and phrase translation model
CN106250375A (en) Translation processing method and device
KR20170106308A (en) Annotation assistance device and computer program therefor
CN101458681A (en) Voice translation method and voice translation apparatus
JP2011141598A (en) Image processing apparatus, image processing method, and program
CN104199871A (en) High-speed test question inputting method for intelligent teaching
US11720617B2 (en) Method and system for automated generation and editing of educational and training materials
CN112766000B (en) Machine translation method and system based on pre-training model
CN113609285A (en) Multi-mode text summarization system based on door control fusion mechanism
CN112257462A (en) Hypertext markup language translation method based on neural machine translation technology
CN113947094A (en) Auxiliary translation method
CN116468009A (en) Article generation method, apparatus, electronic device and storage medium
CN110750669B (en) Method and system for generating image captions
CN113343717A (en) Neural machine translation method based on translation memory library
CN107967243A (en) A kind of processing method for supporting that user independently makes pauses in reading unpunctuated ancient writings
CN112632950A (en) PPT generation method, device, equipment and computer-readable storage medium
WO2023115770A1 (en) Translation method and related device therefor
Patil et al. Real time machine translation system between indian languages
CN111680523B (en) Man-machine collaborative translation system and method based on context semantic comparison
US20050288919A1 (en) Method and system for model-parameter machine translation
Tiedemann ISA & ICA—Two web interfaces for interactive alignment of bitexts
JP2005284723A (en) Natural language processing system, natural language processing method, and computer program
CN114185573A (en) Implementation and online updating system and method for human-computer interaction machine translation system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Li Guanghua

Inventor before: Li Guanghua

Inventor before: Zhang Na