CN111401000B

CN111401000B - Real-time translation previewing method for online auxiliary translation

Info

Publication number: CN111401000B
Application number: CN202010260294.7A
Authority: CN
Inventors: 陈件; 张井; 成延; 刘旻
Original assignee: Shanghai Yizhe Information Technology Co ltd
Current assignee: Shanghai Yizhe Information Technology Co ltd
Priority date: 2020-04-03
Filing date: 2020-04-03
Publication date: 2023-06-20
Anticipated expiration: 2040-04-03
Also published as: CN111401000A

Abstract

The invention discloses a real-time translation previewing method for online auxiliary translation, which relates to the field of computer auxiliary translation and comprises the following steps: converting the original text file into HTML through a file format converter, analyzing and dividing the original text into sentence Segment segments which are divided according to sentences, burying element ids in the sentence Segment segments into converted HTML sub-tags by using a cyclic recursion algorithm to form a one-to-one correspondence, and realizing linkage between the sentence Segment segments and the HTML through a dom node of the HTML at the front end so as to achieve the effect of previewing the translation in real time; the invention provides an algorithm, which can render the translation in the auxiliary translation into a browser in real time for the translator to check and reference, thereby greatly saving the translation time and improving the efficiency obviously.

Description

Real-time translation previewing method for online auxiliary translation

Technical Field

The invention relates to the field of computer-aided translation, in particular to a real-time translation preview method for online-aided translation.

Background

The contemporary computer aided translation needs to extract the text, translate the text into the appointed target language, and then fill the translated text back. Typically, a translator cannot view the original text and translated text of a translated document in an editor during translation. The conventional method is to convert the original text into an html format by a file conversion method and render the html format to a translator for viewing by a browser. However, the translations formed by the translator during the editing process cannot be viewed in real time.

Disclosure of Invention

The embodiment of the invention provides a real-time translation preview method for online assisted translation. The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview and is intended to neither identify key/critical elements nor delineate the scope of such embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.

According to a first aspect of an embodiment of the present invention, there is provided

A real-time translation preview method for online assisted translation comprises the following steps:

converting the original text file into an HTML format;

analyzing the original text and dividing the original text into sentence Segment arrays divided by sentences;

embedding the element id in the sentence Segment into the sub-tag in the HTML format file by using a cyclic recursion algorithm to form a one-to-one correspondence;

and the linkage between the sentence Segment and the HTML is realized through the dom node of the HTML, so that the effect of previewing the translation in real time is achieved.

Preferably, the original document format is doc, docx, rtf, xls, xlsx, ppt, pptx, pdf, sxw, stw, sxc, stc.

Preferably, the method for converting the original document into the HTML format is to convert the original document by using a word self conversion function or other third party tools.

Preferably, the parsing and dividing the original text into sentence Segment segments array divided by sentences, specifically, dividing the sentence into sentence Segment segments divided by sentences is word, phrase or sentence.

Preferably, the Segment array is a Segment list, and the text content and the corresponding text labels of the segments are recorded.

Preferably, the cyclic recursive algorithm comprises the steps of:

defining a class of Atom types, which has two types defined as Tag and text;

defining Segment content in the Segment list as text of Atom, and defining label of Segment as label Tag of Atom;

the algorithm circularly reads each Atom and judges whether the Atom is put into a text pool or not according to the type of the Atom;

and (3) corresponding each Atom in the text pool to the Tag of the Atom to finally form a new HTML sub-Tag set with id mapping.

Preferably, the Atom class is a custom class.

Preferably, each Segment is composed of one or more Atom.

Preferably, the sub-Tag of the HTML is constituted by a Tag of Atom.

Preferably, the linkage method between the sentence Segment and the HTML is as follows: the Tag of Atom is embedded into the HTML sub-Tag.

The technical scheme provided by the embodiment of the invention can comprise the following beneficial effects:

the invention provides an algorithm, which can render the translation in the auxiliary translation into a browser in real time for the translator to check and reference, thereby greatly saving the translation time and improving the efficiency obviously. As shown in FIG. 7, when translating the 181 th sentence, the translator can see the effect of the translated sentence in the translated text in real time.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

FIG. 1 is a schematic diagram of a real-time preview method of a translation of an online assisted translation according to an exemplary embodiment;

FIG. 2 is a logic diagram of a recursive and round robin algorithm shown in accordance with an exemplary embodiment;

FIG. 3 is an exemplary diagram of an original document shown in accordance with an exemplary embodiment;

FIG. 4 is a schematic diagram of sentence segments divided by sentences, shown according to an example embodiment;

FIG. 5 is a diagram illustrating conversion of an original document into HTML through a file format, according to an exemplary embodiment;

FIG. 6 is a schematic diagram showing embedding a transUnitId in a tag according to an exemplary embodiment;

FIG. 7 is a live preview effect view of a translation shown in accordance with an exemplary embodiment.

Description of the embodiments

The following description and the drawings sufficiently illustrate specific embodiments of the invention to enable those skilled in the art to practice them. The embodiments represent only possible variations. Individual components and functions are optional unless explicitly required, and the sequence of operations may vary. Portions and features of some embodiments may be included in, or substituted for, those of others. The scope of embodiments of the invention encompasses the full ambit of the claims, as well as all available equivalents of the claims. Embodiments may be referred to herein, individually or collectively, by the term "invention" merely for convenience and without intending to voluntarily limit the scope of this application to any single invention or inventive concept if more than one is in fact disclosed. Relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed. Various embodiments are described herein in a progressive manner, each embodiment focusing on differences from other embodiments, and identical and similar parts between the various embodiments are sufficient to be seen with each other. The structures, products and the like disclosed in the embodiments correspond to the parts disclosed in the embodiments, so that the description is relatively simple, and the relevant parts refer to the description of the method parts.

It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless it is specifically stated otherwise.

It should be noted that although the steps of the methods of the present disclosure are illustrated in the accompanying drawings in a particular order, this does not require or imply that the steps must be performed in that particular order or that all of the illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.

The invention is further described below with reference to the accompanying drawings and examples:

as shown in FIG. 1, the real-time translation preview method for online assisted translation comprises the following steps:

s1: converting the original text file into an HTML format;

s2: analyzing the original text and dividing the original text into sentence Segment arrays divided by sentences;

s3: embedding the element id in the sentence Segment into an HTML sub-tag by using a cyclic recursion algorithm to form a one-to-one correspondence;

s4: and the linkage between the sentence Segment and the HTML is realized through the dom node of the HTML, so that the effect of previewing the translation in real time is achieved.

According to the above scheme, further, the original document format may be word, excel, ppt, pdf, as shown in fig. 3, where the original document is a word.

In particular embodiments, the file format conversion described in FIG. 5 may utilize the conversion functionality of the word itself or other third party open source tools.

According to the above scheme, further, the sentence is divided into sentence segments divided by sentence, and the sentence segments are words or phrases, as shown in fig. 3, the word has test. test, sentence 2: fast.

According to the above scheme, further, the Segment array is a Segment list, and records text content and corresponding text labels of Segment segments, as shown in fig. 4, when the code is implemented, we define two sentences as two objects, segment1 and Segment2, and transUnitId: sentence labels; srcAtom: is sentence content.

According to the above scheme, further, as shown in fig. 2, a schematic diagram of a logic diagram of a recursive and cyclic algorithm is shown in a specific embodiment, and the cyclic recursive algorithm specifically includes the following steps:

s31: defining a class of Atom types, which has two types defined as Tag and text;

s32: defining Segment content in the Segment list as text of Atom, and defining label of Segment as label Tag of Atom;

s33: the algorithm circularly reads each Atom and judges whether the Atom is put into a text pool or not according to the type of the Atom;

s34: and (3) corresponding each Atom in the text pool to the Tag of the Atom to finally form a new HTML sub-Tag set with id mapping.

In a specific embodiment, the Atom class is a custom class and is not an original class.

According to the above scheme, further, each Segment is composed of one or more Atom.

In a specific embodiment, the sub-Tag of the HTML is formed by a Tag of Atom.

According to the above scheme, further, the linkage between the Segment segments and the HTML is implemented by embedding the Atom Tag into the HTML sub-Tag, as shown in fig. 6, so as to realize that the translated content of the sentence 1 can be displayed in the HTML webpage in real time, we need to locate the first span Tag under the p Tag of the above figure. The simplest approach is to say transUnitId embedding in the tag.

The method for previewing the translation in real time in the online auxiliary translation can render the translation in the auxiliary translation into the browser in real time for the translator to check and reference, greatly saves the translation time, and has very obvious efficiency improvement. As shown in FIG. 7, when translating the 181 th sentence, the translator can see the effect of the translated sentence in the translated text in real time.

The invention provides an algorithm, which can render the translation in the auxiliary translation into a browser in real time for the translator to check and reference, thereby greatly saving the translation time and improving the efficiency obviously.

It is to be understood that the invention is not limited to the arrangements and instrumentality shown in the drawings and described above, and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A real-time translation preview method for online assisted translation is characterized by comprising the following steps:

converting the original text file into an HTML format;

analyzing the original text and dividing the original text into sentence Segment arrays divided by sentences; the method comprises the following steps: dividing the clause into sentence Segment segments divided according to sentences as words, phrases or sentences; the Segment array is a Segment list, and the text content and the corresponding text labels of the Segment are recorded;

the linkage between the sentence Segment and the HTML is realized through the dom node of the HTML, so that the effect of previewing the translation in real time is achieved;

the cyclic recursion algorithm comprises the following steps:

defining a class of Atom types, which has two types defined as Tag and text; class Atom is a custom class;

each Atom in the text pool corresponds to the Tag of the Atom to finally form a new HTML sub-Tag set with id mapping;

wherein each Segment is composed of one or more Atom; the sub-Tag of the HTML is composed of Tag of an Atom;

the linkage method between the sentence Segment and the HTML comprises the following steps: the Tag of Atom is embedded into the HTML sub-Tag.

2. The method for real-time translation preview of online assisted translation according to claim 1, wherein said original document format is doc, docx, rtf, xls, xlsx, ppt, pptx, pdf, sxw, stw, sxc, stc.

3. The method for real-time translation preview of online assisted translation according to claim 2, wherein the converting the original document into HTML format is by using a word's own conversion function or other third party tools.