CN111401000B - Real-time translation previewing method for online auxiliary translation - Google Patents
Real-time translation previewing method for online auxiliary translation Download PDFInfo
- Publication number
- CN111401000B CN111401000B CN202010260294.7A CN202010260294A CN111401000B CN 111401000 B CN111401000 B CN 111401000B CN 202010260294 A CN202010260294 A CN 202010260294A CN 111401000 B CN111401000 B CN 111401000B
- Authority
- CN
- China
- Prior art keywords
- translation
- atom
- html
- segment
- tag
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention discloses a real-time translation previewing method for online auxiliary translation, which relates to the field of computer auxiliary translation and comprises the following steps: converting the original text file into HTML through a file format converter, analyzing and dividing the original text into sentence Segment segments which are divided according to sentences, burying element ids in the sentence Segment segments into converted HTML sub-tags by using a cyclic recursion algorithm to form a one-to-one correspondence, and realizing linkage between the sentence Segment segments and the HTML through a dom node of the HTML at the front end so as to achieve the effect of previewing the translation in real time; the invention provides an algorithm, which can render the translation in the auxiliary translation into a browser in real time for the translator to check and reference, thereby greatly saving the translation time and improving the efficiency obviously.
Description
Technical Field
The invention relates to the field of computer-aided translation, in particular to a real-time translation preview method for online-aided translation.
Background
The contemporary computer aided translation needs to extract the text, translate the text into the appointed target language, and then fill the translated text back. Typically, a translator cannot view the original text and translated text of a translated document in an editor during translation. The conventional method is to convert the original text into an html format by a file conversion method and render the html format to a translator for viewing by a browser. However, the translations formed by the translator during the editing process cannot be viewed in real time.
Disclosure of Invention
The embodiment of the invention provides a real-time translation preview method for online assisted translation. The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview and is intended to neither identify key/critical elements nor delineate the scope of such embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.
According to a first aspect of an embodiment of the present invention, there is provided
A real-time translation preview method for online assisted translation comprises the following steps:
converting the original text file into an HTML format;
analyzing the original text and dividing the original text into sentence Segment arrays divided by sentences;
embedding the element id in the sentence Segment into the sub-tag in the HTML format file by using a cyclic recursion algorithm to form a one-to-one correspondence;
and the linkage between the sentence Segment and the HTML is realized through the dom node of the HTML, so that the effect of previewing the translation in real time is achieved.
Preferably, the original document format is doc, docx, rtf, xls, xlsx, ppt, pptx, pdf, sxw, stw, sxc, stc.
Preferably, the method for converting the original document into the HTML format is to convert the original document by using a word self conversion function or other third party tools.
Preferably, the parsing and dividing the original text into sentence Segment segments array divided by sentences, specifically, dividing the sentence into sentence Segment segments divided by sentences is word, phrase or sentence.
Preferably, the Segment array is a Segment list, and the text content and the corresponding text labels of the segments are recorded.
Preferably, the cyclic recursive algorithm comprises the steps of:
defining a class of Atom types, which has two types defined as Tag and text;
defining Segment content in the Segment list as text of Atom, and defining label of Segment as label Tag of Atom;
the algorithm circularly reads each Atom and judges whether the Atom is put into a text pool or not according to the type of the Atom;
and (3) corresponding each Atom in the text pool to the Tag of the Atom to finally form a new HTML sub-Tag set with id mapping.
Preferably, the Atom class is a custom class.
Preferably, each Segment is composed of one or more Atom.
Preferably, the sub-Tag of the HTML is constituted by a Tag of Atom.
Preferably, the linkage method between the sentence Segment and the HTML is as follows: the Tag of Atom is embedded into the HTML sub-Tag.
The technical scheme provided by the embodiment of the invention can comprise the following beneficial effects:
the invention provides an algorithm, which can render the translation in the auxiliary translation into a browser in real time for the translator to check and reference, thereby greatly saving the translation time and improving the efficiency obviously. As shown in FIG. 7, when translating the 181 th sentence, the translator can see the effect of the translated sentence in the translated text in real time.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a schematic diagram of a real-time preview method of a translation of an online assisted translation according to an exemplary embodiment;
FIG. 2 is a logic diagram of a recursive and round robin algorithm shown in accordance with an exemplary embodiment;
FIG. 3 is an exemplary diagram of an original document shown in accordance with an exemplary embodiment;
FIG. 4 is a schematic diagram of sentence segments divided by sentences, shown according to an example embodiment;
FIG. 5 is a diagram illustrating conversion of an original document into HTML through a file format, according to an exemplary embodiment;
FIG. 6 is a schematic diagram showing embedding a transUnitId in a tag according to an exemplary embodiment;
FIG. 7 is a live preview effect view of a translation shown in accordance with an exemplary embodiment.
Description of the embodiments
The following description and the drawings sufficiently illustrate specific embodiments of the invention to enable those skilled in the art to practice them. The embodiments represent only possible variations. Individual components and functions are optional unless explicitly required, and the sequence of operations may vary. Portions and features of some embodiments may be included in, or substituted for, those of others. The scope of embodiments of the invention encompasses the full ambit of the claims, as well as all available equivalents of the claims. Embodiments may be referred to herein, individually or collectively, by the term "invention" merely for convenience and without intending to voluntarily limit the scope of this application to any single invention or inventive concept if more than one is in fact disclosed. Relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed. Various embodiments are described herein in a progressive manner, each embodiment focusing on differences from other embodiments, and identical and similar parts between the various embodiments are sufficient to be seen with each other. The structures, products and the like disclosed in the embodiments correspond to the parts disclosed in the embodiments, so that the description is relatively simple, and the relevant parts refer to the description of the method parts.
It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless it is specifically stated otherwise.
It should be noted that although the steps of the methods of the present disclosure are illustrated in the accompanying drawings in a particular order, this does not require or imply that the steps must be performed in that particular order or that all of the illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.
The invention is further described below with reference to the accompanying drawings and examples:
as shown in FIG. 1, the real-time translation preview method for online assisted translation comprises the following steps:
s1: converting the original text file into an HTML format;
s2: analyzing the original text and dividing the original text into sentence Segment arrays divided by sentences;
s3: embedding the element id in the sentence Segment into an HTML sub-tag by using a cyclic recursion algorithm to form a one-to-one correspondence;
s4: and the linkage between the sentence Segment and the HTML is realized through the dom node of the HTML, so that the effect of previewing the translation in real time is achieved.
According to the above scheme, further, the original document format may be word, excel, ppt, pdf, as shown in fig. 3, where the original document is a word.
In particular embodiments, the file format conversion described in FIG. 5 may utilize the conversion functionality of the word itself or other third party open source tools.
According to the above scheme, further, the sentence is divided into sentence segments divided by sentence, and the sentence segments are words or phrases, as shown in fig. 3, the word has test. test, sentence 2: fast.
According to the above scheme, further, the Segment array is a Segment list, and records text content and corresponding text labels of Segment segments, as shown in fig. 4, when the code is implemented, we define two sentences as two objects, segment1 and Segment2, and transUnitId: sentence labels; srcAtom: is sentence content.
According to the above scheme, further, as shown in fig. 2, a schematic diagram of a logic diagram of a recursive and cyclic algorithm is shown in a specific embodiment, and the cyclic recursive algorithm specifically includes the following steps:
s31: defining a class of Atom types, which has two types defined as Tag and text;
s32: defining Segment content in the Segment list as text of Atom, and defining label of Segment as label Tag of Atom;
s33: the algorithm circularly reads each Atom and judges whether the Atom is put into a text pool or not according to the type of the Atom;
s34: and (3) corresponding each Atom in the text pool to the Tag of the Atom to finally form a new HTML sub-Tag set with id mapping.
In a specific embodiment, the Atom class is a custom class and is not an original class.
According to the above scheme, further, each Segment is composed of one or more Atom.
In a specific embodiment, the sub-Tag of the HTML is formed by a Tag of Atom.
According to the above scheme, further, the linkage between the Segment segments and the HTML is implemented by embedding the Atom Tag into the HTML sub-Tag, as shown in fig. 6, so as to realize that the translated content of the sentence 1 can be displayed in the HTML webpage in real time, we need to locate the first span Tag under the p Tag of the above figure. The simplest approach is to say transUnitId embedding in the tag.
The method for previewing the translation in real time in the online auxiliary translation can render the translation in the auxiliary translation into the browser in real time for the translator to check and reference, greatly saves the translation time, and has very obvious efficiency improvement. As shown in FIG. 7, when translating the 181 th sentence, the translator can see the effect of the translated sentence in the translated text in real time.
The invention provides an algorithm, which can render the translation in the auxiliary translation into a browser in real time for the translator to check and reference, thereby greatly saving the translation time and improving the efficiency obviously.
It is to be understood that the invention is not limited to the arrangements and instrumentality shown in the drawings and described above, and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.
Claims (3)
1. A real-time translation preview method for online assisted translation is characterized by comprising the following steps:
converting the original text file into an HTML format;
analyzing the original text and dividing the original text into sentence Segment arrays divided by sentences; the method comprises the following steps: dividing the clause into sentence Segment segments divided according to sentences as words, phrases or sentences; the Segment array is a Segment list, and the text content and the corresponding text labels of the Segment are recorded;
embedding the element id in the sentence Segment into the sub-tag in the HTML format file by using a cyclic recursion algorithm to form a one-to-one correspondence;
the linkage between the sentence Segment and the HTML is realized through the dom node of the HTML, so that the effect of previewing the translation in real time is achieved;
the cyclic recursion algorithm comprises the following steps:
defining a class of Atom types, which has two types defined as Tag and text; class Atom is a custom class;
defining Segment content in the Segment list as text of Atom, and defining label of Segment as label Tag of Atom;
the algorithm circularly reads each Atom and judges whether the Atom is put into a text pool or not according to the type of the Atom;
each Atom in the text pool corresponds to the Tag of the Atom to finally form a new HTML sub-Tag set with id mapping;
wherein each Segment is composed of one or more Atom; the sub-Tag of the HTML is composed of Tag of an Atom;
the linkage method between the sentence Segment and the HTML comprises the following steps: the Tag of Atom is embedded into the HTML sub-Tag.
2. The method for real-time translation preview of online assisted translation according to claim 1, wherein said original document format is doc, docx, rtf, xls, xlsx, ppt, pptx, pdf, sxw, stw, sxc, stc.
3. The method for real-time translation preview of online assisted translation according to claim 2, wherein the converting the original document into HTML format is by using a word's own conversion function or other third party tools.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010260294.7A CN111401000B (en) | 2020-04-03 | 2020-04-03 | Real-time translation previewing method for online auxiliary translation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010260294.7A CN111401000B (en) | 2020-04-03 | 2020-04-03 | Real-time translation previewing method for online auxiliary translation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111401000A CN111401000A (en) | 2020-07-10 |
CN111401000B true CN111401000B (en) | 2023-06-20 |
Family
ID=71434942
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010260294.7A Active CN111401000B (en) | 2020-04-03 | 2020-04-03 | Real-time translation previewing method for online auxiliary translation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111401000B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111985255A (en) * | 2020-09-01 | 2020-11-24 | 北京中科凡语科技有限公司 | Translation method, translation device, electronic device and storage medium |
CN113705158A (en) * | 2021-09-26 | 2021-11-26 | 上海一者信息科技有限公司 | Method for intelligently restoring original text style in document translation |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102388383A (en) * | 2006-12-08 | 2012-03-21 | 帕特里克·J·霍尔 | Online computer-aided translation |
CN102567384A (en) * | 2010-12-29 | 2012-07-11 | 盛乐信息技术(上海)有限公司 | Webpage multi-language dynamic switching method and system based on webpage browser engine |
CN102929867A (en) * | 2011-11-03 | 2013-02-13 | 微软公司 | Technology used for automatically translating a document |
CN104965866A (en) * | 2015-06-05 | 2015-10-07 | 小米科技有限责任公司 | Method and apparatus for establishing label and style rule binding relation |
CN105069000A (en) * | 2015-08-24 | 2015-11-18 | 中译语通科技(北京)有限公司 | Interactive prediction input method |
CN105468697A (en) * | 2015-11-18 | 2016-04-06 | 成都优译信息技术有限公司 | Automatic positioning method used for translation teaching system |
CN105573969A (en) * | 2006-10-02 | 2016-05-11 | 谷歌公司 | Displaying original text in a user interface with translated text |
CN105760542A (en) * | 2016-03-15 | 2016-07-13 | 腾讯科技(深圳)有限公司 | Display control method, terminal and server |
CN106649271A (en) * | 2016-12-19 | 2017-05-10 | 成都优译信息技术股份有限公司 | Translation-based word document analysis method |
CN107885735A (en) * | 2017-11-21 | 2018-04-06 | 语联网(武汉)信息技术有限公司 | A kind of unrelated document translation method and system of form |
CN109145260A (en) * | 2018-08-24 | 2019-01-04 | 北京科技大学 | A kind of text information extraction method |
CN110263351A (en) * | 2019-06-17 | 2019-09-20 | 深圳前海微众银行股份有限公司 | A kind of multi-language translation method of webpage, device and equipment |
-
2020
- 2020-04-03 CN CN202010260294.7A patent/CN111401000B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105573969A (en) * | 2006-10-02 | 2016-05-11 | 谷歌公司 | Displaying original text in a user interface with translated text |
CN102388383A (en) * | 2006-12-08 | 2012-03-21 | 帕特里克·J·霍尔 | Online computer-aided translation |
CN102567384A (en) * | 2010-12-29 | 2012-07-11 | 盛乐信息技术(上海)有限公司 | Webpage multi-language dynamic switching method and system based on webpage browser engine |
CN102929867A (en) * | 2011-11-03 | 2013-02-13 | 微软公司 | Technology used for automatically translating a document |
CN104965866A (en) * | 2015-06-05 | 2015-10-07 | 小米科技有限责任公司 | Method and apparatus for establishing label and style rule binding relation |
CN105069000A (en) * | 2015-08-24 | 2015-11-18 | 中译语通科技(北京)有限公司 | Interactive prediction input method |
CN105468697A (en) * | 2015-11-18 | 2016-04-06 | 成都优译信息技术有限公司 | Automatic positioning method used for translation teaching system |
CN105760542A (en) * | 2016-03-15 | 2016-07-13 | 腾讯科技(深圳)有限公司 | Display control method, terminal and server |
CN106649271A (en) * | 2016-12-19 | 2017-05-10 | 成都优译信息技术股份有限公司 | Translation-based word document analysis method |
CN107885735A (en) * | 2017-11-21 | 2018-04-06 | 语联网(武汉)信息技术有限公司 | A kind of unrelated document translation method and system of form |
CN109145260A (en) * | 2018-08-24 | 2019-01-04 | 北京科技大学 | A kind of text information extraction method |
CN110263351A (en) * | 2019-06-17 | 2019-09-20 | 深圳前海微众银行股份有限公司 | A kind of multi-language translation method of webpage, device and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN111401000A (en) | 2020-07-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7472343B2 (en) | Systems, methods and computer programs for analysis, clarification, reporting on and generation of master documents for use in automated document generation | |
US8635539B2 (en) | Web-based language translation memory compilation and application | |
CN111401000B (en) | Real-time translation previewing method for online auxiliary translation | |
Lewis et al. | Developing ODIN: A multilingual repository of annotated language data for hundreds of the world's languages | |
JP2004334791A (en) | Machine translation apparatus, data processing method and program | |
Goodman et al. | Xigt: extensible interlinear glossed text for natural language processing | |
JP4304268B2 (en) | Third language text generation algorithm, apparatus, and program by inputting bilingual parallel text | |
CN110413574A (en) | A kind of method of automatic code generating internationalized resources | |
Sautter et al. | Semi-automated XML markup of biosystematic legacy literature with the GoldenGATE editor | |
JP2004220266A (en) | Machine translation device and machine translation method | |
Jacobson et al. | Linguistic documents synchronizing sound and text | |
Hudík et al. | The integration of moses into localization industry | |
Escartín | Design and compilation of a specialized Spanish-German parallel corpus. | |
Komen | Cesax: Coreference editor for syntactically annotated XML corpora | |
Durrani et al. | Improving Egyptian-to-English SMT by mapping Egyptian into MSA | |
JP5994150B2 (en) | Document creation method, document creation apparatus, and document creation program | |
Declerck et al. | Cross-linking Austrian dialectal Dictionaries through formalized Meanings | |
Kumar et al. | A machine assisted human translation system for technical documents | |
Senellart et al. | SYSTRAN translation stylesheets: machine translation driven by XSLT | |
Cruz—Lara et al. | Standardising the management and the representation of multilingual data | |
Boitet et al. | Towards Higher Quality Internal and Outside Multilingualization of Web Sites | |
Erjavec et al. | From machine readable dictionaries to lexical databases: The concede experience | |
Choumane et al. | Integrating translation services within a structured editor | |
Korkiakangas | A digital diplomatic edition of the 10th-century charters of Lucca for Latin corpus linguistics | |
Huang et al. | Quality Assurance of Automatic Annotation of Very Large Corpora: a Study based on heterogeneous Tagging System. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |