CN107122337B - Translation document generation method and device - Google Patents

Translation document generation method and device Download PDF

Info

Publication number
CN107122337B
CN107122337B CN201610101441.XA CN201610101441A CN107122337B CN 107122337 B CN107122337 B CN 107122337B CN 201610101441 A CN201610101441 A CN 201610101441A CN 107122337 B CN107122337 B CN 107122337B
Authority
CN
China
Prior art keywords
formatting
translation
user
formatted
contents
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610101441.XA
Other languages
Chinese (zh)
Other versions
CN107122337A (en
Inventor
李鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201610101441.XA priority Critical patent/CN107122337B/en
Publication of CN107122337A publication Critical patent/CN107122337A/en
Application granted granted Critical
Publication of CN107122337B publication Critical patent/CN107122337B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The application relates to the technical field of computers, in particular to a translation document generation method and device, which are used for solving the problems of high production cost and low production efficiency and accuracy of a translation document. The method for generating the translation document provided by the embodiment of the application comprises the following steps: displaying N types of formatted contents corresponding to the translation contents to a user according to the formatting type selected by the user and the translation contents needing formatting processing so that the user can edit the contents; and generating a translation file after formatting processing based on the formatted content edited by the user and the formatted information template corresponding to the formatting type under the target template format. Because the formatting treatment of the translated file can be directly carried out through the interface prompt, the use training of a target template format is not required to be specially carried out on a translator, the translation cost is reduced, and the translation efficiency and the translation accuracy can be greatly improved compared with a manual formatting mode.

Description

Translation document generation method and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a translation document generation method and apparatus.
Background
With the international development trend of information technology, more and more software is facing the international demand. Since software developers are generally not in the most varied languages, after a software developer edits a software document in one language, the software document is translated to another language by a specialized translator. Here, the language of the software refers to characters, words and sentences displayed to the user in the software, and the language obtained by translating the original language from the source language to the target language is the translated language.
The technical scheme of the software often involves some statements that need to be formatted, for example, for a notification statement "you have { number } messages" in the technical scheme, when translating the statement into english, formatting is needed, that is, a relation between a single complex form and a number of the word of the message is described, so that when the software runs, an accurate notification statement is presented to a user according to the number of actual messages, for example, if the user is notified of a message, the single complex form of the message is used, and if the user is notified of a plurality of messages, the complex form of the message is used.
The ICU message format is a widely used parsing and compiling tool that solves the language format problem encountered in software internationalization through a specific template format. The ICU Message Format requires the user to have a certain programming context and a certain threshold for non-programming context translators. When a document is delivered to a translator for translation, the translator needs to be trained to use the ICU Message Format so that the translator can master how to Format the translated sentence.
Therefore, at present, because the translator needs to be trained to use the parsing and compiling tools, the production cost of the translated documents is high, the production efficiency is low, and the translator is likely to perform wrong formatting treatment due to unfamiliarity with the use of the parsing and compiling tools, so that the production accuracy of the translated documents is low.
Disclosure of Invention
The embodiment of the application provides a method and a device for generating a translation document, which are used for solving the problems of high production cost, low production efficiency and low accuracy of the translation document.
The embodiment of the application provides a method for generating a translation document, which comprises the following steps:
determining a formatting type selected by a user and translation contents which need to be formatted aiming at the formatting type;
displaying N types of formatted contents corresponding to the translation contents to the user aiming at the formatting types and the translation contents so that the user can edit the N types of formatted contents; n is a positive integer;
and generating a translation file after formatting processing based on the formatted content edited and processed by the user and the formatted information template corresponding to the formatting type under the format of the target template.
Optionally, determining the format type selected by the user includes:
and determining the formatting type selected by the user through the formatting type button clicked by the user, or determining the formatting type selected by the user through a pull-down prompt box.
Optionally, determining the translation content that needs to be formatted and selected by the user includes:
and determining the translation content needing to be formatted, which is selected by the user on the translation file which is not formatted.
Optionally, for the formatting type and the translated content, displaying N types of formatted content corresponding to the translated content to the user, including:
querying N formatting conditions of the formatting type in the target language in a language data warehouse;
and displaying N types of formatted contents to be corrected corresponding to the translation contents based on the inquired N types of formatting conditions of the formatting type so as to prompt the user to correct the displayed N types of formatted contents to be corrected.
Optionally, after generating the formatted translation document, the method further includes:
and generating and displaying example data corresponding to the formatted translation file so that the user can judge whether the generated translation file is correct.
The embodiment of the present application provides a translation document generation apparatus, including:
the determining module is used for determining the formatting type selected by the user and the translation content which needs to be formatted according to the formatting type;
the display module is used for displaying N types of formatted contents corresponding to the translation contents to the user according to the formatting types and the translation contents so that the user can edit the N types of formatted contents; n is a positive integer;
and the generating module is used for generating a translation file after formatting processing based on the formatting content edited and processed by the user and the formatting information template corresponding to the formatting type under the format of the target template.
According to the method and the device, the formatted content corresponding to the translation content is displayed to the user according to the formatting type and the translation content selected by the user, the user edits the formatted content, and then the formatted translation file is generated based on the formatted content edited and processed by the user and the formatted information template corresponding to the formatting type under the format of the target template. Because the scheme of the application can directly carry out formatting treatment on the translation file through interface prompt, a translator does not need to specially train the use of a target template format, the translation cost is reduced, and compared with a manual formatting mode, the mode of automatically generating the translation file can greatly improve the translation efficiency and the translation accuracy.
Drawings
FIG. 1 is a flowchart of a method for generating a translated document according to an embodiment of the present application;
FIG. 2(a) is a diagram illustrating a translation document before formatting;
FIG. 2(b) is a diagram illustrating a translation document after formatting;
fig. 3 is a schematic structural diagram of a translation document generation apparatus according to an embodiment of the present application.
Detailed Description
After the developer develops the translation document of the software, the developer generally submits the translation document to a translator of the outsourcing company for translation. In order to solve the problem of formatting the translation document during the internationalization of software, the translation document is generally produced by using an analysis and compilation tool such as an ICU Message Format, and the ICU Message Format meets the formatting requirement of the translation document through a specific template Format. Because the translator needs to be trained specially to master the use of the ICU Message Format, the cost of producing translation documents in the existing manner is high.
The embodiment of the application improves the ICU Message Format, and can apply the DATA in the CLDR DATA to the use of the ICU Message Format. Here, CLDR DATA is a widely used language DATA repository containing language-related information of countries around the world. According to the embodiment of the application, a visual auxiliary operation interface is added in the ICU Message Format, a prompt is given to the formatting operation of a user by using language information in CLDR DATA, and the user only needs to edit and process the formatting content under a specific formatting type corresponding to the translation content according to the interface prompt. The embodiment of the application pre-stores the formatting information template corresponding to each formatting type in the ICU Message Format, and can automatically generate the formatted translation document according to the formatting content in the specific formatting type edited and processed by the user and the formatting information template of the specific formatting type.
The embodiments of the present application will be described in further detail with reference to the drawings attached hereto.
As shown in fig. 1, a flowchart of a method for generating a translation document provided in the embodiment of the present application includes the following steps:
s101: and determining the formatting type selected by the user and the translation content needing to be formatted according to the formatting type.
In a specific implementation, after finding that there is a formatting problem (such as a single-plural form problem) in the translation content, the translator may click a formatting type button (i.e., a single-plural form processing button) corresponding to the formatting problem in a plurality of formatting type buttons, or may provide a pull-down prompt box to the user, and the user selects the formatting type through the pull-down prompt box. In addition, the user can sort and select the translation content which needs to be formatted on the translation file which is not formatted.
S102: displaying N types of formatted contents corresponding to the translation contents to the user aiming at the formatting types and the translation contents so that the user can edit the N types of formatted contents; n is a positive integer.
In a specific implementation, the language DATA store (CLDR DATA) can be queried for N formatting cases of the formatting type in the target language (for example, a single complex form of english is divided into two cases, one case is a single case with a number of 1, and the other case is a complex case with a number of more than 1); and displaying N types of formatted contents to be corrected (displaying a single-form formatted content and a plural-form formatted content) corresponding to the translated contents based on the inquired N types of formatting conditions of the formatting types so as to prompt a user to correct the displayed N types of formatted contents to be corrected. Here, the formatted content generated by the client according to the preset random generation rule may be displayed on the operation interface, and the user may further perform correction processing, such as randomly generating single and plural forms of messages into messages.
As shown in fig. 2(a), after the user inputs the translation document "you have { count } new messages", and finds that "{ count } new messages" needs to be processed in a single-plural form, the user may be prompted to click a single-plural form processing button "add a new processing", and at this time, the user may be prompted to sort the input document for the translation contents that need to be processed in a single-plural form. After the user selects "{ count } new messages" in "you have { count } new messages", the client automatically lists the contents in several single and complex forms that need to be processed by the user and correspond to the translated contents according to several single and complex conditions (for example, 2 single and complex conditions exist in English, and 5 single and complex conditions exist in Russian) of the target language in the language data warehouse. For example, there are two formatting cases, i.e. the case where the number (count) is 1(one), and the other case where the number is greater than 1(other), corresponding to english.
S103: and generating a translation file after formatting processing based on the formatted content edited and processed by the user and the formatted information template corresponding to the formatting type under the format of the target template.
As shown in fig. 2(b), based on the single-Format Message when the count is one, the multiple-Format messages when the count is other, and the formatting information template corresponding to the single-multiple Format in the target template Format (e.g. the template Format under the ICU Message Format), the translation document after the formatting process is "You have { count, public, one { { count } new messages } other { { count } new messages }".
S104: and generating and displaying example data corresponding to the formatted translation file so that the user can judge whether the generated translation file is correct or not.
Here, after generating the formatted translation document, in order to further reduce the error rate of the translation, the client may feed back the execution effect of the formatted translation document to the user, which may help the user to judge the accuracy of the translation document in time.
As shown in fig. 2(b), the example data fed back to the user is "You have 1 new messages," You have 2 new messages, "You have 3 new messages," You have 4 new messages, "and the execution result is correct, which indicates that there is no problem in the translation document.
According to the embodiment of the application, the production of the translated documents by the translator is assisted through the visual interface tool, and by adopting the embodiment of the application, the translator does not need to learn the specific Format of the ICU Message Format, and can directly produce the translated documents of the Message Format meeting the software development requirements through interface prompt and interface operation, so that the translation cost is reduced. In addition, the formatted translation text can be automatically generated based on the formatted information template, and compared with a manual formatting mode, the translation accuracy can be greatly improved. Moreover, the embodiment of the application can also feed back the actual file execution effect to the user, so that the user can conveniently confirm the accuracy of the translated file in time.
Based on the same inventive concept, the embodiment of the present application further provides a device for generating a translation document corresponding to the method for generating a translation document, and because the principle of solving the problem of the device is similar to that of the method for generating a translation document in the embodiment of the present application, the implementation of the device can refer to the implementation of the method, and repeated details are not repeated.
As shown in fig. 3, a schematic structural diagram of a translation document generation apparatus provided in the embodiment of the present application includes:
a determining module 31, configured to determine a formatting type selected by a user, and a translation content that needs to be formatted for the formatting type;
a display module 32, configured to display, for the formatting type and the translation content, N types of formatting contents corresponding to the translation content to the user, so that the user can edit the N types of formatting contents; n is a positive integer;
and a generating module 33, configured to generate a formatted translation document based on the formatted content edited and processed by the user and the formatted information template corresponding to the formatting type in the target template format.
Optionally, the determining module 31 is specifically configured to:
and determining the formatting type selected by the user through the formatting type button clicked by the user, or determining the formatting type selected by the user through a pull-down prompt box.
Optionally, the determining module 31 is specifically configured to:
and determining the translation content needing to be formatted, which is selected by the user on the translation file which is not formatted.
Optionally, the display module 32 is specifically configured to:
querying N formatting conditions of the formatting type in the target language in a language data warehouse; and displaying N types of formatted contents to be corrected corresponding to the translation contents based on the inquired N types of formatting conditions of the formatting type so as to prompt the user to correct the displayed N types of formatted contents to be corrected.
Optionally, the generating module 33 is further configured to:
and generating and displaying example data corresponding to the formatted translation file so that the user can judge whether the generated translation file is correct.
With the above translation document generation device, the formatting content corresponding to the translation content is displayed to the user for the formatting type and the translation content selected by the user, the user edits the formatting content, and then the translation document after the formatting process is generated based on the formatting content edited by the user and the formatting information template corresponding to the formatting type in the target template format. Because the scheme of the application can directly carry out formatting treatment on the translation file through interface prompt, a translator does not need to specially train the use of a target template format, the translation cost is reduced, and compared with a manual formatting mode, the mode of automatically generating the translation file can greatly improve the translation efficiency and the translation accuracy.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. A method for generating a translation document, the method comprising:
determining a formatting type selected by a user and translation contents which need to be formatted aiming at the formatting type;
displaying N types of formatted contents corresponding to the translation contents to the user aiming at the formatting types and the translation contents so that the user can edit the N types of formatted contents; n is a positive integer and represents that N formatting conditions exist corresponding to the formatting type;
and generating a translation document after the formatting processing based on the formatting content edited and processed by the user and the formatting information template corresponding to the formatting type under the format of the target template, and feeding back the execution effect of the translation document after the formatting processing.
2. The method of claim 1, wherein determining the user-selected formatting type comprises:
and determining the formatting type selected by the user through the formatting type button clicked by the user, or determining the formatting type selected by the user through a pull-down prompt box.
3. The method of claim 1 or 2, wherein determining the translation selected by the user for formatting comprises:
and determining the translation content needing to be formatted, which is selected by the user on the translation file which is not formatted.
4. The method of claim 1, wherein displaying to the user, for the formatting type and translated content, N formatted content corresponding to the translated content comprises:
querying N formatting conditions of the formatting type in the target language in a language data warehouse;
and displaying N types of formatted contents to be corrected corresponding to the translation contents based on the inquired N types of formatting conditions of the formatting type so as to prompt the user to correct the displayed N types of formatted contents to be corrected.
5. The method of claim 1, wherein after generating the formatted translation, further comprising:
and generating and displaying example data corresponding to the formatted translation file so that the user can judge whether the generated translation file is correct.
6. A translation document generation apparatus, comprising:
the determining module is used for determining the formatting type selected by the user and the translation content which needs to be formatted according to the formatting type;
the display module is used for displaying N types of formatted contents corresponding to the translation contents to the user according to the formatting types and the translation contents so that the user can edit the N types of formatted contents; n is a positive integer and represents that N formatting conditions exist corresponding to the formatting type;
and the generating module is used for generating a translation document after the formatting processing based on the formatting content edited and processed by the user and the formatting information template corresponding to the formatting type under the format of the target template, and feeding back the execution effect of the translation document after the formatting processing.
7. The apparatus of claim 6, wherein the determination module is specifically configured to:
and determining the formatting type selected by the user through the formatting type button clicked by the user, or determining the formatting type selected by the user through a pull-down prompt box.
8. The apparatus of claim 6 or 7, wherein the determining module is specifically configured to:
and determining the translation content needing to be formatted, which is selected by the user on the translation file which is not formatted.
9. The apparatus of claim 6, wherein the display module is specifically configured to:
querying N formatting conditions of the formatting type in the target language in a language data warehouse; and displaying N types of formatted contents to be corrected corresponding to the translation contents based on the inquired N types of formatting conditions of the formatting type so as to prompt the user to correct the displayed N types of formatted contents to be corrected.
10. The apparatus of claim 6, wherein the generation module is further to:
and generating and displaying example data corresponding to the formatted translation file so that the user can judge whether the generated translation file is correct.
CN201610101441.XA 2016-02-24 2016-02-24 Translation document generation method and device Active CN107122337B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610101441.XA CN107122337B (en) 2016-02-24 2016-02-24 Translation document generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610101441.XA CN107122337B (en) 2016-02-24 2016-02-24 Translation document generation method and device

Publications (2)

Publication Number Publication Date
CN107122337A CN107122337A (en) 2017-09-01
CN107122337B true CN107122337B (en) 2021-02-02

Family

ID=59717655

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610101441.XA Active CN107122337B (en) 2016-02-24 2016-02-24 Translation document generation method and device

Country Status (1)

Country Link
CN (1) CN107122337B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111798190B (en) * 2019-04-03 2024-01-23 阿里巴巴集团控股有限公司 Method and system for processing translation document
CN112287652A (en) * 2020-06-29 2021-01-29 南京易杰智信息科技有限公司 Method, system and device for translating formatted pictures and texts

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7568152B1 (en) * 2000-07-14 2009-07-28 International Business Machines Corporation Text file interface support in an object oriented application
GB2444084A (en) * 2006-11-23 2008-05-28 Sharp Kk Selecting examples in an example based machine translation system
JP2008210022A (en) * 2007-02-23 2008-09-11 Nec Corp System and method for automatically creating conversion template, conversion template composition server, and program
CN102193914A (en) * 2011-05-26 2011-09-21 中国科学院计算技术研究所 Computer aided translation method and system

Also Published As

Publication number Publication date
CN107122337A (en) 2017-09-01

Similar Documents

Publication Publication Date Title
Castilho et al. A comparative quality evaluation of PBSMT and NMT using professional translators
US9778929B2 (en) Automated efficient translation context delivery
Garcia Computer-aided translation: systems
CA2861469A1 (en) Method and apparatus to construct program for assisting in reviewing
US8589791B2 (en) Automatically generating a glossary of terms for a given document or group of documents
US20140058718A1 (en) Crowdsourcing translation services
O'Brien et al. Towards intelligent post-editing interfaces
JP2007034813A (en) Software manual generation system in two or more natural languages
US20160062981A1 (en) Methods and apparatus related to determining edit rules for rewriting phrases
Specia et al. Translation quality and productivity: A study on rich morphology languages
US11301643B2 (en) String extraction and translation service
US20090199165A1 (en) Methods, systems, and computer program products for internationalizing user interface control layouts
WO2015052817A1 (en) Transliteration work support device, transliteration work support method and program
CN107122337B (en) Translation document generation method and device
Hu et al. Crowdsourced monolingual translation
US20180101366A1 (en) Reducing translation volume and ensuring consistent text strings in software development
JP2007149109A (en) Translation support device
JP2014232505A (en) Inter-item association generation support device
CN111061469B (en) WEB front-end source code generation method and device, storage medium and processor
Li et al. Cognitive computing in action to enhance invoice processing with customized language translation
CN110275712B (en) Text replacement method, device and equipment
KR20130020970A (en) Coding system and coding method
CN110515653B (en) Document generation method and device, electronic equipment and computer readable storage medium
Van Zaanen et al. The development of Dutch and Afrikaans language resources for compound boundary analysis
JP2009151613A (en) Program source conversion apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1243518

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant