CN111967274A - Label conversion processing method and device, electronic equipment and readable storage medium - Google Patents

Label conversion processing method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN111967274A
CN111967274A CN202010867140.4A CN202010867140A CN111967274A CN 111967274 A CN111967274 A CN 111967274A CN 202010867140 A CN202010867140 A CN 202010867140A CN 111967274 A CN111967274 A CN 111967274A
Authority
CN
China
Prior art keywords
tag
label
target
name
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010867140.4A
Other languages
Chinese (zh)
Other versions
CN111967274B (en
Inventor
郭云辉
陈海燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wensihai Huizhike Technology Co ltd
Original Assignee
Wensihai Huizhike Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wensihai Huizhike Technology Co ltd filed Critical Wensihai Huizhike Technology Co ltd
Priority to CN202010867140.4A priority Critical patent/CN111967274B/en
Publication of CN111967274A publication Critical patent/CN111967274A/en
Application granted granted Critical
Publication of CN111967274B publication Critical patent/CN111967274B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The application provides a label conversion processing method, a label conversion processing device, electronic equipment and a readable storage medium, wherein the method comprises the following steps: identifying a target label in a file to be translated, wherein the target label is a label with a logical paired label characteristic; judging the label type of the target label according to the label type characteristics in the logical paired label characteristics; the label type comprises a logic start label or a logic end label; determining a replacement tag name of the target tag according to the tag name of the target tag; generating an update tag based on the tag type and the replacement tag name so that the update tag can be recognized as a formally paired tag; the target tag is replaced with the update tag. Thus, because the target tag is replaced to update the tag, and the updated tag can be recognized, the independent tag logically belonging to the tag pair can also be correctly placed by the translator in the translation process, so that the requirement on the computer professional knowledge of the translator is reduced, and the quality and the efficiency of the translation operation are improved.

Description

Label conversion processing method and device, electronic equipment and readable storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a tag conversion processing method and apparatus, an electronic device, and a readable storage medium.
Background
Often there are tags in a file that can be formally divided into pairs of tags (i.e., a start tag and an end tag)Tags) and stand-alone tags. In order to facilitate the identification and processing of these labels by the translator, and to reduce the technical knowledge reserve requirements for the translator, CAT (Computer assisted translation Tool) will process these labels into different appearances (shapes of left and right ends), such as: the start tag will be processed as
Figure BDA0002649963770000011
The end tag will be processed as
Figure BDA0002649963770000012
The independent label would be treated as "■". Therefore, the translator can accurately judge the label category through the label appearance so as to ensure that the starting label and the ending label can be correctly paired and wrap a proper text. Especially, when the language order needs to be adjusted in the translated text due to different characteristics of the source language and the target language, the translator is more dependent on the function.
However, in some classes of documents, there exists a special class of tags that, although in the form of separate tags, are functionally related to turning on or off a certain characteristic or action, or to indicating a certain range of starting or ending positions. I.e. logically belonging to a pair of tags. The CAT currently treats such logically paired special independent tags together with the ordinary independent tags as the appearance "■" of the independent tags. This can easily cause the translator to misinterpret such special individual tags as ordinary individual tags, which may result in tags that cannot be paired (e.g., a tag with a start function is located behind a tag with an end function, or one of the two parties to the pairing is missing). Meanwhile, in order to avoid such errors, the interpreter needs to store certain technical knowledge to know the corresponding file format coding knowledge to identify such special tags.
Disclosure of Invention
An object of the embodiments of the present application is to provide a tag conversion processing method, an apparatus, an electronic device, and a readable storage medium, so as to solve the problem that the related art cannot distinguish a special independent tag logically belonging to a tag pair, thereby improving the requirement of a translator on computer professional knowledge, and also reducing the quality and efficiency of translation operations.
The embodiment of the application provides a label conversion processing method, which comprises the following steps: identifying a target label in a file to be translated, wherein the target label is a label with a logical pair label characteristic; judging the label type of the target label according to the label type characteristic in the logic pair label characteristic; the label type comprises a logic start label or a logic end label; determining a replacement tag name of the target tag according to the tag name of the target tag; generating an updated tag based on the tag type and the replacement tag name such that the updated tag can be recognized as a formally paired tag; replacing the target tag with the update tag.
In the embodiment of the application, the type of the label is determined by identifying the target label in the file to be translated and according to the label type characteristic in the logic pair label characteristic of the target label. And meanwhile, according to the label name of the target label, determining a corresponding replacement label name, so that an updated label is generated according to the label type and the replacement label name, the updated label can be identified as a formally paired label, and finally, the target label is replaced by the updated label. Therefore, the target label is replaced to update the label, and the updated label can be recognized, so that during translation, the independent labels logically belonging to the label pair can be differentially translated, so that a translator can distinguish the special independent labels logically belonging to the label pair, the requirement on the computer professional knowledge of the translator is further reduced, and the quality and the efficiency of translation operation are also improved.
Further, identifying the target tag in the file to be translated comprises: acquiring a tag name and/or tag attribute of a tag in the file to be translated, and identifying whether the tag name and/or tag attribute contains the logic paired tag feature; and if the label name and/or the label attribute contain the logic paired label characteristic, determining that the label is the target label.
In the practical application process, compared with other ordinary independent tags, the independent tag with the logical pair-wise tag feature needs to be capable of logically implementing the function of the tag pair, so that the independent tag with the logical pair-wise tag feature often needs to have a feature corresponding to the logical pair-wise feature in terms of name and/or tag attribute, for example, the independent tag with the logical pair-wise tag feature (i.e., the ordinary independent tag) will not have the feature, for example, the independent tag with the logical pair-wise tag feature will have the feature of "start" or "end". Accordingly, in the implementation process, the identification determination of the target tag can be quickly and reliably realized by identifying whether the tag name and/or the tag attribute contain the logical paired tag feature.
Further, determining an alternate tag name for the target tag based on the tag name of the target tag comprises: and determining a preset replacement tag name corresponding to the tag name of the target tag as the replacement tag name according to a preset corresponding relation.
Further, the generating an updated tag according to the tag type and the replacement tag name includes: if the tag type is a logic start tag, acquiring the target tag, and replacing the tag name of the target tag with the replacement tag name to obtain the updated tag; and if the tag type is a logic end tag, acquiring the target tag, replacing the tag name of the target tag with the replacement tag name, adding an end tag identifier before the replacement tag name, and defaulting the tag attribute of the target tag to obtain the updated tag.
It should be understood that in actual application, when CAT translates, the tag can be allowed to default to the partial feature, as long as there is a corresponding ending tag identification thereafter. For this reason, in the embodiment of the present application, it may be allowed that the start tag identifier is not set for the logical start tag, and the end tag identifier is set for the logical end tag, so that the tags can be simplified on the basis of meeting the related translation requirements. In addition, the omitted tag attribute is required for the end tag in the tag pair at CAT translation. For this reason, in the present application, the tag attribute of the target tag may be defaulted, so that the updated tag corresponding to the logical end tag can be effectively identified by the CAT.
Further, the method further comprises: determining the file type of the file to be translated; and determining a logic pair tag characteristic for identifying the target tag according to the file type.
It should be understood that in actual implementation, the specific individual tags belonging to a tag pair in different types of files may have different logical pair tag characteristics that can be used to implement the logical pair function. For example, for one type of file, the logical pair-wise tag characteristics of the individual tags within the pair of tags may be "start" and "end", while for another type of file, the logical pair-wise tag characteristics of the individual tags within the pair of tags may be "open" and "close". In contrast, in the implementation process, the logic paired tag feature for identifying the target tag is determined according to the file type of the file to be translated, so that effective identification of the target tag in the embodiment of the application can be ensured, and the reliability of the scheme provided by the embodiment of the application can be ensured.
Further, the method further comprises: judging whether the target label comprises a matching group distinguishing attribute; and if so, configuring the matched group distinguishing attribute in the updated label according to the matched group distinguishing attribute in the target label.
In an actual application process, there may be multiple pairs (i.e. multiple pairing groups) of independent tags belonging to a tag pair, and in order to distinguish the independent tags of different pairing groups, a pairing group distinguishing attribute is set in the tag, so as to distinguish the independent tags of different pairs belonging to a tag pair. Through the implementation process, the replaced update tag can realize corresponding functions according to the set pairing group after being translated, and logic errors in translation cannot be caused.
Further, the method further comprises: recording the corresponding relation between the target label and the updated label; if a translation completion instruction is received, replacing the updated tag with the target tag in a translation completion file; the translation completion file is a file after the translation of the file to be translated is completed, and a label in the file to be translated is reserved in the translation completion file.
In the implementation process, the updated tag in the translation finished file is replaced by the target tag, so that the consistency of the tag in the translation finished file and the tag in the original file can be ensured, and the translation finished file can be conveniently used for subsequent operation.
An embodiment of the present application further provides a tag conversion processing apparatus, including: the device comprises an identification module, a judgment module, a determination module and a replacement module; the identification module is used for identifying a target label in a file to be translated, wherein the target label is a label with a logical pair label characteristic; the judging module is used for judging the label type of the target label according to the label type characteristic in the logic paired label characteristics; the label type comprises a logic start label or a logic end label; the determining module is further used for determining a replacing label name of the target label according to the label name of the target label; the replacement module is used for generating an updated label according to the label type and the replacement label name so that the updated label can be identified as a formally paired label; replacing the target tag with the update tag.
An embodiment of the present application further provides an electronic device, including: a controller, a memory and a communication bus; the communication bus is used for realizing connection communication between the controller and the memory; the controller is configured to execute one or more programs stored in the memory to implement any of the above-described tag conversion processing methods.
An embodiment of the present application further provides a readable storage medium, which stores one or more programs, where the one or more programs are executable by one or more controllers to implement any of the above-described tag conversion processing methods.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a schematic basic flow chart of a tag conversion processing method according to an embodiment of the present application;
FIG. 2 is a schematic structural diagram of a special independent tag provided in an embodiment of the present application;
fig. 3 is a schematic flowchart of a specific tag conversion processing method according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a tag conversion processing apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
The first embodiment is as follows:
in order to solve the problem that the related art cannot distinguish the special independent tags logically belonging to the tag pairs, thereby improving the requirement of the computer professional knowledge of the translator and reducing the quality and efficiency of the translation operation, an embodiment of the present application provides a tag conversion processing method, which is shown in fig. 1 and includes:
s101: and identifying a target label in the file to be translated.
In the embodiment of the application, the file to be translated can be provided by an engineer according to the actual translation requirement, or read from a related file system according to the actual translation requirement.
In the embodiment of the present application, the target tag refers to a separate tag in the document to be translated that has a form of a separate tag, but logically belongs to a tag pair (i.e., functionally belongs to turning on or off a certain characteristic or action, or representing a start or end position of a certain range) (for convenience of description, the separate tag logically belonging to the tag pair is hereinafter referred to as a special separate tag in the present application).
In an embodiment of the application, the target tag has a tag with a logical pair tag characteristic. By logical pair tag feature, it is meant that the part of the tag that can make a particular independent tag implement the function of the tag pair is different from the ordinary independent tag, such as having the features of "start" and "end", "open" and "close", "1" or "0".
It should be understood that in actual applications, the logical pair-tag characteristics of a particular individual tag may be different in different types of files. For example, for one type of file, the logical pair-wise tag characteristics of the individual tags within the pair of tags may be "start" and "end", while for another type of file, the logical pair-wise tag characteristics of the individual tags within the pair of tags may be "open" and "close".
Therefore, in the embodiment of the application, the corresponding relationship between each type of file and the logical paired tag feature may be recorded in advance, so that the file type of the file to be translated may be determined first, and then the corresponding logical paired tag feature may be determined according to the file type.
In the embodiment of the application, a special independent tag can be recognized from a file to be translated based on the logical paired tag characteristics, and the recognized special independent tag is a target tag.
It should be noted that, referring to fig. 2, in practical applications, a special independent tag for a Processing Instruction (Processing Instruction) element class has a structure: <? target data? And (6). Where "target" designates a processor that processes instructions, and "data" is the content of the instructions. For convenience of description, the "target" is referred to as a tag name and the "data" is referred to as a tag attribute in the present application. For example, <? rb-cbt _ start name ═ student "? And <? rb-cbt _ end? "rb-cbt _ start" and "rb-cbt _ end" are tag names, and "name ═ parent" is a tag attribute. In addition, there are special independent tags of the general element class, for example, special independent tags < bookmark type ═ start "/> and < bookmark type ═ end"/>, the "bookmark" is a tag name, and the "type ═ start" "and" type ═ end "" are tag attributes.
It should be understood that for a tag, there may be multiple specific attributes in the tag attributes. For example, two attributes of "type ═ start", "id ═ 1", "can be included at the same time.
It should be noted that, in the practical application process, since the special independent tag needs to be able to logically implement the function of the tag pair, it often needs to have a feature of corresponding identifying a logical pair characteristic on the tag name and/or the tag attribute, and an independent tag without the logical pair characteristic (i.e. a normal independent tag) does not have such a feature. For example, for the previous example, "start" and "end" in the tag attribute are features that identify logically paired properties; as another example, for a particular independent tag <? rb-cbt _ start name ═ student "? And <? rb-cbt _ end? For example, "start" and "end" in the tag names "rb-cbt _ start" and "rb-cbt _ end" are features that identify logical pair-wise properties. It should be understood that in addition to the two foregoing example cases, there may be cases where both the tag name and tag attributes have features that identify logical pair-wise characteristics, which are not further illustrated herein.
Therefore, in the embodiment of the application, the tag name and/or the tag attribute of the tag in the file to be translated can be obtained, and whether the tag name and/or the tag attribute contains the logical paired tag feature or not can be identified. If the tag name and/or the tag attribute contain logical paired tag characteristics, the tag can be determined to be the target tag. Therefore, the target label in the file to be translated can be quickly and reliably identified based on the distinguishing characteristics of the special independent label and the common independent label.
S102: and judging the label type of the target label according to the label type characteristic in the logic pair label characteristic.
In the actual application process, the conventional tag pair is composed of a start tag and an end tag, and the special independent tag is similar to the conventional tag pair, and a matching group is also required to be composed of a tag for realizing the start function and a tag for realizing the end function, so that the function of the complete tag pair can be logically realized.
Thus, in actual practice, a tag type feature would be present in a logical pair tag feature of a particular individual tag to identify whether the particular individual tag logically implements the beginning or ending function of the tag pair.
In the embodiment of the present application, the tag type feature for identifying that the special independent tag logically implements the start function of the tag pair is defined as a first type of tag type feature, such as "start", "open", and the like, and the tag type feature for identifying that the special independent tag logically implements the end function of the tag pair is defined as a second type of tag type feature, such as "end", "close", and the like.
When the label type characteristic in the logic paired label characteristic is identified to be the first type of label type characteristic, the label type of the target label can be determined to be the logic start label, and when the label type characteristic in the logic paired label characteristic is identified to be the second type of label type characteristic, the label type of the target label can be determined to be the logic end label.
In the embodiment of the present application, the logical start tag and the logical end tag correspond to the start tag and the end tag of the conventional tag pair, respectively.
S103: a replacement tag name for the target tag is determined based on the tag name of the target tag.
In a possible implementation manner of the embodiment of the present application, replacement tag names corresponding to various types of tag names may be defined in advance, and a correspondence table between the tag names and the replacement tag names is configured, so that the corresponding replacement tag names may be automatically found according to the tag name of the target tag.
It should be noted that in this possible embodiment, the defined replacement tag name should satisfy the parsing rule of the computer-aided translation tool, so that the computer-aided translation tool can correctly recognize the replacement tag name. Illustratively, for a particular independent tag <? rb-cbt _ start name ═ student "? A tag name of "rb-cbt _ start" may define a tag name of "Q _ cbt" in the form of the beginning tag in a conventional tag pair. Therefore, in the implementation process of the scheme, the determined replacement tag name of the tag name "rb-cbt _ start" is "Q _ cbt", so that the computer-aided translation tool can accurately recognize the replacement tag.
In another possible implementation manner of the embodiment of the present application, an engineer may also configure a corresponding tag name conversion rule in advance, so that the tag name of the target tag is automatically converted into the replacement tag name according to the tag name conversion rule.
It should also be noted that in the present embodiment, the tag name conversion rule should be configured such that the converted replacement tag name can satisfy the parsing rule of the computer-aided translation tool, so that the computer-aided translation tool can correctly recognize the replacement tag name. Illustratively, the tag name conversion rule may be configured to convert the tag name of the target tag into an inherent tag span of HTML (HyperText Markup Language). For example, tag name <? rh-cbt > may be converted to < span class ═ cbt ">, so that the computer-aided translation tool may accurately recognize the replacement tag < span class ═ cbt" >.
S104: an update tag is generated based on the tag type and the replacement tag name such that the update tag can be recognized as a formally paired tag.
S105: the target tag is replaced with the update tag.
In the embodiment of the present application, the rest of the content in the tag other than the tag name may be inherited or may be defaulted, specifically, based on the recognition of the computer aided translation tool, so as to generate the updated tag.
In the embodiment of the application, for the target tag with the tag type of the logical start tag, since the computer-aided translation tool allows the start tag to be represented as the identifier of the start tag by default, the tag name of the target tag can be directly replaced by the replacement tag name, so that the updated tag is obtained. Of course, the start tag identifier may be added after replacing the tag name of the target tag with the replacement tag name, so that the computer-aided translation tool can determine that the target tag is the logic start tag in the special independent tag.
For the target tag with the tag type of the logical end tag, in order to ensure that the computer aided translation tool can accurately identify the two special independent tags forming the logical tag pair, not only the tag name of the target tag needs to be replaced by the replacement tag name, but also the end tag identifier needs to be added before the replacement tag name, so as to ensure that the computer aided translation tool can accurately identify the corresponding logical start tag and the corresponding logical end tag.
Illustratively, for text:
<?rb-cbt_start name="student teacher"?>Quiz for students&teachers<?rb-cbt_end?><?rb-cbt_start name="teacher"?>
Answer for teachers only<?rb-cbt_end?>
in the text <? rb-cbt _ start name ═ student tenacher? And <? rb-cbt _ end? Is a pair of special independent tags; and <? rb-cbt _ start name ═ teacher? And <? rb-cbt _ end? Is yet another pair of special independent tags. The text may be converted to:
<Q_cbt name="student teacher">Quiz for students&teachers</Q_cbt><Q_cbt name="teacher">
Answer for teachers only</Q_cbt>
it can be seen that the label before conversion <? rb-cbt _ start name ═ student tenacher? For < Q _ cbt name ═ parent tenacher' >, the identification "_ start" indicating that it is the start label is defaulted, and the label before conversion <? rb-cbt _ end? The conversion is for </Q _ cbt >, which has an end tag "/", whereby the computer-aided translation tool can determine that the logical start tag is the preceding tag < Q _ cbt name ">, and the logical end tag is </Q _ cbt >.
Similarly, the computer aided translation tool can determine the logic start tag < Q _ cbt name ═ teacher' > and the logic end tag </Q _ cbt >, so as to realize accurate translation of the text.
Note that formally paired tags are tags having the same tag name and having an end identifier in the end tag so that the computer-aided translation tool can recognize them as paired tags. Such as the forms < a > and </a >, etc.
It is noted that for computer-aided translation tools, default tag attributes are required for the end tag. Therefore, for a special independent tag belonging to a logical end tag, it is necessary to default the tag attribute of the special independent tag when converting. For example, for a special independent tag of "bookmark type ═ end"/>, the updated tag obtained after conversion can be a </bookmark >, and the tag attribute of "type ═ end" "is omitted.
It is to be understood that there are often multiple sets of paired special independent tags in one text, such as the above example, with two sets of paired special independent tags.
In practical applications, besides the case of the multiple groups of special independent tags in pairs appearing in sequence in the previous example, the special independent tags in pairs may be nested with each other, so that the relationship between the special independent tags in each group needs to be clarified. In order to clarify the correspondence relationship between each group of special independent tags, a pairing group distinction attribute is often set in the tag. For example, for a tag < w: bookmark start w: ID ═ 0 >, where "ID ═ 0" >, i.e., the pairing group distinguishing attribute, the tag is identified to belong to the pairing group with ID of 0.
In this embodiment of the application, if the target tag includes the pairing group distinction attribute, the pairing group distinction attribute in the update tag may be configured according to the pairing group distinction attribute in the target tag.
It should be understood that, in the embodiment of the present application, for the update of the pairing group distinguishing attribute, the pairing group distinguishing attribute in the original tag may be inherited, but the pairing group distinguishing attribute may also be regenerated via a specific conversion or encoding rule. For example, for "id" 0 "in the above example, the numbering may be repeated so that id" 1 "or id takes the remaining identifiable characters.
It is noted that the pairing-set attribute belongs to one of the tag attributes, and thus for a particular individual tag belonging to a logical end tag, a default is required in generating the update tag. However, in order to enable the computer-aided translation tool to accurately determine the two special independent tags belonging to the same pairing group, in the embodiment of the present application, when the updated tag is generated, the generated pairing group distinguishing attribute may be added to the tag name as a part of the tag name, so that on the basis of meeting the identification requirement of the computer-aided translation tool, accurate identification of the two special independent tags belonging to the same pairing group can be achieved.
It should be further noted that, in the embodiment of the present application, as long as the adopted pairing group distinguishing attribute conversion manner can enable two special independent tags of the same pairing group, the pairing group distinguishing attributes after conversion are still consistent and can be recognized by a computer-aided translation tool. And do not limit the specific implementations thereof.
In the embodiment of the application, after the target tag is replaced by the update tag, the corresponding relationship between the target tag and the update tag may be recorded. Further, the updated tag may be replaced back with the target tag in the translation complete file upon receiving the translation complete instruction.
It should be understood that the aforementioned translation completion file is a file after the translation of the file to be translated is completed, and the tag in the file to be translated is reserved in the translation completion file. Therefore, the consistency of the label in the translated file and the original file can be ensured, and the translated file is convenient to use for subsequent operation.
The tag conversion processing method provided in the embodiment of the application determines the tag type by identifying the target tag in the file to be translated and according to the tag type feature in the logical paired tag feature of the target tag. And meanwhile, according to the label name of the target label, determining a corresponding replacement label name, so that an updated label is generated according to the label type and the replacement label name, the updated label can be identified as a formally paired label, and finally, the target label is replaced by the updated label. Therefore, the target label is replaced to update the label, and the updated label can be recognized, so that during translation, the independent labels logically belonging to the label pair can be differentially translated, so that a translator can distinguish the special independent labels logically belonging to the label pair, the requirement on the computer professional knowledge of the translator is further reduced, and the quality and the efficiency of translation operation are also improved.
Example two:
in this embodiment, on the basis of the first embodiment, a specific tag conversion processing procedure is taken as an example to exemplify the scheme of the present application.
Referring to fig. 3, the tag conversion process includes:
s301: and acquiring the file to be translated, and determining the file type of the file to be translated.
S302: and determining a logic pair label characteristic for identifying the target label according to the file type.
S303: and acquiring the label name and the label attribute of the label in the file to be translated.
S304: it is identified whether a logical pair tag feature is included in the tag name and/or tag attributes. If yes, go to step S305; otherwise, ending.
S305: and determining a replacement tag name corresponding to the tag name of the target tag according to the preset tag name corresponding relation.
S306: and determining the label type of the target label according to the label type characteristic in the logic pair label characteristic.
It should be understood that there is no timing limitation between steps S305 and S306. That is, step S305 may be executed before step S306, may be executed after step S306, and may be executed simultaneously with step S306.
S307: and when the tag type is the logic start tag, acquiring the target tag, and replacing the tag name of the target tag with the replacement tag name to obtain the updated tag.
S308: and when the tag type is a logic end tag, acquiring the target tag, replacing the tag name of the target tag with a replacement tag name, adding an end tag identifier before replacing the tag name, and defaulting the tag attribute of the target tag to obtain an updated tag.
S309: judging whether the target label comprises the grouping attribute, if so, turning to the step S310; otherwise, go to step S311.
S310: and updating the pairing group distinguishing attribute in the label according to the pairing group distinguishing attribute configuration in the target label.
S311: and replacing the target label with an update label, and recording the corresponding relation between the target label and the update label.
S312: and when a translation completion instruction is received, replacing the update label with a target label in the translation completion file.
Two sets of text cases for which conversion is implemented by embodiments of the present application are illustrated below:
case one: (robot Help source file: Processing Instruction (Processing Instruction) tag of starting or ending position of conditional text)
Before conversion, the following steps are carried out:
<?rb-cbt_start name="student teacher"?>Quiz for students&teachers<?rb-cbt_end?><?rb-cbt_start name="teacher"?>
Answer for teachers only<?rb-cbt_end?>
or the following steps:
<?rb-cbt_start name="teacher student"?>Quiz for students&teachers<?rb-cbt_end?><?rb-cbt_start name="teacher"?>
Answer for teachers only<?rb-cbt_end?>
after conversion:
<Q_cbt name="student teacher">Quiz for students&teachers</Q_cbt><Q_cbt name="teacher">
Answer for teachers only</Q_cbt>
(code action interpretation: when the student conditions are given externally, only the text controlled by the student conditions is displayed, namely, "Quiz for students"; when the teacher conditions are given externally, the range controlled by the teacher conditions is displayed, namely, "Quiz for students & teachers" and "Answer for teachers only").
In the above code, "rb-cbt _ start" and "rb-cbt _ end" are paired with each other according to a name (tag name) attribute for marking a text range to be controlled. In addition, in the two pieces of code before conversion, "name" ═ student "and" name "═ student" describe two objects, "teacher" and "student", and therefore the nature is consistent, so that the two pieces of code can be converted into the code after conversion.
Note that, for "name ═ student relay" and "name ═ student" in the precedent code, it is a pairing group distinguishing attribute in the label, and therefore, it can also be converted into the remaining form. For example, it may be set that a binary bit conversion rule is adopted, and when two occur simultaneously, add to obtain the combined id: student teacher is 3.
The converted code is: < Q _ cbt id ═ 3"> Quiz for students [ < Q _ cbt > < Q _ cbt id ═ 2" >
Answer for teachers only</Q_cbt>
Case two: (Label of start or end position of Open Xml Document bookmark)
And w: bookmark start and w: bookmark nd are paired with each other according to the w: id attribute and are used for marking the text range controlled by the bookmark.
Before conversion, the following steps are carried out:
(w: bookmark Start w: id ═ 0 ═ w: name ═ Test "/> < w: r > < w: rPr > < w: rFontsw: hit ═ eastAsia"/> < w: land w: val ═ en-US "w: eastAsia ═ zh ═ z-CN"/> < w: rPr > < w: t > < w: r > < w: bookmark End: id ═ 0"/> < w >
After conversion:
< bookmark id ═ 0"> < w: r > < w: rPr > < w: rFonts w: hit ═ eastAsia"/> < w: lang w: val ═ en-US:eastAsia ═ zh-CN "/> < w: rPr > < w: tty: t > < w: r >/bookmark >
Or the following steps:
< bookmark id ═ 1"> < w: r > < w: rPr > < w: rFonts w: hit ═ eastAsia"/> < w: lang w: val ═ en-US:eastAsia ═ zh-CN "/> < w: rPr > < w: tty: t > < w: r >/bookmark >
In the above example, the first converted code has no change in the pairing group partition attribute and remains as id "0", while the second converted code has the pairing group partition attribute newly assigned as id "1". Both ways can be accurately recognized by computer-aided translation tools and are feasible.
According to the scheme of the embodiment of the application, the pairing labeling processing of the class start/end label is realized on the premise of not influencing the original CAT processing logic through the identification and replacement processing of the special independent label, and the misleading of a translator is avoided, so that the translation quality and efficiency are improved. Meanwhile, because the original CAT logic is not influenced, the modification workload is less, and the method is easy to implement and popularize.
Example three:
based on the same inventive concept, the embodiment of the application also provides a label conversion processing device applied to the electronic equipment. Referring to fig. 4, fig. 4 shows a tag conversion processing apparatus 100 corresponding to the method according to the first embodiment. It should be understood that the specific functions of the tag conversion processing apparatus 100 can be referred to the above description, and the detailed description is appropriately omitted here to avoid redundancy. The tag conversion processing apparatus 100 includes at least one software functional module that can be stored in a memory in the form of software or firmware or solidified in an operating system of the tag conversion processing apparatus 100. Specifically, the method comprises the following steps:
referring to fig. 4, the tag conversion processing apparatus 100 includes: an identification module 101, a judgment module 102, a determination module 103 and a replacement module 104. Wherein:
the identification module 101 is configured to identify a target tag in a file to be translated, where the target tag is a tag having a logical paired tag characteristic;
the judging module 102 is configured to judge a tag type of the target tag according to a tag type feature in the logical paired tag features; the label type comprises a logic start label or a logic end label;
the determining module 103 is configured to determine a replacement tag name of the target tag according to the tag name of the target tag;
the replacement module 104 is configured to generate an updated tag according to the tag type and the replacement tag name, so that the updated tag can be recognized as a formally paired tag; replacing the target tag with the update tag.
In this embodiment of the present application, the identifying module 101 is specifically configured to obtain a tag name of a tag in the file to be translated, and identify whether the tag name includes the logical paired tag feature; and if the label name contains the logic pair label characteristic, determining that the label is the target label.
In this embodiment of the present application, the determining module 103 is specifically configured to determine, according to a preset correspondence, a preset replacement tag name corresponding to the tag name of the target tag as the replacement tag name.
In a possible implementation manner of the embodiment of the present application, the replacing module 104 is specifically configured to, if the tag type is a logic start tag, obtain the target tag, and replace the tag name of the target tag with the replacement tag name to obtain the updated tag; and if the tag type is a logic end tag, acquiring the target tag, replacing the tag name of the target tag with the replacement tag name, and adding an end tag identifier before the replacement tag name to obtain the updated tag.
In this embodiment of the present application, the determining module 103 is further configured to determine a file type of the file to be translated; and determining a logic pair tag characteristic for identifying the target tag according to the file type.
In this embodiment of the present application, the tag conversion processing apparatus 100 further includes a configuration module, configured to determine whether the target tag includes a pairing group partition attribute; and if so, configuring the matched group distinguishing attribute in the updated label according to the matched group distinguishing attribute in the target label.
In this embodiment of the application, the tag conversion processing apparatus 100 further includes a recording module, configured to record a corresponding relationship between the target tag and the update tag. The replacing module 104 is further configured to replace the updated tag with the target tag in the translation complete file if the translation complete instruction is received; the translation completion file is a file after the translation of the file to be translated is completed, and a label in the file to be translated is reserved in the translation completion file.
It should be understood that, for the sake of brevity, the contents described in some embodiments are not repeated in this embodiment.
Example four:
the embodiment provides an electronic device, which can be seen in fig. 5, and includes a controller 501, a memory 502 and a communication bus 503. Wherein:
the communication bus 503 is used for realizing connection communication between the controller 501 and the memory 502.
The controller 501 is configured to execute one or more programs stored in the memory 502 to implement the tag conversion processing method in the first embodiment or the second embodiment.
It will be appreciated that the configuration shown in fig. 5 is merely illustrative and that the electronic device may also include more or fewer components than shown in fig. 5 or have a different configuration than shown in fig. 5, for example also having components such as a communication port, a display screen, a keyboard, etc.
The present embodiment also provides a readable storage medium, such as a floppy disk, an optical disk, a hard disk, a flash Memory, a usb (Secure Digital Card), an MMC (Multimedia Card), etc., in which one or more programs for implementing the above steps are stored, and the one or more programs can be executed by one or more controllers to implement the tag conversion processing method in the first embodiment/the second embodiment. And will not be described in detail herein.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
In addition, units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
Furthermore, the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
In this context, a plurality means two or more.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A tag conversion processing method is characterized by comprising the following steps:
identifying a target label in a file to be translated, wherein the target label is a label with a logical pair label characteristic;
judging the label type of the target label according to the label type characteristic in the logic pair label characteristic; the label type comprises a logic start label or a logic end label;
determining a replacement tag name of the target tag according to the tag name of the target tag;
generating an updated tag based on the tag type and the replacement tag name such that the updated tag can be recognized as a formally paired tag;
replacing the target tag with the update tag.
2. The tag conversion processing method according to claim 1, wherein identifying the target tag in the file to be translated includes:
acquiring a tag name and/or tag attribute of a tag in the file to be translated, and identifying whether the tag name and/or tag attribute contains the logic paired tag feature;
and if the label name and/or the label attribute contain the logic paired label characteristic, determining that the label is the target label.
3. The tag conversion processing method according to claim 1, wherein determining the replacement tag name of the target tag from the tag name of the target tag includes:
and determining a preset replacement tag name corresponding to the tag name of the target tag as the replacement tag name according to a preset corresponding relation.
4. The tag conversion processing method according to claim 3, wherein said generating an update tag based on said tag type and said replacement tag name, includes:
if the tag type is a logic start tag, acquiring the target tag, and replacing the tag name of the target tag with the replacement tag name to obtain the updated tag;
and if the tag type is a logic end tag, acquiring the target tag, replacing the tag name of the target tag with the replacement tag name, adding an end tag identifier before the replacement tag name, and defaulting the tag attribute of the target tag to obtain the updated tag.
5. The label conversion processing method according to claim 1, characterized in that the method further comprises:
determining the file type of the file to be translated;
and determining a logic pair tag characteristic for identifying the target tag according to the file type.
6. The label conversion processing method of any one of claims 1-5, wherein the method further comprises:
judging whether the target label comprises a matching group distinguishing attribute;
and if so, configuring the matched group distinguishing attribute in the updated label according to the matched group distinguishing attribute in the target label.
7. The label conversion processing method of any one of claims 1-5, wherein the method further comprises:
recording the corresponding relation between the target label and the updated label;
if a translation completion instruction is received, replacing the updated tag with the target tag in a translation completion file; the translation completion file is a file after the translation of the file to be translated is completed, and a label in the file to be translated is reserved in the translation completion file.
8. A label conversion processing apparatus characterized by comprising: the device comprises an identification module, a judgment module, a determination module and a replacement module;
the identification module is used for identifying a target label in a file to be translated, wherein the target label is a label with a logical pair label characteristic;
the judging module is used for judging the label type of the target label according to the label type characteristic in the logic paired label characteristics; the label type comprises a logic start label or a logic end label;
the determining module is used for determining a replacing label name of the target label according to the label name of the target label;
the replacement module is used for generating an updated label according to the label type and the replacement label name so that the updated label can be identified as a formally paired label; replacing the target tag with the update tag.
9. An electronic device, comprising: a controller, a memory and a communication bus;
the communication bus is used for realizing connection communication between the controller and the memory;
the controller is configured to execute one or more programs stored in the memory to implement the tag conversion processing method according to any one of claims 1 to 7.
10. A readable storage medium storing one or more programs, the one or more programs being executable by one or more controllers to implement the tag conversion processing method according to any one of claims 1 to 7.
CN202010867140.4A 2020-08-25 2020-08-25 Label conversion processing method and device, electronic equipment and readable storage medium Active CN111967274B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010867140.4A CN111967274B (en) 2020-08-25 2020-08-25 Label conversion processing method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010867140.4A CN111967274B (en) 2020-08-25 2020-08-25 Label conversion processing method and device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN111967274A true CN111967274A (en) 2020-11-20
CN111967274B CN111967274B (en) 2024-05-31

Family

ID=73390683

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010867140.4A Active CN111967274B (en) 2020-08-25 2020-08-25 Label conversion processing method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN111967274B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112632988A (en) * 2020-12-29 2021-04-09 文思海辉智科科技有限公司 Sentence segmentation method and device and electronic equipment

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100295665A1 (en) * 2009-05-22 2010-11-25 The Stanley Works Israel Ltd. Object management system and method
US20130151230A1 (en) * 2011-12-12 2013-06-13 Google Inc. Techniques for assisting a human translator in translating a document including at least one tag
CN103745003A (en) * 2014-01-24 2014-04-23 北京京东尚科信息技术有限公司 HTML fragment detection method
CN107045447A (en) * 2016-02-05 2017-08-15 阿里巴巴集团控股有限公司 The tag displaying method and device of a kind of data object
CN109766560A (en) * 2019-01-14 2019-05-17 姚珍强 Interpretation method, system, terminal and storage medium
CN109801008A (en) * 2018-06-15 2019-05-24 意盛(北京)科技有限责任公司 The method and system of authentication
US20190370323A1 (en) * 2018-06-01 2019-12-05 Apple Inc. Text correction
CN110569332A (en) * 2019-09-09 2019-12-13 腾讯科技(深圳)有限公司 Sentence feature extraction processing method and device
CN110969003A (en) * 2018-09-29 2020-04-07 北京国双科技有限公司 Text content generation method and device
CN111144070A (en) * 2019-12-31 2020-05-12 北京迈迪培尔信息技术有限公司 Document parsing translation method and device
CN111143074A (en) * 2019-12-30 2020-05-12 文思海辉智科科技有限公司 Method and device for distributing translation files
CN111291533A (en) * 2020-01-22 2020-06-16 文思海辉智科科技有限公司 Sentence segment to be displayed display method and device, computer equipment and storage medium
CN111460835A (en) * 2020-03-31 2020-07-28 文思海辉智科科技有限公司 Auxiliary translation method and device and electronic equipment

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100295665A1 (en) * 2009-05-22 2010-11-25 The Stanley Works Israel Ltd. Object management system and method
US20130151230A1 (en) * 2011-12-12 2013-06-13 Google Inc. Techniques for assisting a human translator in translating a document including at least one tag
CN103745003A (en) * 2014-01-24 2014-04-23 北京京东尚科信息技术有限公司 HTML fragment detection method
CN107045447A (en) * 2016-02-05 2017-08-15 阿里巴巴集团控股有限公司 The tag displaying method and device of a kind of data object
US20190370323A1 (en) * 2018-06-01 2019-12-05 Apple Inc. Text correction
CN109801008A (en) * 2018-06-15 2019-05-24 意盛(北京)科技有限责任公司 The method and system of authentication
CN110969003A (en) * 2018-09-29 2020-04-07 北京国双科技有限公司 Text content generation method and device
CN109766560A (en) * 2019-01-14 2019-05-17 姚珍强 Interpretation method, system, terminal and storage medium
CN110569332A (en) * 2019-09-09 2019-12-13 腾讯科技(深圳)有限公司 Sentence feature extraction processing method and device
CN111143074A (en) * 2019-12-30 2020-05-12 文思海辉智科科技有限公司 Method and device for distributing translation files
CN111144070A (en) * 2019-12-31 2020-05-12 北京迈迪培尔信息技术有限公司 Document parsing translation method and device
CN111291533A (en) * 2020-01-22 2020-06-16 文思海辉智科科技有限公司 Sentence segment to be displayed display method and device, computer equipment and storage medium
CN111460835A (en) * 2020-03-31 2020-07-28 文思海辉智科科技有限公司 Auxiliary translation method and device and electronic equipment

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
YANLING CHEN等: "An Information-Theoretic Approach to the Chipless RFID Tag Identification", IEEE, vol. 7, pages 96984, XP011737259, DOI: 10.1109/ACCESS.2019.2929243 *
姚军: "基于动态标签技术的信息发布系统设计研究", 电子设计工程, vol. 27, no. 15, pages 21 *
王峥嵘: "电子书包在线作业系统的设计研究", 中国优秀硕士学位论文全文数据库信息科技辑, no. 1, pages 138 - 276 *
郭东峰: "数据抽取中数据预处理", 数据库技术, no. 7, pages 224 *
陈晖, 陈意云, 茹祥民: "一种用于Java程序验证编译的标签类型", 软件学报, vol. 16, no. 03, pages 346 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112632988A (en) * 2020-12-29 2021-04-09 文思海辉智科科技有限公司 Sentence segmentation method and device and electronic equipment

Also Published As

Publication number Publication date
CN111967274B (en) 2024-05-31

Similar Documents

Publication Publication Date Title
CN109086199B (en) Method, terminal and storage medium for automatically generating test script
US20070033520A1 (en) System and method for web page localization
US20080155519A1 (en) Code translator
CN104899010A (en) Multilingualization method and system of source code
WO2014157056A1 (en) Source program analysis system, source program analysis method, and recording medium on which program is recorded
CN107566090B (en) Fixed-length/variable-length text message processing method and device
CN103761079A (en) Method and device for automatically graying page
CN108664546B (en) XML data structure conversion method and device
CN114090671A (en) Data import method and device, electronic equipment and storage medium
CN103761095B (en) Method for generating universal upgrade file head data information
CN111325031B (en) Resume analysis method and device
CN108153721A (en) A kind of automatic generation method in Word file embedded chart
CN111967274A (en) Label conversion processing method and device, electronic equipment and readable storage medium
CN110837727A (en) Document template generation method and device, terminal equipment and medium
CN113010473B (en) Method and equipment for editing YAML file
CN110633258B (en) Log insertion method, device, computer device and storage medium
CN110347379B (en) Processing method, device and storage medium for combined crowdsourcing questions
CN110110050B (en) Method for generating news event generating type question-answer data set
CN111026604A (en) Log file analysis method and device
CN114118026B (en) Automatic document generation method and device, computer storage medium and electronic equipment
CN102104741B (en) Method and device for arranging multi-language captions
CN111460766B (en) Contradictory language block boundary recognition method and device
CN110457659B (en) Clause document generation method and terminal equipment
CN107612919B (en) Protocol configuration method and device
CN102104743B (en) Method and device for editing multi-language hybrid arranged captions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant