CN116933807A - Text translation method, device, equipment and readable storage medium - Google Patents

Text translation method, device, equipment and readable storage medium Download PDF

Info

Publication number
CN116933807A
CN116933807A CN202311181680.7A CN202311181680A CN116933807A CN 116933807 A CN116933807 A CN 116933807A CN 202311181680 A CN202311181680 A CN 202311181680A CN 116933807 A CN116933807 A CN 116933807A
Authority
CN
China
Prior art keywords
translation
target
task
path
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311181680.7A
Other languages
Chinese (zh)
Other versions
CN116933807B (en
Inventor
王瑜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Fandian Chuangxiang Technology Co ltd
Original Assignee
Chengdu Fandian Chuangxiang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Fandian Chuangxiang Technology Co ltd filed Critical Chengdu Fandian Chuangxiang Technology Co ltd
Priority to CN202311181680.7A priority Critical patent/CN116933807B/en
Publication of CN116933807A publication Critical patent/CN116933807A/en
Application granted granted Critical
Publication of CN116933807B publication Critical patent/CN116933807B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Abstract

The application discloses a text translation method, a device, equipment and a readable storage medium in the technical field of computers. For a target translation task, at least one target translation path conforming to the target translation task is determined in a preset translation path library, so that a translation path when a model executes the target translation task is determined. And then matching the translation prompt information for the target translation task in a preset translation prompt library so that the artificial intelligent model executes the target translation task according to at least one item of target translation path and the translation prompt information, thereby enabling a translation result output by the artificial intelligent model executing the target translation task to follow the translation prompt information and the target translation path, improving the translation accuracy and enabling the translation result to be personalized or stylized according to the translation prompt information.

Description

Text translation method, device, equipment and readable storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a text translation method, apparatus, device, and readable storage medium.
Background
At present, a translation model or an artificial intelligent model with stronger capability can be used for text translation, but the readiness of the translation model or the artificial intelligent model for the translation result of the text depends on the related technologies of models such as model precision, model training modes and the like, and the translation quality is difficult to improve from other aspects.
Therefore, how to deviate from the model to improve the accuracy of text translation is a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
In view of the above, the present application is directed to a method, an apparatus, a device and a readable storage medium for text translation, so as to improve the accuracy of text translation by separating from a model. The specific scheme is as follows:
in a first aspect, the present application provides a text translation method, including:
acquiring a target translation task; the target translation task at least comprises: text to be translated and a type of the target language;
determining at least one target translation path conforming to the target translation task in a preset translation path library; the target translation path takes the language type of the text as an initial node, takes the target language type as a termination node, and does not have an intermediate conversion node or has at least one pivot language as an intermediate conversion node between the initial node and the termination node;
matching translation prompt information for the target translation task in a preset translation prompt library;
and inputting the at least one target translation path, the translation prompt information and the target translation task into an artificial intelligent model, so that the artificial intelligent model executes the target translation task according to the at least one target translation path and the translation prompt information.
Optionally, the determining, in a preset translation path library, at least one target translation path that meets the target translation task includes:
constructing query information by taking the language type of the text as an initial node and the target language type as a termination node;
and if the starting node and the ending node of any translation path in the translation path library are overlapped with the query information, determining the current translation path as the target translation path.
Optionally, the method further comprises:
if the target translation paths are multiple, obtaining a plurality of translation results obtained by the artificial intelligent model by executing the target translation task according to the multiple target translation paths and the translation prompt information;
inputting each translation result into the artificial intelligence model so that the artificial intelligence model outputs a translation score for each translation result;
and determining an optimal translation result according to the translation score.
Optionally, the matching the translation hint information for the target translation task in the preset translation hint library includes:
determining key information in the text;
and inquiring a target translation example and/or a target keyword matched with the key information in the translation prompt library to obtain the translation prompt information.
Optionally, the determining the key information in the text includes:
receiving the key information of a user aiming at the text input;
and/or
And extracting the key information from the text by using preset rules.
Optionally, the translation hint library includes:
the context learning module is used for providing translation example files in multiple language formats;
the knowledge enhancement module is used for providing a theme class keyword file, a domain class keyword file, a term class keyword file, a special keyword file and a template class translation example file;
and the stylization module is used for providing translation example files in multiple language styles.
Optionally, the method further comprises:
acquiring updated information of the translation path library and/or the translation prompt library;
and updating the translation path library and/or the translation prompt library according to the updating information.
In a second aspect, the present application provides a text translation apparatus, comprising:
the acquisition module is used for acquiring a target translation task; the target translation task at least comprises: text to be translated and a type of the target language;
the determining module is used for determining at least one target translation path which accords with the target translation task in a preset translation path library; the target translation path takes the language type of the text as an initial node, takes the target language type as a termination node, and does not have an intermediate conversion node or has at least one pivot language as an intermediate conversion node between the initial node and the termination node;
the matching module is used for matching the translation prompt information for the target translation task in a preset translation prompt library;
and the translation module is used for inputting the at least one target translation path, the translation prompt information and the target translation task into an artificial intelligent model so that the artificial intelligent model executes the target translation task according to the at least one target translation path and the translation prompt information.
In a third aspect, the present application provides an electronic device, comprising:
a memory for storing a computer program;
and a processor for executing the computer program to implement the previously disclosed text translation method.
In a fourth aspect, the present application provides a readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the previously disclosed text translation method.
According to the scheme, the application provides a text translation method, which comprises the following steps: acquiring a target translation task; the target translation task at least comprises: text to be translated and a type of the target language; determining at least one target translation path conforming to the target translation task in a preset translation path library; the target translation path takes the language type of the text as an initial node, takes the target language type as a termination node, and does not have an intermediate conversion node or has at least one pivot language as an intermediate conversion node between the initial node and the termination node; matching translation prompt information for the target translation task in a preset translation prompt library; and inputting the at least one target translation path, the translation prompt information and the target translation task into an artificial intelligent model, so that the artificial intelligent model executes the target translation task according to the at least one target translation path and the translation prompt information.
The beneficial effects of the application are as follows: for a target translation task, at least one target translation path conforming to the target translation task is determined in a preset translation path library, so as to determine a translation path when the model executes the target translation task. If the language type of the text to be translated is German and the target language type is Chinese, the target translation path may be: de (start node), a certain pivot language (e.g., english, intermediate conversion node), chinese (end node), etc. The pivot language is generally English, is the most universal language worldwide, and has the most parallel corpus with other languages, so that translation is easier to perform, and the pivot language is used for performing intermediate conversion, so that translation accuracy can be improved. And then matching the translation prompt information for the target translation task in a preset translation prompt library so that the artificial intelligent model executes the target translation task according to at least one item of target translation path and the translation prompt information, thereby enabling a translation result output by the artificial intelligent model executing the target translation task to follow the translation prompt information and the target translation path, not only being separated from the design of the model to improve the translation accuracy, but also enabling the translation result to be personalized or stylized according to the translation prompt information.
Correspondingly, the text translation device, the text translation equipment and the readable storage medium have the technical effects.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a text translation method disclosed by the application;
FIG. 2 is a schematic diagram of a promtt design architecture of the present disclosure;
FIG. 3 is a schematic diagram of a translation result selection according to the present disclosure;
FIG. 4 is a schematic diagram of a text translation device according to the present disclosure;
fig. 5 is a schematic diagram of an electronic device according to the present disclosure.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
At present, a translation model or an artificial intelligent model with stronger capability can be used for text translation, but the readiness of the translation model or the artificial intelligent model for the translation result of the text depends on the related technologies of models such as model precision, model training modes and the like, and the translation quality is difficult to improve from other aspects. Therefore, the application provides a text translation scheme which can be separated from a model to improve the accuracy of text translation.
Referring to fig. 1, the embodiment of the application discloses a text translation method, which comprises the following steps:
s101, acquiring a target translation task.
Wherein, the target translation task at least comprises: text to be translated and a type of the target language; may further include: language type of text to be translated.
S102, determining at least one target translation path which accords with a target translation task in a preset translation path library.
The target translation path takes the language type of the text as an initial node, takes the target language type as a terminal node, and does not have an intermediate conversion node or takes at least one pivot language as the intermediate conversion node between the initial node and the terminal node. If the language type of the text to be translated is German and the target language type is Chinese, the target translation path may be: de (start node) →han (end node), de (start node) →english (intermediate conversion node) →han (end node), and so on. English is the most universal language worldwide, and parallel corpus between English and other languages is the most, so that translation is easier to carry out, and translation accuracy can be improved by carrying out intermediate conversion in English.
In one embodiment, determining at least one target translation path in a preset translation path library, wherein the target translation path accords with a target translation task, comprises: constructing query information by taking the language type of the text as an initial node and the target language type as a termination node; and if the starting node and the ending node of any translation path in the translation path library are overlapped with the query information, determining the current translation path as a target translation path. If the language type of the text to be translated is German and the target language type is Chinese, the query information may be: the method comprises the steps of 'starting node is German and ending node is Chinese', and then traversing a translation path library to inquire that the starting node is German and the ending node is a translation path of Chinese as a target translation path. A plurality of translation paths are stored in a preset translation path library.
Of course, more than one target translation path may be found. In one embodiment, the method further comprises: if the target translation paths are multiple, multiple translation results obtained by the artificial intelligent model executing the target translation task according to the multiple target translation paths and the translation prompt information are obtained; inputting each translation result into the artificial intelligent model so that the artificial intelligent model outputs a translation score for each translation result; and determining an optimal translation result according to the translation score. The artificial intelligence model in this embodiment may be a dedicated translation model, or may be a complex model with multiple functions such as question-answering, translation, natural language generation, image classification, keyword extraction, etc., for example: chatGPT (Chat Generative Pre-trained Transformer, chat-generated pre-training transformation model), and the like.
S103, matching translation prompt information for the target translation task in a preset translation prompt library.
In one embodiment, matching translation hint information for a target translation task in a preset translation hint library includes: determining key information in the text; and inquiring target translation examples and/or target keywords matched with the key information in the translation prompt library to obtain translation prompt information. In one embodiment, determining key information in text includes: receiving key information of a user aiming at text input; and/or extracting key information from the text by using preset rules; and/or inputting the text to be translated into the artificial intelligence model, so that the artificial intelligence model outputs key information in the text.
In this embodiment, the translation path library (including a language characteristic module) and the translation prompt library (including a module of context learning, knowledge enhancement, and stylization) are knowledge bases designed by professionals according to experience and professional knowledge, wherein information includes translation examples, keywords, and the like. In one embodiment, the translation hint library includes: the context learning module is used for providing translation example files in a plurality of language formats, such as: the format of the phrase, the antithetical couplet, the format of the rhyme, etc.; the knowledge enhancement module is used for providing a theme class keyword file, a domain class keyword file, a term class keyword file, a special keyword file (including a high-order task indicator, an external input indicator and the like) and a template class translation example file; the stylization module is used for providing translation example files with multiple language styles, such as: style of ancient poems, etc. Accordingly, querying the translation hint library for target translation examples and/or target keywords that match the key information, including: inquiring translation example files in multiple language formats in the context learning module through the file key value, and determining translation examples in a specific language format matched with the key information; querying each file in the knowledge enhancement module through the file key value, and determining target keywords matched with the key information and template class translation examples; and querying the translation example files of the multiple language styles in the stylization module through the file key values, and determining the translation examples of the specific language styles matched with the key information. It can be seen that each file in the translation hint library is provided with a corresponding file key value, and the value can be set as a storage path of the corresponding file.
It should be noted that the translation path library and the translation hint library support timely update to add, delete, change, etc. information therein. Thus in one embodiment, further comprising: acquiring updated information of a translation path library and/or a translation prompt library; and updating the translation path library and/or the translation prompt library according to the updating information. The update information may be manually entered by a user.
S104, inputting the at least one target translation path, the translation prompt information and the target translation task into the artificial intelligent model, so that the artificial intelligent model executes the target translation task according to the at least one target translation path and the translation prompt information.
When the method is specifically implemented, corresponding Prompt prompts can be designed based on information in the translation path library and the translation Prompt library, and then the artificial intelligent model is guided to execute target translation tasks according to the Prompt prompts. Thus, in one embodiment, inputting at least one target translation path, translation hint information, and a target translation task into an artificial intelligence model to cause the artificial intelligence model to perform the target translation task according to the at least one target translation path and translation hint information, comprising: and constructing a Prompt comprising at least one target translation path, translation Prompt information and a target translation task, and inputting the Prompt into the artificial intelligent model so that the artificial intelligent model executes the target translation task according to the Prompt.
The method for constructing the promtt Prompt comprising at least one target translation path, translation Prompt information and target translation tasks comprises the following steps: and filling at least one item mark translation path, translation Prompt information and a target translation task into a preset Prompt template to obtain a Prompt. For example, the preset hint template is: [ translation route is +word filling area ], [ prompt message is +word filling area ], [ translation task is +word filling area ]. If at least one item mark translation path is represented by A, translation Prompt information is represented by B, and a target translation task is represented by C, the promt Prompt obtained through filling is as follows: the translation path is: a ], [ hint information: b ], [ translation task: c ].
To facilitate understanding of the language of the Prompt, the Prompt may be generated based on at least one target translation path, translation Prompt information, and target translation tasks. In one example, constructing a Prompt hint including at least one target translation path, translation hint information, and a target translation task includes: and inputting the at least one target translation path, the translation Prompt information and the target translation task into a natural language model, so that the natural language model generates a Prompt according to the at least one target translation path, the translation Prompt information and the target translation task, and the obtained Prompt accords with the language habit of a person and is easier to understand by the artificial intelligent model. In one example, the natural language model generates the Prompt as: translate the sentence "It's a pleasure to make your acquaintance" into chinese and make the language style of the chinese translation result consistent with the following translation examples: translation results of the sentence "I don't know whether I really love you, but I know I cannot lose you. If the earth is going to be destroyed I want to tell you that you are the only one I want to see" are exemplified: the feeling of me is unknown, but the user cannot feel the bad feeling. If the heaves and earth burst, only the Ru is in front of the eyes of me. The natural language model may be a model which is trained and dedicated to generate a Prompt, or may be a complex artificial intelligence model with multiple functions such as question-answering, translation, natural language generation, image classification, etc., for example: chatGPT, etc.
After obtaining a target translation task, the embodiment determines at least one target translation path conforming to the target translation task in a preset translation path library, so as to determine a translation path when the model executes the target translation task. And then matching the translation prompt information for the target translation task in a preset translation prompt library so that the artificial intelligent model executes the target translation task according to at least one item of target translation path and the translation prompt information, thereby enabling a translation result output by the artificial intelligent model executing the target translation task to follow the translation prompt information and the target translation path, not only being separated from the design of the model to improve the translation accuracy, but also enabling the translation result to be personalized or stylized according to the translation prompt information.
As in the previous embodiment, the promt may be used to determine the target translation path and the translation hint information, and then the following modules may be designed in the promt: the model role assignment module, the context learning module, the knowledge enhancement module, the stylization module, and the language characteristics module are described with reference to fig. 2. In one example, the artificial intelligence model can use ChatGPT or the like.
The model role assignment module assigns roles to the artificial intelligence model, allowing the model to act as a specialized translator. The model role assignment module outputs prompt information aiming at a translation task as follows:
1.Translate these sentences from [SRC] to [TGT];
2.Answer with no quotes. What do these sentences mean in [TGT];
3.Please provide the [TGT] translation for these sentences。
wherein SRC stands for Source language, TGT stands for target language. As can be seen from the 3 kinds of prompt information, the model role assignment module can realize the simplest prompt with straight white, but no extra modifier.
For example: given the following sentences: i once when I was six years old I saw a magnificent picture in a book, called True Stories from Nature, about the primeval forest, SRC is english and TGT is chinese, then the translation result output by the model is:
1. i once seen a strong picture in a book named "natural real story" at six years old, depicting the original forest.
2. When me ages six, me saw a strong picture in a book named "natural true story", which is the original forest.
3. I have seen a strong picture in a book named "natural real story" at six years old, depicting the original forest.
From the above 3 translation results, it can be seen that the accuracy and fluency of the translation results are applicable based on the simplest prompt.
In the translation process, the temperature coefficient of the ChatGPT is set to 0, so that the translation output is more stable, and the accuracy of translation is ensured during translation. A larger temperature coefficient may make the translation more open.
If the translation result of the model is required to conform to a certain form, such as the style of ancient poetry, a context learning module, a knowledge enhancement module, a stylization module and other modules need to be called for customizing the translation result.
The context learning module may specify language habits of the translation, namely: given a small number of context translation examples, let ChatGPT learn the context at the time of translation, thereby outputting a translation result conforming to a specific language habit for a translation task. For text to be translated: a straight foot is not afraid of a crooked shoe, the translation result output by the calling model role assignment module is: one straight foot is not afraid of one bent shoe. If the call context learning module gives a small number of examples of Chinese-to-English inter-translations, the Chinese-to-English inter-translations are as follows:
messages = [{"role": "system", "content": "You are a professoral translator"}
{"role": "user", "content": "It is never too old to learn"}
{ "role": "static", "content": "live to old, learned to old" }
{"role": "user", "content": "No cross, no crown"}
{ "role": "Assistant", "content": "does not experience wind and rain, how to see rainbow" }
{"role": "user", "content": "A straight foot is not afraid of a crooked shoe"}]。
Then the translation result output by the model translation "A straight foot is not afraid of a crooked shoe" is: the body is afraid of the shadow inclination.
It can be seen that given a small number of context examples, the model can be enabled to output translation results in a specified format and habit in a context learning method.
The knowledge enhancement module can remarkably improve the translation quality, and can specifically enhance knowledge from the aspects of keywords, translation templates, external knowledge and the like, and the knowledge enhancement gives prompt words and translated sentences preferably keep the language consistent.
The keyword pairs after translation can be kept unchanged by extracting keywords through the Prompt, so that the accuracy of translation is improved. The keyword can be obtained through the ChatGPT, and the specific keyword can be specified by the user. And extracting keywords, translating, and translating the meaning of the whole sentence. For sentences to be translated: today's weather is good, we should get rid of the constraint of the bed, feel happy, go out for outing and riding, find a quiet camping at night, look up for starry sky. The extracted keywords are translated, and the formed translation pairs are as follows: < weather >, < break free >, < bed-beans >, < pleasant mood-joyful mood >, < outing-hike >, < cycling-bike rim >, < camping-camping >, < starry sky-stars >, then the translation hint words are: based on a given translation pair, translating the following sentences into English "today's weather is good, we should get rid of the constraint of the bed, feel happy mood, go out for outing and riding, find a quiet camping and look up for starry sky at night.
The topic information often determines the direction of the whole sentence, and the topic of the sentence is grasped, so that the translation is more controllable, and the topic is not deviated. Likewise, chatGPT can be given a small number of words to mine the subject of sentences by Prompt. For example: the promt of the subject word mining is designed to "describe one or more subjects of the following sentence using a small number of words: today's weather is good, we should get rid of the constraint of the bed, feel happy, go out for outing and riding, find a quiet camping at night, look up for starry sky "; subject matter: weather, play, camping, starry sky.
The translation hint word is then "based on the given topic, translate the following sentence into English: today's weather is good, we should get rid of the constraint of the bed, feel happy, go out for outing and riding, find a quiet camping at night, look up for starry sky.
Given a translation related template, the ChatGPT can learn reference during translation, and can further learn the translation result of the sample, and the mode can be regarded as less sample learning. The design of the translation templates can be from two perspectives: (1) give manual translation, let ChatGPT reference. For example, the translation hint word is "please refer to sentence: chinese translation sample of Nice to meet you: in the life, people feel happy and feel like to meet. Translating it into a chinese-like expression). The model outputs the translation result as follows: long distance trekking and fortunate encounters. (2) Similar translation pairs are generated through Prompt, then translation enhancement is carried out, and the translation pairs are designed: "write an English sentence related to but different from the input English sentence, and translate it into Chinese. English sentence: nice to meet you. Translation pair: it's a pleasure to make your acquaintance-very happy to know you. The corresponding translation hint words are: "based on a given translation pair, english" Hi "is translated into Chinese.
In the natural language processing task, the model wants to exert the maximum effect, the domain data is an important ring, the domain data determines the upper limit of the performance, and the correct domain data is introduced in the translation to improve the translation performance of the ChatGPT, and the wrong domain data can cause the significant performance degradation. The expression of terms in different fields is often not identical. For example, in the medical field, the translation of resistance is expressed as "drug resistance", while the word resistance is meant to be "resistance". Thus, at Prompt can be designed as: "what the Chinese expression of resistance is in the context of the medical field", thus obtaining the translation result of "drug resistance".
The higher order task information in Prompt can further improve the translation performance of ChatGPT, especially in complex tasks, in the design of promt, what the higher order task of translation is can be specified. Such as Prompt cues: "please translate the following chinese into english, i want you to replace the lower level vocabulary with a higher level english vocabulary at the time of translation. Chinese: meet you, which is an unattractive margin for life. In this example, the higher order tasks are: the high level vocabulary replaces the low level vocabulary. Translation results specified by higher order tasks: "Meeting you is a rare and precious fate in my life". Translation results without higher order tasks: "Meeting you is a rare fate in life".
Through external knowledge, certain important words or words which need to be translated into manual settings can be replaced, so that fixed translation is reserved, the translation requirement is met, and the translation meets the subjective requirement better.
It is easy to see from the above knowledge enhancement mode that various specific designs can be correspondingly performed on the Prompt, so that the translation result meets the personalized requirements. The knowledge enhancement modes can be flexibly combined to improve the translation quality.
The stylization module may specify the cultural background of the target language to which the translation is to conform. Traditional translations tend to be relatively fixed, and large amounts of data are often required to train to obtain a particular style of translation. The stylized translation can be performed by designing promt thanks to the learning ability of the large model. Stylized translation mainly uses the following two methods:
1. the desired style is specified directly in the Prompt.
Prompt hint word: "please express the following English sentence as Chinese in the style of ancient poetry of China: i don't know whether I really love you, but I know I cannot lose you If the earth is going to be destroyed I want to tell you that you are the only one I want to see ". Translation results: the feeling of me is unknown, but the user cannot feel the bad feeling. If the heaves and earth burst, only the Ru is in front of the eyes of me.
2. Translation is performed first, and then rewriting is performed in accordance with a desired style.
Prompt hint word: please first translate the following english sentence into chinese: i don't know whether I really love you, but I know I cannot lose you If the earth is going to be destroyed I want to tell you that you are the only one I want to see. And then the Chinese ancient poetry style is rewritten. Translation results: i do not know if I really love you, but I know that I cannot lose you. If the earth is to be destroyed, I want to tell you that you are the only people I want to see. Rewriting the result: what is known is that the heart is not directed, but what is known is that the heart is not able to lose his or her position. If the body bursts, we want to get the only one in our heart Ru Nai.
Languages in the world can be classified into two categories on a scale: high resource languages and low resource languages. High-resource languages are widely spread and used due to the influence of population, country, language family, such as chinese, english, etc. Therefore, the translation corpus of the high-resource language is sufficient, so that the effect of the large model on the high-resource language can be good when the large model performs the translation task, and the low-resource language is just opposite. Then the use of language properties module may replace the lower level resource language with a more advanced resource language.
The translations are classified into 4 classes according to the resources:
1. high resource-high resource: both the source language and the target language are high-resource languages.
2. High-low resources: the source language is a high resource language and the target language is a low resource language.
3. Low resource-high resource: the source language is a low-resource language and the target language is a high-resource language.
4. Low resource-low resource: both the source language and the target language are low-resource languages.
Since ChatGPT has better translation effect in high-resource language, low-resource language has little research, and experiments find that: (1) The score for translating low resources into high resources is much higher than translating high resources into low resources. This is due to the fact that low resources can benefit from the strong modeling capabilities of high resource languages to compensate for the lack of parallel data. (2) When the low resource language is translated into the high resource, it may be translated into the pivot language first and then into the high resource language. The reason for introducing pivot languages is that there is no parallel corpus between low-resource and high-resource languages, but there is some parallel expectation between them and pivot languages. The pivot language is typically english, as parallel corpus is the most common language worldwide with other languages.
The following is given for the german sentence to be translated: dies bestatigt nicht nur, dass zumindest einige Dinosaurier Federn hatten, aine theoriee, die bereits weit verbreitet ist, sondern liefert auch Details, die Fossilien im Allgemeinennicht liefern konnen, wie etwa Farbe und dreidimensionale Anordnung.
If a German sentence is directly translated into Chinese, then the translation result is: not only does this confirm the theory that at least some dinosaurs are feathered, which has been widely spread, but it also provides details such as color and three-dimensional arrangement that fossil cannot provide.
If the pivot language is introduced to be translated into English first and then translated from English to Chinese, the translation result is: not only does this confirm the theory that at least some dinosaurs are feathered, which is widely accepted, but it also provides details such as color and three-dimensional arrangements that fossil is generally unable to provide.
From the above examples, it can be seen that the translation results of the introduced pivot language are better than the direct translation results, whereby the same sentence can be translated according to different translation paths. However, if the translation between the source language and the pivot language is poor, such errors may propagate into the pivot language and the target language, resulting in a final translation that is less efficient than a direct translation.
Furthermore, the ChatGPT can be utilized to score, so that the ChatGPT scores the translation results of different translation paths of the same sentence, and one with the highest score is selected as a final result, thereby reducing errors caused by error propagation and enhancing the translation effect. Accordingly, the Prompt term may be: "please translate the above german sentence into 4 english, then translate the 4 english translations into 4 chinese respectively, score each chinese at the same time, choose the chinese translation with highest score as the final result. As shown in fig. 3, the same text has 4 translation results, and the highest score is selected as the final result.
Therefore, according to the embodiment, the promt design is performed, so that the accuracy of translation can be greatly enhanced, and the effectiveness of translation is effectively improved; meanwhile, the temperature coefficient of ChatGPT is set to 0, so that translation is more stable.
A text translation device provided in the embodiments of the present application is described below, and a text translation device described below may refer to other embodiments described herein.
Referring to fig. 4, an embodiment of the present application discloses a text translation device, including:
an obtaining module 401, configured to obtain a target translation task; the target translation task includes at least: text to be translated and a type of the target language;
a determining module 402, configured to determine at least one target translation path that meets a target translation task in a preset translation path library; the target translation path takes the language type of the text as an initial node, takes the target language type as a termination node, and does not have an intermediate conversion node or has at least one pivot language as the intermediate conversion node between the initial node and the termination node;
the matching module 403 is configured to match the translation hint information for the target translation task in a preset translation hint library;
the translation module 404 is configured to input at least one target translation path, translation hint information, and a target translation task into the artificial intelligence model, so that the artificial intelligence model performs the target translation task according to the at least one target translation path and the translation hint information.
In one embodiment, the determining module is specifically configured to:
constructing query information by taking the language type of the text as an initial node and the target language type as a termination node;
and if the starting node and the ending node of any translation path in the translation path library are overlapped with the query information, determining the current translation path as a target translation path.
In one embodiment, the method further comprises:
the selection module is used for acquiring a plurality of translation results obtained by the artificial intelligent model executing the target translation task according to the plurality of target translation paths and the translation prompt information if the target translation paths are multiple; inputting each translation result into the artificial intelligent model so that the artificial intelligent model outputs a translation score for each translation result; and determining an optimal translation result according to the translation score.
In one embodiment, the matching module is specifically configured to:
determining key information in the text;
and inquiring target translation examples and/or target keywords matched with the key information in the translation prompt library to obtain translation prompt information.
In one embodiment, the matching module is specifically configured to:
receiving key information of a user aiming at text input;
and/or
And extracting key information from the text by using preset rules.
In one embodiment, the translation hint library includes:
the context learning module is used for providing translation example files in multiple language formats;
the knowledge enhancement module is used for providing a theme class keyword file, a domain class keyword file, a term class keyword file, a special keyword file and a template class translation example file;
and the stylization module is used for providing translation example files in multiple language styles.
In one embodiment, the method further comprises:
the library updating module is used for acquiring updating information of the translation path library and/or the translation prompt library; and updating the translation path library and/or the translation prompt library according to the updating information.
The more specific working process of each module and unit in this embodiment may refer to the corresponding content disclosed in the foregoing embodiment, and will not be described herein.
Therefore, the embodiment provides a text translation device, which can be separated from a model to improve the accuracy of text translation.
An electronic device provided in the embodiments of the present application is described below, and an electronic device described below may refer to other embodiments described herein.
Referring to fig. 5, an embodiment of the present application discloses an electronic device, including:
a memory 501 for storing a computer program;
a processor 502 for executing the computer program to implement the method disclosed in any of the embodiments above.
A readable storage medium provided by embodiments of the present application is described below, and the readable storage medium described below may be referred to with respect to other embodiments described herein.
A readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the text translation method disclosed in the previous embodiments. For specific steps of the method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.
The references to "first," "second," "third," "fourth," etc. (if present) are used to distinguish similar objects from each other and are not necessarily used to describe a particular order or sequence. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, or apparatus.
It should be noted that the description of "first", "second", etc. in this disclosure is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implying an indication of the number of technical features being indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present application.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of readable storage medium known in the art.
The principles and embodiments of the present application have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present application and the core ideas thereof; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (10)

1. A method of text translation, comprising:
acquiring a target translation task; the target translation task at least comprises: text to be translated and a type of the target language;
determining at least one target translation path conforming to the target translation task in a preset translation path library; the target translation path takes the language type of the text as an initial node, takes the target language type as a termination node, and does not have an intermediate conversion node or has at least one pivot language as an intermediate conversion node between the initial node and the termination node;
matching translation prompt information for the target translation task in a preset translation prompt library;
and inputting the at least one target translation path, the translation prompt information and the target translation task into an artificial intelligent model, so that the artificial intelligent model executes the target translation task according to the at least one target translation path and the translation prompt information.
2. The method of claim 1, wherein determining at least one target translation path in a preset translation path library that meets the target translation task comprises:
constructing query information by taking the language type of the text as an initial node and the target language type as a termination node;
and if the starting node and the ending node of any translation path in the translation path library are overlapped with the query information, determining the current translation path as the target translation path.
3. The method as recited in claim 1, further comprising:
if the target translation paths are multiple, obtaining a plurality of translation results obtained by the artificial intelligent model by executing the target translation task according to the multiple target translation paths and the translation prompt information;
inputting each translation result into the artificial intelligence model so that the artificial intelligence model outputs a translation score for each translation result;
and determining an optimal translation result according to the translation score.
4. The method according to claim 1, wherein the matching translation hint information for the target translation task in a preset translation hint library includes:
determining key information in the text;
and inquiring a target translation example and/or a target keyword matched with the key information in the translation prompt library to obtain the translation prompt information.
5. The method of claim 4, wherein the determining key information in the text comprises:
receiving the key information of a user aiming at the text input;
and/or
And extracting the key information from the text by using preset rules.
6. The method of any one of claims 1 to 5, wherein the library of translation hints comprises:
the context learning module is used for providing translation example files in multiple language formats;
the knowledge enhancement module is used for providing a theme class keyword file, a domain class keyword file, a term class keyword file, a special keyword file and a template class translation example file;
and the stylization module is used for providing translation example files in multiple language styles.
7. The method of any one of claims 1 to 5, wherein inputting the at least one target translation path, the translation hint information, and the target translation task into an artificial intelligence model to cause the artificial intelligence model to perform the target translation task according to the at least one target translation path and the translation hint information comprises:
constructing a Prompt comprising the at least one target translation path, the translation Prompt information and the target translation task;
inputting the Prompt into the artificial intelligence model so that the artificial intelligence model executes the target translation task according to the Prompt.
8. A text translation device, comprising:
the acquisition module is used for acquiring a target translation task; the target translation task at least comprises: text to be translated and a type of the target language;
the determining module is used for determining at least one target translation path which accords with the target translation task in a preset translation path library; the target translation path takes the language type of the text as an initial node, takes the target language type as a termination node, and does not have an intermediate conversion node or has at least one pivot language as an intermediate conversion node between the initial node and the termination node;
the matching module is used for matching the translation prompt information for the target translation task in a preset translation prompt library;
and the translation module is used for inputting the at least one target translation path, the translation prompt information and the target translation task into an artificial intelligent model so that the artificial intelligent model executes the target translation task according to the at least one target translation path and the translation prompt information.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the method of any one of claims 1 to 7.
10. A readable storage medium for storing a computer program, wherein the computer program when executed by a processor implements the method of any one of claims 1 to 7.
CN202311181680.7A 2023-09-14 2023-09-14 Text translation method, device, equipment and readable storage medium Active CN116933807B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311181680.7A CN116933807B (en) 2023-09-14 2023-09-14 Text translation method, device, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311181680.7A CN116933807B (en) 2023-09-14 2023-09-14 Text translation method, device, equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN116933807A true CN116933807A (en) 2023-10-24
CN116933807B CN116933807B (en) 2023-12-29

Family

ID=88382899

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311181680.7A Active CN116933807B (en) 2023-09-14 2023-09-14 Text translation method, device, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN116933807B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117709375A (en) * 2024-02-01 2024-03-15 成都帆点创想科技有限公司 Text translation method and device

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1988005946A1 (en) * 1987-02-05 1988-08-11 Tolin Bruce G Method of using a created international language as an intermediate pathway in translation between two national languages
WO2006090732A1 (en) * 2005-02-24 2006-08-31 Fuji Xerox Co., Ltd. Word translation device, translation method, and translation program
CN1834955A (en) * 2005-03-14 2006-09-20 富士施乐株式会社 Multilingual translation memory, translation method, and translation program
EP3026614A1 (en) * 2014-11-25 2016-06-01 Lionbridge Technologies, Inc. Information technology platform for language translations and task management
CN112163434A (en) * 2020-10-20 2021-01-01 腾讯科技(深圳)有限公司 Text translation method, device, medium and electronic equipment based on artificial intelligence
CN114722842A (en) * 2022-04-24 2022-07-08 西安领向鸟文化传播有限公司 Computer artificial intelligent foreign language translation method and translation system thereof
CN116127045A (en) * 2023-03-03 2023-05-16 北京百度网讯科技有限公司 Training method for generating large language model and man-machine voice interaction method based on model
CN116187353A (en) * 2023-02-14 2023-05-30 中国工商银行股份有限公司 Translation method, translation device, computer equipment and storage medium thereof
US20230222393A1 (en) * 2023-03-22 2023-07-13 Mark Zahm Systems and methodologies for the propagation of modulardynamic ai environments in lower dimensional space throughguided and autonomous learning
CN116542260A (en) * 2023-07-05 2023-08-04 中国民用航空飞行学院 Translation text quality assessment method and system based on natural language big model
CN116611459A (en) * 2023-07-19 2023-08-18 腾讯科技(深圳)有限公司 Translation model training method and device, electronic equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1988005946A1 (en) * 1987-02-05 1988-08-11 Tolin Bruce G Method of using a created international language as an intermediate pathway in translation between two national languages
WO2006090732A1 (en) * 2005-02-24 2006-08-31 Fuji Xerox Co., Ltd. Word translation device, translation method, and translation program
CN1834955A (en) * 2005-03-14 2006-09-20 富士施乐株式会社 Multilingual translation memory, translation method, and translation program
EP3026614A1 (en) * 2014-11-25 2016-06-01 Lionbridge Technologies, Inc. Information technology platform for language translations and task management
CN112163434A (en) * 2020-10-20 2021-01-01 腾讯科技(深圳)有限公司 Text translation method, device, medium and electronic equipment based on artificial intelligence
CN114722842A (en) * 2022-04-24 2022-07-08 西安领向鸟文化传播有限公司 Computer artificial intelligent foreign language translation method and translation system thereof
CN116187353A (en) * 2023-02-14 2023-05-30 中国工商银行股份有限公司 Translation method, translation device, computer equipment and storage medium thereof
CN116127045A (en) * 2023-03-03 2023-05-16 北京百度网讯科技有限公司 Training method for generating large language model and man-machine voice interaction method based on model
US20230222393A1 (en) * 2023-03-22 2023-07-13 Mark Zahm Systems and methodologies for the propagation of modulardynamic ai environments in lower dimensional space throughguided and autonomous learning
CN116542260A (en) * 2023-07-05 2023-08-04 中国民用航空飞行学院 Translation text quality assessment method and system based on natural language big model
CN116611459A (en) * 2023-07-19 2023-08-18 腾讯科技(深圳)有限公司 Translation model training method and device, electronic equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A.M. JASMINE HASHANA: "Deep Learning in ChatGPT - A Survey", 《PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON TRENDS IN ELECTRONICS AND INFORMATICS》, pages 1001 - 1005 *
我的AI力量: "ChatGPT的翻译表现以及提示词技巧", pages 1 - 23, Retrieved from the Internet <URL:https://zhuanlan.zhihu.com/p/640379910> *
贵理工涛声涛影: "ChatGPT I提示语词库,一查即用,精准又高效!", pages 1 - 4, Retrieved from the Internet <URL:www.360doc.com/content/12/0121/07/11400177_1086124901.shtml> *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117709375A (en) * 2024-02-01 2024-03-15 成都帆点创想科技有限公司 Text translation method and device

Also Published As

Publication number Publication date
CN116933807B (en) 2023-12-29

Similar Documents

Publication Publication Date Title
CN116933807B (en) Text translation method, device, equipment and readable storage medium
Pennell et al. Normalization of text messages for text-to-speech
JPH1011447A (en) Translation method and system based upon pattern
Goddard Natural semantic metalanguage
Albl-Mikasa Notation language and notation text
Razumovskaya Cultural information/memory and aesthetic information in literary translation
KR101936208B1 (en) Method for providing notation of standard chinese pronunciation utilizing hangul block building rules and hangul vowel letters which fulfilling traditional four hu theory of chinese language as they are pronounced in modern korean hangul*language and system thereof
Gevaert The evolution of the lexical and conceptual field of anger in Old and Middle English
CN111680524B (en) Human-machine feedback translation method and system based on inverse matrix analysis
JP2001325251A (en) Kanji (chinese character) input device and method
JP5528420B2 (en) Translation apparatus, translation method, and computer program
CN109086285B (en) Intelligent Chinese processing method, system and device based on morphemes
Zuckermann Cultural hybridity: Multisourced neologization in'reinvented'languages and in languages with'phono-logographic'script
KR20230124471A (en) Method and device for constructing personalized semantic unit chunks based on ai language model for learing english
Brinton English historical linguistics: approaches and perspectives
JP2009157888A (en) Transliteration model generation device, transliteration apparatus, and computer program therefor
Drummond Clause structure and ergativity in Nukuoro
Loulakaki-Moore Seferis and Elytis as Translators
Díaz-Pérez Language representation in the Spanish, Italian and French versions of Xiaolu Guo’s A Concise Chinese-English Dictionary for Lovers: On the translation of ungrammatical idiolect and language-based jokes
Miesenberger et al. Computers Helping People with Special Needs: 18th International Conference, ICCHP-AAATE 2022, Lecco, Italy, July 11–15, 2022, Proceedings, Part I
JP2006260076A (en) Document preparation support device
Taylor The future of TEX
Koch On Translation: An Application of Digital
Caldwell Non-Japanese Haiku Today
Stergiopoulou Between the Lines: Seferis Anti-Writing Pound's Homer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant