CN114139560B

CN114139560B - Translation system based on artificial intelligence

Info

Publication number: CN114139560B
Application number: CN202111465059.4A
Authority: CN
Inventors: 刘建伟
Original assignee: Shandong Shiyu Information Technology Co ltd
Current assignee: Shandong Shiyu Information Technology Co ltd
Priority date: 2021-12-03
Filing date: 2021-12-03
Publication date: 2022-12-09
Anticipated expiration: 2041-12-03
Also published as: CN114139560A

Abstract

The invention relates to a translation system based on artificial intelligence, comprising: the system comprises a neural network model, a storage module, a first machine translation model, an arrangement module, a second machine translation model, an output module, a user habit acquisition module, a recording module, a cache module and an updating module, wherein the first machine translation model is provided with a receiving unit, a combination unit and a pre-translation unit; the method comprises the steps that firstly, a plurality of pre-translation targets are obtained through a neural network model with artificial intelligence, the plurality of pre-translation targets comprise pre-translation targets corresponding to source sentences and pre-translation targets which are related to the source sentences and are expanded, the pre-translation targets are sorted according to habits selected by a user, and a plurality of translation targets selected based on the habits are provided for the user; meanwhile, the selection habits of the user are collected, and the user habits are trained again so as to deduce a translation target according with the selection habits of the user.

Description

Translation system based on artificial intelligence

Technical Field

The invention relates to the technical field of translation systems, in particular to a translation system based on artificial intelligence.

Background

Artificial intelligence research includes robotics, language identification, image identification, natural language processing, expert systems, and the like. In the field of language translation, compared with the traditional dictionary base, the artificial intelligence can analyze the tone and the semantics of the source sentences to be translated, and the translation result is more accurate.

With the research and progress of artificial intelligence technology, neural network machine translation is commonly applied as a new generation of translation technology. The neural network machine translation model has the capabilities of self-learning and continuous updating, so that the artificial intelligence and neural network machine translation model is mainly researched in the field of translation.

In the prior art, for example, publication No. CN111738025A discloses an artificial intelligence based translation method, which includes: acquiring a first language text to be translated, wherein the first language text comprises at least one first language word segmentation; obtaining a word vector corresponding to the at least one first language participle in the first language text through a first translation model, wherein the first translation model is obtained by training based on a plurality of pseudo parallel corpus pairs, each pseudo parallel corpus pair comprises a second language corpus and a first language corpus obtained by translating the second language corpus through a second translation model, the second translation model is obtained by training based on a second parallel corpus pair, each second parallel corpus pair comprises a target second language corpus and a plurality of reference first language corpora, and each reference first language corpus is a reference translation of the target second language corpus; predicting, by the first translation model, based on the word vector, at least two candidate second language participles corresponding to each of the first language participles, and a first probability that each of the first language participles is translated into a corresponding candidate second language participle; determining, by the first translation model, a target second language participle corresponding to each first language participle from the candidate second language participles based on the first probability; and fusing each target second language participle through the first translation model to obtain a second language text after the first language text is translated.

Generally speaking, the translation of the language is translation plus revision, and the route is a relatively mature route, in the method, at least two candidate second language participles corresponding to each of the first language participles are predicted, and a first probability that each of the first language participles is translated into the corresponding candidate second language participle is utilized; then, determining target second language participles corresponding to the first language participles from the candidate second language participles through the first translation model based on the first probability; and fusing each target second language participle through the first translation model. The fusion technology needs additional training, and therefore, the neural network resources are occupied.

Disclosure of Invention

In view of the above, the present invention provides an artificial intelligence based translation system to solve the above problems in the prior art.

In order to achieve the purpose, the invention provides the following technical scheme:

an artificial intelligence based translation system comprising:

the neural network model is used for training in the neural network model to obtain a constraint condition set based on the historical habit elements of the translation target selected by the user;

the storage module is used for storing the constraint condition set;

a first machine translation model having a receiving unit, a combining unit, and a pre-translation unit; the receiving unit receives a source sentence to be translated;

the combination unit is connected with the receiving unit and used for recombining the source sentences based on combination elements formed by keywords, semantics and moods of the source sentences to form a plurality of new combination targets;

the pre-translation unit is connected with the receiving unit and translates the new combined target and the source sentences to obtain a plurality of pre-translation targets;

the arrangement module is used for loading a constraint condition set, arranging and combining a plurality of pre-translation targets according to a first constraint condition, and inputting the objects into a second machine translation model according to the sequence of the arrangement and combination, wherein the second machine translation model revises the plurality of pre-translation targets according to the sequence of the arrangement and combination to obtain a plurality of target sentences;

the output module is used for outputting the target sentence to be presented to the user;

the user habit acquisition module is used for selecting a target sentence based on user operation and operation behaviors, marking a source sentence corresponding to the selected target sentence and one or more of a plurality of new combined targets by using a set tag set, and forming acquired data;

the recording module is used for tracking and recording the acquired data to form a tracking log; inputting the tracking log into a neural network model for training to form a constraint condition subunit;

the cache module is used for storing the constraint condition subunit;

the updating module is used for extracting at least one constraint condition subunit in the cache module based on the updating period set by the control module, comparing the constraint condition subunit with the constraint condition set, checking whether the constraint condition subunit is a newly added element, if the constraint condition subunit is the newly added element, updating the constraint condition set, and if the constraint condition subunit is not the newly added element, judging the constraint condition set as a repeated element; the deletion is directly performed using the deletion module.

In the above, the neural network model is provided with a recognition unit,

the identification unit is used for acquiring the arrangement of an optimal set, the arrangement of the optimal set is realized by loading a tracking log, scanning the tracking log to acquire an element set contained in the tracking log, and counting the elements contained in the element set according to the same or similar elements;

after statistics, according to whether the same elements, two same elements, \8230 \ 8230;, N same elements are contained in the constraint condition set or not, the same elements are compared one by one and arranged, and the set with the least number of the same elements in the comparison result is the optimal set.

In the above, the historical habit elements include a combination of one or more of selection of a type of translation target, selection of clickable keywords or key elements, selection of related parts of speech, semantic selection, selection of associated sentences, selection of external associated word banks, browsing stay time and click viewing times.

In the above, the constraint condition set is formed by inputting the historical habit elements into the neural network model and training the historical habit elements according to the frequency of appearance of the historical habit elements, and the weight value of each historical habit element is set according to the frequency of appearance of the historical habit element.

In the above, the target sentence corresponds to the pre-translation target one to one or more than one.

In the above, the user habit acquisition module has at least one monitoring unit, the monitoring unit monitors user operation and operation behavior in the user habit acquisition module, and when it is monitored that the user operation and operation behavior is selection of a target sentence, a set tag set is used to label a source sentence corresponding to the selected target sentence, and one or more of a plurality of new combined targets, so as to form acquired data.

In the above, the arrangement module further includes a priority determination unit, and the priority determination unit determines the arrangement order of the pre-translation targets according to the weight value of each of the historical habit elements in the constraint set by loading the constraint set.

In the above, the recording module has at least one tracking unit, and the tracking unit tracks, as a target, a monitoring behavior of the monitoring unit with respect to at least one of a user operation and an operation behavior, captures a monitoring behavior including a selection of a target statement, and records the tracking log.

In the above, the updating module has at least one comparing unit, when the comparing unit performs comparison between the constraint condition subunit and the constraint condition set, if a certain element contained in the constraint condition subunit has an appearance frequency greater than a set threshold in an updating period, and the certain element does not have the same or similar element characteristics in the constraint condition set and can cause a deviation that can cause habitual appearance of a user in selection of a target sentence, the certain element is set as a new element, and a set weight value is given to the new element, and the new element is updated in the constraint condition set; if a certain element contained in the constraint condition subunit has an appearance frequency less than half of a set threshold value in an updating period, directly utilizing a deleting module to delete the element; if a certain element contained in the constraint condition subunit has an occurrence frequency which is more than half of the set threshold value and less than or equal to the set threshold value in the updating period, the certain element is registered in the cache module, and when the updating module in the next period starts updating, the certain element is added into the next updating module for updating.

In the above, the second machine translation model has at least one revision unit therein, each revision unit having a first revision subunit, a second revision subunit, and a comparison unit;

the revision unit and a pre-translation target have a one-to-one or one-to-many correspondence relationship, and the revision unit extracts sentence segments corresponding to the pre-translation target according to the pre-translation target;

the first revision subunit divides the sentence segments into at least one sentence segment according to the set punctuation marks, revises the corresponding contents of the sentence segments one by one, and assembles the revised sentence segments one by one to form a first revision target;

the second revision subunit directly revises the sentence segments to form a second revision target;

the comparison unit is used for inputting the first revision target and the second revision target into the comparison unit for comparison, carrying out grammar comparison, and directly outputting to obtain a target sentence if the first revision target and the second revision target are the same;

and if the first revision target and the second revision target are different, outputting the target sentence with no errors or least errors in grammar.

The method comprises the steps that firstly, a plurality of pre-translation targets are obtained through a neural network model with artificial intelligence, the plurality of pre-translation targets comprise pre-translation targets corresponding to source sentences and pre-translation targets which are related to the source sentences and expanded, the pre-translation targets are sequenced according to habits selected by a user, and a plurality of translation targets selected based on the habits are provided for the user; meanwhile, the selection habits of the user are collected, and the user habits are trained again so as to deduce a translation target according with the selection habits of the user.

Drawings

FIG. 1 is a schematic diagram of the framework of the present invention;

fig. 2 is a schematic diagram of a framework in an embodiment of the invention.

Detailed Description

The present invention is described in detail below with reference to the accompanying drawings, which refer to fig. 1 to 2.

The invention provides a communication switching method of an intelligent sound box, which comprises the following steps:

an artificial intelligence based translation system comprising:

the historical habit elements comprise a combination formed by one or more of selection of the type of a translation target, selection of clickable keywords or key elements, selection of related parts of speech, semantic selection, selection of related sentences, selection of external related word stock quoted, browsing retention time and click viewing times;

the constraint condition set is formed by inputting historical habit elements into a neural network model according to the occurrence frequency of the historical habit elements and training the historical habit elements, and the weight value of each historical habit element is set according to the occurrence frequency of each historical habit element;

the storage module is used for storing the constraint condition set;

the pre-translation unit is connected with the receiving unit and translates the new combination target and the source sentences to obtain a plurality of pre-translation targets, and the target sentences correspond to the pre-translation targets one by one or more;

in the above, the arrangement module further includes a priority determination unit, where the priority determination unit determines the arrangement order of the pre-translation targets according to the weight value of each element of the historical habits in the constraint condition set by loading the constraint condition set;

the revision unit and a pre-translation target have one-to-one correspondence or one-to-many correspondence, and the revision unit extracts sentence segments corresponding to the pre-translation target according to the pre-translation target;

the first revision subunit divides the sentence segments into at least one sentence segment according to the set punctuation marks, revises the corresponding contents of the sentence segments one by one, and assembles the sentence segments after revising one by one to form a first revision target;

the comparison unit is used for inputting the first revision target and the second revision target into the comparison unit for comparison, performing grammar comparison, and directly outputting to obtain a target sentence if the first revision target and the second revision target are the same;

the recording module is provided with at least one tracking unit, and the tracking unit takes the monitoring behavior of the monitoring unit on at least one of the user operation and the operation behavior as a target to track, captures the selected monitoring behavior containing the target statement and records a tracking log;

the cache module is used for storing the constraint condition subunit;

the updating module is used for extracting at least one constraint condition subunit in the cache module based on the updating period set by the control module, comparing the constraint condition subunit with the constraint condition set, checking whether the constraint condition subunit is a newly added element, if so, updating the constraint condition set, and if not, judging the constraint condition set as a repeated element; the deletion is directly performed by the deletion module.

The neural network model is provided with a recognition unit,

the identification unit is used for acquiring the arrangement of an optimal set, the arrangement of the optimal set scans a tracking log by loading the tracking log to acquire an element set contained in the tracking log, and counting the elements contained in the element set according to the same or similar elements;

In the application, the neural network model is an artificial intelligence carrier and has the capability of self-training and self-learning, specifically, the monitoring unit monitors the user operation and operation behavior in the user habit acquisition module, and when the user operation and operation behavior are monitored to be the selection of a target statement, the tracking unit tracks the monitoring behavior of the monitoring unit on at least one of the user operation and operation behavior as a target, captures the monitoring behavior including the selection of the target statement, and records the tracking log; the neural network model is provided with an identification unit, the identification unit is used for acquiring the arrangement of an optimal set, the arrangement of the optimal set scans a tracking log by loading the tracking log so as to acquire an element set contained in the tracking log, and statistics is carried out according to the elements contained in the same or similar element sets; after statistics, the same elements are compared one by one and arranged according to whether the same elements, two same elements, \8230 \ 8230;, and the same elements N have the same elements, and the set with the least number of the same elements in the comparison result is the optimal set, and then the optimal set is arranged according to whether the set is the optimal set. That is, if the same element as the constraint condition set is more, it is verified that the habit of the user described in the trace log is the same as the habit of the user currently described in the neural network model, and if the habit of the user described in the trace log is different from the habit of the user currently described in the neural network model, it is verified that the habit of the user has a deviation at a certain time, and at this time, it is necessary to record the deviation, and when the user uses the system again, a translation target that matches the updated habit of the user is recommended to the user.

The principle is as follows: when a user inputs a source sentence, the receiving unit receives the source sentence to be translated;

the pre-translation unit is connected with the receiving unit and translates the new combined target and the source sentences to obtain a plurality of pre-translation targets, and the target sentences correspond to the pre-translation targets one by one or correspond to more pre-translation targets;

the arrangement module loads a constraint condition set and arranges and combines a plurality of pre-translation targets according to a first constraint condition, specifically, the magnitude of the weight value of each historical habit element in the constraint condition set determines the arrangement sequence of the pre-translation targets, and the pre-translation targets are input into a second machine translation model according to the sequence of the arrangement and the combination, at least one revision unit is arranged in the second machine translation model, and each revision unit is provided with a first revision subunit, a second revision subunit and a comparison unit;

In the invention, if the source sentence is a word, when a user inputs the source sentence, the receiving unit receives the source sentence to be translated;

the combination unit is connected with the receiving unit and expands the words with the same or similar semantics in the source sentences based on the semantics of the source sentences to form a plurality of new combination targets;

the arrangement module loads a constraint condition set and arranges and combines a plurality of pre-translation targets according to a first constraint condition, specifically, the magnitude of a weight value of each historical habit element in the constraint condition set determines the arrangement sequence of the pre-translation targets and inputs the arrangement sequence into a second machine translation model according to the sequence of the arrangement and combination, at least one revision unit is arranged in the second machine translation model, and each revision unit is provided with a first revision subunit, a second revision subunit and a comparison unit;

the revision unit and a pre-translation target have a one-to-one or one-to-many correspondence relationship, and the revision unit extracts a target word corresponding to the pre-translation target according to the pre-translation target;

and the first revision subunit revises the target words one by one to form a first revision target and outputs the first revision target.

In the present invention, if the source sentence is a complete sentence, when the user inputs the source sentence, the receiving unit receives the source sentence to be translated;

the combination unit is connected with the receiving unit, and is used for recombining the source sentences based on combination elements formed by keywords, semantics and moods of the source sentences to form a plurality of new combination targets, wherein the new combination targets comprise the keywords, low-frequency vocabularies, similar vocabularies expanded based on the keywords and the low-frequency vocabularies and corresponding application examples.

The pre-translation unit is connected with the receiving unit and translates the new combined target and the source sentence to obtain a plurality of pre-translation targets, and the target sentences correspond to the pre-translation targets one by one or correspond to the pre-translation targets one by one;

the first revising subunit is used for directly revising and directly outputting the keywords, the low-frequency vocabulary and similar vocabulary expanded based on the keywords and the low-frequency vocabulary; dividing a source sentence into at least one sub-sentence according to a set punctuation mark (semicolon), revising the corresponding contents of the sub-sentences one by one, and assembling the sub-sentences after revising one by one to form a first revising target;

the second revision subunit directly revises the source sentence to form a second revision target;

the comparison unit is used for inputting the first revision target and the second revision target into the comparison unit for comparison, performing grammar comparison, and directly outputting to obtain a target sentence if the first revision target and the second revision target are the same; and if the first revision target and the second revision target are different, outputting the target sentence with no errors or least errors in grammar.

In the invention, if the source sentence is a complete sentence segment, when a user inputs the source sentence segment, the receiving unit receives the source sentence segment to be translated;

the combination unit is connected with the receiving unit, and is used for recombining the source sentence based on combination elements formed by keywords, semantics and tone of the source sentence fragment to form a plurality of new combination targets, wherein the new combination targets comprise keywords, low-frequency vocabularies, similar vocabularies expanded based on the keywords and the low-frequency vocabularies and corresponding application examples, and splitting the sentence fragment based on the sentence fragment to form a sub-sentence fragment according to whether the sentence fragment is a complete sentence or not;

the first revising subunit directly revises and directly outputs the keyword, the low-frequency vocabulary and similar vocabulary expanded based on the keyword and the low-frequency vocabulary; dividing a source sentence segment into at least one sub-sentence segment according to a set punctuation mark (a semicolon or a sentence number), dividing the sub-sentence segment into sub-sentences, revising the sub-sentence segments and the corresponding contents of the sub-sentences corresponding to the word sentence segments one by one, and assembling the sub-sentence segments and the corresponding contents after revising one by one to form a first revising target;

the comparison unit is used for inputting the first revision target and the second revision target into the comparison unit for comparison, carrying out grammar comparison, and directly outputting to obtain a target sentence if the first revision target and the second revision target are the same; and if the first revision target and the second revision target are different, outputting the target sentence with no syntax errors or the smallest syntax errors.

The method comprises the steps that firstly, a plurality of pre-translation targets are obtained through a neural network model with artificial intelligence, the plurality of pre-translation targets comprise pre-translation targets corresponding to source sentences and pre-translation targets which are related to the source sentences and are expanded, the pre-translation targets are sorted according to habits selected by a user, and a plurality of translation targets selected based on the habits are provided for the user; meanwhile, the selection habits of the user are collected, and the user habits are trained again so as to deduce a translation target according with the selection habits of the user.

The principles and embodiments of the present invention have been described herein using specific examples, which are presented only to assist in understanding the method and its core concepts of the present invention. It should be noted that there are no specific structures but rather a few limitations to the preferred embodiments of the present invention, and that many modifications, adaptations, and variations are possible and can be made by one skilled in the art without departing from the principles of the present invention; such modifications, variations, combinations, or adaptations of the invention using its spirit and scope, as defined by the claims, may be directed to other uses and embodiments.

Claims

1. The translation system based on artificial intelligence is characterized by comprising:

the storage module is used for storing the constraint condition set;

the pre-translation unit is connected with the receiving unit and translates the new combination target and the source sentences to obtain a plurality of pre-translation targets;

the arrangement module is used for loading a constraint condition set, arranging and combining the multiple pre-translation targets according to a first constraint condition, inputting the constraint condition set into the second machine translation model according to the sequence of the arrangement and the combination, and revising the multiple pre-translation targets according to the sequence of the arrangement and the combination by the second machine translation model to obtain multiple target sentences;

the cache module is used for storing the constraint condition subunit;

the updating module is used for extracting at least one constraint condition subunit in the cache module based on the updating period set by the control module, comparing the constraint condition subunit with the constraint condition set, checking whether the constraint condition subunit is a newly added element, if so, updating the constraint condition set, and if not, judging the constraint condition set as a repeated element; then the deletion module is used for directly deleting;

the arrangement module is also provided with a priority determining part which determines the arrangement sequence of the pre-translation targets according to the weight value of each historical habit element in the constraint condition set by loading the constraint condition set;

the updating module is provided with at least one comparing unit, when the comparing unit compares the constraint condition subunit with the constraint condition set, if a certain element contained in the constraint condition subunit has an appearance frequency greater than a set threshold value in an updating period, and the element does not have the same or similar element characteristics in the constraint condition set and can cause habitual deviation of selection of a target sentence by a user, the element is set as a newly added element, a set weight value is given to the newly added element, and the newly added element is updated in the constraint condition set; if a certain element contained in the constraint condition subunit has an appearance frequency less than half of a set threshold value in an updating period, directly utilizing a deleting module to delete the element; if a certain element contained in the constraint condition subunit has an occurrence frequency which is more than half of the set threshold value and less than or equal to the set threshold value in the updating period, the certain element is registered in the cache module, and when the updating module in the next period starts updating, the certain element is added into the next updating module for updating.

2. The artificial intelligence based translation system of claim 1, wherein a recognition unit is disposed in the neural network model,

3. The artificial intelligence based translation system according to claim 1 or 2, wherein the historical habit elements comprise a combination of one or more of selection of type of translation target, selection of clickable keywords or key elements, selection of related parts of speech, semantic selection, selection of associated sentences, selection of referring to external associated thesaurus, browsing dwell time and number of click-to-view times.

4. The artificial intelligence based translation system of claim 1, wherein the target sentences correspond one-to-one or one-to-many with pre-translation targets.

5. The artificial intelligence based translation system of claim 1, wherein the user habit acquisition module has at least one monitoring unit, the monitoring unit monitors user operations and operations in the user habit acquisition module, and when it is monitored that the user operations and operations are a selection of a target sentence, a set tag set is used to label one or more of a source sentence and a plurality of new combined targets corresponding to the selected target sentence, so as to form acquired data.

6. The artificial intelligence based translation system of claim 1, wherein the recording module has at least one tracking unit therein, the tracking unit tracks a monitoring behavior of the monitoring unit on at least one of a user operation and an operation behavior as a target, and captures a monitoring behavior including a selection of a target sentence, and records in a tracking log.

7. The artificial intelligence based translation system of claim 1, wherein the second machine translation model has at least one revision unit therein, each revision unit having a first revision subunit, a second revision subunit, and a collation unit;