CN110321567B - Neural machine translation method, device and equipment based on attention mechanism - Google Patents

Neural machine translation method, device and equipment based on attention mechanism Download PDF

Info

Publication number
CN110321567B
CN110321567B CN201910539986.2A CN201910539986A CN110321567B CN 110321567 B CN110321567 B CN 110321567B CN 201910539986 A CN201910539986 A CN 201910539986A CN 110321567 B CN110321567 B CN 110321567B
Authority
CN
China
Prior art keywords
distance
tensor
alignment
language
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910539986.2A
Other languages
Chinese (zh)
Other versions
CN110321567A (en
Inventor
朱宪超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Lan Bridge Information Technology Co ltd
Original Assignee
Sichuan Lan Bridge Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Lan Bridge Information Technology Co ltd filed Critical Sichuan Lan Bridge Information Technology Co ltd
Priority to CN201910539986.2A priority Critical patent/CN110321567B/en
Publication of CN110321567A publication Critical patent/CN110321567A/en
Application granted granted Critical
Publication of CN110321567B publication Critical patent/CN110321567B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a neural machine translation method, device and equipment based on an attention mechanism. The method comprises the steps of obtaining a source language and a target language during translation, wherein the source language refers to language information needing translation, and the target language refers to language information after translation; calculating distance tensors of the source language and the target language, wherein the distance tensors refer to distance weights; the distance tensor is used in calculating an alignment tensor using an alignment function in order to conform the neural machine translation result to expectations. The application solves the technical problem of poor translation effect. The application can effectively improve the alignment effect of the attention function and improve the translation effect and the score.

Description

Neural machine translation method, device and equipment based on attention mechanism
Technical Field
The application relates to the field of neural machine translation, in particular to a neural machine translation method, device and equipment based on an attention mechanism.
Background
Neural machine translation is a machine translation method. Based on the coding and decoding system, the source language sequence is coded by coding, the information in the source language is extracted, and the information is converted into another language, namely the target language by decoding, so that the translation of the language is completed.
The inventors found that the translation effect is poor due to the influence of similar words or the like in neural machine translation.
Aiming at the problem of poor translation effect in the related technology, no effective solution is proposed at present.
Disclosure of Invention
The application mainly aims to provide a neural machine translation method, device and equipment based on an attention mechanism so as to solve the problem of poor translation effect.
To achieve the above object, according to one aspect of the present application, there is provided a neural machine translation method based on an attention mechanism.
The neural machine translation method based on the attention mechanism comprises the following steps: acquiring a source language and a target language during translation, wherein the source language refers to language information needing translation, and the target language refers to language information after translation; calculating distance tensors of the source language and the target language, wherein the distance tensors refer to distance weights; the distance tensor is used in calculating an alignment tensor using an attention mechanism to conform the neural machine translation result to expectations, wherein the alignment tensor is calculated using an alignment function.
Further, using the distance tensor in the process of calculating the alignment tensor using the attention mechanism includes: the distance tensor is introduced into the attention mechanism for calculation, and part of the distance tensor is subtracted from the output alignment tensor based on the attention mechanism.
Further, calculating a distance tensor for the source language and the target language includes: and calculating the distance parameter and carrying into the process of calculating the distance tensor.
Further, the process of calculating the distance parameter and carrying into the distance tensor calculation includes: taking the source language word vector and the target language word vector based on the attention function input tensor as calculated initial quantities; calculating Euclidean distance between the source language word vector and the target language word vector to obtain a distance tensor; and normalizing the distance tensor to obtain a new distance tensor.
Further, a seq2seq framework model for attention-based mechanisms.
To achieve the above object, according to another aspect of the present application, there is provided a neural machine translation device based on an attention mechanism.
The neural machine translation device based on the attention mechanism according to the present application includes: the system comprises an acquisition module, a translation processing module and a translation processing module, wherein the acquisition module is used for acquiring a source language and a target language during translation, the source language refers to language information needing translation, and the target language refers to translated language information; the computing module is used for computing distance tensors of the source language and the target language, wherein the distance tensors refer to distance weights; and the substituting module is used for using the distance tensor in the process of calculating the alignment tensor by using an attention mechanism so as to enable the neural machine translation result to accord with expectations, wherein the alignment tensor is calculated by adopting an alignment function.
Further, the substitution module is configured to introduce the distance tensor into an attention mechanism for calculation, and subtract a portion of the distance tensor from an output alignment tensor based on the attention mechanism.
Further, the calculation module is used for calculating the distance parameter and carrying into the process of calculating the distance tensor.
Further, the computing module is further configured to take the source language word vector and the target language word vector based on an attention function input tensor as initial quantities for computation; calculating Euclidean distance between the source language word vector and the target language word vector to obtain a distance tensor; and normalizing the distance tensor to obtain a new distance tensor.
To achieve the above object, according to still another aspect of the present application, there is provided a processing apparatus including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the neural machine translation method when executing the program.
According to the neural machine translation method, device and equipment based on the attention mechanism, a mode of acquiring the source language and the target language during translation is adopted, and the distance tensor between the source language and the target language is calculated, so that the distance tensor is used in the process of calculating the alignment tensor by using the attention mechanism, the neural machine translation result meets the expected purpose, the technical effect of improving the alignment of the attention function is achieved, and the technical problem of poor translation effect is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application, are incorporated in and constitute a part of this specification. The drawings and their description are illustrative of the application and are not to be construed as unduly limiting the application. In the drawings:
FIG. 1 is a schematic flow diagram of a neural machine translation method based on an attention mechanism according to an embodiment of the present application;
FIG. 2 is a schematic flow diagram of a neural machine translation method based on an attention mechanism, according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a neural machine translation device based on an attention mechanism according to an embodiment of the present application.
Detailed Description
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate in order to describe the embodiments of the application herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In the present application, the terms "upper", "lower", "left", "right", "front", "rear", "top", "bottom", "inner", "outer", "middle", "vertical", "horizontal", "lateral", "longitudinal" and the like indicate an azimuth or a positional relationship based on that shown in the drawings. These terms are only used to better describe the present application and its embodiments and are not intended to limit the scope of the indicated devices, elements or components to the particular orientations or to configure and operate in the particular orientations.
Also, some of the terms described above may be used to indicate other meanings in addition to orientation or positional relationships, for example, the term "upper" may also be used to indicate some sort of attachment or connection in some cases. The specific meaning of these terms in the present application will be understood by those of ordinary skill in the art according to the specific circumstances.
Furthermore, the terms "mounted," "configured," "provided," "connected," "coupled," and "sleeved" are to be construed broadly. For example, it may be a fixed connection, a removable connection, or a unitary construction; may be a mechanical connection, or an electrical connection; may be directly connected, or indirectly connected through intervening media, or may be in internal communication between two devices, elements, or components. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art according to the specific circumstances.
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
As shown in fig. 1, the method includes steps S102 to S106 as follows:
step S102, acquiring a source language and a target language during translation,
the source language refers to language information to be translated, and the target language refers to translated language information.
The method can determine the voice information to be translated when translating, and can acquire the language information to be translated or translated.
Step S104, calculating distance tensors of the source language and the target language,
the distance tensor refers to the distance weight.
Specifically, tensors Q and K of word vectors in the source language and the target language are used as initial quantities of calculation, euclidean distances between Q and K are calculated, and a tensor distance is obtained.
By calculating the distance tensor of the source language and the target language, the word vector distance is introduced, so that the difference of alignment degree can be increased, the correspondence degree of similar words is higher, the alignment degree of dissimilar words is lower, and the translation effect is better.
Step S106, using the distance tensor in the process of calculating the alignment tensor by using the attention mechanism to make the neural machine translation result accord with the expectation,
the alignment tensor is calculated using an alignment function.
When applied on a neural machine translation model based on an attention mechanism, the distance tensor is introduced in the process of calculating an alignment tensor therein using the attention mechanism. Because the distance of the word vectors between the aligned sentences represents the difference degree between the two sentences, the calculation of adding the distance parameter into the alignment function can effectively enlarge the alignment probability difference of different words, so that the alignment is more effective.
It should be noted that the distance tensor is used in the calculation of the alignment tensor using the attention mechanism, and is applicable to all neural machine translation models containing the attention mechanism, without modifying the model framework.
It should be noted that the neural machine translation result meets the expectation that the effect and the score of the translation result meet the preset translation requirement.
From the above description, it can be seen that the following technical effects are achieved:
by adopting a mode of acquiring a source language and a target language during translation, the distance tensor of the source language and the target language is calculated, so that the distance tensor is used in the process of calculating an alignment tensor by using an attention mechanism, and the neural machine translation result meets the expected aim, thereby realizing the technical effect of improving the alignment of an attention function and further solving the technical problem of poor translation effect.
According to an embodiment of the present application, as a preference in the present embodiment, using the distance tensor in the process of calculating the alignment tensor using the attention mechanism includes: the distance tensor is introduced into the attention mechanism for calculation, and part of the distance tensor is subtracted from the output alignment tensor based on the attention mechanism.
Specifically, in the translation process, the word vector distance between the input sentences of the source language and the target language is calculated to obtain the distance tensor, the distance tensor is introduced into an attention mechanism for calculation, and a part of distance tensor is subtracted from the attention output alignment tensor to obtain the output alignment tensor with higher efficiency.
According to an embodiment of the present application, as a preference in the present embodiment, calculating the distance tensor of the source language and the target language includes: and calculating the distance parameter and carrying into the process of calculating the distance tensor.
In particular, the process of distance tensor calculation may be carried over after calculating the distance parameter.
In consideration of the existing alignment process of the attention function, the similarity of two input sentence word vectors is calculated, and then a series of calculation is performed to obtain the alignment function. In the process, the relative distance is not used for introducing calculation, and the participation of word vector distance equidistance parameters in an alignment function is absent. For example, when calculating the alignment of "drink" and "drink", the distance between the two word vectors is substantially 0, and when calculating the alignment of "drink" and "distance", the distance between the two word vectors is large. After the distance parameters are calculated, the distance vector distance can be introduced in the distance tensor calculation process, so that the difference of alignment degree can be increased, the correspondence degree of similar words is higher, the alignment degree of dissimilar words is lower, and the translation effect is better.
According to an embodiment of the present application, as a preferred embodiment, as shown in fig. 2, the process of calculating the distance parameter and carrying into the distance tensor calculation includes:
step S202, the source language word vector and the target language word vector based on the attention function input tensor are used as calculated initial quantities;
step S204, calculating Euclidean distance between the source language word vector and the target language word vector to obtain a distance tensor;
step S206, performing normalization processing on the distance tensor to obtain a new distance tensor.
Specifically, one possible calculation process is as follows:
step S1, enabling the output vector of the hidden layer to be ki, and performing dot product operation QKt to obtain Si;
step S2, softmax normalization is carried out to obtain an Ai alignment weight, and a calculation formula is as follows:
step S3, calculating a target language word vector zj and a source language word vector vi, and carrying out softmax function normalization on the output vector to obtain a distance tensor hi;
step S4, introducing a distance tensor to calculate to obtain an improved pair Ji Quanchong Ai, wherein a calculation formula is ai=ai-0.5 hi;
step S5, multiplication of ai and Vi is obtained, and the Attention (Q, K, V) is obtained, wherein the calculation formula is as follows:
Attention(Q,K,V)=∑ i a i V i
according to an embodiment of the present application, as a preference in this embodiment, a seq2seq framework model for an attention-based mechanism. The seq2seq framework model is known as Sequence to Sequence. Is a universal encoder-decoder framework and can be used in scenes such as machine translation, text summarization, session modeling, image captions and the like.
Specifically, the basic Seq2Seq model includes two RNNs, a Decoder and an Encoder, the most basic Seq2Seq model includes three parts, namely Encoder, decoder and an intermediate State Vector connecting the two, the Encoder encodes the three parts into a State Vector S with a fixed size through learning input, then the State Vector S is transmitted to the Encoder, and the Encoder outputs the State Vector S through learning the State Vector S.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.
There is also provided, in accordance with an embodiment of the present application, an attention-based neural machine translation device for implementing the above method, as shown in fig. 3, the device including: the acquiring module 10 is configured to acquire a source language and a target language during translation, where the source language refers to language information that needs to be translated, and the target language refers to translated language information; a calculation module 20, configured to calculate a distance tensor of the source language and the target language, where the distance tensor refers to a distance weight; the substituting module 30 is configured to use the distance tensor in a process of calculating an alignment tensor using an attention mechanism to make the neural machine translation result conform to expectations, where the alignment tensor is calculated using an alignment function.
The source language in the obtaining module 10 in the embodiment of the present application refers to language information that needs to be translated, and the target language refers to translated language information.
The method can determine the voice information to be translated when translating, and can acquire the language information to be translated or translated.
The distance tensor in the calculation module 20 of the embodiment of the present application refers to a distance weight.
Specifically, tensors Q and K of word vectors in the source language and the target language are used as initial quantities of calculation, euclidean distances between Q and K are calculated, and a tensor distance is obtained.
By calculating the distance tensor of the source language and the target language, the word vector distance is introduced, so that the difference of alignment degree can be increased, the correspondence degree of similar words is higher, the alignment degree of dissimilar words is lower, and the translation effect is better.
The alignment tensor in the substitution module 30 of the embodiment of the present application is calculated using an alignment function.
When applied on a neural machine translation model based on an attention mechanism, the distance tensor is introduced in the process of calculating an alignment tensor therein using the attention mechanism. Because the distance of the word vectors between the aligned sentences represents the difference degree between the two sentences, the calculation of adding the distance parameter into the alignment function can effectively enlarge the alignment probability difference of different words, so that the alignment is more effective.
It should be noted that the distance tensor is used in the calculation of the alignment tensor using the attention mechanism, and is applicable to all neural machine translation models containing the attention mechanism, without modifying the model framework.
According to an embodiment of the present application, as a preference in this embodiment, the substitution module 30 is configured to introduce the distance tensor into an attention mechanism for calculation, and subtract a part of the distance tensor from an output alignment tensor based on the attention mechanism.
In the embodiment of the application, the word vector distance of the input sentence of the source language and the target language is calculated in the translation process, the distance tensor is obtained, the distance tensor is introduced into the attention mechanism for calculation, and the output alignment tensor with higher efficiency is obtained by subtracting a part of the distance tensor from the output alignment tensor of the attention.
According to an embodiment of the present application, as a preference in this embodiment, the calculation module 20 is configured to calculate the distance parameter and carry over the process of calculating the distance tensor.
In particular, the process of distance tensor calculation may be carried over after calculating the distance parameter.
In consideration of the existing alignment process of the attention function, the similarity of two input sentence word vectors is calculated, and then a series of calculation is performed to obtain the alignment function. In the process, the relative distance is not used for introducing calculation, and the participation of word vector distance equidistance parameters in an alignment function is absent. For example, when calculating the alignment of "drink" and "drink", the distance between the two word vectors is substantially 0, and when calculating the alignment of "drink" and "distance", the distance between the two word vectors is large. After the distance parameters are calculated, the distance vector distance can be introduced in the distance tensor calculation process, so that the difference of alignment degree can be increased, the correspondence degree of similar words is higher, the alignment degree of dissimilar words is lower, and the translation effect is better.
According to an embodiment of the present application, as a preference in this embodiment, the calculation module 20 is further configured to take the source language word vector and the target language word vector based on an attention function input tensor as initial quantities of calculation; calculating Euclidean distance between the source language word vector and the target language word vector to obtain a distance tensor; and normalizing the distance tensor to obtain a new distance tensor.
Specifically, one possible calculation process is as follows:
step S1, enabling the output vector of the hidden layer to be ki, and performing dot product operation QKt to obtain Si;
step S2, softmax normalization is carried out to obtain an Ai alignment weight, and a calculation formula is as follows:
step S3, calculating a target language word vector zj and a source language word vector vi, and carrying out softmax function normalization on the output vector to obtain a distance tensor hi;
step S4, introducing a distance tensor to calculate to obtain an improved pair Ji Quanchong Ai, wherein a calculation formula is ai=ai-0.5 hi;
step S5, multiplication of ai and Vi is obtained, and the Attention (Q, K, V) is obtained, wherein the calculation formula is as follows:
Attention(Q,K,V)=∑ i a i V i
in another embodiment of the present application, there is also provided a processing device including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the neural machine translation method when the program is executed. The neural machine translation method comprises the following steps:
acquiring a source language and a target language during translation, wherein the source language refers to language information needing translation, and the target language refers to language information after translation;
calculating distance tensors of the source language and the target language, wherein the distance tensors refer to distance weights;
the distance tensor is used in calculating an alignment tensor using an attention mechanism to conform the neural machine translation result to expectations, wherein the alignment tensor is calculated using an alignment function.
It will be apparent to those skilled in the art that the modules or steps of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, or they may alternatively be implemented in program code executable by computing devices, such that they may be stored in a memory device for execution by the computing devices, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module. Thus, the present application is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (4)

1. A neural machine translation method based on an attention mechanism, comprising:
acquiring a source language and a target language during translation, wherein the source language refers to language information needing translation, and the target language refers to language information after translation;
calculating distance tensors of the source language and the target language, wherein the distance tensors refer to distance weights;
using the distance tensor in the process of calculating an alignment tensor by using an attention mechanism so as to enable the neural machine translation result to accord with expectations, wherein the alignment tensor is calculated by adopting an alignment function;
using the distance tensor in the process of calculating the alignment tensor using the attention mechanism includes:
introducing the distance tensor into an attention mechanism for calculation, and subtracting part of the distance tensor from an output alignment tensor based on the attention mechanism;
calculating a distance tensor for the source language and the target language includes: calculating the distance parameter and carrying into a distance tensor calculation process;
the process of calculating the distance parameter and carrying into the distance tensor calculation comprises the following steps:
taking the source language word vector and the target language word vector based on the attention function input tensor as calculated initial quantities;
calculating Euclidean distance between the source language word vector and the target language word vector to obtain a distance tensor;
and normalizing the distance tensor to obtain a new distance tensor.
2. The neural machine translation method of claim 1, wherein the model is a seq2seq framework model for attention-based mechanisms.
3. A neural machine translation device based on an attention mechanism, comprising:
the system comprises an acquisition module, a translation processing module and a translation processing module, wherein the acquisition module is used for acquiring a source language and a target language during translation, the source language refers to language information needing translation, and the target language refers to translated language information;
the computing module is used for computing distance tensors of the source language and the target language, wherein the distance tensors refer to distance weights;
the substituting module is used for using the distance tensor in the process of calculating the alignment tensor by using an attention mechanism so as to enable the neural machine translation result to accord with expectations, wherein the alignment tensor is calculated by adopting an alignment function;
the substituting module is used for introducing the distance tensor into an attention mechanism for calculation, and subtracting part of the distance tensor from an output alignment tensor based on the attention mechanism;
the calculation module is used for calculating the distance parameter and carrying into a distance tensor calculation process;
the calculation module is also used for the calculation of the data,
taking the source language word vector and the target language word vector based on the attention function input tensor as calculated initial quantities;
calculating Euclidean distance between the source language word vector and the target language word vector to obtain a distance tensor;
and normalizing the distance tensor to obtain a new distance tensor.
4. A processing device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the neural machine translation method of any one of claims 1 to 2 when the program is executed by the processor.
CN201910539986.2A 2019-06-20 2019-06-20 Neural machine translation method, device and equipment based on attention mechanism Active CN110321567B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910539986.2A CN110321567B (en) 2019-06-20 2019-06-20 Neural machine translation method, device and equipment based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910539986.2A CN110321567B (en) 2019-06-20 2019-06-20 Neural machine translation method, device and equipment based on attention mechanism

Publications (2)

Publication Number Publication Date
CN110321567A CN110321567A (en) 2019-10-11
CN110321567B true CN110321567B (en) 2023-08-11

Family

ID=68119909

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910539986.2A Active CN110321567B (en) 2019-06-20 2019-06-20 Neural machine translation method, device and equipment based on attention mechanism

Country Status (1)

Country Link
CN (1) CN110321567B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110717342B (en) * 2019-09-27 2023-03-14 电子科技大学 Distance parameter alignment translation method based on transformer
US11875131B2 (en) * 2020-09-16 2024-01-16 International Business Machines Corporation Zero-shot cross-lingual transfer learning
CN112511172B (en) * 2020-11-11 2023-03-24 山东云海国创云计算装备产业创新中心有限公司 Decoding method, device, equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6233545B1 (en) * 1997-05-01 2001-05-15 William E. Datig Universal machine translator of arbitrary languages utilizing epistemic moments
CN108647214B (en) * 2018-03-29 2020-06-30 中国科学院自动化研究所 Decoding method based on deep neural network translation model
CN109710951B (en) * 2018-12-27 2023-10-17 北京百度网讯科技有限公司 Auxiliary translation method, device, equipment and storage medium based on translation history

Also Published As

Publication number Publication date
CN110321567A (en) 2019-10-11

Similar Documents

Publication Publication Date Title
CN110321567B (en) Neural machine translation method, device and equipment based on attention mechanism
KR102382499B1 (en) Translation method, target information determination method, related apparatus and storage medium
WO2019114695A1 (en) Translation model-based training method, translation method, computer device and storage medium
US20220414959A1 (en) Method for Training Virtual Image Generating Model and Method for Generating Virtual Image
CN110162766B (en) Word vector updating method and device
JP7030434B2 (en) Translation method, translation equipment and translation program
CN109919078A (en) A kind of method, the method and device of model training of video sequence selection
CN108228576B (en) Text translation method and device
CN107766319B (en) Sequence conversion method and device
WO2023174036A1 (en) Federated learning model training method, electronic device and storage medium
CN111816169A (en) Method and device for training Chinese and English hybrid speech recognition model
CN108664465A (en) One kind automatically generating text method and relevant apparatus
CN109871736A (en) The generation method and device of natural language description information
CN112463989A (en) Knowledge graph-based information acquisition method and system
Yang et al. Cross-modal mutual learning for audio-visual speech recognition and manipulation
CN110795912B (en) Method, device, equipment and storage medium for encoding text based on neural network
CN105373527B (en) Omission recovery method and question-answering system
CN108388549B (en) Information conversion method, information conversion device, storage medium and electronic device
CN110472253B (en) Sentence-level machine translation quality estimation model training method based on mixed granularity
CN110532891B (en) Target object state identification method, device, medium and equipment
CN117197268A (en) Image generation method, device and storage medium
Dou et al. Attention forcing for machine translation
CN111178097A (en) Method and device for generating Chinese and Tai bilingual corpus based on multi-level translation model
KR20210048281A (en) Apparatus and method for generating video with background removed
Gopalkrishnan et al. Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant