CN111126078B

CN111126078B - Translation method and device

Info

Publication number: CN111126078B
Application number: CN201911316920.3A
Authority: CN
Inventors: 张传强; 张睿卿; 熊皓; 何中军; 吴华; 李芝; 王海峰
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2019-12-19
Filing date: 2019-12-19
Publication date: 2023-04-07
Anticipated expiration: 2039-12-19
Also published as: CN111126078A

Abstract

The embodiment of the disclosure discloses a translation method and a translation device. The method adopts a translation model, the translation model comprises an encoder, a classifier and a decoder, and the method comprises the following steps: inputting a vector matrix determined based on the word segmentation sequence of the first text into an encoder to obtain intermediate representation output by the encoder; inputting the intermediate representation into a classifier to obtain a classification label output by the classifier; and in response to the classification label indicating that the participle at the tail in the participle sequence of the first text is an independent translation unit, inputting the intermediate representation into a decoder to obtain a second text output by the decoder. The method reduces the time consumption of the system under the condition that the translation quality is ensured, is easy to improve on the basis of the existing translation model, is simple to use, and realizes the control of the translation unit by introducing supervised learning.

Description

Translation method and device

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a translation method and apparatus.

Background

The simultaneous translation has been rapidly developed in the last two years, and each large internet company has also successively introduced its own simultaneous product.

Most of the existing synchronous transmission systems are based on pipeline (pipeline) mode, firstly a text is generated through speech recognition (ASR), then a sentence breaking module is called to break the text, then a translation model is called to translate the text after the sentence breaking, and finally a translation result is displayed. Specifically, the synchronous transmission system increases the frequency of sentence breaks and reduces the granularity of the sentence breaks. Or the simultaneous transmission system adopts a wait-K words model, translation is started when the recognition result is larger than K words, and one more word is translated when one more word is recognized. When the sentence breaking model judges that the sentence can be broken, the tail part is translated at one time.

However, if the co-transmission system increases the frequency of calling translation by frequently adding punctuations, the probability of wrong sentence break is increased, the translation quality is affected, and if semantic dependency exists between clauses, translation errors are further increased; in addition, the time delay of clause translation is still large. If the peer-to-peer system adopts the wait-k words model, great help is provided for reducing the delay problem of the peer-to-peer system, but still an additional sentence-breaking module is needed, otherwise, the translation result is delayed more and more than the recognition result, such as Chinese-to-English translation, under the same table meaning, the average English sentence length is about 1.25 times of Chinese (Huang and ZHao, where, when to fine optimal sentence search for the future text generation), and if there is no "catch-up" at the sentence-breaking time, english will always lag behind Chinese. In addition, the wait-k model requires that a word must be decoded at each moment, and under the condition that the identification result information is insufficient, the decoded word is likely to be wrong, so that the translation effect is influenced.

Disclosure of Invention

The embodiment of the disclosure provides a translation method and a translation device.

In a first aspect, an embodiment of the present disclosure provides a method for translation, including: a method of translation employing a translation model, the translation model comprising an encoder, a classifier and a decoder, the method comprising: inputting a vector matrix determined based on the word segmentation sequence of the first text into an encoder to obtain intermediate representation output by the encoder; inputting the intermediate representation into a classifier to obtain a classification label output by the classifier; and in response to the classification label indicating that the participle positioned at the tail in the participle sequence of the first text is an independent translation unit, inputting the intermediate representation into a decoder to obtain a second text output by the decoder.

In some embodiments, in response to the classification tag indicating that a word at the tail in the sequence of words in the first text is an independent translation unit, inputting the intermediate representation into a decoder, resulting in a second text output by the decoder comprising: and in response to the fact that the classification tag indicates that the participle positioned at the tail in the participle sequence of the first text is an independent translation unit and the historical translation result has a translation result corresponding to the participle positioned before the tail in the participle sequence of the first text, inputting the intermediate representation into a decoder, and taking the translation result corresponding to the participle positioned before the tail in the participle sequence of the first text in the historical translation result as a constraint, inputting the constraint into the decoder to obtain a second text output by the decoder.

In some embodiments, the training data samples for the translation model are determined based on the following steps: aligning training data of the translation model by using a word alignment tool to obtain alignment information of the training data; and taking the alignment information of the training data as a training data sample of the translation model.

In some embodiments, the word segmentation sequence of the first text is obtained via the following steps: recognizing an input first voice to obtain a first text; and performing word segmentation on the first text to obtain a word segmentation sequence of the first text.

In some embodiments, the method further comprises: generating a second voice based on the translated second text; and playing the second voice.

In a second aspect, an embodiment of the present disclosure provides an apparatus for translation, which employs a translation model, where the translation model includes an encoder, a classifier, and a decoder, and the apparatus includes: an encoder input unit configured to input a vector matrix determined based on a word segmentation sequence of the first text into an encoder, resulting in an intermediate representation of an encoder output; a classifier input unit configured to input the intermediate representation into a classifier, resulting in a classification label output by the classifier; and the decoder input unit is configured to respond to the fact that the classification label indicates that the participle positioned at the tail in the participle sequence of the first text is an independent translation unit, input the intermediate representation into the decoder and obtain a second text output by the decoder.

In some embodiments, the decoder input unit is further configured to: and in response to the fact that the classification label indicates that the participle positioned at the tail in the participle sequence of the first text is an independent translation unit and the historical translation result has a translation result corresponding to the participle positioned before the tail in the participle sequence of the first text, inputting the intermediate representation into a decoder, and taking the translation result corresponding to the participle positioned before the tail in the participle sequence of the first text in the historical translation result as a constraint, inputting the constraint into the decoder, and obtaining a second text output by the decoder.

In some embodiments, the training data samples of the translation model used in the device are determined based on the following units: the training data alignment unit is configured to align the training data of the translation model by adopting a word alignment tool to obtain alignment information of the training data; and the training data determination unit is configured to use the alignment information of the training data as a training data sample of the translation model.

In some embodiments, the apparatus further comprises: the first voice recognition unit is configured to recognize input first voice to obtain a first text; the first text word segmentation unit is configured to segment words of the first text to obtain a word segmentation sequence of the first text.

In some embodiments, the apparatus further comprises: a second speech generating unit configured to generate second speech based on the translated second text; and a second voice playing unit configured to play the second voice.

In a third aspect, an embodiment of the present disclosure provides an electronic device/terminal/server, including: one or more processors; storage means for storing one or more programs; when executed by one or more processors, cause the one or more processors to implement a method of translation as described above.

In a fourth aspect, the embodiments of the present disclosure provide a computer readable medium, on which a computer program is stored, which when executed by a processor, implements the method of translation as described in any one of the above.

The translation method and the translation device provided by the embodiment of the disclosure adopt a translation model, the translation model comprises an encoder, a classifier and a decoder, and the method comprises the following steps: firstly, inputting a vector matrix determined based on a word segmentation sequence of a first text into an encoder to obtain intermediate representation output by the encoder; then, inputting the intermediate representation into a classifier to obtain a classification label, wherein the classification label indicates whether the participle positioned at the tail part in the participle sequence of the first text is an independent translation unit; then, in response to the classification label indicating that the participle at the tail in the participle sequence of the first text is an independent translation unit, the intermediate representation is input to a decoder, and the text output by the decoder is taken as the translated second text. In the process, a sentence-breaking module in a pipeline flow of a traditional system is removed, a translation unit is controlled by a translation model, so that the time consumption of the system is reduced under the condition that the translation quality is ensured, the improvement is easy on the basis of the existing translation model, the control of the translation unit is realized by using simple and introducing supervised learning, and in some embodiments, the historical translation result of a preceding participle in a participle sequence of a first text is considered in the translation of each tail participle in the translation process, so that the problems of disambiguation of entity reference and translation in the field can be effectively solved.

Drawings

Other features, objects, and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which the present disclosure may be applied;

FIG. 2a is a schematic flow chart diagram illustrating one embodiment of a method of translating in accordance with embodiments of the present disclosure;

FIG. 2b illustrates an exemplary block diagram of the translation model of FIG. 2 a;

FIG. 3 is an exemplary application scenario of a method of translation according to an embodiment of the present disclosure;

FIG. 4 is a flow diagram of another embodiment of a method of translating according to an embodiment of the present disclosure;

FIG. 5 is an exemplary block diagram of one embodiment of an apparatus for translation of the present disclosure;

FIG. 6 is a schematic block diagram of a computer system suitable for use with a server embodying embodiments of the present disclosure.

Detailed Description

The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.

It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the method of translation or apparatus of translation of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have installed thereon various communication client applications, such as a translation-type application, a browser application, a shopping-type application, a search-type application, an instant messaging tool, a mailbox client, social platform software, and the like.

The

terminal apparatuses

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, 103 are hardware, they may be various electronic devices supporting browser applications, including but not limited to tablet computers, laptop portable computers, desktop computers, and the like. When the

terminal apparatuses

101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented, for example, as multiple software or software modules to provide distributed services, or as a single software or software module. And is not particularly limited herein.

The server 105 may be a server providing various services, such as a background server providing support for browser applications running on the

terminal devices

101, 102, 103. The background server can analyze and process the received data such as the request and feed back the processing result to the terminal equipment.

The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster composed of multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules, for example, to provide distributed services, or as a single piece of software or software module. And is not particularly limited herein.

In practice, the method for translating provided by the embodiment of the present disclosure may be executed by the

terminal devices

101, 102, 103 and/or the

servers

105, 106, and the means for translating may also be disposed in the

terminal devices

101, 102, 103 and/or the

servers

105, 106.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for an implementation.

With continued reference to fig. 2a, fig. 2a illustrates a flow 200 of one embodiment of a method of translation according to the present disclosure. The translation method comprises the following steps:

step 201, inputting the vector matrix determined based on the word segmentation sequence of the first text into the encoder to obtain the intermediate representation output by the encoder.

In this embodiment, the execution subject of the translation method (e.g., the terminal or the server shown in fig. 1) may employ a unit that controls translation in translation using a translation model with a classifier. The translation model herein may be a multitask neural translation model, and specifically, may include: an encoder, a classifier and a decoder.

The encoder receives a word segmentation sequence of the first text as input and encodes information in the word segmentation sequence into an intermediate representation. The classifier receives the intermediate representation as input, and judges whether the participle positioned at the tail part in the participle sequence of the first text is an independent translation unit, if so, the intermediate representation is input into a decoder. The decoder decodes the intermediate representation as the target language.

Specifically, the execution body may start inputting the vector matrix determined based on the word segmentation sequence of the first text into the encoder when detecting that the input first text includes K (K is an integer greater than 1) number of segmented words, so as to obtain an intermediate representation of the encoder output.

The above translation model can refer to the structure shown in fig. 2b, and the upper part of the above translation model is a conventional translation model (which can be a translation model in the prior art or in the future, and is not limited in this application.) for example, a transformation (Attention all you needed) or RNN based on Vaswani and shazer, and the lower part of the above translation model is accessed into a classification model after the encoder. The updating can be optimized by supervised learning, and the classifier and the decoder together update the parameters at the encoder side. Specifically, large-scale source language-target language translation sentence pairs can be constructed first, and then the end-to-end translation model is used for training on the sentence pairs to optimize model parameters. The target language text is typically generated directly at test time given the source language text.

In some optional implementations of the present embodiment, the training data samples of the translation model are determined based on the following steps: aligning training data of the translation model by using a word alignment tool to obtain alignment information of the training data; and taking the alignment information of the training data as a training data sample of the translation model.

In this implementation, the training data of the translation model may be aligned by using a word alignment tool (e.g., TER-based, GIZA + +, METEOR) in the prior art or in a future developed technology, so as to obtain alignment information of the training data. One specific example is as follows:

source text: i want a wallet of | excellent |.

Target text: i | want | a | magneicitent | wallet.

According to the above alignment information, a plurality of pieces of training data of the new translation model can be generated, and as shown in table 1 below, there are two training targets, namely, a classified label (two classifications, i.e., whether it can be used as an independent translation unit) and a translation target. It should be noted that the last "|" in the input is the position that needs to be classified at the time, and there are W words in total after that (W is the window size, which is taken to be 3 here).

Table 1:

input device	Training target
		I want to	And (4) classification label: 1, translation: i is
I want to one	Classification label: -1 translation: i is
		I want one pole	And (4) classification label: 1, translation: i want
I want to be one excellent	Classification label: -1 translation: i want
		I want to be one is excellent	And (4) classification label: 1, translation: i want a
I want a wallet of	And (4) classification label: 1, translation: i want a magnicient

Step 202, inputting the intermediate representation into a classifier to obtain a classification label output by the classifier.

In this embodiment, the execution main body inputs the intermediate representation into the classifier, and obtains a classification tag output by the classifier, where the classification tag is used to indicate whether a participle located at the tail in the participle sequence of the first text is an independent translation unit.

The classifier may be a machine learning network model for performing two classifications on the input in the prior art or the future developed technology, which is not limited in the present application. For example, a binary classification network implemented using a support vector machine, a decision tree, a Logistic regression model, or the like.

Step 203, responding to the fact that the classification label indicates that the participle at the tail part in the participle sequence of the first text is an independent translation unit, inputting the intermediate representation into a decoder, and obtaining a second text output by the decoder.

In this embodiment, if the classification tag indicates that the participle at the tail in the participle sequence of the first text is an independent translation unit, it indicates that an accurate translation can be obtained based on the participle sequence of the first text, and at this time, the middle representation is input to the decoder, so that the integrity and accuracy of the second text output by the decoder can be improved.

In some optional implementations of this embodiment, in response to the class label indicating that the participle at the tail in the participle sequence of the first text is an independent translation unit, inputting the intermediate representation to the decoder comprises: and in response to the fact that the classification label indicates that the participle positioned at the tail in the participle sequence of the first text is an independent translation unit and the historical translation result has a translation result corresponding to the participle positioned before the tail in the participle sequence of the first text, inputting the intermediate representation into a decoder, and taking the translation result corresponding to the participle positioned before the tail in the participle sequence of the first text in the historical translation result as a constraint, inputting the constraint into the decoder, and obtaining a second text output by the decoder.

In the implementation mode, the historical translation result corresponding to the participle positioned before the tail in the first text participle sequence can be adopted to restrict the translation result of the participle sequence of the first text, so that the translation accuracy and timeliness are improved.

In one specific example, the second text output by the decoder is shown in table two below.

A second table:

according to the translation method of the embodiment of the disclosure, the classifier can be adopted to determine whether the participle at the tail part in the participle sequence of the first text is an independent translation unit, and when the participle is determined to be the independent translation unit, the translation of the first text is performed, so that the integrity and the accuracy of the translated second text are improved. Compared with the prior art, the sentence-breaking module in the pipeline process of the traditional system is removed, so that the time consumption of the system is reduced. In some embodiments, when the participle at the tail is determined to be an independent translation unit, the historical translation result of the participle before the tail is considered in the translation of the participle sequence of the first text, so that the problems of disambiguation of entity reference and translation of the field can be effectively solved.

An exemplary application scenario of the method of translation of the present disclosure is described below in conjunction with fig. 3.

As shown in fig. 3, fig. 3 illustrates one exemplary application scenario of a method of translation according to the present disclosure.

As shown in fig. 3, a method 300 of translation is executed in an electronic device 320, and the translation is performed by using a translation model, where the translation model includes an encoder, a classifier, and a decoder, and the method 300 includes:

inputting a vector matrix 302 determined based on a word segmentation sequence 301 of a first text into an encoder 303 to obtain an intermediate representation 304 output by the encoder 303;

inputting the intermediate representation 304 into a classifier 305, resulting in a classification label 306 output by the classifier 305;

in response to the classification tag 306 indicating that the segmentation 307 at the end of the segmentation sequence 301 of the first text is an independent translation unit 308, the intermediate representation 304 is input to a decoder 309 resulting in a second text 310 output by the decoder 309.

It should be understood that the application scenario of the method for translation shown in fig. 3 is only an exemplary description of the method for translation, and does not represent a limitation on the method. For example, the steps shown in fig. 3 above may be implemented in further detail. Other translation steps can be added on the basis of the above-mentioned figure 3.

With further reference to fig. 4, fig. 4 shows a schematic flow chart diagram of another embodiment of a method of translation according to the present disclosure.

As shown in fig. 4, the method 400 for translation according to this embodiment may include the following steps:

step 401, recognizing the input first voice to obtain a first text.

In this embodiment, the executing body (for example, the terminal or the server shown in fig. 1) of the method for translation may adopt a speech recognition technology in the prior art or a technology developed in the future to recognize the input first speech in real time and obtain the first text.

Step 402, performing word segmentation on the first text to obtain a word segmentation sequence of the first text.

In this embodiment, the executing entity may perform word segmentation on the first text according to a preset rule or by using a word segmentation tool in the prior art or in a technology developed in the future, so as to obtain a word segmentation sequence of the first text.

Step 403, inputting the vector matrix determined based on the word segmentation sequence of the first text into the encoder to obtain an intermediate representation output by the encoder.

Step 404, inputting the intermediate representation into a classifier to obtain a classification label output by the classifier.

The classifier may be a machine learning network model for performing two classifications on the input in the prior art or the future developed technology, which is not limited in the present application. For example, a binary network implemented using a support vector machine, a decision tree, a Logistic regression model, etc.

Step 405, in response to the classification label indicating that the participle at the tail in the participle sequence of the first text is an independent translation unit, inputting the intermediate representation into a decoder to obtain a second text output by the decoder.

In this embodiment, if the classification tag indicates that the participle at the tail in the participle sequence of the first text is an independent translation unit, which indicates that an accurate translation can be obtained based on the participle sequence of the first text, the execution main body may input the intermediate representation to the decoder, so as to improve the integrity and accuracy of the second text output by the decoder.

Step 406, generating a second speech based on the translated second text.

In this embodiment, the executing body may convert the translated second text into voice by using a word conversion voice tool or an AI voice intelligent synthesis tool, etc. according to the translated second text.

Step 407, playing the second voice.

In this embodiment, the execution body may play the second voice generated in step 406, thereby completing the simultaneous interpretation.

The method for translation in the embodiment of fig. 4 of the present disclosure, on the basis of the method for translation shown in fig. 2, refines the process of obtaining the played second voice based on the input first voice, and plays the second voice, thereby improving the accuracy and efficiency of simultaneous interpretation.

With further reference to fig. 5, as an implementation of the methods shown in the above-mentioned figures, an embodiment of the present disclosure provides an embodiment of an apparatus for translation, where the embodiment of the apparatus corresponds to the method embodiments shown in fig. 2 to fig. 4, and the apparatus may be specifically applied to the terminal device or the server.

As shown in fig. 5, the translation apparatus 500 of the present embodiment may include: adopting a translation model, wherein the translation model comprises an encoder, a classifier and a decoder, and the device comprises: an encoder input unit 510 configured to input a vector matrix determined based on a word segmentation sequence of the first text into an encoder, resulting in an intermediate representation of the encoder output; a classifier input unit 520 configured to input the intermediate representation into a classifier, resulting in a classification label output by the classifier; a decoder input unit 530 configured to input the intermediate representation into the decoder in response to the classification tag indicating that the participle at the tail in the participle sequence of the first text is an independent translation unit, resulting in the second text output by the decoder.

In some embodiments, the training data samples of the translation model used in the apparatus are determined based on the following units (not shown in the figure): the training data alignment unit is configured to align the training data of the translation model by adopting a word alignment tool to obtain alignment information of the training data; and the training data determining unit is configured to use the alignment information of the training data as a training data sample of the translation model.

In some embodiments, the device further comprises (not shown in the figures): the first voice recognition unit is configured to recognize input first voice to obtain a first text; the first text word segmentation unit is configured to segment words of the first text to obtain a word segmentation sequence of the first text.

In some embodiments, the device further comprises (not shown in the figures): a second speech generating unit configured to generate second speech based on the translated second text; a second voice playing unit configured to play the second voice.

It should be understood that the various elements recited in the apparatus 500 correspond to the various steps recited in the method described with reference to fig. 2-4. Thus, the operations and features described above with respect to the method are equally applicable to the apparatus 500 and the various units included therein, and are not described again here.

Referring now to FIG. 6, shown is a schematic block diagram of an electronic device (e.g., a server or terminal device of FIG. 1) 600 suitable for use in implementing embodiments of the present disclosure. Terminal devices in embodiments of the present disclosure may include, but are not limited to, devices such as notebook computers, desktop computers, and the like. The terminal device/server shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 6, the electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of embodiments of the present disclosure. It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: inputting a vector matrix determined based on the word segmentation sequence of the first text into an encoder to obtain intermediate representation output by the encoder; inputting the intermediate representation into a classifier to obtain a classification label output by the classifier; and in response to the classification label indicating that the participle positioned at the tail in the participle sequence of the first text is an independent translation unit, inputting the intermediate representation into a decoder to obtain a second text output by the decoder.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an encoder input unit, a classifier input unit, and a decoder input unit. Where the names of these elements do not in some cases constitute a limitation of the element itself, the encoder input element may also be described as "the element that inputs the vector matrix determined based on the participle sequence of the first text to the encoder, resulting in an intermediate representation of the encoder output", for example.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the spirit of the invention. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims

1. A method of translation employing a translation model comprising an encoder, a classifier, and a decoder, the method comprising:

inputting a vector matrix determined based on a word segmentation sequence of a first text into the encoder to obtain an intermediate representation output by the encoder;

inputting the intermediate representation into the classifier to obtain a classification label output by the classifier;

and in response to the fact that the classification label indicates that the participle positioned at the tail in the participle sequence of the first text is an independent translation unit and the historical translation result has a translation result corresponding to the participle positioned before the tail in the participle sequence of the first text, inputting the intermediate representation into the decoder, and taking the translation result corresponding to the participle positioned before the tail in the participle sequence of the first text in the historical translation result as a constraint, and inputting the constraint into the decoder to obtain a second text output by the decoder.

2. The method of claim 1, wherein the training data samples of the translation model are determined based on:

aligning training data of the translation model by using a word alignment tool to obtain alignment information of the training data;

and taking the alignment information of the training data as a training data sample of the translation model.

3. The method of claim 1, wherein the word segmentation sequence of the first text is obtained via:

recognizing an input first voice to obtain a first text;

and performing word segmentation on the first text to obtain a word segmentation sequence of the first text.

4. The method of any of claims 1-3, wherein the method further comprises:

generating a second voice based on the translated second text;

and playing the second voice.

5. An apparatus for translation employing a translation model comprising an encoder, a classifier, and a decoder, the apparatus comprising:

an encoder input unit configured to input a vector matrix determined based on a word segmentation sequence of a first text into the encoder, resulting in an intermediate representation of the encoder output;

a classifier input unit configured to input the intermediate representation into the classifier, resulting in a classification label output by the classifier;

a decoder input unit configured to input the intermediate representation into the decoder in response to the classification tag indicating that the participle located at the tail in the participle sequence of the first text is an independent translation unit and a translation result corresponding to the participle located before the tail in the participle sequence of the first text exists in the historical translation results, and input the constraint into the decoder to obtain the second text output by the decoder in response to the translation result corresponding to the participle located before the tail in the participle sequence of the first text in the historical translation results being used as the constraint.

6. The apparatus of claim 5, wherein training data samples of a translation model used in the apparatus are determined based on:

the training data alignment unit is configured to align the training data of the translation model by adopting a word alignment tool to obtain alignment information of the training data;

a training data determination unit configured to use alignment information of the training data as a training data sample of the translation model.

7. The apparatus of claim 5, wherein the apparatus further comprises:

a first voice recognition unit configured to recognize an input first voice to obtain a first text;

the first text word segmentation unit is configured to segment the first text to obtain a word segmentation sequence of the first text.

8. The apparatus of any of claims 5-7, wherein the apparatus further comprises:

a second speech generating unit configured to generate second speech based on the translated second text;

a second voice playing unit configured to play the second voice.

9. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-4.

10. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-4.