JP2021026760A

JP2021026760A - Machine translation apparatus and method

Info

Publication number: JP2021026760A
Application number: JP2020097082A
Authority: JP
Inventors: シーホングオ; Xihong Guo; シンユグオ; xin yu Guo; アンシンリー; Anxin Li; ランチェン; Lan Chen
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2019-07-31
Filing date: 2020-06-03
Publication date: 2021-02-22
Also published as: CN112395888A

Abstract

To provide a new machine translation method and apparatus capable of sharing the same decoding network for different translation directions.SOLUTION: A machine translation device 800 includes: a pre-processing unit which generates a plurality of vectors corresponding to each source language word in an input text of a source language with respect to the input text of the source language; an encoding unit which encodes the plurality of vectors to generate a plurality of encoded vectors; and a decoding unit in which a plurality of encoding vectors and information indicating a translation direction are input into a single decoding network, the output text of the target language corresponding to the input text of the source language is output from the single decoding network, and output order of target language words included in the output text is matched with the translation direction. If the translation direction indicated by the information entered into the single decoding network is changed, parameters of each node in the single decoding network are not changed.SELECTED DRAWING: Figure 8

Description

本発明は、自然言語処理の分野に関し、さらに具体的には、機械翻訳装置及び方法に関する。 The present invention relates to the field of natural language processing, and more specifically to machine translation devices and methods.

機械翻訳は、コンピュータを利用して、１種の自然言語（ソース言語）を別の自然言語（ターゲット言語）に変換するプロセスである。機械翻訳は、計算言語学の一分野であり、人工知能の究極の目標の１つであり、重要な科学的研究価値を持っている。それとともに、機械翻訳は、重要な実用価値も持っている。経済のグローバル化とインターネットの急速な発展に伴い、機械翻訳技術は政治的、経済的、文化的交流の促進などにおいてますます重要な役割を果たしてきた。 Machine translation is the process of converting one natural language (source language) into another natural language (target language) using a computer. Machine translation is a field of computational linguistics, one of the ultimate goals of artificial intelligence, and has important scientific research value. At the same time, machine translation also has important practical value. With the globalization of the economy and the rapid development of the Internet, machine translation technology has played an increasingly important role in promoting political, economic and cultural exchanges.

深層学習の研究の大きな進歩に伴い、人工ニューラルネットワークに基づく機械翻訳（ＮｅｕｒａｌＭａｃｈｉｎｅＴｒａｎｓｌａｔｉｏｎ、ＮＭＴ）が徐々に頭角を現してきた。その技術の中核は、超大量のノード（ニューロン）を有した、コーパスから翻訳知識を自動的に学習可能な深層ニューラルネットワークである。１種の言語の文がベクトル化された後、ネットワークにおいて１層ずつ伝送され、コンピュータが「理解」可能な表現形態に変換され、さらに、多層の複雑な伝達演算によって、別の言語の訳文が生成される。「言語を理解して、訳文を生成する」といった翻訳形態を実現した。この翻訳方法の最大の利点は、訳文の流れが良く、文法的規範により合致し、理解しやすいということである。従来の翻訳技術に比べて、品質が「飛躍的に」改善されている。 With the great progress of research on deep learning, machine translation (Neural Machine Translation, NMT) based on artificial neural networks has gradually emerged. The core of the technology is a deep neural network that has an extremely large number of nodes (neurons) and can automatically learn translation knowledge from the corpus. Sentences in one language are vectorized and then transmitted layer by layer over the network, transformed into a computer-understandable form of expression, and a multi-layered complex transmission operation that translates into another language. Will be generated. We have realized a translation form such as "understand the language and generate a translation". The biggest advantage of this translation method is that the translation flow is good, it is more consistent with grammatical norms, and it is easier to understand. The quality is "dramatically" improved compared to traditional translation techniques.

グーグルは、ＮＭＴを実現するための新しいアーキテクチャ（Ｔｒａｎｓｆｏｒｍｅｒ）を提供した。Ｔｒａｎｓｆｏｒｍｅｒアーキテクチャは、エンコーダ（Ｅｎｃｏｄｅｒ）及びデコーダー（Ｄｅｃｏｄｅｒ）という２つの部分を含む。エンコーダは、入力テキストに対して深層意味表現を行い、デコーダーは、入力テキストの意味表現に基づいて出力テキストを生成する。エンコーダ及びデコーダーは、いずれも多層ネットワークを積層してなるものである。 Google has provided a new architecture (Transformer) to realize NMT. The Transformer architecture includes two parts: an encoder (Encoder) and a decoder (Decoder). The encoder makes a deep semantic representation of the input text, and the decoder produces the output text based on the semantic representation of the input text. Both the encoder and the decoder are formed by stacking a multi-layer network.

従来の復号化ネットワークは、設定済みの翻訳方向（例えば、左から右へ又は右から左へ）で１ワードずつ訳文を生成して出力することしかできない。つまり、１つの復号化ネットワークは、一定の翻訳方向で構築してトレーニングすることしかできない。ユーザが翻訳方向の変更を望むなら、該翻訳方向に対応する他の復号化ネットワークを再構築してトレーニングするしかできない。しかしながら、これは明らかに時間コスト及びソフトウェア・ハードウェアコストの節約に不利である。 A conventional decoding network can only generate and output a word-by-word translation in a set translation direction (for example, left to right or right to left). That is, one decryption network can only be built and trained in a certain translation direction. If the user wants to change the translation direction, he can only rebuild and train other decryption networks corresponding to the translation direction. However, this clearly has a disadvantage in saving time and software / hardware costs.

以上の事情に鑑みて、異なる翻訳方向に対して、同一の復号化ネットワークを共有できる新たな機械翻訳方法及び装置を提供することが望まれている。 In view of the above circumstances, it is desired to provide a new machine translation method and apparatus capable of sharing the same decoding network for different translation directions.

本開示の一態様によれば、ソース言語の入力テキストに対して、前記ソース言語の入力テキストにおける各ソース言語ワードに対応する複数のベクトルを生成するように、処理を実行するステップと、前記複数のベクトルを符号化して複数の符号化ベクトルを生成するステップと、前記複数の符号化ベクトル及び翻訳方向を指示する情報を単一の復号化ネットワークに入力するステップと、ソース言語の入力テキストに対応するターゲット言語の出力テキストを、前記単一の復号化ネットワークから出力し、前記出力テキストに含まれるターゲット言語ワードの出力順序が前記翻訳方向に一致するステップと、を含み、前記単一の復号化ネットワークに入力された情報が指示する翻訳方向が変更される場合、前記単一の復号化ネットワークにおける各ノードのパラメータが変更されない機械翻訳方法が提供される。 According to one aspect of the present disclosure, a step of executing a process on an input text of the source language so as to generate a plurality of vectors corresponding to each source language word in the input text of the source language, and the plurality of steps. Corresponds to the step of encoding the vector of the above to generate a plurality of coded vectors, the step of inputting the plurality of coded vectors and the information indicating the translation direction into a single decoding network, and the input text of the source language. The output text of the target language to be output is output from the single decoding network, and the output order of the target language words included in the output text matches the translation direction. A machine translation method is provided in which the parameters of each node in the single decoding network are not changed when the translation direction indicated by the information input to the network is changed.

なお、本開示の実施例に係る方法では、ソース言語のトレーニング入力テキストに対して、前記ソース言語のトレーニング入力テキストにおける各ソース言語ワードに対応する複数のトレーニングベクトルを生成するように、処理を実行する処理と、前記複数のトレーニングベクトルを符号化して複数の符号化トレーニングベクトルを生成する処理と、前記複数の符号化トレーニングベクトル及び翻訳方向を指示する情報を単一の復号化ネットワークに入力する処理と、複数のトレーニング予測ベクトルをそれぞれ複数のタイムステップにおいて前記単一の復号化ネットワークから出力し、トレーニング予測ベクトルのそれぞれに、ターゲット言語シソーラスにおける各ワードが１つのタイムステップにおいてターゲット言語の出力テキストにおけるワードとされる確率が含まれ、各タイムステップがそれぞれターゲット言語の出力テキストにおける各ワードに対応する処理と、前記翻訳方向に対応する、複数のタイムステップにおいて出力されるべきターゲット言語の正解テキストのワードに基づいて、複数の正解ベクトルを決定し、正解ベクトルのそれぞれに、ターゲット言語シソーラスにおける各ワードが１つのタイムステップにおいてターゲット言語の出力テキストにおけるワードとされる確率が含まれ、該タイムステップにおいて出力されるべきワードに対応する確率が最も大きい処理と、少なくともトレーニング予測ベクトルと対応する正解ベクトルとの間の差を示す第１損失関数に基づいて、前記復号化ネットワークにおける各ノードのパラメータを調整する処理とによって、前記復号化ネットワークをトレーニングする。 In the method according to the embodiment of the present disclosure, processing is executed for the training input text of the source language so as to generate a plurality of training vectors corresponding to each source language word in the training input text of the source language. Processing to encode the plurality of training vectors to generate a plurality of coding training vectors, and processing to input the plurality of coding training vectors and information indicating a translation direction into a single decoding network. And, a plurality of training prediction vectors are output from the single decoding network in a plurality of time steps, and each word in the target language system is output in the output text of the target language in one time step for each of the training prediction vectors. The processing corresponding to each word in the output text of the target language, which includes the probability of being a word, and the correct answer text of the target language to be output in a plurality of time steps corresponding to the translation direction. A plurality of correct answer vectors are determined based on the words, and each of the correct answer vectors includes the probability that each word in the target language system is a word in the output text of the target language in one time step, and in that time step. Adjust the parameters of each node in the decryption network based on the first loss function, which indicates the difference between the process with the highest probability of corresponding the word to be output and at least the training prediction vector and the corresponding correct answer vector. The decryption network is trained by the processing.

なお、本開示の実施例に係る方法では、前記単一の復号化ネットワークには、隠れ層が含まれ、さらに、前記翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを分類ネットワークに入力し、前記単一の復号化ネットワークから出力されるターゲット言語のトレーニング出力テキストにおける各ワードの出力順序を指示する分類結果を、前記分類ネットワークから出力する処理と、前記出力順序と前記翻訳方向との比較に基づいて、前記復号化ネットワークにおける各ノードのパラメータをさらに調整する処理とによって、前記復号化ネットワークをトレーニングする。 In the method according to the embodiment of the present disclosure, the single decoding network includes a hidden layer, and is further output from the hidden layer of the decoding network at each time step corresponding to the translation direction. The processing of inputting the intermediate vector to the classification network and outputting the classification result indicating the output order of each word in the training output text of the target language output from the single decoding network from the classification network and the above-mentioned The decoding network is trained by a process of further adjusting the parameters of each node in the decoding network based on the comparison between the output order and the translation direction.

なお、本開示の実施例に係る方法では、分類ネットワークから出力される分類結果と前記翻訳方向との間の差を示す第２損失関数に基づいて、前記分類ネットワークにおける各ノードのパラメータを調整する処理によって、前記分類ネットワークをトレーニングする。 In the method according to the embodiment of the present disclosure, the parameters of each node in the classification network are adjusted based on the second loss function indicating the difference between the classification result output from the classification network and the translation direction. The process trains the classification network.

なお、本開示の実施例に係る方法では、前記複数の符号化ベクトル及び翻訳方向を指示する情報を単一の復号化ネットワークに入力するステップは、前記複数の符号化ベクトル及び第１翻訳方向を指示する情報を単一の復号化ネットワークに入力するステップと、前記複数の符号化ベクトル及び第２翻訳方向を指示する情報を単一の復号化ネットワークに入力するステップと、をさらに含み、ソース言語の入力テキストに対応するターゲット言語の出力テキストを、前記単一の復号化ネットワークから出力するステップは、ターゲット言語の第１出力テキストを、前記単一の復号化ネットワークから出力し、前記第１出力テキストに含まれるターゲット言語ワードの出力順序が前記第１翻訳方向に一致するステップと、ターゲット言語の第２出力テキストを、前記単一の復号化ネットワークから出力し、前記第２出力テキストに含まれるターゲット言語ワードの出力順序が前記第２翻訳方向に一致するステップとをさらに含む。 In the method according to the embodiment of the present disclosure, the step of inputting the plurality of coding vectors and the information indicating the translation direction into a single decoding network sets the plurality of coding vectors and the first translation direction. The source language further includes a step of inputting the indicating information into a single decoding network and a step of inputting the plurality of coding vectors and the information indicating the second translation direction into a single decoding network. The step of outputting the output text of the target language corresponding to the input text of the above from the single decoding network outputs the first output text of the target language from the single decoding network, and the first output. The step in which the output order of the target language words included in the text matches the first translation direction and the second output text of the target language are output from the single decoding network and included in the second output text. It further includes a step in which the output order of the target language words matches the second translation direction.

なお、本開示の実施例に係る方法では、前記復号化ネットワークには、隠れ層が含まれ、前記方法は、前記第１翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを分類ネットワークに入力し、各タイムステップがそれぞれターゲット言語の第１出力テキストにおける各ワードに対応し、前記単一の復号化ネットワークから出力されるターゲット言語の第１出力テキストにおける各ワードの出力順序を指示する第１分類結果を、前記分類ネットワークから出力するステップと、前記第２翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを前記分類ネットワークに入力し、各タイムステップがそれぞれターゲット言語の第２出力テキストにおける各ワードに対応し、前記単一の復号化ネットワークから出力されるターゲット言語の第２出力テキストにおける各ワードの出力順序を指示する第２分類結果を、前記分類ネットワークから出力するステップと、前記ターゲット言語の第１出力テキストにおける各ワードの出現確率に基づく値、及び第１分類結果を評点ネットワークに入力し、前記評点ネットワークから第１点数を出力するステップと、前記ターゲット言語の第２出力テキストにおける各ワードの出現確率に基づく値、及び第２分類結果を評点ネットワークに入力し、前記評点ネットワークから第２点数を出力するステップと、をさらに含み、ソース言語の入力テキストに対応するターゲット言語の出力テキストを、前記単一の復号化ネットワークから出力するステップは、第１点数と第２点数とのうち大きい一方に対応する出力テキストをターゲット言語の出力テキストとして選択するステップをさらに含む。 In the method according to the embodiment of the present disclosure, the decryption network includes a hidden layer, and the method outputs from the hidden layer of the decryption network at each time step corresponding to the first translation direction. The intermediate vector to be input is input to the classification network, each time step corresponds to each word in the first output text of the target language, and each in the first output text of the target language output from the single decoding network. The step of outputting the first classification result indicating the output order of words from the classification network and the intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the second translation direction are classified. Input to the network, each time step corresponds to each word in the second output text of the target language, and indicates the output order of each word in the second output text of the target language output from the single decoding network. The step of outputting the second classification result to be performed from the classification network, the value based on the appearance probability of each word in the first output text of the target language, and the first classification result are input to the score network and from the score network. A step of outputting the first score, a value based on the appearance probability of each word in the second output text of the target language, and a step of inputting the second classification result into the score network and outputting the second score from the score network. And, and, the step of outputting the output text of the target language corresponding to the input text of the source language from the single decoding network is the output corresponding to one of the first score and the second score. It also includes a step to select the text as the output text for the target language.

なお、本開示の実施例に係る方法では、ソース言語のトレーニング入力テキストに対して、前記ソース言語のトレーニング入力テキストにおける各ソース言語ワードに対応する複数のトレーニングベクトルを生成するように、処理を実行する処理と、前記複数のトレーニングベクトルを符号化して複数の符号化トレーニングベクトルを生成する処理と、前記複数の符号化トレーニングベクトル及び１つの翻訳方向を指示する情報を単一の復号化ネットワークに入力する処理と、前記１つの翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを分類ネットワークに入力し、前記単一の復号化ネットワークから出力されるターゲット言語のトレーニング出力テキストにおける各ワードの出力順序を指示する分類結果を、前記分類ネットワークから出力する処理と、前記ターゲット言語のトレーニング出力テキストにおける各ワードの出現確率に基づく値、及び分類結果を評点ネットワークに入力し、前記評点ネットワークからトレーニング点数を出力する処理と、前記復号化ネットワークから出力されるターゲット言語のトレーニング出力テキストとターゲット言語の正解テキストとの間の類似度を算出する処理と、前記トレーニング点数と前記類似度との間の差を示す第３損失関数に基づいて、前記評点ネットワークにおける各ノードのパラメータを調整する処理とによって、前記評点ネットワークをトレーニングする。 In the method according to the embodiment of the present disclosure, processing is executed for the training input text of the source language so as to generate a plurality of training vectors corresponding to each source language word in the training input text of the source language. Processing, processing to encode the plurality of training vectors to generate a plurality of encoded training vectors, and inputting the plurality of encoded training vectors and information indicating one translation direction into a single decoding network. And the intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the one translation direction is input to the classification network, and the target language output from the single decoding network The process of outputting the classification result indicating the output order of each word in the training output text from the classification network, the value based on the appearance probability of each word in the training output text of the target language, and the classification result are input to the score network. Then, the process of outputting the training score from the score network, the process of calculating the similarity between the training output text of the target language output from the decoding network and the correct answer text of the target language, and the training score The score network is trained by a process of adjusting the parameters of each node in the score network based on a third loss function that indicates the difference between the similarity and the score network.

なお、本開示の実施例に係る方法では、前記ターゲット言語の正解テキストは、同一のソース言語の入力テキストが入力される場合、別の復号化ネットワークから出力されるターゲット言語の出力テキストをさらに含み、前記別の復号化ネットワークのネットワーク規模が前記復号化ネットワークのネットワーク規模より大きい。 In the method according to the embodiment of the present disclosure, the correct answer text of the target language further includes the output text of the target language output from another decoding network when the input text of the same source language is input. , The network scale of the other decryption network is larger than the network scale of the decryption network.

なお、本開示の実施例に係る方法では、前記復号化ネットワークから出力されるターゲット言語の出力テキストを、ソース言語の入力テキストに対応するターゲット言語の正解テキストとして、別の復号化ネットワークのトレーニング処理に適用し、前記別の復号化ネットワークのネットワーク規模が前記復号化ネットワークのネットワーク規模より小さい。 In the method according to the embodiment of the present disclosure, the output text of the target language output from the decoding network is used as the correct answer text of the target language corresponding to the input text of the source language, and training processing of another decoding network is performed. The network scale of the other decryption network is smaller than the network scale of the decryption network.

本開示の別の一態様によれば、ソース言語の入力テキストに対して、前記ソース言語の入力テキストにおける各ソース言語ワードに対応する複数のベクトルを生成するように、処理を実行するための前処理ユニットと、前記複数のベクトルを符号化して複数の符号化ベクトルを生成するための符号化ユニットと、前記複数の符号化ベクトル及び翻訳方向を指示する情報を単一の復号化ネットワークに入力し、ソース言語の入力テキストに対応するターゲット言語の出力テキストを前記単一の復号化ネットワークから出力するための復号化ユニットであって、前記出力テキストに含まれるターゲット言語ワードの出力順序が前記翻訳方向に一致する復号化ユニットと、を含み、前記単一の復号化ネットワークに入力された情報が指示する翻訳方向が変更される場合、前記単一の復号化ネットワークにおける各ノードのパラメータが変更されない機械翻訳装置が提供される。 According to another aspect of the present disclosure, before performing processing on the source language input text so as to generate a plurality of vectors corresponding to each source language word in the source language input text. The processing unit, the coding unit for encoding the plurality of vectors to generate a plurality of coded vectors, the plurality of coded vectors, and the information indicating the translation direction are input to a single decoding network. , A decoding unit for outputting the output text of the target language corresponding to the input text of the source language from the single decoding network, and the output order of the target language words included in the output text is the translation direction. A machine in which the parameters of each node in the single decoding network are not changed if the translation direction indicated by the information input to the single decoding network is changed, including a decoding unit that matches. A translation device is provided.

なお、本開示の実施例に係る装置は、ソース言語のトレーニング入力テキストに対して、前記ソース言語のトレーニング入力テキストにおける各ソース言語ワードに対応する複数のトレーニングベクトルを生成するように、処理を実行する処理と、前記複数のトレーニングベクトルを符号化して複数の符号化トレーニングベクトルを生成する処理と、前記複数の符号化トレーニングベクトル及び翻訳方向を指示する情報を単一の復号化ネットワークに入力する処理と、複数のトレーニング予測ベクトルをそれぞれ複数のタイムステップにおいて前記単一の復号化ネットワークから出力し、トレーニング予測ベクトルのそれぞれに、ターゲット言語シソーラスにおける各ワードが１つのタイムステップにおいてターゲット言語の出力テキストにおけるワードとされる確率が含まれ、各タイムステップがそれぞれターゲット言語の出力テキストにおける各ワードに対応する処理と、前記翻訳方向に対応する、複数のタイムステップにおいて出力されるべきターゲット言語の正解テキストのワードに基づいて、複数の正解ベクトルを決定し、正解ベクトルのそれぞれに、ターゲット言語シソーラスにおける各ワードが１つのタイムステップにおいてターゲット言語の出力テキストにおけるワードとされる確率が含まれ、該タイムステップにおいて出力されるべきワードに対応する確率が最も大きい処理と、少なくともトレーニング予測ベクトルと対応する正解ベクトルとの間の差を示す第１損失関数に基づいて、前記復号化ネットワークにおける各ノードのパラメータを調整する処理と、を実行することによって、前記復号化ネットワークをトレーニングするためのトレーニングユニットをさらに含む。 The apparatus according to the embodiment of the present disclosure executes processing on the training input text of the source language so as to generate a plurality of training vectors corresponding to each source language word in the training input text of the source language. Processing to encode the plurality of training vectors to generate a plurality of coding training vectors, and processing to input the plurality of coding training vectors and information indicating a translation direction into a single decoding network. And, a plurality of training prediction vectors are output from the single decoding network in a plurality of time steps, and each word in the target language system is output in the output text of the target language in one time step for each of the training prediction vectors. The processing corresponding to each word in the output text of the target language, which includes the probability of being a word, and the correct answer text of the target language to be output in a plurality of time steps corresponding to the translation direction. A plurality of correct answer vectors are determined based on the words, and each of the correct answer vectors includes the probability that each word in the target language system is a word in the output text of the target language in one time step, and in that time step. Adjust the parameters of each node in the decryption network based on the first loss function, which indicates the difference between the process with the highest probability of corresponding the word to be output and at least the training prediction vector and the corresponding correct answer vector. A training unit for training the decryption network is further included.

なお、本開示の実施例に係る装置では、前記単一の復号化ネットワークには、隠れ層が含まれ、前記トレーニングユニットは、さらに、前記翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを分類ネットワークに入力し、前記単一の復号化ネットワークから出力されるターゲット言語のトレーニング出力テキストにおける各ワードの出力順序を指示する分類結果を、前記分類ネットワークから出力する処理と、前記出力順序と前記翻訳方向との比較に基づいて、前記復号化ネットワークにおける各ノードのパラメータをさらに調整する処理と、を実行することによって、前記復号化ネットワークをトレーニングするように構成される。 In the apparatus according to the embodiment of the present disclosure, the single decoding network includes a hidden layer, and the training unit further corresponds to the translation direction of the decoding network at each time step. The intermediate vector output from the hidden layer is input to the classification network, and the classification result indicating the output order of each word in the training output text of the target language output from the single decoding network is output from the classification network. The decryption network is configured to be trained by performing the process of further adjusting the parameters of each node in the decryption network based on the comparison between the output order and the translation direction. Will be done.

なお、本開示の実施例に係る装置では、前記トレーニングユニットは、さらに、分類ネットワークから出力される分類結果と前記翻訳方向との間の差を示す第２損失関数に基づいて、前記分類ネットワークにおける各ノードのパラメータを調整する処理によって、前記分類ネットワークをトレーニングするように構成される。 In addition, in the apparatus according to the embodiment of the present disclosure, the training unit is further in the classification network based on a second loss function indicating the difference between the classification result output from the classification network and the translation direction. The process of adjusting the parameters of each node is configured to train the classification network.

なお、本開示の実施例に係る装置では、前記復号化ユニットは、さらに、前記複数の符号化ベクトル及び第１翻訳方向を指示する情報を単一の復号化ネットワークに入力し、ターゲット言語の第１出力テキストを前記単一の復号化ネットワークから出力し、前記第１出力テキストに含まれるターゲット言語ワードの出力順序が前記第１翻訳方向に一致し、前記複数の符号化ベクトル及び第２翻訳方向を指示する情報を単一の復号化ネットワークに入力し、ターゲット言語の第２出力テキストを、前記単一の復号化ネットワークから出力し、前記第２出力テキストに含まれるターゲット言語ワードの出力順序が前記第２翻訳方向に一致するように構成される。 In the apparatus according to the embodiment of the present disclosure, the decoding unit further inputs the plurality of coding vectors and information indicating the first translation direction into a single decoding network, and the target language is the first. One output text is output from the single decoding network, the output order of the target language words included in the first output text matches the first translation direction, and the plurality of coding vectors and the second translation direction Is input to a single decoding network, the second output text of the target language is output from the single decoding network, and the output order of the target language words included in the second output text is It is configured to match the second translation direction.

なお、本開示の実施例に係る装置では、前記復号化ネットワークには、隠れ層が含まれ、前記装置は、前記第１翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを分類ネットワークに入力し、各タイムステップがそれぞれターゲット言語の第１出力テキストにおける各ワードに対応し、前記単一の復号化ネットワークから出力されるターゲット言語の第１出力テキストにおける各ワードの出力順序を指示する第１分類結果を、前記分類ネットワークから出力する処理と、前記第２翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを前記分類ネットワークに入力し、各タイムステップがそれぞれターゲット言語の第２出力テキストにおける各ワードに対応し、前記単一の復号化ネットワークから出力されるターゲット言語の第２出力テキストにおける各ワードの出力順序を指示する第２分類結果を、前記分類ネットワークから出力する処理と、前記ターゲット言語の第１出力テキストにおける各ワードの出現確率に基づく値、及び第１分類結果を評点ネットワークに入力し、前記評点ネットワークから第１点数を出力する処理と、前記ターゲット言語の第２出力テキストにおける各ワードの出現確率に基づく値、及び第２分類結果を評点ネットワークに入力し、前記評点ネットワークから第２点数を出力する処理と、第１点数と第２点数とのうち大きい一方に対応する出力テキストをターゲット言語の出力テキストとして選択する処理と、を実行するための選択ユニットをさらに含む。 In the apparatus according to the embodiment of the present disclosure, the decoding network includes a hidden layer, and the apparatus outputs from the hidden layer of the decoding network at each time step corresponding to the first translation direction. The intermediate vector to be input is input to the classification network, each time step corresponds to each word in the first output text of the target language, and each in the first output text of the target language output from the single decoding network. The process of outputting the first classification result indicating the output order of words from the classification network and the intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the second translation direction are classified. Input to the network, each time step corresponds to each word in the second output text of the target language, and indicates the output order of each word in the second output text of the target language output from the single decoding network. The process of outputting the second classification result to be performed from the classification network, the value based on the appearance probability of each word in the first output text of the target language, and the first classification result are input to the score network and from the score network. A process of outputting the first score, a value based on the appearance probability of each word in the second output text of the target language, and a process of inputting the second classification result into the score network and outputting the second score from the score network. And the process of selecting the output text corresponding to one of the first score and the second score as the output text of the target language, and the selection unit for executing the above.

なお、本開示の実施例に係る装置は、ソース言語のトレーニング入力テキストに対して、前記ソース言語のトレーニング入力テキストにおける各ソース言語ワードに対応する複数のトレーニングベクトルを生成するように、処理を実行する処理と、前記複数のトレーニングベクトルを符号化して複数の符号化トレーニングベクトルを生成する処理と、前記複数の符号化トレーニングベクトル及び１つの翻訳方向を指示する情報を単一の復号化ネットワークに入力する処理と、前記１つの翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを分類ネットワークに入力し、前記単一の復号化ネットワークから出力されるターゲット言語のトレーニング出力テキストにおける各ワードの出力順序を指示する分類結果を、前記分類ネットワークから出力する処理と、前記ターゲット言語のトレーニング出力テキストにおける各ワードの出現確率に基づく値、及び分類結果を評点ネットワークに入力し、前記評点ネットワークからトレーニング点数を出力する処理と、前記復号化ネットワークから出力されるターゲット言語のトレーニング出力テキストとターゲット言語の正解テキストとの間の類似度を算出する処理と、前記トレーニング点数と前記類似度との間の差を示す第３損失関数に基づいて、前記評点ネットワークにおける各ノードのパラメータを調整する処理とによって、前記評点ネットワークをトレーニングするためのトレーニングユニットをさらに含む。 The apparatus according to the embodiment of the present disclosure executes processing on the training input text of the source language so as to generate a plurality of training vectors corresponding to each source language word in the training input text of the source language. Processing, processing to encode the plurality of training vectors to generate a plurality of encoded training vectors, and inputting the plurality of encoded training vectors and information indicating one translation direction into a single decoding network. And the intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the one translation direction is input to the classification network, and the target language output from the single decoding network The process of outputting the classification result indicating the output order of each word in the training output text from the classification network, the value based on the appearance probability of each word in the training output text of the target language, and the classification result are input to the score network. Then, the process of outputting the training score from the score network, the process of calculating the similarity between the training output text of the target language output from the decoding network and the correct answer text of the target language, and the training score A training unit for training the score network is further included by a process of adjusting the parameters of each node in the score network based on a third loss function indicating the difference between the similarity and the score network.

なお、本開示の実施例に係る装置では、前記ターゲット言語の正解テキストは、同一のソース言語の入力テキストが入力される場合、別の復号化ネットワークから出力されるターゲット言語の出力テキストをさらに含み、前記別の復号化ネットワークのネットワーク規模が前記復号化ネットワークのネットワーク規模より大きい。 In the apparatus according to the embodiment of the present disclosure, the correct answer text of the target language further includes the output text of the target language output from another decoding network when the input text of the same source language is input. , The network scale of the other decryption network is larger than the network scale of the decryption network.

なお、本開示の実施例に係る装置では、前記復号化ネットワークから出力されるターゲット言語の出力テキストを、ソース言語の入力テキストに対応するターゲット言語の正解テキストとして、別の復号化ネットワークのトレーニング処理に適用し、前記別の復号化ネットワークのネットワーク規模が前記復号化ネットワークのネットワーク規模より小さい。 In the apparatus according to the embodiment of the present disclosure, the output text of the target language output from the decoding network is used as the correct answer text of the target language corresponding to the input text of the source language, and training processing of another decoding network is performed. The network scale of the other decryption network is smaller than the network scale of the decryption network.

本開示の各実施例に係る機械翻訳方法及び機械翻訳装置では、同一の復号化ネットワークを用いることで、第１翻訳方向でのターゲット言語のテキスト出力を実現でき、第２翻訳方向でのターゲット言語のテキスト出力も実現できる。それにより、従来技術における、一定の翻訳方向に対して復号化ネットワークを個別に構築してトレーニングする方式に比べて、時間コスト及びソフトウェア・ハードウェアコストを大幅に削減することができる。 In the machine translation method and the machine translation apparatus according to each embodiment of the present disclosure, the text output of the target language in the first translation direction can be realized by using the same decoding network, and the target language in the second translation direction can be realized. Text output can also be realized. As a result, the time cost and software / hardware cost can be significantly reduced as compared with the conventional method of individually constructing and training a decoding network for a certain translation direction.

本開示の一実施例に係る機械翻訳方法のプロセスを示すフローチャートである。It is a flowchart which shows the process of the machine translation method which concerns on one Example of this disclosure. 符号化ネットワークの一例を示す。An example of a coded network is shown. 復号化ネットワークの時間順の入出力模式図を示す。The time-ordered input / output schematic diagram of the decryption network is shown. 出力されるべきワードを復号化ネットワークが１つのタイムステップにおいて決定するプロセスの模式図を示す。A schematic diagram of the process by which the decoding network determines the words to be output in one time step is shown. 復号化ネットワークの具体的なトレーニングプロセスを示すフローチャートである。It is a flowchart which shows the specific training process of a decryption network. 本開示の別の一実施例に係る機械翻訳方法のプロセスを示すフローチャートである。It is a flowchart which shows the process of the machine translation method which concerns on another Example of this disclosure. 評点ネットワークの具体的なトレーニングプロセスを示すフローチャートである。It is a flowchart which shows the specific training process of a scoring network. 本開示の実施例に係る機械翻訳装置の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the machine translation apparatus which concerns on embodiment of this disclosure. 本開示の実施例に係る例示的な算出装置のアーキテクチャの模式図である。It is a schematic diagram of the architecture of the exemplary computing device according to the embodiment of the present disclosure.

以下、図面を参照しながら本発明の各好適な実施形態について説明する。図面を参照する以下の説明は、特許請求の範囲及びその均等物により限定される本発明の示例的な実施形態の理解を補助するために提供される。理解を補助するための様々な具体的な詳細を含むが、それらは示例的なものにすぎない。したがって、本発明の範囲及び趣旨から逸脱することなく、ここで説明される実施形態に対して様々な変更や修正を行うことが可能であることは、当業者に認識され得る。明細書をより明確かつ簡単にするために、当分野の周知の機能及び構造についての詳細な説明を省略する。 Hereinafter, each preferred embodiment of the present invention will be described with reference to the drawings. The following description with reference to the drawings is provided to aid in understanding the exemplary embodiments of the invention limited by the claims and their equivalents. It contains various concrete details to aid understanding, but they are only exemplary. Therefore, it can be recognized by those skilled in the art that various changes and modifications can be made to the embodiments described herein without departing from the scope and gist of the present invention. In order to make the specification clearer and simpler, detailed description of well-known functions and structures in the art is omitted.

まず、図１を参照しながら、本開示の一実施例に係る機械翻訳方法について説明する。図１に示すように、前記機械翻訳方法は、下記のステップを含む。 First, a machine translation method according to an embodiment of the present disclosure will be described with reference to FIG. As shown in FIG. 1, the machine translation method includes the following steps.

まず、ステップＳ１０１で、ソース言語の入力テキストに対して、前記ソース言語の入力テキストにおける各ソース言語ワードに対応する複数のベクトルを生成するように、処理を実行する。従来の機械学習方法は、往々にしてテキストデータを直接処理できないため、後続の各ネットワークに入力する前に、まずターゲット言語の入力テキストを数値型データに変換する必要がある。例えば、ソース言語の入力テキストは、１つの文であってもよい。文に対して単語分割処理を実行することにより、文を複数のワードに分割し、次に複数のワードをそれぞれ特定の次元のワードベクトルに変換する。例えば、ワード埋め込み（ｗｏｒｄｅｍｂｅｄｄｉｎｇ）の方式でこの変換を実現することができる。 First, in step S101, processing is executed for the input text of the source language so as to generate a plurality of vectors corresponding to each source language word in the input text of the source language. Traditional machine learning methods often cannot process text data directly, so the input text in the target language must first be converted to numeric data before being input to each subsequent network. For example, the input text of the source language may be one sentence. By executing the word division process on the sentence, the sentence is divided into a plurality of words, and then the plurality of words are converted into word vectors of a specific dimension. For example, this conversion can be realized by a method of word embedding.

そして、ステップＳ１０２で、前記複数のベクトルを符号化して複数の符号化ベクトルを生成する。例えば、符号化ネットワークによって前記符号化を実行することができる。図２は、符号化ネットワークの一例を示す。図２において、ｘ_１、ｘ_２、ｘ_３、…、ｘ_ｍは、ステップＳ１０１でソース言語の入力テキストをベクトル化して得られた複数のベクトルを示し、ｈ_１、ｈ_２、ｈ_３、…、ｈ_ｍは、符号化ネットワークによって生じた符号化ベクトルを示す。当然のことながら、符号化ネットワークの規模（例えば、ノード数及びネットワーク層数）は、図２の例に限定されない。 Then, in step S102, the plurality of vectors are encoded to generate a plurality of encoded vectors. For example, the coding can be performed by a coding network. FIG. 2 shows an example of a coded network. In FIG. 2, x ₁ , x ₂ , x ₃ , ..., X _m indicate a plurality of vectors obtained by vectorizing the input text of the source language in step S101, and h ₁ , h ₂ , h ₃ , ... , h _m represents a coded vector generated by the encoding network. As a matter of course, the scale of the coded network (for example, the number of nodes and the number of network layers) is not limited to the example of FIG.

次に、ステップＳ１０３で、前記複数の符号化ベクトル及び翻訳方向を指示する情報を単一の復号化ネットワークに入力する。ここで、翻訳方向はユーザにより指定される。例えば、復号化ネットワークが左から右への順序で訳文テキストを出力することをユーザが望む場合、復号化ネットワークに、左から右への翻訳方向を指示する情報を入力する。又は、復号化ネットワークが右から左への順序で訳文テキストを出力することをユーザが望む場合、復号化ネットワークに、右から左への翻訳方向を指示する情報を入力する。例えば、翻訳方向を指示する情報は、多次元特徴ベクトル（例えば、１２８次元）であってもよい。該多次元特徴ベクトルにおける各要素の値は、ランダムに決定され、同一翻訳方向に対して固定されている。 Next, in step S103, the plurality of coding vectors and information indicating the translation direction are input to a single decoding network. Here, the translation direction is specified by the user. For example, if the user wants the decryption network to output the translated text in a left-to-right order, the decryption network is populated with information indicating the translation direction from left to right. Alternatively, if the user wants the decryption network to output the translated text in a right-to-left order, the decryption network is populated with information indicating the translation direction from right to left. For example, the information indicating the translation direction may be a multidimensional feature vector (for example, 128 dimensions). The value of each element in the multidimensional feature vector is randomly determined and fixed for the same translation direction.

最後に、ステップＳ１０４で、ソース言語の入力テキストに対応するターゲット言語の出力テキストを、前記単一の復号化ネットワークから出力し、前記出力テキストに含まれるターゲット言語ワードの出力順序が前記翻訳方向に一致する。 Finally, in step S104, the output text of the target language corresponding to the input text of the source language is output from the single decoding network, and the output order of the target language words included in the output text is in the translation direction. Match.

なお、ここで、前記単一の復号化ネットワークに入力された情報が指示する翻訳方向が変更される場合、前記単一の復号化ネットワークにおける各ノードのパラメータは変更されない。 Here, when the translation direction indicated by the information input to the single decoding network is changed, the parameters of each node in the single decoding network are not changed.

例えば、前記単一の復号化ネットワークに入力された情報が指示する翻訳方向が第１翻訳方向である場合の前記単一の復号化ネットワークにおける各ノードのパラメータと、前記単一の復号化ネットワークに入力された情報が指示する翻訳方向が第２翻訳方向である場合の前記単一の復号化ネットワークにおける各ノードのパラメータとは、同一である。換言すれば、同一の復号化ネットワークを用いると、第１翻訳方向でのターゲット言語テキストの出力を実現でき、第２翻訳方向でのターゲット言語テキストの出力も実現できる。それにより、従来技術で一定の翻訳方向に対して復号化ネットワークを個別に構築してトレーニングする方式に比べて、時間コスト及びソフトウェア・ハードウェアコストを大幅に削減することができる。 For example, the parameters of each node in the single decoding network when the translation direction indicated by the information input to the single decoding network is the first translation direction, and the single decoding network. The parameters of each node in the single decoding network when the translation direction indicated by the input information is the second translation direction are the same. In other words, if the same decoding network is used, the output of the target language text in the first translation direction can be realized, and the output of the target language text in the second translation direction can also be realized. As a result, the time cost and the software / hardware cost can be significantly reduced as compared with the method of individually constructing and training the decoding network for a certain translation direction by the conventional technique.

前記復号化ネットワークは、予め決定された処理時間間隔で訳文（すなわち、ターゲット言語の出力テキスト）における各ワードを順次出力する。第１タイムステップ（ｔｉｍｅｓｔｅｐ）において、ステップＳ１０３で復号化ネットワークに入力された情報が指示する翻訳方向が左から右である場合、前記復号化ネットワークが左から１番目のターゲット言語ワードを出力する。そして、第１タイムステップから予め決定された処理時間間隔をおいた第２タイムステップにおいて、前記復号化ネットワークが引き続き左から２番目のターゲット言語ワードを出力する。前記復号化ネットワークが最後のターゲット言語ワードを出力するまで、以下同様に続く。同様に、第１タイムステップ（ｔｉｍｅｓｔｅｐ）において、ステップＳ１０３で復号化ネットワークに入力された情報が指示する翻訳方向が右から左である場合、前記復号化ネットワークが右から１番目のターゲット言語ワードを出力する。そして、第１タイムステップから予め決定された処理時間間隔をおいた第２タイムステップにおいて、前記復号化ネットワークが引き続き右から２番目のターゲット言語ワードを出力する。前記復号化ネットワークが最後のターゲット言語ワードを出力するまで、以下同様に続く。 The decryption network sequentially outputs each word in the translated text (that is, the output text of the target language) at predetermined processing time intervals. In the first time step, when the translation direction indicated by the information input to the decoding network in step S103 is from left to right, the decoding network outputs the first target language word from the left. .. Then, in the second time step with a predetermined processing time interval from the first time step, the decoding network continuously outputs the second target language word from the left. The same applies until the decryption network outputs the final target language word. Similarly, in the first time step, when the translation direction indicated by the information input to the decoding network in step S103 is from right to left, the decoding network is the first target language word from the right. Is output. Then, in the second time step with a predetermined processing time interval from the first time step, the decoding network continuously outputs the second target language word from the right. The same applies until the decryption network outputs the final target language word.

なお、現在のタイムステップの出力は、次のタイムステップにおいて復号化するとき、さらに入力として復号化ネットワークの最下層にフィードバックされる。換言すれば、復号化ネットワークからの符号化ベクトル及び翻訳方向を指示する情報を入力とする以外、復号化ネットワークは、さらに、復号化ネットワークの先行出力に基づいて現在の出力を生成することができる。 The output of the current time step is further fed back to the bottom layer of the decoding network as an input when decoding is performed in the next time step. In other words, the decoding network can further generate the current output based on the preceding output of the decoding network, other than inputting the coding vector from the decoding network and the information indicating the translation direction. ..

図３は、復号化ネットワークの時間順の入出力模式図を示す。図３において、各ブロックで示されるのは、同一の復号化ネットワークであり、ただし、復号化ネットワークの位置するタイムステップが異なる。第１タイムステップｔ_１において、符号化ベクトルであるｈ_１、ｈ_２、ｈ_３、…、ｈ_ｍに加えて、復号化ネットワークに翻訳方向を指示する情報＜Ｌ２Ｒ＞を入力する。ここで、＜Ｌ２Ｒ＞は、左から右への翻訳方向を示す。復号化ネットワークは、該タイムステップｔ_１において翻訳方向に対応するターゲット言語ワードｙ_１を出力し、ｙ_１が訳文の左から１番目のワードである。第２タイムステップｔ_２において、符号化ベクトルであるｈ_１、ｈ_２、ｈ_３、…、ｈ_ｍに加えて、復号化ネットワークに第１タイムステップｔ_１の出力ｙ_１を入力する。また、復号化ネットワークは、該タイムステップｔ_２において翻訳方向に対応するターゲット言語ワードｙ_２を出力し、ｙ_２が訳文左から２番目のワードである。復号化ネットワークは、後の各タイムステップにおいて、復号化ネットワークが翻訳方向に対応する最後のターゲット言語ワード（すなわち、左から最後のターゲット言語ワード）を出力するまで、同様に処理を実行する。 FIG. 3 shows a schematic diagram of input / output of the decryption network in chronological order. In FIG. 3, each block shows the same decryption network, but the time steps in which the decryption networks are located are different. In the first time step t ₁ , in addition to the coding vectors h ₁ , h ₂ , h ₃ , ..., H _m , information <L2R> indicating the translation direction is input to the decoding network. Here, <L2R> indicates the translation direction from left to right. The decoding network _{outputs the target language word y 1} corresponding to the translation direction in the _{time step t 1} , and y ₁ is the first word from the left in the translated text. In the second time step _{t 2,} _h _1, h _2, h 3 is a coding vector, ..., in addition to _{h m,} receives the output _{y 1} of the first time step _{t 1} the decoding network. Further, the decoding network _{outputs the target language word y 2} corresponding to the translation direction in the _{time step t 2} , and y ₂ is the second word from the left of the translated sentence. At each subsequent time step, the decryption network performs the same process until the decryption network outputs the last target language word corresponding to the translation direction (that is, the last target language word from the left).

図３に示されていないが、同様に、第１タイムステップｔ_１において、復号化ネットワークに翻訳方向を指示する情報＜Ｒ２Ｌ＞を入力し、＜Ｒ２Ｌ＞が右から左への翻訳方向を示す場合、復号化ネットワークは、該タイムステップｔ_１において翻訳方向に対応するターゲット言語ワードｙ_１を出力し、ｙ_１が訳文の右から１番目のワードであることが理解され得る。第２タイムステップｔ_２において、符号化ベクトルであるｈ_１、ｈ_２、ｈ_３、…、ｈ_ｍに加えて、復号化ネットワークに第１タイムステップｔ_１の出力ｙ_１を入力する。また、復号化ネットワークは、該タイムステップｔ_２において翻訳方向に対応するターゲット言語ワードｙ_２を出力し、ｙ_２が訳文の右から２番目のワードである。復号化ネットワークは、後の各タイムステップにおいて、復号化ネットワークが翻訳方向に対応する最後のターゲット言語ワード（すなわち、右から最後のターゲット言語ワード）を出力するまで同様に処理を実行する。 Although not shown in Figure 3, similarly, in the first time step t _1, enter information <R2L> indicating the translation direction decoding network, indicating the direction of translation from right to left is <R2L> In this case, the decoding network outputs _{the target language word y 1} corresponding to the translation direction in the _{time step t 1} _{, and it can be understood that y 1} is the first word from the right in the translated text. In the second time step _{t 2,} _h _1, h _2, h 3 is a coding vector, ..., in addition to _{h m,} receives the output _{y 1} of the first time step _{t 1} the decoding network. Further, the decoding network _{outputs the target language word y 2} corresponding to the translation direction in the _{time step t 2} , and y ₂ is the second word from the right in the translated sentence. At each subsequent time step, the decryption network performs the same process until the decryption network outputs the last target language word corresponding to the translation direction (that is, the last target language word from the right).

次に、復号化ネットワークが出力されるべきターゲット言語ワードをどのように決定するかの詳細について具体的に説明する。具体的には、複数の予測ベクトルをそれぞれ複数のタイムステップにおいて前記単一の復号化ネットワークから出力し、予測ベクトルのそれぞれに、ターゲット言語シソーラスにおける各ワードが１つのタイムステップにおいてターゲット言語の出力テキストにおけるワードとされる確率が含まれ、各タイムステップがそれぞれターゲット言語の出力テキストにおける各ワードに対応する。 Next, the details of how the decryption network determines the target language word to be output will be described in detail. Specifically, a plurality of prediction vectors are output from the single decoding network in a plurality of time steps, and each word in the target language thesaurus is output text in the target language in one time step for each of the prediction vectors. It includes the probability of being a word in, and each time step corresponds to each word in the output text of the target language.

図４は、復号化ネットワークが１つのタイムステップにおいて出力されるべきワードを決定するプロセスの模式図を示す。第１タイムステップを例として説明する。例えば、ターゲット言語シソーラスを１００００個のワードを含むシソーラスとして設定してもよい。ソース言語の入力テキストが与えられるとき、該ターゲット言語シソーラスに含まれるワードから、ソース言語の入力テキストに対応するワードを選択して出力する。この場合、復号化ネットワークは、第１タイムステップにおいて、次元が１００００である予測ベクトルを出力する。 FIG. 4 shows a schematic diagram of the process by which the decryption network determines the words to be output in one time step. The first time step will be described as an example. For example, the target language thesaurus may be set as a thesaurus containing 10,000 words. When the input text of the source language is given, the word corresponding to the input text of the source language is selected from the words included in the target language thesaurus and output. In this case, the decoding network outputs a prediction vector having a dimension of 10000 in the first time step.

図４において、４０１で復号化ネットワークの隠れ層から出力される中間ベクトル、例えば、浮動小数点型のベクトルを示す。該中間ベクトルを全結合層に提供する。全結合層により該中間ベクトルをより高次元のベクトルに投影する。ターゲット言語シソーラスが１００００個のワードを含むシソーラスである（すなわち、シソーラスサイズｖｏｃａｂ＿ｓｉｚｅ＝１００００）場合、このより高次元のベクトルが１００００次元のベクトルである。最後にＳｏｆｔｍａｘ層により、このより高次元のベクトルにおける各要素を０〜１の間の確率値に正規化する。正規化後の、このより高次元のベクトルは、上述した予測ベクトルであり、図４において４０２で示される。図４において、斜線ですべての確率のうち最も大きいものを示す。これで、５番目の要素に対応する確率が最も大きいことが分かる。ターゲット言語シソーラスが英語シソーラスであると仮定し、５番目の要素がターゲット言語シソーラスにおけるワード「Ｉ」に対応する場合、復号化ネットワークが第１タイムステップにおいてワード「Ｉ」を出力する。 In FIG. 4, an intermediate vector output from the hidden layer of the decryption network in 401, for example, a floating-point type vector is shown. The intermediate vector is provided to the fully connected layer. The fully connected layer projects the intermediate vector onto a higher dimensional vector. If the target language thesaurus is a thesaurus containing 10000 words (ie, the thesaurus size vocab_size = 10000), this higher dimensional vector is a 10000 dimensional vector. Finally, the Softmax layer normalizes each element in this higher dimensional vector to a probability value between 0 and 1. After normalization, this higher dimensional vector is the prediction vector described above and is shown at 402 in FIG. In FIG. 4, diagonal lines indicate the largest of all probabilities. From this, it can be seen that the probability corresponding to the fifth element is the largest. Assuming the target language thesaurus is an English thesaurus, if the fifth element corresponds to the word "I" in the target language thesaurus, the decryption network outputs the word "I" in the first time step.

復号化ネットワークが各タイムステップにおいて出力するワードの確率は、以下の式（１）で示すことができる。

The probability of a word output by the decryption network at each time step can be expressed by the following equation (1).

一方、従来技術に基づく翻訳方向Ｌ２Ｒ及びＲ２Ｌに対応する復号化ネットワークが各タイムステップにおいて出力するワードの確率は、それぞれ以下の式（２）及び（３）で示す。

On the other hand, the probabilities of words output by the decoding network corresponding to the translation directions L2R and R2L based on the prior art at each time step are shown by the following equations (2) and (3), respectively.

以上の式（１）〜（３）からわかるように、本開示に係る方法では、異なる翻訳方向Ｌ２Ｒ及びＲ２Ｌに対して、復号化ネットワークのパラメータは、同様にθである。一方、従来技術の方法では、異なる翻訳方向に対して、復号化ネットワークのパラメータは異なり、それぞれθ^L２R及びθ^R２Lであり、換言すれば、異なる翻訳方向に対して、それぞれ個別の復号化ネットワークを用いる。 As can be seen from the above equations (1) to (3), in the method according to the present disclosure, the parameter of the decoding network is similarly θ for different translation directions L2R and R2L. On the other hand, in the method of the prior art, the parameters of the decoding network are different for different translation directions and are θ ^L2R and θ ^R2 L, respectively. In other words, individual decoding networks are provided for different translation directions. Use.

また、図４を参照しながら第１タイムステップを例として説明したが、その他のタイムステップでの処理はいずれも第１タイムステップでの処理と類似する。相違点は、タイムステップのそれぞれにおいていずれも該タイムステップに対応する１つの予測ベクトルを生成し、一般的には、異なるタイムステップにおいて出力される予測ベクトルが異なる点である。 Further, although the first time step has been described as an example with reference to FIG. 4, the processing in the other time steps is similar to the processing in the first time step. The difference is that each time step generates one prediction vector corresponding to the time step, and generally, the prediction vectors output in different time steps are different.

以上、本開示に係る機械翻訳方法の具体的なプロセス、及び異なる翻訳方向に対して同一のノードパラメータを共有できる単一の復号化ネットワークについて説明した。上述した機械翻訳方法は、トレーニング済みの復号化ネットワークに基づいて翻訳を実行するプロセスである。次に、図５を参照して、前記復号化ネットワークの具体的なトレーニングプロセスについて説明する。 The specific process of the machine translation method according to the present disclosure and a single decoding network capable of sharing the same node parameters for different translation directions have been described above. The machine translation method described above is the process of performing translations based on a trained decryption network. Next, a specific training process of the decryption network will be described with reference to FIG.

まず、ステップＳ５０１で、ソース言語のトレーニング入力テキストに対して、前記ソース言語のトレーニング入力テキストにおける各ソース言語ワードに対応する複数のトレーニングベクトルを生成するように、処理を実行する。 First, in step S501, a process is executed for the training input text of the source language so as to generate a plurality of training vectors corresponding to each source language word in the training input text of the source language.

そして、ステップＳ５０２で、前記複数のトレーニングベクトルを符号化して複数の符号化トレーニングベクトルを生成する。 Then, in step S502, the plurality of training vectors are encoded to generate a plurality of encoded training vectors.

次に、ステップＳ５０３で、前記複数の符号化トレーニングベクトル及び翻訳方向を指示する情報を単一の復号化ネットワークに入力する。 Next, in step S503, the plurality of coding training vectors and information indicating the translation direction are input to a single decoding network.

そして、ステップＳ５０４で、複数のトレーニング予測ベクトルをそれぞれ複数のタイムステップにおいて前記単一の復号化ネットワークから出力し、トレーニング予測ベクトルのそれぞれに、ターゲット言語シソーラスにおける各ワードが１つのタイムステップにおいてターゲット言語の出力テキストにおけるワードとされる確率が含まれ、各タイムステップがそれぞれターゲット言語の出力テキストにおける各ワードに対応する。 Then, in step S504, a plurality of training prediction vectors are output from the single decoding network in each of the plurality of time steps, and each word in the target language thesaurus is assigned to each of the training prediction vectors in the target language in one time step. Includes the probability of being a word in the output text of, and each time step corresponds to each word in the output text of the target language.

以上のステップＳ５０１〜ステップＳ５０４は、以上で説明された機械翻訳方法のステップ及び処理と同様である。相違点は、復号化ネットワークのトレーニングが完了した後、任意のソース言語の入力テキストが入力される点である。該任意のソース言語の入力テキストに対応するターゲット言語の正解テキストが存在しない。一方、トレーニングプロセスにある復号化ネットワークの場合、トレーニンサンプルライブラリからのソース言語トレーニングテキストが入力される。該ソース言語トレーニングテキストに対応するターゲット言語の正解テキストが存在する。 The above steps S501 to S504 are the same as the steps and processes of the machine translation method described above. The difference is that the input text in any source language is entered after the decryption network has been trained. There is no correct answer text in the target language that corresponds to the input text in the arbitrary source language. On the other hand, for decryption networks in the training process, the source language training text from the training sample library is entered. There is a correct text in the target language that corresponds to the source language training text.

また、更なる相違点は、トレーニング済みの復号化ネットワークから出力される予測ベクトルが正確である点であり、換言すれば、予測ベクトルにおける確率が最も大きい要素に対応するワードが、出力されるべき正確ワードである。一方、トレーニングプロセスにある復号化ネットワークから出力されるトレーニング予測ベクトルが正確ではなく、換言すれば、トレーニング予測ベクトルにおける確率が最も大きい要素に対応するワードが必ずしも出力されるべき正確ワードではない。 A further difference is that the prediction vector output from the trained decryption network is accurate, in other words, the word corresponding to the element with the highest probability in the prediction vector should be output. The exact word. On the other hand, the training prediction vector output from the decoding network in the training process is not accurate, in other words, the word corresponding to the element with the highest probability in the training prediction vector is not necessarily the accurate word to be output.

次に、ステップＳ５０５で、前記翻訳方向に対応する、複数のタイムステップにおいて出力されるべきターゲット言語の正解テキストのワードに基づいて、複数の正解ベクトルを決定し、正解ベクトルのそれぞれに、ターゲット言語シソーラスにおける各ワードが１つのタイムステップにおいてターゲット言語の出力テキストにおけるワードとされる確率が含まれ、該タイムステップにおいて出力されるべきワードに対応する確率が最も大きい。 Next, in step S505, a plurality of correct answer vectors are determined based on the words of the correct answer texts of the target language to be output in the plurality of time steps corresponding to the translation direction, and each of the correct answer vectors is given the target language. The probability that each word in the thesaurus is a word in the output text of the target language in one time step is included, and the probability corresponding to the word to be output in the time step is the largest.

依然として、第１タイムステップを例として説明する。入力される翻訳方向が左から右であり、正解テキストにおける左から１番目のワードが「Ｉ」であり、「Ｉ」がベクトルにおける５番目の要素に対応すると仮定すると、第１タイムステップにおいて、正解ベクトルが、５番目の要素の値が最大のベクトルである。例えば、５番目の要素の値が１であり、その他の要素の値がいずれも０である。 The first time step will still be described as an example. Assuming that the input translation direction is from left to right, the first word from the left in the correct text is "I", and "I" corresponds to the fifth element in the vector, in the first time step, The correct vector is the vector with the largest value of the fifth element. For example, the value of the fifth element is 1, and the values of the other elements are all 0.

最後に、ステップＳ５０６で、少なくともトレーニング予測ベクトルと対応する正解ベクトルとの間の差を示す第１損失関数に基づいて、前記復号化ネットワークにおける各ノードのパラメータを調整する。 Finally, in step S506, the parameters of each node in the decryption network are adjusted based on at least the first loss function indicating the difference between the training prediction vector and the corresponding correct answer vector.

例えば、可能な一実施形態としては、第１損失関数は、トレーニング予測ベクトルにおける要素のそれぞれの値と、正解ベクトルにおける要素のそれぞれの値との差を示してもよい。又は、別の可能な実施形態としては、第１損失関数は、正解ベクトルにおける確率値が最も大きい要素と、トレーニング予測ベクトルにおける対応する要素との差を示してもよい。例を挙げて言うと、正解ベクトルにおける５番目の要素の値が最も大きく、例えば値が１であると仮定すると、第１損失関数は、トレーニング予測ベクトルにおける５番目の要素の確率値と、正解ベクトルにおける５番目の要素の確率値との差を示してもよい。 For example, in one possible embodiment, the first loss function may indicate the difference between the respective values of the elements in the training prediction vector and the respective values of the elements in the correct answer vector. Alternatively, in another possible embodiment, the first loss function may indicate the difference between the element with the highest probability value in the correct answer vector and the corresponding element in the training prediction vector. For example, assuming that the value of the 5th element in the correct answer vector is the largest, for example, the value is 1, the first loss function is the probability value of the 5th element in the training prediction vector and the correct answer. The difference from the probability value of the fifth element in the vector may be shown.

また、以上の説明からわかるように、翻訳方向を指示する情報は、最初のタイムステップにおいて出力されるターゲット言語ワードへの影響が最も大きい。この後の各タイムステップにおいて、翻訳方向を指示する情報が、さらに入力として復号化ネットワークに提供されるが、時間の経過とともに、より大きな重みが比較的近い先行出力に付与されるため、翻訳方向を指示する情報の重みがますます小さくなり、換言すれば、翻訳方向を指示する情報は出力されるターゲット言語ワードへの影響がますます小さくなる。つまり、同一の復号化ネットワークを共有するアーキテクチャにおいては、特に、後続のタイムステップの出力が翻訳方向に敏感ではなく、翻訳方向に関する情報が紛失されてしまう恐れがある。したがって、別の可能な実施形態としては、前記単一の復号化ネットワークに対するトレーニングでは、前記復号化ネットワークを補助的にトレーニングするための分類ネットワークをさらに追加することができる。 Further, as can be seen from the above explanation, the information indicating the translation direction has the greatest influence on the target language word output in the first time step. In each subsequent time step, information indicating the translation direction is further provided to the decoding network as input, but with the passage of time, a larger weight is given to the relatively close leading output, so that the translation direction is given. The weight of the information that indicates the translation direction becomes smaller and smaller, in other words, the information that indicates the translation direction has less influence on the output target language word. That is, in an architecture that shares the same decryption network, the output of subsequent time steps is not sensitive to the translation direction, and information about the translation direction may be lost. Thus, in another possible embodiment, training for the single decryption network can further add a classification network for auxiliary training of the decryption network.

具体的には、上述したように、前記単一の復号化ネットワークには、隠れ層が含まれ、ステップＳ５０６の後、さらに以下の処理によって、前記復号化ネットワークをトレーニングする。 Specifically, as described above, the single decryption network includes a hidden layer, and after step S506, the decryption network is further trained by the following processing.

ステップＳ５０７で、前記翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを分類ネットワークに入力し、前記単一の復号化ネットワークから出力されるターゲット言語のトレーニング出力テキストにおける各ワードの出力順序を指示する分類結果を、前記分類ネットワークから出力する。 In step S507, the intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the translation direction is input to the classification network, and the training output of the target language output from the single decoding network is output. The classification result indicating the output order of each word in the text is output from the classification network.

復号化ネットワークの隠れ層から出力される中間ベクトルは、図４を参照して上述した中間ベクトル４０１であり、隠れ層とは、全結合層の前の最上部の隠れ層である。前記分類ネットワークは、畳み込みニューラルネットワーク（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋｓ、ＣＮＮ）により実現することができる。例えば、前記分類ネットワークは、以下の式（４）の判別関数で示すことができる。

The intermediate vector output from the hidden layer of the decoding network is the intermediate vector 401 described above with reference to FIG. 4, and the hidden layer is the uppermost hidden layer in front of the fully connected layer. The classification network can be realized by a convolutional neural network (CNN). For example, the classification network can be represented by the discriminant function of the following equation (4).

例えば、分類ネットワークから出力される分類結果は、ターゲット言語のトレーニング出力テキストにおける各ワードの出力順序が特定順序であることを指示する確率値であってもよい。前記確率値ｐ_ｄｉｒは、以下の式（５）で示すことができる。

For example, the classification result output from the classification network may be a probability value indicating that the output order of each word in the training output text of the target language is a specific order. The probability value _pdir can be expressed by the following equation (5).

具体的には、前記特定順序が左から右への順序として設定される場合、分類ネットワークから出力される確率値が１であれば、ターゲット言語のトレーニング出力テキストにおける各ワードの出力順序が左から右への順序であり、分類ネットワークから出力される確率値が０であれば、ターゲット言語のトレーニング出力テキストにおける各ワードの出力順序が右から左への順序であり、分類ネットワークから出力される確率値が０〜１の間の値であり、かつ該値が１に近ければ、ターゲット言語のトレーニング出力テキストにおける各ワードの出力順序がほぼ左から右への順序であり、そのうち一部のワードの出力順序が乱れ、分類ネットワークから出力される確率値が０〜１の間の値であり、かつ該値が０に近ければ、ターゲット言語のトレーニング出力テキストにおける各ワードの出力順序がほぼ右から左への順序であり、そのうち一部のワードの出力順序が乱れる。 Specifically, when the specific order is set as an order from left to right, if the probability value output from the classification network is 1, the output order of each word in the training output text of the target language is from the left. If the order is to the right and the probability value output from the classification network is 0, the output order of each word in the training output text of the target language is the order from right to left, and the probability of being output from the classification network. If the value is between 0 and 1 and the value is close to 1, then the output order of each word in the training output text of the target language is approximately left-to-right, of which some words. If the output order is disturbed, the probability value output from the classification network is between 0 and 1, and the value is close to 0, then the output order of each word in the training output text of the target language is approximately right to left. The order is to, and the output order of some words is disturbed.

そして、ステップＳ５０８で、前記出力順序と前記翻訳方向との比較に基づいて、前記復号化ネットワークにおける各ノードのパラメータをさらに調整する。ここで、翻訳方向とは、復号化ネットワークに入力された情報が指示する方向である。 Then, in step S508, the parameters of each node in the decoding network are further adjusted based on the comparison between the output order and the translation direction. Here, the translation direction is the direction indicated by the information input to the decoding network.

つまり、翻訳方向に敏感でない復号化ネットワークを、分類ネットワークによってさらに補正する。具体的には、ソース言語のトレーニング入力テキスト及びユーザにより指示される翻訳方向に基づいて、分類ネットワークを介して出力される分類結果が、ターゲット言語のトレーニング出力テキストにおける各ワードの出力順序が、入力される翻訳方向に一致することを指示する場合、例えば、ユーザにより指示される翻訳方向が左から右であり、分類ネットワークから出力されるターゲット言語のトレーニング出力テキストにおける各ワードの出力順序が左から右への順序である確率値が１である場合、復号化ネットワークをさらに調整する必要がないと考えられる。一方、ソース言語のトレーニング入力テキスト及びユーザにより指示される翻訳方向に基づいて、分類ネットワークを介して出力される分類結果が、ターゲット言語のトレーニング出力テキストにおける各ワードの出力順序が、入力される翻訳方向と逆であることを指示する場合、例えば、ユーザにより指示される翻訳方向が左から右であり、分類ネットワークから出力されるターゲット言語のトレーニング出力テキストにおける各ワードの出力順序が左から右への順序である確率値が０である場合、復号化ネットワークを大きく調整する必要があると考えられる。又は、ソース言語のトレーニング入力テキスト及びユーザにより指示される翻訳方向に基づいて、分類ネットワークを介して出力される分類結果が、ターゲット言語のトレーニング出力テキストにおける各ワードの出力順序が、入力される翻訳方向とほぼ一致することを指示する場合、例えば、ユーザにより指示される翻訳方向が左から右であり、分類ネットワークから出力されるターゲット言語のトレーニング出力テキストにおける各ワードの出力順序が左から右への順序である確率値が０．８である場合、復号化ネットワークを微調整する必要があると考えられる。 That is, the decoding network that is not sensitive to the translation direction is further corrected by the classification network. Specifically, the classification result output via the classification network is input based on the training input text of the source language and the translation direction instructed by the user, and the output order of each word in the training output text of the target language is input. When instructing to match the translation direction to be performed, for example, the translation direction specified by the user is from left to right, and the output order of each word in the training output text of the target language output from the classification network is from left to right. If the probability value in the right order is 1, it is considered that the decryption network does not need to be further adjusted. On the other hand, based on the training input text of the source language and the translation direction instructed by the user, the classification result output via the classification network is the translation in which the output order of each word in the training output text of the target language is input. When indicating that the direction is opposite, for example, the translation direction indicated by the user is left to right, and the output order of each word in the training output text of the target language output from the classification network is from left to right. When the probability value in the order of is 0, it is considered that the decoding network needs to be adjusted significantly. Alternatively, a translation in which the classification result output via the classification network is input based on the training input text of the source language and the translation direction instructed by the user, and the output order of each word in the training output text of the target language is input. When instructing that the direction is approximately the same, for example, the translation direction indicated by the user is left to right, and the output order of each word in the training output text of the target language output from the classification network is from left to right. If the probability value in the order of is 0.8, it is considered necessary to fine-tune the decryption network.

したがって、この実施形態では、分類ネットワークをさらに追加することによって、復号化ネットワークのターゲット言語ワードの実際の出力順序が、入力された、ユーザにより指示される翻訳方向に一致するまで徐々にトレーニングされ得るように、翻訳方向情報を学習又は利用することができる。 Therefore, in this embodiment, additional classification networks can be added to gradually train until the actual output order of the target language words of the decryption network matches the input, user-instructed translation direction. As such, translation direction information can be learned or used.

ステップＳ５０７及びステップＳ５０８は、本開示に係る単一の復号化ネットワークのトレーニングに必要なステップではないので、図５において破線ブロックで示される。 Steps S507 and S508 are not necessary steps for training a single decryption network according to the present disclosure and are therefore shown by dashed blocks in FIG.

復号化ネットワークと同様に、分類ネットワークもトレーニングする必要がある。例えば、分類ネットワークから出力される分類結果と前記翻訳方向との間の差を示す第２損失関数に基づいて、前記分類ネットワークにおける各ノードのパラメータを調整する処理によって、前記分類ネットワークをトレーニングすることができる。ここで、前記翻訳方向は、復号化ネットワークに入力された情報が指示する方向である。この場合、ユーザにより指示される、復号化ネットワークに入力された情報が指示する方向を分類ネットワークの近似正解とする。上述したように、分類ネットワークから出力される分類結果は、ワード出力順序が左から右であることを指示する確率値である。例えば、ユーザにより指示される、復号化ネットワークに入力された情報が指示する方向が左から右であり、トレーニングデータに対して、分類ネットワークから出力される確率値が０．８である場合、１を分類ネットワークの近似正解として分類ネットワークにおける各ノードのパラメータを調整する。又は、例えば、ユーザにより指示される、復号化ネットワークに入力された情報が指示する方向が右から左であり、トレーニングデータに対して、分類ネットワークから出力される確率値が０．８である場合、０を分類ネットワークの近似正解として分類ネットワークにおける各ノードのパラメータを調整する。 As with decryption networks, classification networks need to be trained. For example, training the classification network by adjusting the parameters of each node in the classification network based on a second loss function indicating the difference between the classification result output from the classification network and the translation direction. Can be done. Here, the translation direction is the direction indicated by the information input to the decoding network. In this case, the direction instructed by the information input to the decoding network instructed by the user is set as the approximate correct answer of the classification network. As described above, the classification result output from the classification network is a probability value indicating that the word output order is from left to right. For example, when the direction instructed by the information input to the decryption network by the user is from left to right and the probability value output from the classification network is 0.8 with respect to the training data, 1 Adjust the parameters of each node in the classification network as the approximate correct answer for the classification network. Or, for example, when the direction indicated by the information input to the decoding network indicated by the user is from right to left, and the probability value output from the classification network is 0.8 with respect to the training data. , 0 is set as the approximate correct answer of the classification network, and the parameters of each node in the classification network are adjusted.

又は、別の可能な実施形態としては、分類結果と実際のワード出力順序との間の差に基づいて、前記分類ネットワークにおける各ノードのパラメータを調整する処理によって、前記分類ネットワークをトレーニングすることができる。この実施形態では、ユーザにより指示される、復号化ネットワークに入力された情報が指示する方向を分類ネットワークの近似正解として用いない。トレーニングプロセスでは、予め実際のワード出力順序について、該出力順序が特定翻訳方向である確率値を分類ネットワークの分類結果の正解確率値として算出する必要がある。例えば、ユーザにより指示される、復号化ネットワークに入力された情報が指示する方向が左から右であり、例えば、各タイムステップにおいてｙ_１、ｙ_２、ｙ_３、ｙ_４が順次出力され、復号化ネットワークにより実際出力されるワード順序がｙ_１、ｙ_３、ｙ_２、ｙ_４である場合、出力順序が左から右である確率値は、０．５である。この場合、分類ネットワークから出力される確率値が０．８であれば、０．５を分類ネットワークの正解として分類ネットワークにおける各ノードのパラメータを調整する。 Alternatively, another possible embodiment is to train the classification network by adjusting the parameters of each node in the classification network based on the difference between the classification result and the actual word output order. it can. In this embodiment, the direction indicated by the information input to the decoding network, which is instructed by the user, is not used as the approximate correct answer of the classification network. In the training process, it is necessary to calculate in advance the probability value that the output order is in the specific translation direction as the correct answer probability value of the classification result of the classification network for the actual word output order. For example, the direction instructed by the information input to the decoding network, which is instructed by the user, is from left to right. For example, y ₁ , y ₂ , y ₃ , and y ₄ are sequentially output and decoded in each time step. When the word order actually output by the network is y ₁ , y ₃ , y ₂ , y ₄ , the probability value that the output order is from left to right is 0.5. In this case, if the probability value output from the classification network is 0.8, 0.5 is set as the correct answer for the classification network, and the parameters of each node in the classification network are adjusted.

前記分類ネットワークのトレーニングは、前記復号化ネットワークのトレーニングの前に個別に行われてもよい。又は、前記分類ネットワークは、前記復号化ネットワークと同時にトレーニングされてもよい。同時にトレーニングされる場合、２つのネットワークは、制約し合い、調整し合い、両方の損失関数がいずれも収束するまで、トレーニングは終了しない。 The training of the classification network may be performed individually prior to the training of the decryption network. Alternatively, the classification network may be trained at the same time as the decryption network. When trained at the same time, the two networks constrain and coordinate, and training does not end until both loss functions converge.

以上、復号化ネットワークにユーザにより指定される翻訳方向が入力され、復号化ネットワークが前記翻訳方向にしたがって、ターゲット言語ワードを出力する実施例について説明した。しかしながら、本発明はこれに限られない。図６は、本開示の別の一実施例に係る機械翻訳方法のプロセスのフローチャートを示す。当該別の一実施例では、前記単一の復号化ネットワークは、２つの翻訳方向に対して、それぞれ２つのターゲット言語の出力テキストを出力してもよい。そして、予め決定された規則に基づいて、この２つの出力テキストのうち好ましい１つを最終的な訳文として選択して出力する。 The embodiment in which the translation direction specified by the user is input to the decoding network and the decoding network outputs the target language word according to the translation direction has been described above. However, the present invention is not limited to this. FIG. 6 shows a flowchart of the process of the machine translation method according to another embodiment of the present disclosure. In another embodiment, the single decoding network may output output texts in two target languages for each of the two translation directions. Then, based on a predetermined rule, a preferable one of the two output texts is selected and output as the final translation.

具体的には、図６に示すように、前記機械翻訳方法は、下記のステップを含む。 Specifically, as shown in FIG. 6, the machine translation method includes the following steps.

まず、ステップＳ６０１で、ソース言語の入力テキストに対して、前記ソース言語の入力テキストにおける各ソース言語ワードに対応する複数のベクトルを生成するように、処理を実行する。 First, in step S601, a process is executed so as to generate a plurality of vectors corresponding to each source language word in the input text of the source language with respect to the input text of the source language.

そして、ステップＳ６０２で、前記複数のベクトルを符号化して複数の符号化ベクトルを生成する。 Then, in step S602, the plurality of vectors are encoded to generate a plurality of encoded vectors.

以上のステップＳ６０１及びステップＳ６０２は、図中のステップＳ１０１及びステップＳ１０２と同様である。 The above steps S601 and S602 are the same as steps S101 and S102 in the figure.

次に、ステップＳ６０３で、前記複数の符号化ベクトル及び第１翻訳方向を指示する情報を単一の復号化ネットワークに入力する。 Next, in step S603, the plurality of coding vectors and information indicating the first translation direction are input to a single decoding network.

そして、ステップＳ６０４で、ターゲット言語の第１出力テキストを、前記単一の復号化ネットワークから出力し、前記第１出力テキストに含まれるターゲット言語ワードの出力順序が前記第１翻訳方向に一致する。 Then, in step S604, the first output text of the target language is output from the single decoding network, and the output order of the target language words included in the first output text matches the first translation direction.

次に、ステップＳ６０５で、前記複数の符号化ベクトル及び第２翻訳方向を指示する情報を単一の復号化ネットワークに入力する。 Next, in step S605, the plurality of coding vectors and information indicating the second translation direction are input to a single decoding network.

そして、ステップＳ６０６で、ターゲット言語の第２出力テキストを、前記単一の復号化ネットワークから出力し、前記第２出力テキストに含まれるターゲット言語ワードの出力順序が前記第２翻訳方向に一致する。 Then, in step S606, the second output text of the target language is output from the single decoding network, and the output order of the target language words included in the second output text matches the second translation direction.

例えば、第１翻訳方向は、左から右への方向であってもよく、第２翻訳方向は、右から左への方向であってもよい。つまり、上記ステップＳ６０３〜ステップＳ６０６によって、それぞれ、２つの翻訳方向に対応する訳文結果を取得する。 For example, the first translation direction may be from left to right, and the second translation direction may be from right to left. That is, the translation results corresponding to the two translation directions are acquired in each of the steps S603 to S606.

そして、ステップＳ６０７で、第１出力テキスト及び第２出力テキストをそれぞれ評点し、第１点数と第２点数を得る。 Then, in step S607, the first output text and the second output text are graded, respectively, and the first score and the second score are obtained.

例えば、評点ネットワークを構築することで、出力テキストに対する評点を実現することができる。 For example, by constructing a scoring network, scoring for the output text can be realized.

具体的には、前記第１翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを分類ネットワークに入力し、各タイムステップがそれぞれターゲット言語の第１出力テキストにおける各ワードに対応し、前記単一の復号化ネットワークから出力されるターゲット言語の第１出力テキストにおける各ワードの出力順序を指示する第１分類結果を、前記分類ネットワークから出力する。ここで、分類ネットワークは、以上で説明された分類ネットワークであってもよく、上述したように、分類ネットワークから出力される分類結果をｐ_ｄｉｒとして示すことができる。そして、前記ターゲット言語の第１出力テキストにおける各ワードの出現確率に基づく値、及び第１分類結果を評点ネットワークに入力し、前記評点ネットワークから第１点数を出力する。ここで、例えば、前記ターゲット言語の第１出力テキストにおける各ワードの出現確率に基づく値は、すべての出力されたターゲット言語ワードの確率の平均値であってもよい。上述したように、各タイムステップにおいて出力されるターゲット言語ワードをそれぞれｙ_１、ｙ_２、…、ｙ_ｎで示すと、現在のタイムステップにおいて出力されるターゲット言語ワードｙ_ｉの確率がＳｃ（ｙ_ｉ）であり、以下の式（６）で算出することができる。

Specifically, the intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the first translation direction is input to the classification network, and each time step is in the first output text of the target language. The first classification result corresponding to each word and indicating the output order of each word in the first output text of the target language output from the single decoding network is output from the classification network. Here, the classification network may be the classification network described above, and as described above, the classification result output from the classification network can be shown as a _dir. Then, the value based on the appearance probability of each word in the first output text of the target language and the first classification result are input to the score network, and the first score is output from the score network. Here, for example, the value based on the appearance probability of each word in the first output text of the target language may be the average value of the probabilities of all the output target language words. As described above, when the target language words output in each time step are indicated by y ₁ , y ₂ , ..., Y _n _{, respectively, the probability of the target language words y i} output in the current time step is Sc (y). _i ), which can be calculated by the following equation (6).

評点ネットワークは、以下の式（７）で示すことができる。

The scoring network can be represented by the following equation (7).

同様に、前記第２翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを前記分類ネットワークに入力し、各タイムステップがそれぞれターゲット言語の第２出力テキストにおける各ワードに対応し、前記単一の復号化ネットワークから出力されるターゲット言語の第２出力テキストにおける各ワードの出力順序を指示する第２分類結果を、前記分類ネットワークから出力する。そして、前記ターゲット言語の第２出力テキストにおける各ワードの出現確率に基づく値、及び第２分類結果を評点ネットワークに入力し、前記評点ネットワークから第２点数を出力する。 Similarly, an intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the second translation direction is input to the classification network, and each time step is each in the second output text of the target language. The second classification result corresponding to the words and indicating the output order of each word in the second output text of the target language output from the single decoding network is output from the classification network. Then, the value based on the appearance probability of each word in the second output text of the target language and the second classification result are input to the score network, and the second score is output from the score network.

最後に、ステップＳ６０８で、第１点数と第２点数とのうち大きい一方に対応する出力テキストをターゲット言語の出力テキストとして選択する。 Finally, in step S608, the output text corresponding to the larger one of the first score and the second score is selected as the output text of the target language.

したがって、当該別の一実施例では、ユーザが参加する必要がない場合、２つの翻訳方向に対応する出力テキストから、より好ましい１つを動的に選択して、最終的な訳文として出力することができる。 Therefore, in the other embodiment, when the user does not need to participate, a more preferable one is dynamically selected from the output texts corresponding to the two translation directions and output as the final translation. Can be done.

当該別の一実施例では、復号化ネットワークのトレーニング方式は、上述した一実施例における復号化ネットワークのトレーニング方式と完全に同じであり、以上で説明された内容を当該別の一実施例における復号化ネットワークのトレーニングに完全に適用する。また、冗長を避けるために、ここでは詳細な説明が省略される。 In the other embodiment, the decoding network training method is completely the same as the decoding network training method in the above-described one embodiment, and the contents described above are decoded in the other embodiment. Fully applicable to training networks. Further, in order to avoid redundancy, detailed description is omitted here.

上述した復号化ネットワーク、分類ネットワークと同様に、評点ネットワークも使用前にトレーニングする必要がある。次に、図７を参照しながら前記評点ネットワークの具体的なトレーニングプロセスを説明する。図７に示すように、以下のステップによって前記評点ネットワークをトレーニングすることができる。 Similar to the decryption and classification networks described above, the scoring network also needs to be trained before use. Next, a specific training process of the score network will be described with reference to FIG. 7. As shown in FIG. 7, the score network can be trained by the following steps.

まず、ステップＳ７０１で、ソース言語のトレーニング入力テキストに対して、前記ソース言語のトレーニング入力テキストにおける各ソース言語ワードに対応する複数のトレーニングベクトルを生成するように、処理を実行する。 First, in step S701, a process is executed for the training input text of the source language so as to generate a plurality of training vectors corresponding to each source language word in the training input text of the source language.

そして、ステップＳ７０２で、前記複数のトレーニングベクトルを符号化して複数の符号化トレーニングベクトルを生成する。 Then, in step S702, the plurality of training vectors are encoded to generate a plurality of encoded training vectors.

次に、ステップＳ７０３で、前記複数の符号化トレーニングベクトル及び１つの翻訳方向を指示する情報を単一の復号化ネットワークに入力する。 Next, in step S703, the plurality of coding training vectors and information indicating one translation direction are input to a single decoding network.

そして、ステップＳ７０４で、前記１つの翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを分類ネットワークに入力し、前記単一の復号化ネットワークから出力されるターゲット言語のトレーニング出力テキストにおける各ワードの出力順序を指示する分類結果を、前記分類ネットワークから出力する。 Then, in step S704, the intermediate vector output from the hidden layer of the decoding network in each time step corresponding to the one translation direction is input to the classification network, and the target output from the single decoding network is input. The classification result indicating the output order of each word in the training output text of the language is output from the classification network.

そして、ステップＳ７０５で、前記ターゲット言語のトレーニング出力テキストにおける各ワードの出現確率に基づく値、及び分類結果を評点ネットワークに入力し、前記評点ネットワークからトレーニング点数を出力する。トレーニングを開始するとき、評点ネットワークから出力されるトレーニング点数は、正確ではない。 Then, in step S705, the value based on the appearance probability of each word in the training output text of the target language and the classification result are input to the score network, and the training score is output from the score network. When starting training, the training scores output from the score network are not accurate.

次に、ステップＳ７０６で、前記復号化ネットワークから出力されるターゲット言語のトレーニング出力テキストとターゲット言語の正解テキストとの間の類似度を算出する。例えば、機械翻訳の自動評価方法とするＢＬＥＵ（ｂｉｌｉｎｇｕａｌｅｖａｌｕａｔｉｏｎｕｎｄｅｒｓｔｕｄｙ）アルゴリズムを用いて前記類似度を算出することができる。機械翻訳の訳文が人手翻訳（すなわち、ターゲット言語の正解テキスト）の結果に近ければ近いほど、その翻訳品質が高く、得られたＢＬＥＵ点数も高い。 Next, in step S706, the degree of similarity between the training output text of the target language output from the decryption network and the correct answer text of the target language is calculated. For example, the similarity can be calculated by using a BLEU (bilingual evaluation evaluation) algorithm as an automatic evaluation method for machine translation. The closer the translation of the machine translation is to the result of the manual translation (that is, the correct text of the target language), the higher the translation quality and the higher the BLEU score obtained.

最後に、ステップＳ７０７で、前記トレーニング点数と前記類似度との間の差を示す第３損失関数に基づいて、前記評点ネットワークにおける各ノードのパラメータを調整する。つまり、ターゲット言語のトレーニング出力テキストの翻訳品質を評価する正確な基準である類似度（例えば、ＢＬＥＵ点数）を、評点ネットワークから出力される点数の校正参照とすることにより、予め決定された回数のトレーニングを経た後、評点ネットワークは、復号化ネットワークから出力される訳文の翻訳品質を正確に評価する点数を出力することができる。 Finally, in step S707, the parameters of each node in the rating network are adjusted based on a third loss function that indicates the difference between the training score and the similarity. That is, a predetermined number of times by using the similarity (for example, BLEU score), which is an accurate criterion for evaluating the translation quality of the training output text of the target language, as a calibration reference of the score output from the score network. After training, the scoring network can output a score that accurately evaluates the translation quality of the translation output from the decoding network.

以上で、本開示の２つの実施例に係る機械翻訳方法について説明した。一実施例では、復号化ネットワークは、入力される翻訳方向に基づいて、対応するターゲット言語の出力テキスト（訳文）を出力し、すなわち、復号化ネットワークにより最終的に出力される訳文は、最初にユーザにより指示される翻訳方向に関連付けられる。別の一実施例では、復号化ネットワークに２つの異なる翻訳方向をそれぞれ入力し、２つの異なる翻訳方向に対応する２つの異なる訳文から、翻訳品質がより好ましい１つを選択し、最終的な訳文として出力する。つまり、当該別の一実施例では、復号化ネットワークに翻訳方向を入力する必要がないと考えられる。 The machine translation method according to the two examples of the present disclosure has been described above. In one embodiment, the decoding network outputs the output text (translation) of the corresponding target language based on the input translation direction, that is, the translation finally output by the decoding network is first. Associated with the translation direction indicated by the user. In another embodiment, two different translation directions are input to the decoding network, and one of the two different translations corresponding to the two different translation directions is selected to have a more favorable translation quality, and the final translation is completed. Output as. That is, in the other embodiment, it is considered unnecessary to input the translation direction into the decoding network.

復号化ネットワークのトレーニングでは、ソース言語のトレーニング入力テキストに対する人手翻訳テキストをターゲット言語の正解テキストとすることができるほか、前記ターゲット言語の正解テキストは、同一のソース言語の入力テキストが入力される場合、別の復号化ネットワークから出力されるターゲット言語の出力テキストをさらに含んでもよく、前記別の復号化ネットワークのネットワーク規模が前記復号化ネットワークのネットワーク規模より大きい。ここで、ネットワーク規模とは、例えば、ノード数、ネットワーク層数のようなネットワークパラメータである。当然のことながら、前記別の復号化ネットワークは、トレーニング済みのネットワークである。 In the decryption network training, the manually translated text for the training input text of the source language can be used as the correct answer text of the target language, and the correct answer text of the target language is when the input text of the same source language is input. , The output text of the target language output from another decryption network may be further included, and the network scale of the other decryption network is larger than the network scale of the decryption network. Here, the network scale is a network parameter such as the number of nodes and the number of network layers. Unsurprisingly, the other decryption network is a trained network.

ただし、Ｔは、トレーニングサンプル数を示す。 However, T indicates the number of training samples.

いくつかの場合では、正解である人手翻訳テキストを達成しにくいのに対し、別の復号化ネットワークからのターゲット言語の出力テキストをより容易に達成し得るため、より大きい別の復号化ネットワークの出力を正解として追加することにより、規模が大きい別の復号化ネットワークから、規模が小さい復号化ネットワークへの知識移転を実現し、学習用の知識の多様性を実現することができる。例えば、大きい別の復号化ネットワークは、大型機器において実行される復号化ネットワークであってもよいが、小さい該復号化ネットワークは、小型ポータブル機器において実行される復号化ネットワークであってもよい。それにより、大きい別の復号化ネットワークが学習した知識を、直接小さい復号化ネットワークに移転することにより、小さい復号化ネットワークのトレーニングが完了するまでの時間を効果的に短縮することができ、復号化ネットワークの損失関数が長期にわたって収束できずにいる状況を回避することができる。 In some cases, the correct manual translation text is difficult to achieve, while the output text of the target language from another decryption network can be more easily achieved, so the output of a larger alternative decryption network. By adding as the correct answer, it is possible to realize knowledge transfer from another large-scale decoding network to a small-scale decoding network, and to realize a variety of knowledge for learning. For example, another large decoding network may be a decoding network performed on a large device, while the small decoding network may be a decoding network performed on a small portable device. Thereby, by transferring the knowledge learned by another large decoding network directly to the small decoding network, the time required to complete the training of the small decoding network can be effectively shortened, and the decoding can be performed. It is possible to avoid the situation where the loss function of the network cannot converge for a long period of time.

上述したように、本開示に係る復号化ネットワークは、規模が小さい復号化ネットワークとして、規模が大きい別の復号化ネットワークのターゲット言語の出力テキストを正解として用いてトレーニングを実行することができる。一方、本開示に係る復号化ネットワークがトレーニング完了後に出力するターゲット言語の出力テキストが、同様に正解テキストとして規模が小さい復号化ネットワークのトレーニング処理に適用されてもよい。 As described above, the decoding network according to the present disclosure can perform training as a small-scale decoding network by using the output text of the target language of another large-scale decoding network as the correct answer. On the other hand, the output text of the target language output by the decoding network according to the present disclosure after the training is completed may be applied to the training process of the decoding network having a small scale as the correct answer text as well.

具体的には、前記復号化ネットワークから出力されるターゲット言語の出力テキストをソース言語の入力テキストに対応するターゲット言語の正解テキストとして、別の復号化ネットワークのトレーニング処理に適用し、前記別の復号化ネットワークのネットワーク規模が前記復号化ネットワークのネットワーク規模より小さい。 Specifically, the output text of the target language output from the decoding network is applied to the training process of another decoding network as the correct answer text of the target language corresponding to the input text of the source language, and the other decoding is performed. The network scale of the decryption network is smaller than the network scale of the decryption network.

実験で証明されるように、同一のトレーニングデータでネットワークをトレーニングする場合、本開示の各実施例に係る機械翻訳方法は、従来技術の方式に比べて、同じ復号化速度を確保した上で、さらにＢＬＥＵ点数を約１〜２．５点向上させることができる。 As will be proved in the experiment, when the network is trained with the same training data, the machine translation method according to each embodiment of the present disclosure ensures the same decoding speed as compared with the method of the prior art. Further, the BLEU score can be improved by about 1 to 2.5 points.

以上、図面を参照しながら本開示の各実施例に係る機械翻訳方法の具体的なプロセスについて詳細に説明した。次に、図８を参照しながら本開示の一実施例に係る機械翻訳装置について説明する。 The specific process of the machine translation method according to each embodiment of the present disclosure has been described in detail with reference to the drawings. Next, the machine translation apparatus according to the embodiment of the present disclosure will be described with reference to FIG.

図８に示すように、前記機械翻訳装置８００は、前処理ユニット８０１と、符号化ユニット８０２と、復号化ユニット８０３と、を含む。 As shown in FIG. 8, the machine translation apparatus 800 includes a preprocessing unit 801, a coding unit 802, and a decoding unit 803.

前処理ユニット８０１は、ソース言語の入力テキストに対して、前記ソース言語の入力テキストにおける各ソース言語ワードに対応する複数のベクトルを生成するように、処理を実行するために用いられる。従来の機械学習方法は、往々にしてテキストデータを直接処理できないため、後続の各ネットワークに入力する前に、まずターゲット言語の入力テキストを数値型データに変換する必要がある。例えば、ソース言語の入力テキストは、１つの文であってもよい。文に対して単語分割処理を実行することにより、文を複数のワードに分割し、次に複数のワードをそれぞれ特定の次元のワードベクトルに変換する。例えば、ワード埋め込み（ｗｏｒｄｅｍｂｅｄｄｉｎｇ）の方式でこの変換を実現することができる。 The preprocessing unit 801 is used to perform processing on the input text of the source language so as to generate a plurality of vectors corresponding to each source language word in the input text of the source language. Traditional machine learning methods often cannot process text data directly, so the input text in the target language must first be converted to numeric data before being input to each subsequent network. For example, the input text of the source language may be one sentence. By executing the word division process on the sentence, the sentence is divided into a plurality of words, and then the plurality of words are converted into word vectors of a specific dimension. For example, this conversion can be realized by a method of word embedding.

符号化ユニット８０２は、前記複数のベクトルを符号化して複数の符号化ベクトルを生成するために用いられる。例えば、図２を参照して上述した符号化ネットワークによって前記符号化を実行することができる。 The coding unit 802 is used to encode the plurality of vectors to generate a plurality of coded vectors. For example, the coding can be performed by the coding network described above with reference to FIG.

復号化ユニット８０３は、前記複数の符号化ベクトル及び翻訳方向を指示する情報を単一の復号化ネットワークに入力し、ソース言語の入力テキストに対応するターゲット言語の出力テキストを前記単一の復号化ネットワークから出力するために用いられ、前記出力テキストに含まれるターゲット言語ワードの出力順序が前記翻訳方向に一致し、前記単一の復号化ネットワークに入力された情報が指示する翻訳方向が変更される場合、前記単一の復号化ネットワークにおける各ノードのパラメータが変更されない。 The decoding unit 803 inputs the plurality of coding vectors and information indicating the translation direction into a single decoding network, and decodes the output text of the target language corresponding to the input text of the source language into the single decoding network. Used to output from the network, the output order of the target language words contained in the output text matches the translation direction, and the translation direction indicated by the information input to the single decoding network is changed. If so, the parameters of each node in the single decryption network are unchanged.

ここで、上述したように、翻訳方向は、ユーザにより指定される。例えば、復号化ネットワークが左から右への順序で訳文テキストを出力することをユーザが望む場合、復号化ネットワークに、左から右への翻訳方向を指示する情報を入力する。又は、復号化ネットワークが右から左への順序で訳文テキストを出力することをユーザが望む場合、復号化ネットワークに、右から左への翻訳方向を指示する情報を入力する。例えば、翻訳方向を指示する情報は、多次元特徴ベクトル（例えば、１２８次元）であってもよい。該多次元特徴ベクトルにおける各要素の値は、ランダムに決定され、同一翻訳方向に対して固定されている。 Here, as described above, the translation direction is specified by the user. For example, if the user wants the decryption network to output the translated text in a left-to-right order, the decryption network is populated with information indicating the translation direction from left to right. Alternatively, if the user wants the decryption network to output the translated text in a right-to-left order, the decryption network is populated with information indicating the translation direction from right to left. For example, the information indicating the translation direction may be a multidimensional feature vector (for example, 128 dimensions). The value of each element in the multidimensional feature vector is randomly determined and fixed for the same translation direction.

なお、ここで、前記単一の復号化ネットワークに入力された情報が指示する翻訳方向が変更される場合、前記単一の復号化ネットワークにおける各ノードのパラメータが変更されない。 Here, when the translation direction indicated by the information input to the single decoding network is changed, the parameters of each node in the single decoding network are not changed.

前記復号化ネットワークは、予め決定された処理時間間隔で訳文（すなわち、ターゲット言語の出力テキスト）における各ワードを順次出力する。第１タイムステップ（ｔｉｍｅｓｔｅｐ）において、復号化ネットワークに入力された情報が指示する翻訳方向が左から右である場合、前記復号化ネットワークが左から１番目のターゲット言語ワードを出力する。そして、第１タイムステップから予め決定された処理時間間隔をおいた第２タイムステップにおいて、前記復号化ネットワークが引き続き左から２番目のターゲット言語ワードを出力する。前記復号化ネットワークが最後のターゲット言語ワードを出力するまで、以下同様に続く。同様に、第１タイムステップ（ｔｉｍｅｓｔｅｐ）、復号化ネットワークに入力された情報が指示する翻訳方向が右から左である場合、前記復号化ネットワークが右から１番目のターゲット言語ワードを出力する。そして、第１タイムステップから予め決定された処理時間間隔をおいた第２タイムステップにおいて、前記復号化ネットワークが引き続き右からの２番目のターゲット言語ワードを出力する。前記復号化ネットワークが最後のターゲット言語ワードを出力するまで、以下同様に続く。 The decryption network sequentially outputs each word in the translated text (that is, the output text of the target language) at predetermined processing time intervals. In the first time step, when the translation direction indicated by the information input to the decoding network is from left to right, the decoding network outputs the first target language word from the left. Then, in the second time step with a predetermined processing time interval from the first time step, the decoding network continuously outputs the second target language word from the left. The same applies until the decryption network outputs the final target language word. Similarly, in the first time step, when the translation direction indicated by the information input to the decoding network is from right to left, the decoding network outputs the first target language word from the right. Then, in the second time step with a predetermined processing time interval from the first time step, the decoding network continues to output the second target language word from the right. The same applies until the decryption network outputs the final target language word.

なお、現在のタイムステップの出力は、次のタイムステップにおいて復号化するとき、さらに入力として復号化ネットワークの最下層にフィードバックされる。換言すれば、復号化ネットワークからの符号化ベクトル及び翻訳方向を指示する情報を入力とする以外、復号化ネットワークは、なお、復号化ネットワークの先行出力に基づいて現在の出力を生成することができる。 The output of the current time step is further fed back to the bottom layer of the decoding network as an input when decoding is performed in the next time step. In other words, the decryption network can still generate the current output based on the predecessor output of the decryption network, except to input the coding vector from the decoding network and the information indicating the translation direction. ..

また、上述したように、復号化ネットワークは、複数の予測ベクトルを各タイムステップにおいて出力し、予測ベクトルのそれぞれに、ターゲット言語シソーラスにおける各ワードが１つのタイムステップにおいてターゲット言語の出力テキストにおけるワードとされる確率が含まれ、各タイムステップがそれぞれターゲット言語の出力テキストにおける各ワードに対応する。各タイムステップにおいて、該タイムステップに対応する予測ベクトルにおける確率が最も大きい要素に対応するシソーラスにおけるワードを訳文として選択して出力する。 Further, as described above, the decoding network outputs a plurality of prediction vectors at each time step, and for each of the prediction vectors, each word in the target language thesaurus is a word in the output text of the target language in one time step. Each time step corresponds to each word in the output text of the target language. In each time step, the word in the thesaurus corresponding to the element having the highest probability in the prediction vector corresponding to the time step is selected and output as a translation.

以上、本開示に係る機械翻訳装置の具体的な構成、及び異なる翻訳方向に対して同一のノードパラメータを共有できる単一の復号化ネットワークについて説明した。上述した機械翻訳装置は、トレーニング済みの復号化ネットワークに基づいて翻訳を実行するものである。したがって、前記装置は、ソース言語のトレーニング入力テキストに対して、前記ソース言語のトレーニング入力テキストにおける各ソース言語ワードに対応する複数のトレーニングベクトルを生成するように、処理を実行する処理と、前記複数のトレーニングベクトルを符号化して複数の符号化トレーニングベクトルを生成する処理と、前記複数の符号化トレーニングベクトル及び翻訳方向を指示する情報を単一の復号化ネットワークに入力する処理と、複数のトレーニング予測ベクトルをそれぞれ複数のタイムステップにおいて前記単一の復号化ネットワークから出力し、トレーニング予測ベクトルのそれぞれに、ターゲット言語シソーラスにおける各ワードが１つのタイムステップにおいてターゲット言語の出力テキストにおけるワードとされる確率が含まれ、各タイムステップがそれぞれターゲット言語の出力テキストにおける各ワードに対応する処理と、前記翻訳方向に対応する、複数のタイムステップにおいて出力されるべきターゲット言語の正解テキストのワードに基づいて、複数の正解ベクトルを決定し、正解ベクトルのそれぞれに、ターゲット言語シソーラスにおける各ワードが１つのタイムステップにおいてターゲット言語の出力テキストにおけるワードとされる確率が含まれ、該タイムステップにおいて出力されるべきワードに対応する確率が最も大きい処理と、少なくともトレーニング予測ベクトルと対応する正解ベクトルとの間の差を示す第１損失関数に基づいて、前記復号化ネットワークにおける各ノードのパラメータを調整する処理と、を実行することによって、前記復号化ネットワークをトレーニングするためのトレーニングユニット８０４をさらに含んでもよい。 The specific configuration of the machine translation apparatus according to the present disclosure and a single decoding network capable of sharing the same node parameters for different translation directions have been described above. The machine translation device described above performs translations based on a trained decryption network. Therefore, the apparatus includes a process of executing a process for the training input text of the source language so as to generate a plurality of training vectors corresponding to each source language word in the training input text of the source language, and the plurality of processes. The process of encoding the training vector of the above to generate a plurality of coded training vectors, the process of inputting the plurality of coded training vectors and the information indicating the translation direction into a single decoding network, and a plurality of training predictions. Each vector is output from the single decoding network in multiple time steps, and each training prediction vector has a probability that each word in the target language system is a word in the output text of the target language in one time step. Multiple, based on the processing corresponding to each word in the output text of the target language, each time step being included, and the correct text word of the target language to be output in the plurality of time steps corresponding to the translation direction. The correct answer vector is determined, and each of the correct answer vectors includes the probability that each word in the target language system is a word in the output text of the target language in one time step, and the word to be output in the time step. Executes the process of adjusting the parameters of each node in the decryption network based on the first loss function indicating the difference between the process having the highest corresponding probability and at least the training prediction vector and the corresponding correct answer vector. By doing so, a training unit 804 for training the decryption network may be further included.

また、以上の説明からわかるように、翻訳方向を指示する情報は、最初のタイムステップにおいて出力されるターゲット言語ワードへの影響が最も大きい。この後の各タイムステップにおいて、翻訳方向を指示する情報が、さらに入力として復号化ネットワークに提供されるが、時間の経過とともに、より大きな重みが比較的近い先行出力に付与されるため、翻訳方向を指示する情報の重みがますます小さくなり、換言すれば、翻訳方向を指示する情報は、出力されるターゲット言語ワードへの影響がますます小さくなる。つまり、同一の復号化ネットワークを共有するアーキテクチャにおいては、特に、後続のタイムステップの出力が翻訳方向に敏感ではなく、翻訳方向に関する情報が紛失されてしまう恐れがある。したがって、別の可能な実施形態としては、前記単一の復号化ネットワークに対するトレーニングでは、前記復号化ネットワークを補助的にトレーニングするための分類ネットワークをさらに追加することができる。 Further, as can be seen from the above explanation, the information indicating the translation direction has the greatest influence on the target language word output in the first time step. In each subsequent time step, information indicating the translation direction is further provided to the decoding network as input, but with the passage of time, a larger weight is given to the relatively close leading output, so that the translation direction is given. The weight of the information that indicates the translation direction becomes smaller and smaller, in other words, the information that indicates the translation direction has less influence on the output target language word. That is, in an architecture that shares the same decryption network, the output of subsequent time steps is not sensitive to the translation direction, and information about the translation direction may be lost. Thus, in another possible embodiment, training for the single decryption network can further add a classification network for auxiliary training of the decryption network.

具体的には、前記単一の復号化ネットワークには、隠れ層が含まれ、前記トレーニングユニット８０４は、さらに、前記翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを分類ネットワークに入力し、前記単一の復号化ネットワークから出力されるターゲット言語のトレーニング出力テキストにおける各ワードの出力順序を指示する分類結果を、前記分類ネットワークから出力する処理と、前記出力順序と前記翻訳方向との比較に基づいて、前記復号化ネットワークにおける各ノードのパラメータをさらに調整する処理と、を実行することによって、前記復号化ネットワークをトレーニングするように構成される。ここで、翻訳方向は、復号化ネットワークに入力された情報が指示する方向である。 Specifically, the single decryption network includes a hidden layer, and the training unit 804 is further output from the hidden layer of the decryption network at each time step corresponding to the translation direction. A process of inputting an intermediate vector to the classification network and outputting a classification result indicating the output order of each word in the training output text of the target language output from the single decoding network from the classification network and the output. It is configured to train the decryption network by performing a process of further adjusting the parameters of each node in the decryption network based on the comparison of the sequence with the translation direction. Here, the translation direction is the direction indicated by the information input to the decoding network.

復号化ネットワークの隠れ層から出力される中間ベクトルは、図４を参照して上述した中間ベクトル４０１であり、隠れ層とは、全結合層の前の最上部の隠れ層である。前記分類ネットワークは、畳み込みニューラルネットワーク（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋｓ、ＣＮＮ）により実現することができる。 The intermediate vector output from the hidden layer of the decoding network is the intermediate vector 401 described above with reference to FIG. 4, and the hidden layer is the uppermost hidden layer in front of the fully connected layer. The classification network can be realized by a convolutional neural network (CNN).

具体的には、前記特定順序が左から右への順序として設定される場合、分類ネットワークから出力される確率値が１であれば、ターゲット言語のトレーニング出力テキストにおける各ワードの出力順序が左から右への順序であり、分類ネットワークから出力される確率値が０であれば、ターゲット言語のトレーニング出力テキストにおける各ワードの出力順序が右から左である順序であり、分類ネットワークから出力される確率値が０〜１の間の値であり、かつ該値が１に近ければ、ターゲット言語のトレーニング出力テキストにおける各ワードの出力順序がほぼ左から右への順序であり、そのうち一部のワードの出力順序が乱れ、分類ネットワークから出力される確率値が０〜１の間の値であり、該値が０に近ければ、ターゲット言語のトレーニング出力テキストにおける各ワードの出力順序がほぼ右から左への順序であり、そのうち一部のワードの出力順序が乱れる。 Specifically, when the specific order is set as an order from left to right, if the probability value output from the classification network is 1, the output order of each word in the training output text of the target language is from the left. If the order is to the right and the probability value output from the classification network is 0, the output order of each word in the training output text of the target language is from right to left, and the probability of being output from the classification network. If the value is between 0 and 1 and the value is close to 1, then the output order of each word in the training output text of the target language is approximately left-to-right, of which some words. If the output order is disturbed and the probability value output from the classification network is between 0 and 1, and the value is close to 0, the output order of each word in the training output text of the target language is approximately from right to left. The output order of some words is disturbed.

復号化ネットワークと同様に、分類ネットワークもトレーニングする必要がある。例えば、前記トレーニングユニット８０４は、分類ネットワークから出力される分類結果と前記翻訳方向との間の差を示す第２損失関数に基づいて、前記分類ネットワークにおける各ノードのパラメータを調整する処理によって、前記分類ネットワークをトレーニングすることができる。ここで、前記翻訳方向は、復号化ネットワークに入力された情報が指示する方向である。この場合、ユーザにより指示される、復号化ネットワークに入力された情報が指示する方向を分類ネットワークの近似正解とする。上述したように、分類ネットワークから出力される分類結果は、ワード出力順序が左から右であることを指示する確率値である。例えば、ユーザにより指示される、復号化ネットワークに入力された情報が指示する方向が左から右であり、トレーニングデータに対して、分類ネットワークから出力される確率値が０．８である場合、１を分類ネットワークの近似正解として分類ネットワークにおける各ノードのパラメータを調整する。又は、例えば、ユーザにより指示される、復号化ネットワークに入力された情報が指示する方向が右から左であり、トレーニングデータに対して、分類ネットワークから出力される確率値が０．８である場合、０を分類ネットワークの近似正解として分類ネットワークにおける各ノードのパラメータを調整する。 As with decryption networks, classification networks need to be trained. For example, the training unit 804 may perform the process of adjusting the parameters of each node in the classification network based on the second loss function indicating the difference between the classification result output from the classification network and the translation direction. Can train classification networks. Here, the translation direction is the direction indicated by the information input to the decoding network. In this case, the direction instructed by the information input to the decoding network instructed by the user is set as the approximate correct answer of the classification network. As described above, the classification result output from the classification network is a probability value indicating that the word output order is from left to right. For example, when the direction instructed by the information input to the decryption network by the user is from left to right and the probability value output from the classification network is 0.8 with respect to the training data, 1 Adjust the parameters of each node in the classification network as the approximate correct answer for the classification network. Or, for example, when the direction indicated by the information input to the decoding network indicated by the user is from right to left, and the probability value output from the classification network is 0.8 with respect to the training data. , 0 is set as the approximate correct answer of the classification network, and the parameters of each node in the classification network are adjusted.

又は、別の可能な実施形態としては、分類結果と実際のワード出力順序との間の差に基づいて、前記分類ネットワークにおける各ノードのパラメータを調整する処理によって、前記分類ネットワークをトレーニングすることができる。この実施形態では、ユーザにより指示される、復号化ネットワークに入力された情報が指示する方向を分類ネットワークの近似正解として用いない。トレーニングプロセスでは、予め実際のワード出力順序について、該出力順序が特定翻訳方向である確率値を、分類ネットワークの分類結果の正解確率値として算出する必要がある。例えば、ユーザにより指示される、復号化ネットワークに入力された情報が指示する方向が左から右であり、例えば、各タイムステップにおいてｙ_１、ｙ_２、ｙ_３、ｙ_４が順次出力され、復号化ネットワークにより実際出力されるワード順序がｙ_１、ｙ_３、ｙ_２、ｙ_４である場合、出力順序が左から右である確率値は、０．５である。この場合、分類ネットワークから出力される確率値が０．８であれば、０．５を分類ネットワークの正解として分類ネットワークにおける各ノードのパラメータを調整する。 Alternatively, another possible embodiment is to train the classification network by adjusting the parameters of each node in the classification network based on the difference between the classification result and the actual word output order. it can. In this embodiment, the direction indicated by the information input to the decoding network, which is instructed by the user, is not used as the approximate correct answer of the classification network. In the training process, it is necessary to calculate in advance the probability value that the output order is in the specific translation direction for the actual word output order as the correct answer probability value of the classification result of the classification network. For example, the direction instructed by the information input to the decoding network, which is instructed by the user, is from left to right. For example, y ₁ , y ₂ , y ₃ , and y ₄ are sequentially output and decoded in each time step. When the word order actually output by the network is y ₁ , y ₃ , y ₂ , y ₄ , the probability value that the output order is from left to right is 0.5. In this case, if the probability value output from the classification network is 0.8, 0.5 is set as the correct answer for the classification network, and the parameters of each node in the classification network are adjusted.

以上、復号化ネットワークにユーザにより指定される翻訳方向が入力され、復号化ネットワークが前記翻訳方向にしたがって、ターゲット言語ワードを出力する実施例について説明した。しかしながら、本発明はこれに限られない。 The embodiment in which the translation direction specified by the user is input to the decoding network and the decoding network outputs the target language word according to the translation direction has been described above. However, the present invention is not limited to this.

本開示の別の一実施例によれば、前記復号化ユニット８０３は、さらに、前記複数の符号化ベクトル及び第１翻訳方向を指示する情報を単一の復号化ネットワークに入力し、ターゲット言語の第１出力テキストを前記単一の復号化ネットワークから出力し、前記第１出力テキストに含まれるターゲット言語ワードの出力順序が前記第１翻訳方向に一致し、前記複数の符号化ベクトル及び第２翻訳方向を指示する情報を単一の復号化ネットワークに入力し、ターゲット言語の第２出力テキストを、前記単一の復号化ネットワークから出力し、前記第２出力テキストに含まれるターゲット言語ワードの出力順序が前記第２翻訳方向に一致するように構成されてもよい。 According to another embodiment of the present disclosure, the decoding unit 803 further inputs the plurality of coding vectors and information indicating the first translation direction into a single decoding network, and the target language. The first output text is output from the single decoding network, the output order of the target language words included in the first output text matches the first translation direction, and the plurality of coded vectors and the second translation The information indicating the direction is input to a single decoding network, the second output text of the target language is output from the single decoding network, and the output order of the target language words included in the second output text. May be configured to match the second translation direction.

これで、当該別の一実施例では、前記単一の復号化ネットワークは、２つの翻訳方向に対して、それぞれ２つのターゲット言語の出力テキストを出力してもよいことが分かる。 It can now be seen that in the other embodiment, the single decoding network may output output text in two target languages for each of the two translation directions.

そして、復号化ユニット８０３は、予め決定された規則に基づいて、この２つの出力テキストのうちの好ましい１つを最終的な訳文として選択して出力するための選択ユニット（図示せず）をさらに含んでもよい。 Then, the decoding unit 803 further selects and outputs a selection unit (not shown) for selecting and outputting a preferable one of the two output texts as a final translation based on a predetermined rule. It may be included.

具体的には、選択ユニットは、前記第１翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを分類ネットワークに入力し、各タイムステップがそれぞれターゲット言語の第１出力テキストにおける各ワードに対応し、前記単一の復号化ネットワークから出力されるターゲット言語の第１出力テキストにおける各ワードの出力順序を指示する第１分類結果を、前記分類ネットワークから出力する処理と、前記第２翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを前記分類ネットワークに入力し、各タイムステップがそれぞれターゲット言語の第２出力テキストにおける各ワードに対応し、前記単一の復号化ネットワークから出力されるターゲット言語の第２出力テキストにおける各ワードの出力順序を指示する第２分類結果を、前記分類ネットワークから出力する処理と、前記ターゲット言語の第１出力テキストにおける各ワードの出現確率に基づく値、及び第１分類結果を評点ネットワークに入力し、前記評点ネットワークから第１点数を出力する処理と、前記ターゲット言語の第２出力テキストにおける各ワードの出現確率に基づく値、及び第２分類結果を評点ネットワークに入力し、前記評点ネットワークから第２点数を出力する処理と、第１点数と第２点数とのうち大きい一方に対応する出力テキストをターゲット言語の出力テキストとして選択する処理と、を実行するように構成される。 Specifically, the selection unit inputs an intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the first translation direction into the classification network, and each time step is the first of the target language. A process of outputting the first classification result corresponding to each word in one output text and instructing the output order of each word in the first output text of the target language output from the single decoding network from the classification network. And, the intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the second translation direction is input to the classification network, and each time step is each word in the second output text of the target language. The process of outputting the second classification result indicating the output order of each word in the second output text of the target language output from the single decoding network from the classification network and the target language. A process of inputting a value based on the appearance probability of each word in the first output text and the first classification result into the score network and outputting the first score from the score network, and each word in the second output text of the target language. The process of inputting the value based on the appearance probability of and the second classification result into the score network and outputting the second score from the score network, and the output text corresponding to the larger one of the first score and the second score. It is configured to perform the process of selecting as the output text of the target language.

上述した復号化ネットワーク、分類ネットワークと同様に、評点ネットワークも使用前にトレーニングする必要がある。 Similar to the decryption and classification networks described above, the scoring network also needs to be trained before use.

具体的には、トレーニングユニット８０４は、ソース言語のトレーニング入力テキストに対して、前記ソース言語のトレーニング入力テキストにおける各ソース言語ワードに対応する複数のトレーニングベクトルを生成するように、処理を実行する処理と、前記複数のトレーニングベクトルを符号化して複数の符号化トレーニングベクトルを生成する処理と、前記複数の符号化トレーニングベクトル及び１つの翻訳方向を指示する情報を単一の復号化ネットワークに入力する処理と、前記１つの翻訳方向に対応する、各タイムステップにおいて復号化ネットワークの隠れ層から出力される中間ベクトルを分類ネットワークに入力し、前記単一の復号化ネットワークから出力されるターゲット言語のトレーニング出力テキストにおける各ワードの出力順序を指示する分類結果を、前記分類ネットワークから出力する処理と、前記ターゲット言語のトレーニング出力テキストにおける各ワードの出現確率に基づく値、及び分類結果を評点ネットワークに入力し、前記評点ネットワークからトレーニング点数を出力する処理と、前記復号化ネットワークから出力されるターゲット言語のトレーニング出力テキストとターゲット言語の正解テキストとの間の類似度を算出する処理と、前記トレーニング点数と前記類似度との間の差を示す第３損失関数に基づいて、前記評点ネットワークにおける各ノードのパラメータを調整する処理とによって、前記評点ネットワークをトレーニングすることができる。 Specifically, the training unit 804 performs processing on the training input text of the source language so as to generate a plurality of training vectors corresponding to each source language word in the training input text of the source language. And the process of encoding the plurality of training vectors to generate a plurality of coded training vectors, and the process of inputting the plurality of coded training vectors and information indicating one translation direction into a single decoding network. And, the intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the one translation direction is input to the classification network, and the training output of the target language output from the single decoding network is input. The process of outputting the classification result indicating the output order of each word in the text from the classification network, the value based on the appearance probability of each word in the training output text of the target language, and the classification result are input to the score network. The process of outputting the training score from the score network, the process of calculating the similarity between the training output text of the target language output from the decoding network and the correct answer text of the target language, and the process of calculating the training score and the similarity. The score network can be trained by the process of adjusting the parameters of each node in the score network based on a third loss function that indicates the difference between degrees.

以上、本開示の２つの実施例に係る機械翻訳装置について説明した。一実施例では、復号化ネットワークは、入力される翻訳方向に基づいて、対応するターゲット言語の出力テキスト（訳文）を出力し、すなわち、復号化ネットワークにより最終的に出力される訳文は、最初にユーザにより指示される翻訳方向に関連付けられる。別の一実施例では、復号化ネットワークに２つの異なる翻訳方向をそれぞれ入力し、２つの異なる翻訳方向に対応する２つの異なる訳文から、翻訳品質がより好ましい１つを選択し、最終的な訳文として出力する。つまり、当該別の一実施例では、復号化ネットワークが翻訳方向を入力する必要がないと考えられる。 The machine translation apparatus according to the two embodiments of the present disclosure has been described above. In one embodiment, the decoding network outputs the output text (translation) of the corresponding target language based on the input translation direction, that is, the translation finally output by the decoding network is first. Associated with the translation direction indicated by the user. In another embodiment, two different translation directions are input to the decoding network, and one of the two different translations corresponding to the two different translation directions is selected to have a more favorable translation quality, and the final translation is completed. Output as. That is, in the other embodiment, it is considered that the decoding network does not need to input the translation direction.

復号化ネットワークのトレーニングでは、ソース言語のトレーニング入力テキストに対する人手翻訳テキストをターゲット言語の正解テキストとすることができるほか、前記ターゲット言語の正解テキストは、同一のソース言語の入力テキストが入力される場合、別の復号化ネットワークから出力されるターゲット言語の出力テキストをさらに含んでもよく、前記別の復号化ネットワークのネットワーク規模が前記復号化ネットワークのネットワーク規模より大きい。ここで、ネットワーク規模とは、例えばノード数、ネットワーク層数のようなネットワークパラメータである。当然のことながら、前記別の復号化ネットワークは、トレーニング済みのネットワークである。 In the decryption network training, the manually translated text for the training input text of the source language can be used as the correct answer text of the target language, and the correct answer text of the target language is when the input text of the same source language is input. , The output text of the target language output from another decryption network may be further included, and the network scale of the other decryption network is larger than the network scale of the decryption network. Here, the network scale is a network parameter such as the number of nodes and the number of network layers. Unsurprisingly, the other decryption network is a trained network.

一方、本開示に係る復号化ネットワークがトレーニング完了後に出力するターゲット言語の出力テキストが、同様に正解テキストとして規模が小さい復号化ネットワークのトレーニング処理に適用されてもよい。 On the other hand, the output text of the target language output by the decoding network according to the present disclosure after the training is completed may be applied to the training process of the decoding network having a small scale as the correct answer text as well.

本開示の実施例に係る機械翻訳装置は、上述した機械翻訳方法に完全に対応するので、機械翻訳装置についての説明では、多くの詳細内容が説明されていない。上述した機械翻訳方法についてのすべての詳細内容が同様に機械翻訳装置に適用できることは、当業者に認識され得る。 Since the machine translation apparatus according to the embodiment of the present disclosure fully corresponds to the machine translation method described above, many details are not explained in the description of the machine translation apparatus. It can be appreciated by those skilled in the art that all the details of the machine translation method described above are similarly applicable to machine translation equipment.

また、本開示の実施例に係る方法又は装置は、図９に示す算出装置９００のアーキテクチャによって実現されてもよい。図９に示すように、算出装置９００は、バス９１０と、１つ以上のＣＰＵ９２０と、読み出し専用メモリ（ＲＯＭ）９３０と、ランダムアクセスメモリ（ＲＡＭ）９４０と、ネットワークに接続される通信ポート９５０と、入力／出力ユニット９６０と、ハードディスク９７０となど、を含んでもよい。算出装置９００におけるストレージデバイス、例えばＲＯＭ９３０又はハードディスク９７０は、本開示に係る機械翻訳方法の処理及び／又は通信に用いられる様々なデータ又はファイル、及びＣＰＵにより実行されるプログラム命令を記憶することができる。当然のことながら、図９に示すアーキテクチャは、例示的なものにすぎず、異なる装置を実装するとき、実際のニーズに応じて、図９に示す算出装置のうちの１つ又は複数のユニットを省略してもよい。 Further, the method or apparatus according to the embodiment of the present disclosure may be realized by the architecture of the calculation apparatus 900 shown in FIG. As shown in FIG. 9, the calculation device 900 includes a bus 910, one or more CPUs 920, a read-only memory (ROM) 930, a random access memory (RAM) 940, and a communication port 950 connected to the network. , The input / output unit 960, the hard disk 970, and the like may be included. The storage device in the calculator 900, such as the ROM 930 or the hard disk 970, can store various data or files used in the processing and / or communication of the machine translation method according to the present disclosure, and program instructions executed by the CPU. .. Of course, the architecture shown in FIG. 9 is merely exemplary, and when implementing different devices, one or more units of the calculator shown in FIG. 9 may be used, depending on the actual needs. It may be omitted.

本開示の実施例は、コンピュータ可読記憶媒体として実装されてもよい。本開示の実施例に係るコンピュータ可読記憶媒体には、コンピュータ可読命令が記憶される。前記コンピュータ可読命令がプロセッサによって実行されるとき、以上の図面を参照して上述した本開示の実施例に係る機械翻訳方法を実行することができる。前記コンピュータ可読記憶媒体は、例えば揮発性メモリ及び／又は不揮発性メモリを含むがこれらに限れない。前記揮発性メモリは、例えばランダムアクセスメモリ（ＲＡＭ）及び／又はキャッシュ（ｃａｃｈｅ）などを含んでもよい。前記不揮発性メモリは、例えば読み出し専用メモリ（ＲＯＭ）、ハードディスク、フラッシュメモリなどを含んでもよい。 The embodiments of the present disclosure may be implemented as a computer-readable storage medium. Computer-readable instructions are stored in the computer-readable storage medium according to the embodiment of the present disclosure. When the computer-readable instruction is executed by the processor, the machine translation method according to the embodiment of the present disclosure described above can be executed with reference to the above drawings. The computer-readable storage medium includes, but is not limited to, for example, volatile memory and / or non-volatile memory. The volatile memory may include, for example, a random access memory (RAM) and / or a cache (cache). The non-volatile memory may include, for example, a read-only memory (ROM), a hard disk, a flash memory, and the like.

以上、図１〜図９を参照しながら、本開示の各実施例に係る機械翻訳方法及び機械翻訳装置について詳細に説明した。本開示の各実施例に係る機械翻訳方法及び機械翻訳装置では、同一の復号化ネットワークを用いることで、第１翻訳方向でのターゲット言語のテキスト出力を実現でき、第２翻訳方向でのターゲット言語のテキスト出力も実現できる。それにより、従来技術における一定の翻訳方向に対して復号化ネットワークを個別に構築してトレーニングする方式に比べて、時間コスト及びソフトウェア・ハードウェアコストを大幅に削減することができ、復号化ネットワークのトレーニングでは、分類ネットワークをさらに追加することによって、復号化ネットワークのターゲット言語ワードの実際の出力順序が、入力された、ユーザにより指示される翻訳方向に一致するまで徐々にトレーニングされ得るように、翻訳方向情報を学習又は利用することができ、また、ユーザが参加する必要がない場合、異なる翻訳方向に対応する２つの翻訳結果から、より好ましい１つを動的に選択して、最終的な訳文として出力することができ、最後に、大きい別の復号化ネットワークが学習した知識を、直接小さい復号化ネットワークに移転することにより、学習用の知識の多様性を実現し、小さい復号化ネットワークのトレーニングが完了するまでの時間を効果的に短縮することができ、復号化ネットワークの損失関数が長期にわたって収束できずにいる状況を回避することができる。 As described above, the machine translation method and the machine translation apparatus according to each embodiment of the present disclosure have been described in detail with reference to FIGS. 1 to 9. In the machine translation method and the machine translation apparatus according to each embodiment of the present disclosure, the text output of the target language in the first translation direction can be realized by using the same decoding network, and the target language in the second translation direction can be realized. Text output can also be realized. As a result, the time cost and software / hardware cost can be significantly reduced as compared with the method of individually constructing and training the decoding network for a certain translation direction in the prior art, and the decoding network can be reduced. In the training, by adding more classification networks, the translation can be gradually trained until the actual output order of the target language words of the decryption network matches the input, user-instructed translation direction. If the direction information can be learned or used and the user does not need to participate, the final translation is made by dynamically selecting the more preferred one from the two translation results corresponding to different translation directions. Finally, by transferring the knowledge learned by another large decoding network directly to the small decoding network, a variety of knowledge for learning is realized, and training of the small decoding network is realized. The time to complete can be effectively reduced, and the situation where the loss function of the decryption network cannot converge for a long period of time can be avoided.

なお、本明細書では、用語「含む」、「含まれる」又はそれらの任意の他の変形は、一連の要素を含むプロセス、方法、物品または装置が、それらの要素を含むだけではなく、明確に列挙されていない他の要素をさらに含み、あるいは、このようなプロセス、方法、物品又は装置に固有の要素をさらに含むように、非排他的な包含を網羅することが意図される。それ以上の制限がない限り、「…を含む」という文によって限定される要素は、前記要素を含むプロセス、方法、物品又は装置において他の同一要素の存在を除外するものではない。 It should be noted that herein, the terms "included", "included" or any other variation thereof are defined not only by a process, method, article or device containing a set of elements, but also containing those elements. It is intended to cover non-exclusive inclusion to further include other elements not listed in, or to further include elements specific to such processes, methods, articles or devices. Unless otherwise restricted, the elements limited by the statement "contains ..." do not preclude the presence of other identical elements in the process, method, article or device containing the element.

最後に、上述した一連の処理には、ここで記載された順序で時系列に実行される処理が含まれるだけでなく、時間順序ではなく並列的または個別に実行される処理も含まれることに留意すべきである。 Finally, the series of processes described above includes not only processes executed in chronological order in the order described here, but also processes executed in parallel or individually rather than in chronological order. It should be noted.

以上の実施形態の説明を通じて、本発明がソフトウェアと必要なハードウェアプラットフォームとの組み合わせによって実装されてもよく、当然のことながら、全てがソフトウェアによって実装されてもよいことを当業者であれば理解できる。このような理解に基づいて、背景技術に寄与する本発明の技術案の全部又は一部は、コンピュータデバイス（パーソナルコンピュータ、サーバ、またはネットワークデバイスなどであってもよい）に本発明の各実施例又は実施例の一部に記載される方法を実行させるために用いられる、いくつかの命令を含む、例えば、ＲＯＭ／ＲＡＭ、フロッピーディスク、ディスクなどの記憶媒体に記憶され得るコンピュータソフトウェア製品の形態で具現化されることが可能である。 Through the description of the above embodiments, those skilled in the art will understand that the present invention may be implemented by a combination of software and a required hardware platform, and of course, all may be implemented by software. it can. Based on this understanding, all or part of the technical proposal of the present invention that contributes to the background art may be a computer device (which may be a personal computer, a server, a network device, etc.) according to each embodiment of the present invention. Or in the form of a computer software product that includes several instructions and can be stored on a storage medium such as a ROM / RAM, floppy disk, disk, etc., used to perform the methods described in some of the examples. It can be embodied.

以上、本発明について詳細に説明した。本明細書では、具体的な例を用いて本発明の原理及び実施形態について解釈したが、以上の実施例の説明は、本発明の方法及びその趣旨への理解を補助するためのものに過ぎない。また、本発明の要旨に基づいて、具体的な実施形態及び適用範囲には、いずれも変更があることは、当業者にとって自明である。よって、本明細書の内容は、本発明に対する限定として理解されるべきではない。
The present invention has been described in detail above. In the present specification, the principles and embodiments of the present invention have been interpreted using specific examples, but the above description of the examples is merely for assisting in understanding the method and its gist of the present invention. Absent. Further, it is obvious to those skilled in the art that there are changes in the specific embodiments and scope of application based on the gist of the present invention. Therefore, the content of this specification should not be understood as a limitation to the present invention.

Claims

A preprocessing unit for performing processing on the source language input text so as to generate a plurality of vectors corresponding to each source language word in the source language input text.
A coding unit for encoding the plurality of vectors to generate a plurality of coded vectors, and
To input the plurality of coding vectors and information indicating the translation direction into a single decoding network, and to output the output text of the target language corresponding to the input text of the source language from the single decoding network. The decoding unit includes a decoding unit in which the output order of the target language words included in the output text matches the translation direction.
A machine translation device in which the parameters of each node in the single decoding network are not changed when the translation direction indicated by the information input to the single decoding network is changed.

A process of executing a process for the training input text of the source language so as to generate a plurality of training vectors corresponding to each source language word in the training input text of the source language.
The process of encoding the plurality of training vectors to generate a plurality of coded training vectors, and
The process of inputting the plurality of coding training vectors and information indicating the translation direction into a single decoding network, and
A plurality of training prediction vectors are output from the single decoding network in a plurality of time steps, and each word in the target language thesaurus is used as a word in the output text of the target language in one time step for each of the training prediction vectors. Each time step corresponds to each word in the output text of the target language, including the probability that it will be
A plurality of correct answer vectors are determined based on the words of the correct answer text of the target language to be output in the plurality of time steps corresponding to the translation directions, and each of the correct answer vectors has one word in the target language thesaurus. The processing that includes the probability of being a word in the output text of the target language in the time step and has the highest probability of corresponding to the word to be output in the time step.
By performing at least the process of adjusting the parameters of each node in the decryption network based on the first loss function indicating the difference between the training prediction vector and the corresponding correct answer vector, The device of claim 1, further comprising a training unit for training.

The single decryption network includes a hidden layer, the training unit further
The intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the translation direction is input to the classification network, and each word in the training output text of the target language output from the single decoding network. The process of outputting the classification result indicating the output order of the above from the classification network and
A claim configured to train the decryption network by performing a process of further adjusting the parameters of each node in the decryption network based on a comparison of the output sequence with the translation direction. Item 2. The device according to item 2.

The training unit further
The classification network is configured to be trained by the process of adjusting the parameters of each node in the classification network based on the second loss function indicating the difference between the classification result output from the classification network and the translation direction. The device according to claim 3.

The decoding unit further
The plurality of coding vectors and the information indicating the first translation direction are input to a single decoding network, the first output text of the target language is output from the single decoding network, and the first output text is output. The output order of the target language words included in the above matches the first translation direction.
The plurality of coding vectors and the information indicating the second translation direction are input to a single decoding network, the second output text of the target language is output from the single decoding network, and the second output. The device according to claim 1, wherein the output order of the target language words included in the text is configured to match the second translation direction.

The decryption network includes a hidden layer, and the device is
The intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the first translation direction is input to the classification network, and each time step corresponds to each word in the first output text of the target language. , A process of outputting the first classification result indicating the output order of each word in the first output text of the target language output from the single decoding network from the classification network.
An intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the second translation direction is input to the classification network, and each time step corresponds to each word in the second output text of the target language. Then, a process of outputting the second classification result indicating the output order of each word in the second output text of the target language output from the single decoding network from the classification network, and
A process of inputting a value based on the appearance probability of each word in the first output text of the target language and a first classification result into the score network and outputting the first score from the score network.
A process of inputting a value based on the appearance probability of each word in the second output text of the target language and a second classification result into the score network and outputting the second score from the score network.
The apparatus according to claim 5, further comprising a process of selecting an output text corresponding to one of the larger one of the first score and the second score as the output text of the target language, and a selection unit for executing the process.

A process of executing a process for the training input text of the source language so as to generate a plurality of training vectors corresponding to each source language word in the training input text of the source language.
The process of encoding the plurality of training vectors to generate a plurality of coded training vectors, and
The process of inputting the plurality of coding training vectors and the information indicating one translation direction into a single decoding network, and
The intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the one translation direction is input to the classification network, and in the training output text of the target language output from the single decoding network. A process of outputting the classification result indicating the output order of each word from the classification network and
A process of inputting a value based on the appearance probability of each word in the training output text of the target language and a classification result into the score network and outputting the training score from the score network.
The process of calculating the similarity between the training output text of the target language output from the decryption network and the correct text of the target language, and
A training unit for training the score network is further included by a process of adjusting the parameters of each node in the score network based on a third loss function indicating the difference between the training score and the similarity. , The apparatus according to claim 6.

The correct answer text of the target language further includes the output text of the target language output from another decoding network when the input text of the same source language is input.
The device according to claim 1, wherein the network scale of the other decryption network is larger than the network scale of the decryption network.

The output text of the target language output from the decoding network is applied to the training process of another decoding network as the correct answer text of the target language corresponding to the input text of the source language.
The device according to claim 1, wherein the network scale of the other decryption network is smaller than the network scale of the decryption network.

A step of executing processing on the input text of the source language so as to generate a plurality of vectors corresponding to each source language word in the input text of the source language.
A step of encoding the plurality of vectors to generate a plurality of coded vectors, and
A step of inputting the plurality of coding vectors and information indicating the translation direction into a single decoding network, and
The output text of the target language corresponding to the input text of the source language is output from the single decoding network, and the output order of the target language words included in the output text matches the translation direction. ,
A machine translation method in which the parameters of each node in the single decoding network are not changed when the translation direction indicated by the information input to the single decoding network is changed.

A process of executing a process for the training input text of the source language so as to generate a plurality of training vectors corresponding to each source language word in the training input text of the source language.
The process of encoding the plurality of training vectors to generate a plurality of coded training vectors, and
The process of inputting the plurality of coding training vectors and information indicating the translation direction into a single decoding network, and
A plurality of training prediction vectors are output from the single decoding network in a plurality of time steps, and each word in the target language thesaurus is used as a word in the output text of the target language in one time step for each of the training prediction vectors. Each time step corresponds to each word in the output text of the target language, including the probability that it will be
A plurality of correct answer vectors are determined based on the words of the correct answer text of the target language to be output in the plurality of time steps corresponding to the translation directions, and each of the correct answer vectors has one word in the target language thesaurus. The processing that includes the probability of being a word in the output text of the target language in the time step and has the highest probability of corresponding to the word to be output in the time step.
Claim that the decrypted network is trained by a process of adjusting the parameters of each node in the decrypted network based on at least a first loss function indicating the difference between the training prediction vector and the corresponding correct answer vector. 10. The method according to 10.

The single decryption network contains a hidden layer, and further
The intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the translation direction is input to the classification network, and each word in the training output text of the target language output from the single decoding network. The process of outputting the classification result indicating the output order of the above from the classification network and
11. The method of claim 11, wherein the decoding network is trained by a process of further adjusting the parameters of each node in the decoding network based on a comparison of the output order with the translation direction.

Claim that the classification network is trained by the process of adjusting the parameters of each node in the classification network based on the second loss function indicating the difference between the classification result output from the classification network and the translation direction. 12. The method according to 12.

The step of inputting the plurality of coding vectors and the information indicating the translation direction into a single decoding network is
A step of inputting the plurality of coding vectors and information indicating the first translation direction into a single decoding network, and
Further including the step of inputting the plurality of coding vectors and the information indicating the second translation direction into a single decoding network.
The step of outputting the output text of the target language corresponding to the input text of the source language from the single decoding network is
A step in which the first output text of the target language is output from the single decoding network and the output order of the target language words included in the first output text matches the first translation direction.
A step of outputting the second output text of the target language from the single decoding network and the output order of the target language words included in the second output text matches the second translation direction is further included. The method according to claim 10.

The decryption network includes a hidden layer, the method of which
The intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the first translation direction is input to the classification network, and each time step corresponds to each word in the first output text of the target language. , A step of outputting the first classification result indicating the output order of each word in the first output text of the target language output from the single decoding network from the classification network.
An intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the second translation direction is input to the classification network, and each time step corresponds to each word in the second output text of the target language. Then, a step of outputting the second classification result indicating the output order of each word in the second output text of the target language output from the single decoding network from the classification network, and
A step of inputting a value based on the appearance probability of each word in the first output text of the target language and the first classification result into the score network and outputting the first score from the score network.
Further including a value based on the appearance probability of each word in the second output text of the target language, and a step of inputting the second classification result into the score network and outputting the second score from the score network.
The step of outputting the output text of the target language corresponding to the input text of the source language from the single decoding network is
14. The method of claim 14, further comprising selecting the output text corresponding to one of the first and second scores as the output text of the target language.

A process of executing a process for the training input text of the source language so as to generate a plurality of training vectors corresponding to each source language word in the training input text of the source language.
The process of encoding the plurality of training vectors to generate a plurality of coded training vectors, and
The process of inputting the plurality of coding training vectors and the information indicating one translation direction into a single decoding network, and
The intermediate vector output from the hidden layer of the decoding network at each time step corresponding to the one translation direction is input to the classification network, and in the training output text of the target language output from the single decoding network. A process of outputting the classification result indicating the output order of each word from the classification network and
A process of inputting a value based on the appearance probability of each word in the training output text of the target language and a classification result into the score network and outputting the training score from the score network.
The process of calculating the similarity between the training output text of the target language output from the decryption network and the correct text of the target language, and
15. The 15th claim, wherein the score network is trained by a process of adjusting the parameters of each node in the score network based on a third loss function indicating the difference between the training score and the similarity. Method.

The correct answer text of the target language further includes the output text of the target language output from another decoding network when the input text of the same source language is input.
The method according to claim 10, wherein the network scale of the other decryption network is larger than the network scale of the decryption network.

The output text of the target language output from the decoding network is applied to the training process of another decoding network as the correct answer text of the target language corresponding to the input text of the source language.
The method according to claim 10, wherein the network scale of the other decryption network is smaller than the network scale of the decryption network.