JP6633999B2

JP6633999B2 - Encoder learning device, conversion device, method, and program

Info

Publication number: JP6633999B2
Application number: JP2016212964A
Authority: JP
Inventors: 鈴木　潤; 潤鈴木
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2016-10-31
Filing date: 2016-10-31
Publication date: 2020-01-22
Anticipated expiration: 2036-10-31
Also published as: JP2018073163A

Description

本発明は、符号器学習装置、変換装置、方法、及びプログラムに係り、特に、離散構造を変換する問題を解くための符号器学習装置、変換装置、方法、及びプログラムに関する。 The present invention relates to an encoder learning device, a conversion device, a method, and a program, and particularly to an encoder learning device, a conversion device, a method, and a program for solving a problem of transforming a discrete structure.

自然言語処理分野の問題を題材として背景技術を説明する。計算機により自然言語を処理する技術は、文章に現れる表層的な文字や単語といった離散シンボルの集合を処理する技術と言える。例えば、ある言語の文を入力し、別の言語の文を出力する自動翻訳システムを考える。このシステムでは、入力および出力の文を単語列（文字列）とみなして処理が行われる。よって、システム内の計算機により、離散構造（シンボル構造）から別の離散構造へ変換する処理を行っているとみなすことができる。また、文書要約システム、対話システム、文章構成システムなどの言語を入出力とするシステムは、前述の翻訳システム同様、離散構造から別の離散構造へ変換する処理によってシステムが構成されると言える。このように、前述の自然言語処理システム以外の自然言語処理システムでも、自然言語を扱うシステムでは、扱う対象が単語、文、文書といった離散構造であるため、入力から出力へどのような変換を行うかという定義の違いはあるにせよ、処理のフレームワークは同じであり、離散構造から離散構造の変換問題に帰着できる。図１に自然言語処理における変換問題の各種の例を示す。 The background art will be described using the problem in the field of natural language processing as a theme. The technique of processing a natural language by a computer can be said to be a technique of processing a set of discrete symbols such as superficial characters and words appearing in a sentence. For example, consider an automatic translation system that inputs a sentence in one language and outputs a sentence in another language. In this system, processing is performed by regarding input and output sentences as word strings (character strings). Therefore, it can be considered that a computer in the system is performing a process of converting a discrete structure (symbol structure) into another discrete structure. Also, a system such as a document summarization system, a dialogue system, and a sentence construction system that uses a language as an input / output can be said to be constituted by a process of converting a discrete structure into another discrete structure, like the above-described translation system. As described above, even in a natural language processing system other than the above-described natural language processing system, in a system that handles natural languages, what kind of conversion from input to output is performed because a target to be processed is a discrete structure such as a word, a sentence, and a document. Despite the difference in the definition, the processing framework is the same, and it can be reduced to a discrete structure transformation problem from a discrete structure. FIG. 1 shows various examples of a conversion problem in natural language processing.

近年、ニューラルネットに基づく文字列-文字列変換方法が注目を浴びている。例えば、非特許文献１や非特許文献２では、リカレントニューラルネットの枠組みを使い、離散構造を実数値ベクトルへ符号化し、その実数値ベクトルから離散構造を復号するという方法論で離散構造-離散構造変換問題を実現している。 In recent years, a character string-character string conversion method based on a neural network has attracted attention. For example, Non-Patent Document 1 and Non-Patent Document 2 use a recurrent neural network framework to encode a discrete structure into a real-valued vector, and decode the discrete structure from the real-valued vector, thereby forming a discrete-structure to discrete-structure conversion problem. Has been realized.

例えば、非特許文献１では、図２に示すように符号器及び復号器を構成することが開示されている。 For example, Non-Patent Document 1 discloses that an encoder and a decoder are configured as shown in FIG.

ここで、説明のための関数群及び演算子を図３に示す。関数群は、図２に示すように、シグモイド関数をσ_１、tanh関数をσ_２、softmax関数をσ_３、relu関数をσ_４とする。各関数は、ベクトルを入力とし、入力されたベクトルと同じ大きさ（次元数）のベクトルを返す関数である。また各関数は、入力されたベクトルの要素毎に所定の計算をし、入力されたベクトルと同じ要素番号（位置）に結果を格納する。これはベクトルを行列に置き換えた場合についても同様である。 Here, a function group and an operator for explanation are shown in FIG. As shown in FIG. 2, the sigmoid function is σ ₁ , the tanh function is σ ₂ , the softmax function is σ ₃ , and the relu function is σ ₄ , as shown in FIG. Each function is a function that receives a vector as an input and returns a vector having the same size (number of dimensions) as the input vector. Each function performs a predetermined calculation for each element of the input vector, and stores the result in the same element number (position) as the input vector. The same applies to a case where a vector is replaced with a matrix.

非特許文献１記載の復号器では、復号化ユニットの出力から、以下の式に従って単語を順次予測することにより、系列の単語を予測する。 The decoder described in Non-Patent Document 1 predicts words in a sequence by sequentially predicting words from the output of the decoding unit according to the following equation.

図２の例の場合には予測した単語は In the example of FIG. 2, the predicted word is

として出力され、一つ目の単語は「It」となる。 And the first word is “It”.

ここで、符号器の構築の原理を説明する。符号器は、符号化ユニットを連結することで符号器全体を構成する。符号化ユニット内部は様々な構成が考えられるが、ここでは例として、リカレントニューラルネット（ＲＮＮ）により構成する場合と、長短期記憶メモリ（ＬＳＴＭ）により構築する場合の例をあげる。 Here, the principle of the construction of the encoder will be described. The encoder constitutes the entire encoder by connecting the encoding units. Various configurations are conceivable inside the coding unit. Here, as an example, a case where the coding unit is configured by a recurrent neural network (RNN) and a case where the coding unit is configured by a long and short term storage memory (LSTM) will be described.

以下に符号化ユニットをＲＮＮにより構築する場合と、ＬＳＴＭにより構築する場合の計算式を下記（１）式、（２）式に示す。 The following shows equations (1) and (2) when calculating an encoding unit using RNN and when constructing an encoding unit using LSTM.

各符号化ユニットに入力される情報は、通常の入力ラベルのベクトルｘと、接続する符号化ユニットの中間（符号）状態ｚとなる。 The information input to each coding unit is a vector x of a normal input label and an intermediate (code) state z of the connected coding unit.

また、非特許文献２では、図４に示すように符号器及び復号器を構成することが開示されている。図４では、一番目の単語を生成する際の処理として、符号器内の各符号化ユニットで得た隠れ層ベクトルから計算されるｃ_ｊと、復号化ユニットの１番目の隠れ層ベクトルｈ_ｊ ^ｔ（ただしｔ＝１）を復号器の単語生成ユニットの入力として用いている。 Non-Patent Document 2 discloses that an encoder and a decoder are configured as shown in FIG. In FIG. 4, as processing when the first word is generated, c _j calculated from the hidden layer vector obtained in each encoding unit in the encoder, and the first hidden layer vector h _{j of the} decoding unit. ^t (where t = 1) is used as an input to the word generation unit of the decoder.

ここで、ｃ_ｊは、以下（３）式に従って計算される。 Here, c _j is calculated according to the following equation (3).

・・・（３）
... (3)

符号器及び復号器の学習について説明する。符号器及び復号器の学習は、下記、損失関数Ψを最小にするパラメタ The learning of the encoder and the decoder will be described. The learning of the encoder and decoder consists of the following parameters that minimize the loss function Ψ

を探索する最小化問題として定式化できる。 Can be formulated as a minimization problem that searches for

パラメタ Parameter

は、ニューラルネットワーク内の全てのパラメタの集合である。関数Ψは正解となる単語ｙ_ｎ,ｔと現在のシステムの予測結果ｏ_ｎ,ｔとの負のクロスエントロピーに相当する。ｎは、文番号を表すとし、学習用正解データの数が上限となる。ｔは、文中の単語番号を表すとする。 Is the set of all parameters in the neural network. The function Ψ corresponds to the negative cross entropy between the correct word y _{n, t} and the current system prediction result on _{, t} . n represents a sentence number, and the number of correct answer data for learning is the upper limit. Let t represent the word number in the sentence.

Sutskever, Ilya and Vinyals, Oriol and Le, Quoc V.Sequence to Sequence Learning with Neural Networks,Advances in Neural Information Processing Systems 27, pp. 3104-3112, 2014Sutskever, Ilya and Vinyals, Oriol and Le, Quoc V. Sequence to Sequence Learning with Neural Networks, Advances in Neural Information Processing Systems 27, pp. 3104-3112, 2014 Dzmitry Bahdanau and Kyunghyun Cho and Yoshua Bengio.Neural Machine Translation by Jointly Learning to Align and Translate ICLR-2015Dzmitry Bahdanau and Kyunghyun Cho and Yoshua Bengio.Neural Machine Translation by Jointly Learning to Align and Translate ICLR-2015

上記のような従来技術（ニューラルネット等）による方法は、それ以前の方法と比較して一般的に性能が大幅に向上したと言えるが、人間が作成する文章には未だに遠く及ばないのが現状である。 In general, it can be said that the performance of the conventional technique (such as a neural network) as described above has been greatly improved compared to the previous methods, but it is still far from texts created by humans. It is.

通常、ニューラルネットに基づく系列符号器及び系列復号器による方法では入力文が与えられてから、出力文を生成するまでを一つのネットワークで表現する。 Normally, in a method using a sequence encoder and a sequence decoder based on a neural network, a process from when an input sentence is given to when an output sentence is generated is represented by one network.

この場合、モデルが単一のネットワークで表現されるため、わかりやすい反面、文章から文章への変換という複雑な主問題を一つのネットワークモデルで表現する方法であるため、機械学習問題の側面では、精度よく変換するためのモデルを学習することが非常に難しい問題である、という側面がある。 In this case, the model is represented by a single network, so it is easy to understand.On the other hand, since it is a method to represent the complex main problem of converting sentences into sentences by one network model, the accuracy of machine learning problem is There is an aspect that learning a model to convert well is a very difficult problem.

本発明は、上記事情を鑑みて成されたものであり、精度よく離散構造を変換するためのパラメタを学習できる符号器学習装置、方法、及びプログラムを提供することを目的とする。 The present invention has been made in view of the above circumstances, and an object of the present invention is to provide an encoder learning device, a method, and a program that can accurately learn parameters for converting a discrete structure.

また、精度よく離散構造を変換する変換装置、方法、及びプログラムを提供することを目的とする。 It is another object of the present invention to provide a conversion device, method, and program for converting a discrete structure with high accuracy.

上記目的を達成するために、第１の発明に係る符号器学習装置は、入力された離散構造を変換する主問題についての正解データに基づいて、前記主問題に対して予め定められた補助問題の正解データを生成する補助問題生成部と、前記入力された離散構造の各要素に対応する、前記要素を符号化する符号化ユニットを接続した符号器であって、前記補助問題の解を予測する予測器を含む符号器と、前記符号器によって出力される符号、及び前記補助問題の解を入力とする、前記離散構造の要素に復号する復号化ユニットを接続した復号器と、前記主問題についての正解データと、前記補助問題の正解データとに基づいて、前記符号化ユニット、前記復号化ユニット、及び前記予測器のパラメタを学習する学習部と、を含んで構成されている。 In order to achieve the above object, an encoder learning apparatus according to a first aspect of the present invention provides a coder learning device which converts a discrete structure into a predetermined auxiliary problem based on correct data of the main problem. An encoder that connects an auxiliary problem generation unit that generates correct answer data and an encoding unit that encodes the element corresponding to each element of the input discrete structure, and predicts a solution of the auxiliary problem. An encoder including a predictor that performs decoding, a decoder connected to a decoding unit that decodes the elements of the discrete structure, and inputs a code output by the encoder and a solution to the auxiliary problem, and the main problem And a learning unit that learns parameters of the encoding unit, the decoding unit, and the predictor based on the correct data of the sub-problem and the correct data of the auxiliary problem.

また、第１の発明に係る符号器学習装置において、前記学習部は、前記入力された離散構造に基づいて、前記符号器を構築する符号器構築部と、前記構築された前記符号器と、パラメタの初期値又は更新された前記パラメタとに基づいて、前記入力された離散構造の各要素を、対応する前記符号化ユニットに入力して、順次計算して、前記入力された離散構造の符号を出力すると共に、前記予測器を計算して前記補助問題の解を予測する符号器計算部と、前記出力された符号及び前記補助問題の解を、前記復号器に入力して、順次計算して、離散構造を出力する復号器計算部と、前記復号器計算部によって出力された離散構造と、前記主問題についての正解データとを用いて表される目的関数の値を計算する目的関数計算部と、記計算された目的関数の値に基づいて前記パラメタを更新するパラメタ更新部と、予め定められた反復終了条件を満たすまで、前記符号器計算部による計算、前記復号器計算部による計算、前記目的関数計算部による計算、及び前記パラメタ更新部による更新を繰り返す終了判定部と、を含むようにしてもよい。 Further, in the encoder learning apparatus according to the first invention, the learning unit includes an encoder construction unit configured to construct the encoder based on the input discrete structure; Based on the initial value of the parameter or the updated parameter, each element of the input discrete structure is input to the corresponding encoding unit, sequentially calculated, and the code of the input discrete structure is calculated. And an encoder calculation unit that calculates the predictor to predict the solution of the auxiliary problem, and inputs the output code and the solution of the auxiliary problem to the decoder, and sequentially calculates A decoder calculation unit that outputs a discrete structure, an objective function calculation that calculates a value of an objective function represented by using the discrete structure output by the decoder calculation unit, and correct data for the main problem Part and notation A parameter updating unit that updates the parameter based on the value of the objective function, and a calculation by the encoder calculation unit, a calculation by the decoder calculation unit, and a calculation by the objective function calculation unit until a predetermined iteration end condition is satisfied. An end determination unit that repeats calculation and updating by the parameter updating unit may be included.

第２の発明に係る変換装置は、入力された離散構造を変換する主問題を解く変換装置であって、前記入力された離散構造に基づいて、前記離散構造の各要素に対応する、前記要素を符号化する符号化ユニットを接続した符号器であって、前記主問題に対して予め定められた補助問題に対応する予測器を含む符号器を構築する符号器構築部と、前記構築された前記符号器に基づいて、前記入力された離散構造の各要素を、対応する前記符号化ユニットに入力して、順次計算して、前記入力された離散構造の符号を出力すると共に、前記予測器を計算して前記補助問題の解を予測する符号器計算部と、前記出力された符号及び前記補助問題の解を、前記離散構造の要素に復号する復号化ユニットを接続した復号器に入力して、順次計算して、離散構造を出力する復号器計算部と、を含んで構成されている。 A conversion device according to a second aspect is a conversion device for solving a main problem of converting an input discrete structure, wherein the element corresponding to each element of the discrete structure is based on the input discrete structure. An encoder connected to an encoding unit that encodes the encoder, and an encoder constructing unit that constructs an encoder including a predictor corresponding to a predetermined auxiliary problem with respect to the main problem; and Based on the encoder, each element of the input discrete structure is input to a corresponding one of the encoding units, sequentially calculated, and outputs the code of the input discrete structure, and the predictor And an encoder calculating section for predicting the solution of the auxiliary problem by calculating the output code and the solution of the auxiliary problem to a decoder connected to a decoding unit for decoding the element of the discrete structure. And calculate sequentially, discrete It is configured to include a decoder calculating section for outputting the granulation, the.

また、第２の発明に係る変換装置において、前記補助問題を、変換後の離散構造に含まれる要素の集合を予測する問題、変換後の離散構造に含まれる要素の数を予測する問題、及び前記入力された離散構造と、変換後の離散構造とのいずれにも含まれる要素の集合を予測する問題の少なくとも一つであるようにしてもよい。 Further, in the conversion device according to the second invention, the auxiliary problem is a problem of predicting a set of elements included in a converted discrete structure, a problem of predicting the number of elements included in a converted discrete structure, and This may be at least one of the problems of predicting a set of elements included in both the input discrete structure and the transformed discrete structure.

第３の発明に係る符号器学習方法は、補助問題生成部が、入力された離散構造を変換する主問題についての正解データに基づいて、前記主問題に対して予め定められた補助問題の正解データを生成するステップと、学習部が、前記入力された離散構造の各要素に対応する、前記要素を符号化する符号化ユニットを接続した符号器であって、前記補助問題の解を予測する予測器を含む符号器と、前記符号器によって出力される符号、及び前記補助問題の解を入力とする、前記離散構造の要素に復号する復号化ユニットを接続した復号器と、前記主問題についての正解データと、前記補助問題の正解データとに基づいて、前記符号化ユニット、前記復号化ユニット、及び前記予測器のパラメタを学習するステップと、を含んで実行することを特徴とする。 In the encoder learning method according to a third aspect of the present invention, the auxiliary problem generating unit corrects a predetermined auxiliary problem with respect to the main problem based on correct data of the main problem for transforming the input discrete structure. A step of generating data, wherein the learning unit is connected to a coding unit for coding the element corresponding to each element of the input discrete structure, and predicts a solution of the auxiliary problem. An encoder including a predictor, a code output by the encoder, and a decoder connected to a decoding unit that decodes the elements of the discrete structure, which have the input of the solution of the auxiliary problem and the main problem. Learning the parameters of the encoding unit, the decoding unit, and the predictor based on the correct answer data and the correct answer data of the auxiliary problem. To.

第４の発明に係る変換方法は、入力された離散構造を変換する主問題を解く変換装置における変換方法であって、符号器構築部が、前記入力された離散構造に基づいて、前記離散構造の各要素に対応する、前記要素を符号化する符号化ユニットを接続した符号器であって、前記主問題に対して予め定められた補助問題に対応する予測器を含む符号器を構築するステップと、符号器計算部が、前記構築された前記符号器に基づいて、前記入力された離散構造の各要素を、対応する前記符号化ユニットに入力して、順次計算して、前記入力された離散構造の符号を出力すると共に、前記予測器を計算して前記補助問題の解を予測するステップと、復号器計算部が、前記出力された符号及び前記補助問題の解を、前記離散構造の要素に復号する復号化ユニットを接続した復号器に入力して、順次計算して、離散構造を出力するステップと、を含んで実行することを特徴とする。 A conversion method according to a fourth aspect of the present invention is a conversion method for a conversion device for solving a main problem of converting an input discrete structure, wherein an encoder constructing unit determines the discrete structure based on the input discrete structure. Constructing an encoder including a predictor corresponding to a predetermined auxiliary problem with respect to the main problem, the encoder being connected to a coding unit that encodes the element, corresponding to each element of And an encoder calculating unit, based on the constructed encoder, inputs each element of the input discrete structure to the corresponding encoding unit, sequentially calculates, and Outputting a code having a discrete structure and predicting a solution to the auxiliary problem by calculating the predictor; anda decoder calculating section converts the output code and the solution to the auxiliary problem to the discrete structure Decoding to decode to element Enter the decoder connected to the unit, it is sequentially calculated, and executes contain, and outputting the discrete structures.

第５の発明に係るプログラムは、コンピュータを、第１の発明に係る符号器学習装置の各部として機能させるためのプログラムである。 A program according to a fifth invention is a program for causing a computer to function as each unit of the encoder learning device according to the first invention.

第６の発明に係るプログラムは、コンピュータを、第２の発明に係る変換装置の各部として機能させるためのプログラムである。 A program according to a sixth invention is a program for causing a computer to function as each unit of the conversion device according to the second invention.

本発明の符号器学習装置、方法、及びプログラムによれば、離散構造を変換する主問題についての正解データに基づいて、主問題に対して予め定められた補助問題の正解データを生成し、入力された離散構造の各要素に対応する、要素を符号化する符号化ユニットを接続した符号器であって、補助問題の解を予測する予測器を含む符号器と、符号器によって出力される符号、及び補助問題の解を入力とする、離散構造の要素に復号する復号化ユニットを接続した復号器と、主問題についての正解データと、補助問題の正解データとに基づいて、符号化ユニット、復号化ユニット、及び予測器のパラメタを学習することにより、精度よく離散構造を変換するためのパラメタを学習することができる、という効果が得られる。 According to the encoder learning apparatus, method, and program of the present invention, based on correct data of a main problem for which a discrete structure is to be converted, correct data of a predetermined auxiliary problem is generated for the main problem and input. A coder including a predictor for predicting a solution of an auxiliary problem, the coder including a coding unit for coding the element corresponding to each element of the divided discrete structure, and a code output by the coder And the input of the solution of the auxiliary problem, a decoder connected to a decoding unit for decoding into elements having a discrete structure, correct data for the main problem, and the correct data for the auxiliary problem, By learning the parameters of the decoding unit and the predictor, the effect is obtained that the parameters for converting the discrete structure can be learned with high accuracy.

また、本発明の変換装置、方法、及びプログラムによれば、入力された離散構造に基づいて、離散構造の各要素に対応する、要素を符号化する符号化ユニットを接続した符号器であって、主問題に対して予め定められた補助問題に対応する予測器を含む符号器を構築し、構築された前記符号器に基づいて、入力された離散構造の各要素を、対応する符号化ユニットに入力して、順次計算して、入力された離散構造の符号を出力すると共に、予測器を計算して補助問題の解を予測し、出力された符号及び前記補助問題の解を、復号器に入力して、順次計算して、離散構造を出力することにより、精度よく離散構造を変換することができる、という効果が得られる。 Further, according to the conversion device, method and program of the present invention, based on the input discrete structure, the encoder corresponding to each element of the discrete structure, the encoding unit for encoding the element is connected, Constructing an encoder including a predictor corresponding to a predetermined auxiliary problem with respect to the main problem, and, based on the constructed encoder, converting each element of the input discrete structure into a corresponding encoding unit , Sequentially calculating, outputting the input code of the discrete structure, predicting the solution of the auxiliary problem by calculating the predictor, and outputting the output code and the solution of the auxiliary problem to a decoder. , Sequentially calculating, and outputting the discrete structure, the effect of being able to convert the discrete structure with high accuracy is obtained.

自然言語処理における変換問題の各種の例を示す図である。It is a figure showing various examples of a conversion problem in natural language processing. 非特許文献１に開示されている符号器及び復号器の構成を示す図である。FIG. 9 is a diagram illustrating a configuration of an encoder and a decoder disclosed in Non-Patent Document 1. 説明のための関数群及び演算子を表す図である。It is a figure showing the function group and operator for description. 非特許文献２に開示されている符号器及び復号器の構成を示す図である。FIG. 11 is a diagram illustrating a configuration of an encoder and a decoder disclosed in Non-Patent Document 2. 与えられた文章の要約文を自動で作成する技術を示す図である。FIG. 6 is a diagram illustrating a technique for automatically creating a summary sentence of a given sentence. 本発明の実施の形態に係る符号器学習装置の構成を示すブロック図である。1 is a block diagram illustrating a configuration of an encoder learning device according to an embodiment of the present invention. 要約文に含まれる単語の集合を予測する問題の一例を示す図である。FIG. 9 is a diagram illustrating an example of a problem of predicting a set of words included in a summary sentence. 要約文に含まれる単語の数を予測する問題の一例を示す図である。FIG. 9 is a diagram illustrating an example of a problem of estimating the number of words included in a summary sentence. 入力された文章と、変換後の要約文とのいずれにも含まれる単語の集合を予測する問題の一例を示す図である。FIG. 11 is a diagram illustrating an example of a problem of predicting a set of words included in both an input sentence and a converted summary sentence. 補助問題１の予測器を含む符号器の構成の一例を示す図である。FIG. 9 is a diagram illustrating an example of a configuration of an encoder including a predictor of the auxiliary problem 1. 補助問題２の予測器を含む符号器の構成の一例を示す図である。FIG. 11 is a diagram illustrating an example of a configuration of an encoder including a predictor of Sub-Problem 2. 補助問題２の予測結果の一例を示す図である。FIG. 14 is a diagram illustrating an example of a prediction result of the auxiliary problem 2. 補助問題３の予測器を含む符号器の構成の一例を示す図である。FIG. 14 is a diagram illustrating an example of a configuration of an encoder including a predictor of Sub-Problem 3. 本発明の実施の形態に係る変換装置の構成を示すブロック図である。FIG. 2 is a block diagram illustrating a configuration of a conversion device according to the embodiment of the present invention. 補助問題１〜３の符号器及び復号器による出力の一例を示す図である。It is a figure which shows an example of the output by the encoder and decoder of the auxiliary problems 1-3. 補助問題１の符号器及び復号器による出力の一例を示す図である。FIG. 9 is a diagram illustrating an example of outputs of the encoder and the decoder of the auxiliary problem 1. 補助問題１、３の符号器及び復号器による出力の一例を示す図である。It is a figure which shows an example of the output by the encoder and decoder of the auxiliary problems 1 and 3. 本発明の実施の形態に係る符号器学習装置における符号器学習処理ルーチンを示すフローチャートである。5 is a flowchart illustrating an encoder learning processing routine in the encoder learning device according to the embodiment of the present invention. 本発明の実施の形態に係る変換装置における変換処理ルーチンを示すフローチャートである。5 is a flowchart illustrating a conversion processing routine in the conversion device according to the embodiment of the present invention. 離散構造の対象を翻訳とした場合の、補助問題１の符号器及び復号器の構成の一例を示す図である。FIG. 9 is a diagram illustrating an example of a configuration of an encoder and a decoder of the auxiliary problem 1 when a target having a discrete structure is translated. 離散構造の対象を翻訳とした場合の、補助問題１、２の符号器及び復号器の構成の一例を示す図である。FIG. 9 is a diagram illustrating an example of a configuration of an encoder and a decoder of the auxiliary problems 1 and 2 when a target having a discrete structure is translated. 離散構造の対象を翻訳とした場合の、補助問題１〜３の符号器及び復号器の構成の一例を示す図である。It is a figure which shows an example of the structure of the encoder and decoder of the auxiliary problems 1-3 when the object of a discrete structure is made into translation.

以下、図面を参照して本発明の実施の形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜本発明の実施の形態に係る概要＞ <Overview according to Embodiment of the Present Invention>

まず、本発明の実施の形態における概要を説明する。 First, an outline of an embodiment of the present invention will be described.

入力された文章を要約して変換する主問題を解くための符号器学習装置、及び変換装置として説明する。図５に示すように、与えられた文章の要約文を自動で作成する技術は、文章の概要を短時間で把握する目的に極めて有効な手段である。 A description will be given as an encoder learning device and a conversion device for solving a main problem of converting an input sentence by converting it. As shown in FIG. 5, a technique for automatically creating a summary of a given sentence is an extremely effective means for grasping the outline of a sentence in a short time.

＜本発明の実施の形態に係る符号器学習装置の構成＞ <Configuration of Encoder Learning Apparatus According to Embodiment of the Present Invention>

次に、本発明の実施の形態に係る符号器学習装置の構成について説明する。図６に示すように、本発明の実施の形態に係る符号器学習装置１００は、ＣＰＵと、ＲＡＭと、後述する符号器学習処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。この符号器学習装置１００は、機能的には図６に示すように入力部１０と、演算部２０と、パラメタＤＢ４４とを備えている。 Next, the configuration of the encoder learning device according to the embodiment of the present invention will be described. As shown in FIG. 6, an encoder learning device 100 according to an embodiment of the present invention includes a CPU, a RAM, a ROM storing a program and various data for executing an encoder learning processing routine described below, And a computer including The encoder learning device 100 functionally includes an input unit 10, an operation unit 20, and a parameter DB 44 as shown in FIG.

入力部１０は、入力された文章を要約文に変換する主問題についての正解データを受け付ける。 The input unit 10 receives correct answer data on a main problem that converts an input sentence into a summary sentence.

演算部２０は、補助問題生成部２２と、学習部２４とを含んで構成されている。 The calculation unit 20 includes an auxiliary problem generation unit 22 and a learning unit 24.

補助問題生成部２２は、入力部１０で受け付けた文章を要約文に変換する主問題についての正解データに基づいて、主問題に対して予め定められた補助問題の正解データを生成する。 The auxiliary problem generation unit 22 generates correct answer data of a predetermined auxiliary problem with respect to the main problem, based on the correct answer data of the main problem for converting the sentence received by the input unit 10 into a summary sentence.

ここで、本実施の形態で扱う補助問題について説明する。 Here, the auxiliary problem handled in the present embodiment will be described.

補助問題は、主問題よりも計算コストが少なく簡単であり、かつ、主問題を解くことに関連する問題であることを条件とする。また、主問題に対する正解データから補助問題の学習用正解データを構築できることを条件とする。 The sub-problem is conditional on being less computationally expensive and simpler than the main problem, and a problem related to solving the main problem. The condition is that correct answer data for learning the auxiliary problem can be constructed from the correct answer data for the main question.

補助問題生成部２２では、主問題についての正解データに基づいて、以下の補助問題１〜３の正解データを生成する。 The auxiliary problem generation unit 22 generates correct data of the following auxiliary problems 1 to 3 based on the correct data of the main problem.

補助問題１は、図７に示すような、要約文に含まれる単語の集合を予測する問題である。補助問題１は、主問題に比べて語順を考慮しなくてよいのでその分簡単な問題である。また、補助問題１の学習用正解データは主問題の学習用正解データから語順を取り除いた単語の集合である。 The auxiliary problem 1 is a problem for predicting a set of words included in a summary sentence as shown in FIG. The auxiliary problem 1 is a simpler problem because the word order does not need to be considered as compared with the main problem. The correct answer data for learning of the auxiliary problem 1 is a set of words obtained by removing the word order from the correct answer data for learning of the main problem.

補助問題２は、図８に示すような、要約文に含まれる単語の数を予測する問題である。補助問題２は、１以上の整数を答える問題であり、単語そのものを正解しなくてもよい分簡単な問題である。また、補助問題２の学習用正解データは主問題の学習用正解データの出力単語数である。 The auxiliary problem 2 is a problem for predicting the number of words included in the summary sentence as shown in FIG. The auxiliary question 2 is a question that answers an integer of 1 or more, and is a simple question because it is not necessary to correctly answer the word itself. The correct answer data for learning of the auxiliary problem 2 is the number of output words of the correct answer data for learning of the main problem.

補助問題３は、図９に示すような、入力された文章と、変換後の要約文とのいずれにも含まれる単語の集合を予測する問題である。補助問題３は、文章の各単語に正または負をつける問題であり、言い換えられた単語を正解しなくてもよい分簡単な問題である。また、補助問題３の学習用正解データは、主問題の学習用正解データである文章と要約文の単語の積集合で獲得可能な単語の集合である。 The auxiliary problem 3 is a problem of predicting a set of words included in both the input sentence and the converted summary sentence as shown in FIG. The auxiliary problem 3 is a problem in which each word of a sentence is given a positive or negative value, and is a simple problem because it is not necessary to correctly answer a paraphrased word. The correct answer data for learning of the auxiliary problem 3 is a set of words that can be obtained by the intersection set of the sentence, which is the correct answer data for learning of the main problem, and the word of the summary sentence.

学習部２４は、以下に説明する各部の処理によって、入力された文章の各単語に対応する、単語を符号化する符号化ユニットを接続した符号器であって、補助問題の解を予測する予測器を含む符号器と、符号器によって出力される符号、及び補助問題の解を入力とする、要約文の単語に復号する復号化ユニットを接続した復号器と、主問題についての正解データと、補助問題１〜３の正解データとに基づいて、符号化ユニット、復号化ユニット、及び予測器のパラメタ The learning unit 24 is an encoder connected to an encoding unit that encodes a word corresponding to each word of an input sentence by processing of each unit described below, and is a prediction unit that predicts a solution of an auxiliary problem. An encoder including a decoder, a decoder connected to a code output by the encoder, and a decoding unit that decodes a word of a summary sentence as an input of the solution of the auxiliary problem, and correct data for the main problem; On the basis of the correct data of the auxiliary problems 1 to 3, the parameters of the coding unit, the decoding unit, and the predictor

を学習する。 To learn.

符号化ユニットは、上記（１）式又は（２）式に従って、ｚ_outを出力する。 The encoding unit outputs z _out according to the above equation (1) or (2).

符号器は、上記（３）式に従って、ベクトルｃ_jを出力する。 The encoder outputs a vector c _j according to the above equation (3).

ここで、各補助問題の解を予測する予測器について説明する。 Here, a predictor that predicts the solution of each auxiliary problem will be described.

補助問題１の予測器を含む符号器は、図１０に示すように構成される。補助問題１の予測器は、以下（４）式に従って、補助問題１の解としてｓ^ｗｓを出力する。 The encoder including the predictor of the auxiliary problem 1 is configured as shown in FIG. The predictor of the auxiliary problem 1 outputs ^sws as a solution of the auxiliary problem 1 according to the following equation (4).

・・・（４）
... (4)

補助問題２の予測器を含む符号器は、図１１に示すように構成される。補助問題１の予測器は、以下（５）式に従って、補助問題１の解としてｓ^ｗｌを出力する。 The encoder including the predictor of the auxiliary problem 2 is configured as shown in FIG. The predictor of the auxiliary problem 1 outputs ^swl as a solution of the auxiliary problem 1 according to the following equation (5).

・・・（５）
... (5)

補助問題３の予測器を含む符号器は、図１２に示すように構成される。補助問題１の予測器は、以下（６）式に従って、補助問題１の解としてｓ^ｏｗを出力する。 The encoder including the predictor of the auxiliary problem 3 is configured as shown in FIG. The predictor of the auxiliary problem 1 outputs ^sow as a solution of the auxiliary problem 1 according to the following equation (6).

・・・（６） ... (6)

次に、復号器の復号化ユニットは、符号化ユニットと同様に、上記（１）式又は（２）式に従って、ｚ_outを出力し、復号器は、復号化ユニットの出力と、符号器の出力ｃ_ｊとに基づいて、下記（７）式に従って、単語を予測する。 Next, the decoding unit of the decoder outputs z _out according to the above equation (1) or (2), similarly to the encoding unit, and the decoder outputs the output of the decoding unit and the encoder. Based on the output c _j , a word is predicted according to the following equation (7).

・・・（７）

... (7)

上記（７）式のように、補助問題の解を考慮して、単語が予測される。 As in the above equation (7), a word is predicted in consideration of the solution of the auxiliary problem.

例えば、復号器の単語生成ユニットの計算において、主問題の解ｏ_ｊに、補助問題の解ｓ^ｗｓと補助問題の解ｓ^ｏｗを用いて計算された~ｓをベクトルの要素毎に乗算した~ｏ_ｊが求められる。また、復号器の単語生成ユニットの計算において、補助問題の解~ｓ^ｗｌによって単語数を規定して計算を行う。 For example, in the calculation of the word generation unit of the decoder, mainly the solution o _j problems, multiplied by the solution s ^ws a solution s ^ow calculated ~ s using auxiliary problem of the auxiliary problems for each element of the vector - o _j is determined. Further, in the calculation of the word generation unit of the decoder, the calculation is performed by defining the number of words by the solution to the auxiliary problem ~ ^swl .

学習部２４は、符号器構築部３０と、符号器計算部３２と、復号器計算部３４と、予測取得部３６と、目的関数計算部３８と、パラメタ更新部４０と、終了判定部４２とを含んで構成されている。 The learning unit 24 includes an encoder construction unit 30, an encoder calculation unit 32, a decoder calculation unit 34, a prediction acquisition unit 36, an objective function calculation unit 38, a parameter update unit 40, an end determination unit 42, It is comprised including.

符号器構築部３０は、入力された文章に基づいて、符号器を構築する。 The encoder construction unit 30 constructs an encoder based on the input text.

符号器計算部３２は、符号器構築部３０で構築された符号器と、パラメタ The encoder calculation unit 32 includes the encoder constructed by the encoder construction unit 30 and the parameter

の初期値又は更新されたパラメタ Initial value or updated parameter

とに基づいて、入力された文章の各単語を、対応する符号化ユニットに入力して、順次計算して、入力された文章の符号を出力すると共に、予測器を計算して補助問題１〜３の解を予測する。 Based on the above, each word of the input sentence is input to the corresponding encoding unit, sequentially calculated, and outputs the sign of the input sentence. Predict the solution of 3.

復号器計算部３４は、符号器計算部３２から出力された符号及び補助問題１〜３の解を、復号器に入力して、順次計算して、主問題及び補助問題１〜３により解かれた要約文を出力する。 The decoder calculator 34 inputs the codes output from the encoder calculator 32 and the solutions of the sub-problems 1 to 3 to the decoder, sequentially calculates them, and solves them by the main problem and the sub-problems 1 to 3. Output a summary sentence.

予測取得部３６は、復号器計算部３４によって計算された主問題及び補助問題１〜３により解かれた要約文を予測結果として取得する。 The prediction acquisition unit 36 acquires, as a prediction result, a summary sentence solved by the main problem and the auxiliary problems 1 to 3 calculated by the decoder calculation unit 34.

目的関数計算部３８は、予測取得部３６で取得した要約文と、主問題についての正解データとを用いて表される目的関数の値を計算する。 The objective function calculator 38 calculates the value of the objective function represented using the summary sentence acquired by the prediction acquisition unit 36 and the correct answer data for the main problem.

ここで、補助問題１〜３の予測器を学習するための目的関数について説明する。 Here, an objective function for learning predictors for the auxiliary problems 1 to 3 will be described.

補助問題１についての目的関数は、以下（８）式の損失関数Ψを最小にするパラメタｓ^ｗｓを探索する最小化問題として定式化できる。 The objective function for the auxiliary problem 1, the following equation (8) of the loss function Ψ can be formulated as a minimization problem of searching parameters s ^ws that minimizes.

・・・（８）
... (8)

関数Ψは正解ｃ^ｗｓと現在のシステムの予測結果ｓ^ｗｓとの負のクロスエントロピーに相当する。ｓ^ｗｓは、パラメタ The function Ψ corresponds to a negative cross entropy between the correct answer c ^ws and the current system prediction result ^sws . ^sws is a parameter

に依存して決定する値である。 Is a value determined depending on.

補助問題２についての目的関数は、以下（９）式の損失関数Ψを最小にするパラメタｓ^ｗｓを探索する最小化問題として定式化できる。 The objective function for the auxiliary problem 2, the following (9) can be formulated as a minimization problem of searching parameters s ^ws that minimizes the loss function Ψ of Formula.

・・・（９）
... (9)

関数Ψは正解ｃ^ｗｌと現在のシステムの予測結果ｓ^ｗｌとの負のクロスエントロピーに相当する。ｓ^ｗｓは、パラメタ The function Ψ corresponds to the negative cross entropy between the correct answer c ^wl and the prediction result s ^wl of the current system. ^sws is a parameter

に依存して決定する値である。図１３に示すように、[ｃ^ｗｌ−０．５,ｃ^ｗｌ＋０．５]の範囲に予測が入れば損失０で最小値となる。 Is a value determined depending on. As shown in FIG. 13, if the prediction falls within the range of [c ^wl -0.5, c ^wl +0.5], the loss becomes the minimum value with zero loss.

補助問題３についての目的関数は、以下（１０）式の損失関数Ψを最小にするパラメタｓ^ｏｗを探索する最小化問題として定式化できる。 The objective function for the auxiliary problem 3 can be formulated as a minimization problem that searches for a parameter ^sow that minimizes the loss function の of the following equation (10).

・・・（１０）
... (10)

関数Ψは正解ｃ^ｏｗと現在のシステムの予測結果ｓ^ｏｗとの負のクロスエントロピーに相当する。ｓ^ｏｗは、パラメタ The function Ψ corresponds to the negative cross entropy between the correct answer c ^ow and the prediction result s ^ow of the current system. ^sow is a parameter

に依存して決定する値である。 Is a value determined depending on.

目的関数は、以下（１１）式に示すように、主問題と補助問題１〜３の全ての目的関数を統合して同時に学習を行う。 As shown in the following equation (11), the objective function integrates all the objective functions of the main problem and the auxiliary problems 1 to 3 and performs learning simultaneously.

・・・（１１）
... (11)

学習の手順としては、（１）現在のパラメタ The learning procedure consists of (1) current parameters

で入力された主問題及び補助問題１〜３の正解データに対する予測結果を取得する（符号器計算部３２〜予測取得部３６までの処理）。（２）予測結果（出力された要約文）と、主問題及び補助問題１〜３の正解データとを用いて、主問題及び各補助問題のそれぞれの損失関数を計算する。（３）損失関数の値にしたがって、勾配を計算する。（４）chain ruleにしたがって各パラメタ単位の勾配の値を取得する。 To obtain prediction results for the correct answer data of the main problem and the auxiliary problems 1 to 3 input in (1) (processing from the encoder calculation unit 32 to the prediction acquisition unit 36). (2) The loss function of each of the main problem and each of the auxiliary problems is calculated using the prediction result (the output summary sentence) and the correct answer data of the main problem and the auxiliary problems 1 to 3. (3) Calculate the gradient according to the value of the loss function. (4) Obtain the value of the gradient for each parameter unit according to the chain rule.

パラメタ更新部４０は、目的関数計算部３８で計算された目的関数の値に基づいてパラメタ
The parameter updating unit 40 performs a parameter based on the value of the objective function calculated by the objective function calculating unit 38.

を更新する。ここでは、上記目的関数計算部３８の学習の手順（４）で得られた値に従ってパラメタ To update. Here, the parameter is determined according to the value obtained in the learning procedure (4) of the objective function calculation unit 38.

を更新する。 To update.

終了判定部４２は、予め定められた反復終了条件を満たすまで、符号器計算部３２による計算、復号器計算部３４による計算、予測取得部３６による取得、目的関数計算部３８による計算、及びパラメタ更新部４０による更新を繰り返す。そして、最終的にパラメタ更新部４０で更新されたパラメタ The end determination unit 42 performs the calculation by the encoder calculation unit 32, the calculation by the decoder calculation unit 34, the acquisition by the prediction acquisition unit 36, the calculation by the objective function calculation unit 38, and the parameters until the predetermined iteration end condition is satisfied. The updating by the updating unit 40 is repeated. Then, the parameter finally updated by the parameter updating unit 40

をパラメタＤＢ４４に格納する。 Is stored in the parameter DB 44.

＜本発明の実施の形態に係る変換装置の構成＞ <Configuration of Conversion Device According to Embodiment of the Present Invention>

次に、本発明の実施の形態に係る変換装置の構成について説明する。図１４に示すように、本発明の実施の形態に係る変換装置２００は、ＣＰＵと、ＲＡＭと、後述する変換処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。この変換装置２００は、機能的には図１４に示すように入力部２１０と、演算部２２０と、出力部２５０とを備えている。 Next, the configuration of the conversion device according to the embodiment of the present invention will be described. As shown in FIG. 14, a conversion device 200 according to an embodiment of the present invention is a computer including a CPU, a RAM, and a ROM storing a program for executing a conversion processing routine described later and various data. Can be configured. The conversion device 200 functionally includes an input unit 210, a calculation unit 220, and an output unit 250 as shown in FIG.

入力部２１０は、要約の対象となる、離散構造を持つ文章を受け付ける。 The input unit 210 receives a sentence having a discrete structure to be summarized.

演算部２０は、符号器構築部２３０と、符号器計算部２３２と、復号器計算部２３４と、パラメタＤＢ２４４とを含んで構成されている。 The operation unit 20 includes an encoder construction unit 230, an encoder calculation unit 232, a decoder calculation unit 234, and a parameter DB 244.

符号器構築部２３０は、入力部２１０で受け付けた文章に基づいて、文章の各単語に対応する、単語を符号化する符号化ユニットを接続した符号器であって、主問題に対して予め定められた補助問題１〜３に対応する予測器を含む符号器を構築する。ここでは、符号器学習装置１００の符号器構築部３０と同様に符号器を構築すればよい。 The encoder constructing unit 230 is an encoder connected to an encoding unit that encodes a word corresponding to each word of the sentence based on the sentence received by the input unit 210, and is predetermined for the main problem. Construct an encoder that includes predictors corresponding to the given sub-problems 1-3. Here, an encoder may be constructed similarly to the encoder construction unit 30 of the encoder learning apparatus 100.

パラメタＤＢ２４４には、符号器学習装置１００で学習されたパラメタ The parameter DB 244 includes parameters learned by the encoder learning device 100.

が格納されている。 Is stored.

符号器計算部２３２は、符号器構築部２３０で構築された符号器に基づいて、入力された文章の各単語を、対応する符号化ユニットに入力して、順次計算して、入力された文章の符号を出力すると共に、上記（３）〜（５）式に従って、パラメタＤＢ２４４に格納されているパラメタ The encoder calculation unit 232 inputs each word of the input text to the corresponding encoding unit based on the encoder constructed by the encoder construction unit 230, sequentially calculates the words, and sequentially calculates the input text. And outputs the parameter stored in the parameter DB 244 according to the above equations (3) to (5).

を用いて、予測器を計算して補助問題１〜３の解を予測する。 Is used to calculate a predictor to predict the solutions of the auxiliary problems 1 to 3.

復号器計算部２３４は、符号器計算部２３２から出力された符号及び補助問題の解を、復号器に入力して、順次計算して、主問題及び補助問題１〜３が解かれた要約文を出力部２５０に出力する。単語を出力するまでの具体的な処理は、符号器学習装置１００の学習部２４において説明した処理と同様である。 The decoder calculation unit 234 inputs the solution of the code and the auxiliary problem output from the encoder calculation unit 232 to the decoder, sequentially calculates the summary and the summary sentence in which the main problem and the auxiliary problems 1 to 3 are solved. Is output to the output unit 250. The specific processing up to the output of the word is the same as the processing described in the learning unit 24 of the encoder learning device 100.

予測器を用いた補助問題１〜３の計算と復号器から出力される要約文の例を図１５に示す。復号器の単語生成ユニットの計算において、補助問題２の解~ｓ^ｗｌによって単語数を規定して、主問題の解ｏ_ｊに~ｓをベクトルの要素毎に乗算した~ｏ_ｊが求められる。は、補助問題１の解ｓ^ｗｓ、及び補助問題３の解ｓ^ｏｗを乗算して得たものである。補助問題２の解~ｓ^ｗｌを用いて、単語数を規定して計算を行う場合には、例えば規定された単語数が１０語であれば１０語分の単語生成ユニットを計算して終了する。補助問題３の解ｓ^ｏｗを用いて計算を行う場合には、例えば、ある単語が出力側の単語リストに入っている場合には、その単語リストの単語に対する単語の集合の予測結果の確率を修正する。出力側の単語リストに入っていない（未知語）時には、未知語（UNK)と判定された際に、attentionの確率a_ｉ,ｊがもっとも高い入力単語が同一単語と予測されていれば、その単語で置き換える。 FIG. 15 shows an example of the calculation of the auxiliary problems 1 to 3 using the predictor and the summary sentence output from the decoder. In the calculation of the word generation unit of the decoder, the number of words is defined by the solution ~ s ^wl of the auxiliary problem 2, and ~ o _j obtained by multiplying the solution o _{j of the} main problem by ~ s for each vector element is obtained. Is obtained by multiplying the solution s ^ws of the auxiliary problem 1 and the solution s ^ow of the auxiliary problem 3. When the calculation is performed by defining the number of words using the solution of the auxiliary problem 2 to ^swl , for example, if the specified number of words is 10, a word generation unit for 10 words is calculated and the processing is terminated. . When the calculation is performed using the solution ^sow of the auxiliary problem 3, for example, when a certain word is included in the word list on the output side, the probability of the prediction result of the set of words for the words in the word list is calculated. Fix it. When the input word is not included in the word list on the output side (unknown word) and the input word having the highest probability of attention a _{i, j} is predicted to be the same word when it is determined to be unknown word (UNK), Replace with words.

なお、補助問題１のみの解を考慮する場合を図１６に示す。また、補助問題１、３の解を考慮する場合を図１７に示す。 FIG. 16 shows a case where the solution of only the auxiliary problem 1 is considered. FIG. 17 shows a case where the solutions to the auxiliary problems 1 and 3 are considered.

＜本発明の実施の形態に係る符号器学習装置の作用＞ <Operation of Encoder Learning Apparatus According to Embodiment of the Present Invention>

次に、本発明の実施の形態に係る符号器学習装置１００の作用について説明する。入力部１０において、入力された文章を要約文に変換する主問題についての正解データを受け付けると符号器学習装置１００は、図１８に示す符号器学習処理ルーチンを実行する。 Next, the operation of the encoder learning device 100 according to the embodiment of the present invention will be described. When the input unit 10 receives correct answer data on the main problem for converting the input sentence into a summary sentence, the encoder learning device 100 executes an encoder learning processing routine shown in FIG.

まず、ステップＳ１００では、入力部１０で受け付けた文章を要約文に変換する主問題についての正解データに基づいて、主問題に対して予め定められた補助問題１〜３の正解データを生成する。 First, in step S100, based on the correct data of the main problem for converting the sentence received by the input unit 10 into the summary sentence, the correct data of the auxiliary problems 1 to 3 predetermined for the main problem is generated.

次に、ステップＳ１０２では、入力された文章に基づいて、入力された文章の各単語に対応する、単語を符号化する符号化ユニットを接続した符号器を構築する。 Next, in step S102, based on the input sentence, an encoder connected to an encoding unit that encodes a word corresponding to each word of the input sentence is constructed.

ステップＳ１０４では、ステップＳ１０２で構築された符号器と、パラメタ In step S104, the encoder constructed in step S102 and the parameter

とに基づいて、入力された文章の各単語を、対応する符号化ユニットに入力して、順次計算して、入力された文章の符号を出力すると共に、予測器を計算して補助問題の解を予測する。 Based on the above, each word of the input sentence is input to the corresponding encoding unit, and is sequentially calculated, the sign of the input sentence is output, and the predictor is calculated to solve the auxiliary problem. Predict.

ステップＳ１０６では、ステップＳ１０４で出力された符号及び補助問題の解を、復号器に入力して、上記（７）式に従って、順次計算して、主問題及び補助問題により解かれた要約文を出力する。 In step S106, the code and the solution of the auxiliary problem output in step S104 are input to a decoder, and sequentially calculated according to the above equation (7), and a summary sentence solved by the main problem and the auxiliary problem is output. I do.

ステップＳ１０８では、ステップＳ１０６で計算された主問題及び補助問題により解かれた要約文を予測結果として取得する。 In step S108, a summary sentence solved by the main problem and the auxiliary problem calculated in step S106 is obtained as a prediction result.

ステップＳ１１０では、上記（１１）式に従って、ステップＳ１０８で取得した要約文と、主問題についての正解データとを用いて表される目的関数の値を計算する。 In step S110, the value of the objective function represented by using the summary sentence acquired in step S108 and the correct answer data for the main problem is calculated according to the above equation (11).

ステップＳ１１２では、ステップＳ１１０で計算された目的関数の値に基づいてパラメタ In step S112, a parameter is set based on the value of the objective function calculated in step S110.

を更新する。 To update.

ステップＳ１１４では、反復終了条件を満たすか否かを判定し、反復終了条件を満たしていればステップＳ１１２で更新されたパラメタ In step S114, it is determined whether the repetition end condition is satisfied. If the repetition end condition is satisfied, the parameter updated in step S112 is determined.

をパラメタＤＢ４４に格納して処理を終了し、反復終了条件を満たしていなければステップＳ１０４に戻って処理を繰り返す。また、主問題についての正解データを複数受け付けた場合には、主問題についての正解データ毎に、上記ステップＳ１００〜Ｓ１１４の処理を繰り返せばよい。 Is stored in the parameter DB 44, and the process is terminated. If the repetition termination condition is not satisfied, the process returns to step S104 to repeat the process. Further, when a plurality of correct answer data on the main problem is received, the processes of steps S100 to S114 may be repeated for each correct answer data on the main problem.

以上説明したように、本発明の実施の形態に係る符号器学習装置によれば、入力された文章を要約する主問題についての正解データに基づいて、主問題に対して予め定められた補助問題の正解データを生成し、入力された文章の各単語に対応する、単語を符号化する符号化ユニットを接続した符号器であって、補助問題の解を予測する予測器を含む符号器と、符号器によって出力される符号、及び補助問題の解を入力とする、単語に復号する復号化ユニットを接続した復号器と、主問題についての正解データと、補助問題の正解データとに基づいて、符号化ユニット、復号化ユニット、及び予測器のパラメタを学習することにより、精度よく要約文に変換するためのパラメタを学習することができる。 As described above, according to the encoder learning device according to the embodiment of the present invention, based on correct data of a main problem that summarizes an input sentence, a predetermined auxiliary problem is determined for the main problem. An encoder that generates a correct answer data and that corresponds to each word of the input sentence and is connected to an encoding unit that encodes the word, the encoder including a predictor that predicts a solution of an auxiliary problem, Based on the code output by the encoder, and a decoder connected to a decoding unit that decodes a word, with the solution of the auxiliary problem as input, based on the correct data for the main problem and the correct data for the auxiliary problem, By learning the parameters of the encoding unit, the decoding unit, and the predictor, it is possible to accurately learn the parameters for converting into a summary sentence.

＜本発明の実施の形態に係る変換装置の作用＞ <Operation of Conversion Device According to Embodiment of the Present Invention>

次に、本発明の実施の形態に係る変換装置２００の作用について説明する。入力部２１０において、要約の対象となる、離散構造を持つ文章を受け付けると、変換装置２００は、図１９に示す変換処理ルーチンを実行する。 Next, the operation of the conversion device 200 according to the embodiment of the present invention will be described. When the input unit 210 receives a sentence having a discrete structure to be summarized, the conversion device 200 executes a conversion processing routine shown in FIG.

まず、ステップＳ２００では、入力部２１０において受け付けた文章に基づいて、文章の各単語に対応する、単語を符号化する符号化ユニットを接続した符号器であって、主問題に対して予め定められた補助問題に対応する予測器を含む符号器を構築する。 First, in step S200, based on the sentence received by the input unit 210, an encoder connected to an encoding unit that encodes a word, corresponding to each word of the sentence, is predetermined for the main problem. We construct an encoder that includes a predictor corresponding to the auxiliary problem.

次に、ステップＳ２０２では、ステップＳ２０２で構築された符号器に基づいて、入力された文章の各単語を、対応する符号化ユニットに入力して、上記（７）式に従って、順次計算して、入力された文章の符号を出力すると共に、パラメタＤＢ２４４に格納されているパラメタ Next, in step S202, based on the encoder constructed in step S202, each word of the input sentence is input to the corresponding encoding unit, and is sequentially calculated according to the above equation (7). Outputs the sign of the input sentence, and outputs the parameters stored in the parameter DB 244.

ステップＳ２０４では、ステップＳ２０２で出力された符号及び補助問題の解を、復号器に入力して、順次計算して、主問題及び補助問題が解かれた要約文を出力部２５０に出力して処理を終了する。 In step S204, the code and the solution of the auxiliary problem output in step S202 are input to the decoder, sequentially calculated, and the summary sentence in which the main problem and the auxiliary problem are solved is output to the output unit 250 for processing. To end.

以上説明したように、本発明の実施の形態に係る変換装置によれば、入力された文章に基づいて、入力された文章の各単語に対応する、単語を符号化する符号化ユニットを接続した符号器であって、主問題に対して予め定められた補助問題に対応する予測器を含む符号器を構築し、構築された符号器に基づいて、入力された文章の各単語を、対応する符号化ユニットに入力して、順次計算して、入力された文章の符号を出力すると共に、予測器を計算して補助問題の解を予測し、出力された符号及び補助問題の解を、復号器に入力して、順次計算して、要約文を出力することにより、精度よく要約文に変換することができる。 As described above, according to the conversion device according to the embodiment of the present invention, based on an input sentence, an encoding unit that encodes a word corresponding to each word of the input sentence is connected. An encoder is constructed, which includes a predictor corresponding to a predetermined auxiliary problem with respect to the main problem, and, based on the constructed encoder, each word of the input sentence is corresponded. Input to the encoding unit, sequentially calculate and output the code of the input sentence, calculate the predictor to predict the solution of the auxiliary problem, and decode the output code and the solution of the auxiliary problem. It is possible to convert the summary sentence with high accuracy by inputting it into a device, calculating sequentially, and outputting the summary sentence.

なお、本発明は、上述した実施の形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 The present invention is not limited to the above-described embodiment, and various modifications and applications can be made without departing from the gist of the present invention.

例えば、上述した実施の形態の符号器学習装置１００では、補助問題１〜３についての正解データを生成して、符号器及び復号器で用いるパラメタ For example, in the encoder learning apparatus 100 according to the above-described embodiment, correct data for the auxiliary questions 1 to 3 is generated, and the parameters used in the encoder and the decoder are used.

を学習していたが、これに限定されるものではなく、補助問題１〜３をそれぞれ独立して正解データを生成し、符号器及び復号器で用いるパラメタ However, the present invention is not limited to this. The parameters used for the auxiliary problems 1 to 3 independently generate correct data and are used in the encoder and the decoder.

を学習するようにしてもよい。また、補助問題１〜３以外の補助問題の正解データを生成して、パラメタ

を学習するようにしてもよい。 May be learned. In addition, correct answer data of the auxiliary questions other than the auxiliary questions 1 to 3 is generated, and the parameter

May be learned.

また、上述した実施の形態の変換装置２００では、補助問題１〜３の解を予測する場合を例に説明したが、これに限定されるものではなく、補助問題１〜３以外の解を予測するようにしてもよい。 Further, in the conversion device 200 according to the above-described embodiment, the case where the solutions of the auxiliary problems 1 to 3 are predicted has been described as an example. However, the present invention is not limited to this. You may make it.

また、上述した実施の形態では、文章を入力とし、要約文を出力する場合を例に説明したが、これに限定されるものではなく、本発明の実施の形態に係る手法は、離散構造を変換する問題であればどのような分野でも適用することができる。例えば文章を翻訳して他言語に変換する主問題にも適用することができ、この翻訳の主問題に適用する場合には、符号器及び復号器は例えば図２０〜２２のように構成することができる。図２０では、補助問題１の解を考慮する符号器及び復号器を示しており、図２１では、補助問題１、２の解を考慮する符号器及び復号器を示しており、図２２では、補助問題１〜３の解を考慮する符号器及び復号器を示している。 Further, in the above-described embodiment, the case where a sentence is input and a summary sentence is output has been described as an example. However, the present invention is not limited to this, and the method according to the embodiment of the present invention employs a discrete structure. It can be applied to any field as long as it is a problem to be converted. For example, the present invention can be applied to a main problem of translating a sentence and converting it into another language. In the case of applying to a main problem of this translation, an encoder and a decoder should be configured as shown in FIGS. Can be. FIG. 20 illustrates an encoder and a decoder that consider the solution of the auxiliary problem 1, FIG. 21 illustrates an encoder and a decoder that considers the solution of the auxiliary problem 1 and 2, and FIG. Fig. 2 shows an encoder and a decoder that take into account the solutions of the auxiliary problems 1 to 3;

また、本発明は、上記のような言語処理以外にもグラフなどの離散構造を持つ対象の場合についても同様に適用することができる。 Further, the present invention can be similarly applied to an object having a discrete structure such as a graph other than the language processing as described above.

１０、２１０入力部
２０、２２０演算部
２２補助問題生成部
２４学習部
３０、２３０符号器構築部
３２、２３２符号器計算部
３４、２３４復号器計算部
３６予測取得部
３８目的関数計算部
４０パラメタ更新部
４２終了判定部
１００符号器学習装置
２００変換装置
２５０出力部 10, 210 Input unit 20, 220 Operation unit 22 Auxiliary problem generation unit 24 Learning unit 30, 230 Encoder construction unit 32, 232 Encoder calculation unit 34, 234 Decoder calculation unit 36 Prediction acquisition unit 38 Objective function calculation unit 40 Parameter Update unit 42 End determination unit 100 Encoder learning device 200 Conversion device 250 Output unit

Claims

An auxiliary problem generation unit that generates correct data of a predetermined auxiliary problem with respect to the main problem, based on the correct data of the main problem that converts the input discrete structure;
An encoder connected to an encoding unit for encoding the element corresponding to each element of the input discrete structure, the encoder including a predictor for predicting a solution of the auxiliary problem, and the encoder A decoder connected to a decoding unit that decodes the elements of the discrete structure, and a correct answer data for the main problem, and a correct answer data for the auxiliary problem. A learning unit that learns parameters of the encoding unit, the decoding unit, and the predictor based on
An encoder learning device including:

The learning unit includes:
An encoder construction unit that constructs the encoder based on the input discrete structure,
Based on the constructed encoder and the initial value of the parameter or the updated parameter, each element of the input discrete structure is input to the corresponding encoding unit and sequentially calculated. An encoder calculating unit that outputs the code of the input discrete structure, and calculates the predictor to predict the solution of the auxiliary problem.
A decoder calculation unit that inputs the output code and the solution of the auxiliary problem to the decoder, sequentially calculates, and outputs a discrete structure,
An objective function calculation unit that calculates a value of an objective function represented by using the discrete structure output by the decoder calculation unit and correct data for the main problem,
A parameter updating unit that updates the parameter based on the calculated value of the objective function,
Until a predetermined iteration end condition is satisfied, a calculation by the encoder calculation unit, a calculation by the decoder calculation unit, a calculation by the objective function calculation unit, and an end determination unit that repeats the update by the parameter update unit, The encoder learning device according to claim 1, further comprising:

A converter for solving a main problem of converting an input discrete structure,
An encoder connected to an encoding unit that encodes the element, corresponding to each element of the discrete structure, based on the input discrete structure, and a predetermined auxiliary problem for the main problem. An encoder construction unit that constructs an encoder including a predictor corresponding to
Based on the constructed encoder, each element of the input discrete structure is input to a corresponding one of the encoding units, sequentially calculated, and outputs the code of the input discrete structure. An encoder calculation unit that calculates the predictor to predict the solution of the auxiliary problem;
A decoder calculation unit that inputs the output code and the solution of the auxiliary problem to a decoder connected to a decoding unit that decodes the elements of the discrete structure, sequentially calculates, and outputs a discrete structure,
A conversion device including:

The auxiliary problem, a problem of predicting a set of elements included in the transformed discrete structure, a problem of predicting the number of elements included in the transformed discrete structure, the input discrete structure, and the transformed discrete structure 4. The conversion according to claim 3, wherein the conversion is at least one of a problem of predicting a set of elements included in any of the above and an auxiliary problem that can be learned using correct data generated from correct data of other main problems. apparatus.

Auxiliary problem generator, based on the correct data of the main problem to transform the input discrete structure, generating correct data of a predetermined auxiliary problem for the main problem,
A learning unit, which corresponds to each element of the input discrete structure, is an encoder connected to an encoding unit that encodes the element, and includes an encoder including a predictor that predicts a solution to the auxiliary problem. A decoder connected to a decoding unit that decodes the elements of the discrete structure, having a code output by the encoder, and a solution of the auxiliary problem as inputs, and correct data for the main problem; Learning parameters of the encoding unit and the predictor based on the correct answer data of the problem;
An encoder learning method including:

A conversion method in a conversion device for solving a main problem of converting an input discrete structure,
An encoder construction unit, based on the input discrete structure, corresponding to each element of the discrete structure, an encoder connected to an encoding unit that encodes the element, for the main problem Constructing an encoder including a predictor corresponding to a predetermined auxiliary problem;
An encoder calculation unit, based on the constructed encoder, inputs each element of the input discrete structure to the corresponding encoding unit, and sequentially calculates the input discrete structure. And calculating the predictor to predict the solution of the auxiliary problem; and
A decoder calculation unit inputs the output code and the solution of the auxiliary problem to a decoder connected with a decoding unit that decodes the elements into the discrete structure, sequentially calculates and outputs a discrete structure. Steps and
Conversion method including.

A program for causing a computer to function as each unit of the encoder learning device according to claim 1.

A program for causing a computer to function as each unit of the conversion device according to claim 3.