JP2018185771A

JP2018185771A - Sentence pair classification apparatus, sentence pair classification learning apparatus, method, and program

Info

Publication number: JP2018185771A
Application number: JP2017088955A
Authority: JP
Inventors: 京介西田; Kyosuke Nishida; 九月貞光; Kugatsu Sadamitsu; 松尾　義博; Yoshihiro Matsuo; 義博松尾; 久子浅野; Hisako Asano
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2017-04-27
Filing date: 2017-04-27
Publication date: 2018-11-22
Anticipated expiration: 2037-04-27
Also published as: JP6738769B2

Abstract

PROBLEM TO BE SOLVED: To obtain a class relating to the relationship of a sentence pair in consideration of interpretation of words.SOLUTION: A convolution unit 238 outputs a feature matrix Fobtained by performing convolution processing on a feature matrix Foutput from a word vectorization unit 32, a feature matrix F' output by a sentence pair attention unit 32, and a feature matrix F" output by a word interpretation expanded sentence pair attention section 36, in each layer of a neural net, as an output of a layer i, to each of sentences of the sentence pair. A classification unit 240 classifies each of the sentence pairs included in a set sentence pairs to a class relating to the relationship of the sentence pair on the basis of feature matrices Fand Ffor each of the sentence pair output by a last layer B of the neural net.SELECTED DRAWING: Figure 6

Description

本発明は、文ペア分類装置、文ペア分類学習装置、方法、及びプログラムに係り、特に、２つ以上の文の文ペアをクラスに分類するための文ペア分類装置、文ペア分類学習装置、方法、及びプログラムに関する。 The present invention relates to a sentence pair classification apparatus, a sentence pair classification learning apparatus, a method, and a program, and in particular, a sentence pair classification apparatus, a sentence pair classification learning apparatus for classifying sentence pairs of two or more sentences into classes, The present invention relates to a method and a program.

質問文に対して回答となる文であるかの判定（回答文選択）や、２つの文が同じ意味を持つかの判定（換言同定）、文１から文２が推論可能かの判定（含意認識）など、文ペアの関係性クラスの分類を人工知能により正確に実施することができれば、情報検索や質問応答や知的エージェント対話など幅広いサービスに応用することができる。 Determining whether a sentence is an answer to a question sentence (answer sentence selection), determining whether two sentences have the same meaning (paraphrase identification), determining whether sentence 1 to sentence 2 can be inferred (implications) If the classification of the relationship class of sentence pairs such as recognition) can be performed accurately by artificial intelligence, it can be applied to a wide range of services such as information retrieval, question answering, and intelligent agent dialogue.

文ペアクラス分類を行うための従来手法として、非特許文献１などの手法がこれまで提案されている。 As a conventional technique for classifying sentence pair classes, techniques such as Non-Patent Document 1 have been proposed so far.

非特許文献１などの従来手法では、各文に含まれる単語毎のベクトル類似度行列を計算し、この類似度行列に基づいて、２つの文の関係性クラスを分類している。単語のベクトルについては、非特許文献２に記載のword2vecなどの手法により、大規模な文書コーパスから学習可能である。 In a conventional method such as Non-Patent Document 1, a vector similarity matrix is calculated for each word included in each sentence, and the relationship classes of the two sentences are classified based on the similarity matrix. The word vector can be learned from a large-scale document corpus by a method such as word2vec described in Non-Patent Document 2.

Wenpeng Yin, Hinrich Schutze, Bing Xiang, Bowen Zhou: ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs. Transactions of the Association for Computational Linguistics, Volume 4: 259-272 (2016)Wenpeng Yin, Hinrich Schutze, Bing Xiang, Bowen Zhou: ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs.Transactions of the Association for Computational Linguistics, Volume 4: 259-272 (2016) Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. In Proceedings of NIPS, 2013.Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean.Distributed Representations of Words and Phrases and their Compositionality.In Proceedings of NIPS, 2013.

従来手法では、単語をベクトル化して扱うことにより、例えば「野球のチケットはどこで買えますか？」と「野球の入場券の売り場はどこですか？」の２文が同じ意味（言い換え）を持つことを判定するタスクにおいて、「チケット」と「入場券」の単語ベクトルの類似度が高い事を利用して、正しく判定することが可能になっている。 In the conventional method, for example, two words, “Where can I buy a baseball ticket” and “Where is a baseball ticket office?” Have the same meaning (in other words) by using vectorized words. In the task of determining whether or not the “ticket” and “admission ticket” word vectors are highly similar, it can be determined correctly.

しかし、「クーリングオフはいつまでできますか？」と「無条件で契約を解除できる期間は？」という言い換え判定においては、「クーリングオフ」という単語と、「無」「条件」「契約」「解除」の各単語の類似度は低くなるため、言い換え判定の精度を下げる要因となる。回答文選択や含意認識のタスクに置いても同様の問題が生じる。 However, in the paraphrase judgment, “How long can the cooling off be done?” And “How long can I cancel the contract unconditionally?”, The word “Cooling off” and “No” “Condition” “Contract” “Cancellation” The similarity of each word of “” becomes low, which causes a reduction in the accuracy of the paraphrase determination. The same problem occurs even in the task of answer sentence selection and implication recognition task.

本発明は、上記問題点を解決するために成されたものであり、語釈を考慮した文ペアの関係性に関するクラスを求めることができる文ペア分類装置、方法、及びプログラムを提供することを目的とする。 The present invention has been made to solve the above problems, and it is an object of the present invention to provide a sentence pair classification apparatus, method, and program capable of obtaining a class relating to the relationship between sentence pairs in consideration of interpretation. And

また、語釈を考慮した文ペアの関係性に関するクラスを求めるためのパラメータを学習することができる文ペア分類学習装置、方法、及びプログラムを提供することを目的とする。 It is another object of the present invention to provide a sentence pair classification learning device, method, and program capable of learning parameters for obtaining a class related to the relationship between sentence pairs in consideration of interpretation.

上記目的を達成するために、第１の発明に係る文ペア分類装置は、文ペアの文の各々を単語の系列に分割する単語分割部と、前記文ペアの前記文の各々に対し、前記分割された単語の各々を、各単語のベクトルを記憶する単語ベクトル記憶部に基づいてベクトル化して得られる、前記文の各々の単語に関する特徴行列を出力する単語ベクトル化部と、ニューラルネットの各レイヤーにおいて、前記文ペアの前記文の各々に対する前記単語の各々に関する特徴行列、又は前記文ペアの前記文の各々に対する、一つ前のレイヤーにより出力された特徴行列のマッチングに関する特徴行列を、前記文ペアの前記文の各々に対して出力する文ペアアテンション部と、ニューラルネットの各レイヤーにおいて、前記レイヤーに対応する単語数だけ前記単語を連結したチャンクについて、チャンクに対する語釈文を記憶する語釈文記憶部を検索して得られる、前記文ペアの一方の文に含まれる前記チャンクに関する語釈文に含まれる単語の各々に関する特徴行列と、前記文ペアの他方の文の単語の各々に関する特徴行列とのマッチングに関する特徴行列を、前記文ペアの前記文の各々に対して出力する語釈拡張文ペアアテンション部と、ニューラルネットの各レイヤーにおいて、前記単語ベクトル化部が出力した前記特徴行列、前記文ペアアテンション部が出力した前記特徴行列、及び前記語釈拡張文ペアアテンション部が出力した前記特徴行列に対して畳み込み処理を行って得られる特徴行列を、前記レイヤーの出力として、前記文ペアの前記文の各々に対して出力する畳み込み部と、前記ニューラルネットの最後のレイヤーにより出力された、前記文ペアの各々に対する前記特徴行列に基づいて、前記文ペアの関係性に関するクラスを分類するクラス分類部と、を含んで構成されている。 In order to achieve the above object, a sentence pair classification device according to a first invention includes a word dividing unit that divides each sentence of a sentence pair into a series of words, and for each of the sentences of the sentence pair, Each of the divided words is vectorized on the basis of a word vector storage unit that stores a vector of each word, and is obtained by vectorizing a word vectorization unit that outputs a feature matrix for each word of the sentence, and each of the neural networks In a layer, a feature matrix for each of the words for each of the sentences in the sentence pair, or a feature matrix for matching the feature matrix output by the previous layer for each of the sentences in the sentence pair, In each sentence pair attention section to be output for each sentence of the sentence pair and each layer of the neural network, the number of words corresponding to the layer is the same. And a feature matrix for each of the words included in the sentence related to the chunk included in one sentence of the sentence pair, obtained by searching the word sentence storage unit that stores the word sentence for the chunk. In each layer of the neural network, a word extension sentence pair attention unit that outputs a feature matrix related to matching with a feature matrix related to each of the words of the other sentence of the sentence pair, and to each of the sentences of the sentence pair, Feature matrix obtained by performing convolution processing on the feature matrix output by the word vectorization unit, the feature matrix output by the sentence pair attention unit, and the feature matrix output by the word extension sentence pair attention unit Is output as the output of the layer to each of the sentences of the sentence pair; Output by the last layers of Rarunetto, based on the feature matrix for each of the sentence pairs is configured to include a, a classification unit which classifies the class of the relationship of the sentence pairs.

また、第１の発明に係る文ペア分類装置において、前記単語ベクトル化部、前記文ペアアテンション部、前記語釈拡張文ペアアテンション部、及び前記畳み込み部では、予め学習されたパラメータ行列を用いて特徴行列を求めるようにしてもよい。 In the sentence pair classification apparatus according to the first invention, the word vectorization unit, the sentence pair attention unit, the extended sentence pair attention unit, and the convolution unit use a parameter matrix learned in advance. A matrix may be obtained.

また、第２の発明に係る文ペア分類学習装置は、文ペアの関係性に関するクラスを示す正解ラベルが付与された文ペアの各々を含む文ペア集合に含まれる前記文ペアの各々に対し、前記文ペアの文の各々を単語の系列に分割する単語分割部と、前記文ペア集合に含まれる前記文ペアの前記文の各々に対し、前記分割された単語の各々を、各単語のベクトルを記憶する単語ベクトル記憶部に基づいてベクトル化して得られる、前記文の各々の単語に関する特徴行列を出力する単語ベクトル化部と、ニューラルネットの各レイヤーにおいて、前記文ペアの前記文の各々に対する前記単語の各々に関する特徴行列、又は前記文ペアの前記文の各々に対する、一つ前のレイヤーにより出力された特徴行列のマッチングに関する特徴行列を、前記文ペアの前記文の各々に対して出力する文ペアアテンション部と、ニューラルネットの各レイヤーにおいて、前記レイヤーに対応する単語数だけ前記単語を連結したチャンクについて、チャンクに対する語釈文を記憶する語釈文記憶部を検索して得られる、前記文ペアの一方の文に含まれる前記チャンクに関する語釈文に含まれる単語の各々に関する特徴行列と、前記文ペアの他方の文の単語の各々に関する特徴行列とのマッチングに関する特徴行列を、前記文ペアの前記文の各々に対して出力する語釈拡張文ペアアテンション部と、ニューラルネットの各レイヤーにおいて、前記単語ベクトル化部が出力した前記特徴行列、前記文ペアアテンション部が出力した前記特徴行列、及び前記語釈拡張文ペアアテンション部が出力した前記特徴行列に対して畳み込み処理を行って得られる特徴行列を、前記レイヤーの出力として、前記文ペアの前記文の各々に対して出力する畳み込み部と、前記文ペア集合に含まれる前記文ペアの各々に対し、前記ニューラルネットの最後のレイヤーにより出力された、前記文ペアの各々に対する前記特徴行列に基づいて、前記文ペアの関係性に関するクラスを分類し、分類結果と前記正解ラベルとに基づいて前記分類結果に関する損失を算出するクラス分類部と、前記文ペア集合に含まれる前記文ペアの各々に対して算出された、前記分類結果に関する損失に基づいて、前記文ペアアテンション部、前記語釈拡張文ペアアテンション部、及び前記畳み込み部において特徴行列を求めるためのパラメータ行列を学習する学習部と、を含んで構成されている。 Further, the sentence pair classification learning device according to the second invention, for each of the sentence pairs included in the sentence pair set including each of the sentence pairs to which the correct label indicating the class related to the relationship of the sentence pair is given, A word dividing unit for dividing each sentence of the sentence pair into a series of words; and for each sentence of the sentence pair included in the sentence pair set, each of the divided words is a vector of each word A word vectorization unit that outputs a feature matrix for each word of the sentence obtained by vectorization based on a word vector storage unit that stores the word, and in each layer of the neural network, for each of the sentences of the sentence pair A feature matrix for each of the words or a feature matrix for matching a feature matrix output by the previous layer for each of the sentences in the sentence pair is represented by the sentence pair. A sentence pair attention unit for outputting to each of the sentences; and a word sentence storage unit for storing a word sentence for the chunk in each layer of the neural network, with respect to the chunk in which the word is connected by the number of words corresponding to the layer. Relating to the matching between the feature matrix for each word contained in the word sentence related to the chunk included in one sentence of the sentence pair and the feature matrix for each word in the other sentence of the sentence pair, obtained by searching An extended sentence pair attention unit that outputs a feature matrix for each of the sentences of the sentence pair, and at each layer of the neural network, the feature matrix output by the word vectorization unit and the sentence pair attention unit The feature matrix output and the feature matrix output by the word extension sentence pair attention unit A convolution part that outputs a feature matrix obtained by performing convolution processing as an output of the layer for each of the sentences of the sentence pair, and for each of the sentence pairs included in the sentence pair set, Based on the feature matrix for each of the sentence pairs output by the last layer of the neural network, classify a class related to the relationship of the sentence pairs, and based on the classification result and the correct answer label, the classification result A class classifying unit that calculates a loss related to the sentence pair, and the sentence pair attention unit and the interpretation extended sentence pair attention based on the loss related to the classification result calculated for each of the sentence pairs included in the sentence pair set. And a learning unit that learns a parameter matrix for obtaining a feature matrix in the convolution unit.

第３の発明に係る文ペア分類方法は、単語分割部が、文ペアの文の各々を単語の系列に分割するステップと、単語ベクトル化部が、前記文ペアの前記文の各々に対し、前記分割された単語の各々を、各単語のベクトルを記憶する単語ベクトル記憶部に基づいてベクトル化して得られる、前記文の各々の単語に関する特徴行列を出力するステップと、文ペアアテンション部が、ニューラルネットの各レイヤーにおいて、前記文ペアの前記文の各々に対する前記単語の各々に関する特徴行列、又は前記文ペアの前記文の各々に対する、一つ前のレイヤーにより出力された特徴行列のマッチングに関する特徴行列を、前記文ペアの前記文の各々に対して出力するステップと、語釈拡張文ペアアテンション部が、ニューラルネットの各レイヤーにおいて、前記レイヤーに対応する単語数だけ前記単語を連結したチャンクについて、チャンクに対する語釈文を記憶する語釈文記憶部を検索して得られる、前記文ペアの一方の文に含まれる前記チャンクに関する語釈文に含まれる単語の各々に関する特徴行列と、前記文ペアの他方の文の単語の各々に関する特徴行列とのマッチングに関する特徴行列を、前記文ペアの前記文の各々に対して出力するステップと、畳み込み部が、ニューラルネットの各レイヤーにおいて、前記単語ベクトル化部が出力した前記特徴行列、前記文ペアアテンション部が出力した前記特徴行列、及び前記語釈拡張文ペアアテンション部が出力した前記特徴行列に対して畳み込み処理を行って得られる特徴行列を、前記レイヤーの出力として、前記文ペアの前記文の各々に対して出力するステップと、クラス分類部が、前記ニューラルネットの最後のレイヤーにより出力された、前記文ペアの各々に対する前記特徴行列に基づいて、前記文ペアの関係性に関するクラスを分類するステップと、を含んで実行することを特徴とする。 In the sentence pair classification method according to the third invention, the word dividing unit divides each sentence of the sentence pair into a series of words, and the word vectorizing unit applies to each of the sentences of the sentence pair, A step of outputting a feature matrix for each word of the sentence obtained by vectorizing each of the divided words based on a word vector storage unit storing a vector of each word; and a sentence pair attention unit, In each layer of the neural network, a feature matrix for each of the words for each of the sentences in the sentence pair, or a feature for matching the feature matrix output by the previous layer for each of the sentences in the sentence pair A matrix is output for each of the sentences of the sentence pair, and an extended sentence pair attention unit is provided at each layer of the neural network. The word sentence related to the chunk included in one sentence of the sentence pair obtained by searching the word sentence storage unit for storing the word sentence for the chunk for the chunks in which the number of words corresponding to the layer is connected. Outputting a feature matrix relating to matching between a feature matrix relating to each of the words included in the sentence pair and a feature matrix relating to each of the words in the other sentence of the sentence pair, for each of the sentences in the sentence pair, and convolution For each layer of the neural network, the feature matrix output by the word vectorization unit, the feature matrix output by the sentence pair attention unit, and the feature matrix output by the word extension sentence pair attention unit A feature matrix obtained by performing convolution processing is used as an output of the layer, and each of the sentences in the sentence pair And a step of classifying a class related to the relationship between the sentence pairs based on the feature matrix for each of the sentence pairs output by the last layer of the neural network. It is characterized by performing including.

また、第３の発明に係る文ペア分類方法において、前記単語ベクトル化部、前記文ペアアテンション部、前記語釈拡張文ペアアテンション部、及び前記畳み込み部では、予め学習されたパラメータ行列を用いて特徴行列を求めるようにしてもよい。 In the sentence pair classification method according to the third aspect of the invention, the word vectorization unit, the sentence pair attention unit, the interpretation extended sentence pair attention unit, and the convolution unit use a parameter matrix learned in advance. A matrix may be obtained.

また、第４の発明に係る文ペア分類学習方法は、単語分割部が、文ペアの関係性に関するクラスを示す正解ラベルが付与された文ペアの各々を含む文ペア集合に含まれる前記文ペアの各々に対し、前記文ペアの文の各々を単語の系列に分割するステップと、単語ベクトル化部が、前記文ペア集合に含まれる前記文ペアの前記文の各々に対し、前記分割された単語の各々を、各単語のベクトルを記憶する単語ベクトル記憶部に基づいてベクトル化して得られる、前記文の各々の単語に関する特徴行列を出力するステップと、文ペアアテンション部が、ニューラルネットの各レイヤーにおいて、前記文ペアの前記文の各々に対する前記単語の各々に関する特徴行列、又は前記文ペアの前記文の各々に対する、一つ前のレイヤーにより出力された特徴行列のマッチングに関する特徴行列を、前記文ペアの前記文の各々に対して出力するステップと、語釈拡張文ペアアテンション部が、ニューラルネットの各レイヤーにおいて、前記レイヤーに対応する単語数だけ前記単語を連結したチャンクについて、チャンクに対する語釈文を記憶する語釈文記憶部を検索して得られる、前記文ペアの一方の文に含まれる前記チャンクに関する語釈文に含まれる単語の各々に関する特徴行列と、前記文ペアの他方の文の単語の各々に関する特徴行列とのマッチングに関する特徴行列を、前記文ペアの前記文の各々に対して出力するステップと、畳み込み部が、ニューラルネットの各レイヤーにおいて、前記単語ベクトル化部が出力した前記特徴行列、前記文ペアアテンション部が出力した前記特徴行列、及び前記語釈拡張文ペアアテンション部が出力した前記特徴行列に対して畳み込み処理を行って得られる特徴行列を、前記レイヤーの出力として、前記文ペアの前記文の各々に対して出力するステップと、クラス分類部が、前記文ペア集合に含まれる前記文ペアの各々に対し、前記ニューラルネットの最後のレイヤーにより出力された、前記文ペアの各々に対する前記特徴行列に基づいて、前記文ペアの関係性に関するクラスを分類し、分類結果と前記正解ラベルとに基づいて前記分類結果に関する損失を算出するステップと、学習部が、前記文ペア集合に含まれる前記文ペアの各々に対して算出された、前記分類結果に関する損失に基づいて、前記文ペアアテンション部、前記語釈拡張文ペアアテンション部、及び前記畳み込み部において特徴行列を求めるためのパラメータ行列を学習するステップと、を含んで実行することを特徴とする。 In the sentence pair classification learning method according to the fourth invention, the word division unit includes the sentence pair included in a sentence pair set including each sentence pair to which a correct label indicating a class related to the relationship between sentence pairs is given. Dividing each sentence of the sentence pair into a sequence of words, and a word vectorizing unit for each of the sentences of the sentence pair included in the sentence pair set. A step of outputting a feature matrix for each word of the sentence obtained by vectorizing each word based on a word vector storage unit storing a vector of each word; and a sentence pair attention unit, each of the neural network In the layer, the feature matrix for each of the words for each of the sentences in the sentence pair, or the feature output by the previous layer for each of the sentences in the sentence pair. A step of outputting a feature matrix relating to matrix matching to each of the sentences of the sentence pair, and a word extension sentence pair attention unit, in each layer of the neural network, the number of words corresponding to the layer. For the connected chunks, a feature matrix for each of the words included in the sentence related to the chunk included in one sentence of the sentence pair obtained by searching the word sentence storage unit that stores the word sentence for the chunk, and Outputting a feature matrix for matching with a feature matrix for each of the words of the other sentence of the sentence pair for each of the sentences of the sentence pair, and a convolution unit in each layer of the neural network The feature matrix output by the vectorization unit, the feature matrix output by the sentence pair attention unit, Outputting a feature matrix obtained by performing a convolution process on the feature matrix output by the word extension sentence pair attention unit, as an output of the layer, to each of the sentences of the sentence pair; The class classification unit, for each of the sentence pairs included in the sentence pair set, the sentence pair relationship based on the feature matrix for each of the sentence pairs output by the last layer of the neural network. Classifying a class related to sex, calculating a loss related to the classification result based on the classification result and the correct answer label, and a learning unit calculated for each of the sentence pairs included in the sentence pair set Based on the loss related to the classification result, the sentence pair attention unit, the extended sentence pair attention unit, and the convolution unit And a step of learning a parameter matrix for obtaining a characteristic matrix.

また、第５の発明に係るプログラムは、コンピュータを、第１の発明に係る文ペア分類装置、又は第２の発明に係る文ペア分類学習装置の各部として機能させるためのプログラムである。 A program according to the fifth invention is a program for causing a computer to function as each part of the sentence pair classification device according to the first invention or the sentence pair classification learning device according to the second invention.

本発明の文ペア分類装置、方法、及びプログラムによれば、ニューラルネットの各レイヤーにおいて、単語ベクトル化部が出力した特徴行列、文ペアアテンション部が出力した特徴行列、及び語釈拡張文ペアアテンション部が出力した特徴行列に対して畳み込み処理を行って得られる特徴行列を、レイヤーの出力として、文ペアの文の各々に対して出力し、クラス分類部は、文ペア集合に含まれる文ペアの各々に対し、ニューラルネットの最後のレイヤーにより出力された、文ペアの各々に対する特徴行列に基づいて、文ペアの関係性に関するクラスに分類することにより、語釈を考慮した文ペアの関係性に関するクラスを求めることができる、という効果が得られる。 According to the sentence pair classification apparatus, method, and program of the present invention, the feature matrix output from the word vectorization unit, the feature matrix output from the sentence pair attention unit, and the extended sentence pair attention unit in each layer of the neural network The feature matrix obtained by performing the convolution process on the feature matrix output by is output to each sentence of the sentence pair as the output of the layer, and the class classification unit outputs the sentence pair included in the sentence pair set. For each, a class related to the relationship between sentence pairs considering the interpretation by classifying into a class related to the relationship between sentence pairs based on the feature matrix for each sentence pair output by the last layer of the neural network. Can be obtained.

本発明の文ペア分類学習装置、方法、及びプログラムによれば、ニューラルネットの各レイヤーにおいて、単語ベクトル化部が出力した特徴行列、文ペアアテンション部が出力した特徴行列、及び語釈拡張文ペアアテンション部が出力した特徴行列に対して畳み込み処理を行って得られる特徴行列を、レイヤーの出力として、文ペアの文の各々に対して出力し、クラス分類部は、文ペア集合に含まれる文ペアの各々に対し、ニューラルネットの最後のレイヤーにより出力された、文ペアの各々に対する特徴行列に基づいて、文ペアの関係性に関するクラスを分類し、分類結果と正解ラベルとに基づいて分類結果に関する損失を算出し、学習部は、文ペア集合に含まれる文ペアの各々に対して算出された、分類結果に関する損失に基づいて、特徴行列を求めるためのパラメータ行列を学習することにより、語釈を考慮した文ペアの関係性に関するクラスを求めるためのパラメータを学習することができる、という効果が得られる。 According to the sentence pair classification learning apparatus, method, and program of the present invention, in each layer of the neural network, the feature matrix output from the word vectorization unit, the feature matrix output from the sentence pair attention unit, and the extended sentence pair attention The feature matrix obtained by performing the convolution process on the feature matrix output by the unit is output as a layer output to each sentence of the sentence pair, and the class classification unit includes the sentence pair included in the sentence pair set. For each of the above, classify the class related to the relationship of the sentence pair based on the feature matrix for each of the sentence pairs output by the last layer of the neural network, and relate to the classification result based on the classification result and the correct answer label. The loss is calculated, and the learning unit calculates the loss based on the loss related to the classification result calculated for each sentence pair included in the sentence pair set. By learning the parameter matrix for obtaining the matrix, it is possible to learn the parameters for determining the class of the relationship of the sentence pairs in consideration of the interpretation of a word, the effect is obtained that.

本発明の実施の形態に係る文ペア分類学習装置の構成を示すブロック図である。It is a block diagram which shows the structure of the sentence pair classification learning apparatus which concerns on embodiment of this invention. 単語ベクトル記憶部の一例を示す図である。It is a figure which shows an example of a word vector memory | storage part. 語釈文記憶部の一例を示すである。It is an example of a word storage part. 本発明の実施の形態に係る文ペア分類学習装置における文ペア分類学習処理ルーチンを示すフローチャートである。It is a flowchart which shows the sentence pair classification learning process routine in the sentence pair classification learning apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る文ペア分類学習装置における文ペア分類学習処理ルーチンを示すフローチャートである。It is a flowchart which shows the sentence pair classification learning process routine in the sentence pair classification learning apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る文ペア分類装置の構成を示すブロック図である。It is a block diagram which shows the structure of the sentence pair classification device which concerns on embodiment of this invention. 本発明の実施の形態に係る文ペア分類装置における文ペア分類処理ルーチンを示すフローチャートである。It is a flowchart which shows the sentence pair classification | category processing routine in the sentence pair classification | category apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る文ペア分類装置における文ペア分類処理ルーチンを示すフローチャートである。It is a flowchart which shows the sentence pair classification | category processing routine in the sentence pair classification | category apparatus which concerns on embodiment of this invention.

以下、図面を参照して本発明の実施の形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜本発明の実施の形態に係る概要＞ <Outline according to Embodiment of the Present Invention>

まず、本発明の実施の形態における概要を説明する。 First, an outline of the embodiment of the present invention will be described.

本発明の実施の形態では、上記従来技術の問題点に鑑みて、入力された文に出現する各単語の語釈文を利用する。例えば、クーリングオフの語釈文である「一定の契約に限り、一定期間、説明不要で無条件で申込みの撤回または契約を解除できる法制度」と「無条件で契約を解除できる期間は？」の間で単語の類似度行列を計算して文ペアクラス分類に利用することで、単語ベクトルの精度に強く依存せず、高精度に文ペアクラスを分類することを可能にする。 In the embodiment of the present invention, in view of the above-mentioned problems of the prior art, the interpretation of each word that appears in the input sentence is used. For example, the cooling-off excuses are “A legal system that allows you to withdraw or cancel a contract unconditionally without explanation for a certain period only” and “How long can you cancel a contract unconditionally?” By calculating the word similarity matrix between them and using it for sentence pair class classification, it becomes possible to classify sentence pair classes with high precision without strongly depending on the precision of word vectors.

＜本発明の実施の形態に係る文ペア分類学習装置の構成＞ <Configuration of sentence pair classification learning device according to an embodiment of the present invention>

次に、本発明の実施の形態に係る文ペア分類学習装置の構成について説明する。図１に示すように、本発明の実施の形態に係る文ペア分類学習装置１００は、ＣＰＵと、ＲＡＭと、後述する文ペア分類学習処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。この文ペア分類学習装置１００は、機能的には図１に示すように入力部１０と、演算部２０とを備えている。 Next, the configuration of the sentence pair classification learning device according to the embodiment of the present invention will be described. As shown in FIG. 1, a sentence pair classification learning device 100 according to an embodiment of the present invention includes a CPU, a RAM, and a ROM that stores a program and various data for executing a sentence pair classification learning processing routine to be described later. And a computer including Functionally, the sentence pair classification learning device 100 includes an input unit 10 and a calculation unit 20 as shown in FIG.

入力部１０は、文ペアの関係性に関するクラスを示す正解ラベルが付与された文ペアの各々を含む文ペア集合を受け付ける。 The input unit 10 receives a sentence pair set including each sentence pair to which a correct answer label indicating a class related to the relationship between sentence pairs is given.

演算部２０は、単語ベクトル記憶部２２と、語釈文記憶部２４と、パラメータ行列記憶部２６と、単語分割部３０と、単語ベクトル化部３２と、文ペアアテンション部３４と、語釈拡張文ペアアテンション部３６と、畳み込み部３８と、クラス分類部４０と、学習部４２とを含んで構成されている。各処理部の処理の詳細については、作用の説明において詳しく説明する。 The calculation unit 20 includes a word vector storage unit 22, an interpretation sentence storage unit 24, a parameter matrix storage unit 26, a word division unit 30, a word vectorization unit 32, a sentence pair attention unit 34, and an extended sentence pair. An attention unit 36, a convolution unit 38, a class classification unit 40, and a learning unit 42 are included. Details of the processing of each processing unit will be described in detail in the description of the operation.

単語ベクトル記憶部２２には、図２に示すように、単語ｘ、及び単語ベクトルｅの組が格納されており、ｅの次元数はＥ⁽¹⁾次元である。 As shown in FIG. 2, the word vector storage unit 22 stores a set of a word x and a word vector e, and the dimension number of e is E ⁽¹⁾ dimension.

語釈文記憶部２４には、少なくとも１つの単語を連結したチャンク文字列と、チャンク文字列に対応する語釈文が格納されている。例えば、図３に示すように“クーリングオフ”のチャンク文字列に対応する語釈文が格納されている。 The word sentence storage unit 24 stores a chunk character string in which at least one word is concatenated and a word sentence corresponding to the chunk character string. For example, as shown in FIG. 3, a sentence corresponding to a chunk character string “cooling off” is stored.

パラメータ行列記憶部２６には、文ペアアテンション部３４で用いるＷ_ａ ⁽ⁱ⁾、語釈拡張文ペアアテンション部３６で用いるＷ_ｂ ⁽ⁱ⁾、畳み込み部３８で用いるＷ_ｃ ⁽ⁱ⁾、ｂ_ｃ ⁽ⁱ⁾、クラス分類部４０で用いるＷ_ｄ、ｂ_ｄの各パラメータ行列（ｉ＝１,...,Ｂ）が格納される。 In the parameter matrix storage unit 26, W _a ⁽ⁱ⁾ used in the sentence pair attention unit 34, W _b ⁽ⁱ⁾ used in the interpretation extended sentence pair attention unit 36, W _c ⁽ⁱ⁾ , b _c ⁽ used in the convolution unit 38. ⁱ⁾ Each parameter matrix (i = 1,..., B) of W _d and b _d used in the class classification unit 40 is stored.

単語分割部３０は、入力部１０で受け付けた文ペア集合に含まれる文ペアの各々に対し、文ペアの文の各々を単語の系列に分割する。 For each sentence pair included in the sentence pair set received by the input unit 10, the word dividing unit 30 divides each sentence of the sentence pair into a series of words.

単語ベクトル化部３２は、文ペア集合に含まれる文ペアの文の各々に対し、分割された単語の各々を、各単語のベクトルを記憶する単語ベクトル記憶部２２に基づいてベクトル化して得られる、文の各々の単語に関する特徴行列Ｆ_jを出力する。 The word vectorization unit 32 is obtained by vectorizing each of the divided words for each sentence of the sentence pair included in the sentence pair set based on the word vector storage unit 22 that stores the vector of each word. , Output a feature matrix F _j for each word of the sentence.

文ペアアテンション部３４は、ニューラルネットの各レイヤー（ｉ＝１,...,Ｂ）において、パラメータ行列Ｗ_ａ ⁽ⁱ⁾を用いて、文ペアの文の各々に対する単語の各々に関する特徴行列Ｆ_j、又は文ペアの文の各々に対する、一つ前のレイヤーにより出力された特徴行列Ｆ_j ⁽ⁱ⁾のマッチングに関する特徴行列Ｆ_j ⁽ⁱ⁾'を求め、文ペアの文の各々に対して出力する。 The sentence pair attention unit 34 uses a parameter matrix W _a ⁽ⁱ⁾ in each layer (i = 1,..., B) of the neural network, and _a feature matrix F regarding each word for each sentence in the sentence pair. _The feature matrix F _j ⁽ⁱ⁾ ′ relating to the matching of the feature matrix F _j ⁽ⁱ⁾ output by the previous layer is obtained for each sentence of _j or sentence pair, and for each sentence of the sentence pair Output.

語釈拡張文ペアアテンション部３６は、ニューラルネットの各レイヤーにおいて、パラメータ行列Ｗ_ｂ ⁽ⁱ⁾を用いて、レイヤーに対応する単語数ｉだけ単語を連結したチャンクについて、チャンクに対する語釈文を記憶する語釈文記憶部２４を検索して得られる、文ペアの一方の文に含まれるチャンクに関する語釈文に含まれる単語の各々に関する特徴行列Ｇ_ｋと、文ペアの他方の文の単語の各々に関する特徴行列Ｆ_ｈとのマッチングに関する特徴行列Ｆ_j ⁽ⁱ⁾''を求め、文ペアの文の各々に対して出力する。 The word expansion sentence pair attention unit 36 uses the parameter matrix W _b ⁽ⁱ⁾ in each layer of the neural network to store the word sentence corresponding to the chunk for the chunk in which the number of words i corresponding to the layer is connected. obtained by searching the sentence storage section 24, the feature matrix G _k for each of the words included in the interpretation of a word statements about the chunk included in one sentence of sentence pairs, feature matrix for each of the words of the other sentence of the sentence pairs A feature matrix F _j ⁽ⁱ⁾ ″ relating to matching with F _h is obtained and output for each sentence in the sentence pair.

畳み込み部３８は、ニューラルネットの各レイヤーにおいて、パラメータ行列Ｗ_ｃ ⁽ⁱ⁾、ｂ_ｃ ⁽ⁱ⁾を用いて、単語ベクトル化部３２が出力した特徴行列Ｆ_j、文ペアアテンション部３４が出力した特徴行列Ｆ_j ⁽ⁱ⁾'、及び語釈拡張文ペアアテンション部３６が出力した特徴行列Ｆ_j ⁽ⁱ⁾''に対して畳み込み処理を行って得られる特徴行列Ｆ_j ⁽ⁱ⁺¹⁾を、レイヤーｉの出力として、文ペアの文の各々に対して出力する。 The convolution unit 38 outputs the feature matrix F _j output from the word vectorization unit 32 and the sentence pair attention unit 34 using the parameter matrices W _c ⁽ⁱ⁾ and b _c ⁽ⁱ⁾ in each layer of the neural network. wherein the matrix F _j ⁽ⁱ⁾ ', and interpretation of a word feature matrix extended sentence pairs attention unit 36 has output F _j ^(i)' obtained by performing the convolution processing on the 'feature matrix F _j a ^{(i + 1),} As the output of layer i, it outputs to each sentence of the sentence pair.

クラス分類部４０は、文ペア集合に含まれる文ペアの各々に対し、ニューラルネットの最後のレイヤーＢにより出力された、文ペアの各々に対する特徴行列Ｆ_１及びＦ_２と、パラメータ行列Ｗ_ｄ、ｂ_ｄとに基づいて、文ペアの関係性に関するクラスを分類し、分類結果と正解ラベルとに基づいて分類結果に関する損失Ｌを算出する。 The class classification unit 40 outputs, for each sentence pair included in the sentence pair set, feature matrices F ₁ and F ₂ for each sentence pair output by the last layer B of the neural network, a parameter matrix W _d , Based on b _d , class related to sentence pair relationship is classified, and loss L related to the classification result is calculated based on the classification result and the correct answer label.

学習部４２は、文ペア集合に含まれる文ペアの各々に対して算出された、分類結果に関する損失に基づいて、文ペアアテンション部３４、語釈拡張文ペアアテンション部３６、及び畳み込み部３８において特徴行列を求めるためのパラメータ行列Ｗ_ａ ⁽ⁱ⁾、Ｗ_ｂ ⁽ⁱ⁾、Ｗ_ｃ ⁽ⁱ⁾、ｂ_ｃ ⁽ⁱ⁾、Ｗ_ｄ、ｂ_ｄを学習する。 The learning unit 42 is characterized in the sentence pair attention unit 34, the expanded sentence pair attention unit 36, and the convolution unit 38 based on the loss related to the classification result calculated for each sentence pair included in the sentence pair set. The parameter matrices W _a ⁽ⁱ⁾ , W _b ⁽ⁱ⁾ , W _c ⁽ⁱ⁾ , b _c ⁽ⁱ⁾ , W _d , b _d for obtaining the matrix are learned.

＜本発明の実施の形態に係る文ペア分類学習装置の作用＞ <Operation of Sentence Pair Classification Learning Device According to Embodiment of the Present Invention>

次に、本発明の実施の形態に係る文ペア分類学習装置１００の作用について説明する。入力部１０において文ペアの関係性に関するクラスを示す正解ラベルが付与された文ペアの各々を含む文ペア集合を受け付けると、パラメータ行列Ｗ_ａ ⁽ⁱ⁾、Ｗ_ｂ ⁽ⁱ⁾、Ｗ_ｃ ⁽ⁱ⁾、ｂ_ｃ ⁽ⁱ⁾、Ｗ_ｄ、ｂ_ｄを初期化してパラメータ行列記憶部２６に記憶し、文ペア分類学習装置１００は、図４及び図５に示す文ペア分類学習処理ルーチンを実行する。 Next, the operation of the sentence pair classification learning device 100 according to the embodiment of the present invention will be described. When the input unit 10 receives a sentence pair set including each sentence pair to which a correct answer label indicating a class relating to the relationship between sentence pairs is received, the parameter matrices W _a ⁽ⁱ⁾ , W _b ⁽ⁱ⁾ , W _c ^{(i )} , B _c ⁽ⁱ⁾ , W _d , b _d are initialized and stored in the parameter matrix storage unit 26, and the sentence pair classification learning device 100 executes a sentence pair classification learning processing routine shown in FIGS. .

ステップＳ１００では、エポック数をｎ＝１に初期化する。 In step S100, the number of epochs is initialized to n = 1.

ステップＳ１０２では、学習データ（正解ラベル（クラスタ）付の文ペアからなる文ペア集合）を、ランダムに定めたＭ個の文ペアが含まれるミニバッチの各々に分割する。ミニバッチに含まれる文ペアの数Ｍの最大値は、本実施の形態ではＭ_ｍａｘ＝５０とする。 In step S102, learning data (sentence pair set consisting of sentence pairs with correct labels (clusters)) is divided into each of mini-batches including M sentence pairs determined at random. In the present embodiment, the maximum value of the number M of sentence pairs included in the mini-batch is M _max = 50.

ステップＳ１０４では、ミニバッチを選択する。 In step S104, a mini-batch is selected.

ステップＳ１０６では、ミニバッチの文ペアの番号を表すｍをｍ＝１に設定する。 In step S106, m, which represents the mini-batch sentence pair number, is set to m = 1.

ステップＳ１０８では、単語分割部３０は、ｍ番目の文ペアの各文（文ｊ；ｊ＝１あるいは２）に関して、ニューラルネットのレイヤーを表すブロックのインデクスを表す変数ｉをｉ＝１にセットする。 In step S108, the word dividing unit 30 sets a variable i representing an index of a block representing a neural network layer to i = 1 for each sentence (sentence j; j = 1 or 2) of the m-th sentence pair. .

ステップＳ１１０では、単語分割部３０は、文ペアの文ｊを単語の系列に分割する。たとえば、「投資信託ではクーリングオフはいつまでできる」という文について、「投資信託」「では」「クーリング」「オフ」「は」「いつ」「まで」「できる」のような系列に分割する。単語分割部３０は、分割された単語の個数がT個より多い場合は、先頭からＴ個のトークンのみを出力する。また、Ｔ個よりも少ない場合は、特殊な単語「ＰＡＤ」を系列の末尾に追加して出力する。本実施形態では、Ｔ＝１００とする。 In step S110, the word dividing unit 30 divides the sentence j of the sentence pair into word sequences. For example, a sentence “when the investment trust can be cooled off until” is divided into a series such as “investment trust” “in” “cooling” “off” “ha” “when” “until” “can”. When the number of divided words is greater than T, the word dividing unit 30 outputs only T tokens from the beginning. If the number is less than T, the special word “PAD” is added to the end of the sequence and output. In this embodiment, T = 100.

ステップＳ１１２では、単語ベクトル化部３２は、単語分割部３０が出力した文の単語の系列（ｘ₁,ｘ₂,..,ｘ_r）に含まれる各単語について単語ベクトル記憶部２２を検索し、文ペアの各文ｊについて、以下（１）式の文ｊの単語の各々に関する特徴行列に変換する。 In step S 112, the word vectorization unit 32 searches the word vector storage unit 22 for each word included in the word sequence (x ₁ , x ₂ ,..., X _r ) of the sentence output from the word division unit 30. Then, each sentence j of the sentence pair is converted into a feature matrix relating to each word of the sentence j in the following equation (1).

・・・（１）
... (1)

Ｆ_j ⁽¹⁾の行列のサイズはＥ⁽¹⁾×Ｔである。本実施形態では、Ｅ⁽¹⁾＝１００とする。 The size of the matrix of F _j ⁽¹⁾ is E ⁽¹⁾ × T. In this embodiment, E ⁽¹⁾ = 100.

なお、単語ベクトル記憶部２２に含まれない単語および特殊単語「ＰＡＤ」の場合は、単語ベクトルはＥ⁽¹⁾次元の零ベクトルとする。 In the case of a word not included in the word vector storage unit 22 and the special word “PAD”, the word vector is an E ⁽¹⁾ -dimensional zero vector.

次に、ステップＳ２００では、ニューラルネットのレイヤーを表すブロック数Ｂ（ｉ＝１,..,Ｂ）を設定する。本実施形態では、Ｂ＝２とする。 Next, in step S200, the number of blocks B (i = 1,..., B) representing the neural network layer is set. In this embodiment, B = 2.

ステップＳ３００では、文ペアアテンション部３４は、上記ステップＳ１１２で求められた文１及び文２の特徴行列Ｆ_j、又は後述するステップＳ５０２において前のレイヤーの畳み込み処理で求められた文１及び文２の特徴行列Ｆ_j ⁽ⁱ⁾から、以下（２）式のＡn,mを要素とするアテンション行列Ａを作成する。 In step S300, the sentence pair attention unit 34 determines the feature matrix F _{j of} the sentence 1 and sentence 2 obtained in step S112, or the sentence 1 and sentence 2 obtained in the convolution processing of the previous layer in step S502 described later. From the feature matrix F _j ⁽ⁱ⁾ , an attention matrix A having An, m in the following equation (2) as an element is created.

・・・（２）
... (2)

ここで、関数ｍａｔｃｈは、単語（あるいはチャンク）のマッチングスコアを出力するために、ベクトルｘとｙを受け取ってスカラ値を出力する関数で、１／(１＋｜ｘ−ｙ｜)とする。[:,ｎ]は列方向を考慮せずｎ行目のベクトルを取り出す操作、[:,ｍ]は列方向を考慮せずｍ行目のベクトルを取り出す操作である。また、コサイン類似度などをｍａｔｃｈ関数として使用しても良い。アテンション行列ＡのサイズはＴ×Ｔである。 Here, the function match is a function that receives vectors x and y and outputs a scalar value in order to output a matching score of a word (or chunk), and is 1 / (1+ | xy−). [:, n] is an operation for extracting the vector in the nth row without considering the column direction, and [:, m] is an operation for extracting the vector in the mth row without considering the column direction. Further, cosine similarity may be used as a match function. The size of the attention matrix A is T × T.

次に、ステップＳ３０２では、アテンション行列Ａを以下（３）式の文１及び文２の文同士のマッチングに関する特徴行列Ｆ₁ ⁽ⁱ⁾'、Ｆ₂ ⁽ⁱ⁾'に変換する。 Next, in step S302, the attention matrix A is converted into feature matrices F ₁ ⁽ⁱ⁾ ′, F ₂ ⁽ⁱ⁾ ′ relating to matching between sentences 1 and 2 in the following equation (3).

・・・（３）

... (3)

ここで、Ｗ_a ⁽ⁱ⁾はパラメータ行列であり、Ｅ⁽ⁱ⁾×Ｔの行列である。Ａ^ｔは行列Ａの転置行列を表す。本実施形態では、Ｅ⁽²⁾＝１００、Ｅ^（３）＝１００とする。 Here, W _a ⁽ⁱ⁾ is a parameter matrix, and is an E ⁽ⁱ⁾ × T matrix. A ^t represents a transposed matrix of the matrix A. In the present embodiment, E ⁽²⁾ = 100 and E ⁽³⁾ = 100.

ステップ４００では、語釈拡張文ペアアテンション部３６は、文ペアの各文ｊについて、レイヤーのインデクスｉの値と、該文ｊの単語系列（ｘ₁,ｘ₂,..,ｘ_r）から、単語数ｉの単語を連結したチャンクの系列を作成する。ｉ＝１のとき、単語系列とチャンク系列は同じものである。ｉ＝２のとき、チャンク系列は（（ｘ₁,ｘ₂,..,ｘ_r）ＰＡＤ）となる。チャンク系列の長さは常にＴである。チャンク系列の末尾には、ｉ−１個の「ＰＡＤ」単語が追加される。なお、Ｂが３以上のときは、例えばｉ＝３のときチャンク系列は（（ｘ₁,ｘ₂,..,ｘ_r）ＰＡＤ，ＰＡＤ）となる。 In step 400, the word extension sentence pair attention unit 36, for each sentence j of the sentence pair, from the value of the layer index i and the word sequence (x ₁ , x ₂ ,.., X _r ) of the sentence j, A series of chunks is created by concatenating words of number i. When i = 1, the word series and the chunk series are the same. When i = 2, the chunk sequence is ((x ₁ , x ₂ ,..., x _r ) PAD). The length of the chunk sequence is always T. At the end of the chunk sequence, i−1 “PAD” words are added. When B is 3 or more, for example, when i = 3, the chunk sequence is ((x ₁ , x ₂ ,..., X _r ) PAD, PAD).

ステップ４０２では、語釈拡張文ペアアテンション部３６は、チャンク系列の要素ｋ（k=１、２、…）を選択する。 In step 402, the word extension sentence pair attention unit 36 selects a chunk sequence element k (k = 1, 2,...).

ステップＳ４０４では、語釈拡張文ペアアテンション部３６は、文ペアの各文ｊについて、ステップＳ４０２で選択した要素ｋに含まれる単語文字列を連結した文字列（チャンク文字列；例えば、（ｘ₁,ｘ₂）＝「クーリング」、「オフ」の場合”クーリングオフ”）で語釈文記憶部２４を検索し、チャンク文字列に対応する語釈文が格納されている場合は、以下（４）式に示す語釈文についての単語の各々に関する特徴行列Ｇ_kを獲得する。獲得方法は、上記ステップＳ１１２の処理と同様である。 In step S404, the word extension sentence pair attention unit 36, for each sentence j of the sentence pair, a character string (chunk character string; for example, (x ₁ , When x ₂ ) = “cooling”, “off” is “cooling off”), the word storage 24 is searched, and when the word corresponding to the chunk character string is stored, the following expression (4) is used: Obtain a feature matrix G _k for each of the words for the indicated sentence. The acquisition method is the same as that in step S112.

・・・（４）
... (4)

ステップＳ４０６では、語釈拡張文ペアアテンション部３６は、文ペアの各文ｊについて、ｈ＝３−ｊとしたとき、文ｈとチャンク要素ｋに対応する語釈文の特徴行列Ｇ_ｋから、以下（５）式のアテンション行列Ａを作成する。 In step S406, the word extension sentence pair attention unit 36 obtains the following from the sentence matrix G _{k of the} sentence sentence corresponding to the sentence h and the chunk element k when h = 3-j for each sentence j of the sentence pair: 5) Create an attention matrix A of the formula.

・・・（５）
... (5)

ステップＳ４０８では、語釈拡張文ペアアテンション部３６は、文ペアの各文ｊについて、ステップＳ４０６で作成されたアテンション行列Ａを以下（６）式の語釈に関する特徴行列Ｆ_j ⁽ⁱ⁾''に変換する。 In step S408, the word extension sentence pair attention unit 36 converts the attention matrix A created in step S406 into a feature matrix F _j ⁽ⁱ⁾ ″ related to the wording of the following equation (6) for each sentence j of the sentence pair. To do.

・・・（６）
... (6)

ここで、ｗａｌｌ＿ｐｏｏｌｉｎｇは各行について、列方向の非ゼロの値について平均を取った値（列方向の最大値としてもよい）である。Ｗ_b ⁽ⁱ⁾はパラメータ行列であり、Ｅ⁽ⁱ⁾×Ｔの行列である。 Here, wall_pooling is a value obtained by averaging the non-zero values in the column direction (may be the maximum value in the column direction) for each row. W _b ⁽ⁱ⁾ is a parameter matrix and is an E ⁽ⁱ⁾ × T matrix.

ステップ４１０では、語釈拡張文ペアアテンション部３６は、文ペアの各文ｊについて、チャンク系列の要素ｋのチャンク文字列に該当するものが語釈文記憶部２４に存在しない場合、特徴行列Ｆ_ｊ ⁽ⁱ⁾''の該当部分に零ベクトルを代入する。 In step 410, the word extension sentence pair attention unit 36, for each sentence j of the sentence pair, if there is no chunk character string corresponding to the chunk sequence element k in the sentence sentence storage unit 24, the feature matrix F _j ^{( i)} Assign a zero vector to the corresponding part of ''.

・・・（７）
... (7)

ステップＳ４１２では、語釈拡張文ペアアテンション部３６は、全ての要素ｋについて処理を終了したかを判定し、終了していればステップＳ５００に移行し終了していなければステップＳ４０２に戻って次の要素ｋを選択して処理を繰り返す。 In step S412, the expanded sentence pair attention unit 36 determines whether or not the processing has been completed for all elements k. If completed, the process proceeds to step S500. If not completed, the process returns to step S402 to return to the next element. Select k and repeat the process.

ステップＳ５００では、畳み込み部３８は、文ペアの各文（文ｊ；ｊ＝１あるいは２）について、畳み込み処理を行う。各特徴行列Ｆ_ｊ ⁽ⁱ⁾,Ｆ_ｊ ⁽ⁱ⁾',Ｆ_ｊ ⁽ⁱ⁾''はそれぞれＥ⁽ⁱ⁾×Ｔの行列である。これらの行列から、３×Ｅ^（ｉ）×Ｔの３階テンソルＦに変換し、フィルタサイズ３×２、パディング幅の行方向０、列方向１、ストライド幅１、入力チャネル数Ｅ⁽ⁱ⁾、出力チャネル数Ｅ⁽ⁱ⁺¹⁾の畳み込み処理を行い以下（８）式にてＨを出力する。 In step S500, the convolution unit 38 performs a convolution process on each sentence (sentence j; j = 1 or 2) of the sentence pair. Each feature matrix F _j ⁽ⁱ⁾ , F _j ⁽ⁱ⁾ ′, F _j ⁽ⁱ⁾ ″ is an E ⁽ⁱ⁾ × T matrix. These matrices are converted into 3 × E ⁽ⁱ⁾ × T third-order tensor F, and the filter size is 3 × 2, the padding width in the row direction 0, the column direction 1, the stride width 1, and the number of input channels E ^(i). Then, convolution processing is performed for the number E ^{(i + 1)} of output channels, and H is output by the following equation (8).

・・・（８）
... (8)

ここで、Ｈは１×Ｔ×Ｅ⁽ⁱ⁾のテンソルとなる。σはシグモイド関数、＊は畳み込み処理を表す。Ｗ_c ⁽ⁱ⁾、ｂ_c ^（i）はパラメータ行列である。 Here, H is a tensor of 1 × T × E ⁽ⁱ⁾ . σ represents a sigmoid function, and * represents a convolution process. W _c ⁽ⁱ⁾ and b _c ⁽ⁱ⁾ are parameter matrices.

ステップＳ５０２では、畳み込み部３８は、文ペアの各文ｊについて、ステップＳ５００の畳み込み処理で得られたＨを下記プーリング処理にてｉ＋１の特徴行列Ｆ_j ⁽ⁱ⁺¹⁾に変換する。 In step S502, the convolution unit 38 converts H obtained by the convolution processing of step S500 into an i + 1 feature matrix F _j ^{(i + 1)} by the following pooling processing for each sentence j of the sentence pair.

・・・（９）
... (9)

ここで、ｗ２＿ｐｏｏｌｉｎｇは各行について、ウィンドウサイズ２で列方向の非ゼロの値について平均を取った値（列方向の最大値としてもよい）である。 Here, w2_pooling is a value obtained by averaging the non-zero values in the column direction at the window size 2 for each row (may be the maximum value in the column direction).

なお、レイヤーの最終ブロック（ｉ＝Ｂ）では、ｗ２＿ｐｏｏｌｉｎｇの代わりにｗａｌｌ＿ｐｏｏｌｉｎｇを利用する。最終層が出力するＦのサイズは、１×Ｅ^(B+1)である。 In the final block (i = B) of the layer, wall_pooling is used instead of w2_pooling. The size of F output from the final layer is 1 × E ^{(B + 1)} .

ステップＳ５０４では、ｉ＝Ｂか否かを判定し、ｉ＝ＢであればステップＳ６００に移行し、ｉ＝ＢでなければステップＳ５０６に移行し、ｉ＝ｉ＋１として、ステップＳ３００に戻って処理を繰り返す。 In step S504, it is determined whether i = B. If i = B, the process proceeds to step S600. If i = B, the process proceeds to step S506, i = i + 1 is set, and the process returns to step S300 to perform the process. repeat.

ステップＳ６００では、クラス分類部４０は、ニューラルネットの最終ブロックが出力したＦ₁、Ｆ₂をそれぞれベクトルに変換して連結したベクトルｖを入力として、文ペアについて文関係のクラス分類（クラス数＝Ｃ）を行う。 In step S600, the class classifying unit 40 receives a vector v obtained by converting F ₁ and F ₂ output from the final block of the neural network into a vector and connecting them, and inputs sentence-related class classification (number of classes = class). C).

・・・（１０）
(10)

ここで、Ｗ_dのサイズはＣ×２Ｅ^(B+1)次元の行列、ｂ_ｄはＣ次元のベクトルであり、ｓｏｆｔｍａｘはソフトマックス関数である。ｙはＣ次元のベクトルである。 Here, the size of W _d is a C × 2E ^{(B + 1)} -dimensional matrix, b _d is a C-dimensional vector, and softmax is a softmax function. y is a C-dimensional vector.

ステップ６０２では、クラス分類部４０は、出力ｙに関する損失を計算する。正解クラスのインデクスをｔ∈{１,...,Ｃ}、クラスｔに関する出力をｙ_ｔとしたとき、正解クラスｔの損失Ｌを下記（１１）式にて計算する。 In step 602, the class classification unit 40 calculates a loss related to the output y. When the index of the correct class is t∈ {1,..., C} and the output related to the class t is y _t , the loss L of the correct class t is calculated by the following equation (11).

・・・（１１）
(11)

ステップＳ７００では、学習部４２は、ｍ＝Ｍか否かを判定し、ｍ＝ＭであればステップＳ７０４に移行し、ｍ＝ＭでなければステップＳ７０２でｍ＝ｍ＋１としてステップＳ１０８に戻って処理を繰り返す。 In step S700, the learning unit 42 determines whether m = M. If m = M, the learning unit 42 proceeds to step S704, and if m = M, sets m = m + 1 in step S702 and returns to step S108 for processing. repeat.

ステップＳ７０４では、学習部４２は、ステップ１０８〜ステップ７００で算出された文ペアの各々に対する正解クラスｔの損失Ｌを該ミニバッチについて合計し、確率的勾配降下法により文ペアアテンション部３４のＷ_a ⁽ⁱ⁾、語釈拡張文ペアアテンション部３６のＷ_b ⁽ⁱ⁾、畳み込み部３８のＷ_c ⁽ⁱ⁾、ｂ_c ⁽ⁱ⁾クラス分類部４０のＷ_d、ｂ_dの各パラメータ行列（ｉ＝１,...,Ｂ）について最適化を行う。なお、最適化の方法は確率的勾配降下法に限らず、他の最適化法を利用しても良い。 In step S704, the learning unit 42 adds up the loss L of the correct class t for each of the sentence pairs calculated in steps 108 to 700 for the mini-batch, and uses the probabilistic gradient descent method W _a of the sentence pair attention unit 34. ^(i), interpretation of a word W _b of the extended sentence pairs attention section 36 ^(i), W _c of the convolution portion ^{_{^{38 (i), b c (}}} i) of the classification unit 40 W _{_d,} b _d each parameter matrix of the (i = Optimize for 1, ..., B). The optimization method is not limited to the stochastic gradient descent method, and other optimization methods may be used.

ステップＳ７０６では、全てのミニバッチについて処理を終了したかを判定し、処理を終了していればステップＳ７０８に移行し、処理を終了していなければステップＳ１０４に戻って次のミニバッチを選択して処理を繰り返す。 In step S706, it is determined whether or not the processing has been completed for all mini-batches. If the processing has been completed, the process proceeds to step S708. If the processing has not been completed, the process returns to step S104 to select the next mini-batch. repeat.

ステップＳ７０８では、ｎ＝Ｎ（Ｎ＝１００）か否かを判定し、ｎ＝Ｎであれば処理を終了し、ｎ＝ＮでなければステップＳ７１０でｎ＝ｎ＋１としてステップＳ１０２に戻って処理を繰り返す。 In step S708, it is determined whether or not n = N (N = 100). If n = N, the process ends. If not n = N, n = n + 1 in step S710 and the process returns to step S102 to perform the process. repeat.

以上説明したように、本発明の実施の形態に係る文ペア分類学習装置によれば、ニューラルネットの各レイヤーにおいて、単語ベクトル化部３２が出力した特徴行列Ｆ_j、文ペアアテンション部３４が出力した特徴行列Ｆ_j ⁽ⁱ⁾'、及び語釈拡張文ペアアテンション部３６が出力した特徴行列Ｆ_j ⁽ⁱ⁾''に対して畳み込み処理を行って得られる特徴行列Ｆ_j ⁽ⁱ⁺¹⁾を、レイヤーｉの出力として、文ペアの文の各々に対して出力し、クラス分類部４０は、文ペア集合に含まれる文ペアの各々に対し、ニューラルネットの最後のレイヤーＢにより出力された、文ペアの各々に対する特徴行列Ｆ_１及びＦ_２に基づいて、文ペアの関係性に関するクラスを分類し、分類結果と正解ラベルとに基づいて分類結果に関する損失Ｌを算出し、学習部４２は、文ペア集合に含まれる文ペアの各々に対して算出された、分類結果に関する損失に基づいて、特徴行列を求めるためのパラメータ行列を学習することにより、語釈を考慮した文ペアの関係性に関するクラスを求めるためのパラメータを学習することができる。 As described above, according to the sentence pair classification learning device according to the embodiment of the present invention, the feature matrix F _j output from the word vectorization unit 32 and the sentence pair attention unit 34 output in each layer of the neural network. the features matrix F _j ⁽ⁱ⁾ ', and interpretation of a word extended statement feature matrix pairs attention unit 36 has output F _j ^(i)' obtained by performing the convolution processing on the 'feature matrix F _j a ^{(i + 1)} , The output of layer i is output for each sentence of the sentence pair, and the class classification unit 40 outputs the sentence pair included in the sentence pair set by the last layer B of the neural network. Based on the feature matrices F ₁ and F ₂ for each sentence pair, classify the class related to the relationship between the sentence pairs, calculate the loss L related to the classification result based on the classification result and the correct answer label, the learning unit 42, Sentence pair Based on the loss related to the classification result calculated for each sentence pair included in the group, the parameter matrix for finding the feature matrix is learned, and the class related to the relationship between the sentence pairs in consideration of the interpretation is obtained. Can learn the parameters.

＜本発明の実施の形態に係る文ペア分類装置の構成＞ <Configuration of sentence pair classification apparatus according to an embodiment of the present invention>

次に、本発明の実施の形態に係る文ペア分類装置の構成について説明する。図６に示すように、本発明の実施の形態に係る文ペア分類装置２００は、ＣＰＵと、ＲＡＭと、後述する文ペア分類処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。この文ペア分類装置２００は、機能的には図６に示すように入力部２１０と、演算部２２０と、出力部２５０とを備えている。 Next, the configuration of the sentence pair classification apparatus according to the embodiment of the present invention will be described. As shown in FIG. 6, the sentence pair classification apparatus 200 according to the embodiment of the present invention includes a CPU, a RAM, a ROM for storing a program and various data for executing a sentence pair classification processing routine described later, Can be configured with a computer including Functionally, the sentence pair classification apparatus 200 includes an input unit 210, a calculation unit 220, and an output unit 250 as shown in FIG.

入力部２１０は、テストデータとして、クラスを求める対象とする文ペアを受け付ける。 The input unit 210 receives a sentence pair for which a class is to be obtained as test data.

演算部２２０は、単語ベクトル記憶部２２２と、語釈文記憶部２２４と、パラメータ行列記憶部２２６と、単語分割部２３０と、単語ベクトル化部２３２と、文ペアアテンション部２３４と、語釈拡張文ペアアテンション部２３６と、畳み込み部２３８と、クラス分類部２４０とを含んで構成されている。各処理部の処理の詳細については、作用の説明において詳しく説明する。 The calculation unit 220 includes a word vector storage unit 222, an interpretation sentence storage unit 224, a parameter matrix storage unit 226, a word division unit 230, a word vectorization unit 232, a sentence pair attention unit 234, and an extended sentence pair. An attention unit 236, a convolution unit 238, and a class classification unit 240 are included. Details of the processing of each processing unit will be described in detail in the description of the operation.

単語ベクトル記憶部２２２には、上記図２の単語ベクトル記憶部２２と同じものが格納されている。 The word vector storage unit 222 stores the same as the word vector storage unit 22 of FIG.

語釈文記憶部２２４には、上記図３の語釈文記憶部２４と同じものが格納されている。パラメータ行列記憶部２２６には、上記文ペア分類学習装置１００で学習された、文ペアアテンション部２３４で用いるＷ_a ⁽ⁱ⁾、語釈拡張文ペアアテンション部２３６で用いるＷ_b ⁽ⁱ⁾、畳み込み部２３８で用いるＷ_c ⁽ⁱ⁾、ｂ_c ⁽ⁱ⁾クラス分類部２４０で用いるＷ_d、ｂ_dの各パラメータ行列（ｉ＝１,...,Ｂ）が格納されている。 The word sentence storage unit 224 stores the same as the word sentence storage unit 24 of FIG. In the parameter matrix storage unit 226, W _a ⁽ⁱ⁾ used in the sentence pair attention unit 234, W _b ⁽ⁱ⁾ used in the word extension sentence pair attention unit 236, and the convolution unit learned by the sentence pair classification learning device 100. It used _{^{238 W c (i), b}} c (i) used in the classification unit 240 W _d, each parameter matrix of _{b d (i = 1, ...} , B) is stored.

単語分割部２３０は、入力部１０で受け付けた文ペアの各々に対し、文ペアの文の各々を単語の系列に分割する。 The word dividing unit 230 divides each sentence of the sentence pair into a word series for each sentence pair received by the input unit 10.

単語ベクトル化部２３２は、文ペアの文の各々に対し、分割された単語の各々を、各単語のベクトルを記憶する単語ベクトル記憶部２２２に基づいてベクトル化して得られる、文の各々の単語に関する特徴行列Ｆ_jを出力する。 The word vectorization unit 232, for each sentence of the sentence pair, each word of the sentence obtained by vectorizing each of the divided words based on the word vector storage unit 222 that stores the vector of each word A feature matrix F _j for is output.

文ペアアテンション部２３４は、ニューラルネットの各レイヤー（ｉ＝１,...,Ｂ）において、パラメータ行列Ｗ_a ⁽ⁱ⁾を用いて、文ペアの文の各々に対する単語の各々に関する特徴行列Ｆ_j、又は文ペアの文の各々に対する、一つ前のレイヤーにより出力された特徴行列Ｆ_j ⁽ⁱ⁾のマッチングに関する特徴行列Ｆ_j ⁽ⁱ⁾'を求め、文ペアの文の各々に対して出力する。 The sentence pair attention unit 234 uses a parameter matrix W _a ⁽ⁱ⁾ in each layer (i = 1,..., B) of the neural network, and _a feature matrix F relating to each word for each sentence in the sentence pair. _The feature matrix F _j ⁽ⁱ⁾ ′ relating to the matching of the feature matrix F _j ⁽ⁱ⁾ output by the previous layer is obtained for each sentence of _j or sentence pair, and for each sentence of the sentence pair Output.

語釈拡張文ペアアテンション部２３６は、ニューラルネットの各レイヤーにおいて、パラメータ行列Ｗ_b ⁽ⁱ⁾を用いて、レイヤーｉに対応する単語数ｉだけ単語を連結したチャンクについて、チャンクに対する語釈文を記憶する語釈文記憶部２２４を検索して得られる、文ペアの一方の文に含まれるチャンクに関する語釈文に含まれる単語の各々に関する特徴行列Ｇ_ｋと、文ペアの他方の文の単語の各々に関する特徴行列Ｆ_ｈとのマッチングに関する特徴行列Ｆ_j ⁽ⁱ⁾''を求め、文ペアの文の各々に対して出力する。 The word extension sentence pair attention unit 236 stores, in each layer of the neural network, the word sentence for the chunk with respect to the chunk in which the number of words corresponding to the number i corresponding to the layer i is connected using the parameter matrix W _b ^(i). A characteristic matrix G _k for each word included in the word sentence related to the chunk included in one sentence of the sentence pair obtained by searching the word sentence storage unit 224, and a characteristic regarding each word of the other sentence in the sentence pair A feature matrix F _j ⁽ⁱ⁾ ″ for matching with the matrix F _h is obtained and output for each sentence in the sentence pair.

畳み込み部２３８は、ニューラルネットの各レイヤーにおいて、パラメータ行列Ｗ_ｃ ⁽ⁱ⁾、ｂ_c ⁽ⁱ⁾を用いて、単語ベクトル化部２３２が出力した特徴行列Ｆ_j、文ペアアテンション部２３４が出力した特徴行列Ｆ_j ⁽ⁱ⁾'、及び語釈拡張文ペアアテンション部２３６が出力した特徴行列Ｆ_j ⁽ⁱ⁾''に対して畳み込み処理を行って得られる特徴行列Ｆ_j ⁽ⁱ⁺¹⁾を求め、レイヤーｉの出力として、文ペアの文の各々に対して出力する。 The convolution unit 238 outputs the feature matrix F _j output from the word vectorization unit 232 and the sentence pair attention unit 234 using the parameter matrices W _c ⁽ⁱ⁾ and b _c ⁽ⁱ⁾ in each layer of the neural network. wherein the matrix F _j ⁽ⁱ⁾ ', and interpretation of a word extended statement feature pairs attention unit 236 has output matrix F _j ^(i)' seek 'obtain a convolution processing performed on the feature matrix F _j ^{(i + 1)} , Output to each of the sentences in the sentence pair as the output of layer i.

クラス分類部２４０は、文ペア集合に含まれる文ペアの各々に対し、ニューラルネットの最後のレイヤーＢにより出力された、文ペアの各々に対する特徴行列Ｆ₁及びＦ₂と、パラメータ行列Ｗ_d、ｂ_dとに基づいて、文ペアをクラスに分類し、分類結果を出力部２５０に出力する。 The class classification unit 240 outputs, for each sentence pair included in the sentence pair set, feature matrices F ₁ and F ₂ for each sentence pair output by the last layer B of the neural network, a parameter matrix W _d , Based on b _d , the sentence pairs are classified into classes, and the classification results are output to the output unit 250.

＜本発明の実施の形態に係る文ペア分類装置の作用＞ <Operation of sentence pair classification apparatus according to embodiment of the present invention>

次に、本発明の実施の形態に係る文ペア分類装置２００の作用について説明する。入力部２１０においてテストデータとして文ペアを受け付けると、文ペア分類装置２００は、図７及び図８に示す文ペア分類処理ルーチンを実行する。なお、複数の文ペアをテストデートする場合には、ステップＳ８００〜８０２を文ペア文ごとに行えばよい。 Next, the operation of the sentence pair classification apparatus 200 according to the embodiment of the present invention will be described. When a sentence pair is received as test data in the input unit 210, the sentence pair classification apparatus 200 executes a sentence pair classification processing routine shown in FIGS. In addition, when test date of a plurality of sentence pairs, steps S800 to 802 may be performed for each sentence pair sentence.

ステップＳ８００では、テストデータを１個の文ペアが含まれるミニバッチに分割する。 In step S800, the test data is divided into mini-batches including one sentence pair.

次にステップＳ８００で分割した文ペアについて、上記図４及び図５に示すステップＳ１０８〜Ｓ６００と同様の処理を行って、各クラスについてのＣ次元のベクトルｙを求める。 Next, the sentence pair divided in step S800 is subjected to the same processing as in steps S108 to S600 shown in FIGS. 4 and 5 to obtain a C-dimensional vector y for each class.

ステップＳ８０２では、ステップＳ６００で求められたＣ次元のベクトルｙの要素の中で最も値が大きいｔ番目の要素に対応するクラスを文ペアの分類結果として出力部２５０に出力する。 In step S802, the class corresponding to the t-th element having the largest value among the elements of the C-dimensional vector y obtained in step S600 is output to the output unit 250 as the sentence pair classification result.

以上説明したように、本発明の実施の形態に係る文ペア分類装置によれば、ニューラルネットの各レイヤーにおいて、単語ベクトル化部３２が出力した特徴行列Ｆ_j、文ペアアテンション部３４が出力した特徴行列Ｆ_j ⁽ⁱ⁾'、及び語釈拡張文ペアアテンション部３６が出力した特徴行列Ｆ_j ⁽ⁱ⁾''に対して畳み込み処理を行って得られる特徴行列Ｆ_j ⁽ⁱ⁺¹⁾を、レイヤーｉの出力として、文ペアの文の各々に対して出力し、クラス分類部４０は、文ペア集合に含まれる文ペアの各々に対し、ニューラルネットの最後のレイヤーＢにより出力された、文ペアの各々に対する特徴行列Ｆ₁及びＦ₂に基づいて、文ペアの関係性に関するクラスに分類することで、語釈を考慮した文ペアの関係性に関するクラスを求めることができる。 As described above, according to the sentence pair classification apparatus according to the embodiment of the present invention, the feature matrix F _j output from the word vectorization unit 32 and the sentence pair attention unit 34 output in each layer of the neural network. wherein the matrix F _j ⁽ⁱ⁾ ', and interpretation of a word feature matrix extended sentence pairs attention unit 36 has output F _j ^(i)' obtained by performing the convolution processing on the 'feature matrix F _j a ^{(i + 1),} As the output of layer i, it is output for each sentence of the sentence pair, and the class classification unit 40 outputs the sentence output by the last layer B of the neural network for each sentence pair included in the sentence pair set. Based on the feature matrices F ₁ and F ₂ for each of the pairs, a class relating to the relationship between sentence pairs can be obtained by classifying the sentence pair into a class relating to the relationship between sentence pairs.

なお、本発明は、上述した実施の形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 The present invention is not limited to the above-described embodiment, and various modifications and applications can be made without departing from the gist of the present invention.

１０入力部
２０演算部
２２単語ベクトル記憶部
２４語釈文記憶部
２６パラメータ行列記憶部
３０単語分割部
３２単語ベクトル化部
３４文ペアアテンション部
３６語釈拡張文ペアアテンション部
３８畳み込み部
４０クラス分類部
４２学習部
１００文ペア分類学習装置
２００文ペア分類装置
２１０入力部
２２０演算部
２２２単語ベクトル記憶部
２２４語釈文記憶部
２２６パラメータ行列記憶部
２３０単語分割部
２３２単語ベクトル化部
２３４文ペアアテンション部
２３６語釈拡張文ペアアテンション部
２４０クラス分類部
２５０出力部 DESCRIPTION OF SYMBOLS 10 Input part 20 Operation part 22 Word vector memory | storage part 24 Word sentence memory | storage part 26 Parameter matrix memory | storage part 30 Word division part 32 Word vectorization part 34 Sentence pair attention part 36 Word extension sentence pair attention part 38 Convolution part 40 Class classification part 42 Learning unit 100 Sentence pair classification learning device 200 Sentence pair classification device 210 Input unit 220 Operation unit 222 Word vector storage unit 224 Word sentence storage unit 226 Parameter matrix storage unit 230 Word division unit 232 Word vectorization unit 234 Sentence pair attention unit 236 Word interpretation Extended sentence pair attention section 240 Class classification section 250 Output section

Claims

A word divider that divides each sentence of a sentence pair into a sequence of words;
A feature matrix for each word of the sentence obtained by vectorizing each of the divided words for each of the sentences of the sentence pair based on a word vector storage unit that stores a vector of each word. An output word vectorization unit;
In each layer of the neural network, a feature matrix for each of the words for each of the sentences in the sentence pair, or a feature for matching the feature matrix output by the previous layer for each of the sentences in the sentence pair A sentence pair attention unit for outputting a matrix to each of the sentences of the sentence pair;
In each layer of the neural network, included in one sentence of the sentence pair, obtained by searching the word sentence storage unit for storing the word sentence for the chunk for the chunk in which the word is connected by the number of words corresponding to the layer. For each of the sentences in the sentence pair, a feature matrix for matching the feature matrix for each of the words included in the sentence related to the chunk and the feature matrix for each of the words of the other sentence of the sentence pair. A verb expansion sentence pair attention part to be output;
In each layer of the neural network, a convolution process is performed on the feature matrix output by the word vectorization unit, the feature matrix output by the sentence pair attention unit, and the feature matrix output by the word extension sentence pair attention unit. A convolution unit that outputs, as an output of the layer, a feature matrix obtained by performing for each of the sentences of the sentence pair;
A class classification unit that classifies a class related to the relationship between the sentence pairs based on the feature matrix for each of the sentence pairs output by the last layer of the neural network;
Sentence pair classifier including

The sentence pair classification device according to claim 1, wherein the word vectorization unit, the sentence pair attention unit, the word extension sentence pair attention unit, and the convolution unit obtain a feature matrix using a parameter matrix learned in advance.

A word that divides each sentence of the sentence pair into a series of words for each of the sentence pairs included in the sentence pair set including each of the sentence pairs to which a correct answer label indicating a class relating to the relationship of sentence pairs is given. A dividing section;
For each of the sentences of the sentence pair included in the sentence pair set, each of the divided words is obtained by vectorizing based on a word vector storage unit that stores a vector of each word. A word vectorization unit that outputs a feature matrix for each word;
In each layer of the neural network, a feature matrix for each of the words for each of the sentences in the sentence pair, or a feature for matching the feature matrix output by the previous layer for each of the sentences in the sentence pair A sentence pair attention unit for outputting a matrix to each of the sentences of the sentence pair;
In each layer of the neural network, included in one sentence of the sentence pair, obtained by searching the word sentence storage unit for storing the word sentence for the chunk for the chunk in which the word is connected by the number of words corresponding to the layer. For each of the sentences in the sentence pair, a feature matrix for matching the feature matrix for each of the words included in the sentence related to the chunk and the feature matrix for each of the words of the other sentence of the sentence pair. A verb expansion sentence pair attention part to be output;
In each layer of the neural network, a convolution process is performed on the feature matrix output by the word vectorization unit, the feature matrix output by the sentence pair attention unit, and the feature matrix output by the word extension sentence pair attention unit. A convolution unit that outputs, as an output of the layer, a feature matrix obtained by performing for each of the sentences of the sentence pair;
For each of the sentence pairs included in the sentence pair set, classify the class related to the relationship between the sentence pairs based on the feature matrix for each of the sentence pairs output by the last layer of the neural network. A class classification unit that calculates a loss related to the classification result based on the classification result and the correct answer label;
Based on the loss related to the classification result calculated for each of the sentence pairs included in the sentence pair set, a feature matrix is generated in the sentence pair attention unit, the word extension sentence pair attention unit, and the convolution unit. A learning unit for learning a parameter matrix for obtaining;
Sentence pair classification learning device including

A word dividing unit dividing each sentence of a sentence pair into a sequence of words;
Each of the sentences obtained by vectorizing a word vector for each of the sentences of the sentence pair based on a word vector storage unit that stores a vector of each word for each of the sentences of the sentence pair Outputting a feature matrix for the words of
The sentence pair attention unit is output by the previous layer for each of the words of the sentence pair or for each of the sentences of the sentence pair in each layer of the neural network. Outputting a feature matrix for feature matrix matching for each of the sentences of the sentence pair;
The word extension sentence pair attention part is obtained by searching the word sentence storage part for storing the word sentence for the chunk for each chunk of the neural net connected with the number of words corresponding to the layer. A feature matrix for matching a feature matrix for each of the words included in the interpretation sentence for the chunk included in one sentence of the sentence pair and a feature matrix for each of the words of the other sentence of the sentence pair, Outputting for each of said sentences;
In each layer of the neural network, the convolution unit includes the feature matrix output from the word vectorization unit, the feature matrix output from the sentence pair attention unit, and the feature matrix output from the word extension sentence pair attention unit. Outputting a feature matrix obtained by performing convolution processing on each of the sentences of the sentence pair as an output of the layer;
Classifying a class related to the relationship between the sentence pairs based on the feature matrix for each of the sentence pairs output by the last layer of the neural network;
Sentence pair classification method including.

5. The sentence pair classification method according to claim 4, wherein the word vectorization unit, the sentence pair attention unit, the word extension sentence pair attention unit, and the convolution unit obtain a feature matrix using a previously learned parameter matrix.

For each of the sentence pairs included in the sentence pair set including each of the sentence pairs to which the correct answer label indicating the class relating to the relationship between the sentence pairs is given, each of the sentences of the sentence pair Dividing into series,
A word vectorization unit vectorizes each of the divided words for each of the sentences of the sentence pair included in the sentence pair set based on a word vector storage unit that stores a vector of each word. Outputting a resulting feature matrix for each word of the sentence;
The sentence pair attention unit is output by the previous layer for each of the words of the sentence pair or for each of the sentences of the sentence pair in each layer of the neural network. Outputting a feature matrix for feature matrix matching for each of the sentences of the sentence pair;
The word extension sentence pair attention part is obtained by searching the word sentence storage part for storing the word sentence for the chunk for each chunk of the neural net connected with the number of words corresponding to the layer. A feature matrix for matching a feature matrix for each of the words included in the interpretation sentence for the chunk included in one sentence of the sentence pair and a feature matrix for each of the words of the other sentence of the sentence pair, Outputting for each of said sentences;
In each layer of the neural network, the convolution unit includes the feature matrix output from the word vectorization unit, the feature matrix output from the sentence pair attention unit, and the feature matrix output from the word extension sentence pair attention unit. Outputting a feature matrix obtained by performing convolution processing on each of the sentences of the sentence pair as an output of the layer;
The class classification unit, for each of the sentence pairs included in the sentence pair set, the sentence pair relationship based on the feature matrix for each of the sentence pairs output by the last layer of the neural network. Classifying a class related to sex, and calculating a loss related to the classification result based on the classification result and the correct answer label;
The learning unit calculates the sentence pair attention unit, the word extension sentence pair attention unit, and the convolution unit based on a loss related to the classification result calculated for each of the sentence pairs included in the sentence pair set. Learning a parameter matrix for obtaining a feature matrix in
A sentence pair classification learning method including

The program for functioning a computer as each part of the sentence pair classification | category apparatus of Claim 1 or Claim 2, or the sentence pair classification | category learning apparatus of Claim 3.