JP6830602B2

JP6830602B2 - Phrase structure learning device, phrase structure analysis device, method, and program

Info

Publication number: JP6830602B2
Application number: JP2017218449A
Authority: JP
Inventors: 英剛上垣外; 平尾　努; 努平尾; 林　克彦; 克彦林; 学奥村; 大也高村
Original assignee: Nippon Telegraph and Telephone Corp; Tokyo Institute of Technology NUC
Current assignee: Nippon Telegraph and Telephone Corp; Tokyo Institute of Technology NUC
Priority date: 2017-11-13
Filing date: 2017-11-13
Publication date: 2021-02-17
Anticipated expiration: 2037-11-13
Also published as: JP2019091172A

Description

本発明は、句構造学習装置、方法、及びプログラムに係り、特に、文の句構造を解析するための句構造学習装置、句構造解析装置、方法、及びプログラムに関する。 The present invention relates to a phrase structure learning device, a method, and a program, and more particularly to a phrase structure learning device, a phrase structure analysis device, a method, and a program for analyzing the phrase structure of a sentence.

句構造解析は、入力された文の句構造木を計算機によって解析して出力する技術である。図１に句構造木の例を示す。句構造木は句と、その句の階層的な構造によって構成される。 Phrase structure analysis is a technique for analyzing and outputting the phrase structure tree of an input sentence by a computer. FIG. 1 shows an example of a phrase structure tree. The phrase structure tree is composed of phrases and the hierarchical structure of the phrases.

これらの句は、句構造ラベルと、その句を構成する単語集合からなる。句を構成する単語集合は、句構造木における当該句構造ラベルの下位の葉ノードに含まれる単語の集合によって構成される。 These phrases consist of a phrase structure label and a set of words that make up the phrase. The word set that composes a phrase is composed of a set of words contained in the leaf nodes below the phrase structure label in the phrase structure tree.

ニューラルネットワークを用いた系列に基づく句構造解析法として、非特許文献１等がある。非特許文献１は、明示的な木構造を仮定せず、句構造木を句構造ラベルの系列として解析を行い出力する。非特許文献１をはじめとする従来技術においては、図１に示す句構造木は図２に示すような系列として表現される。系列として表現された句構造木では全ての葉ノードの単語と品詞(句構造ラベルのうち、単語の最も近くにあるラベル。図１ではＷＰ、ＶＢＺ、ＪＪを指す。)は図２に示すＸＸのようにすべて同一のラベルに置換される。また、ＸＸを除くすべてのラベルが句構造ラベルである。このような系列として表現された句構造木を、以後「正規化された句構造木」と呼ぶ。図３に従来技術に基づく句構造解析法の構成図を示す。系列に基づく句構造解析法は、人手によるルールや遷移規則を必要とせず、線形時間で句構造木を出力することが可能である。 Non-Patent Document 1 and the like are examples of a phrase structure analysis method based on a sequence using a neural network. Non-Patent Document 1 analyzes and outputs a phrase structure tree as a series of phrase structure labels without assuming an explicit tree structure. In the prior art including Non-Patent Document 1, the phrase structure tree shown in FIG. 1 is expressed as a series as shown in FIG. In the phrase structure tree expressed as a series, the words and part of speech of all leaf nodes (the label closest to the word among the phrase structure labels. In FIG. 1, it refers to WP, VBZ, JJ) are XX shown in FIG. All are replaced with the same label. In addition, all labels except XX are phrase structure labels. The phrase structure tree expressed as such a series is hereinafter referred to as a "normalized phrase structure tree". FIG. 3 shows a configuration diagram of a phrase structure analysis method based on the prior art. The phrase structure analysis method based on the series does not require manual rules or transition rules, and can output the phrase structure tree in linear time.

Vinyals, O., Kaiser, L ., Koo, T., Petrov, S., Sutskever,I., and Hinton, G. (2015). "Grammar as a foreign language.". In Advancesin Neural Information Processing Systems (pp. 2773-2781).Vinyals, O., Kaiser, L., Koo, T., Petrov, S., Sutskever, I., and Hinton, G. (2015). "Grammar as a foreign language.". In Advancesin Neural Information Processing Systems ( pp. 2773-2781).

非特許文献１において、入力される単語列と出力する句構造ラベルの関係は、句構造解析器に含まれる、句構造ラベルに対する、入力文の各単語の重みを出力する注意機構によって計算される分布によって捉えられる。図４に注意機構が出力する各重みの分布の例を示す。各セルは入力単語と出力された句構造ラベルの対応を、黒いセルは注意機構がその対応に高い確率を割り当てたことをそれぞれ表している。注意機構は入力文字列中のどの文字列が重要であるかを解析中に判断する。解析機はその結果を利用して解析結果を出力する。注意機構の出力する分布は明示的な句構造ラベルを与えられることなく、教師なしで学習される。そのために、注意機構が入力文字列と出力文字列の対応関係を正しく学習するとは限らない。この結果、注意機構が出力する誤った対応関係によって、出力される句構造木に誤りが生じる可能性がある。 In Non-Patent Document 1, the relationship between the input word string and the output phrase structure label is calculated by a caution mechanism that outputs the weight of each word in the input sentence with respect to the phrase structure label included in the phrase structure analyzer. Captured by distribution. FIG. 4 shows an example of the distribution of each weight output by the attention mechanism. Each cell shows the correspondence between the input word and the output phrase structure label, and the black cell shows that the attention mechanism assigned a high probability to the correspondence. Attention The mechanism determines during analysis which string in the input string is important. The analyzer uses the result to output the analysis result. The distribution output by the attention mechanism is unsupervised without being given an explicit phrase structure label. Therefore, the attention mechanism does not always correctly learn the correspondence between the input character string and the output character string. As a result, there is a possibility that an error will occur in the output phrase structure tree due to the erroneous correspondence output by the attention mechanism.

本発明は、上記問題点を解決するために成されたものであり、注意機構が精度よく対応付けを出力するように学習するための学習データを生成できる句構造学習装置、方法、及びプログラムを提供することを目的とする。 The present invention has been made to solve the above problems, and provides a phrase structure learning device, method, and program capable of generating learning data for learning so that the attention mechanism outputs a correspondence with high accuracy. The purpose is to provide.

また、本発明は、上記問題点を解決するために成されたものであり、精度よく、句構造解析を行うことができる、句構造解析装置、方法、及びプログラムを提供することを目的とする。 The present invention has been made in order to solve the above problems, and an object of the present invention is to provide a phrase structure analysis device, a method, and a program capable of performing phrase structure analysis with high accuracy. ..

上記目的を達成するために本発明に係る句構造学習装置は、入力文に対する句構造ラベル列を出力する句構造解析器を学習する句構造学習装置であって、前記入力文と、前記入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列とに基づいて、前記句構造解析器に含まれる、前記句構造ラベルに対する、前記入力文の各単語の重みを出力する注意機構を学習するための、前記単語と前記句構造ラベルの対応付けからなる学習データを生成する学習データ生成部、を含んで構成されている。 The phrase structure learning device according to the present invention in order to achieve the above object is a phrase structure learning device that learns a phrase structure analyzer that outputs a phrase structure label string for an input sentence, and the input sentence and the input sentence. Note that the weight of each word in the input sentence is output with respect to the phrase structure label included in the phrase structure analyzer based on the phrase structure label string consisting of the phrase structure labels of each node of the phrase structure tree representing. It is configured to include a learning data generation unit that generates learning data including the correspondence between the word and the phrase structure label for learning the mechanism.

また、本発明に係る句構造解析装置は、入力文に対する句構造ラベル列を出力する予め学習された句構造解析器であって、句構造ラベルに対する、前記入力文の各単語の重みを出力する注意機構を含む句構造解析器を用いて、前記入力文を入力とし、前記入力文に対する前記句構造ラベル列を出力する句構造解析部、を含む句構造解析装置であって、前記注意機構は、学習用入力文と、前記学習用入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列とに基づいて生成された、前記単語と前記句構造ラベルの対応付けからなる学習データに基づいて予め学習されたものとすることを特徴とする。 Further, the phrase structure analyzer according to the present invention is a pre-learned phrase structure analyzer that outputs a phrase structure label string for an input sentence, and outputs the weight of each word of the input sentence with respect to the phrase structure label. A phrase structure analysis device including a phrase structure analysis unit that takes the input sentence as an input and outputs the phrase structure label string for the input sentence by using a phrase structure analyzer including a caution mechanism. , A correspondence between the word and the phrase structure label generated based on the learning input sentence and the phrase structure label string consisting of the phrase structure labels of each node of the phrase structure tree representing the learning input sentence. It is characterized in that it is pre-learned based on the training data.

本発明に係るプログラムは、コンピュータを、句構造学習装置、又は句構造解析装置の各部として機能させるためのプログラムである。 The program according to the present invention is a program for making a computer function as each part of a phrase structure learning device or a phrase structure analysis device.

本発明の句構造学習装置、方法、及びプログラムによれば、入力文と、入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列とに基づいて、句構造解析器に含まれる、句構造ラベルに対する、入力文の各単語の重みを出力する注意機構を学習するための、単語と句構造ラベルの対応付けからなる学習データを生成することにより、注意機構が精度よく対応付けを出力するように学習するための学習データを生成できる、という効果が得られる。 According to the phrase structure learning device, method, and program of the present invention, the phrase structure analyzer is based on the input sentence and the phrase structure label string consisting of the phrase structure labels of each node of the phrase structure tree representing the input sentence. By generating learning data consisting of the correspondence between words and phrase structure labels for learning the attention mechanism that outputs the weight of each word in the input sentence for the included phrase structure label, the attention mechanism responds accurately. The effect of being able to generate training data for training to output a phrase can be obtained.

また、本発明の句構造解析装置、方法、及びプログラムによれば、注意機構を含む句構造解析器を用いて、句構造解析を行い、句構造解析器に含まれる注意機構は、学習用入力文と、学習用入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列とに基づいて生成された、単語と句構造ラベルの対応付けからなる学習データに基づいて予め学習されたものとすることにより、精度よく、句構造解析を行うことができる、という効果が得られる。 Further, according to the phrase structure analyzer, method, and program of the present invention, the phrase structure analysis is performed using the phrase structure analyzer including the attention mechanism, and the attention mechanism included in the phrase structure analyzer is a learning input. Pre-learning based on learning data consisting of correspondence between words and phrase structure labels, generated based on the sentence and the phrase structure label string consisting of the phrase structure labels of each node of the phrase structure tree representing the input sentence for learning. By assuming that, the effect that the phrase structure analysis can be performed with high accuracy can be obtained.

句構造解析によって出力される句構造木の一例を示す図である。It is a figure which shows an example of the phrase structure tree output by the phrase structure analysis. 句構造木の句構造ラベルの系列の一例を示す図である。It is a figure which shows an example of the phrase structure label series of a phrase structure tree. 従来の句構造解析法の構成図の一例を示す図である。It is a figure which shows an example of the structural diagram of the conventional phrase structure analysis method. 注意機構３が出力する各重みの分布の一例を示す図である。It is a figure which shows an example of the distribution of each weight output by attention mechanism 3. 学習フェーズにおける句構造解析器２４０を構成するニューラルネットワークの一例を示す図である。It is a figure which shows an example of the neural network which comprises the phrase structure analyzer 240 in a learning phase. 実行フェーズにおける句構造解析器４０を構成するニューラルネットワークの一例を示す図である。It is a figure which shows an example of the neural network which comprises the phrase structure analyzer 40 in the execution phase. 本発明の実施の形態で用いる入出力の内容を示す図である。It is a figure which shows the content of the input / output used in the embodiment of this invention. 本発明の実施の形態に係る句構造解析装置１００の構成を示すブロック図である。It is a block diagram which shows the structure of the phrase structure analysis apparatus 100 which concerns on embodiment of this invention. 句構造解析器４０を構成するニューラルネットワークの構成を示す図である。It is a figure which shows the structure of the neural network which comprises the phrase structure analyzer 40. ニューラルネットワークを構成するエンコード部１の詳細の一例を示す図である。It is a figure which shows an example of the details of the encoding unit 1 which constitutes a neural network. ニューラルネットワークを構成するデコード部２の詳細の一例を示す図である。It is a figure which shows an example of the details of the decoding unit 2 which constitutes a neural network. ニューラルネットワークを構成する注意機構３の詳細の一例を示す図である。It is a figure which shows an example of the details of the attention mechanism 3 which constitutes a neural network. 本発明の実施の形態に係る句構造解析装置１００における句構造解析処理ルーチンを示すフローチャートである。It is a flowchart which shows the phrase structure analysis processing routine in the phrase structure analysis apparatus 100 which concerns on embodiment of this invention. 句構造解析器４０のニューラルネットワークにおける処理ルーチンの詳細の一例を示す図である。It is a figure which shows an example of the detail of the processing routine in the neural network of a phrase structure analyzer 40. 本発明の実施の形態に係る句構造学習装置２００の構成を示すブロック図である。It is a block diagram which shows the structure of the phrase structure learning apparatus 200 which concerns on embodiment of this invention. 本発明の実施の形態の学習の概要を示す図である。It is a figure which shows the outline of learning of embodiment of this invention. 学習用入力文Ｘと正解系列Ｌの一例を示す図である。It is a figure which shows an example of the input sentence X for learning and the correct answer series L. 学習用入力文を構成する単語と、句構造ラベルとの対応付けの一例を示す図である。It is a figure which shows an example of the correspondence between the word which constitutes a learning input sentence, and a phrase structure label. 学習用入力文を構成する単語と、句構造ラベルとの対応付けの一例を示す図である。It is a figure which shows an example of the correspondence between the word which constitutes a learning input sentence, and a phrase structure label. 学習データ生成部２３０の処理結果として得られた学習データの一例を示す図である。It is a figure which shows an example of the learning data obtained as the processing result of the learning data generation unit 230. 本発明の実施の形態に係る句構造学習装置２００における句構造学習処理ルーチンを示すフローチャートである。It is a flowchart which shows the phrase structure learning processing routine in the phrase structure learning apparatus 200 which concerns on embodiment of this invention. 学習データの生成処理ルーチンを示すフローチャートである。It is a flowchart which shows the generation processing routine of learning data.

以下、図面を参照して本発明の実施の形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜本発明の実施の形態に係る概要＞ <Overview of Embodiments of the Present Invention>

まず、本発明の実施の形態における概要を説明する。 First, an outline of the embodiment of the present invention will be described.

本発明の実施の形態では非特許文献１のような、ニューラルネットワークを用いた系列に基づく句構造解析器において、注意機構に対し、入力文を構成する単語と出力する句構造ラベルの関係を明示的に与えて学習を行う。 In the embodiment of the present invention, in a phrase structure analyzer based on a sequence using a neural network as in Non-Patent Document 1, the relationship between the words constituting the input sentence and the phrase structure label to be output is clarified to the attention mechanism. Give and learn.

注意機構の学習のための学習データは、句構造解析のための学習データをもとに、所定のルールを用いて作成する。具体的には、学習データの句構造木を構成する各句構造ラベルについて、句構造ラベルと、当該句構造ラベルのノードより下位の葉ノードに含まれる単語の集合を、所定のルールにより対応付けることで作成する。出力しようとする句構造ラベルと、その句構造ラベルを頂点とする句に含まれる単語とが対応付くことを正しい対応であると規定し、注意機構を学習するための学習データとする。 The learning data for learning the attention mechanism is created by using a predetermined rule based on the learning data for phrase structure analysis. Specifically, for each phrase structure label that constitutes the phrase structure tree of the training data, the phrase structure label and the set of words included in the leaf nodes lower than the node of the phrase structure label are associated with each other according to a predetermined rule. Create with. It is defined that the correspondence between the phrase structure label to be output and the words included in the phrase having the phrase structure label as the apex is the correct correspondence, and it is used as learning data for learning the attention mechanism.

このような学習データを用いて注意機構を学習することにより、注意機構が誤った対応関係を含む分布を出力することを防ぎ、句構造解析の精度を向上させることができる。また、注意機構は句構造解析器の内部の処理であり、注意機構のための学習データを別途作成するのはコストが高いが、本手法により句構造解析器の学習データから注意機構のための学習データが作成できる。 By learning the attention mechanism using such learning data, it is possible to prevent the attention mechanism from outputting a distribution including an erroneous correspondence and improve the accuracy of the phrase structure analysis. In addition, the attention mechanism is an internal process of the phrase structure analyzer, and it is expensive to separately create training data for the attention mechanism, but this method is used to use the learning data of the phrase structure analyzer for the attention mechanism. Learning data can be created.

以下の本発明の実施の形態の説明では、学習フェーズと、実行フェーズとに分けて説明する。学習フェーズでは、図５に示す句構造解析器２４０を構成するニューラルネットワークのパラメータを、学習データに基づき決定する。実行フェーズでは、学習フェーズで定義された図６に示す句構造解析器４０を構成するニューラルネットワークに基づき入力を処理し、出力は学習済みのパラメータに依存して決定する。 In the following description of the embodiment of the present invention, a learning phase and an execution phase will be described separately. In the learning phase, the parameters of the neural network constituting the phrase structure analyzer 240 shown in FIG. 5 are determined based on the learning data. In the execution phase, the input is processed based on the neural network constituting the phrase structure analyzer 40 shown in FIG. 6 defined in the learning phase, and the output is determined depending on the learned parameters.

以下、本実施の形態で用いる入出力の例として、図７を用いる。入力文Ｘにおける各単語の位置をｉとし、ｉ＝{１,…,ｎ}、本例ではｎ＝６である。出力系列Ｙにおける各句構造ラベルの位置をｔとし、ｔ＝{１, …,ｍ}、本例ではｍ＝１５である。 Hereinafter, FIG. 7 is used as an example of input / output used in the present embodiment. Let i be the position of each word in the input sentence X, i = {1, ..., n}, and in this example, n = 6. Let t be the position of each phrase structure label in the output series Y, and t = {1, ..., m}, and m = 15 in this example.

＜本発明の実施の形態に係る句構造解析装置の構成＞ <Structure of phrase structure analysis device according to the embodiment of the present invention>

次に、本発明の実施の形態に係る句構造解析装置の構成について説明する。なお、句構造解析装置において、実行フェーズを処理する。 Next, the configuration of the phrase structure analysis device according to the embodiment of the present invention will be described. The phrase structure analysis device processes the execution phase.

本発明の実施の形態における句構造解析器４０は、処理対象となる文ｘを入力とし、句構造ラベルｙ_ｔを出力する。次に出力したｙ_ｔを入力とし、ｙ_ｔ＋１を出力する逐次処理を繰り返し、出力するｙ_ｔが文末記号である</s>となったときに処理を終了する。 The phrase structure analyzer 40 according to the embodiment of the present invention takes the sentence x to be processed as an input and outputs the phrase structure label y _t . Next, the output y _t is used as an input, and the sequential processing of outputting y _{t + 1} is repeated, and the processing is terminated when the output y _t becomes the sentence end symbol </ s>.

図８に示すように、本発明の実施の形態に係る句構造解析装置１００は、ＣＰＵと、ＲＡＭと、後述する句構造解析処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。この句構造解析装置１００は、機能的には図８に示すように入力部１０と、演算部２０とを備えている。 As shown in FIG. 8, the phrase structure analysis device 100 according to the embodiment of the present invention includes a CPU, a RAM, a ROM that stores a program for executing a phrase structure analysis processing routine described later, and various data. It can be configured with a computer that includes. The phrase structure analysis device 100 functionally includes an input unit 10 and a calculation unit 20 as shown in FIG.

入力部１０は、入力文を受け付ける。入力文は、文を分割し、文頭と文末記号を付与したものが与えられる。 The input unit 10 accepts an input sentence. The input sentence is given by dividing the sentence and adding the beginning and end symbols.

演算部２０は、句構造解析部３０と、句構造解析器４０とを含んで構成されている。 The calculation unit 20 includes a phrase structure analysis unit 30 and a phrase structure analyzer 40.

句構造解析器４０は、単語からなる入力文に対する句構造ラベル列を先頭から順に出力する予め学習された句構造解析器であり、ニューラルネットワークにより構成されるものである。また、句構造解析器４０は、句構造ラベルに対する、入力文の各単語の重みを出力する注意機構３を含む句構造解析器である。句構造解析器４０の学習については、後述する句構造学習装置において説明する。 The phrase structure analyzer 40 is a pre-learned phrase structure analyzer that outputs a phrase structure label string for an input sentence composed of words in order from the beginning, and is configured by a neural network. Further, the phrase structure analyzer 40 is a phrase structure analyzer including a caution mechanism 3 that outputs the weight of each word in the input sentence with respect to the phrase structure label. The learning of the phrase structure analyzer 40 will be described in the phrase structure learning device described later.

ここで句構造解析器４０のニューラルネットワークの構成について説明する。図９に示すように句構造解析器４０を構成するニューラルネットワークは、エンコード部１と、デコード部２と、注意機構３と、出力部４とを含んで構成されている。 Here, the configuration of the neural network of the phrase structure analyzer 40 will be described. As shown in FIG. 9, the neural network constituting the phrase structure analyzer 40 includes an encoding unit 1, a decoding unit 2, a caution mechanism 3, and an output unit 4.

エンコード部１は入力文を隠れ状態へと変換する。 The encoding unit 1 converts the input sentence into a hidden state.

デコード部２は、前回出力したラベルｙ_ｔ−１を隠れ状態へと変換する。 The decoding unit 2 converts the previously output label y _t-1 into a hidden state.

注意機構３は、エンコード部１とデコード部２によって変換された隠れ状態を組み合わせて重みづけを行い、入力文の各単語の重みへと変換する。 The attention mechanism 3 combines the hidden states converted by the encoding unit 1 and the decoding unit 2 to perform weighting, and converts the weights into the weights of each word in the input sentence.

出力部４は、エンコード部１の隠れ状態を、注意機構３により出力された入力文の各単語の重みに基づき重み付けし、デコード部２の隠れ状態と結合した上で、出力すべきラベルを決定する。 The output unit 4 weights the hidden state of the encoding unit 1 based on the weight of each word of the input sentence output by the attention mechanism 3, combines it with the hidden state of the decoding unit 2, and determines the label to be output. To do.

１文の処理につき、エンコード部１での処理は最初の１回のみでよいが、それ以外の部はｍ−１回処理を繰り返すこととなる。 For the processing of one sentence, the processing in the encoding unit 1 may be performed only once at the beginning, but the processing in the other parts is repeated m-1 times.

以下、句構造解析器４０を構成するニューラルネットワークの各部の詳細な説明を行う。 Hereinafter, each part of the neural network constituting the phrase structure analyzer 40 will be described in detail.

図１０にエンコード部１の詳細を示す。エンコード部１は入力として１文を受け取り、１文に含まれる単語の系列ｘ＝{ｘ_１,…,ｘ_ｉ,…,ｘ_ｎ}を、各単語ｘ_ｉごとに、実数値の所定次元からなるベクトルｈ_ｉへと変換する。 FIG. 10 shows the details of the encoding unit 1. The encoding unit 1 receives one sentence as an input, and sets the word sequence x = {x ₁ , ..., x _i , ..., x _n } contained in one sentence from a predetermined dimension of a real value for each word x _i. become converted to vector _{h i.}

具体的には、まず入力された単語の系列をｉ＝１からｎへ前向きに走査する順方向の再帰型ニューラルネットワークによってｌ番目の入力単語を隠れ状態ベクトル Specifically, first, the l-th input word is hidden by a forward recurrent neural network that scans the input word sequence forward from i = 1 to n.

に変換する。同様に、入力された単語の系列をｉ＝ｎから１へと後ろ向きに走査する逆方向の再帰型ニューラルネットによりｌ番目の入力単語を隠れ状態ベクトル
Convert to. Similarly, the l-th input word is hidden by a recursive neural network in the reverse direction that scans the sequence of input words backward from i = n to 1.

に変換する。最終的に、

Convert to. Finally,

と
When

を結合し，ｈ_ｉとする．この処理により、ｈ_ｉは前方及び後方の単語の変換結果であるｈ_１,…,ｈ_ｉ−１及び、ｈ_ｉ＋１,…,ｈ_ｎに依存したものとなる。
Are combined to obtain _hi . This process, _{h 1} _{h i} is the result forward and backward word _conversion, ..., _{h i-1} _{and, h i + 1, ...,} becomes dependent on _{h n.}

各単語ｘ_ｉをベクトル Vector each word x _i

または Or

に変換する際には、予め作成された(単語、単語ベクトル)の組からなるコードブックを用いる。単語ベクトルは、組となる単語の特徴を所定次元からなる空間上の座標として示したものであり、単語分散表現とも呼ばれるものである。本実施の形態では、全入力単語（<s>,</s>を含む）をＷとして、次の条件を満たすベクトルを用いるが、非特許文献２に記載の方法等を用い、予め作成されたものを用いてもよい。 When converting to, a codebook consisting of a set of (words, word vectors) created in advance is used. The word vector shows the characteristics of a set of words as coordinates in a space having a predetermined dimension, and is also called a word distributed expression. In the present embodiment, all input words (including <s>, </ s>) are set to W, and a vector satisfying the following conditions is used, but it is created in advance by using the method described in Non-Patent Document 2. You may use the one.

条件１：１次元がひとつの単語に対応する、全Ｗ次元からなる。
条件２：組となる単語に対応する次元の要素を１に、それ以外の次元の要素を０としたOne-hotベクトルとする。 Condition 1: One dimension corresponds to one word, consisting of all W dimensions.
Condition 2: A One-hot vector in which the dimensional element corresponding to the set word is 1 and the other dimensional elements are 0.

[非特許文献２]:Tomas Mikolov, Ilya Sutskever, Kai Chen,Greg Corrado, and Jeffrey Dean.Distributed Representations of Words and Phrases and their Compositionality. In Proceedings of NIPS, 2013. [Non-Patent Document 2]: Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. In Proceedings of NIPS, 2013.

本実施の形態では、単語ベクトルの各要素の値は、ニューラルネットワークのパラメータにより重みづけされる。パラメータは、後述する句構造学習装置の学習により更新される。 In this embodiment, the value of each element of the word vector is weighted by the parameters of the neural network. The parameters are updated by learning the phrase structure learning device described later.

図１１にデコード部２の詳細を示す。デコード部２は出力部４が前回出力した句構造ラベルｙ_ｔ−１及びｅｎｃ^ｔ−１を入力とし、順方向の再帰型ニューラルネットワークによって、入力情報と FIG. 11 shows the details of the decoding unit 2. The decoding unit 2 receives the phrase structure labels y _t-1 and enc ^t-1 previously output by the output unit 4 as inputs, and uses a forward recurrent neural network to input information.

とを、実数値の所定次元からなる隠れ状態ベクトル And, a hidden state vector consisting of a predetermined dimension of real values

へと変換して出力する。逐次的に句構造ラベルが入力されるため、変換の結果は以前の句構造ラベルの変換結果である隠れ状態ベクトルの集合 Convert to and output. Since the phrase structure labels are input sequentially, the result of the conversion is a set of hidden state vectors that are the conversion results of the previous phrase structure labels.

と、エンコード部１の隠れ状態ベクトルｈ_１,…,ｈ_ｎに注意機構３が重みづけをしたｅｎｃ^ｔ−１に依存する。ｅｎｃ^ｔの詳細は後述する。 And, the hidden state vectors h ₁ , ..., H _n of the encoding unit 1 depend on the enc ^t-1 weighted by the attention mechanism 3. For more information on enc ^t will be described later.

句構造ラベルｙ_ｔ−１をベクトルに変換する際には、予め作成された(句構造ラベル,句構造ベクトル)の組からなるコードブックを用いる。全句構造ラベル数をＶとしたとき、句構造ベクトルとして次の条件を満たすベクトルを用いる。 When converting the phrase structure label y _t-1 into a vector, a codebook consisting of a set of (phrase structure label, phrase structure vector) created in advance is used. When the number of all phrase structure labels is V, a vector that satisfies the following conditions is used as the phrase structure vector.

条件１：１次元がひとつの句構造ラベルに対応する、全Ｖ次元からなる。
条件２：組となる句構造ラベルに対応する次元の要素を１に、それ以外の次元の要素を０としたOne-hotベクトルとする。 Condition 1: One dimension corresponds to one phrase structure label, consisting of all V dimensions.
Condition 2: A One-hot vector in which the dimensional element corresponding to the phrase structure label to be a set is 1 and the other dimensional elements are 0.

デコード部２はｔ＝２より処理を開始することとし、その際の入力としてｔ＝１の句構造ラベルｙ_１には<s>を、ｅｎｃ^１には後述する初期値を、デコード部２の初期状態 The decoding unit 2 starts the process from t = 2, and as an input at that time, <s> is used for the phrase structure label y ₁ of t = 1, and the initial value described later is used for enc ¹ . initial state

にはエンコード部１の隠れ状態 Is the hidden state of the encoding unit 1

を使用する。 To use.

図１２に注意機構３の詳細を示す。注意機構３は、デコード部２の句構造ラベルの隠れ状態への変換結果 FIG. 12 shows the details of the caution mechanism 3. Attention mechanism 3 is the conversion result of the phrase structure label of the decoding unit 2 to the hidden state.

と、エンコード部１の各単語の隠れ状態への変換結果ｈ_１,…,ｈ_ｎを入力として受け取り、各対応の総和が１となるように正規化された重みα_ｉ ^ｔをニューラルネットワークにより計算し、分布α^ｔとして出力する。注意機構３の初期値α^１（ｔ＝１）は、α_１ ^１＝１、それ以外の要素が０のベクトルとしておく。 When the conversion result h ₁ to each word of the hidden state of the encoding unit _1, ..., receives the h _n as input, calculates the normalized weight alpha _i ^t as the corresponding sum is 1 the neural network And output as a distribution α ^t . The initial value α ¹ (t = 1) of the attention mechanism 3 is a vector in which α ₁ ¹ = 1 and the other elements are 0.

出力部４では、注意機構３が出力した分布、エンコード部１の各単語の隠れ状態への変換結果ｈ_１,…,ｈ_ｎ、デコード部２の句構造ラベルの隠れ状態への変換結果 In the output unit 4, the distribution output by the attention mechanism 3, the conversion result of each word of the encoding unit 1 to the hidden state h ₁ , ..., H _n , and the conversion result of the phrase structure label of the decoding unit 2 to the hidden state.

を受け取り、各句構造ラベルの出力確率を出力する。初めに出力部４は注意機構３が出力した分布α^ｔに従い、エンコード部１の変換結果に重み付けを行った総和を以下（１）式により計算する。 Is received and the output probability of each phrase structure label is output. First, the output unit 4 calculates the sum of the weighted conversion results of the encoding unit 1 according to the distribution α ^t output by the attention mechanism 3 by the following equation (1).

…（１）
… (1)

（１）式の総和とデコード部２の句構造ラベルの変換結果を連結したベクトル

をソフトマックス層に入力し、各句構造ラベルの出力確率を決定する。句構造ラベル数がＶの際の句構造ラベルｙ_ｔの出力確率Ｐ(ｙ_ｔ|ｘ_１,…,ｘ_ｎ,ｙ_１,…,ｙ_ｔ−１)は、重み行列Ｗ_ｖとバイアス項ｂを用いて以下（２）式で計算される。 A vector that concatenates the sum of equations (1) and the conversion result of the phrase structure label of the decoding unit 2.

Is input to the softmax layer, and the output probability of each phrase structure label is determined. The output probability P (y _t | x ₁ , ..., x _n , y ₁ , ..., y _t-1 ) of the phrase structure label y _t when the number of phrase structure labels is V is the weight matrix W _v and the bias term b. Is calculated by the following equation (2) using.

…（２）
… (2)

で計算される。重み行列Ｗ_ｖ及びバイアス項ｂは、ニューラルネットワークのパラメータである。 It is calculated by. The weight matrix W _v and the bias term b are parameters of the neural network.

最も高い出力確率Ｐとなるｙ_ｔを句構造解析器４０が出力するｔ番目の句構造ラベルとする。 Let y _t, which has the highest output probability P, be the t-th phrase structure label output by the phrase structure analyzer 40.

以上の処理により逐次的に句構造ラベルｙ_ｔを出力し、出力するｙ_ｔが文末記号である</s>となったとき（本実施の形態ではｔ＝１４のとき）に処理を終了する。 Through the above processing, the phrase structure label y _t is sequentially output, and the processing is terminated when the output y _t becomes the sentence end symbol </ s> (when t = 14 in the present embodiment). ..

出力部の初期値として、

は<s>とする。α^１は注意機構３の初期値を利用する。 As the initial value of the output section

Is <s>. α ¹ uses the initial value of the attention mechanism 3.

以上が、句構造解析器４０を構成するニューラルネットワークの各部についての説明である。 The above is a description of each part of the neural network constituting the phrase structure analyzer 40.

句構造解析部３０は、句構造解析器４０を用いて、入力部１０で受け付けた入力文を入力とし、入力文に対する句構造ラベル列を出力する。ここで、句構造解析器に含まれる注意機構３は、学習用入力文と、学習用入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列であって、正規化された句構造ラベル列とに基づいて生成された、単語と句構造ラベルの対応付けからなる学習データに基づいて予め学習されたものとする。ここで、正規化された句構造ラベル列とは、上記図２について説明したように、葉ノードの単語と品詞について置換を行った後の句構造木の句構造ラベルの系列である。 The phrase structure analysis unit 30 uses the phrase structure analyzer 40 to input the input sentence received by the input unit 10 and outputs a phrase structure label string for the input sentence. Here, the attention mechanism 3 included in the phrase structure analyzer is a phrase structure label string composed of a learning input sentence and a phrase structure label of each node of the phrase structure tree representing the learning input sentence, and is normalized. It is assumed that the training is performed in advance based on the training data consisting of the correspondence between the words and the phrase structure labels, which is generated based on the phrase structure label string. Here, the normalized phrase structure label string is a series of phrase structure labels of the phrase structure tree after the words and part of speech of the leaf node are replaced as described in FIG. 2 above.

＜本発明の実施の形態に係る句構造解析装置の作用＞ <Operation of the phrase structure analyzer according to the embodiment of the present invention>

次に、本発明の実施の形態に係る句構造解析装置１００の作用について説明する。句構造解析装置１００は、図１３に示す句構造解析処理ルーチンを実行する。 Next, the operation of the phrase structure analysis device 100 according to the embodiment of the present invention will be described. The phrase structure analysis device 100 executes the phrase structure analysis processing routine shown in FIG.

まず、ステップＳ１００では、入力部１０において入力文を受け付ける。 First, in step S100, the input unit 10 receives an input sentence.

ステップＳ１０２では、句構造解析器４０を用いて、入力部１０で受け付けた入力文を入力とし、入力文に対する句構造ラベル列を出力する。 In step S102, the phrase structure analyzer 40 is used to input the input sentence received by the input unit 10, and output the phrase structure label string for the input sentence.

次に、句構造解析器４０のニューラルネットワークにおける処理ルーチンの詳細について図１４を参照して説明する。 Next, the details of the processing routine in the neural network of the phrase structure analyzer 40 will be described with reference to FIG.

ステップＳ１０００では、ｔ＝１とする。 In step S1000, t = 1.

ステップＳ１００２では、エンコード部１において、入力文を受け付け、入力文を隠れ状態へと変換する。 In step S1002, the encoding unit 1 receives the input sentence and converts the input sentence into the hidden state.

ステップＳ１００４では、ニューラルネットワークの各部における初期値を設定する。 In step S1004, initial values in each part of the neural network are set.

ステップＳ１００６では、ｔ＝２とする。 In step S1006, t = 2.

ステップＳ１００８では、デコード部２において、初期ラベル、又は前回出力したラベルｙ_ｔ−１を隠れ状態へと変換する。 In step S1008, the decoding unit 2 converts the initial label or the previously output label y _t-1 into a hidden state.

ステップＳ１０１０では、注意機構３において、エンコード部１とデコード部２によって変換された隠れ状態を組み合わせて重みづけを行い、入力文の各単語の重みへと変換する。 In step S1010, the attention mechanism 3 combines the hidden states converted by the encoding unit 1 and the decoding unit 2 to perform weighting, and converts the weights into the weights of each word in the input sentence.

ステップＳ１０１２では、出力部４において、エンコード部１の隠れ状態を、注意機構３により出力された入力文の各単語の重みに基づき重み付けし、デコード部２の隠れ状態と結合した上で、出力すべきラベルを決定する。 In step S1012, in the output unit 4, the hidden state of the encoding unit 1 is weighted based on the weight of each word of the input sentence output by the attention mechanism 3, combined with the hidden state of the decoding unit 2, and then output. Decide which label to use.

ステップＳ１０１４では、ステップＳ１０１２の出力が文末記号</s>ではないかを判定し、文末記号</s>であれば処理を終了し、文末記号</s>でなければステップＳ１０１６でｔ＝ｔ＋１とカウントアップして処理を繰り返す。なお、ｔをカウントアップすることをｔ＝ｔ＋１と表記する。 In step S1014, it is determined whether the output of step S1012 is a sentence end symbol </ s>, if it is a sentence end symbol </ s>, the process is terminated, and if it is not a sentence end symbol </ s>, t = in step S1016. The process is repeated by counting up to t + 1. Note that counting up t is expressed as t = t + 1.

以上説明したように、本発明の実施の形態に係る句構造解析装置１００によれば、注意機構３を含む句構造解析器４０を用いて、句構造解析を行い、句構造解析器４０に含まれる注意機構３は、学習用入力文と、学習用入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列であって、正規化された句構造ラベル列とに基づいて生成された、単語と句構造ラベルの対応付けからなる学習データに基づいて予め学習されたものとすることにより、精度よく、句構造解析を行うことができる。 As described above, according to the phrase structure analyzer 100 according to the embodiment of the present invention, the phrase structure analysis is performed using the phrase structure analyzer 40 including the attention mechanism 3, and the phrase structure analyzer 40 includes the phrase structure analyzer 40. The attention mechanism 3 is a phrase structure label string composed of a learning input sentence and a phrase structure label of each node of the phrase structure tree representing the learning input sentence, and is based on a normalized phrase structure label string. The phrase structure analysis can be performed with high accuracy by assuming that the training data has been learned in advance based on the generated learning data consisting of the correspondence between the words and the phrase structure labels.

＜本発明の実施の形態に係る句構造学習装置の構成＞ <Structure of phrase structure learning device according to the embodiment of the present invention>

次に、本発明の実施の形態に係る句構造学習装置の構成について説明する。なお、句構造学習装置において、学習フェーズを処理する。また、本実施の形態において、ニューラルネットワークのエンコード部１、デコード部２、注意機構３、出力部４は、同時に学習を行うこととするが、別々に学習を行ってもよい。 Next, the configuration of the phrase structure learning device according to the embodiment of the present invention will be described. The phrase structure learning device processes the learning phase. Further, in the present embodiment, the encoding unit 1, the decoding unit 2, the attention mechanism 3, and the output unit 4 of the neural network are to be trained at the same time, but they may be trained separately.

図１５に示すように、本発明の実施の形態に係る句構造学習装置２００は、ＣＰＵと、ＲＡＭと、後述する句構造学習処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。この句構造学習装置２００は、機能的には図１５に示すように入力部２１０と、演算部２２０とを備えている。 As shown in FIG. 15, the phrase structure learning device 200 according to the embodiment of the present invention includes a CPU, a RAM, a ROM that stores a program for executing a phrase structure learning processing routine described later, and various data. It can be configured with a computer that includes. The phrase structure learning device 200 functionally includes an input unit 210 and a calculation unit 220 as shown in FIG.

図１６に本実施の形態の学習の概要図を示す。図１６は、学習データ生成部の学習データ生成処理と、ニューラルネットワークの学習処理に分けられる。 FIG. 16 shows a schematic diagram of learning according to the present embodiment. FIG. 16 is divided into a learning data generation process of the learning data generation unit and a learning process of the neural network.

入力部２１０は、学習用入力文と、正規化された句構造木とを受け付ける。学習フェーズにおける入力となる学習用入力文は、句構造解析の対象となる文と、その解析結果として正解である句構造ラベル系列の組が複数集められたものとする。本実施の形態では、学習データの一組として図１７で提示する学習用入力文Ｘと正解系列Ｌ（正規化された句構造木）の組を例として用い、説明を行う。ここでいう正規化とは、原稿冒頭で述べた、全ての葉ノードの単語と品詞をＸＸ等の同一のラベルにより置換することであり、正規化された句構造木とは、系列として表現され、かつ、正規化された句構造木を指す。 The input unit 210 accepts a learning input sentence and a normalized phrase structure tree. The learning input sentence to be input in the learning phase is assumed to be a collection of a sentence to be analyzed for phrase structure and a plurality of sets of phrase structure label series which are correct answers as the analysis result. In the present embodiment, a set of the learning input sentence X and the correct answer series L (normalized phrase structure tree) presented in FIG. 17 as a set of learning data will be used as an example for explanation. Normalization here means replacing the words and part of speech of all leaf nodes described at the beginning of the manuscript with the same label such as XX, and the normalized phrase structure tree is expressed as a series. And points to a normalized phrase structure tree.

学習用入力文Ｘにおける各単語の位置をｉとし、ｉ＝{１,…,ｎ}、本実施の形態ではｎ＝６である。正解系列Ｌにおける各句構造ラベルの位置をｔとし、ｔ＝{１,…,ｍ}、本実施の形態ではｍ＝１５である。 The position of each word in the learning input sentence X is i, i = {1, ..., n}, and n = 6 in the present embodiment. The position of each phrase structure label in the correct answer series L is t, t = {1, ..., M}, and m = 15 in this embodiment.

演算部２２０は、学習データ生成部２３０と、学習部２３２と、句構造解析器２４０と、を含んで構成されている。なお、句構造解析器２４０は、上記句構造解析装置１００の句構造解析器４０と同様のものであり、句構造学習装置２００で学習される句構造解析器を句構造解析器２４０と表記する。 The calculation unit 220 includes a learning data generation unit 230, a learning unit 232, and a phrase structure analyzer 240. The phrase structure analyzer 240 is the same as the phrase structure analyzer 40 of the phrase structure analysis device 100, and the phrase structure analyzer learned by the phrase structure learning device 200 is referred to as a phrase structure analyzer 240. ..

学習データ生成部２３０は、学習用入力文と、学習用入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列であって、正規化された句構造ラベル列とに基づいて、句構造解析器に含まれる、句構造ラベルに対する、学習用入力文の各単語の重みを出力する注意機構３を学習するための、単語と句構造ラベルの対応付けからなる学習データを生成する。 The learning data generation unit 230 is a phrase structure label string composed of a learning input sentence and a phrase structure label of each node of the phrase structure tree representing the learning input sentence, and is based on a normalized phrase structure label string. Then, the learning data consisting of the correspondence between the words and the phrase structure label for learning the attention mechanism 3 that outputs the weight of each word of the input sentence for learning with respect to the phrase structure label included in the phrase structure analyzer is generated. To do.

学習データ生成部２３０における、正解α_ｉ ^ｔの作成方法について、以下に詳細に説明する。 In the learning data generating unit 230, how to create a correct alpha _i ^t, it is described in detail below.

学習データ生成部２３０は、学習用入力文Ｘと正規化された句構造木Ｌをもとに、正解の対応付けαを出力する。α_ｉ ^ｔは単語ｘ_ｉと句構造ラベルｌ_ｔに対応が存在する場合に１を、存在しない場合に０を返す変数である。 The learning data generation unit 230 outputs a correct answer association α based on the learning input sentence X and the normalized phrase structure tree L. The alpha _i ^t 1 if the corresponding word x _i and phrase structure labels l _t are present, in the absence of a variable return 0.

学習データ生成部２３０では正解α_ｉ ^ｔを求めるにあたって、学習用入力文を構成する単語と、正規化された句構造木を構成する句構造ラベルとの対応付けを行う。対応付けの例を図１８、図１９に示す。 In obtaining the learning data generation unit 230 in the correct alpha _i ^t, performs the words constituting the learning input sentence, the correspondence between phrase structure labels constituting the normalized phrase structure tree. Examples of the association are shown in FIGS. 18 and 19.

正解系列Ｌにおける、単語とそれに対応する品詞の存在を表す非終端記号ＸＸは、学習用入力文Ｘ中の単語と対応付けられる。正解系列Ｌの、最初と最後を表す<s>，</s>についてはそれぞれ入力文Ｘ中の<s>，</s>に対応付けられる。正解系列Ｌの、非終端記号ＸＸ、<s>，</s>以外の句構造ラベルが対応する単語は、木構造を仮定した際に当該句構造ラベルを頂点とする句構造木に含まれる、学習用入力文中の単語の集合から選択される。すわなち、当該句構造ラベルのノードに対する下位の葉ノードである学習用入力文の単語の集合から選択される。選択にあたっては次のいずれかの方法を用いる。 The non-terminal symbol XX representing the existence of a word and its corresponding part of speech in the correct answer sequence L is associated with the word in the learning input sentence X. The <s> and </ s> representing the beginning and the end of the correct answer sequence L are associated with <s> and </ s> in the input sentence X, respectively. Words corresponding to phrase structure labels other than nonterminal symbols XX, <s>, and </ s> in the correct series L are included in the phrase structure tree having the phrase structure label as the apex when the tree structure is assumed. It is selected from a set of words in the input sentence for learning. That is, it is selected from a set of words in the learning input sentence, which is a lower leaf node for the node of the phrase structure label. One of the following methods is used for selection.

１：当該句構造ラベルを頂点とする句構造木の主辞となる単語を選択
２：当該句構造ラベルを頂点とする句構造木に含まれる単語の集合のうち、最も左の単語を選択
３：当該句構造ラベルを頂点とする句構造木に含まれる単語の集合のうち、最も右の単語を選択
４：句構造ラベルが”("を含む場合、そのラベルを頂点とする句を構成する単語の集合のうち最も左の単語を選択する。句構造ラベルが”)"を含む場合、当該句構造ラベルを頂点とする句構造木に含まれる単語の集合のうち最も右の単語を選択 1: Select the word that is the main subject of the phrase structure tree whose apex is the phrase structure label 2: Select the leftmost word from the set of words included in the phrase structure tree whose apex is the phrase structure label 3: Select the rightmost word from the set of words contained in the phrase structure tree whose apex is the phrase structure label 4: If the phrase structure label contains "(", the words that make up the phrase whose apex is that label Select the leftmost word from the set of words. If the phrase structure label contains ")", select the rightmost word from the set of words contained in the phrase structure tree whose apex is the phrase structure label.

上記方法１においては、どの単語が句構造木の主辞となるかについて、予め学習データにヘッドルール等から人手でその情報を付与しておくか、別途解析をして付与する必要がある。方法４は、方法２と３を組み合わせたものとなる。 In the above method 1, it is necessary to manually add the information to the learning data from the head rule or the like in advance, or to separately analyze and add the information as to which word is the head of the phrase structure tree. Method 4 is a combination of methods 2 and 3.

学習データ生成部２３０では、このように正規化された句構造ラベル列に含まれる句構造ラベルの各々について、句構造ラベルのノードに対する下位の葉ノードである学習用入力文の単語を対応付けることにより学習データを生成する。 In the learning data generation unit 230, each of the phrase structure labels included in the phrase structure label string normalized in this way is associated with a word of a learning input sentence which is a lower leaf node with respect to the node of the phrase structure label. Generate training data.

本実施の形態では上記４の方法を取ることとし、処理結果を図２０に示す。 In the present embodiment, the method 4 described above is adopted, and the processing result is shown in FIG.

学習部２３２は、学習データ生成部２３０で生成された学習データに基づいて、正解句構造ラベルｌ_ｔを出力部の出力ｙ_ｔの正解データとして、注意機構３を含む句構造解析器２４０について学習を行う。学習方法はニューラルネットワークの学習方法として一般的なものを用いればよい。本実施の形態では、確率的勾配降下法により学習可能なパラメータの最適化を行うこととする。 The learning unit 232 learns about the phrase structure analyzer 240 including the attention mechanism 3 by using the correct phrase structure label l _t as the correct answer data of the output y _t of the output unit based on the learning data generated by the learning data generation unit 230. I do. As the learning method, a general learning method for the neural network may be used. In the present embodiment, the parameters that can be learned are optimized by the stochastic gradient descent method.

注意機構３の学習については、学習用入力文の各単語に対応する隠れ状態ベクトルの各々と、一つ前に出力された句構造ラベルに対する隠れ状態ベクトルとを入力として、学習データ生成部２３０の作成する正解α_ｉ ^ｔと、注意機構３の出力する推定α_ｉ ^ｔが等しくなるように学習を行う。学習方法はニューラルネットワークの学習方法として一般的なものを用いればよい。本実施の形態では、確率的勾配降下法により学習可能なパラメータの最適化を行うこととする。 Regarding the learning of the attention mechanism 3, the learning data generation unit 230 receives each of the hidden state vectors corresponding to each word of the learning input sentence and the hidden state vector for the phrase structure label output immediately before as inputs. and correct alpha _i ^t for creating, learning such that the output estimate alpha _i ^t attention mechanism 3 is equal performed. As the learning method, a general learning method for the neural network may be used. In the present embodiment, the parameters that can be learned are optimized by the stochastic gradient descent method.

＜本発明の実施の形態に係る句構造学習装置の作用＞ <Operation of phrase structure learning device according to the embodiment of the present invention>

次に、本発明の実施の形態に係る句構造学習装置２００の作用について説明する。句構造学習装置２００は、図２１に示す句構造学習処理ルーチンを実行する。 Next, the operation of the phrase structure learning device 200 according to the embodiment of the present invention will be described. The phrase structure learning device 200 executes the phrase structure learning processing routine shown in FIG.

まず、ステップＳ２００では、学習用入力文と、正規化された句構造木とを受け付ける。 First, in step S200, a learning input sentence and a normalized phrase structure tree are accepted.

ステップＳ２０２では、学習用入力文と、学習用入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列であって、正規化された句構造ラベル列とに基づいて、句構造解析器２４０に含まれる、句構造ラベルに対する、学習用入力文の各単語の重みを出力する注意機構３を学習するための、単語と句構造ラベルの対応付けからなる学習データを生成する。 In step S202, a phrase structure label string composed of a learning input sentence and a phrase structure label of each node of the phrase structure tree representing the learning input sentence, based on the normalized phrase structure label string. A learning data including a correspondence between a word and a phrase structure label for learning the attention mechanism 3 for outputting the weight of each word in a learning input sentence with respect to the phrase structure label included in the structure analyzer 240 is generated.

ステップＳ２０４では、学習データ生成部２３０で生成された学習データに基づいて、注意機構３を含む句構造解析器２４０について、正解句構造ラベルｌ_ｔを出力部の出力ｙ_ｔの正解データとして学習を行い、処理を終了する。なお、句構造解析器２４０のニューラルネットワークにおける処理ルーチンは上記図１４と同様であるため説明を省略する。 In step S204, based on the learning data generated by the learning data generation unit 230, the phrase structure analyzer 240 including the attention mechanism 3 is trained with the correct phrase structure label l _t as the correct answer data of the output y _t of the output unit. And end the process. Since the processing routine in the neural network of the phrase structure analyzer 240 is the same as that in FIG. 14, the description thereof will be omitted.

次に、ステップＳ２０２における学習データの生成処理ルーチンの詳細について図２２を参照して説明する。 Next, the details of the learning data generation processing routine in step S202 will be described with reference to FIG.

ステップＳ２０００では、ｔ＝１とする。 In step S2000, t = 1.

ステップＳ２００２では、ｉ＝１とする。 In step S2002, i = 1.

ステップＳ２００４では、正規化された句構造木の句構造ラベルｌ_ｔが<s>(または</s>)であるかを判定し、条件を満たす場合はステップＳ２００６へ移行し、条件を満たさない場合はステップＳ２００８へ移行する。 In step S2004, the normalized phrase structure tree phrase structure labels l _t is determined whether the <s> (or </ s>), if the condition is satisfied and proceeds to step S2006, the condition is not satisfied If so, the process proceeds to step S2008.

ステップＳ２００６では、学習用入力文の単語ｘ_ｉが<s>(または</s>であるかを判定し、条件を満たす場合はステップＳ２０１８へ移行し、条件を満たさない場合はステップＳ２０２０へ移行する。 In step S2006, it is determined whether the word x _i of the learning input sentence is <s> (or </ s>), and if the condition is satisfied, the process proceeds to step S2018, and if the condition is not satisfied, the process proceeds to step S2020. To do.

ステップＳ２００８では、正規化された句構造木のｌ_ｔがＸＸであるかを判定し、条件を満たす場合はステップＳ２０１０へ移行し、条件を満たさない場合はステップＳ２０１２へ移行する。 In step S2008, the normalized phrase structure tree _{l t} is determined whether the XX, when conditions are satisfied, the process proceeds to step S2010, if the condition is not satisfied the process proceeds to step S2012.

ステップＳ２０１０では、学習用入力文の単語ｘ_ｉが、ｌ_ｔのＸＸに対応するかを判定し、条件を満たす場合はステップＳ２０１８へ移行し、条件を満たさない場合はステップＳ２０２０へ移行する。 In step S2010, the word _{x i} of the learning input sentence, it determines whether corresponding to XX of _{l t,} if the condition is satisfied and proceeds to step S2018, if the condition is not satisfied the process proceeds to step S2020.

ステップＳ２０１２では、ｌ_ｔが"("を含むかを判定し、含む場合はステップＳ２０１４へ移行し、含まない場合はステップＳ２０１６へ移行する。 In step S2012, it is determined whether l _t includes "(", and if it is included, the process proceeds to step S2014, and if it is not included, the process proceeds to step S2016.

ステップＳ２０１４では、学習用入力文の単語ｘ_ｉが、ｌ_ｔを頂点とする句構造木に含まれ、かつ最も右であるかを判定し、条満たす場合はステップＳ２０１８へ移行し、条件を満たさない場合はステップＳ２０２０へ移行する。 In step S2014, it is determined whether the word x _i of the learning input sentence is included in the phrase structure tree having l _t as the apex and is the rightmost, and if the condition is satisfied, the process proceeds to step S2018 and the condition is satisfied. If not, the process proceeds to step S2020.

ステップＳ２０１６では、学習用入力文の単語ｘ_ｉが、ｌ_ｔを頂点とする句構造木に含まれ、かつ最も左であるかを判定し、条満たす場合はステップＳ２０１８へ移行し、条件を満たさない場合はステップＳ２０２０へ移行する。 In step S2016, the word x _i of the learning input sentence is included in the phrase structure tree whose vertices l _t, and determines whether the leftmost, if it meets conditions proceeds to step S2018, satisfy the condition If not, the process proceeds to step S2020.

ステップＳ２０１８では、α_i ^t＝１とする。 In step S2018, α _i ^t = 1.

ステップＳ２０２０では、α_i ^t＝０とする。 In step S2020, α _i ^t = 0.

ステップＳ２０２２では、ｉ＝ｉ＋１とする。なお、ｉをカウントアップすることをｉ＝ｉ＋１と表記する。 In step S2022, i = i + 1. Note that counting up i is expressed as i = i + 1.

ステップＳ２０２４では、ｉ＞ｎかを判定し、条満たす場合はステップＳ２０２６へ移行し、条件を満たさない場合はステップＳ２００４へ移行する。 In step S2024, it is determined whether i> n, and if the conditions are satisfied, the process proceeds to step S2026, and if the conditions are not satisfied, the process proceeds to step S2004.

ステップＳ２０２６では、ｔ＜ｍかを判定し、条満たす場合はステップＳ２０２６へ移行し、条件を満たさない場合は処理を終了する。 In step S2026, it is determined whether t <m, and if the condition is satisfied, the process proceeds to step S2026, and if the condition is not satisfied, the process ends.

以上説明したように、本発明の実施の形態に係る句構造学習装置２００によれば、入力文と、入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列であって、正規化された句構造ラベル列とに基づいて、句構造解析器２４０に含まれる、句構造ラベルに対する、入力文の各単語の重みを出力する注意機構３を学習するための、単語と句構造ラベルの対応付けからなる学習データを生成することにより、注意機構３のための学習データを用いて、精度よく句構造を解析するための句構造解析器２４０を学習することができる。 As described above, according to the phrase structure learning device 200 according to the embodiment of the present invention, it is a phrase structure label string composed of an input sentence and a phrase structure label of each node of the phrase structure tree representing the input sentence. , Words and phrases for learning the attention mechanism 3 included in the phrase structure analyzer 240, which outputs the weight of each word in the input sentence for the phrase structure label, based on the normalized phrase structure label sequence. By generating the learning data consisting of the association of the structure labels, the phrase structure analyzer 240 for accurately analyzing the phrase structure can be learned by using the training data for the attention mechanism 3.

なお、本発明は、上述した実施の形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 The present invention is not limited to the above-described embodiment, and various modifications and applications are possible without departing from the gist of the present invention.

例えば、上述した実施の形態では、句構造学習装置２００は、学習データ生成部２３０により学習データを生成し、学習部２３２により注意機構３を含む句構造解析器２４０を学習する場合を例に説明したが、これに限定されるものではなく、学習データ生成部２３０の学習データの生成、及び学習部２３２による注意機構３を含む句構造解析器２４０の学習をそれぞれ別の装置により実現するようにしてもよい。 For example, in the above-described embodiment, the case where the phrase structure learning device 200 generates learning data by the learning data generation unit 230 and the learning unit 232 learns the phrase structure analyzer 240 including the attention mechanism 3 will be described as an example. However, the present invention is not limited to this, and the learning data generation of the learning data generation unit 230 and the learning of the phrase structure analyzer 240 including the attention mechanism 3 by the learning unit 232 are realized by different devices. You may.

１エンコード部
２デコード部
３注意機構
４出力部
１０、２１０入力部
２０、２２０演算部
３０句構造解析部
４０、２４０句構造解析器
１００句構造解析装置
２００句構造学習装置
２３０学習データ生成部
２３２学習部 1 Encoding unit 2 Decoding unit 3 Attention mechanism 4 Output unit 10, 210 Input unit 20, 220 Arithmetic unit 30 Phrases structure analysis unit 40, 240 Phrase structure analyzer 100 Phrase structure analysis device 200 Phrase structure learning device 230 Learning data generation unit 232 Learning department

Claims

It is a phrase structure learning device that learns a phrase structure analyzer that outputs a phrase structure label string for an input sentence.
Based on the input sentence and the phrase structure label string consisting of the phrase structure labels of each node of the phrase structure tree representing the input sentence, the input sentence of the input sentence with respect to the phrase structure label included in the phrase structure analyzer. A learning data generation unit that generates learning data composed of a correspondence between the word and the phrase structure label for learning the attention mechanism that outputs the weight of each word.
Phrase structure learning device including.

The learning data generation unit associates the word of the input sentence, which is a lower leaf node of the phrase structure label with respect to the node, with each of the phrase structure labels included in the phrase structure label string. The phrase structure learning device according to claim 1.

The phrase structure analyzer outputs the phrase structure labels in order from the beginning.
The attention mechanism takes each of the hidden state vectors corresponding to each word of the input sentence and the hidden state vector for the phrase structure label output immediately before as input, and the input sentence for the phrase structure label. Outputs the weight of each word in
The phrase structure learning device according to claim 1 or 2, further comprising a learning unit that learns the attention mechanism based on the generated learning data.

Using a pre-learned phrase structure analyzer that outputs a phrase structure label string for an input sentence and includes a caution mechanism that outputs the weight of each word in the input sentence for the phrase structure label, A phrase structure analysis unit that takes the input sentence as an input and outputs the phrase structure label string for the input sentence.
It is a phrase structure analysis device that includes
The attention mechanism
Learning consisting of the correspondence between the word and the phrase structure label generated based on the learning input sentence and the phrase structure label string consisting of the phrase structure labels of each node of the phrase structure tree representing the learning input sentence. Phrase structure analyzer that is pre-learned based on the data.

It is a phrase structure learning method in a phrase structure learning device that learns a phrase structure analyzer that outputs a phrase structure label string for an input sentence.
The phrase structure label included in the phrase structure analyzer is included in the phrase structure analyzer based on the input sentence and a phrase structure label string composed of phrase structure labels of each node of the phrase structure tree representing the input sentence. A step of generating learning data including the correspondence between the word and the phrase structure label for learning the attention mechanism for outputting the weight of each word in the input sentence.
Phrase structure learning method including.

The phrase structure analyzer is a pre-learned phrase structure analyzer that outputs a phrase structure label string for an input sentence, and includes a caution mechanism that outputs the weight of each word of the input sentence for the phrase structure label. A step of using an analyzer to input the input sentence and output the phrase structure label string for the input sentence.
It is a phrase structure analysis method including
The attention mechanism
Learning consisting of the correspondence between the word and the phrase structure label generated based on the learning input sentence and the phrase structure label string consisting of the phrase structure labels of each node of the phrase structure tree representing the learning input sentence. Phrase structure analysis method that assumes that it has been learned in advance based on the data.

The computer program for functioning as a phrase structure learning equipment according to any one of claims 1 to 3.

A program for causing a computer to function as the phrase structure analysis device according to claim 4.