JP2019091172A

JP2019091172A - Phrase structure learning device, phrase structure analysis device and method, and program

Info

Publication number: JP2019091172A
Application number: JP2017218449A
Authority: JP
Inventors: 英剛上垣外; Hidetaka Kamigaito; 平尾　努; Tsutomu Hirao; 努平尾; 林　克彦; Katsuhiko Hayashi; 克彦林; 学奥村; Manabu Okumura; 大也高村; Daiya Takamura
Original assignee: Nippon Telegraph and Telephone Corp; Tokyo Institute of Technology NUC
Current assignee: Nippon Telegraph and Telephone Corp; Tokyo Institute of Technology NUC
Priority date: 2017-11-13
Filing date: 2017-11-13
Publication date: 2019-06-13
Anticipated expiration: 2037-11-13
Also published as: JP6830602B2

Abstract

To provide a phrase structure learning device, a phrase structure analysis device and method, and a program which generate learning data for such learning that an attention mechanism accurately outputs correspondence.SOLUTION: In order to allow for generation of learning data for such learning that an attention mechanism which is included in a phrase structure analyzer to output weights of each word in an input sentence for phrase structure labels of respective nodes of a phrase structure tree representing the input sentence accurately outputs correspondences, a phrase structure learning device 200 generates learning data which is for training the attention mechanism and comprises correspondences between words and the phrase structure labels, on the basis of the input sentence and a phrase structure label string comprising the phrase structure labels.SELECTED DRAWING: Figure 15

Description

本発明は、句構造学習装置、方法、及びプログラムに係り、特に、文の句構造を解析するための句構造学習装置、句構造解析装置、方法、及びプログラムに関する。 The present invention relates to a phrase structure learning device, method, and program, and more particularly to a phrase structure learning device, phrase structure analysis device, method, and program for analyzing the phrase structure of a sentence.

句構造解析は、入力された文の句構造木を計算機によって解析して出力する技術である。図１に句構造木の例を示す。句構造木は句と、その句の階層的な構造によって構成される。 Phrase structure analysis is a technique for analyzing and outputting a phrase structure tree of an input sentence by a computer. Fig. 1 shows an example of a phrase structure tree. A phrase structure tree is composed of phrases and the hierarchical structure of the phrases.

これらの句は、句構造ラベルと、その句を構成する単語集合からなる。句を構成する単語集合は、句構造木における当該句構造ラベルの下位の葉ノードに含まれる単語の集合によって構成される。 These phrases consist of a phrase structure label and a set of words that make up the phrase. A word set constituting a phrase is constituted by a set of words included in lower leaf nodes of the phrase structure label in the phrase structure tree.

ニューラルネットワークを用いた系列に基づく句構造解析法として、非特許文献１等がある。非特許文献１は、明示的な木構造を仮定せず、句構造木を句構造ラベルの系列として解析を行い出力する。非特許文献１をはじめとする従来技術においては、図１に示す句構造木は図２に示すような系列として表現される。系列として表現された句構造木では全ての葉ノードの単語と品詞(句構造ラベルのうち、単語の最も近くにあるラベル。図１ではＷＰ、ＶＢＺ、ＪＪを指す。)は図２に示すＸＸのようにすべて同一のラベルに置換される。また、ＸＸを除くすべてのラベルが句構造ラベルである。このような系列として表現された句構造木を、以後「正規化された句構造木」と呼ぶ。図３に従来技術に基づく句構造解析法の構成図を示す。系列に基づく句構造解析法は、人手によるルールや遷移規則を必要とせず、線形時間で句構造木を出力することが可能である。 Non-Patent Document 1 is an example of a phrase-based phrase structure analysis method using a neural network. Non-Patent Document 1 does not assume an explicit tree structure, but analyzes and outputs a phrase structure tree as a sequence of phrase structure labels. In the prior art including Non-Patent Document 1, the phrase structure tree shown in FIG. 1 is expressed as a series as shown in FIG. In the phrase structure tree expressed as a series, the words and part-of-speech of all leaf nodes (of the phrase structure labels, the label closest to the word; in FIG. 1, WP, VBZ, JJ are shown) are shown in FIG. Are all replaced with the same label. Also, all labels except XX are phrase structure labels. The phrase structure tree expressed as such a sequence is hereinafter referred to as a "normalized phrase structure tree". FIG. 3 shows a block diagram of a phrase structure analysis method based on the prior art. A sequence-based phrase structure analysis method can output a phrase structure tree in linear time without requiring manual rules or transition rules.

Vinyals, O., Kaiser, L ., Koo, T., Petrov, S., Sutskever,I., and Hinton, G. (2015). "Grammar as a foreign language.". In Advancesin Neural Information Processing Systems (pp. 2773-2781).Vinyals, O., Kaiser, L., Koo, T., Petrov, S., Sutskever, I., and Hinton, G. (2015). "Grammar as a foreign language.". In Advancesin Neural Information Processing Systems ( pp. 2773-2781).

非特許文献１において、入力される単語列と出力する句構造ラベルの関係は、句構造解析器に含まれる、句構造ラベルに対する、入力文の各単語の重みを出力する注意機構によって計算される分布によって捉えられる。図４に注意機構が出力する各重みの分布の例を示す。各セルは入力単語と出力された句構造ラベルの対応を、黒いセルは注意機構がその対応に高い確率を割り当てたことをそれぞれ表している。注意機構は入力文字列中のどの文字列が重要であるかを解析中に判断する。解析機はその結果を利用して解析結果を出力する。注意機構の出力する分布は明示的な句構造ラベルを与えられることなく、教師なしで学習される。そのために、注意機構が入力文字列と出力文字列の対応関係を正しく学習するとは限らない。この結果、注意機構が出力する誤った対応関係によって、出力される句構造木に誤りが生じる可能性がある。 In Non-Patent Document 1, the relation between the input word string and the output phrase structure label is calculated by the attention mechanism included in the phrase structure analyzer that outputs the weight of each word of the input sentence to the phrase structure label. Captured by distribution. FIG. 4 shows an example of the distribution of each weight output by the caution mechanism. Each cell represents the correspondence between the input word and the output phrase structure label, and the black cell represents that the attention mechanism assigned a high probability to the correspondence. The caution mechanism determines during analysis which string in the input string is important. The analyzer outputs the analysis result using the result. The output distribution of the attention mechanism is trained unsupervised without being given explicit phrase structure labels. Therefore, the caution mechanism does not necessarily learn the correspondence of the input string and the output string correctly. As a result, an incorrect correspondence relation outputted by the caution mechanism may cause an error in the outputted phrase structure tree.

本発明は、上記問題点を解決するために成されたものであり、注意機構が精度よく対応付けを出力するように学習するための学習データを生成できる句構造学習装置、方法、及びプログラムを提供することを目的とする。 The present invention has been made to solve the above problems, and provides a phrase structure learning device, method, and program capable of generating learning data for learning that the attention mechanism accurately outputs the correspondence. Intended to be provided.

また、本発明は、上記問題点を解決するために成されたものであり、精度よく、句構造解析を行うことができる、句構造解析装置、方法、及びプログラムを提供することを目的とする。 Another object of the present invention is to provide a phrase structure analysis apparatus, method, and program capable of performing phrase structure analysis with high accuracy, which has been made to solve the above-mentioned problems. .

上記目的を達成するために本発明に係る句構造学習装置は、入力文に対する句構造ラベル列を出力する句構造解析器を学習する句構造学習装置であって、前記入力文と、前記入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列とに基づいて、前記句構造解析器に含まれる、前記句構造ラベルに対する、前記入力文の各単語の重みを出力する注意機構を学習するための、前記単語と前記句構造ラベルの対応付けからなる学習データを生成する学習データ生成部、を含んで構成されている。 In order to achieve the above object, a phrase structure learning device according to the present invention is a phrase structure learning device for learning a phrase structure analyzer that outputs a phrase structure label string for an input sentence, the input sentence and the input sentence To output the weight of each word of the input sentence to the phrase structure label included in the phrase structure analyzer based on the phrase structure label string including the phrase structure label of each node of the phrase structure tree representing A learning data generation unit for generating learning data composed of the correspondence between the word and the phrase structure label for learning a mechanism.

また、本発明に係る句構造解析装置は、入力文に対する句構造ラベル列を出力する予め学習された句構造解析器であって、句構造ラベルに対する、前記入力文の各単語の重みを出力する注意機構を含む句構造解析器を用いて、前記入力文を入力とし、前記入力文に対する前記句構造ラベル列を出力する句構造解析部、を含む句構造解析装置であって、前記注意機構は、学習用入力文と、前記学習用入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列とに基づいて生成された、前記単語と前記句構造ラベルの対応付けからなる学習データに基づいて予め学習されたものとすることを特徴とする。 The phrase structure analysis device according to the present invention is a previously learned phrase structure analyzer that outputs a phrase structure label string for an input sentence, and outputs the weight of each word of the input sentence to the phrase structure label. A phrase structure analysis device including: a phrase structure analysis unit that takes the input sentence as an input and outputs the phrase structure label string for the input sentence using a phrase structure analyzer including a caution mechanism, the caution mechanism comprising: And an association of the word and the phrase structure label generated based on the learning input sentence and the phrase structure label string including the phrase structure label of each node of the phrase structure tree representing the learning input sentence. It is characterized in that it is learned in advance based on learning data.

本発明に係るプログラムは、コンピュータを、句構造学習装置、又は句構造解析装置の各部として機能させるためのプログラムである。 A program according to the present invention is a program for causing a computer to function as each part of a phrase structure learning device or a phrase structure analysis device.

本発明の句構造学習装置、方法、及びプログラムによれば、入力文と、入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列とに基づいて、句構造解析器に含まれる、句構造ラベルに対する、入力文の各単語の重みを出力する注意機構を学習するための、単語と句構造ラベルの対応付けからなる学習データを生成することにより、注意機構が精度よく対応付けを出力するように学習するための学習データを生成できる、という効果が得られる。 According to the phrase structure learning device, method, and program of the present invention, a phrase structure analyzer is provided based on an input sentence and a phrase structure label string consisting of phrase structure labels of each node of a phrase structure tree representing the input sentence. The attention mechanism responds with high precision by generating learning data consisting of the correspondence between words and phrase structure labels to learn the attention mechanism that outputs the weight of each word of the input sentence to the included phrase structure labels. The effect is that it is possible to generate learning data for learning so as to output the attachment.

また、本発明の句構造解析装置、方法、及びプログラムによれば、注意機構を含む句構造解析器を用いて、句構造解析を行い、句構造解析器に含まれる注意機構は、学習用入力文と、学習用入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列とに基づいて生成された、単語と句構造ラベルの対応付けからなる学習データに基づいて予め学習されたものとすることにより、精度よく、句構造解析を行うことができる、という効果が得られる。 Further, according to the phrase structure analysis device, method, and program of the present invention, phrase structure analysis is performed using a phrase structure analyzer including an attention mechanism, and the attention mechanism included in the phrase structure analyzer is a learning input The learning is made in advance based on learning data consisting of the correspondence between words and phrase structure labels, generated based on the sentence and the phrase structure label string consisting of phrase structure labels of each node of the phrase structure tree representing the input sentence for learning According to the above, it is possible to obtain the effect that the phrase structure analysis can be performed with high accuracy.

句構造解析によって出力される句構造木の一例を示す図である。It is a figure which shows an example of the phrase structure tree output by phrase structure analysis. 句構造木の句構造ラベルの系列の一例を示す図である。It is a figure which shows an example of the series of the phrase structure label of phrase structure tree. 従来の句構造解析法の構成図の一例を示す図である。It is a figure which shows an example of the block diagram of the conventional phrase structure analysis method. 注意機構３が出力する各重みの分布の一例を示す図である。It is a figure which shows an example of distribution of each weight which the caution mechanism 3 outputs. 学習フェーズにおける句構造解析器２４０を構成するニューラルネットワークの一例を示す図である。It is a figure which shows an example of the neural network which comprises the phrase structure analyzer 240 in a learning phase. 実行フェーズにおける句構造解析器４０を構成するニューラルネットワークの一例を示す図である。It is a figure which shows an example of the neural network which comprises the phrase structure analyzer in the execution phase. 本発明の実施の形態で用いる入出力の内容を示す図である。It is a figure which shows the content of the input-output used by embodiment of this invention. 本発明の実施の形態に係る句構造解析装置１００の構成を示すブロック図である。It is a block diagram showing composition of phrase structure analysis device 100 concerning an embodiment of the invention. 句構造解析器４０を構成するニューラルネットワークの構成を示す図である。FIG. 2 is a diagram showing a configuration of a neural network that constitutes a phrase structure analyzer 40. ニューラルネットワークを構成するエンコード部１の詳細の一例を示す図である。It is a figure which shows an example of the detail of the encoding part 1 which comprises a neural network. ニューラルネットワークを構成するデコード部２の詳細の一例を示す図である。It is a figure which shows an example of the detail of the decoding part 2 which comprises a neural network. ニューラルネットワークを構成する注意機構３の詳細の一例を示す図である。It is a figure which shows an example of the detail of the attention mechanism 3 which comprises a neural network. 本発明の実施の形態に係る句構造解析装置１００における句構造解析処理ルーチンを示すフローチャートである。It is a flowchart which shows the phrase structure-analysis processing routine in the phrase structure-analysis apparatus 100 based on embodiment of this invention. 句構造解析器４０のニューラルネットワークにおける処理ルーチンの詳細の一例を示す図である。It is a figure which shows an example of the detail of the processing routine in the neural network of the phrase structure analyzer. 本発明の実施の形態に係る句構造学習装置２００の構成を示すブロック図である。It is a block diagram showing composition of phrase structure learning device 200 concerning an embodiment of the invention. 本発明の実施の形態の学習の概要を示す図である。It is a figure showing an outline of learning of an embodiment of the present invention. 学習用入力文Ｘと正解系列Ｌの一例を示す図である。FIG. 6 is a diagram showing an example of a learning input sentence X and an answer series L. 学習用入力文を構成する単語と、句構造ラベルとの対応付けの一例を示す図である。It is a figure which shows an example of matching with the word which comprises the input sentence for learning, and a phrase structure label. 学習用入力文を構成する単語と、句構造ラベルとの対応付けの一例を示す図である。It is a figure which shows an example of matching with the word which comprises the input sentence for learning, and a phrase structure label. 学習データ生成部２３０の処理結果として得られた学習データの一例を示す図である。It is a figure which shows an example of the learning data obtained as a process result of the learning data generation part 230. FIG. 本発明の実施の形態に係る句構造学習装置２００における句構造学習処理ルーチンを示すフローチャートである。It is a flowchart which shows the phrase structure learning process routine in the phrase structure learning apparatus 200 which concerns on embodiment of this invention. 学習データの生成処理ルーチンを示すフローチャートである。It is a flowchart which shows the production | generation process routine of learning data.

以下、図面を参照して本発明の実施の形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜本発明の実施の形態に係る概要＞ <Overview of Embodiment of the Present Invention>

まず、本発明の実施の形態における概要を説明する。 First, an outline of the embodiment of the present invention will be described.

本発明の実施の形態では非特許文献１のような、ニューラルネットワークを用いた系列に基づく句構造解析器において、注意機構に対し、入力文を構成する単語と出力する句構造ラベルの関係を明示的に与えて学習を行う。 In the embodiment of the present invention, in a phrase structure analyzer based on a sequence using a neural network as in Non-Patent Document 1, the relationship between words constituting an input sentence and phrase structure labels to be output is clearly indicated for the attention mechanism. Give and learn.

注意機構の学習のための学習データは、句構造解析のための学習データをもとに、所定のルールを用いて作成する。具体的には、学習データの句構造木を構成する各句構造ラベルについて、句構造ラベルと、当該句構造ラベルのノードより下位の葉ノードに含まれる単語の集合を、所定のルールにより対応付けることで作成する。出力しようとする句構造ラベルと、その句構造ラベルを頂点とする句に含まれる単語とが対応付くことを正しい対応であると規定し、注意機構を学習するための学習データとする。 The learning data for learning the attention mechanism is created using predetermined rules based on the learning data for phrase structure analysis. Specifically, for each phrase structure label that constitutes a phrase structure tree of learning data, the phrase structure label is associated with a set of words included in leaf nodes lower than the node of the phrase structure label according to a predetermined rule. Create with It is defined that correspondence between a phrase structure label to be output and a word included in a phrase whose apex is the phrase structure label is defined as correct correspondence, and is used as learning data for learning the attention mechanism.

このような学習データを用いて注意機構を学習することにより、注意機構が誤った対応関係を含む分布を出力することを防ぎ、句構造解析の精度を向上させることができる。また、注意機構は句構造解析器の内部の処理であり、注意機構のための学習データを別途作成するのはコストが高いが、本手法により句構造解析器の学習データから注意機構のための学習データが作成できる。 By learning the attention mechanism using such learning data, it is possible to prevent the attention mechanism from outputting a distribution including an incorrect correspondence and to improve the accuracy of the phrase structure analysis. In addition, the attention mechanism is internal processing of the phrase structure analyzer, and although it is expensive to separately create learning data for the attention mechanism, the learning data of the phrase structure analyzer according to the present method is for the attention mechanism. Learning data can be created.

以下の本発明の実施の形態の説明では、学習フェーズと、実行フェーズとに分けて説明する。学習フェーズでは、図５に示す句構造解析器２４０を構成するニューラルネットワークのパラメータを、学習データに基づき決定する。実行フェーズでは、学習フェーズで定義された図６に示す句構造解析器４０を構成するニューラルネットワークに基づき入力を処理し、出力は学習済みのパラメータに依存して決定する。 The following description of the embodiment of the present invention will be divided into a learning phase and an execution phase. In the learning phase, parameters of the neural network constituting the phrase structure analyzer 240 shown in FIG. 5 are determined based on the learning data. In the execution phase, the input is processed based on the neural network constituting the phrase structure analyzer 40 shown in FIG. 6 defined in the learning phase, and the output is determined depending on the learned parameters.

以下、本実施の形態で用いる入出力の例として、図７を用いる。入力文Ｘにおける各単語の位置をｉとし、ｉ＝{１,…,ｎ}、本例ではｎ＝６である。出力系列Ｙにおける各句構造ラベルの位置をｔとし、ｔ＝{１, …,ｍ}、本例ではｍ＝１５である。 Hereinafter, FIG. 7 will be used as an example of input and output used in the present embodiment. The position of each word in the input sentence X is i, i = {1,..., N}, and n = 6 in this example. Let t be the position of each phrase structure label in the output sequence Y, and t = {1,..., M}, and m = 15 in this example.

＜本発明の実施の形態に係る句構造解析装置の構成＞ <Configuration of Phrase Structure Analysis Device According to Embodiment of the Present Invention>

次に、本発明の実施の形態に係る句構造解析装置の構成について説明する。なお、句構造解析装置において、実行フェーズを処理する。 Next, the configuration of the phrase structure analysis device according to the embodiment of the present invention will be described. In the phrase structure analysis device, the execution phase is processed.

本発明の実施の形態における句構造解析器４０は、処理対象となる文ｘを入力とし、句構造ラベルｙ_ｔを出力する。次に出力したｙ_ｔを入力とし、ｙ_ｔ＋１を出力する逐次処理を繰り返し、出力するｙ_ｔが文末記号である</s>となったときに処理を終了する。 The phrase structure analyzer 40 according to the embodiment of the present invention receives a sentence x to be processed as an input, and outputs a phrase structure label y _t . Next, the output y _t is input, and the sequential processing of outputting y _{t + 1} is repeated, and the processing ends when y _{t to} be output becomes a sentence end symbol </ s>.

図８に示すように、本発明の実施の形態に係る句構造解析装置１００は、ＣＰＵと、ＲＡＭと、後述する句構造解析処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。この句構造解析装置１００は、機能的には図８に示すように入力部１０と、演算部２０とを備えている。 As shown in FIG. 8, the phrase structure analysis device 100 according to the embodiment of the present invention includes a CPU, a RAM, and a ROM storing a program for executing a phrase structure analysis processing routine to be described later and various data. Can be configured with a computer including This phrase structure analysis device 100 functionally includes an input unit 10 and an operation unit 20 as shown in FIG.

入力部１０は、入力文を受け付ける。入力文は、文を分割し、文頭と文末記号を付与したものが与えられる。 The input unit 10 receives an input sentence. The input sentence is obtained by dividing the sentence and adding a sentence head and a sentence end symbol.

演算部２０は、句構造解析部３０と、句構造解析器４０とを含んで構成されている。 The operation unit 20 includes a phrase structure analysis unit 30 and a phrase structure analysis unit 40.

句構造解析器４０は、単語からなる入力文に対する句構造ラベル列を先頭から順に出力する予め学習された句構造解析器であり、ニューラルネットワークにより構成されるものである。また、句構造解析器４０は、句構造ラベルに対する、入力文の各単語の重みを出力する注意機構３を含む句構造解析器である。句構造解析器４０の学習については、後述する句構造学習装置において説明する。 The phrase structure analyzer 40 is a previously learned phrase structure analyzer that sequentially outputs a phrase structure label string for an input sentence consisting of words from the top, and is constituted by a neural network. The phrase structure analyzer 40 is a phrase structure analyzer including a caution mechanism 3 that outputs the weight of each word of the input sentence to the phrase structure label. The learning of the phrase structure analyzer 40 will be described in the phrase structure learning device described later.

ここで句構造解析器４０のニューラルネットワークの構成について説明する。図９に示すように句構造解析器４０を構成するニューラルネットワークは、エンコード部１と、デコード部２と、注意機構３と、出力部４とを含んで構成されている。 Here, the configuration of the neural network of the phrase structure analyzer 40 will be described. As shown in FIG. 9, the neural network constituting the phrase structure analyzer 40 is configured to include the encoding unit 1, the decoding unit 2, the attention mechanism 3, and the output unit 4.

エンコード部１は入力文を隠れ状態へと変換する。 The encoding unit 1 converts an input sentence into a hidden state.

デコード部２は、前回出力したラベルｙ_ｔ−１を隠れ状態へと変換する。 The decoding unit 2 converts the label yt _-1 output last time into the hidden state.

注意機構３は、エンコード部１とデコード部２によって変換された隠れ状態を組み合わせて重みづけを行い、入力文の各単語の重みへと変換する。 The caution mechanism 3 combines the hidden state converted by the encoding unit 1 and the decoding unit 2 to perform weighting, and converts it into the weight of each word of the input sentence.

出力部４は、エンコード部１の隠れ状態を、注意機構３により出力された入力文の各単語の重みに基づき重み付けし、デコード部２の隠れ状態と結合した上で、出力すべきラベルを決定する。 The output unit 4 weights the hidden state of the encoding unit 1 based on the weight of each word of the input sentence output by the caution mechanism 3, combines with the hidden state of the decoding unit 2, and determines the label to be output. Do.

１文の処理につき、エンコード部１での処理は最初の１回のみでよいが、それ以外の部はｍ−１回処理を繰り返すこととなる。 For the processing of one sentence, the processing in the encoding unit 1 may be performed only once, but the other parts will repeat the processing m-1 times.

以下、句構造解析器４０を構成するニューラルネットワークの各部の詳細な説明を行う。 Hereinafter, each part of the neural network which comprises the phrase structure analyzer 40 is demonstrated in detail.

図１０にエンコード部１の詳細を示す。エンコード部１は入力として１文を受け取り、１文に含まれる単語の系列ｘ＝{ｘ_１,…,ｘ_ｉ,…,ｘ_ｎ}を、各単語ｘ_ｉごとに、実数値の所定次元からなるベクトルｈ_ｉへと変換する。 FIG. 10 shows the details of the encoding unit 1. Encoding section 1 receives one sentence as an input, and generates a series of words x = {x ₁ ,..., X _i ,..., X _n } contained in one sentence, for each word x _i Convert to a vector h _i

具体的には、まず入力された単語の系列をｉ＝１からｎへ前向きに走査する順方向の再帰型ニューラルネットワークによってｌ番目の入力単語を隠れ状態ベクトル Specifically, the l-th input word is hidden by the forward recursive neural network which first scans the input word sequence forward from i = 1 to n.

に変換する。同様に、入力された単語の系列をｉ＝ｎから１へと後ろ向きに走査する逆方向の再帰型ニューラルネットによりｌ番目の入力単語を隠れ状態ベクトル
Convert to Similarly, the l-th input word is hidden by the backward recursive neural network that scans the input word sequence backward from i = n to 1

に変換する。最終的に、

Convert to Finally,

と
When

を結合し，ｈ_ｉとする．この処理により、ｈ_ｉは前方及び後方の単語の変換結果であるｈ_１,…,ｈ_ｉ−１及び、ｈ_ｉ＋１,…,ｈ_ｎに依存したものとなる。
Let be connected to be h _i . This process, _{h 1} _{h i} is the result forward and backward word _conversion, ..., _{h i-1} _{and, h i + 1, ...,} becomes dependent on _{h n.}

各単語ｘ_ｉをベクトル Vector each word x _i

または Or

に変換する際には、予め作成された(単語、単語ベクトル)の組からなるコードブックを用いる。単語ベクトルは、組となる単語の特徴を所定次元からなる空間上の座標として示したものであり、単語分散表現とも呼ばれるものである。本実施の形態では、全入力単語（<s>,</s>を含む）をＷとして、次の条件を満たすベクトルを用いるが、非特許文献２に記載の方法等を用い、予め作成されたものを用いてもよい。 When converting to, a codebook consisting of a set of (word, word vector) created in advance is used. The word vector indicates the feature of the word as a set as a coordinate in the space having a predetermined dimension, and is also called a word distributed expression. In this embodiment, with all input words (including <s> and </ s>) as W, a vector satisfying the following conditions is used, but it is prepared in advance using the method described in Non-Patent Document 2, etc. You may use what.

条件１：１次元がひとつの単語に対応する、全Ｗ次元からなる。
条件２：組となる単語に対応する次元の要素を１に、それ以外の次元の要素を０としたOne-hotベクトルとする。 Condition 1: 1 consists of all W dimensions corresponding to one word.
Condition 2: A one-hot vector in which an element of a dimension corresponding to a set of words is 1 and an element of the other dimensions is 0.

[非特許文献２]:Tomas Mikolov, Ilya Sutskever, Kai Chen,Greg Corrado, and Jeffrey Dean.Distributed Representations of Words and Phrases and their Compositionality. In Proceedings of NIPS, 2013. [Non-Patent Document 2]: Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. In Proceedings of NIPS, 2013.

本実施の形態では、単語ベクトルの各要素の値は、ニューラルネットワークのパラメータにより重みづけされる。パラメータは、後述する句構造学習装置の学習により更新される。 In the present embodiment, the value of each element of the word vector is weighted by the parameters of the neural network. The parameters are updated by learning of the phrase structure learning device described later.

図１１にデコード部２の詳細を示す。デコード部２は出力部４が前回出力した句構造ラベルｙ_ｔ−１及びｅｎｃ^ｔ−１を入力とし、順方向の再帰型ニューラルネットワークによって、入力情報と The details of the decoding unit 2 are shown in FIG. The decoding unit 2 receives as input the phrase structure labels y _t-1 and enc ^t-1 output by the output unit 4 last time, and inputs the input information by the forward recursive neural network.

とを、実数値の所定次元からなる隠れ状態ベクトル And a hidden state vector of real-valued predetermined dimensions

へと変換して出力する。逐次的に句構造ラベルが入力されるため、変換の結果は以前の句構造ラベルの変換結果である隠れ状態ベクトルの集合 Convert to and output. Since the phrase structure labels are sequentially input, the result of the conversion is a set of hidden state vectors that are the result of conversion of the previous phrase structure labels.

と、エンコード部１の隠れ状態ベクトルｈ_１,…,ｈ_ｎに注意機構３が重みづけをしたｅｎｃ^ｔ−１に依存する。ｅｎｃ^ｔの詳細は後述する。 And the hidden state vectors h ₁ ,..., H _n of the encoding unit 1 depend on enc ^t−1 weighted by the attention mechanism 3. Details of enc ^t will be described later.

句構造ラベルｙ_ｔ−１をベクトルに変換する際には、予め作成された(句構造ラベル,句構造ベクトル)の組からなるコードブックを用いる。全句構造ラベル数をＶとしたとき、句構造ベクトルとして次の条件を満たすベクトルを用いる。 When converting the phrase structure label y _t-1 into a vector, a codebook composed of a pair of (phrase structure label, phrase structure vector) created in advance is used. When the total phrase structure label number is V, a vector satisfying the following condition is used as a phrase structure vector.

条件１：１次元がひとつの句構造ラベルに対応する、全Ｖ次元からなる。
条件２：組となる句構造ラベルに対応する次元の要素を１に、それ以外の次元の要素を０としたOne-hotベクトルとする。 Condition 1: It consists of all V dimensions in which one dimension corresponds to one phrase structure label.
Condition 2: A one-hot vector in which an element of a dimension corresponding to a pair of phrase structure labels is 1 and an element of the other dimensions is 0.

デコード部２はｔ＝２より処理を開始することとし、その際の入力としてｔ＝１の句構造ラベルｙ_１には<s>を、ｅｎｃ^１には後述する初期値を、デコード部２の初期状態 Decoding unit 2 and to start the process from t = 2, the <s> is the phrase structure label y ₁ of t = 1 as an input at this time, the initial value will be described later in enc ^1, the decoding section 2 initial state

にはエンコード部１の隠れ状態 Hidden state of the encoding unit 1

を使用する。 Use

図１２に注意機構３の詳細を示す。注意機構３は、デコード部２の句構造ラベルの隠れ状態への変換結果 FIG. 12 shows the details of the caution mechanism 3. Attention mechanism 3 is the conversion result of the phrase structure label of decoding unit 2 to the hidden state

と、エンコード部１の各単語の隠れ状態への変換結果ｈ_１,…,ｈ_ｎを入力として受け取り、各対応の総和が１となるように正規化された重みα_ｉ ^ｔをニューラルネットワークにより計算し、分布α^ｔとして出力する。注意機構３の初期値α^１（ｔ＝１）は、α_１ ^１＝１、それ以外の要素が０のベクトルとしておく。 And the conversion results h ₁ ,..., H _n of each word of the encoding unit 1 to the hidden state as inputs, and the weights α _i ^t normalized by the neural network are calculated so that the sum of each correspondence becomes 1 Output as a distribution α ^t . The initial value α ¹ (t = 1) of the caution mechanism 3 is set to a vector of α ₁ ¹ = 1, and the other elements are 0.

出力部４では、注意機構３が出力した分布、エンコード部１の各単語の隠れ状態への変換結果ｈ_１,…,ｈ_ｎ、デコード部２の句構造ラベルの隠れ状態への変換結果 In the output unit 4, the distribution outputted by the caution mechanism 3, the conversion results h ₁ ,..., H _n of each word of the encoding unit 1 to the hidden state, and the conversion results of the phrase structure label of the decoding unit 2 to the hidden state

を受け取り、各句構造ラベルの出力確率を出力する。初めに出力部４は注意機構３が出力した分布α^ｔに従い、エンコード部１の変換結果に重み付けを行った総和を以下（１）式により計算する。 And output the output probability of each phrase structure label. First, the output unit 4 calculates the sum obtained by weighting the conversion result of the encoding unit 1 according to the following equation (1) according to the distribution α ^t output by the caution mechanism 3.

…（１）
... (1)

（１）式の総和とデコード部２の句構造ラベルの変換結果を連結したベクトル

をソフトマックス層に入力し、各句構造ラベルの出力確率を決定する。句構造ラベル数がＶの際の句構造ラベルｙ_ｔの出力確率Ｐ(ｙ_ｔ|ｘ_１,…,ｘ_ｎ,ｙ_１,…,ｙ_ｔ−１)は、重み行列Ｗ_ｖとバイアス項ｂを用いて以下（２）式で計算される。 (1) A vector obtained by connecting the sum of expressions and the conversion result of the phrase structure label of the decoding unit 2

Are input to the soft max layer, and the output probability of each phrase structure label is determined. The output probability P (y _t | x ₁ ,..., X _n , y ₁ ,..., Y _t−1 ) of the phrase structure label y _t when the number of phrase structure labels is V is a weight matrix W _v and a bias term b The following equation (2) is used to calculate.

…（２）
... (2)

で計算される。重み行列Ｗ_ｖ及びバイアス項ｂは、ニューラルネットワークのパラメータである。 Calculated by The weighting matrix W _v and the bias term b are parameters of the neural network.

最も高い出力確率Ｐとなるｙ_ｔを句構造解析器４０が出力するｔ番目の句構造ラベルとする。 Let y _t with the highest output probability P be the t-th phrase structure label output by the phrase structure analyzer 40.

以上の処理により逐次的に句構造ラベルｙ_ｔを出力し、出力するｙ_ｔが文末記号である</s>となったとき（本実施の形態ではｔ＝１４のとき）に処理を終了する。 The phrase structure label y _t is sequentially output by the above processing, and the processing ends when the output y _t becomes a sentence end symbol </ s> (at the time of t = 14 in this embodiment). .

出力部の初期値として、

は<s>とする。α^１は注意機構３の初期値を利用する。 As the initial value of the output unit,

Is <s>. α ¹ uses the initial value of the caution mechanism 3.

以上が、句構造解析器４０を構成するニューラルネットワークの各部についての説明である。 This completes the description of each part of the neural network that constitutes the phrase structure analyzer 40.

句構造解析部３０は、句構造解析器４０を用いて、入力部１０で受け付けた入力文を入力とし、入力文に対する句構造ラベル列を出力する。ここで、句構造解析器に含まれる注意機構３は、学習用入力文と、学習用入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列であって、正規化された句構造ラベル列とに基づいて生成された、単語と句構造ラベルの対応付けからなる学習データに基づいて予め学習されたものとする。ここで、正規化された句構造ラベル列とは、上記図２について説明したように、葉ノードの単語と品詞について置換を行った後の句構造木の句構造ラベルの系列である。 The phrase structure analysis unit 30 uses the phrase structure analysis unit 40 to use the input sentence accepted by the input unit 10 as an input, and outputs a phrase structure label string for the input sentence. Here, the attention mechanism 3 included in the phrase structure analyzer is a phrase structure label string including the input sentence for learning and the phrase structure label of each node of the phrase structure tree representing the input sentence for learning, and is normalized. It is assumed that learning is made in advance on the basis of learning data which is generated on the basis of the phrase structure label string and which corresponds to the word and the phrase structure label. Here, the normalized phrase structure label sequence is a sequence of phrase structure labels of a phrase structure tree after replacement of a word and a part of speech of a leaf node, as described with reference to FIG.

＜本発明の実施の形態に係る句構造解析装置の作用＞ <Operation of Phrase Structure Analysis Device According to Embodiment of the Present Invention>

次に、本発明の実施の形態に係る句構造解析装置１００の作用について説明する。句構造解析装置１００は、図１３に示す句構造解析処理ルーチンを実行する。 Next, the operation of the phrase structure analysis device 100 according to the embodiment of the present invention will be described. The phrase structure analysis device 100 executes a phrase structure analysis processing routine shown in FIG.

まず、ステップＳ１００では、入力部１０において入力文を受け付ける。 First, in step S100, the input unit 10 receives an input sentence.

ステップＳ１０２では、句構造解析器４０を用いて、入力部１０で受け付けた入力文を入力とし、入力文に対する句構造ラベル列を出力する。 In step S102, using the phrase structure analyzer 40, the input sentence accepted by the input unit 10 is input, and a phrase structure label string for the input sentence is output.

次に、句構造解析器４０のニューラルネットワークにおける処理ルーチンの詳細について図１４を参照して説明する。 Next, details of the processing routine in the neural network of the phrase structure analyzer 40 will be described with reference to FIG.

ステップＳ１０００では、ｔ＝１とする。 In step S1000, t = 1.

ステップＳ１００２では、エンコード部１において、入力文を受け付け、入力文を隠れ状態へと変換する。 In step S1002, the encoding unit 1 receives an input sentence, and converts the input sentence into a hidden state.

ステップＳ１００４では、ニューラルネットワークの各部における初期値を設定する。 In step S1004, an initial value in each part of the neural network is set.

ステップＳ１００６では、ｔ＝２とする。 In step S1006, t = 2.

ステップＳ１００８では、デコード部２において、初期ラベル、又は前回出力したラベルｙ_ｔ−１を隠れ状態へと変換する。 In step S1008, the decoding unit 2 converts the initial label or the label y _t-1 output last time into the hidden state.

ステップＳ１０１０では、注意機構３において、エンコード部１とデコード部２によって変換された隠れ状態を組み合わせて重みづけを行い、入力文の各単語の重みへと変換する。 In step S1010, in the caution mechanism 3, weighting is performed by combining the hidden states converted by the encoding unit 1 and the decoding unit 2, and conversion is performed to the weight of each word of the input sentence.

ステップＳ１０１２では、出力部４において、エンコード部１の隠れ状態を、注意機構３により出力された入力文の各単語の重みに基づき重み付けし、デコード部２の隠れ状態と結合した上で、出力すべきラベルを決定する。 In step S1012, the output unit 4 weights the hidden state of the encoding unit 1 based on the weight of each word of the input sentence output by the caution mechanism 3, combines it with the hidden state of the decoding unit 2, and outputs the result. Determine the power label.

ステップＳ１０１４では、ステップＳ１０１２の出力が文末記号</s>ではないかを判定し、文末記号</s>であれば処理を終了し、文末記号</s>でなければステップＳ１０１６でｔ＝ｔ＋１とカウントアップして処理を繰り返す。なお、ｔをカウントアップすることをｔ＝ｔ＋１と表記する。 In step S1014, it is determined whether the output of step S1012 is the sentence end symbol </ s>, and if the sentence end symbol </ s>, the processing is ended, and if it is not the sentence end symbol </ s>, t = t in step S1016 Count up to t + 1 and repeat the process. Note that counting up t is denoted as t = t + 1.

以上説明したように、本発明の実施の形態に係る句構造解析装置１００によれば、注意機構３を含む句構造解析器４０を用いて、句構造解析を行い、句構造解析器４０に含まれる注意機構３は、学習用入力文と、学習用入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列であって、正規化された句構造ラベル列とに基づいて生成された、単語と句構造ラベルの対応付けからなる学習データに基づいて予め学習されたものとすることにより、精度よく、句構造解析を行うことができる。 As described above, according to the phrase structure analysis device 100 according to the embodiment of the present invention, the phrase structure analysis is performed using the phrase structure analyzer 40 including the caution mechanism 3 and included in the phrase structure analyzer 40. The attention mechanism 3 is a phrase structure label string consisting of a learning input sentence and a phrase structure label of each node of a phrase structure tree representing the learning input sentence, and is based on the normalized phrase structure label string Phrase structure analysis can be performed with high accuracy by using learning data generated in advance on the basis of learning data formed by associating words with phrase structure labels.

＜本発明の実施の形態に係る句構造学習装置の構成＞ <Structure of Phrase Structure Learning Device According to the Embodiment of the Present Invention>

次に、本発明の実施の形態に係る句構造学習装置の構成について説明する。なお、句構造学習装置において、学習フェーズを処理する。また、本実施の形態において、ニューラルネットワークのエンコード部１、デコード部２、注意機構３、出力部４は、同時に学習を行うこととするが、別々に学習を行ってもよい。 Next, the configuration of the phrase structure learning device according to the embodiment of the present invention will be described. In the phrase structure learning device, the learning phase is processed. Further, in the present embodiment, the encoding unit 1, the decoding unit 2, the attention mechanism 3, and the output unit 4 of the neural network perform learning simultaneously, but learning may be performed separately.

図１５に示すように、本発明の実施の形態に係る句構造学習装置２００は、ＣＰＵと、ＲＡＭと、後述する句構造学習処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。この句構造学習装置２００は、機能的には図１５に示すように入力部２１０と、演算部２２０とを備えている。 As shown in FIG. 15, the phrase structure learning device 200 according to the embodiment of the present invention includes a CPU, a RAM, and a ROM storing a program for executing a phrase structure learning processing routine described later and various data. Can be configured with a computer including This phrase structure learning device 200 functionally includes an input unit 210 and an operation unit 220 as shown in FIG.

図１６に本実施の形態の学習の概要図を示す。図１６は、学習データ生成部の学習データ生成処理と、ニューラルネットワークの学習処理に分けられる。 FIG. 16 shows a schematic diagram of learning according to the present embodiment. FIG. 16 is divided into learning data generation processing of a learning data generation unit and learning processing of a neural network.

入力部２１０は、学習用入力文と、正規化された句構造木とを受け付ける。学習フェーズにおける入力となる学習用入力文は、句構造解析の対象となる文と、その解析結果として正解である句構造ラベル系列の組が複数集められたものとする。本実施の形態では、学習データの一組として図１７で提示する学習用入力文Ｘと正解系列Ｌ（正規化された句構造木）の組を例として用い、説明を行う。ここでいう正規化とは、原稿冒頭で述べた、全ての葉ノードの単語と品詞をＸＸ等の同一のラベルにより置換することであり、正規化された句構造木とは、系列として表現され、かつ、正規化された句構造木を指す。 The input unit 210 receives a learning input sentence and a normalized phrase structure tree. The learning input sentence serving as the input in the learning phase is assumed to be a collection of a plurality of sets of a sentence to be subjected to phrase structure analysis and a phrase structure label sequence which is a correct result as the analysis result. In the present embodiment, the description will be made using, as an example, a set of learning input sentences X and a correct answer series L (normalized phrase structure tree) presented in FIG. 17 as one set of learning data. Normalization here means replacing the words and part-of-speech of all leaf nodes described in the beginning of the manuscript with the same label such as XX, and the normalized phrase structure tree is expressed as a series And point to a normalized phrase structure tree.

学習用入力文Ｘにおける各単語の位置をｉとし、ｉ＝{１,…,ｎ}、本実施の形態ではｎ＝６である。正解系列Ｌにおける各句構造ラベルの位置をｔとし、ｔ＝{１,…,ｍ}、本実施の形態ではｍ＝１５である。 The position of each word in the learning input sentence X is i, i = {1,..., N}, and n = 6 in the present embodiment. Let t be the position of each phrase structure label in the correct answer series L, t = {1,..., M}, and m = 15 in this embodiment.

演算部２２０は、学習データ生成部２３０と、学習部２３２と、句構造解析器２４０と、を含んで構成されている。なお、句構造解析器２４０は、上記句構造解析装置１００の句構造解析器４０と同様のものであり、句構造学習装置２００で学習される句構造解析器を句構造解析器２４０と表記する。 The arithmetic unit 220 includes a learning data generation unit 230, a learning unit 232, and a phrase structure analyzer 240. The phrase structure analyzer 240 is the same as the phrase structure analyzer 40 of the phrase structure analyzer 100, and the phrase structure analyzer learned by the phrase structure learning device 200 is referred to as the phrase structure analyzer 240. .

学習データ生成部２３０は、学習用入力文と、学習用入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列であって、正規化された句構造ラベル列とに基づいて、句構造解析器に含まれる、句構造ラベルに対する、学習用入力文の各単語の重みを出力する注意機構３を学習するための、単語と句構造ラベルの対応付けからなる学習データを生成する。 The learning data generation unit 230 is a phrase structure label string including a learning input sentence and phrase structure labels of each node of a phrase structure tree representing the learning input sentence, and is based on the normalized phrase structure label string To generate learning data consisting of the correspondence between words and phrase structure labels for learning the attention mechanism 3 for outputting the weight of each word of the learning input sentence to the phrase structure label, which is included in the phrase structure analyzer. Do.

学習データ生成部２３０における、正解α_ｉ ^ｔの作成方法について、以下に詳細に説明する。 The method of creating the correct answer α _i ^t in the learning data generation unit 230 will be described in detail below.

学習データ生成部２３０は、学習用入力文Ｘと正規化された句構造木Ｌをもとに、正解の対応付けαを出力する。α_ｉ ^ｔは単語ｘ_ｉと句構造ラベルｌ_ｔに対応が存在する場合に１を、存在しない場合に０を返す変数である。 The learning data generation unit 230 outputs the correspondence α of the correct answer based on the learning input sentence X and the normalized phrase structure tree L. α _i ^t is a variable that returns 1 when there is a correspondence between the word x _i and the phrase structure label l _t , and returns 0 when there is not.

学習データ生成部２３０では正解α_ｉ ^ｔを求めるにあたって、学習用入力文を構成する単語と、正規化された句構造木を構成する句構造ラベルとの対応付けを行う。対応付けの例を図１８、図１９に示す。 In order to obtain the correct answer α _i ^t , the learning data generation unit 230 associates the words constituting the learning input sentence with the phrase structure labels constituting the normalized phrase structure tree. An example of the correspondence is shown in FIG. 18 and FIG.

正解系列Ｌにおける、単語とそれに対応する品詞の存在を表す非終端記号ＸＸは、学習用入力文Ｘ中の単語と対応付けられる。正解系列Ｌの、最初と最後を表す<s>，</s>についてはそれぞれ入力文Ｘ中の<s>，</s>に対応付けられる。正解系列Ｌの、非終端記号ＸＸ、<s>，</s>以外の句構造ラベルが対応する単語は、木構造を仮定した際に当該句構造ラベルを頂点とする句構造木に含まれる、学習用入力文中の単語の集合から選択される。すわなち、当該句構造ラベルのノードに対する下位の葉ノードである学習用入力文の単語の集合から選択される。選択にあたっては次のいずれかの方法を用いる。 A non-terminal symbol XX representing the presence of a word and a part of speech corresponding thereto in the correct answer series L is associated with the word in the learning input sentence X. The <s> and </ s> representing the first and the last of the correct answer series L are respectively associated with <s> and </ s> in the input sentence X. A word corresponding to a phrase structure label other than the non-terminal symbol XX, <s>, </ s> of the correct answer series L is included in a phrase structure tree whose apex is the phrase structure label when the tree structure is assumed. It is selected from a set of words in the learning input sentence. That is, it is selected from a set of words of a learning input sentence which is a subordinate leaf node to a node of the phrase structure label. Use one of the following methods for selection.

１：当該句構造ラベルを頂点とする句構造木の主辞となる単語を選択
２：当該句構造ラベルを頂点とする句構造木に含まれる単語の集合のうち、最も左の単語を選択
３：当該句構造ラベルを頂点とする句構造木に含まれる単語の集合のうち、最も右の単語を選択
４：句構造ラベルが”("を含む場合、そのラベルを頂点とする句を構成する単語の集合のうち最も左の単語を選択する。句構造ラベルが”)"を含む場合、当該句構造ラベルを頂点とする句構造木に含まれる単語の集合のうち最も右の単語を選択 1: Select a word that is the head of a phrase structure tree having the phrase structure label as a vertex 2: select the leftmost word in the set of words included in the phrase structure tree having the phrase structure label as a vertex 3: Select the rightmost word among the set of words included in the phrase structure tree having the phrase structure label at the top 4: When the phrase structure label includes “(”, the words constituting the phrase having the label at the vertex If the phrase structure label includes ")", select the rightmost word in the set of words included in the phrase structure tree whose apex is the phrase structure label.

上記方法１においては、どの単語が句構造木の主辞となるかについて、予め学習データにヘッドルール等から人手でその情報を付与しておくか、別途解析をして付与する必要がある。方法４は、方法２と３を組み合わせたものとなる。 In the method 1 described above, it is necessary to manually add information to the learning data manually from the head rule or the like or separately analyze which word is the head of the phrase structure tree. Method 4 is a combination of methods 2 and 3.

学習データ生成部２３０では、このように正規化された句構造ラベル列に含まれる句構造ラベルの各々について、句構造ラベルのノードに対する下位の葉ノードである学習用入力文の単語を対応付けることにより学習データを生成する。 The learning data generation unit 230 associates the words of the learning input sentence, which is a lower leaf node with respect to the node of the phrase structure label, for each of the phrase structure labels included in the phrase structure label string thus normalized. Generate learning data.

本実施の形態では上記４の方法を取ることとし、処理結果を図２０に示す。 In the present embodiment, the above method 4 is adopted, and the processing result is shown in FIG.

学習部２３２は、学習データ生成部２３０で生成された学習データに基づいて、正解句構造ラベルｌ_ｔを出力部の出力ｙ_ｔの正解データとして、注意機構３を含む句構造解析器２４０について学習を行う。学習方法はニューラルネットワークの学習方法として一般的なものを用いればよい。本実施の形態では、確率的勾配降下法により学習可能なパラメータの最適化を行うこととする。 The learning unit 232 learns about the phrase structure analyzer 240 including the attention mechanism 3 with the correct phrase structure label l _t as the correct data of the output y _t of the output unit based on the learning data generated by the learning data generation unit 230. I do. A general learning method may be used as a learning method of a neural network. In this embodiment, optimization of learnable parameters is performed by the stochastic gradient descent method.

注意機構３の学習については、学習用入力文の各単語に対応する隠れ状態ベクトルの各々と、一つ前に出力された句構造ラベルに対する隠れ状態ベクトルとを入力として、学習データ生成部２３０の作成する正解α_ｉ ^ｔと、注意機構３の出力する推定α_ｉ ^ｔが等しくなるように学習を行う。学習方法はニューラルネットワークの学習方法として一般的なものを用いればよい。本実施の形態では、確率的勾配降下法により学習可能なパラメータの最適化を行うこととする。 As for the learning of the caution mechanism 3, each of the hidden state vectors corresponding to each word of the learning input sentence and the hidden state vector for the phrase structure label output immediately before are input as The learning is performed so that the correct solution α _i ^t to be created and the estimated α _i ^t output from the attention mechanism 3 become equal. A general learning method may be used as a learning method of a neural network. In this embodiment, optimization of learnable parameters is performed by the stochastic gradient descent method.

＜本発明の実施の形態に係る句構造学習装置の作用＞ <Operation of Phrase Structure Learning Device According to the Embodiment of the Present Invention>

次に、本発明の実施の形態に係る句構造学習装置２００の作用について説明する。句構造学習装置２００は、図２１に示す句構造学習処理ルーチンを実行する。 Next, the operation of the phrase structure learning device 200 according to the embodiment of the present invention will be described. The phrase structure learning device 200 executes a phrase structure learning process routine shown in FIG.

まず、ステップＳ２００では、学習用入力文と、正規化された句構造木とを受け付ける。 First, in step S200, a learning input sentence and a normalized phrase structure tree are received.

ステップＳ２０２では、学習用入力文と、学習用入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列であって、正規化された句構造ラベル列とに基づいて、句構造解析器２４０に含まれる、句構造ラベルに対する、学習用入力文の各単語の重みを出力する注意機構３を学習するための、単語と句構造ラベルの対応付けからなる学習データを生成する。 In step S202, a phrase structure label string comprising the learning input sentence and the phrase structure label of each node of the phrase structure tree representing the learning input sentence, the phrase being based on the normalized phrase structure label string The learning data is formed of the correspondence between words and phrase structure labels for learning the attention mechanism 3 for outputting the weight of each word of the learning input sentence to the phrase structure label, which is included in the structure analyzer 240.

ステップＳ２０４では、学習データ生成部２３０で生成された学習データに基づいて、注意機構３を含む句構造解析器２４０について、正解句構造ラベルｌ_ｔを出力部の出力ｙ_ｔの正解データとして学習を行い、処理を終了する。なお、句構造解析器２４０のニューラルネットワークにおける処理ルーチンは上記図１４と同様であるため説明を省略する。 In step S204, based on the learning data generated by the learning data generation unit 230, for the phrase structure analyzer 240 including the caution mechanism 3, learning is performed using the correct phrase structure label l _t as the correct data of the output y _t of the output unit. Perform and complete the process. The processing routine in the neural network of the phrase structure analyzer 240 is the same as that shown in FIG.

次に、ステップＳ２０２における学習データの生成処理ルーチンの詳細について図２２を参照して説明する。 Next, details of the learning data generation processing routine in step S202 will be described with reference to FIG.

ステップＳ２０００では、ｔ＝１とする。 In step S2000, t = 1.

ステップＳ２００２では、ｉ＝１とする。 In step S2002, i = 1.

ステップＳ２００４では、正規化された句構造木の句構造ラベルｌ_ｔが<s>(または</s>)であるかを判定し、条件を満たす場合はステップＳ２００６へ移行し、条件を満たさない場合はステップＳ２００８へ移行する。 In step S2004, it is determined whether the phrase structure label l _{t of} the normalized phrase structure tree is <s> (or </ s>). If the condition is satisfied, the process proceeds to step S2006, and the condition is not satisfied. In the case, the process moves to step S2008.

ステップＳ２００６では、学習用入力文の単語ｘ_ｉが<s>(または</s>であるかを判定し、条件を満たす場合はステップＳ２０１８へ移行し、条件を満たさない場合はステップＳ２０２０へ移行する。 In step S2006, it is determined whether word x _{i of the} input sentence for learning is <s> (or </ s>. If the condition is satisfied, the process proceeds to step S2018. If the condition is not satisfied, the process proceeds to step S2020 Do.

ステップＳ２００８では、正規化された句構造木のｌ_ｔがＸＸであるかを判定し、条件を満たす場合はステップＳ２０１０へ移行し、条件を満たさない場合はステップＳ２０１２へ移行する。 In step S2008, it is determined whether the normalized phrase structure tree l _t is XX. If the condition is satisfied, the process proceeds to step S2010. If the condition is not satisfied, the process proceeds to step S2012.

ステップＳ２０１０では、学習用入力文の単語ｘ_ｉが、ｌ_ｔのＸＸに対応するかを判定し、条件を満たす場合はステップＳ２０１８へ移行し、条件を満たさない場合はステップＳ２０２０へ移行する。 In step S2010, the word _{x i} of the learning input sentence, determines whether corresponding to XX of _{l t,} if the condition is satisfied and proceeds to step S2018, if the condition is not satisfied the process proceeds to step S2020.

ステップＳ２０１２では、ｌ_ｔが"("を含むかを判定し、含む場合はステップＳ２０１４へ移行し、含まない場合はステップＳ２０１６へ移行する。 In step S2012, it is determined whether containing _{l t} is "(", if it contains the process proceeds to step S2014, if not included proceeds to step S2016.

ステップＳ２０１４では、学習用入力文の単語ｘ_ｉが、ｌ_ｔを頂点とする句構造木に含まれ、かつ最も右であるかを判定し、条満たす場合はステップＳ２０１８へ移行し、条件を満たさない場合はステップＳ２０２０へ移行する。 In step S2014, it is determined whether the word x _i of the learning input sentence is included in the phrase structure tree having l _t as a vertex and is the rightmost. If the condition is satisfied, the process proceeds to step S2018 to satisfy the condition If not, the process proceeds to step S2020.

ステップＳ２０１６では、学習用入力文の単語ｘ_ｉが、ｌ_ｔを頂点とする句構造木に含まれ、かつ最も左であるかを判定し、条満たす場合はステップＳ２０１８へ移行し、条件を満たさない場合はステップＳ２０２０へ移行する。 In step S2016, it is determined whether the word x _i of the learning input sentence is included in the phrase structure tree having l _t as a vertex and is the leftmost. If the condition is satisfied, the process proceeds to step S2018 to satisfy the condition If not, the process proceeds to step S2020.

ステップＳ２０１８では、α_i ^t＝１とする。 In step S2018, α _i ^t = 1 is set.

ステップＳ２０２０では、α_i ^t＝０とする。 In step S2020, α _i ^t = 0.

ステップＳ２０２２では、ｉ＝ｉ＋１とする。なお、ｉをカウントアップすることをｉ＝ｉ＋１と表記する。 In step S2022, i = i + 1. Note that counting up i is written as i = i + 1.

ステップＳ２０２４では、ｉ＞ｎかを判定し、条満たす場合はステップＳ２０２６へ移行し、条件を満たさない場合はステップＳ２００４へ移行する。 In step S2024, it is determined whether i> n. If the condition is satisfied, the process proceeds to step S2026. If the condition is not satisfied, the process proceeds to step S2004.

ステップＳ２０２６では、ｔ＜ｍかを判定し、条満たす場合はステップＳ２０２６へ移行し、条件を満たさない場合は処理を終了する。 In step S2026, it is determined whether t <m. If the condition is satisfied, the process proceeds to step S2026. If the condition is not satisfied, the process ends.

以上説明したように、本発明の実施の形態に係る句構造学習装置２００によれば、入力文と、入力文を表す句構造木の各ノードの句構造ラベルからなる句構造ラベル列であって、正規化された句構造ラベル列とに基づいて、句構造解析器２４０に含まれる、句構造ラベルに対する、入力文の各単語の重みを出力する注意機構３を学習するための、単語と句構造ラベルの対応付けからなる学習データを生成することにより、注意機構３のための学習データを用いて、精度よく句構造を解析するための句構造解析器２４０を学習することができる。 As described above, according to the phrase structure learning device 200 according to the embodiment of the present invention, the phrase structure label string is composed of an input sentence and a phrase structure label of each node of a phrase structure tree representing the input sentence , A word and a phrase for learning the attention mechanism 3 which outputs the weight of each word of the input sentence to the phrase structure label, which is included in the phrase structure analyzer 240, based on the normalized phrase structure label sequence and By generating learning data including correspondences of structure labels, it is possible to learn the phrase structure analyzer 240 for analyzing the phrase structure with high accuracy using the learning data for the attention mechanism 3.

なお、本発明は、上述した実施の形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 The present invention is not limited to the above-described embodiment, and various modifications and applications can be made without departing from the scope of the present invention.

例えば、上述した実施の形態では、句構造学習装置２００は、学習データ生成部２３０により学習データを生成し、学習部２３２により注意機構３を含む句構造解析器２４０を学習する場合を例に説明したが、これに限定されるものではなく、学習データ生成部２３０の学習データの生成、及び学習部２３２による注意機構３を含む句構造解析器２４０の学習をそれぞれ別の装置により実現するようにしてもよい。 For example, in the embodiment described above, the phrase structure learning device 200 generates learning data by the learning data generation unit 230, and the learning unit 232 learns the phrase structure analyzer 240 including the attention mechanism 3 as an example. However, the present invention is not limited to this, and the generation of the learning data of the learning data generation unit 230 and the learning of the phrase structure analyzer 240 including the attention mechanism 3 by the learning unit 232 may be realized by different devices. May be

１エンコード部
２デコード部
３注意機構
４出力部
１０、２１０入力部
２０、２２０演算部
３０句構造解析部
４０、２４０句構造解析器
１００句構造解析装置
２００句構造学習装置
２３０学習データ生成部
２３２学習部 Reference Signs List 1 encode unit 2 decode unit 3 caution mechanism 4 output unit 10, 210 input unit 20, 220 operation unit 30 phrase structure analysis unit 40, 240 phrase structure analysis unit 100 phrase structure analysis device 200 phrase structure learning device 230 learning data generation unit 232 Learning department

Claims

A phrase structure learning device for learning a phrase structure analyzer that outputs a phrase structure label sequence for an input sentence, comprising:
The input sentence for the phrase structure label included in the phrase structure analyzer, based on the input sentence and a phrase structure label string consisting of phrase structure labels of each node of the phrase structure tree representing the input sentence A learning data generation unit for generating learning data composed of correspondence between the word and the phrase structure label for learning an attention mechanism that outputs a weight of each word;
Phrase structure learning device including.

The learning data generation unit is configured to associate the words of the input sentence, which is a leaf node lower than the node of the phrase structure label, for each of the phrase structure labels included in the phrase structure label string. The phrase structure learning device according to claim 1, which generates

The phrase structure analyzer outputs the phrase structure labels in order from the top, and
The attention mechanism receives, as inputs, each of a hidden state vector corresponding to each word of the input sentence and a hidden state vector for the phrase structure label output immediately before, the input sentence for the phrase structure label Output the weight of each word of
The phrase structure learning device according to claim 1, further comprising a learning unit that learns the attention mechanism based on the generated learning data.

A pre-learned phrase structure analyzer that outputs a phrase structure label string for an input sentence, using a phrase structure analyzer including a caution mechanism that outputs a weight of each word of the input sentence to a phrase structure label, A phrase structure analysis unit that takes the input sentence as an input and outputs the phrase structure label string for the input sentence,
A phrase structure analyzer including
The attention mechanism is
Learning consisting of correspondence between the word and the phrase structure label generated based on the learning input sentence and the phrase structure label string consisting of the phrase structure label of each node of the phrase structure tree representing the learning input sentence A phrase structure analysis device that is learned in advance based on data.

A phrase structure learning method in a phrase structure learning device for learning a phrase structure analyzer that outputs a phrase structure label sequence for an input sentence, comprising:
The phrase structure label contained in the phrase structure analyzer, based on the input sentence and a phrase structure label string comprising a phrase structure label of each node of a phrase structure tree representing the input sentence, the learning data generation unit Generating learning data consisting of the correspondence between the words and the phrase structure label, for learning an attention mechanism that outputs the weight of each word of the input sentence with respect to
Phrase structure learning method including.

A phrase structure analysis unit is a pre-learned phrase structure analyzer that outputs a phrase structure label string for an input sentence, and includes a caution mechanism that outputs a weight of each word of the input sentence to a phrase structure label. Using the analyzer as input to the input sentence and outputting the phrase structure label string for the input sentence;
A phrase structure analysis method including
The attention mechanism is
Learning consisting of correspondence between the word and the phrase structure label generated based on the learning input sentence and the phrase structure label string consisting of the phrase structure label of each node of the phrase structure tree representing the learning input sentence A phrase structure analysis method that is prelearned based on data.

The program for functioning a computer as each part of the phrase structure learning apparatus of any one of Claims 1-3, or the phrase structure analysis apparatus of Claim 4.