JP2016218806A

JP2016218806A - Empty category estimation device, empty category estimation model learning device, method, and program

Info

Publication number: JP2016218806A
Application number: JP2015103963A
Authority: JP
Inventors: ジュンオウ; Jung Oh; 克仁須藤; Katsuto Sudo; 昌明永田; Masaaki Nagata
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2015-05-21
Filing date: 2015-05-21
Publication date: 2016-12-22
Anticipated expiration: 2035-05-21
Also published as: JP6381136B2

Abstract

PROBLEM TO BE SOLVED: To make it possible to estimate the position of the empty category of an input text precisely.SOLUTION: A feature extraction unit 230 extracts dispersion expression of a word as the feature vector of a candidate for the position of an empty category relative to each candidate for the position of the empty category based on the dependency structure tree of an input text. An estimation unit 238 estimates the position of the empty category and an empty category label based on a model including a mapping from a feature vector to a lower dimensional space and a mapping from each of empty category labels to the lower dimensional space that have been learned in advance and the feature vector of each of the candidates of the position of the extracted empty category.SELECTED DRAWING: Figure 3

Description

本発明は、空範疇推定装置、空範疇推定モデル学習装置、方法、及びプログラムに係り、特に、入力されたテキストの空範疇の位置および種類を推定するための空範疇推定装置、空範疇推定モデル学習装置、方法、及びプログラムに関する。 The present invention relates to an empty category estimation device, an empty category estimation model learning device, a method, and a program, and more particularly, to an empty category estimation device and an empty category estimation model for estimating the position and type of an empty category of input text. The present invention relates to a learning apparatus, method, and program.

空範疇検出とは、所与の文章の一部から空範疇を検出することである。従来の研究は、空範疇検出を、主に、分類問題として、あるいは完全な構文解析の副次的問題として、定式化してきた。 The empty category detection is to detect an empty category from a part of a given sentence. Previous studies have formulated air category detection primarily as a classification problem or as a sub-problem of full parsing.

非特許文献１は、ＥＣのとりうる位置を示す依存関係木を用いている。また、非特許文献１では、依存関係木を用いて、ＥＣのとりうる位置の特徴を抽出している。そして、アノテーションされたデータから、分類モデルがトレーニングされる。 Non-Patent Document 1 uses a dependency relationship tree indicating positions that can be taken by EC. Further, in Non-Patent Document 1, a feature of a position that can be taken by an EC is extracted using a dependency relationship tree. A classification model is trained from the annotated data.

非特許文献２は、多数のクラスにまでスケールを拡大可能な同時画像ラベルアノテーション法(joint image-label annotation)を提案している。いずれも、画像およびラベルの双方を、隠れ空間（hidden space）へマッピングし、画像およびラベルの分散表現間の距離に応じて、画像のラベルを決定する。 Non-Patent Document 2 proposes a joint image-label annotation method capable of expanding the scale to many classes. Both map both images and labels to a hidden space and determine the label of the image as a function of the distance between the image and the distributed representation of the label.

Xue Nianwen, and Yaqin Yang. "Dependency-based empty category detection via phrase structure trees." In HLT-NAACL, pp. 1051-1060. 2013.Xue Nianwen, and Yaqin Yang. "Dependency-based empty category detection via phrase structure trees." In HLT-NAACL, pp. 1051-1060. 2013. Weston Jason, Samy Bengio, and Nicolas Usunier. "Wsabie: Scaling up to large vocabulary image annotation." IJCAI. Vol. 11. 2011.1Weston Jason, Samy Bengio, and Nicolas Usunier. "Wsabie: Scaling up to large vocabulary image annotation." IJCAI. Vol. 11. 2011.1

本発明では、入力されたテキストの空範疇の位置および種類を精度よく推定することができる空範疇推定装置、方法、及びプログラムを提供することを目的とする。 An object of the present invention is to provide a sky category estimation device, method, and program capable of accurately estimating the position and type of an empty category of input text.

また、テキストの空範疇の位置および種類を精度よく推定するためのモデルを学習することができる空範疇推定モデル学習装置、方法、及びプログラムを提供することを目的とする。 It is another object of the present invention to provide a sky category estimation model learning apparatus, method, and program capable of learning a model for accurately estimating the position and type of a sky category of text.

上記目的を達成するために、第１の発明に係る空範疇推定装置は、入力テキストから、省略又は移動により生じた名詞的語句である空範疇を推定するための空範疇推定装置であって、前記入力テキストの依存構造木に基づいて、空範疇の位置の候補の各々に対し、前記空範疇の位置の候補の特徴として、単語の分散表現等を抽出する特徴抽出部と、予め学習された、前記特徴から低次元空間への写像、及び空範疇ラベルの各々から前記低次元空間への写像を含むモデルと、前記特徴抽出部によって抽出された前記空範疇の位置の候補の各々の前記特徴とに基づいて、前記空範疇の位置及び前記空範疇ラベルを推定する推定部と、を含んで構成されている。 In order to achieve the above object, an empty category estimation device according to a first aspect of the present invention is an empty category estimation device for estimating an empty category that is a noun phrase generated by omission or movement from an input text, Based on the dependency structure tree of the input text, a feature extraction unit that extracts a distributed expression of a word or the like as a feature of the candidate of the empty category position for each candidate of the empty category position, and learned in advance A model including a mapping from the feature to a low-dimensional space and a mapping from each empty category label to the low-dimensional space, and the feature of each of the candidates for the location of the empty category extracted by the feature extraction unit And an estimation unit that estimates the position of the empty category and the empty category label.

第２の発明に係る空範疇推定方法は、特徴抽出部及び推定部を含み、入力テキストから、省略又は移動により生じた名詞的語句である空範疇を推定するための空範疇推定装置における空範疇推定方法であって、前記特徴抽出部が、前記入力テキストの依存構造木に基づいて、空範疇の位置の候補の各々に対し、前記空範疇の位置の候補の特徴として、単語の分散表現等を抽出し、前記推定部が、予め学習された、前記特徴から低次元空間への写像、及び空範疇ラベルの各々から前記低次元空間への写像を含むモデルと、前記特徴抽出部によって抽出された前記空範疇の位置の候補の各々の前記特徴とに基づいて、前記空範疇の位置及び前記空範疇ラベルを推定する。 An empty category estimation method according to a second invention includes a feature extraction unit and an estimation unit, and an empty category in an empty category estimation device for estimating an empty category that is a noun phrase generated by omission or movement from an input text. In the estimation method, the feature extraction unit may, for each of the empty category position candidates, based on the dependency structure tree of the input text, as a feature of the empty category position candidate, a distributed expression of words, etc. And the estimation unit is extracted by the feature extraction unit, which has been learned in advance, and includes a model including a mapping from the feature to the low-dimensional space and a mapping from each empty category label to the low-dimensional space. Further, the position of the empty category and the empty category label are estimated based on the characteristics of each candidate of the empty category.

第１及び第２の発明によれば、前記特徴抽出部が、前記入力テキストの依存構造木に基づいて、空範疇の位置の候補の各々に対し、前記空範疇の位置の候補の特徴として、単語の分散表現等を抽出する。そして、前記推定部が、予め学習された、前記特徴から低次元空間への写像、及び空範疇ラベルの各々から前記低次元空間への写像を含むモデルと、前記特徴抽出部によって抽出された前記空範疇の位置の候補の各々の前記特徴とに基づいて、前記空範疇の位置及び前記空範疇ラベルを推定する。 According to the first and second inventions, the feature extraction unit, for each of the empty category position candidates, based on the dependency structure tree of the input text, as a feature of the empty category position candidates, Extract distributed expressions of words. Then, the estimation unit learns in advance the model including the mapping from the feature to the low-dimensional space and the mapping from each empty category label to the low-dimensional space, and the feature extraction unit extracts the model Based on the features of each candidate empty category position, the empty category position and the empty category label are estimated.

このように、入力テキストの依存構造木に基づいて、空範疇の位置の候補の各々の特徴として単語の分散表現等を抽出し、特徴から低次元空間への写像、及び空範疇ラベルの各々から低次元空間への写像を含むモデルに基づいて、空範疇の位置及び空範疇ラベルを推定することにより、入力されたテキストの空範疇の位置および種類を精度よく推定することができる。 In this way, based on the dependency structure tree of the input text, a word distributed representation or the like is extracted as each feature of the empty category position candidates, and the mapping from the feature to the low-dimensional space and from each of the empty category labels. The position and type of the empty category of the input text can be accurately estimated by estimating the position of the empty category and the empty category label based on the model including the mapping to the low-dimensional space.

第３の発明に係る空範疇推定モデル学習装置は、省略又は移動により生じた名詞的語句である空範疇の位置及び空範疇ラベルが付与された複数のテキストの各々について、前記テキストの依存構造木に基づいて、空範疇の位置の候補の各々に対し、前記空範疇の位置の候補の特徴として、単語の分散表現等を抽出する特徴抽出部と、前記特徴抽出部によって前記複数のテキストの各々について抽出された前記空範疇の位置の候補の各々の前記特徴と、前記複数のテキストの各々に付与された前記空範疇の位置及び空範疇ラベルとに基づいて、前記特徴から低次元空間への写像、及び空範疇ラベルの各々から前記低次元空間への写像を含むモデルを学習する学習部と、を含んで構成されている。 According to a third aspect of the present invention, there is provided an empty category estimation model learning device, wherein each of a plurality of texts to which an empty category position and an empty category label, which are noun phrases generated by omission or movement, are assigned, is a dependency structure tree of the text. Based on the above, for each of the empty category position candidates, a feature extraction unit that extracts a distributed expression of a word or the like as a feature of the empty category position candidate, and each of the plurality of texts by the feature extraction unit From the feature to the low-dimensional space based on the features of each of the candidate empty category positions extracted for and the empty category position and empty category label assigned to each of the plurality of texts. A learning unit that learns a model including a mapping and a mapping from each empty category label to the low-dimensional space.

第４の発明に係る空範疇推定モデル学習方法は、特徴抽出部及び学習部を含む空範疇推定モデル学習装置における空範疇推定モデル学習方法であって、前記特徴抽出部が、省略又は移動により生じた名詞的語句である空範疇の位置及び空範疇ラベルが付与された複数のテキストの各々について、前記テキストの依存構造木に基づいて、空範疇の位置の候補の各々に対し、前記空範疇の位置の候補の特徴として、単語の分散表現等を抽出し、前記学習部が、前記特徴抽出部によって前記複数のテキストの各々について抽出された前記空範疇の位置の候補の各々の前記特徴と、前記複数のテキストの各々に付与された前記空範疇の位置及び空範疇ラベルとに基づいて、前記特徴から低次元空間への写像、及び空範疇ラベルの各々から前記低次元空間への写像を含むモデルを学習する。 An empty category estimation model learning method according to a fourth invention is an empty category estimation model learning method in an empty category estimation model learning device including a feature extraction unit and a learning unit, wherein the feature extraction unit is generated by omission or movement. For each of a plurality of texts that have been assigned an empty category position and an empty category label, the empty category position is determined for each of the empty category position candidates based on the dependency structure tree of the text. As a feature of the position candidate, a distributed expression of a word is extracted, and the learning unit extracts the feature of each of the empty category position candidates extracted for each of the plurality of texts by the feature extraction unit; Based on the position of the empty category and the empty category label assigned to each of the plurality of texts, the mapping from the feature to a low-dimensional space and the low-dimensional sky from each of the empty category labels. To learn a model that contains a mapping to.

第３及び第４の発明によれば、前記特徴抽出部が、省略又は移動により生じた名詞的語句である空範疇の位置及び空範疇ラベルが付与された複数のテキストの各々について、前記テキストの依存構造木に基づいて、空範疇の位置の候補の各々に対し、前記空範疇の位置の候補の特徴として、単語の分散表現等を抽出する。そして、前記学習部が、前記特徴抽出部によって前記複数のテキストの各々について抽出された前記空範疇の位置の候補の各々の前記特徴と、前記複数のテキストの各々に付与された前記空範疇の位置及び空範疇ラベルとに基づいて、前記特徴から低次元空間への写像、及び空範疇ラベルの各々から前記低次元空間への写像を含むモデルを学習する。 According to the third and fourth aspects of the present invention, the feature extraction unit performs, for each of a plurality of texts to which an empty category position and an empty category label, which are noun phrases generated by omission or movement, are assigned. Based on the dependency structure tree, for each of the candidates for the empty category position, a distributed expression of a word or the like is extracted as the feature of the candidate empty category position. Then, the learning unit extracts the feature of each of the candidates for the position of the empty category extracted for each of the plurality of texts by the feature extracting unit, and the empty category assigned to each of the plurality of texts. Based on the position and the empty category label, a model including a mapping from the feature to the low-dimensional space and a mapping from each empty category label to the low-dimensional space is learned.

このように、テキストの依存構造木に基づいて、空範疇の位置の候補の各々の特徴として単語の分散表現等を抽出し、複数のテキストの各々に付与された前記空範疇の位置及び空範疇ラベルに基づいて、前記特徴から低次元空間への写像、及び空範疇ラベルの各々から前記低次元空間への写像を含むモデルを学習することにより、テキストの空範疇の位置および種類を精度よく推定するためのモデルを学習することができる。 As described above, based on the dependency tree of the text, a distributed expression of the word or the like is extracted as a feature of each candidate of the empty category position, and the empty category position and the empty category assigned to each of the plurality of texts are extracted. Based on the label, the position and type of the empty category of the text are accurately estimated by learning a model that includes the mapping from the feature to the low-dimensional space and the mapping of each empty category label to the low-dimensional space. To learn a model to do.

また、本発明のプログラムは、コンピュータを、上記の空範疇推定装置、及び空範疇推定モデル学習装置を構成する各部として機能させるためのプログラムである。 Moreover, the program of this invention is a program for functioning a computer as each part which comprises said sky category estimation apparatus and said sky category estimation model learning apparatus.

以上説明したように、本発明の空範疇推定装置、方法、及びプログラムによれば、入力テキストの依存構造木に基づいて、空範疇の位置の候補の各々の特徴として単語の分散表現等を抽出し、特徴から低次元空間への写像、及び空範疇ラベルの各々から低次元空間への写像を含むモデルに基づいて、空範疇の位置及び空範疇ラベルを推定することにより、入力されたテキストの空範疇の位置および種類を精度よく推定することができる。 As described above, according to the empty category estimation device, method, and program of the present invention, based on the dependency structure tree of the input text, a distributed representation of a word or the like is extracted as the feature of each candidate for the empty category position. Then, based on a model that includes a mapping from features to low dimensional space and a mapping from each empty category label to low dimensional space, the position of the empty category and the empty category label are estimated to estimate The position and type of the sky category can be estimated with high accuracy.

また、本発明の空範疇推定モデル学習装置、方法、及びプログラムによれば、テキストの依存構造木に基づいて、空範疇の位置の候補の各々の特徴として単語の分散表現等を抽出し、複数のテキストの各々に付与された前記空範疇の位置及び空範疇ラベルに基づいて、前記特徴から低次元空間への写像、及び空範疇ラベルの各々から前記低次元空間への写像を含むモデルを学習することにより、テキストの空範疇の位置および種類を精度よく推定するためのモデルを学習することができる。 Further, according to the empty category estimation model learning apparatus, method, and program of the present invention, based on the dependency structure tree of text, a distributed expression of a word or the like is extracted as a feature of each candidate of the empty category position, and a plurality of Learning a model including a mapping from the feature to a low-dimensional space and a mapping from each of the empty-category labels to the low-dimensional space based on the position of the empty category and a blank category label assigned to each of the texts By doing so, it is possible to learn a model for accurately estimating the position and type of the empty category of the text.

空範疇の位置を説明するための図である。It is a figure for demonstrating the position of an empty category. (a)依存関係タイプ付きの依存構造木の例を示す図、(b)ルートから空範疇OPへ至る経路を示す図、及び(c)ルートから空範疇OPへ至る経路上の各単語に対する依存関係タイプの列を示す図である。(a) A diagram showing an example of a dependency structure tree with a dependency type, (b) a diagram showing a route from the root to the empty category OP, and (c) a dependency on each word on the route from the route to the empty category OP It is a figure which shows the column of relationship type. 本発明の実施の形態に係る空範疇推定モデル学習装置の機能的構成を示すブロック図である。It is a block diagram which shows the functional structure of the sky category estimation model learning apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る空範疇推定装置の機能的構成を示すブロック図である。It is a block diagram which shows the functional structure of the air category estimation apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る空範疇推定モデル学習装置における空範疇推定モデル学習処理ルーチンを示すフローチャート図である。It is a flowchart figure which shows the empty category estimation model learning process routine in the empty category estimation model learning apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る空範疇推定装置における空範疇推定処理ルーチンを示すフローチャート図である。It is a flowchart figure which shows the sky category estimation processing routine in the sky category estimation apparatus which concerns on embodiment of this invention. テストデータにおける空範疇ラベルの分布を示す図である。It is a figure which shows distribution of the empty category label in test data. 実験結果を示す図である。It is a figure which shows an experimental result.

以下、図面を参照して本発明の実施の形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜本発明の実施の形態の概要＞
本実施の形態は、同時文脈ラベル埋込法(joint context-label embedding)を用いた空範疇（ＥＣ：empty category）検出の品質を向上させることを目的とする。空範疇は、テキスト中に明示的には現れていない名詞的単語（nominal word）である。通例、省略または移動により生じる。本実施の形態では、空範疇の位置の特徴として、単語の分散表現を、空範疇の存在およびラベルを決定するために用いる。 <Outline of Embodiment of the Present Invention>
The object of the present embodiment is to improve the quality of empty category (EC) detection using a joint context-label embedding method. An empty category is a nominal word that does not appear explicitly in the text. Usually caused by omission or movement. In the present embodiment, as a feature of the position of the empty category, a distributed expression of the word is used to determine the existence and label of the empty category.

空範疇検出は、テキストにおいて省略または移動により生じた名詞的語句を検出するものである。本実施の形態では、これを分類問題として定式化している。本実施の形態では、各種のＥＣをクラスとして定義する。さらに、ＥＣではない位置を「NONE」と呼ぶことにする。それにより、ＥＣのとりうる位置の全てを収集し、これらの位置を、事前に定義されたクラスへと分類することがタスクとなる。 The empty category detection is to detect a noun phrase caused by omission or movement in the text. In the present embodiment, this is formulated as a classification problem. In the present embodiment, various ECs are defined as classes. Further, a position that is not EC is called “NONE”. Thereby, the task is to collect all the possible positions of the EC and classify these positions into predefined classes.

本実施の形態では、モデルを用いて、ＥＣ位置を分類する。これには、主要な部分問題が２つある。一方は、ＥＣ位置を特徴として表現することであり、他方は、これらの位置を、事前に定義されたクラスに分類することである。 In the present embodiment, EC positions are classified using a model. There are two main subproblems. One is to represent EC positions as features, and the other is to classify these positions into predefined classes.

以下に、本実施の形態に係る空範疇推定装置の原理について説明する。 The principle of the air category estimation device according to the present embodiment will be described below.

＜ＥＣ位置表現＞
＜ＥＣ位置の記述＞
非特許文献１に示された方法に引き続き、我々は、ＥＣの位置の候補の全てを、テキスト中の単語の依存関係を表わす依存構造木を用いて収集する。ＥＣの位置の候補の各々は、単語対、すなわち「<主辞単語，後続単語>」を用いて表現可能である。後続単語(following word)とは、文中の記述位置に続く単語のことである。主辞単語(head word)は、依存構造木においてその位置にECがあると仮定する際にECが修飾する(attach)単語のことである。図１Ａは中国語の文「吃了」に対する依存構造木においてＥＣの位置を表現する例である。先頭が「吃」であり、「了」が後続するため、ＥＣの位置の候補Position-1は「<吃，了>」として表現され、ＥＣの位置の候補Position-2は、「<吃，。>」として表現される。 <EC position expression>
<Description of EC position>
Following the method shown in Non-Patent Document 1, we collect all EC position candidates using a dependency structure tree that represents the dependency of words in the text. Each of the EC position candidates can be expressed using a word pair, ie, “<main word, following word>”. A following word is a word that follows a description position in a sentence. The head word is the word that the EC attaches to when assuming that the EC is in that position in the dependency structure tree. FIG. 1A is an example of expressing the position of EC in the dependency structure tree for the Chinese sentence “end”. Since the beginning is “吃” followed by “End”, EC position candidate Position-1 is represented as “<吃, End>”, and EC position candidate Position-2 is “<「, . "

＜ＥＣ位置の特徴抽出＞
そして、我々は、上述のように定義された各ＥＣの位置の候補の特徴を抽出する。特徴ベクトルは、ＥＣ検出に役立つものと期待される、テキスト中の単語の分散表現を連結することにより構成される。本実施の形態では、あるＥＣに対する特徴ベクトルが、（１）主辞単語（ダミーのルートノードを除く）の分散表現、（２）テキスト中の後続単語の分散表現、（３）「甥（nephews）」すなわち後続単語の子の分散表現、（４）依存構造木の経路上の各単語の分散表現を含んで構成される。これらを特徴テンプレートと呼ぶ。 <Feature extraction of EC position>
And we extract the candidate features of each EC location defined as above. A feature vector is constructed by concatenating distributed representations of words in text that are expected to be useful for EC detection. In the present embodiment, a feature vector for a certain EC includes (1) a distributed representation of a main word (excluding a dummy root node), (2) a distributed representation of subsequent words in the text, (3) “nephews” "That is, it includes a distributed representation of the children of the following word and (4) a distributed representation of each word on the path of the dependency structure tree. These are called feature templates.

（１）主辞単語（ダミーのルートノードを除く）の分散表現について、単語がｄ次元ベクトルを用いて表現されるものとすると、この特徴を表現するためにｄ次元が必要となる。主辞単語の分散表現は、特徴ベクトル中の対応する位置に置かれることになる。 (1) For distributed representations of main words (excluding dummy root nodes), if the words are represented using d-dimensional vectors, d dimensions are required to represent this feature. The distributed representation of the head word will be placed at the corresponding position in the feature vector.

（２）テキスト中の後続単語の分散表現について、この特徴は、主辞単語と同じ方法で抽出される。 (2) For the distributed representation of subsequent words in the text, this feature is extracted in the same way as the main word.

（３）「甥（nephews）」すなわち後続単語の子の分散表現について、後続単語の子となる単語のうち、左端側の２つの単語を選択し、選択した２つの単語の分散表現を用いる。 (3) For “nephews”, that is, the distributed representation of the child of the subsequent word, two words on the left end side are selected from the words that are the children of the subsequent word, and the distributed representation of the selected two words is used.

（４）依存構造木の経路上の各単語の分散表現について、テキストの依存構造木に基いて、ルートノードからＥＣの位置の候補までの経路上の全ての単語（ＥＣの位置の候補は除外）の分散表現を、依存関係タイプの列とともに収集する。図１Ｂ(a)に依存関係タイプ付きの依存構造木の例を示す。図１Ｂ(b)はルートから空範疇OPへ至る経路である。また図１Ｂ(c)はルートから空範疇OPへ至る経路上の各単語
に対する依存関係タイプの列である。このような依存関係タイプの列がm種類あり、単語がd次元のベクトルで表現されるとすると、この特徴を表現するためにmd次元が必要となる。この経路上の各単語の分散表現と依存関係タイプの列は、特徴ベクトル中の対応する位置に置かれる。 (4) For the distributed representation of each word on the path of the dependency structure tree, all words on the path from the root node to the EC position candidates (excluding EC position candidates) based on the text dependency structure tree ) With a dependency type column. FIG. 1B (a) shows an example of a dependency structure tree with dependency type. FIG. 1B (b) shows a route from the route to the empty category OP. Figure 1B (c) shows each word on the route from the route to the empty category OP.
This is a dependency type column for. If there are m types of such dependency type columns and the word is expressed by a d-dimensional vector, the md dimension is required to express this feature. The distributed representation of each word on this path and the dependency type column are placed at corresponding positions in the feature vector.

本実施の形態では、上記の特徴ベクトルにおいて、ベクトル中の単語を、事前トレーニング済辞書から取得された分散表現で置き換える。次のステップにて、抽出された特徴ベクトルを用いて、ＥＣの位置の候補のラベル（ＥＣタイプ）を決定する。 In the present embodiment, in the feature vector described above, the words in the vector are replaced with the distributed representation acquired from the pretrained dictionary. In the next step, EC position candidate labels (EC type) are determined using the extracted feature vectors.

＜同時アノテーション法を用いたＥＣ検出＞
本実施の形態におけるＥＣ検出方法は、２つの写像MAP_A、MAP_Bから成る。MAP_Aは、ＥＣの位置の候補に対するn次元の特徴ベクトルXから、低次元(k次元)のベクトル空間への写像f_A(X)を表す。 <EC detection using simultaneous annotation method>
The EC detection method in the present embodiment includes two maps MAP _A and MAP _B. MAP _A represents a mapping f _A (X) from an n-dimensional feature vector X to a candidate EC position to a low-dimensional (k-dimensional) vector space.

MAP_A: Rⁿ → R^k, k ≪ n
f_A(X) → W_AX (1) MAP _A : R ⁿ → R ^k , k ≪ n
f _A (X) → W _A X (1)

ただし、MAP_Aは、線形変換であり、W_Aは、k * n行列である。 However, MAP _A is a linear transformation, W _A is a k * n matrix.

MAP_Bは、ラベルから低次元(k次元)のベクトル空間への写像である。 MAP _B is a mapping from a label to a low-dimensional (k-dimensional) vector space.

MAP_B: {Label₁, Label₂,…} I R → R^k
f_B（Label_i） → Wⁱ _B (2) MAP _B : {Label ₁ , Label ₂ ,…} IR → R ^k
_{_{f B (Label i) → W}} i B (2)

ただし、MAP_Bも、線形変換である。Wⁱ _Bは、ｋ次元ベクトルであり、2次元空間におけるlabel_iの分散表現でもある。 However, MAP _{B is} also a linear transformation. W ⁱ _B is a k-dimensional vector and is also a distributed representation of label _i in a two-dimensional space.

２つの写像は、トレーニングデータから同時に学習される。テスト段階では、分類すべきＥＣの位置の候補の全てについて、対応する特徴ベクトルXを抽出し、f_A(X) = W_AXを用いて、特徴ベクトルXを低次元空間へ写像する。 Two maps are learned simultaneously from the training data. In the test stage, the corresponding feature vector X is extracted from all the EC position candidates to be classified, and the feature vector X is mapped to the low-dimensional space using f _A (X) = W _A X.

そして、各label_iについて、以下のようにg_i(X)を得る。 Then, for each label _i , g _i (X) is obtained as follows.

g_i(X) = (f_A(X))^TWⁱ _B (3) g _i (X) = (f _A (X)) ^T W ⁱ _B (3)

とりうるlabel_iの各々において、g_i(X)は、当該label_iとなる尤もらしさを表わすスコアであり、ＥＣの位置の候補について推定されるラベルは、g_i(X)を最大化するlabel_iである。 In each of the possible label _i , g _i (X) is a score representing the likelihood of becoming the label _i, and the label estimated for the EC position candidate is a label that maximizes g _i (X). _i .

また、２つの写像MAP_A、MAP_Bで用いられるW_A、Wⁱ _Bを学習するために、本実施の形態では、非特許文献２の方法を用いて、以下の（４）式に示す、重み付けされたペア損失（weighted pairwise loss）を最小化するようにして、確率的勾配降下法（stochastic gradient descent）を用いて学習する。 The two mapping MAP _A, W _A used in the MAP _B, in order to learn the W ⁱ _B, in this embodiment, by using a non-patent document 2 method, shown in the following equation (4), Learning using stochastic gradient descent so as to minimize weighted pairwise loss.

Σ_XΣ_i ¹ _c L(rank_c(X)max(0, (g_i(X) - g_c(X))) （4） Σ _X Σ _i ¹ _c L (rank _c (X) max (0, (g _i (X)-g _c (X))) (4)

ここでcは、特徴ベクトルXについての正解ラベルであり、rank_c(X)は、Xについてとりうる全てのラベルのうちの正解ラベルcのランクである。Lは、エラーに対する態度を反映した関数である。定数関数L = Cは、完全なランクリスト（ranking list）を最適化しようとすることを意味している。ここで、本実施の形態では、L(α) = Σ^α _i=1 1/iを用いており、これは、ランクリストの最上位のものを最適化するためのものである。学習率および確率的勾配降下法アルゴリズムの他のいくつかのパラメータは、開発セットを用いて予め最適化されたものを用いればよい。 Here, c is the correct label for the feature vector X, and rank _c (X) is the rank of the correct label c among all possible labels for X. L is a function that reflects the attitude toward errors. The constant function L = C means to try to optimize the complete ranking list. In the present embodiment, uses a ^{_{L (α) = Σ α i}} = 1 1 / i, which is intended to optimize the intended top of the Live Ranker for English speakers. The learning rate and some other parameters of the probabilistic gradient descent algorithm may be pre-optimized using the development set.

本実施の形態では、非特許文献２の方法を用いて、２つの写像MAP_A、MAP_Bを含むニューラルネットワークモデルを学習する。その他の実施の形態としては、多クラス分類を行う一つのニューラルネットワークモデルを直接学習してもよい。2つの写像を用いてECの位置とラベルを一つの低次元ベクトル空間に写像することの利点は、ラベルの種類(クラス数)が大きくなっても空範疇推定を精度よく行えることである。例えば、省略された代名詞を表すラベルproを、人称(一人称/二人称/三人称)・性別(男性/女性)・数(単数/複数)などに応じて細分化してもよいし、依存構造木における依存関係タイプとして表現される統語的な役割(主語/直接目的語/間接目的語など)に応じて細分化してもよい。 In the present embodiment, a neural network model including two maps MAP _A and MAP _B is learned using the method of Non-Patent Document 2. As another embodiment, one neural network model for performing multi-class classification may be directly learned. The advantage of mapping the EC position and label to one low-dimensional vector space using two mappings is that the empty category can be estimated accurately even if the type of labels (number of classes) increases. For example, an abbreviated pronoun label pro may be subdivided according to person (first person / second person / third person), gender (male / female), number (single / plural), etc. You may subdivide according to the syntactic role (subject / direct object / indirect object, etc.) expressed as a relationship type.

＜本発明の実施の形態に係る空範疇推定モデル学習装置の構成＞
次に、本発明の実施の形態に係る空範疇推定モデル学習装置の構成について説明する。図２に示すように、本発明の実施の形態に係る空範疇推定モデル学習装置１００は、ＣＰＵと、ＲＡＭと、後述する空範疇推定モデル学習処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。この空範疇推定モデル学習装置１００は、機能的には図２に示すように入力部１０と、演算部２０と、出力部９０とを備えている。 <Configuration of Sky Category Estimation Model Learning Device According to Embodiment of the Present Invention>
Next, the configuration of the empty category estimation model learning device according to the embodiment of the present invention will be described. As shown in FIG. 2, the empty category estimation model learning device 100 according to the embodiment of the present invention stores a CPU, a RAM, a program for executing an empty category estimation model learning processing routine to be described later, and various data. And a computer including a ROM. Functionally, the empty category estimation model learning device 100 includes an input unit 10, a calculation unit 20, and an output unit 90 as shown in FIG.

入力部１０は、ＥＣの位置及びＥＣラベルの正解データが予め付与された、学習用のテキストを表す依存構造木を複数受け付け、複数の依存構造木を依存構造木２２に記憶し、複数の依存構造木の各々に付与されているＥＣの位置及びＥＣラベルの正解データを、ＥＣラベル正解データ３８に記憶する。 The input unit 10 accepts a plurality of dependency structure trees that represent learning text, to which EC position and EC label correct data are assigned in advance, stores the plurality of dependency structure trees in the dependency structure tree 22, and stores a plurality of dependency structures. The EC position and EC label correct data assigned to each structural tree are stored in EC label correct data 38.

演算部２０は、依存構造木２２と、特徴テンプレート作成部２４と、特徴テンプレート２６と、単語分散表現２８と、特徴抽出部３０と、ＥＣ位置特徴ベクトル３２と、初期化モデル３４と、初期化ＥＣラベル分散表現３６と、ＥＣラベル正解データ３８と、学習部４０と、モデル５２と、ＥＣラベル分散表現５４と、を含んで構成されている。 The calculation unit 20 includes a dependency structure tree 22, a feature template creation unit 24, a feature template 26, a word distribution expression 28, a feature extraction unit 30, an EC position feature vector 32, an initialization model 34, and an initialization. The EC label distributed representation 36, EC label correct data 38, a learning unit 40, a model 52, and an EC label distributed representation 54 are configured.

依存構造木２２には、入力部１０において受け付けた、複数の学習用のテキストを表す複数の依存構造木が記憶されている。 The dependency structure tree 22 stores a plurality of dependency structure trees representing a plurality of learning texts received by the input unit 10.

特徴テンプレート作成部２４は、複数の依存構造木の各々について、特徴テンプレートを作成し、特徴テンプレート２６に格納する。 The feature template creation unit 24 creates a feature template for each of the plurality of dependency structure trees and stores it in the feature template 26.

単語分散表現２８には、予め学習された各単語の分散表現が記憶されている。 The word dispersion expression 28 stores a dispersion expression of each word learned in advance.

特徴抽出部３０は、複数の依存構造木について、特徴テンプレート作成部２４によって作成された特徴テンプレートに基づいて、ＥＣの位置の候補の各々の特徴ベクトルを抽出し、ＥＣ位置特徴ベクトル３２に格納する。 The feature extraction unit 30 extracts each feature vector of EC position candidates from a plurality of dependency structure trees based on the feature template created by the feature template creation unit 24, and stores it in the EC position feature vector 32. .

初期化モデル３４には、初期化されたモデルとして、写像MAP_Aで用いられる行列W_Aの初期値が格納されている。なお、初期値として、ランダムに設定された値を用いればよい。 The initial model 34, as an initialization model, the initial value of the matrix W _A used in the mapping MAP _A is stored. Note that a randomly set value may be used as the initial value.

初期化ＥＣラベル分散表現３６には、初期化されたモデルとして、写像MAP_Bで用いられる各ＥＣラベルlabel_iに対する行列Wⁱ _Bの初期値が格納されている。なお、初期値として、ランダムに設定された値を用いればよい。 The initialized EC label distribution representation 36 stores the initial value of the matrix W ⁱ _B for each EC label label _i used in the mapping MAP _B as an initialized model. Note that a randomly set value may be used as the initial value.

ＥＣラベル正解データ３８には、入力部１０において受け付けた正解データに基づいて、複数の依存構造木について、ＥＣの位置の候補の各々のＥＣラベルのタイプ又はＥＣラベルがないことが記憶されている。 The EC label correct answer data 38 stores that there is no EC label type or EC label for each of the EC position candidates for a plurality of dependency structure trees based on the correct answer data received by the input unit 10. .

学習部４０は、ＥＣ位置特徴ベクトル３２、初期化モデル３４、初期化ＥＣラベル分散表現３６、及びＥＣラベル正解データ３８に基づいて、２つの写像MAP_A、MAP_Bで用いられる行列W_A、Wⁱ _Bを学習し、モデル５２及びＥＣラベル分散表現５４に格納する。 Based on the EC position feature vector 32, the initialization model 34, the initialized EC label dispersion representation 36, and the EC label correct answer data 38, the learning unit 40 uses matrices W _A and W used in the two mappings MAP _A and MAP _B. ⁱ _B is learned and stored in the model 52 and EC label distribution representation 54.

学習部４０は、更新モデル４２、ＥＣラベル分散表現４４、ＥＣラベル予測部４６、収束判定部４８、及びモデル更新部５０を備えている。 The learning unit 40 includes an update model 42, an EC label distribution representation 44, an EC label prediction unit 46, a convergence determination unit 48, and a model update unit 50.

更新モデル４２には、初期化モデル３４と同じ行列W_A、又はモデル更新部５０によって更新された行列W_Aが記憶されている。 The updated model 42, the same matrix as the initial model 34 W _A, or the model updating unit 50 is the matrix W _A updated by is stored.

ＥＣラベル分散表現４４には、初期化ＥＣラベル分散表現３６と同じ各ＥＣラベルlabel_iに対する行列Wⁱ _B、又はモデル更新部５０によって更新された各ＥＣラベルlabel_iに対する行列Wⁱ _Bが記憶されている。 The EC label distributed representation 44, the matrix for the same respective EC label label _i and initialize EC label distributed representation 36 W ⁱ _B, or matrix for each EC label label _i updated by the model updating unit 50 W ⁱ _B is stored ing.

ＥＣラベル予測部４６は、ＥＣ位置特徴ベクトル３２、更新モデル４２、ＥＣラベル分散表現４４に基づいて、複数の依存構造木の各々について、上記（３）式に従って、ＥＣの位置の候補の各々の特徴ベクトルＸ及び各ＥＣラベルlabel_iに対するスコアを計算し、スコアが最大となるＥＣの位置及びＥＣラベルを予測する。 Based on the EC position feature vector 32, the update model 42, and the EC label distribution representation 44, the EC label prediction unit 46 determines each of the EC position candidates for each of the plurality of dependency structure trees according to the above equation (3). The score for the feature vector X and each EC label label _i is calculated, and the position and EC label of the EC having the maximum score are predicted.

収束判定部４８は、複数の依存構造木の各々について、ＥＣラベル正解データ３８と、ＥＣラベル予測部４６によって予測されたＥＣの位置及びＥＣラベルとを比較して、収束したか否かを判定する。複数の依存構造木の各々について、ＥＣラベル正解データ３８と、ＥＣラベル予測部４６によって予測されたＥＣの位置及びＥＣラベルとが一致した場合に、収束したと判定し、現時点の行列W_Aをモデル５２に格納し、現時点の各ＥＣラベルlabel_iに対する行列Wⁱ _BをＥＣラベル分散表現５４に格納する。 The convergence determination unit 48 compares the EC label correct data 38 with the EC position and EC label predicted by the EC label prediction unit 46 for each of the plurality of dependency structure trees, and determines whether or not the convergence has occurred. To do. For each of a plurality of dependent parse tree, the EC label correct answer data 38, when the position and EC labels predicted EC by EC label prediction unit 46 match, determines that it has converged, the matrix W _A of current The matrix 52 is stored in the model 52, and the matrix W ⁱ _B for each EC label label _i at the present time is stored in the EC label dispersion representation 54.

モデル更新部５０は、ＥＣ位置特徴ベクトル３２と、更新モデル４２と、ＥＣラベル分散表現４４と、ＥＣラベル正解データ３８と、ＥＣラベル予測部４６によって予測されたＥＣの位置及びＥＣラベルとに基づいて、上記（４）式に示す、重み付けされたペア損失を最小化するように、２つの写像MAP_A、MAP_Bで用いられる行列W_A、Wⁱ _Bを更新し、更新された行列W_Aを更新モデル４２に格納し、更新された各ＥＣラベルlabel_iに対する行列Wⁱ _BをＥＣラベル分散表現４４に格納する。 The model update unit 50 is based on the EC position feature vector 32, the update model 42, the EC label distribution representation 44, the EC label correct answer data 38, and the EC position and EC label predicted by the EC label prediction unit 46. Te, shown in equation (4), the weighted pair loss to minimize, two maps MAP _a, matrix used in MAP _B W _a, W ⁱ _B is updated and the updated matrix W _a Are stored in the update model 42, and the matrix W ⁱ _B for each updated EC label label _i is stored in the EC label distribution representation 44.

モデル５２に記憶された行列W_A、及びＥＣラベル分散表現５４に記憶された、各ＥＣラベルlabel_iに対する行列Wⁱ _Bが、出力部９０により出力される。 The output unit 90 outputs the matrix W _A stored in the model 52 and the matrix W ⁱ _B for each EC label label _i stored in the EC label dispersion representation 54.

＜本発明の実施の形態に係る空範疇推定装置の構成＞
次に、本発明の実施の形態に係る空範疇推定装置の構成について説明する。図３に示すように、本発明の実施の形態に係る空範疇推定装置２００は、ＣＰＵと、ＲＡＭと、後述する空範疇推定処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。この空範疇推定装置２００は、機能的には図３に示すように入力部２１０と、演算部２２０と、出力部２９０とを備えている。 <Configuration of an empty category estimation device according to an embodiment of the present invention>
Next, the configuration of the air category estimation device according to the embodiment of the present invention will be described. As shown in FIG. 3, an empty category estimation apparatus 200 according to an embodiment of the present invention includes a CPU, a RAM, a ROM that stores a program and various data for executing an empty category estimation processing routine described later, Can be configured with a computer including Functionally, the empty category estimation device 200 includes an input unit 210, a calculation unit 220, and an output unit 290 as shown in FIG.

入力部２１０は、推定対象のテキストを表す依存構造木、及び特徴テンプレートを受け付け、依存構造木を依存構造木２２２に記憶し、特徴テンプレートを特徴テンプレート２２６に記憶する。 The input unit 210 receives the dependency structure tree and the feature template representing the text to be estimated, stores the dependency structure tree in the dependency structure tree 222, and stores the feature template in the feature template 226.

演算部２２０は、依存構造木２２２と、特徴テンプレート２２６と、単語分散表現２２８と、特徴抽出部２３０と、ＥＣ位置特徴ベクトル２３２と、モデル２３４と、ＥＣラベル分散表現２３６と、推定部２３８と、推定ＥＣラベル２４０と、を含んで構成されている。 The calculation unit 220 includes a dependency structure tree 222, a feature template 226, a word distribution representation 228, a feature extraction unit 230, an EC position feature vector 232, a model 234, an EC label distribution representation 236, and an estimation unit 238. , And an estimated EC label 240.

依存構造木２２２には、入力部２１０において受け付けた、テキストを表す依存構造木が記憶されている。 The dependency structure tree 222 stores a dependency structure tree that represents the text received by the input unit 210.

特徴テンプレート２２６には、入力部２１０において受け付けた特徴テンプレートが記憶されている。なお、入力部２１０において受け付けた特徴テンプレートは、特徴テンプレート作成部２４と同様に作成されたものである。 In the feature template 226, the feature template received by the input unit 210 is stored. Note that the feature template received by the input unit 210 is created in the same manner as the feature template creation unit 24.

単語分散表現２２８には、単語分散表現２８と同様に、予め学習された各単語の分散表現が記憶されている。 Similar to the word distribution expression 28, the word distribution expression 228 stores a distributed expression of each word learned in advance.

特徴抽出部２３０は、依存構造木について、特徴テンプレート２２６に基づいて、特徴抽出部３０と同様に、ＥＣの位置の候補の各々の特徴ベクトルを抽出し、ＥＣ位置特徴ベクトル３２に格納する。 The feature extraction unit 230 extracts each feature vector of EC position candidates based on the feature template 226 for the dependency structure tree, and stores it in the EC position feature vector 32.

モデル２３４には、空範疇推定モデル学習装置１００によって学習されたモデル５２と同じ、写像MAP_Aで用いられる行列W_Aが格納されている。 The model 234 is the same as model 52 learned by the air category estimation model learning device 100, a matrix W _A used in the mapping MAP _A is stored.

ＥＣラベル分散表現３６には、空範疇推定モデル学習装置１００によって学習されたＥＣラベル分散表現５４と同じ、写像MAP_Bで用いられる各ＥＣラベルlabel_iに対する行列Wⁱ _Bが格納されている。 The EC label dispersion representation 36 stores the same matrix W ⁱ _B for each EC label label _i used in the mapping MAP _B as the EC label dispersion representation 54 learned by the empty category estimation model learning device 100.

推定部２３８は、ＥＣ位置特徴ベクトル２３２、モデル２３４、ＥＣラベル分散表現２３６に基づいて、依存構造木について、上記（３）式に従って、ＥＣの位置の候補の各々の特徴ベクトルＸ及び各ＥＣラベルlabel_iに対するスコアを計算し、スコアが最大となるＥＣの位置及びＥＣラベルを、ＥＣラベルの推定結果とし、推定ＥＣラベル２４０に格納する。 Based on the EC position feature vector 232, the model 234, and the EC label distribution representation 236, the estimation unit 238 performs the feature vector X of each EC position candidate and each EC label for the dependency structure tree according to the above equation (3). The score for label _i is calculated, and the EC position and EC label with the maximum score are stored as the estimated EC label 240 as the EC label estimation result.

推定ＥＣラベル２４０に記憶されたＥＣの位置及びＥＣラベルが、出力部２９０により出力される。 The EC position and EC label stored in the estimated EC label 240 are output by the output unit 290.

＜本発明の実施の形態に係る空範疇推定モデル学習装置の作用＞
次に、本発明の実施の形態に係る空範疇推定モデル学習装置１００の作用について説明する。入力部１０において、ＥＣの位置及びＥＣラベルの正解データが予め付与された、学習用のテキストを表す依存構造木を複数受け付けると、複数の依存構造木を依存構造木２２に記憶すると共に、正解データを、ＥＣラベル正解データ３８に記憶し、空範疇推定モデル学習装置１００は、図４に示す空範疇推定モデル学習処理ルーチンを実行する。 <Operation of Sky Category Estimation Model Learning Device According to Embodiment of the Present Invention>
Next, the operation of the empty category estimation model learning device 100 according to the embodiment of the present invention will be described. When the input unit 10 accepts a plurality of dependency structure trees representing the learning text to which the EC position and EC label correct answer data are assigned in advance, the plurality of dependency structure trees are stored in the dependency structure tree 22 and the correct answer is obtained. The data is stored in the EC label correct answer data 38, and the air category estimation model learning device 100 executes the air category estimation model learning processing routine shown in FIG.

まず、ステップＳ１００では、依存構造木２２に記憶されている複数の依存構造木を読み込む。 First, in step S100, a plurality of dependency structure trees stored in the dependency structure tree 22 are read.

次に、ステップＳ１０２では、特徴テンプレートを作成する。ステップＳ１０４では、依存構造木２２に記憶されている複数の依存構造木、単語分散表現２８に記憶されている各単語の分散表現、及びＥＣラベル正解データ３８に記憶されているＥＣの位置及びＥＣラベルの正解データを読み込む。 Next, in step S102, a feature template is created. In step S104, the plurality of dependency structure trees stored in the dependency structure tree 22, the distributed expression of each word stored in the word distribution expression 28, and the EC position and EC stored in the EC label correct answer data 38 are stored. Read the correct answer data of the label.

そして、ステップＳ１０６では、複数の依存構造木の各々について、上記ステップＳ１０２で作成された特徴テンプレートに基づいて、ＥＣの位置の候補の各々の特徴ベクトルを作成し、ＥＣ位置特徴ベクトル３２に格納する。 In step S106, for each of the plurality of dependency structure trees, a feature vector of each of the EC position candidates is created based on the feature template created in step S102, and stored in the EC position feature vector 32. .

ステップＳ１０８では、写像MAP_Aで用いられる行列W_Aの初期値をランダムに設定し、初期化モデル３４及び更新モデル４２に格納する。また、写像MAP_Bで用いられる各ＥＣラベルlabel_iに対する行列Wⁱ _Bの初期値をランダムに設定し、初期化ＥＣラベル分散表現３６及びＥＣラベル分散表現４４に格納する。 In step S108, the initial value of the matrix W _A used in the mapping MAP _A randomly set, and stores the initial model 34 and the update model 42. In addition, the initial value of the matrix W ⁱ _B for each EC label label _i used in the mapping MAP _B is randomly set and stored in the initialized EC label distribution representation 36 and the EC label distribution representation 44.

そして、ステップＳ１１０では、複数の依存構造木の各々について、ＥＣ位置特徴ベクトル３２、更新モデル４２、及びＥＣラベル分散表現４４に基づいて、ＥＣ位置及びＥＣラベルを予測する。 In step S110, EC positions and EC labels are predicted for each of the plurality of dependency structure trees based on the EC position feature vector 32, the update model 42, and the EC label distribution representation 44.

次のステップＳ１１２では、上記ステップＳ１１０で複数の依存構造木の各々について予測されたＥＣ位置及びＥＣラベルと、ＥＣラベル正解データ３８とを比較して、収束したか否かを判定する。上記ステップＳ１１０で複数の依存構造木の各々について予測されたＥＣ位置及びＥＣラベルと、ＥＣラベル正解データ３８とが一致していない場合には、収束していないと判定し、ステップＳ１１４へ移行する。一方、上記ステップＳ１１０で複数の依存構造木の各々について予測されたＥＣ位置及びＥＣラベルと、ＥＣラベル正解データ３８とが一致している場合には、収束したと判定し、ステップＳ１１６へ移行する。 In the next step S112, the EC position and EC label predicted for each of the plurality of dependency structure trees in step S110 are compared with the EC label correct data 38 to determine whether or not the convergence has occurred. If the EC position and EC label predicted for each of the plurality of dependency structure trees in step S110 do not match the EC label correct answer data 38, it is determined that they have not converged, and the process proceeds to step S114. . On the other hand, if the EC position and the EC label predicted for each of the plurality of dependency structure trees in step S110 match the EC label correct answer data 38, it is determined that they have converged, and the process proceeds to step S116. .

ステップＳ１１４では、ＥＣ位置特徴ベクトル３２と、更新モデル４２と、ＥＣラベル分散表現４４と、ＥＣラベル正解データ３８と、上記ステップＳ１１０で予測されたＥＣの位置及びＥＣラベルとに基づいて、上記（４）式に示す、重み付けされたペア損失を最小化するように、２つの写像MAP_A、MAP_Bで用いられる行列W_A、Wⁱ _Bを更新し、更新された行列W_Aを更新モデル４２に格納し、更新された各ＥＣラベルlabel_iに対する行列Wⁱ _BをＥＣラベル分散表現４４に格納し、ステップＳ１１０へ戻る。 In step S114, based on the EC position feature vector 32, the update model 42, the EC label distribution representation 44, the EC label correct data 38, and the EC position and EC label predicted in step S110, the above ( 4) shown in the expression, a weighted pair loss to minimize, two maps MAP _a, the matrix W _a used in the MAP _B, and updates the W ⁱ _B, updates the updated matrix W _a model 42 The matrix W ⁱ _B for each updated EC label label _i is stored in the EC label distribution representation 44, and the process returns to step S110.

ステップＳ１１６では、現時点の行列W_Aをモデル５２に格納し、現時点の各ＥＣラベルlabel_iに対する行列Wⁱ _BをＥＣラベル分散表現５４に格納し、空範疇推定モデル学習処理ルーチンを終了する。 At step S116, and stores the matrix W _A of current to the model 52, and stores the matrix W ⁱ _B for each EC label label _i the current to EC label distributed representation 54, and ends the empty category estimation model learning processing routine.

＜本発明の実施の形態に係る空範疇推定装置の作用＞
次に、本発明の実施の形態に係る空範疇推定装置２００の作用について説明する。入力部２１０において推定対象のテキストを表す依存構造木、及び特徴テンプレートを受け付けると、受け付けた依存構造木を依存構造木２２２に記憶すると共に、特徴テンプレートを、特徴テンプレート２２６に記憶し、空範疇推定装置２００は、図５に示す空範疇推定処理ルーチンを実行する。 <Operation of the empty category estimation device according to the embodiment of the present invention>
Next, the operation of the air category estimation device 200 according to the embodiment of the present invention will be described. When the dependency structure tree representing the text to be estimated and the feature template are received by the input unit 210, the received dependency structure tree is stored in the dependency structure tree 222, and the feature template is stored in the feature template 226, so that empty category estimation is performed. The apparatus 200 executes an empty category estimation processing routine shown in FIG.

まず、ステップＳ２００では、依存構造木２２２に記憶されている依存構造木、単語分散表現２２８に記憶されている各単語の分散表現、及び特徴テンプレート２２６に記憶されている特徴テンプレートを読み込む。 First, in step S <b> 200, the dependency structure tree stored in the dependency structure tree 222, the distributed expression of each word stored in the word distribution expression 228, and the feature template stored in the feature template 226 are read.

次に、ステップＳ２０２では、ステップＳ２００で読み込んだ特徴テンプレートに基づいて、ＥＣの位置の候補の各々の特徴ベクトルを作成し、ＥＣ位置特徴ベクトル２３２に格納する。 Next, in step S <b> 202, a feature vector of each EC position candidate is created based on the feature template read in step S <b> 200 and stored in the EC position feature vector 232.

そして、ステップＳ２０４では、依存構造木について、ＥＣ位置特徴ベクトル２３２、モデル２３４、及びＥＣラベル分散表現２３６に基づいて、ＥＣ位置及びＥＣラベルを予測し、空範疇推定処理ルーチンを終了する。 In step S204, the EC position and EC label are predicted for the dependency structure tree based on the EC position feature vector 232, the model 234, and the EC label distribution representation 236, and the empty category estimation processing routine is terminated.

＜実施例＞
＜実験データ＞
本実施の形態で説明した方法は、アノテーションされたコーパスが利用可能な様々な種類の言語に適用可能である。我々の実験では、中国語ツリーバンクV7.0（Chinese Penn Treebank V7.0）の一部を用いた。データセットを、トレーニングデータ、開発データおよびテストデータの３つの部分に分ける。従来の研究に引き続き、我々は、ファイル1〜40および901〜931をテストデータとし、ファイル41〜80を開発データとしている。トレーニングデータには、ファイル{81〜325，400〜454，500〜554，590〜596，6000〜885，900}が含まれている。図６に、テストデータにおけるＥＣラベルの分布を示す。この実験における本実施形態では、2つのECが同じ主辞単語と後続単語を持つ場合を扱わなかったので、テストデータにおけるECの合計は非特許文献１より若干少ない（なお、本実施形態において、ECラベルを依存関係タイプを考慮するように拡張すれば、このような場合も扱うことができる）。開発データは、パラメータを調整するために用いられ、その最終結果はテストデータについて報告される。CTBツリーは、ＥＣが保たれた特徴抽出用の依存構造木に変換して用いた。 <Example>
<Experimental data>
The method described in this embodiment can be applied to various types of languages in which an annotated corpus can be used. In our experiment, a part of Chinese Penn Treebank V7.0 was used. The data set is divided into three parts: training data, development data and test data. Following the conventional research, we use files 1 to 40 and 901 to 931 as test data, and files 41 to 80 as development data. The training data includes files {81-325, 400-454, 500-554, 590-596, 6000-885, 900}. FIG. 6 shows the distribution of EC labels in the test data. In this embodiment in this experiment, the case where two ECs have the same main word and subsequent word was not handled, so the total EC in the test data is slightly smaller than that of Non-Patent Document 1 (in this embodiment, EC This can be handled by extending the label to take into account dependency types). Development data is used to adjust the parameters, and the final results are reported for test data. The CTB tree was converted into a dependency structure tree for feature extraction with EC maintained.

＜実験設定＞
実験において、パラメータを、学習率（learning rate）=10^-1、単語ベクトル次元=80、および隠れ層（hidden layer）次元=500に設定した。 <Experimental settings>
In the experiment, the parameters were set to learning rate = 10 ⁻¹ , word vector dimension = 80, and hidden layer dimension = 500.

＜実験結果＞
図７に、実験結果として、正解数（correct）と適合率(p)と再現率（ｒ）とＦ１値（Ｆ１）とを示す。ここでは空範疇ラベルとして、Chinese Penn Treebankで定義されているものをそのまま用いた。PRO (big PRO)はコントロール構文などに出現する義務的な照応、pro (small pro)は省略された代名詞、Tは関係節や主題化などの移動における痕跡、OPは空の関係代名詞、RNRは右節点繰り上げ、*は受動構文や繰り上げ構文により生じた痕跡を表す。本実施の形態の手法の結果と、従来の最先端技術による方法（非特許文献１のXue）とを比較する。ここに提供した方法は、CTBについて我々が知る限り最新の最先端技術による性能をもたらすものである。本実施の形態の手法は、従来の最先端技術による方法より、高精度にＥＣラベルを推定できることが分かった。 <Experimental result>
FIG. 7 shows the number of correct answers (correct), precision (p), recall (r), and F1 value (F1) as experimental results. Here, the empty category label defined by the Chinese Penn Treebank was used as it was. PRO (big PRO) is a mandatory anaphor that appears in control syntax, pro (small pro) is an abbreviated pronoun, T is a trace of movements such as relative clauses and thematicization, OP is an empty relative pronoun, RNR is The right node is raised, and * indicates a trace generated by the passive syntax or the raised syntax. The result of the method of the present embodiment is compared with a conventional state-of-the-art method (Xue in Non-Patent Document 1). The method provided here provides the latest state-of-the-art performance as far as we know about CTB. It has been found that the method of the present embodiment can estimate the EC label with higher accuracy than the conventional state-of-the-art method.

以上説明したように、本発明の実施の形態に係る空範疇推定装置によれば、入力テキストの依存構造木に基づいて、ＥＣの位置の候補の各々の特徴ベクトルとして単語の分散表現を抽出し、特徴ベクトルから低次元空間への写像、及び空範疇ラベルの各々から低次元空間への写像を含むモデルに基づいて、空範疇の位置及び空範疇ラベルを推定することにより、入力されたテキストの空範疇の位置および種類を精度よく推定することができる。 As described above, according to the empty category estimation device according to the embodiment of the present invention, based on the dependency structure tree of the input text, a distributed representation of a word is extracted as each feature vector of EC position candidates. , By estimating the location of the empty category and the empty category label based on a model that includes a mapping from the feature vector to the lower dimensional space and a mapping of each empty category label to the lower dimensional space. The position and type of the sky category can be estimated with high accuracy.

また、本発明の実施の形態に係る空範疇推定モデル学習装置によれば、テキストの依存構造木に基づいて、ＥＣの位置の候補の各々の特徴ベクトルとして単語の分散表現を抽出し、複数のテキストの各々に付与された前記空範疇の位置及び空範疇ラベルに基づいて、特徴ベクトルから低次元空間への写像、及び空範疇ラベルの各々から低次元空間への写像を含むモデルを学習することにより、テキストの空範疇の位置を精度よく推定するためのモデルを学習することができる。 In addition, according to the empty category estimation model learning device according to the embodiment of the present invention, based on the dependency structure tree of text, a distributed representation of a word is extracted as each feature vector of EC position candidates, Learning a model including a mapping from a feature vector to a low dimensional space and a mapping from each of the empty category labels to a low dimensional space based on the position and empty category label of each of the texts assigned to each of the texts. Thus, it is possible to learn a model for accurately estimating the position of the empty category of the text.

また、実験により、本実施の形態で説明した手法は、空範疇を、従来のものよりも高精度かつ高い再現性で検出可能であることが示されている。特徴の分散表現、および学習した２つの写像を含むニューラルネットワークモデルにより、空範疇の位置及びラベルを推定し、空範疇の意味および長距離依存関係を取得することができる。 Experiments also show that the method described in this embodiment can detect the sky category with higher accuracy and higher reproducibility than the conventional one. With the neural network model including the distributed representation of the features and the two maps learned, the location and label of the sky category can be estimated, and the meaning and long distance dependency of the sky category can be obtained.

なお、本発明は、上述した実施形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 Note that the present invention is not limited to the above-described embodiment, and various modifications and applications are possible without departing from the gist of the present invention.

例えば、推定対象のテキストの入力を受け付け、空範疇推定装置において、推定対象のテキストに対して、依存構造解析を行って、依存構造木を作成するようにしてもよい。
また、中国語のテキストに対して、空範疇を推定する場合を例に説明したが、これに限定されるものではなく、中国語以外の言語、例えば、日本語のテキストに対して、空範疇を推定するようにしてもよい。 For example, the input of the text to be estimated may be received, and the dependency category analysis may be performed on the text to be estimated by the empty category estimation device to create a dependency structure tree.
In addition, the case where the empty category is estimated for the Chinese text has been described as an example, but the present invention is not limited to this, and the empty category is used for a language other than Chinese, for example, Japanese text. May be estimated.

また、本願明細書中において、プログラムが予めインストールされている実施形態として説明したが、当該プログラムを、コンピュータ読み取り可能な記録媒体に格納して提供することも可能であるし、ネットワークを介して提供することも可能である。 Further, in the present specification, the embodiment has been described in which the program is installed in advance. However, the program can be provided by being stored in a computer-readable recording medium or provided via a network. It is also possible to do.

１０、２１０入力部
２０、２２０演算部
２４特徴テンプレート作成部
３０、２３０特徴抽出部
４０学習部
４６ラベル予測部
４８収束判定部
５０モデル更新部
９０、２９０出力部
１００空範疇推定モデル学習装置
２００空範疇推定装置
２３８推定部 10, 210 Input unit 20, 220 Calculation unit 24 Feature template creation unit 30, 230 Feature extraction unit 40 Learning unit 46 Label prediction unit 48 Convergence determination unit 50 Model update unit 90, 290 Output unit 100 Sky category estimation model learning device 200 Sky Category estimation device 238 estimation unit

Claims

An empty category estimation device for estimating an empty category, which is a noun phrase caused by omission or movement, from an input text,
Based on the dependency structure tree of the input text, for each of the candidates for the position of the empty category, a feature extracting unit that extracts a distributed expression of the word as a feature of the candidate for the empty category position;
A pre-learned model including a mapping from the feature to a low-dimensional space and a mapping from each empty category label to the low-dimensional space, and candidates for the location of the empty category extracted by the feature extraction unit An estimation unit for estimating the position of the air category and the air category label based on each of the features;
An air category estimation device including

The feature extraction unit includes, as the feature of the candidate for the empty category position, a distributed representation of the main word of the candidate for the empty category position, a distributed expression of the subsequent word following the candidate for the empty category position, and the dependency structure At least one of a distributed representation of a word represented by a child node of a node corresponding to the subsequent word in the tree, and a distributed representation of each word on a path from a root node to the candidate for the empty category in the dependency structure tree The empty category estimation apparatus according to claim 1, wherein one is extracted.

The estimation unit is based on the features of the position category candidates extracted by the feature extraction unit, the mapping from the features to a low-dimensional space, and the mapping from the sky category label to the low-dimensional space. 3. The empty category estimation according to claim 1 or 2, wherein a combination of the empty category position candidate and the empty category label having a maximum calculated score is the estimation result of the empty category position and the empty category label. apparatus.

For each of a plurality of texts to which the empty category position and empty category label, which are noun phrases generated by omission or movement, are assigned to each of the empty category position candidates based on the dependency structure tree of the text. A feature extraction unit that extracts a distributed representation of a word as a candidate feature of the position of the empty category;
The feature of each of the candidate empty category positions extracted for each of the plurality of texts by the feature extraction unit, and the empty category position and empty category label assigned to each of the plurality of texts. A learning unit for learning a model including a mapping from the feature to a low-dimensional space and a mapping from each empty category label to the low-dimensional space;
Sky category estimation model learning device including.

The feature extraction unit includes, as the feature of the candidate for the empty category position, a distributed representation of the main word of the candidate for the empty category position, a distributed expression of the subsequent word following the candidate for the empty category position, and the dependency structure At least one of a distributed representation of a word represented by a child node of a node corresponding to the subsequent word in the tree, and a distributed representation of each word on a path from a root node to the candidate for the empty category in the dependency structure tree The empty category estimation model learning device according to claim 4, wherein one is extracted.

An empty category estimation method in an empty category estimation device for estimating an empty category, which is a noun phrase generated by omission or movement, from an input text, including a feature extraction unit and an estimation unit,
The feature extraction unit extracts, based on the dependency structure tree of the input text, for each of the candidates for the position of the empty category, as a feature of the candidate for the position of the empty category, a distributed expression of the word,
The estimation unit learns in advance a model including a mapping from the feature to the low-dimensional space and a mapping from each empty category label to the low-dimensional space, and the empty category extracted by the feature extracting unit. An empty category estimation method for estimating the position of the empty category and the empty category label based on the characteristics of each of the candidate positions.

An empty category estimation model learning method in an empty category estimation model learning device including a feature extraction unit and a learning unit,
For each of a plurality of texts to which a null category label and a null category label, which are noun phrases generated by omission or movement, are extracted by the feature extraction unit based on the text dependency structure tree. For each of the candidates, extract a distributed representation of the word as a feature of the candidate for the position of the empty category,
The learning unit extracts the feature of each of the candidate empty category positions extracted for each of the plurality of texts by the feature extracting unit, the position of the empty category assigned to each of the plurality of texts, and An empty category estimation model learning method that learns a model including a mapping from the feature to a low-dimensional space and a mapping from each empty category label to the low-dimensional space based on the empty category label.

A program for causing a computer to function as each unit of the empty category estimation device according to any one of claims 1 to 3 or the empty category estimation model learning device according to claim 4 or 5.