JP5911931B2 - Predicate term structure extraction device, method, program, and computer-readable recording medium - Google Patents

Predicate term structure extraction device, method, program, and computer-readable recording medium Download PDF

Info

Publication number
JP5911931B2
JP5911931B2 JP2014183214A JP2014183214A JP5911931B2 JP 5911931 B2 JP5911931 B2 JP 5911931B2 JP 2014183214 A JP2014183214 A JP 2014183214A JP 2014183214 A JP2014183214 A JP 2014183214A JP 5911931 B2 JP5911931 B2 JP 5911931B2
Authority
JP
Japan
Prior art keywords
sentence
question
answer
term structure
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2014183214A
Other languages
Japanese (ja)
Other versions
JP2016057810A (en
Inventor
のぞみ 小林
のぞみ 小林
平野 徹
徹 平野
東中 竜一郎
竜一郎 東中
牧野 俊朗
俊朗 牧野
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP2014183214A priority Critical patent/JP5911931B2/en
Publication of JP2016057810A publication Critical patent/JP2016057810A/en
Application granted granted Critical
Publication of JP5911931B2 publication Critical patent/JP5911931B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Description

本発明は、述語項構造抽出装置、方法、プログラム、及びコンピュータ読取り可能な記録媒体に係り、特に、質問文と応答文とのペアから述語項構造を抽出する述語項構造抽出装置、方法、プログラム、及びコンピュータ読取り可能な記録媒体に関する。   The present invention relates to a predicate term structure extraction apparatus, method, program, and computer-readable recording medium, and more particularly to a predicate term structure extraction apparatus, method, and program for extracting a predicate term structure from a pair of a question sentence and a response sentence. And a computer-readable recording medium.

テキストマイニングシステムでは、係り受け解析に基づいて「何がどうだ」を表す述語と項の形(e.g.[インターネット、つながらない])で集計が可能となっている(例えば非特許文献1)。現在のテキストマイニングシステムは、係り受け解析に基づいて述語と項を抽出するものがほとんどであるが、テキストから述語とその項を抽出する技術として「述語項構造解析」もある(例えば非特許文献2)。   In the text mining system, aggregation is possible in the form of a predicate and a term (eg [not connected to the Internet, connection]) representing “what is what” based on dependency analysis (for example, Non-Patent Document 1). . Most current text mining systems extract predicates and terms based on dependency analysis, but there is also "predicate term structure analysis" as a technique for extracting predicates and terms from text (for example, non-patent literature). 2).

脇森浩志,“ビッグデータに対するテキストマイニングとその適用例”,UNISYS TECHNOLOGY REVIEW 第115号,2013.Hiroshi Wakimori, “Text mining for big data and its application”, UNISYS TECHNOLOGY REVIEW 115th, 2013. 今村賢治, 東中竜一郎, 泉朋子,“ゼロ代名詞照応付き述語項構造解析の対話への適応”,言語処理学会第20回年次大会発表論文集,2014.Kenji Imamura, Ryuichiro Higashinaka, Kyoko Izumi, “Adaptation to predicate structure analysis with zero pronoun anaphora”, Proceedings of the 20th Annual Conference of the Language Processing Society of Japan, 2014.

しかし対話テキストには、「インターネットにつながりますか?」「いいえ」のように、質問と回答が対になっているテキストが多く存在する。この場合、回答は「いいえ」のみとなり「インターネット、つながらない」という述語と項を獲得できない。   However, there are many texts in which questions and answers are paired, such as “Do you connect to the Internet?” Or “No”. In this case, the answer is only “No”, and the predicate and term “Cannot connect to the Internet” cannot be obtained.

非特許文献2に記載されているように、従来の述語項構造解析技術は、述語に対して省略されている項を補完するものであり、述語が省略されているケースには対応していないため、「ルータのランプは何色に光っていますか?」「赤です」のように、回答側で述語が省略される場合に述語項構造を正しく獲得できない、という問題がある。   As described in Non-Patent Document 2, the conventional predicate term structure analysis technique complements the term omitted for the predicate, and does not correspond to the case where the predicate is omitted. Therefore, there is a problem that the predicate term structure cannot be acquired correctly when the predicate is omitted on the answer side, such as “What color is the lamp of the router?” Or “It is red”.

このように回答の述語項構造が正しく獲得できないと、たとえばコールセンタのデータでユーザの述語項情報で集計しようとしたときに正しい集計結果を得ることができない。   Thus, if the predicate term structure of the answer cannot be acquired correctly, for example, when the user's predicate term information is aggregated with call center data, a correct aggregation result cannot be obtained.

本発明は、上記問題点を解決するために成されたものであり、質問文と応答文とのペアから述語項構造を精度よく抽出できる述語項構造抽出装置、方法、プログラム、及びコンピュータ読取り可能な記録媒体を提供することを目的とする。   The present invention has been made to solve the above problems, and a predicate term structure extraction device, method, program, and computer-readable code that can accurately extract a predicate term structure from a pair of a question sentence and a response sentence. It is an object to provide a simple recording medium.

上記目的を達成するために、本発明に係る述語項構造抽出装置は、入力された形態素解析済みの質問文と形態素解析済みの応答文とのペアから、述語と前記述語に対応する格要素との組み合わせである述語項構造を抽出する述語項構造抽出装置であって、前記形態素解析済みの質問文に対して、疑問詞の表記が格納された辞書を用いて辞書引きを行い、前記質問文に含まれる疑問詞を同定すると共に、前記質問文の品詞情報と予め定められた質問表現抽出規則と前記同定された疑問詞とに基づいて、前記質問文に含まれる質問表現の範囲を同定する質問表現同定部と、前記形態素解析済みの応答文の品詞情報と、複数の肯定表現及び複数の否定表現を格納した表現リストとに基づいて、前記応答文の回答タイプを判定する回答タイプ判定部と、前記応答文の品詞情報に基づいて、前記応答文に含まれる固有表現又は名詞を回答表現として抽出する回答表現抽出部と、前記質問文と、前記応答文と、前記質問表現同定部により同定された疑問詞及び質問表現と、前記回答タイプ判定部により判定された回答タイプと、前記回答表現抽出部により抽出された回答表現とに基づいて、前記回答タイプに対して予め定められた書き換え方法に応じて、前記質問文を書き換えることにより、前記質問文に対する回答文を生成する回答文生成部と、前記回答文生成部により生成した回答文から、予め定められた述語項構造抽出規則に従って、前記述語項構造を抽出する述語項構造抽出部と、を含んで構成されている。 In order to achieve the above object, the predicate term structure extraction device according to the present invention includes a case element corresponding to a predicate and a predescriptor word from a pair of an inputted morpheme-analyzed question sentence and a morpheme-analyzed response sentence. A predicate term structure extracting device that extracts a predicate term structure that is a combination of and a morphological analysis of the question sentence, performing a dictionary lookup using a dictionary storing interrogative notations, and Identifying the question expression included in the sentence, and identifying the range of the question expression included in the question sentence based on the part-of-speech information of the question sentence, the predetermined question expression extraction rule, and the identified question word Answer type determination that determines the answer type of the response sentence based on the question expression identifying unit that performs, the part-of-speech information of the response sentence that has undergone the morphological analysis, and an expression list that stores a plurality of positive expressions and a plurality of negative expressions Department and Based on the part-of-speech information of the response sentence, the answer expression extraction unit that extracts a proper expression or noun included in the response sentence as an answer expression, the question sentence, the response sentence, and the question expression identification part are identified. A rewriting method determined in advance for the answer type based on the answer word determined by the answer type determining unit, the answer type determined by the answer type determining unit, and the answer expression extracted by the answer expression extracting unit. Accordingly, by rewriting the question sentence, an answer sentence generation unit that generates an answer sentence for the question sentence and a response sentence generated by the answer sentence generation unit according to a predicate term structure extraction rule determined in advance A predicate term structure extraction unit for extracting a description term structure.

また、本発明に係る述語項構造抽出装置において、前記述語項構造抽出部において述語項構造が抽出されなかった場合に、前記生成された回答文に対して述語項構造解析を行って、前記述語項構造を生成する述語項構造解析部を更に含んでもよい。   In the predicate term structure extraction device according to the present invention, when a predicate term structure extraction unit does not extract a predicate term structure, a predicate term structure analysis is performed on the generated answer sentence, You may further include the predicate term structure analysis part which produces | generates a description term structure.

また、本発明に係る述語項構造抽出装置において、前記回答タイプ判定部は、前記応答文の回答タイプとして、肯定否定型、用言型、及び名詞型の何れであるかを判定し、前記回答文生成部は、前記判定された回答タイプが肯定否定型又は名詞型であった場合には、前記回答タイプに対して予め定められた書き換え方法に基づいて、前記質問文に対する回答文を生成し、前記判定された回答タイプが用言型であった場合には、前記回答文を生成せずに、前記質問文及び前記応答文を前記述語項構造解析部に出力し、前記述語項構造解析部は、前記回答タイプ判定部において判定された回答タイプが用言型であった場合には、前記出力された前記質問文と前記応答文との各々に対して述語項構造解析を行って、前記質問文と前記応答文との各々に対する述語項構造を生成し、述語が同じ前記質問文と前記応答文との各々に対する述語項構造において、前記質問文に対する述語項構造の格要素であって、前記応答文に対する述語項構造に含まれていない格要素と、前記述語との組み合わせである述語項構造を出力してもよい。   Further, in the predicate term structure extraction device according to the present invention, the answer type determination unit determines whether the answer type of the response sentence is an affirmative negative type, a prescriptive type, or a noun type, and the answer The sentence generation unit generates an answer sentence for the question sentence based on a rewriting method predetermined for the answer type when the determined answer type is an affirmative negative type or a noun type. When the determined answer type is a prescriptive type, the question sentence and the response sentence are output to the pre-description term structure analysis unit without generating the answer sentence, and the pre-description term The structure analysis unit performs a predicate term structure analysis on each of the output question sentence and the response sentence when the answer type determined by the answer type determination unit is a prescriptive type. And each of the question sentence and the response sentence A predicate term structure for each of the question statement and the response statement having the same predicate, and a predicate term structure case element for the question statement, and included in the predicate term structure for the response statement A predicate term structure that is a combination of a case element that has not been described and a previous description word may be output.

本発明に係る述語項構造抽出方法は、入力された形態素解析済みの質問文と形態素解析済みの応答文とのペアから、述語と前記述語に対応する格要素との組み合わせである述語項構造を抽出する述語項構造抽出方法であって、質問表現同定部が、前記形態素解析済みの質問文に対して、疑問詞の表記が格納された辞書を用いて辞書引きを行い、前記質問文に含まれる疑問詞を同定すると共に、前記質問文の品詞情報と予め定められた質問表現抽出規則と前記同定された疑問詞とに基づいて、前記質問文に含まれる質問表現の範囲を同定するステップと、回答タイプ判定部が、前記形態素解析済みの応答文の品詞情報と、複数の肯定表現及び複数の否定表現を格納した表現リストとに基づいて、前記応答文の回答タイプを判定するステップと、回答表現抽出部が、前記応答文の品詞情報に基づいて、前記応答文に含まれる固有表現又は名詞を回答表現として抽出するステップと、回答文生成部が、前記質問文と、前記応答文と、前記質問表現同定部により同定された疑問詞及び質問表現と、前記回答タイプ判定部により判定された回答タイプと、前記回答表現抽出部により抽出された回答表現とに基づいて、前記回答タイプに対して予め定められた書き換え方法に応じて、前記質問文を書き換えることにより、前記質問文に対する回答文を生成するステップと、述語項構造抽出部が、前記回答文生成部により生成した回答文から、予め定められた述語項構造抽出規則に従って、前記述語項構造を抽出するステップと、を含んで実行することを特徴とする。 The predicate term structure extraction method according to the present invention includes a predicate term structure that is a combination of a predicate and a case element corresponding to a previous descriptive word from a pair of a morpheme-analyzed question sentence and a morpheme-analyzed response sentence. Predicate term structure extraction method, wherein the question expression identification unit performs a dictionary lookup on the question sentence that has been subjected to morphological analysis using a dictionary that stores notation of question words, Identifying a question phrase included, and identifying a range of question expressions included in the question sentence based on part-of-speech information of the question sentence, a predetermined question expression extraction rule, and the identified question word And the answer type determination unit determines the answer type of the response sentence based on the part-of-speech information of the response sentence after the morphological analysis and the expression list storing a plurality of positive expressions and a plurality of negative expressions; , An answer expression extraction unit extracts a specific expression or noun included in the response sentence as an answer expression based on the part of speech information of the response sentence, and an answer sentence generation unit includes the question sentence, the response sentence, The question type and the question expression identified by the question expression identifying unit, the answer type determined by the answer type determining unit, and the answer expression extracted by the answer expression extracting unit , The step of generating an answer sentence to the question sentence by rewriting the question sentence according to a predetermined rewriting method, and the predicate term structure extracting unit from the answer sentence generated by the answer sentence generating unit And a step of extracting a predescription term structure according to a predicate term structure extraction rule determined in advance.

また、本発明に係る述語項構造抽出方法において、述語項構造解析部が、前記述語項構造抽出部において述語項構造が抽出されなかった場合に、前記生成された回答文に対して述語項構造解析を行って、前記述語項構造を生成するステップを更に含んでもよい。   Further, in the predicate term structure extraction method according to the present invention, when the predicate term structure analysis unit does not extract the predicate term structure in the previous description term item structure extraction unit, a predicate term for the generated answer sentence. The method may further include performing a structural analysis to generate a predescription term structure.

また、本発明に係る述語項構造抽出方法において、前記回答タイプ判定部が判定するステップは、前記応答文の回答タイプとして、肯定否定型、用言型、及び名詞型の何れであるかを判定し、前記回答文生成部が前記回答文を生成するステップは、前記判定された回答タイプが肯定否定型又は名詞型であった場合には、前記回答タイプに対して予め定められた書き換え方法に基づいて、前記質問文に対する回答文を生成し、前記判定された回答タイプが用言型であった場合には、前記回答文を生成せずに、前記質問文及び前記応答文を前記述語項構造解析部に出力し、前記述語項構造解析部が前記述語項構造を生成するステップは、前記回答タイプ判定部において判定された回答タイプが用言型であった場合には、前記出力された前記質問文と前記応答文との各々に対して述語項構造解析を行って、前記質問文と前記応答文との各々に対する述語項構造を生成し、述語が同じ前記質問文と前記応答文との各々に対する述語項構造において、前記質問文に対する述語項構造の格要素であって、前記応答文に対する述語項構造に含まれていない格要素と、前記述語との組み合わせである述語項構造を出力してもよい。   In the predicate term structure extraction method according to the present invention, the step of determining by the answer type determining unit determines whether the answer type of the response sentence is an affirmative negative type, a prescriptive type, or a noun type. The step of generating the answer sentence by the answer sentence generating unit is performed by a rewriting method predetermined for the answer type when the determined answer type is an affirmative negative type or a noun type. Based on the question sentence, and when the determined answer type is a prescriptive type, the question sentence and the response sentence are generated as a pre-description word without generating the answer sentence. Outputting to the term structure analysis unit, and the step of generating the previous description word term structure by the previous description term term structure analysis unit is, when the answer type determined by the answer type determination unit is a prescriptive type, The question text that was output Predicate term structure analysis is performed for each of the response sentences to generate a predicate term structure for each of the question sentence and the response sentence, and a predicate for each of the question sentence and the response sentence having the same predicate In a term structure, a predicate term structure that is a combination of a case element of a predicate term structure with respect to the question sentence and not included in the predicate term structure with respect to the response sentence, and a previous description word may be output. Good.

本発明に係るプログラムは、コンピュータに、上記の述語項構造抽出装置の各部を実行させるためのプログラムである。   The program according to the present invention is a program for causing a computer to execute each part of the predicate term structure extraction device.

本発明に係る記録媒体は、コンピュータに、上記の述語項構造抽出装置の各部を実行させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体である。   A recording medium according to the present invention is a computer-readable recording medium that records a program for causing a computer to execute each unit of the predicate term structure extraction device.

本発明の述語項構造抽出装置、方法、プログラム、及びコンピュータ読取り可能な記録媒体によれば、質問文の疑問詞及び質問表現を同定し、応答文の回答タイプを判定し、応答文の回答表現を抽出して、質問文と、応答文と、疑問詞及び質問表現と、回答タイプと、回答表現とに基づいて回答文を生成し、回答文から述語項構造を抽出することで、質問文と応答文とのペアから精度よく述語項構造を抽出できる、という効果が得られる。   According to the predicate term structure extraction device, method, program, and computer-readable recording medium of the present invention, the question sentence and question expression of the question sentence are identified, the answer type of the response sentence is determined, and the answer expression of the response sentence To generate a response sentence based on the question sentence, the response sentence, the question word and the question expression, the answer type, and the answer expression, and extract the predicate term structure from the answer sentence. The effect is that the predicate term structure can be accurately extracted from the pair of the response sentence and the response sentence.

本発明の実施の形態に係る述語項構造抽出装置の構成を示すブロック図である。It is a block diagram which shows the structure of the predicate term structure extraction apparatus which concerns on embodiment of this invention. 疑問詞辞書の例を示す説明図である。It is explanatory drawing which shows the example of an interrogative dictionary. 肯定否定表現リストの例を示す説明図である。It is explanatory drawing which shows the example of a positive / negative expression list. 述語項構造抽出規則の例を示す説明図である。It is explanatory drawing which shows the example of a predicate term structure extraction rule. 本発明の実施の形態に係る述語項構造抽出装置における述語項構造抽出処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the predicate term structure extraction process routine in the predicate term structure extraction apparatus which concerns on embodiment of this invention.

以下、図面を参照して本発明の実施の形態を詳細に説明する。   Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

<本発明の実施の形態に係る述語項構造抽出装置の構成> <Configuration of Predicate Term Structure Extraction Device According to Embodiment of the Present Invention>

次に、本発明の実施の形態に係る述語項構造抽出装置の構成について説明する。図1に示すように、本発明の実施の形態に係る述語項構造抽出装置100は、CPUと、RAMと、後述する述語項構造抽出処理ルーチンを実行するためのプログラムや各種データを記憶したROMと、を含むコンピュータで構成することが出来る。この述語項構造抽出装置100は、機能的には図1に示すように入力部10と、演算部20と、出力部50とを備えている。   Next, the configuration of the predicate term structure extraction device according to the embodiment of the present invention will be described. As shown in FIG. 1, a predicate term structure extraction apparatus 100 according to an embodiment of the present invention includes a CPU, a RAM, and a ROM that stores a program and various data for executing a predicate term structure extraction processing routine described later. And a computer including Functionally, the predicate term structure extraction apparatus 100 includes an input unit 10, an arithmetic unit 20, and an output unit 50 as shown in FIG.

入力部10は、対話テキストから得られる質問文と質問文に対する応答文とのペアを受け付ける。   The input unit 10 accepts a pair of a question sentence obtained from the dialogue text and a response sentence to the question sentence.

演算部20は、テキスト解析部30と、質問表現同定部32と、回答タイプ判定部34と、回答表現抽出部36と、疑問詞辞書38と、肯定否定表現リスト40と、回答文生成部42と、述語項構造抽出部44と、述語項構造抽出規則46と、述語項構造解析部48とを含んで構成されている。   The calculation unit 20 includes a text analysis unit 30, a question expression identification unit 32, an answer type determination unit 34, an answer expression extraction unit 36, an interrogative dictionary 38, an affirmative / negative expression list 40, and an answer sentence generation unit 42. A predicate term structure extraction unit 44, a predicate term structure extraction rule 46, and a predicate term structure analysis unit 48.

テキスト解析部30は、入力部10で受け付けた質問文と応答文とのペアを入力とし、それぞれの文に既知の技術である形態素解析及び固有表現抽出によりテキスト解析を行う。   The text analysis unit 30 receives a pair of a question sentence and a response sentence received by the input unit 10 and performs text analysis on each sentence by morphological analysis and specific expression extraction which are known techniques.

以下の表1に質問文「ランプは何色に光っていますか?」、表2に応答文「赤色です」が入力された場合の形態素解析及び固有表現抽出の例を示す。なお、以下の例の場合、固有表現は存在していないため何も出力されない。   Table 1 below shows examples of morphological analysis and specific expression extraction when a question sentence “What color does the lamp shine?” And Table 2 a response sentence “red” are input. In the following example, nothing is output because there is no specific expression.

質問表現同定部32は、テキスト解析部30で解析された形態素解析済みの質問文に対して、疑問詞の表記と疑問詞タイプとの組み合わせの各々が格納された疑問詞辞書38を用いて辞書引きを行い、質問文に含まれる疑問詞及び疑問詞タイプを同定すると共に、質問文の品詞情報と予め定められた質問表現抽出規則と同定された疑問詞及び疑問詞タイプとに基づいて、質問文に含まれる質問表現の範囲を同定する。 The question expression identifying unit 32 uses a question word dictionary 38 in which each combination of a question word notation and a question word type is stored for a question sentence analyzed by the text analysis unit 30 and subjected to morphological analysis. And the question word and question type included in the question sentence are identified, and the question is based on the part of speech information of the question sentence and the predetermined question expression extraction rule and the identified question word and question word type. Identify the range of question expressions included in the sentence.

質問表現同定部32は、具体的には、まず形態素解析済みの質問文に対し、疑問詞辞書38を辞書引きすることで疑問詞を同定する。疑問詞辞書38には、図2に示すように、疑問詞の「出現表記」と「疑問詞タイプ」の組み合わせが格納されている。例えば疑問詞タイプには、人物を聞く「WHO」、数を聞く「NUMBER」、場所を聞く「WHERE」、理由や手段を聞く「HOW」、選択質問を表す「WHICH」などがある。また、「何県」「何色」のように、求めている答えが何か(e.g.県の名前、色の名前)が特定できる疑問詞の疑問詞タイプを「SPECIFIC」とする。疑問詞辞書38中の出現表記にマッチしなければ、以降の質問表現の範囲同定は行わず、質問表現は無しとなる。表1に示す例では、疑問詞辞書中の出現表記「何色」が解析済みの質問文にマッチし、疑問詞タイプが「SPECIFIC」として同定される。   Specifically, the question expression identifying unit 32 first identifies a question word by searching the question word dictionary 38 for a question sentence that has been subjected to morphological analysis. As shown in FIG. 2, the interrogative dictionary 38 stores combinations of “appearance notation” and “interrogative type” of interrogative words. For example, the question word type includes “WHO” for listening to a person, “NUMBER” for listening to a number, “WHERE” for listening to a place, “HOW” for listening to a reason and means, and “WHICH” for selecting a question. Also, the question word type of a question word that can specify what answer is desired (eg name of prefecture, name of color), such as “what prefecture” and “what color”, is “SPECIFIC”. If it does not match the appearance notation in the interrogative dictionary 38, the range identification of the subsequent question expressions is not performed, and there is no question expression. In the example shown in Table 1, the appearance notation “what color” in the interrogative dictionary matches the analyzed question sentence, and the interrogative type is identified as “SPECIFIC”.

次に、質問表現同定部32は質問表現の範囲の同定を行う。形態素解析済みの質問文と、同定された疑問詞とに対して、質問表現抽出規則を適用して、質問文における質問表現の範囲を同定する。ここでは、同定された疑問詞タイプ、疑問詞の品詞、疑問詞の直後の形態素の品詞などを用いて定められた質問表現抽出規則をもとに、質問文から質問表現の範囲を同定する。   Next, the question expression identifying unit 32 identifies the range of the question expression. A question expression extraction rule is applied to the question sentence that has been subjected to morphological analysis and the identified question word to identify the range of the question expression in the question sentence. Here, the range of the question expression is identified from the question sentence based on the question expression extraction rule defined using the identified question word type, the part of speech of the question word, the part of speech of the morpheme immediately after the question word, and the like.

例えば、質問表現抽出規則として以下の規則1〜規則4を適用することができる(二重引用符付きは疑問詞を表す)。なお、以下に挙げる規則は一例であり、これに限定されるものではなく、その他の規則を適用できることは勿論である。   For example, the following rules 1 to 4 can be applied as the question expression extraction rules (double quotation marks indicate question words). It should be noted that the following rules are examples, and the present invention is not limited to these rules, and other rules can of course be applied.

規則1では、疑問詞の疑問詞タイプが「SPECIFIC」の場合、直後形態素に関わらず、疑問詞を質問表現として抽出する。例えば「“何色”のランプですか」から「質問表現:何色」を同定する。   In rule 1, when the question word type of the question word is “SPECIFIC”, the question word is extracted as the question expression regardless of the morpheme immediately after. For example, “question expression: what color” is identified from “what color” lamp?

規則2では、疑問詞の疑問詞タイプが「NUMBER」、「TIME」、又は「WHAT」であり、かつ疑問詞の品詞が「Number」の場合、直後形態素の品詞が「助数詞」であれば、当該疑問詞と直後形態素とをまとめて候補とし、候補の直後形態素表記が「の」でなければ、当該候補を質問表現として抽出する。例えば、「箱は“何”センチですか」から「質問表現:何センチ」を同定する。   In Rule 2, if the question word type of the question word is “NUMBER”, “TIME”, or “WHAT” and the question word part of speech is “Number”, then the part of speech of the morpheme is “number classifier” The interrogative word and the immediately following morpheme are collectively set as candidates, and if the candidate immediately following morpheme expression is not “NO”, the candidate is extracted as a question expression. For example, “question expression: how many centimeters” is identified from “what is a centimeter in a box”.

規則3では、疑問詞の品詞が「連体詞」の場合、直後形態素の品詞が「名詞接尾辞:名詞」、「名詞」、「名詞:動作」、及び「冠名詞」のいずれかであれば、当該疑問詞と直後形態素とを候補としてまとめ、当該候補の直後形態素の表記が「の」でなければ、当該候補を質問表現として抽出する。例えば、「“どんな”色に光っていますか」から「質問表現:どんな色」を同定する。   In rule 3, if the part of speech of the interrogative word is a “combined part”, then if the part of speech of the morpheme is any one of “noun suffix: noun”, “noun”, “noun: action”, and “crown noun”, The question words and the immediately following morpheme are collected as candidates, and if the notation of the candidate immediately following morpheme is not “NO”, the candidate is extracted as a question expression. For example, “question expression: what color” is identified from “what color is shining”.

規則4では、疑問詞の品詞が「連用詞」の場合、直後形態素の表記が「くらい」であれば、当該疑問詞と直後形態素とをまとめ、かつその後が「名詞接尾辞:名詞」、「名詞」、「名詞:動作」、及び「冠名詞」のいずれかであればまとめて候補とする。候補の直後形態素が「判定詞:終止」、「句点」、又は「の以外の格助詞」であれば、当該候補を質問表現として抽出する。例えば、「エラーは“何回”くらいありますか」から「質問表現:何回くらい」を同定する。   According to Rule 4, if the part of speech of the interrogative word is “consecutive”, if the immediately following morpheme is “about”, the interrogative word and the immediately following morpheme are combined, and the subsequent “noun suffix: noun”, “ Any one of “noun”, “noun: action”, and “crown noun” is a candidate. If the morpheme immediately after the candidate is “determinant: end”, “punctuation”, or “case particle other than”, the candidate is extracted as a question expression. For example, “question expression: how many times” is identified from “how many times there are errors”.

表1の例では、規則1が適用され、質問表現は「何色」、疑問詞タイプは「SPECIFIC」と同定される。   In the example of Table 1, Rule 1 is applied, the question expression is identified as “what color”, and the question word type is identified as “SPECIFIC”.

回答タイプ判定部34は、テキスト解析部30で解析された形態素解析済みの応答文の品詞情報と、複数の肯定表現及び複数の否定表現を格納した肯定否定表現リスト40とに基づいて、応答文の回答タイプを判定する。   Based on the part of speech information of the response sentence that has been analyzed by the text analysis unit 30 and that has been analyzed by the text analysis unit 30, and the positive / negative expression list 40 that stores a plurality of positive expressions and a plurality of negative expressions, the answer type determination unit 34 Determine the answer type.

本実施の形態では、回答タイプ判定部34は、回答タイプとして、応答文が肯定否定型、用言型、及び名詞型のいずれであるかを判定する。   In the present embodiment, the answer type determination unit 34 determines whether the response sentence is an affirmative negative type, a prescriptive type, or a noun type as the answer type.

肯定否定型は、「はい」「いいえ」など、肯定又は否定を表す表現で応答している応答文が該当する。   The affirmative / negative type corresponds to a response sentence that responds with an expression indicating affirmation or denial, such as “Yes” or “No”.

用言型は、「光っていません」「赤い」のように、応答文の述語が用言(動詞、形容詞、形容動詞)の応答文が該当する。   The predicate type is a response sentence in which the predicate of the response sentence is a predicate (verb, adjective, adjective verb), such as “not shining” or “red”.

名詞型は、「赤色です」「3回です」のように名詞や数値等で応答している応答文が該当する。   The noun type corresponds to a response sentence that responds with a noun or a numerical value such as “red” or “three times”.

回答タイプ判定部34は、具体的には、まず図3に示すような予め用意した肯定否定表現リスト40の表現を参照し、応答文中に肯定否定表現リスト40にマッチする表現があれば肯定否定型と判定する。肯定否定表現リスト40にマッチする表現がない場合は、応答文の末尾に用言が出現すれば用言型と判定し、用言型でもなければ名詞型という流れで判定を行う。また、肯定否定表現リスト40の表現の各々には、それぞれ肯定又は否定の種類が付随しており、後述する回答文生成部42の処理で使用する。   Specifically, the answer type determination unit 34 first refers to the expression of the positive / negative expression list 40 prepared in advance as shown in FIG. 3, and if there is an expression that matches the positive / negative expression list 40 in the response sentence, the negative determination is made. Judge as type. If there is no expression that matches the positive / negative expression list 40, it is determined as a prescriptive type if a predicate appears at the end of the response sentence, and a noun type is determined if it is not a prescriptive type. Each expression in the affirmative / negative expression list 40 is accompanied by a positive or negative type, and is used in the processing of the answer sentence generation unit 42 described later.

表2の例では、応答文中に用言が存在しないため名詞型と判断される。   In the example of Table 2, since there is no prescription in the response sentence, it is determined as a noun type.

回答表現抽出部36は、テキスト解析部30で解析された形態素解析済みの応答文の品詞情報に基づいて、応答文に含まれる固有表現又は名詞を回答表現として抽出する。回答にあたる表現は、基本的には固有表現か名詞であるため、応答文の最も末尾に出現する固有表現もしくは名詞の連続を応答表現として抽出すればよい。該当する表現がなければ何も出力しない。例えば、応答文「赤色です」では、品詞が名詞の「赤色」が回答表現となる。   The answer expression extraction unit 36 extracts a specific expression or a noun included in the response sentence as an answer expression based on the part of speech information of the response sentence that has been analyzed by the text analysis unit 30 and has undergone morpheme analysis. Since the expression corresponding to the answer is basically a proper expression or a noun, a proper expression or a series of nouns appearing at the end of the response sentence may be extracted as the response expression. If there is no corresponding expression, nothing is output. For example, in the response sentence “red”, the part of speech is a noun “red” is the answer expression.

回答文生成部42は、形態素解析済みの質問文及び応答文と、質問表現同定部32により同定された疑問詞及び質問表現と、回答タイプ判定部34により判定された回答タイプと、回答表現抽出部36により抽出された回答表現とに基づいて、質問文に対する回答文を生成する。   The answer sentence generation unit 42 extracts a question sentence and a response sentence that have been morphologically analyzed, a question word and a question expression identified by the question expression identification unit 32, an answer type determined by the answer type determination unit 34, and an answer expression extraction. Based on the answer expression extracted by the unit 36, an answer sentence for the question sentence is generated.

回答文生成部42は、具体的には、判定された回答タイプに対して予め定められた書き換え方法に基づいて回答文を生成する。   Specifically, the answer sentence generation unit 42 generates an answer sentence based on a predetermined rewriting method for the determined answer type.

回答文生成部42は、回答タイプが肯定否定型の場合、質問文を平叙文に書き換えて回答文とする。平叙文への書き換えは、疑問を表す「?」や疑問を表す終助詞(たとえば「か」)を削除することで実現できる。例えば、「赤ランプが点灯していますか?」「はい」という質問文と、肯定否定型の種類が肯定の応答文とが入力だった場合、書き換えにより「赤ランプが点灯しています」が回答文となる。また、肯定否定型の種類が否定だった場合、文末を否定形に書き換える。否定形への書き換えは、形態素の表記や品詞等を用いることで実現できる。たとえば「赤ランプが点灯しています」という例で、形態素「ます」の品詞は「動詞接尾辞:終止」だとすると、「末尾形態素の表記が「ます」かつ「動詞接尾辞:終止」であれば「ません」にする」というような規則によって書き換えることができる。   When the answer type is an affirmative / negative type, the answer sentence generation unit 42 rewrites the question sentence into a plain sentence to make an answer sentence. Rewriting into a plain text can be realized by deleting the question mark “?” And the final particle (eg “ka”). For example, if the question text “Is the red lamp lit?” Or “Yes” and the answer sentence with the positive / negative type is affirmative, the rewritten “red lamp is lit” It becomes an answer sentence. If the positive / negative type is negative, the end of the sentence is rewritten to a negative form. Rewriting to the negative form can be realized by using morpheme notation and part of speech. For example, in the example “red lamp is lit”, if the part of speech of the morpheme “mas” is “verb suffix: ending”, if the suffix morpheme is “mas” and “verb suffix: ending” It can be rewritten by a rule such as “no”.

回答文生成部42は、回答タイプが用言型の場合、回答文を生成せずに形態素解析済みの質問文と形態素解析済みの応答文とをそのまま述語項構造解析部48に出力する。   When the answer type is a prescriptive type, the answer sentence generation unit 42 outputs the question sentence that has undergone morpheme analysis and the response sentence that has undergone morpheme analysis to the predicate term structure analysis part 48 without generating an answer sentence.

回答文生成部42は、回答タイプが名詞型の場合、質問表現と回答表現がどちらも存在すれば、質問文の質問表現を回答表現で置き換え、平叙文に修正して回答文とする。例えば、疑問詞タイプが「SPECIFIC」で、質問文が「ランプは何色に光っていますか?」、応答文が「赤色です」の場合、回答タイプが名詞型、質問表現が「何色」、回答表現が「赤色」であるため、疑問を表す「?」や疑問を表す終助詞を削除して、質問表現「何色」を回答表現「赤色」に書き換えることで、「ランプは赤色に光っています」という回答文を生成する。なお、疑問詞タイプが「WHICH」以外の疑問詞タイプの場合には、同様の処理を行う。   When the answer type is a noun type, the answer sentence generation unit 42 replaces the question expression of the question sentence with the answer expression when both the question expression and the answer expression exist, and corrects the question expression to a plain sentence to obtain an answer sentence. For example, if the question type is “SPECIFIC”, the question sentence is “What color is the lamp shining?”, And the response sentence is “red”, the answer type is noun type and the question expression is “what color” Because the answer expression is “red”, the question expression “?” And the last particle that represents the question are deleted, and the question expression “what color” is rewritten to the answer expression “red”. The response sentence “I am shining” is generated. If the question word type is a question word type other than “WHICH”, the same processing is performed.

疑問詞タイプが「WHICH」の場合は、質問表現より前の部分は切り捨てる。例えば「AとBのどちらが光っていますか?」「Aです」が入力だった場合、「どちら」が質問表現で疑問詞タイプは「WHICH」、回答表現が「A」であるため、疑問を表す「?」や疑問を表す終助詞を削除すると共に、「AとBの」を切り捨て、質問表現「どちら」を回答表現「A」に書き換えて「Aが光っています」という回答文を生成する。なお、名詞型の場合において、質問表現と回答表現のいずれかが存在しない場合は何も出力しない。   When the question type is “WHICH”, the part before the question expression is discarded. For example, if “Which is A or B?” Or “A” is input, “Which” is the question expression, the question word type is “WHICH”, and the answer expression is “A”. "?" And question final particle are deleted, "A and B" are truncated, and the question expression "Which" is rewritten to the answer expression "A" to generate an answer sentence "A is shining" To do. In the case of the noun type, nothing is output if either the question expression or the answer expression does not exist.

述語項構造抽出部44は、回答文生成部42で生成された回答文に対して、予め定めた述語項構造抽出規則46を適用して、述語項構造抽出規則46がマッチした部分から、述語項構造を抽出し、出力部50に出力する。   The predicate term structure extraction unit 44 applies a predicate term structure extraction rule 46 to the answer sentence generated by the answer sentence generation unit 42, and from the portion where the predicate term structure extraction rule 46 matches, the predicate The term structure is extracted and output to the output unit 50.

述語項構造抽出規則46の規則と適用例を図4に示す。規則は、形態素区切りにして、正規表現で表したものである。形態素区切りを入れることで、形態素の一部に誤ってマッチするケースを防いでいる。図4の例ではタブが区切り文字となっている。また、回答文だけでなく、応答文に述語項構造抽出規則を適用するようにしてもよい。なお、規則にマッチしない場合には何も出力しない。このように、頻出するパターンを述語項構造抽出規則として予め用意しておくことで、処理速度を向上させること可能となる。   FIG. 4 shows rules and application examples of the predicate term structure extraction rule 46. The rule is a regular expression expressed in morpheme delimiters. By inserting a morpheme break, the case where a part of the morpheme is accidentally matched is prevented. In the example of FIG. 4, tabs are delimiters. Moreover, you may make it apply a predicate term structure extraction rule not only to an answer sentence but to a response sentence. If no rule is matched, nothing is output. In this way, by preparing frequently appearing patterns as predicate term structure extraction rules in advance, the processing speed can be improved.

述語項構造解析部48は、前段の述語項構造抽出部44で何も出力されなかった場合に、回答文生成部42で生成された回答文に対して、既知の技術である述語項構造解析を行って述語項構造を生成し、出力する。   The predicate term structure analysis unit 48 performs a predicate term structure analysis, which is a known technique, on the answer sentence generated by the answer sentence generation unit 42 when nothing is output by the predicate term structure extraction unit 44 in the previous stage. To generate and output a predicate term structure.

また、述語項構造解析部48は、回答文生成部42で回答文が生成されずに質問文と応答文が出力されている場合(回答タイプが用言型の場合)、出力された質問文と応答文との各々に対して述語項構造解析を行って、質問文と応答文との各々に対する述語項構造を生成し、述語が同じ質問文と応答文との各々に対する述語項構造において、質問文に対する述語項構造の格要素であって、応答文に対する述語項構造に含まれていない格要素と、述語との組み合わせである述語項構造を出力する。具体的には、質問文と応答文の述語項構造がともに存在する場合、二つの述語が同じであれば質問文にあって応答文にない項をコピーして回答文の述語項構造を生成し、出力する。その際、「何」「なん」等の疑問詞はコピーの対象外とする。述語が同じか否かは、たとえば述語の終止形が同じであるかどうかで判断できる。述語が異なる場合は応答文の述語項構造をそのまま出力する。応答文の述語項構造が存在しない場合は何も出力しない。   The predicate term structure analysis unit 48 outputs the question sentence when the answer sentence is not generated by the answer sentence generation unit 42 and the question sentence and the response sentence are output (when the answer type is a predicate type). Predicate term structure analysis for each of the question statement and the response statement to generate a predicate term structure for each of the question statement and the response statement, and in the predicate term structure for each of the question statement and the response statement having the same predicate, A predicate term structure that is a combination of a case element of the predicate term structure for the question sentence and not included in the predicate term structure for the response sentence is output. Specifically, when both the predicate term structure of the question sentence and the response sentence exists, if the two predicates are the same, the predicate term structure of the answer sentence is generated by copying the term that is in the question sentence but not in the response sentence And output. At that time, questionable words such as “what” and “what” are excluded from copying. Whether or not the predicates are the same can be determined, for example, based on whether or not the end forms of the predicates are the same. If the predicates are different, the predicate term structure of the response statement is output as it is. If there is no predicate entry structure in the response statement, nothing is output.

例えば、「ランプが光っていますか?」「緑に光ってます」という質問文と応答文が入力の場合、質問文の述語項構造は[述語:光る、ガ格:ランプ]であり、応答文の述語項構造は[述語:光る、ニ格:緑]となるため、述語は同じである。応答文の述語項構造にはガ格が省略されているため、質問文の述語項構造からガ格をコピーし[述語:光る、ガ格:ランプ、ニ格:緑]という回答文の述語項構造を生成し、出力部50に出力する。   For example, if a question sentence and a response sentence such as “Is the lamp glowing?” Or “Is it glowing green” are input, the predicate term structure of the question sentence is [predicate: glowing, ga rating: lamp], and the response Since the predicate term structure of the sentence is [predicate: shine, d case: green], the predicate is the same. Since the case is omitted from the predicate term structure of the response statement, the case is copied from the predicate term structure of the question statement [predicate: shines, case: ramp, d case: green]. A structure is generated and output to the output unit 50.

<本発明の実施の形態に係る述語項構造抽出装置の作用> <Operation of the predicate term structure extraction device according to the embodiment of the present invention>

次に、本発明の実施の形態に係る述語項構造抽出装置100の作用について説明する。入力部10において入力部10で受け付けた質問文と応答文とのペアを受け付けると、述語項構造抽出装置100は、図5に示す述語処理ルーチンを実行する。   Next, the operation of the predicate term structure extraction device 100 according to the embodiment of the present invention will be described. When the pair of the question sentence and the response sentence received by the input unit 10 is received by the input unit 10, the predicate term structure extraction device 100 executes a predicate processing routine shown in FIG. 5.

まず、ステップS100では、入力部10において受け付けた質問文と応答文とのペアを取得する。   First, in step S100, a pair of a question sentence and a response sentence received by the input unit 10 is acquired.

次に、ステップS102では、ステップS100で取得した質問文と応答文のそれぞれに対し、既知の技術である形態素解析及び固有表現抽出によりテキスト解析を行う。   Next, in step S102, text analysis is performed on each of the question sentence and the response sentence acquired in step S100 by morphological analysis and specific expression extraction which are known techniques.

ステップS104では、ステップS102で解析された形態素解析済みの質問文に対して、疑問詞の表記と疑問詞タイプとの組み合わせの各々が格納された辞書を用いて辞書引きを行い、質問文に含まれる疑問詞及び疑問詞タイプを同定すると共に、質問文の品詞情報と予め定められた質問表現抽出規則と同定された疑問詞とに基づいて、質問文に含まれる質問表現の範囲を同定する。   In step S104, the morphological-analyzed question sentence analyzed in step S102 is subjected to dictionary lookup using a dictionary in which each combination of question mark notation and question word type is stored, and is included in the question sentence. And the range of the question expression included in the question sentence is identified based on the part-of-speech information of the question sentence, the predetermined question expression extraction rule, and the identified question word.

ステップS106では、ステップS102で解析された形態素解析済みの応答文の品詞情報と、複数の肯定表現及び複数の否定表現を格納した肯定否定表現リスト40とに基づいて、応答文の回答タイプを判定する。   In step S106, the response type of the response sentence is determined based on the part-of-speech information of the response sentence analyzed in step S102 and the positive / negative expression list 40 storing a plurality of positive expressions and a plurality of negative expressions. To do.

ステップS108では、ステップS102で解析された形態素解析済みの応答文の品詞情報に基づいて、応答文に含まれる固有表現又は名詞を回答表現として抽出する。   In step S108, based on the part-of-speech information of the response sentence that has been analyzed in step S102 and analyzed in step S102, the specific expression or noun included in the response sentence is extracted as an answer expression.

ステップS110では、ステップS106で判定された回答タイプが、肯定否定型、名詞型、及び用言型の何れであるかを判定し、肯定否定型又は名詞型であればステップS112へ移行し、用言型であればステップS120へ移行する。   In step S110, it is determined whether the answer type determined in step S106 is an affirmative negative type, a noun type, or a prescriptive type. If it is an affirmative negative type or a noun type, the process proceeds to step S112. If so, the process proceeds to step S120.

ステップS112では、ステップS102で解析された形態素解析済みの質問文及び応答文と、ステップS104で同定された疑問詞及び質問表現と、ステップS106で判定された回答タイプと、ステップS108で抽出された回答表現とに基づいて、質問文に対する回答文を生成する。   In step S112, the morphologically analyzed question sentence and response sentence analyzed in step S102, the question word and question expression identified in step S104, the answer type determined in step S106, and the extracted in step S108. An answer sentence for the question sentence is generated based on the answer expression.

ステップS114では、ステップS112で生成された回答文に対して、予め定めた述語項構造抽出規則46を適用して、述語項構造抽出規則46がマッチした部分から、回答文の述語項構造を抽出する。   In step S114, the predicate term structure extraction rule 46 defined in advance is applied to the answer sentence generated in step S112, and the predicate term structure of the answer sentence is extracted from the portion where the predicate term structure extraction rule 46 matches. To do.

ステップS116では、ステップS114で述語項構造が抽出されたかを判定し、抽出された場合はステップS124へ移行し、抽出されなかった場合はステップS118へ移行する。   In step S116, it is determined whether the predicate term structure is extracted in step S114. If it is extracted, the process proceeds to step S124. If not extracted, the process proceeds to step S118.

ステップS118では、ステップS112で生成された回答文に対して述語項構造解析を行って、回答文の述語項構造を生成する。   In step S118, a predicate term structure analysis is performed on the answer sentence generated in step S112 to generate a predicate term structure of the answer sentence.

ステップS120では、回答文を生成せずに形態素解析済みの質問文と形態素解析済みの応答文とを出力する。   In step S120, a question sentence that has undergone morphological analysis and a response sentence that has undergone morphological analysis are output without generating an answer sentence.

ステップS122では、ステップS120で出力された質問文と応答文との各々に対して述語項構造解析を行って、質問文と応答文との各々に対する述語項構造を生成し、述語が同じ質問文と応答文との各々に対する述語項構造において、質問文に対する述語項構造の格要素であって、応答文に対する述語項構造に含まれていない格要素と、述語との組み合わせである述語項構造を、回答文の述語項構造として生成する。   In step S122, a predicate term structure analysis is performed on each of the question sentence and the response sentence output in step S120 to generate a predicate term structure for each of the question sentence and the response sentence. Predicate term structure that is a combination of a predicate and a case element of the predicate term structure for the question statement that is not included in the predicate term structure for the response statement. , Generated as a predicate term structure of an answer sentence.

ステップS124では、ステップS114で抽出され、又はステップS122で生成された回答文の述語項構造の結果を出力して処理を終了する。   In step S124, the result of the predicate term structure of the answer sentence extracted in step S114 or generated in step S122 is output, and the process ends.

以上説明したように、本発明の実施の形態に係る述語項構造抽出装置によれば、質問文の疑問詞及び質問表現を同定し、応答文の回答タイプを判定し、応答文の回答表現を抽出して、質問文と、応答文と、疑問詞及び質問表現と、回答タイプと、回答表現とに基づいて回答文を生成し、回答文から述語項構造を抽出することで、質問文と応答文とのペアから精度よく述語項構造を抽出できる。   As described above, according to the predicate term structure extraction device according to the embodiment of the present invention, the question sentence and the question expression of the question sentence are identified, the answer type of the response sentence is determined, and the answer expression of the response sentence is determined. Extract a question sentence, a response sentence, a question word and a question expression, an answer type, and an answer expression, and extract a predicate term structure from the answer sentence, A predicate term structure can be extracted with high accuracy from a pair with a response sentence.

なお、本発明は、上記実施の形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。   The present invention is not limited to the above embodiment, and various modifications and applications can be made without departing from the gist of the present invention.

例えば、上述した実施の形態では、肯定否定表現リストとの照合や応答文の末尾の表現などに基づいて応答文の回答タイプを判定したが、これに限定されるものではなく、予め応答文に対して、肯定否定型、用言型、及び名詞型の3タイプのタグを付与したデータを用意し、既知の教師あり学習の枠組みを用いて肯定否定型、用言型、及び名詞型を判断するモデルを作成しておき、応答文の入力に対して、作成したモデルを適用して回答タイプの判定を行ってもよい。   For example, in the above-described embodiment, the answer type of the response sentence is determined based on collation with the positive / negative expression list or the expression at the end of the response sentence, but the present invention is not limited to this. On the other hand, data with three types of tags, affirmative negative type, prescriptive type, and noun type, is prepared, and the positive negative type, prescriptive type, and noun type are determined using a known supervised learning framework. A model to be created may be created, and the answer type may be determined by applying the created model to the input of a response sentence.

10 入力部
20 演算部
30 テキスト解析部
32 質問表現同定部
34 回答タイプ判定部
36 回答表現抽出部
38 疑問詞辞書
40 肯定否定表現リスト
42 回答文生成部
44 述語項構造抽出部
46 述語項構造抽出規則
48 述語項構造解析部
50 出力部
100 述語項構造抽出装置
DESCRIPTION OF SYMBOLS 10 Input part 20 Operation part 30 Text analysis part 32 Question expression identification part 34 Reply type determination part 36 Reply expression extraction part 38 Interrogative dictionary 40 Positive / negative expression list 42 Reply sentence generation part 44 Predicate term structure extraction part 46 Predicate term structure extraction Rule 48 Predicate term structure analysis unit 50 Output unit 100 Predicate term structure extraction device

Claims (8)

入力された形態素解析済みの質問文と形態素解析済みの応答文とのペアから、述語と前記述語に対応する格要素との組み合わせである述語項構造を抽出する述語項構造抽出装置であって、
前記形態素解析済みの質問文に対して、疑問詞の表記が格納された辞書を用いて辞書引きを行い、前記質問文に含まれる疑問詞を同定すると共に、前記質問文の品詞情報と予め定められた質問表現抽出規則と前記同定された疑問詞とに基づいて、前記質問文に含まれる質問表現の範囲を同定する質問表現同定部と、
前記形態素解析済みの応答文の品詞情報と、複数の肯定表現及び複数の否定表現を格納した表現リストとに基づいて、前記応答文の回答タイプを判定する回答タイプ判定部と、
前記応答文の品詞情報に基づいて、前記応答文に含まれる固有表現又は名詞を回答表現として抽出する回答表現抽出部と、
前記質問文と、前記応答文と、前記質問表現同定部により同定された疑問詞及び質問表現と、前記回答タイプ判定部により判定された回答タイプと、前記回答表現抽出部により抽出された回答表現とに基づいて、前記回答タイプに対して予め定められた書き換え方法に応じて、前記質問文を書き換えることにより、前記質問文に対する回答文を生成する回答文生成部と、
前記回答文生成部により生成した回答文から、予め定められた述語項構造抽出規則に従って、前記述語項構造を抽出する述語項構造抽出部と、
を含む述語項構造抽出装置。
A predicate term structure extraction device that extracts a predicate term structure, which is a combination of a predicate and a case element corresponding to a previous description word, from a pair of an inputted morpheme analyzed question sentence and a morpheme analyzed response sentence. ,
For the question sentence that has undergone morphological analysis, the dictionary is searched using a dictionary that stores the notation of question words, the question words included in the question sentence are identified, and the part of speech information of the question sentence is determined in advance. A question expression identifying unit for identifying a range of question expressions included in the question sentence based on the identified question expression extraction rules and the identified question words;
An answer type determination unit that determines an answer type of the response sentence based on a part of speech information of the response sentence that has been subjected to the morphological analysis, and an expression list that stores a plurality of positive expressions and a plurality of negative expressions;
Based on the part-of-speech information of the response sentence, an answer expression extraction unit that extracts a proper expression or noun included in the response sentence as an answer expression;
The question sentence, the response sentence, the question word and question expression identified by the question expression identifying unit, the answer type determined by the answer type determining unit, and the answer expression extracted by the answer expression extracting unit And an answer sentence generation unit for generating an answer sentence for the question sentence by rewriting the question sentence according to a predetermined rewriting method for the answer type ,
A predicate term structure extraction unit that extracts a predescription term structure according to a predicate term structure extraction rule determined in advance from an answer sentence generated by the answer sentence generation unit;
Predicate term structure extraction device.
前記述語項構造抽出部において述語項構造が抽出されなかった場合に、前記生成された回答文に対して述語項構造解析を行って、前記述語項構造を生成する述語項構造解析部を更に含む請求項1記載の述語項構造抽出装置。   A predicate term structure analysis unit that performs a predicate term structure analysis on the generated answer sentence when a predicate term structure extraction unit does not extract a predicate term structure, and generates a predicate term structure The predicate term structure extraction device according to claim 1, further comprising: 前記回答タイプ判定部は、前記応答文の回答タイプとして、肯定否定型、用言型、及び名詞型の何れであるかを判定し、
前記回答文生成部は、前記判定された回答タイプが肯定否定型又は名詞型であった場合には、前記回答タイプに対して予め定められた書き換え方法に基づいて、前記質問文に対する回答文を生成し、前記判定された回答タイプが用言型であった場合には、前記回答文を生成せずに、前記質問文及び前記応答文を前記述語項構造解析部に出力し、
前記述語項構造解析部は、前記回答タイプ判定部において判定された回答タイプが用言型であった場合には、前記出力された前記質問文と前記応答文との各々に対して述語項構造解析を行って、前記質問文と前記応答文との各々に対する述語項構造を生成し、述語が同じ前記質問文と前記応答文との各々に対する述語項構造において、前記質問文に対する述語項構造の格要素であって、前記応答文に対する述語項構造に含まれていない格要素と、前記述語との組み合わせである述語項構造を出力する請求項2記載の述語項構造抽出装置。
The answer type determination unit determines whether the answer type of the response sentence is an affirmative negative type, a prescriptive type, or a noun type,
When the determined answer type is an affirmative negative type or a noun type, the answer sentence generation unit determines an answer sentence for the question sentence based on a rewriting method predetermined for the answer type. When the generated answer type is a prescriptive type, without generating the answer sentence, the question sentence and the response sentence are output to the pre-description term structure analysis unit,
If the answer type determined by the answer type determination unit is a predicate type, the predescript term structure analysis unit is a predicate term for each of the output question sentence and the response sentence. A structure analysis is performed to generate a predicate term structure for each of the question sentence and the response sentence, and in the predicate term structure for each of the question sentence and the response sentence having the same predicate, a predicate term structure for the question sentence The predicate term structure extraction device according to claim 2, wherein a predicate term structure that is a combination of a case element that is not included in the predicate term structure for the response sentence and a previous description word is output.
入力された形態素解析済みの質問文と形態素解析済みの応答文とのペアから、述語と前記述語に対応する格要素との組み合わせである述語項構造を抽出する述語項構造抽出方法であって、
質問表現同定部が、前記形態素解析済みの質問文に対して、疑問詞の表記が格納された辞書を用いて辞書引きを行い、前記質問文に含まれる疑問詞を同定すると共に、前記質問文の品詞情報と予め定められた質問表現抽出規則と前記同定された疑問詞とに基づいて、前記質問文に含まれる質問表現の範囲を同定するステップと、
回答タイプ判定部が、前記形態素解析済みの応答文の品詞情報と、複数の肯定表現及び複数の否定表現を格納した表現リストとに基づいて、前記応答文の回答タイプを判定するステップと、
回答表現抽出部が、前記応答文の品詞情報に基づいて、前記応答文に含まれる固有表現又は名詞を回答表現として抽出するステップと、
回答文生成部が、前記質問文と、前記応答文と、前記質問表現同定部により同定された疑問詞及び質問表現と、前記回答タイプ判定部により判定された回答タイプと、前記回答表現抽出部により抽出された回答表現とに基づいて、前記回答タイプに対して予め定められた書き換え方法に応じて、前記質問文を書き換えることにより、前記質問文に対する回答文を生成するステップと、
述語項構造抽出部が、前記回答文生成部により生成した回答文から、予め定められた述語項構造抽出規則に従って、前記述語項構造を抽出するステップと、
を含む述語項構造抽出方法。
A predicate term structure extraction method for extracting a predicate term structure, which is a combination of a predicate and a case element corresponding to a previous description word, from a pair of a morpheme analyzed question sentence and a morpheme analyzed response sentence. ,
The question expression identifying unit performs a dictionary lookup on the morphologically analyzed question sentence using a dictionary in which notation of question words is stored, and identifies the question word included in the question sentence, and the question sentence Identifying a range of question expressions included in the question sentence based on the part-of-speech information, a predetermined question expression extraction rule, and the identified question words;
A step of determining a response type of the response sentence based on a part-of-speech information of the response sentence after the morphological analysis and an expression list storing a plurality of positive expressions and a plurality of negative expressions;
An answer expression extraction unit, based on the part-of-speech information of the response sentence, extracts a specific expression or a noun included in the response sentence as an answer expression;
The answer sentence generation unit includes the question sentence, the response sentence, the question word and question expression identified by the question expression identification part, the answer type determined by the answer type determination part, and the answer expression extraction part Generating an answer sentence for the question sentence by rewriting the question sentence according to a predetermined rewriting method for the answer type based on the answer expression extracted by:
A predicate term structure extraction unit that extracts a previous description term term structure from the answer sentence generated by the answer sentence generation unit according to a predicate term structure extraction rule determined in advance;
Predicate term structure extraction method including
述語項構造解析部が、前記述語項構造抽出部において述語項構造が抽出されなかった場合に、前記生成された回答文に対して述語項構造解析を行って、前記述語項構造を生成するステップを更に含む請求項4記載の述語項構造抽出方法。   When the predicate term structure analysis unit does not extract the predicate term structure, the predicate term structure analysis unit performs a predicate term structure analysis on the generated answer sentence to generate a predescription term structure. 5. The predicate term structure extracting method according to claim 4, further comprising a step of: 前記回答タイプ判定部が判定するステップは、前記応答文の回答タイプとして、肯定否定型、用言型、及び名詞型の何れであるかを判定し、
前記回答文生成部が前記回答文を生成するステップは、前記判定された回答タイプが肯定否定型又は名詞型であった場合には、前記回答タイプに対して予め定められた書き換え方法に基づいて、前記質問文に対する回答文を生成し、前記判定された回答タイプが用言型であった場合には、前記回答文を生成せずに、前記質問文及び前記応答文を前記述語項構造解析部に出力し、
前記述語項構造解析部が前記述語項構造を生成するステップは、前記回答タイプ判定部において判定された回答タイプが用言型であった場合には、前記出力された前記質問文と前記応答文との各々に対して述語項構造解析を行って、前記質問文と前記応答文との各々に対する述語項構造を生成し、述語が同じ前記質問文と前記応答文との各々に対する述語項構造において、前記質問文に対する述語項構造の格要素であって、前記応答文に対する述語項構造に含まれていない格要素と、前記述語との組み合わせである述語項構造を出力する請求項5記載の述語項構造抽出方法。
The step of determining by the answer type determining unit determines whether the answer type of the response sentence is an affirmative negative type, a prescriptive type, or a noun type,
The step of generating the answer sentence by the answer sentence generating unit is based on a rewriting method predetermined for the answer type when the determined answer type is an affirmative negative type or a noun type. Generating an answer sentence for the question sentence, and when the determined answer type is a prescriptive type, the question sentence and the response sentence are not pre-descriptor term structure without generating the answer sentence. Output to the analysis unit,
The step of generating the predescription term structure by the predescription term structure analysis unit is performed when the answer type determined by the answer type determination unit is a predicate type and the output question sentence and the Predicate term structure analysis is performed on each of the response statements to generate a predicate term structure for each of the question statement and the response statement, and a predicate term for each of the question statement and the response statement having the same predicate 6. A predicate term structure, which is a combination of a case element of a predicate term structure for the question sentence and not included in the predicate term structure for the response sentence, and a predescription word in the structure. A predicate term structure extraction method.
コンピュータに、請求項1〜請求項3何れか1項に記載の述語項構造抽出装置の各部を実行させるためのプログラム。   The program for making a computer perform each part of the predicate term structure extraction apparatus of any one of Claims 1-3. コンピュータに、請求項1〜請求項3何れか1項に記載の述語項構造抽出装置の各部を実行させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体。   A computer-readable recording medium storing a program for causing a computer to execute each part of the predicate term structure extracting device according to any one of claims 1 to 3.
JP2014183214A 2014-09-09 2014-09-09 Predicate term structure extraction device, method, program, and computer-readable recording medium Active JP5911931B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2014183214A JP5911931B2 (en) 2014-09-09 2014-09-09 Predicate term structure extraction device, method, program, and computer-readable recording medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2014183214A JP5911931B2 (en) 2014-09-09 2014-09-09 Predicate term structure extraction device, method, program, and computer-readable recording medium

Publications (2)

Publication Number Publication Date
JP2016057810A JP2016057810A (en) 2016-04-21
JP5911931B2 true JP5911931B2 (en) 2016-04-27

Family

ID=55758607

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2014183214A Active JP5911931B2 (en) 2014-09-09 2014-09-09 Predicate term structure extraction device, method, program, and computer-readable recording medium

Country Status (1)

Country Link
JP (1) JP5911931B2 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110209781B (en) * 2018-08-13 2023-04-07 腾讯科技(深圳)有限公司 Text processing method and device and related equipment
CN114548086A (en) * 2020-11-26 2022-05-27 税友软件集团股份有限公司 Event text data processing method and related device
CN112685549B (en) * 2021-01-08 2022-07-29 昆明理工大学 Document-related news element entity identification method and system integrating discourse semantics

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5564705B2 (en) * 2010-07-16 2014-08-06 株式会社日立製作所 Sentence structure analyzing apparatus, sentence structure analyzing method, and sentence structure analyzing program

Also Published As

Publication number Publication date
JP2016057810A (en) 2016-04-21

Similar Documents

Publication Publication Date Title
Fernandes et al. Latent structure perceptron with feature induction for unrestricted coreference resolution
US10157171B2 (en) Annotation assisting apparatus and computer program therefor
JP6778655B2 (en) Word concatenation discriminative model learning device, word concatenation detection device, method, and program
JP6427466B2 (en) Synonym pair acquisition apparatus, method and program
Gómez-Adorno et al. A graph based authorship identification approach
JP5911931B2 (en) Predicate term structure extraction device, method, program, and computer-readable recording medium
Ganfure et al. Design and implementation of morphology based spell checker
Jain et al. Text independent root word identification in Hindi language using natural language processing
JP5954836B2 (en) Ununderstood sentence determination model learning method, ununderstood sentence determination method, apparatus, and program
JP6564709B2 (en) Sentence rewriting device, method, and program
JP2018077604A (en) Artificial intelligence device automatically identifying violation candidate of achieving means or method from function description
KR102203895B1 (en) Embedding based causality detection System and Method and Computer Readable Recording Medium on which program therefor is recorded
Van Zaanen et al. The development of Dutch and Afrikaans language resources for compound boundary analysis
Rajalingam A rule based iterative affix stripping stemming algorithm for Tamil
JP6667875B2 (en) Summary sentence creation model learning device, summary sentence creation device, summary sentence creation model learning method, summary sentence creation method, and program
JP2017068435A (en) Text data processing device, text data processing method, and program
JP5944859B2 (en) Evaluation information extracting apparatus, certainty degree learning apparatus, method, and program
JP2017091100A (en) Predicate-argument structure extraction device, method, and program
CN113158654B (en) Domain model extraction method and device and readable storage medium
Chhetri et al. Development of a morph analyser for Nepali noun token
KR102194424B1 (en) Method for restoring sentences and apparatus thereof
El-Kahlout et al. Initial explorations in two-phase Turkish dependency parsing by incorporating constituents
JP2014134871A (en) Method for generating search keyword for question-and-answer, device and program
TWI594135B (en) Plagiarism detecting method of information in english
JP2018073298A (en) Method for automatic extraction/creation of means/method by artificial intelligence device

Legal Events

Date Code Title Description
TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20160301

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20160330

R150 Certificate of patent or registration of utility model

Ref document number: 5911931

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150