JP6871809B2

JP6871809B2 - Information processing equipment, information processing methods, and programs

Info

Publication number: JP6871809B2
Application number: JP2017114483A
Authority: JP
Inventors: 真中辻; 伊東　久; 久伊東; 毘蘆呉; 翔太相樂; 明久藤田
Original assignee: エヌ・ティ・ティレゾナント株式会社
Priority date: 2017-06-09
Filing date: 2017-06-09
Publication date: 2021-05-12
Anticipated expiration: 2037-06-09
Also published as: JP2018206307A

Description

本発明は、情報処理装置、情報処理方法、及びプログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program.

近年、機械学習（例えば、深層学習（Deep Learning）手法）を利用して、入力された質問に対して、回答を出力する技術が知られている（例えば、非特許文献１を参照）。このような従来技術を用いた情報処理装置では、例えば、過去に蓄積された回答などの予め用意された既知の回答のうちから、適切であると推定された回答が選択されて出力される。 In recent years, there has been known a technique of outputting an answer to an input question by using machine learning (for example, a deep learning method) (see, for example, Non-Patent Document 1). In an information processing device using such a conventional technique, for example, an answer presumed to be appropriate is selected and output from known answers prepared in advance such as answers accumulated in the past.

Tan M, Xiang B, Zhou B, “LSTM-BASED DEEP LEARNING MODELS FOR NONFACTOID ANSWER SELECTION”1511.04108v1, 12 Nov 2015Tan M, Xiang B, Zhou B, “LSTM-BASED DEEP LEARNING MODELS FOR NONFACTOID ANSWER SELECTION” 1511.04108v1, 12 Nov 2015

しかしながら、上述した従来の情報処理装置では、例えば、質問が理由や事象の説明に基づく回答を求めるNon-Factoid型質問である場合に、回答が複雑な長文になるが、質問に対して、予め用意された既知の回答が出力されるため、新たな回答を生成することは困難である。そのため、上述した従来の情報処理装置では、質問に対して、文面に違和感のある不自然な回答が出力される場合があった。 However, in the above-mentioned conventional information processing device, for example, when the question is a non-factoid type question that requires an answer based on the explanation of the reason or the event, the answer becomes a complicated long sentence, but the question is answered in advance. It is difficult to generate a new answer because the prepared known answer is output. Therefore, in the conventional information processing device described above, an unnatural answer with a sense of incongruity in the text may be output in response to the question.

本発明は、上記問題を解決すべくなされたもので、その目的は、質問に対して、違和感を低減した自然な文面の回答を生成することができる情報処理装置、情報処理方法、及びプログラムを提供することにある。 The present invention has been made to solve the above problems, and an object of the present invention is to provide an information processing device, an information processing method, and a program capable of generating a natural written answer to a question with less discomfort. To provide.

上記問題を解決するために、本発明の一態様は、入力された入力質問文を取得する質問取得部と、質問文と、回答文において、予め定められた文章の筋道により分割される複数の部分項目それぞれに対応する既知の部分回答文との組を複数有する学習情報に基づいて機械学習された学習結果に基づいて、前記質問取得部によって取得された前記入力質問文に対する回答文を生成する回答生成部とを備え、前記学習結果は、前記質問文を１単語ずつ順次単語の並び順に基づいてエンコードして文脈ベクトルを生成し、生成した前記文脈ベクトルに基づいて、前記複数の部分項目ごとの前記既知の部分回答文をデコードして学習するエンコーダデコーダモデルと、前記エンコーダデコーダモデルに基づいてデコードされた前記複数の部分項目ごとの部分回答文と、前記質問文とを含む組情報を入力情報として、前記質問文を前記単語ごとに変換された特徴ベクトルを時系列の順方向及び逆方向の双方向の前記単語の並び順に基づいて生成した質問特徴ベクトル群に基づいて、前記単語の並びを前記双方向に学習して生成された質問中間ベクトルと、前記複数の部分項目それぞれに対応した回答中間ベクトルであって、前記部分回答文を単語ごとに変換された特徴ベクトルを前記双方向の前記単語の並び順に基づいて生成した回答特徴ベクトル群に基づいて、前記単語の並びを前記双方向に学習して生成された回答中間ベクトルと、の前記複数の部分項目の組合せに基づいて学習する文単位学習モデルと、を組み合わせて算出される損失関数により最適化されて学習されることを特徴とする情報処理装置である。 In order to solve the above problem, one aspect of the present invention includes a question acquisition unit that acquires an input input question sentence, and a plurality of question sentences and answer sentences that are divided by a predetermined sentence path. A response sentence to the input question sentence acquired by the question acquisition unit is generated based on the learning result machine-learned based on the learning information having a plurality of pairs of known partial answer sentences corresponding to each partial item. The learning result is provided with an answer generation unit, and the question sentence is sequentially encoded word by word based on the order of the words to generate a context vector, and based on the generated context vector, for each of the plurality of sub-items. Input the set information including the encoder / decoder model for decoding and learning the known partial answer sentence, the partial answer sentence for each of the plurality of partial items decoded based on the encoder / decoder model, and the question sentence. As information, the sequence of the words is based on the question feature vector group in which the feature vector obtained by converting the question sentence for each word is generated based on the sequence order of the words in both the forward and reverse directions of the time series. The question intermediate vector generated by learning the above bidirectionally and the answer intermediate vector corresponding to each of the plurality of sub-items, and the feature vector obtained by converting the partial answer sentence for each word is the bidirectional. Based on the answer feature vector group generated based on the order of the words, the answer intermediate vector generated by learning the sequence of the words in both directions is learned based on the combination of the plurality of sub-items. It is an information processing device characterized in that it is optimized and learned by a loss function calculated by combining a sentence unit learning model.

また、本発明の一態様は、上記の情報処理装置において、前記文単位学習モデルは、前記エンコーダデコーダモデルの途中学習結果及び前記文脈ベクトルに基づいて生成した回答文を前記部分回答文とし、当該部分回答文と前記質問文とを含む組情報を前記入力情報として学習することを特徴とする。 Further, in one aspect of the present invention, in the above information processing apparatus, the sentence unit learning model uses the intermediate learning result of the encoder / decoder model and the answer sentence generated based on the context vector as the partial answer sentence. It is characterized in that the set information including the partial answer sentence and the question sentence is learned as the input information.

また、本発明の一態様は、上記の情報処理装置において、前記エンコーダデコーダモデルは、前記既知の部分回答文における単語ごとに関連するトピック情報に基づいて、前記既知の部分回答文をデコードして学習することを特徴とする。 Further, in one aspect of the present invention, in the above information processing apparatus, the encoder / decoder model decodes the known partial answer sentence based on the topic information related to each word in the known partial answer sentence. It is characterized by learning.

また、本発明の一態様は、上記の情報処理装置において、前記既知の回答文には、前記質問文に対する正解文と、不正解文とが含まれ、前記回答生成部は、前記質問文と、前記複数の部分項目それぞれに対応する前記正解文及び前記不正解文との組を複数有する前記学習情報に基づいて機械学習された前記学習結果に基づいて、前記回答文を生成し、前記エンコーダデコーダモデルは、前記文脈ベクトルに基づいて、前記複数の部分項目ごとの前記正解文及び前記不正解文をデコードして学習し、前記文単位学習モデルは、前記質問中間ベクトルと、前記複数の部分項目それぞれに対応した正解中間ベクトルであって、前記正解文を単語ごとに変換された特徴ベクトルを前記双方向の前記単語の並び順に基づいて生成した正解特徴ベクトル群に基づいて、前記単語の並びを前記双方向に学習して生成された正解中間ベクトルと、前記複数の部分項目それぞれに対応した不正解中間ベクトルであって、前記不正解文を単語ごとに変換された特徴ベクトルを前記双方向の前記単語の並び順に基づいて生成した不正解特徴ベクトル群に基づいて、前記単語の並びを前記双方向に学習して生成された不正解中間ベクトルと、の前記複数の部分項目の組合せに基づいて学習することを特徴とする。 Further, in one aspect of the present invention, in the above-mentioned information processing apparatus, the known answer sentence includes a correct answer sentence and an incorrect answer sentence for the question sentence, and the answer generation unit includes the question sentence. , The answer sentence is generated based on the learning result machine-learned based on the learning information having a plurality of pairs of the correct answer sentence and the incorrect answer sentence corresponding to each of the plurality of partial items, and the encoder. The decoder model decodes and learns the correct sentence and the incorrect answer sentence for each of the plurality of partial items based on the context vector, and the sentence unit learning model includes the question intermediate vector and the plurality of parts. The correct answer intermediate vector corresponding to each item, and the sequence of the words based on the correct answer feature vector group generated based on the sequence order of the words in both directions, which is a feature vector obtained by converting the correct sentence for each word. The correct intermediate vector generated by learning in both directions and the incorrect intermediate vector corresponding to each of the plurality of sub-items, and the feature vector obtained by converting the incorrect sentence word by word are used in both directions. Based on the incorrect answer feature vector group generated based on the order of the words, the incorrect intermediate vector generated by learning the sequence of the words in both directions, and the combination of the plurality of partial items. It is characterized by learning.

また、本発明の一態様は、上記の情報処理装置において、前記学習結果は、前記複数の部分項目のうちの第１の部分項目に対応する前記正解中間ベクトル及び前記不正解中間ベクトルに基づいて、前記第１の部分項目と異なる第２の部分項目に対応する正解特徴ベクトル群及び不正解特徴ベクトル群が更新されて学習されることを特徴とする。 Further, in one aspect of the present invention, in the above information processing apparatus, the learning result is based on the correct answer intermediate vector and the incorrect answer intermediate vector corresponding to the first partial item among the plurality of partial items. , The correct answer feature vector group and the incorrect answer feature vector group corresponding to the second sub-item different from the first sub-item are updated and learned.

また、本発明の一態様は、上記の情報処理装置において、前記学習情報に基づいて機械学習し、前記学習結果を生成する学習処理部を備えることを特徴とする。 Further, one aspect of the present invention is characterized in that the information processing apparatus includes a learning processing unit that performs machine learning based on the learning information and generates the learning result.

また、本発明の一態様は、質問取得部が、入力された入力質問文を取得する質問取得ステップと、回答生成部が、質問文と、回答文において、予め定められた文章の筋道により分割される複数の部分項目それぞれに対応する既知の部分回答文との組を複数有する学習情報に基づいて機械学習された学習結果に基づいて、前記質問取得ステップによって取得された前記入力質問文に対する回答文を生成する回答生成ステップとを含み、前記学習結果は、前記質問文を１単語ずつ順次単語の並び順に基づいてエンコードして文脈ベクトルを生成し、生成した前記文脈ベクトルに基づいて、前記複数の部分項目ごとの前記既知の部分回答文をデコードして学習するエンコーダデコーダモデルと、前記エンコーダデコーダモデルに基づいてデコードされた前記複数の部分項目ごとの部分回答文と、前記質問文とを含む組情報を入力情報として、前記質問文を前記単語ごとに変換された特徴ベクトルを時系列の順方向及び逆方向の双方向の前記単語の並び順に基づいて生成した質問特徴ベクトル群に基づいて、前記単語の並びを前記双方向に学習して生成された質問中間ベクトルと、前記複数の部分項目それぞれに対応した回答中間ベクトルであって、前記部分回答文を単語ごとに変換された特徴ベクトルを前記双方向の前記単語の並び順に基づいて生成した回答特徴ベクトル群に基づいて、前記単語の並びを前記双方向に学習して生成された回答中間ベクトルと、の前記複数の部分項目の組合せに基づいて学習する文単位学習モデルと、を組み合わせて算出される損失関数により最適化されて学習されることを特徴とする情報処理方法である。 Further, in one aspect of the present invention, the question acquisition unit divides the question acquisition step of acquiring the input input question sentence, and the answer generation unit divides the question sentence and the answer sentence by a predetermined sentence path. Answers to the input question sentence acquired by the question acquisition step based on the learning result machine-learned based on the learning information having a plurality of sets with known partial answer sentences corresponding to each of the plurality of partial items to be performed. The learning result includes the answer generation step of generating a sentence, and the question sentence is sequentially encoded word by word based on the order of the words to generate a context vector, and the plurality of the learning results are generated based on the generated context vector. Includes an encoder / decoder model that decodes and learns the known partial answer sentence for each partial item, a partial answer sentence for each of the plurality of partial items decoded based on the encoder / decoder model, and the question sentence. Using the set information as input information, the feature vector obtained by converting the question sentence for each word is generated based on the sequence order of the words in both the forward and reverse directions of the time series, based on the question feature vector group. A question intermediate vector generated by learning the sequence of words in both directions and an answer intermediate vector corresponding to each of the plurality of partial items, and a feature vector obtained by converting the partial answer sentence for each word. Based on the answer feature vector group generated based on the bidirectional order of the words, the combination of the plurality of sub-items of the answer intermediate vector generated by learning the bidirectional sequence of the words. It is an information processing method characterized in that it is optimized and learned by a loss function calculated by combining a sentence-based learning model that learns based on the sentence.

また、本発明の一態様は、コンピュータに、入力された入力質問文を取得する質問取得ステップと、質問文と、回答文において、予め定められた文章の筋道により分割される複数の部分項目それぞれに対応する既知の部分回答文との組を複数有する学習情報に基づいて機械学習された学習結果に基づいて、前記質問取得ステップによって取得された前記入力質問文に対する回答文を生成する回答生成ステップとを実行させるためのプログラムであり、前記学習結果は、前記質問文を１単語ずつ順次単語の並び順に基づいてエンコードして文脈ベクトルを生成し、生成した前記文脈ベクトルに基づいて、前記複数の部分項目ごとの前記既知の部分回答文をデコードして学習するエンコーダデコーダモデルと、前記エンコーダデコーダモデルに基づいてデコードされた前記複数の部分項目ごとの部分回答文と、前記質問文とを含む組情報を入力情報として、前記質問文を前記単語ごとに変換された特徴ベクトルを時系列の順方向及び逆方向の双方向の前記単語の並び順に基づいて生成した質問特徴ベクトル群に基づいて、前記単語の並びを前記双方向に学習して生成された質問中間ベクトルと、前記複数の部分項目それぞれに対応した回答中間ベクトルであって、前記部分回答文を単語ごとに変換された特徴ベクトルを前記双方向の前記単語の並び順に基づいて生成した回答特徴ベクトル群に基づいて、前記単語の並びを前記双方向に学習して生成された回答中間ベクトルと、の前記複数の部分項目の組合せに基づいて学習する文単位学習モデルと、を組み合わせて算出される損失関数により最適化されて学習されることを特徴とするプログラムである。 Further, one aspect of the present invention is a question acquisition step of acquiring an input question sentence input to a computer, and a plurality of partial items divided by a predetermined sentence path in the question sentence and the answer sentence, respectively. An answer generation step that generates an answer sentence to the input question sentence acquired by the question acquisition step based on a learning result machine-learned based on learning information having a plurality of pairs with a known partial answer sentence corresponding to. The learning result is a program for executing the above, and the learning result encodes the question sentence word by word sequentially based on the order of the words to generate a context vector, and based on the generated context vector, the plurality of words are executed. A set including an encoder / decoder model that decodes and learns the known partial answer sentence for each partial item, a partial answer sentence for each of the plurality of partial items decoded based on the encoder / decoder model, and the question sentence. Based on the question feature vector group generated based on the sequence order of the words in both the forward and reverse directions of the time series, the feature vector obtained by converting the question sentence for each word with the information as input information is described. The question intermediate vector generated by learning the sequence of words in both directions and the answer intermediate vector corresponding to each of the plurality of partial items, and the feature vector obtained by converting the partial answer sentence for each word are described above. Based on the answer feature vector group generated based on the bidirectional order of the words, and based on the combination of the plurality of sub-items of the answer intermediate vector generated by learning the bidirectional sequence of the words. It is a program characterized in that it is optimized and learned by a loss function calculated by combining a sentence-based learning model for learning.

本発明によれば、質問に対して、違和感を低減した自然な文面の回答を生成することができる。 According to the present invention, it is possible to generate a natural written answer to a question with less discomfort.

本実施形態による情報処理システムの一例を示す機能ブロック図である。It is a functional block diagram which shows an example of the information processing system by this Embodiment. 本実施形態における学習モデルの一例を説明する図である。It is a figure explaining an example of the learning model in this embodiment. 本実施形態におけるデコーダモデルの一例を説明する図である。It is a figure explaining an example of the decoder model in this embodiment. 本実施形態における学習処理部及び学習処理の一例を説明する図である。It is a figure explaining an example of the learning processing part and learning processing in this embodiment. 本実施形態におけるエンコーダデコーダモデルの一例を説明する図である。It is a figure explaining an example of the encoder-decoder model in this embodiment. 本実施形態における学習処理の一例を示すフローチャートである。It is a flowchart which shows an example of the learning process in this embodiment. 本実施形態における情報処理装置の質問文から回答文を生成する処理の一例を示すフローチャートである。It is a flowchart which shows an example of the process which generates the answer sentence from the question sentence of the information processing apparatus in this embodiment. 本実施形態における回答生成方式と、従来技術との比較を示す図である。It is a figure which shows the comparison between the answer generation method in this embodiment, and the prior art. 本実施形態における情報処理装置が生成した回答文の一例を説明する図である。It is a figure explaining an example of the answer sentence generated by the information processing apparatus in this embodiment.

以下、本発明の一実施形態による情報処理装置について図面を参照して説明する。 Hereinafter, the information processing apparatus according to the embodiment of the present invention will be described with reference to the drawings.

図１は、本実施形態による情報処理システム１００の一例を示す概略ブロック図である。
図１に示すように、情報処理システム１００は、情報処理装置１と、端末装置２とを備えている。情報処理装置１と、端末装置２とは、ネットワークＮＷ１を介して接続されている。
情報処理システム１００は、例えば、情報処理装置１に接続した端末装置２に、投稿された質問及び回答を表示して、ユーザ間で情報共有するＱ＆Ａサービスなどの情報サービスを提供する。 FIG. 1 is a schematic block diagram showing an example of the information processing system 100 according to the present embodiment.
As shown in FIG. 1, the information processing system 100 includes an information processing device 1 and a terminal device 2. The information processing device 1 and the terminal device 2 are connected via a network NW1.
The information processing system 100 provides an information service such as a Q & A service for displaying posted questions and answers on a terminal device 2 connected to the information processing device 1 and sharing information between users.

端末装置２は、情報処理システム１００が提供する情報サービスを利用するために、ユーザが使用するクライアント端末である。なお、図１に示す例では、説明を簡略化するため、情報処理装置１に１台の端末装置２が接続されている例を示しているが、複数の端末装置２が、情報処理装置１に接続されてもよい。 The terminal device 2 is a client terminal used by a user to use the information service provided by the information processing system 100. In the example shown in FIG. 1, for simplification of the description, one terminal device 2 is connected to the information processing device 1, but a plurality of terminal devices 2 are connected to the information processing device 1. May be connected to.

情報処理装置１は、例えば、Ｑ＆Ａサービスなどの情報サービスを提供するサーバ装置である。情報処理装置１は、例えば、端末装置２を介してユーザから受け付けた質問文を、Ｑ＆Ａサービスに登録して閲覧可能にするとともに、端末装置２を介して他のユーザから受け付けた回答文を登録して閲覧可能にする。また、情報処理装置１は、機械学習を利用して、登録された質問文に対する回答文を生成し、当該回答文をＱ＆Ａサービスに登録して閲覧可能にする。また、情報処理装置１は、ＮＷ（ネットワーク）通信部１１と、記憶部１２と、制御部１３とを備えている。 The information processing device 1 is a server device that provides an information service such as a Q & A service. For example, the information processing device 1 registers a question sentence received from a user via the terminal device 2 in a Q & A service so that it can be viewed, and registers an answer sentence received from another user via the terminal device 2. To make it viewable. Further, the information processing device 1 uses machine learning to generate an answer sentence to the registered question sentence, and registers the answer sentence in the Q & A service so that the answer sentence can be viewed. Further, the information processing device 1 includes a NW (network) communication unit 11, a storage unit 12, and a control unit 13.

ＮＷ通信部１１は、例えば、インターネットなどを利用してネットワークＮＷ１に接続し、ネットワークＮＷ１を介して各種情報の通信を行う。ＮＷ通信部１１は、例えば、ネットワークＮＷ１を介して、接続要求のあった端末装置２に接続し、各種情報の通信を行う。 The NW communication unit 11 connects to the network NW1 using, for example, the Internet, and communicates various information via the network NW1. The NW communication unit 11 connects to the terminal device 2 for which a connection request has been made via, for example, the network NW1 and communicates various information.

記憶部１２は、情報処理装置１が実行する各種処理に利用される情報を記憶する。記憶部１２は、例えば、サービス記憶部１２１と、学習結果記憶部１２２とを備えている。
サービス記憶部１２１は、例えば、ユーザによって端末装置２からＱ＆Ａサービスに投稿された質問文及び回答文などの投稿情報を記憶する。
学習結果記憶部１２２は、後述する学習処理部１３２によって、機械学習された学習結果を記憶する。なお、学習結果の詳細については後述する。 The storage unit 12 stores information used for various processes executed by the information processing device 1. The storage unit 12 includes, for example, a service storage unit 121 and a learning result storage unit 122.
The service storage unit 121 stores, for example, posted information such as a question sentence and an answer sentence posted from the terminal device 2 to the Q & A service by the user.
The learning result storage unit 122 stores the learning result machine-learned by the learning processing unit 132, which will be described later. The details of the learning result will be described later.

制御部１３は、例えば、ＣＰＵ（Central Processing Unit）などを含むプロセッサであり、情報処理装置１を統括的に制御する。制御部１３は、例えば、上述したＱ＆Ａサービスなどの情報サービスを提供する処理や、学習結果記憶部１２２が記憶する学習結果の生成処理、情報処理装置１がＮＷ通信部１１を介して取得した質問文に対する回答文の生成処理などの各種処理を実行する。また、制御部１３は、サービス提供部１３１と、学習処理部１３２と、質問取得部１３３と、回答生成部１３４とを備えている。 The control unit 13 is, for example, a processor including a CPU (Central Processing Unit) and the like, and controls the information processing device 1 in an integrated manner. The control unit 13 is, for example, a process of providing an information service such as the above-mentioned Q & A service, a process of generating a learning result stored in the learning result storage unit 122, and a question acquired by the information processing device 1 via the NW communication unit 11. Performs various processes such as generating answer sentences for sentences. Further, the control unit 13 includes a service providing unit 131, a learning processing unit 132, a question acquisition unit 133, and an answer generation unit 134.

サービス提供部１３１は、情報処理装置１が提供する情報サービスに関する処理を実行する。サービス提供部１３１は、例えば、端末装置２からＮＷ通信部１１を介して、受け付けた質問文及び回答文を投稿情報として、サービス記憶部１２１に記憶させる。また、サービス提供部１３１は、例えば、Ｑ＆Ａサービスの閲覧を希望する端末装置２に対して、サービス記憶部１２１に記憶されている投稿情報を、ＮＷ通信部１１を介して端末装置２に出力し、端末装置２に表示させる。また、サービス提供部１３１は、後述する回答生成部１３４が生成した回答文を、サービス記憶部１２１に記憶させる。なお、サービス提供部１３１は、Ｑ＆Ａサービスにおいて、例えば、「恋愛」、「家族」、「料理」など、カテゴリ（分類）ごとに分かれて、情報をユーザに提供するものとする。 The service providing unit 131 executes processing related to the information service provided by the information processing device 1. For example, the service providing unit 131 stores the received question text and answer text as posted information in the service storage unit 121 from the terminal device 2 via the NW communication unit 11. Further, the service providing unit 131 outputs the posted information stored in the service storage unit 121 to the terminal device 2 via the NW communication unit 11, for example, for the terminal device 2 wishing to browse the Q & A service. , Displayed on the terminal device 2. Further, the service providing unit 131 stores the answer sentence generated by the answer generating unit 134, which will be described later, in the service storage unit 121. In the Q & A service, the service providing unit 131 provides information to the user by dividing it into categories (classifications) such as "romance", "family", and "cooking".

学習処理部１３２は、質問文と、既知の回答文において、予め定められた文章の筋道（シナリオ）により分割される複数の部分項目それぞれに対応する部分回答文との組を複数有する学習情報に基づいて、機械学習を実行して学習結果を生成する。ここで、例えば、回答文の文章の筋道（シナリオ）を、「結論」、「補足」の順に定めた場合には、部分項目は、「結論」及び「補足」であり、回答文の文章の筋道（シナリオ）を、「雑談」、「事例」、「結論」の順番に定めた場合には、部分項目は、「雑談」、「事例」、及び「結論」である。
なお、以下の説明では、複数の部分項目の一例として、「結論」及び「補足」である場合について説明する。 The learning processing unit 132 provides learning information having a plurality of sets of a question sentence and a partial answer sentence corresponding to each of a plurality of partial items divided by a predetermined sentence path (scenario) in a known answer sentence. Based on this, machine learning is executed to generate learning results. Here, for example, when the scenario of the text of the answer sentence is defined in the order of "conclusion" and "supplement", the sub-items are "conclusion" and "supplement", and the text of the answer sentence When the scenario is defined in the order of "chat", "case", and "conclusion", the sub-items are "chat", "case", and "conclusion".
In the following description, the cases of "conclusion" and "supplement" will be described as an example of a plurality of sub-items.

また、学習処理部１３２は、例えば、サービス記憶部１２１が記憶するＱ＆Ａサービスの投稿情報（既知の質問文及び既知の回答文）を入力情報として、深層学習（ディープラーニング）技術を利用した学習モデルにより学習した学習結果を生成する。なお、既知の回答文には、質問文に対する正解文と、不正解文とが含まれる。学習処理部１３２は、学習処理の入力情報（学習情報）として、質問文と、正解の回答文、及び正解以外の回答文の中から任意に抽出した不正解文の組を学習情報として使用する。学習処理部１３２は、生成した学習結果を学習結果記憶部１２２に記憶させる。
なお、本実施形態の学習モデル、学習処理部１３２の構成、及び学習処理の詳細ついては、後述する。 Further, the learning processing unit 132 uses, for example, the posted information (known question text and known answer text) of the Q & A service stored in the service storage unit 121 as input information, and is a learning model using deep learning technology. Generates the learning result learned by. The known answer sentences include a correct answer sentence for the question sentence and an incorrect answer sentence. The learning processing unit 132 uses a set of a question sentence, a correct answer sentence, and an incorrect answer sentence arbitrarily extracted from the answer sentences other than the correct answer as learning information as input information (learning information) of the learning process. .. The learning processing unit 132 stores the generated learning result in the learning result storage unit 122.
The learning model of the present embodiment, the configuration of the learning processing unit 132, and the details of the learning processing will be described later.

質問取得部１３３は、情報処理装置１に入力された入力質問文を取得する。質問取得部１３３は、例えば、サービス記憶部１２１が記憶している質問文の中から、入力質問文を取得する。 The question acquisition unit 133 acquires the input question text input to the information processing device 1. The question acquisition unit 133 acquires an input question sentence from, for example, the question sentence stored in the service storage unit 121.

回答生成部１３４は、上述した学習処理部１３２によって学習された学習結果に基づいて、質問取得部１３３によって取得された入力質問文に対する回答文を生成する。すなわち、回答生成部１３４は、学習結果記憶部１２２が記憶する学習結果に基づいて、複数の部分項目を結合して生成された回答文を生成する。また、回答生成部１３４は、生成した回答文をサービス提供部１３１に供給して、当該回答文を、入力質問文の回答の投稿として、サービス記憶部１２１に記憶させる。 The answer generation unit 134 generates an answer sentence for the input question sentence acquired by the question acquisition unit 133 based on the learning result learned by the learning processing unit 132 described above. That is, the answer generation unit 134 generates an answer sentence generated by combining a plurality of sub-items based on the learning result stored in the learning result storage unit 122. Further, the answer generation unit 134 supplies the generated answer sentence to the service providing unit 131, and stores the answer sentence in the service storage unit 121 as a post of the answer of the input question sentence.

次に、図２及び図３を参照して、本実施形態における学習モデルについて説明する。
図２は、本実施形態における学習モデルＭ１の一例を説明する図である。
図２に示すように、本実施形態における学習モデルＭ１は、深層学習技術を利用したモデルであり、エンコーダデコーダモデルＭ１１と、文単位学習モデルＭ１２とを組み合わせたモデルである。 Next, the learning model in the present embodiment will be described with reference to FIGS. 2 and 3.
FIG. 2 is a diagram illustrating an example of the learning model M1 in the present embodiment.
As shown in FIG. 2, the learning model M1 in the present embodiment is a model using deep learning technology, and is a model in which an encoder / decoder model M11 and a sentence unit learning model M12 are combined.

エンコーダデコーダモデルＭ１１は、質問文を１単語ずつ順次単語の並び順に基づいてエンコードして文脈ベクトル１０（Context vector）を生成し、生成した文脈ベクトル１０に基づいて、複数の部分項目ごとの既知の部分回答文をデコードして学習する。エンコーダデコーダモデルＭ１１は、例えば、ニューラルエンコーダデコーダモデルであり、まずに、一度に質問文の１つの単語（トークン）をエンコードし、回答文を一度に１つの単語（トークン）をデコードする。エンコーダデコーダモデルＭ１１は、複数の部分項目ごとの既知の回答文を生成するようにデコードして学習するモデルである。エンコーダデコーダモデルＭ１１は、質問文の言葉（ＷＯＲＤ）を質問文である入力文Ｘによって調整することにより、逐次生成する。また、目標文は、Ｙ＝｛ｙ（１），・・・，ｙ（Ｔ）｝とする。エンコーダデコーダモデルＭ１１において、損失関数Ｌは、以下の式（１）により表される。 The encoder / decoder model M11 encodes a question sentence word by word sequentially based on the order of the words to generate a context vector 10, and is known for each of a plurality of sub-items based on the generated context vector 10. Learn by decoding the partial answer sentence. The encoder / decoder model M11 is, for example, a neural encoder / decoder model. First, one word (token) of a question sentence is encoded at a time, and one word (token) of an answer sentence is decoded at a time. The encoder-decoder model M11 is a model for decoding and learning so as to generate a known answer sentence for each of a plurality of sub-items. The encoder / decoder model M11 is sequentially generated by adjusting the word (WORD) of the interrogative sentence by the input sentence X which is the interrogative sentence. The target sentence is Y = {y (1), ..., Y (T)}. In the encoder / decoder model M11, the loss function L is represented by the following equation (1).

なお、エンコーダデコーダモデルＭ１１は、エンコーダモデルＭ１１１と、デコーダモデルＭ１１２とを含んでいる。
エンコーダモデルＭ１１１は、質問の全体の意味を捕えるために、両方向にエンコードされるｂｉＬＳＴＭ（bidirectional Long Short-Term Memory）を使用する。エンコーダモデルＭ１１１は、ｂｉＬＳＴＭの出力の各要素の最大値を抽出して蓄積するマックスプーリング（Max pooling）処理により文脈ベクトル１０を生成する。 The encoder / decoder model M11 includes an encoder model M111 and a decoder model M112.
The encoder model M111 uses biLSTM (bidirectional Long Short-Term Memory) encoded in both directions to capture the overall meaning of the question. The encoder model M111 generates the context vector 10 by a Max pooling process that extracts and accumulates the maximum value of each element of the output of biLSTM.

デコーダモデルＭ１１２は、エンコーダモデルＭ１１１によって生成された文脈ベクトル１０に基づいて、部分回答文（結論回答文ａｃ及び補足回答文ａｓ）を出力するように学習するモデルであり、ＬＳＴＭ（Long Short-Term Memory）を利用して、各単語を予測するモデルである。ここでは、デコーダモデルＭ１１２は、文脈ベクトル１０に基づいて、結論回答文ａｃをデコードするとともに、補足回答文ａｓをデコードするモデルである。
デコーダモデルＭ１１２は、例えば、「遠距離」、「恋愛」、「は」、「愛」、「を」、・・・という質問文ｑをエンコーダモデルＭ１１１によりエンコードした文脈ベクトル１０に基づいて、「遠距離」、「恋愛」、「は」、「真実の」、「愛」、・・・という既知の部分回答文（結論回答文ａｃ）をデコードするように学習する。また、デコーダモデルＭ１１２は、文脈ベクトル１０に基づいて、「遠距離」、「恋愛」、「は」、「あなたの」、「愛」、・・・という既知の部分回答文（補足回答文ａｓ）をデコードするように学習する。 The decoder model M112 is a model that learns to output a partial answer sentence (conclusion answer sentence ac and supplementary answer sentence as) based on the context vector 10 generated by the encoder model M111, and is an LSTM (Long Short-Term). It is a model that predicts each word using Memory). Here, the decoder model M112 is a model that decodes the conclusion answer sentence ac and the supplementary answer sentence as based on the context vector 10.
The decoder model M112 is based on the context vector 10 encoded by the encoder model M111, for example, the question sentence q of "long distance", "love", "ha", "love", "o", ... Learn to decode known partial answer sentences (conclusion answer sentence ac) such as "long distance", "love", "ha", "truth", "love", and so on. Further, the decoder model M112 has known partial answer sentences (supplementary answer sentence as) such as "long distance", "love", "ha", "your", "love", ... Based on the context vector 10. ) To decode.

なお、デコーダモデルＭ１１２は、図３に示すように、部分回答文における単語ごとに関連するトピック情報に基づいて、既知の部分回答文をデコードして学習するＣ−ＬＳＴＭ（Contextual-Long Short-Term Memory）により学習するモデルである。 As shown in FIG. 3, the decoder model M112 decodes and learns a known partial answer sentence based on topic information related to each word in the partial answer sentence, and learns C-LSTM (Contextual-Long Short-Term). It is a model that learns by Memory).

図３は、本実施形態におけるデコーダモデルの一例を説明する図である。
図３に示す例では、単語「情報」に対応するトピック情報は、「ＩＴ」（情報技術）であり、単語「社会」に対応するトピック情報は、「公共」である。また、単語「とは」、「、」（読点）、及び「が」に対応するトピック情報は、「接続」であり、単語「資源」に対応するトピック情報は、「ＩＴ」である。このように、トピック情報は、例えば、単語の関連分野を示す情報である。
デコーダモデルＭ１１２は、図３に示すような回答文（部分回答文）の単語列と、各単語に対応するトピック情報とに基づいて、回答文（部分回答文）を生成すうように、学習する。 FIG. 3 is a diagram illustrating an example of a decoder model according to the present embodiment.
In the example shown in FIG. 3, the topic information corresponding to the word "information" is "IT" (information technology), and the topic information corresponding to the word "society" is "public". The topic information corresponding to the words "what", "," (comma) and "ga" is "connection", and the topic information corresponding to the word "resource" is "IT". As described above, the topic information is, for example, information indicating a related field of a word.
The decoder model M112 learns to generate an answer sentence (partial answer sentence) based on the word string of the answer sentence (partial answer sentence) as shown in FIG. 3 and the topic information corresponding to each word. ..

図２の説明に戻り、文単位学習モデルＭ１２は、上述したエンコーダデコーダモデルＭ１１の出力（部分回答文）と、質問文ｑとを入力情報として、文ごとに学習するセンテンスバイセンテンス（Sentence-by-sentence）モデルである。文単位学習モデルＭ１２は、質問文ｑと、複数の部分項目の部分回答文とのそれぞれについて、ｂｉＬＳＴＭ及びマックスプーリング処理を利用して、エンコーダデコーダモデルＭ１１とデコーダモデルＭ１１２とを組み合わせた損失関数Ｌｗにより最適化されて学習する。 Returning to the explanation of FIG. 2, the sentence unit learning model M12 uses the output (partial answer sentence) of the encoder / decoder model M11 described above and the question sentence q as input information to learn sentence by sentence. -sentence) It is a model. The sentence unit learning model M12 uses biLSTM and max pooling processing for each of the question sentence q and the partial answer sentences of the plurality of partial items, and uses a loss function Lw that combines the encoder / decoder model M11 and the decoder model M112. Optimized for learning.

なお、文単位学習モデルＭ１２は、エンコーダデコーダモデルＭ１１の途中学習結果及び文脈ベクトル１０に基づいて生成した回答文を部分回答文とし、当該部分回答文と質問文とを含む組情報を入力情報として学習するようにしてもよい。また、文単位学習モデルＭ１２の入力情報は、例えば、部分回答文と質問文とを含む組情報をベクトル化した情報（特徴ベクトル）である。
また、文単位学習モデルＭ１２は、複数の部分項目のうちの第１の部分項目に対応する出力に基づいて、第１の部分項目と異なる第２の部分項目に対応するｂｉＬＳＴＭが更新されて学習されるアテンションメカニズムを利用するようにしてもよい。 In the sentence unit learning model M12, the answer sentence generated based on the intermediate learning result of the encoder-decoder model M11 and the context vector 10 is used as a partial answer sentence, and the set information including the partial answer sentence and the question sentence is used as input information. You may try to learn. Further, the input information of the sentence unit learning model M12 is, for example, information (feature vector) obtained by vectorizing a set information including a partial answer sentence and a question sentence.
Further, in the sentence unit learning model M12, based on the output corresponding to the first sub-item of the plurality of sub-items, the biLSTM corresponding to the second sub-item different from the first sub-item is updated and learned. The attention mechanism to be used may be used.

次に、図４及び図５を参照して、本実施形態における学習処理部１３２及び学習処理について説明する。
図４は、本実施形態における学習処理部１３２及び学習処理の一例を説明する図である。なお、図４に示す例では、学習の入力情報の一部となる既知の回答文に、質問文に対する正解文と、不正解文とが含まれる場合の一例について説明する。 Next, the learning processing unit 132 and the learning process in the present embodiment will be described with reference to FIGS. 4 and 5.
FIG. 4 is a diagram illustrating an example of the learning processing unit 132 and the learning processing in the present embodiment. In the example shown in FIG. 4, an example will be described in which a known answer sentence that is a part of the input information for learning includes a correct answer sentence and an incorrect answer sentence for the question sentence.

図４に示すように、学習処理部１３２は、エンコーダデコーダモデルＭ１１と、ＱＡ−ＬＳＴＭ（Question Answering-Long Short-Term Memory）部（２０−１、２０−２）と、損失関数生成部３０とを備えている。 As shown in FIG. 4, the learning processing unit 132 includes an encoder / decoder model M11, a QA-LSTM (Question Answering-Long Short-Term Memory) unit (20-1, 20-2), and a loss function generation unit 30. It has.

エンコーダデコーダモデルＭ１１は、上述した図２に示すエンコーダデコーダモデルＭ１１を、正解文と不正解文とに対応させたモデルであり、例えば、図５に示すようなモデルである。ここで、エンコーダデコーダモデルＭ１１は、学習情報の組に含まれる、質問文ｑ、「結論」の正解文ａｃ＋、「結論」の不正解文ａｃ−、「補足」の正解文ａｓ＋、及び「補足」の不正解文ａｓ−に基づいて学習する。 The encoder / decoder model M11 is a model in which the encoder / decoder model M11 shown in FIG. 2 described above is associated with a correct answer sentence and an incorrect answer sentence, and is, for example, a model as shown in FIG. Here, the encoder / decoder model M11 includes the question sentence q, the correct answer sentence ac + of the "conclusion", the incorrect answer sentence ac- of the "conclusion", the correct answer sentence as + of the "supplement", and the "supplementary" included in the learning information set. Learn based on the incorrect sentence as-.

図５は、本実施形態におけるエンコーダデコーダモデルＭ１１の一例を説明する図である。
図５に示すように、エンコーダデコーダモデルＭ１１は、エンコーダモデルＭ１１１と、デコーダモデルＭ１１２とを含んでおり、上述した質問文ｑ、「結論」の正解文ａｃ＋、「結論」の不正解文ａｃ−、「補足」の正解文ａｓ＋、及び「補足」の不正解文ａｓ−の組情報に基づいて学習する。 FIG. 5 is a diagram illustrating an example of the encoder / decoder model M11 in the present embodiment.
As shown in FIG. 5, the encoder / decoder model M11 includes an encoder model M111 and a decoder model M112, and includes the above-mentioned question sentence q, the correct answer sentence ac + of the “conclusion”, and the incorrect answer sentence ac− of the “conclusion”. , "Supplement" correct sentence as +, and "supplement" incorrect sentence as-.

エンコーダモデルＭ１１１は、上述した図２と同様に、質問文ｑを１単語ずつ順次単語の並び順に基づいてエンコードして文脈ベクトル１０を生成する。
また、デコーダモデルＭ１１２は、当該文脈ベクトル１０に基づいて、「結論」の正解文ａｃ＋、「結論」の不正解文ａｃ−、「補足」の正解文ａｓ＋、及び「補足」の不正解文ａｑ−のそれぞれを生成するようにデコードして学習する。
このように、エンコーダデコーダモデルＭ１１は、文脈ベクトル１０に基づいて、複数の部分項目ごとの正解文及び不正解文をデコードして学習する。ここで、エンコーダデコーダモデルＭ１１によって生成された「結論」の正解文及び不正解文と、「補足」の正解文及び不正解文とを、「結論」の正解文ａｃ２＋、「結論」の不正解文ａｃ２−、「補足」の正解文ａｓ２＋、及び「補足」の不正解文ａｓ２−とする。また、「結論」の正解文ａｃ２＋、「結論」の不正解文ａｃ２−、「補足」の正解文ａｓ２＋、及び「補足」の不正解文ａｓ２−は、特徴ベクトルとして出力されるものとする。 Similar to FIG. 2 described above, the encoder model M111 encodes the question sentence q word by word based on the order of the words to generate the context vector 10.
Further, the decoder model M112 has the correct answer sentence ac + of the "conclusion", the incorrect answer sentence ac- of the "conclusion", the correct answer sentence as + of the "supplement", and the incorrect answer sentence aq of the "supplement" based on the context vector 10. Decode and learn to generate each of-.
In this way, the encoder / decoder model M11 decodes and learns the correct answer sentence and the incorrect answer sentence for each of a plurality of sub-items based on the context vector 10. Here, the correct and incorrect sentences of the "conclusion" generated by the encoder / decoder model M11 and the correct and incorrect sentences of the "supplement" are the correct and incorrect sentences of the "conclusion", the correct and incorrect sentences of the "conclusion", and the incorrect answers of the "conclusion". Sentence ac2-, correct sentence as2 + of "supplement", and incorrect answer sentence as2- of "supplement". Further, the correct answer sentence ac2 + of the "conclusion", the incorrect answer sentence ac2- of the "conclusion", the correct answer sentence as2 + of the "supplement", and the incorrect answer sentence as2- of the "supplementary" are output as feature vectors.

図４の説明に戻り、学習処理部１３２は、エンコーダデコーダモデルＭ１１の出力である質問文ｑ、「結論」の正解文ａｃ２＋、「結論」の不正解文ａｃ２−、「補足」の正解文ａｓ２＋、及び「補足」の不正解文ａｓ２−を、それぞれの特徴ベクトル群に変換する。すなわち、学習処理部１３２は、質問文ｑを、特徴ベクトルの集合である特徴ベクトル列Ｗ_ｑに変換する。また、学習処理部１３２は、「結論」の正解文ａｃ２＋を、特徴ベクトルの集合である特徴ベクトル列Ｗ_ａｃ＋に変換し、「結論」の不正解文ａｃ２−を、特徴ベクトルの集合である特徴ベクトル列Ｗ_ａｃ−に変換する。また、学習処理部１３２は、「補足」の正解文ａｓ２＋を、特徴ベクトルの集合である特徴ベクトル列Ｗ_ａｓ＋に変換し、「補足」の不正解文ａｓ２−を、特徴ベクトルの集合である特徴ベクトル列Ｗ_ａｓ−に変換する。 Returning to the explanation of FIG. 4, the learning processing unit 132 includes the question sentence q which is the output of the encoder / decoder model M11, the correct answer sentence ac2 + of the “conclusion”, the incorrect answer sentence ac2- of the “conclusion”, and the correct answer sentence as2 + of the “supplement”. , And the incorrect answer sentence as2- of "supplementary" are converted into each feature vector group. That is, the learning processing unit 132 converts the question sentence q into the feature vector sequence W _{q, which is a set of feature vectors.} Further, the learning processing unit 132 converts the correct answer sentence ac2 + of _{the "conclusion" into the feature vector sequence W ac +} which is a set of feature vectors, and the incorrect answer sentence ac2- of the "conclusion" is a feature which is a set of feature vectors. Convert to vector sequence W _ac−. Also, the learning processing unit 132, a correct sentence as2 + a "supplemental" into a feature vector sequence W _{the as +} a set of feature vectors, a set of non a correct sentence AS2-, feature vector "Supplement" feature Convert to vector sequence _Was−.

ＱＡ−ＬＳＴＭ部（２０−１、２０−２）は、双方向に学習するニューラルネットワークであるｂｉＬＳＴＭである。ＱＡ−ＬＳＴＭ部２０−１は、「結論」用のｂｉＬＳＴＭであり、ＱＡ−ＬＳＴＭ部２０−２は、「補足」用のｂｉＬＳＴＭである。なお、本実施形態において、ＱＡ−ＬＳＴＭ部２０−１と、ＱＡ−ＬＳＴＭ部２０−２とは、同様の構成であり、学習処理部１３２が備える任意のＱＡ−ＬＳＴＭ部を示す場合、又は特に区別しない場合には、ＱＡ−ＬＳＴＭ部２０として説明する。 The QA-LSTM unit (20-1, 20-2) is a biLSTM, which is a neural network that learns in both directions. The QA-LSTM unit 20-1 is a biLSTM for "conclusion", and the QA-LSTM unit 20-2 is a biLSTM for "supplement". In the present embodiment, the QA-LSTM unit 20-1 and the QA-LSTM unit 20-2 have the same configuration, and indicate an arbitrary QA-LSTM unit included in the learning processing unit 132, or particularly. When no distinction is made, it will be described as QA-LSTM unit 20.

ＱＡ−ＬＳＴＭ部２０は、質問埋め込みベクトル生成部２１と、正解埋め込みベクトル生成部２２と、不正解埋め込みベクトル生成部２３とを備えている。
質問埋め込みベクトル生成部２１は、質問文を単語ごとに変換された特徴ベクトルを時系列の順方向及び逆方向の双方向の単語の並び順に基づいて生成した双方向ベクトル列２４（質問特徴ベクトル群）に基づいて、単語の並びを双方向に学習して、質問埋め込みベクトルＯ_ｑを生成する。質問埋め込みベクトル生成部２１は、例えば、質問文の特徴ベクトル列Ｗ_ｑから双方向ベクトル列２４を生成し、当該双方向ベクトル列２４の各要素の最大値を抽出して蓄積するマックスプーリング処理により質問埋め込みベクトルＯ_ｑ（質問中間ベクトル）を生成する。 The QA-LSTM unit 20 includes a question embedding vector generation unit 21, a correct answer embedding vector generation unit 22, and an incorrect answer embedding vector generation unit 23.
The question embedding vector generation unit 21 generates a bidirectional vector sequence 24 (question feature vector group) in which a feature vector obtained by converting a question sentence for each word is generated based on the order of bidirectional words in the forward and reverse directions of the time series. ), The sequence of words is learned in both directions, and the question embedding vector _Oq is generated. Question embedding vector generation unit 21 generates, for example, a bi-directional vector sequence 24 from the feature vector sequence W _q question sentence, by Max pooling process and accumulates the extracted maximum value of each element of the bidirectional vector sequence 24 Generate the question embedding vector O _q (question intermediate vector).

正解埋め込みベクトル生成部２２は、正解文を単語ごとに変換された特徴ベクトルを時系列の順方向及び逆方向の双方向の単語の並び順に基づいて生成した双方向ベクトル列２５（正解特徴ベクトル群）に基づいて、単語の並びを双方向に学習して、正解埋め込みベクトルＯ_ａ＋を生成する。正解埋め込みベクトル生成部２２は、例えば、正解文の特徴ベクトル列Ｗ_ａ＋から双方向ベクトル列２５を生成し、当該双方向ベクトル列２５の各要素の最大値を抽出して蓄積するマックスプーリング処理により正解埋め込みベクトルＯ_ａ＋（正解中間ベクトル）を生成する。 The correct answer embedded vector generation unit 22 generates a bidirectional vector sequence 25 (correct answer feature vector group) in which a feature vector obtained by converting a correct sentence for each word is generated based on the order of bidirectional words in the forward and reverse directions of the time series. ), The sequence of words is learned in both directions, and the correct answer embedding vector O _{a +} is generated. The correct answer embedded vector generation unit 22 generates, for example, _{a bidirectional vector string 25 from the feature vector string Wa + of} the correct sentence, extracts the maximum value of each element of the bidirectional vector string 25, and accumulates it by max pooling processing. The correct answer embedding vector Oa ₊ (correct answer intermediate vector) is generated.

不正解埋め込みベクトル生成部２３は、不正解文を単語ごとに変換された特徴ベクトルを時系列の順方向及び逆方向の双方向の単語の並び順に基づいて生成した双方向ベクトル列２６（不正解特徴ベクトル群）に基づいて、単語の並びを双方向に学習して、正解埋め込みベクトルＯ_ａ−を生成する。不正解埋め込みベクトル生成部２３は、例えば、不正解文の特徴ベクトル列Ｗ_ａ−から双方向ベクトル列２６を生成し、当該双方向ベクトル列２６の各要素の最大値を抽出して蓄積するマックスプーリング処理により不正解埋め込みベクトルＯ_ａ−（不正解中間ベクトル）を生成する。 The incorrect answer embedding vector generation unit 23 generates a bidirectional vector sequence 26 (incorrect answer) in which a feature vector obtained by converting an incorrect answer sentence for each word is generated based on the order of bidirectional words in the forward and reverse directions of the time series. Based on the feature vector group), the sequence of words is learned in both directions to generate the _{correct embedding vector Oa−.} The incorrect answer embedding vector generation unit 23 generates, for example, a _{bidirectional vector string 26 from the feature vector string Wa −} of the incorrect answer sentence, extracts the maximum value of each element of the bidirectional vector string 26, and accumulates the maximum value. Incorrect answer embedding vector _Oa- (incorrect answer intermediate vector) is generated by pooling processing.

なお、ＱＡ−ＬＳＴＭ部２０の基本となるＬＳＴＭについては、非特許文献１に開示されている。基本的なＬＳＴＭでは、学習する際に、入力される時系列の入力Ｘ＝｛ｘ（１），ｘ（２），・・・，ｘ（Ｎ）｝とし、ｘ（ｔ）をｔ番目の単語の特徴ベクトルとした場合に、双方向ベクトル列（２４、２５、２６）の内部のベクトルである各双方向ベクトルｈ（ｔ）が、ｔ時間ごとに以下の式（２）により更新される。 The RSTM, which is the basis of the QA-LSTM unit 20, is disclosed in Non-Patent Document 1. In the basic LSTM, when learning, the input time series input X = {x (1), x (2), ..., X (N)}, and x (t) is the t-th. When the feature vector of a word is used, each bidirectional vector h (t), which is an internal vector of the bidirectional vector sequence (24, 25, 26), is updated by the following equation (2) every t time. ..

ここで、基本的なＬＳＴＭのアーキテクチャにおいて、３つのゲート（ｉｎｐｕｔｉ_ｔ，ｆｏｒｇｅｔｆ_ｔ，ｏｕｔｐｕｔｏ_ｔ）と、セルメモリーベクトルｃ_ｔとがある。また、σ（）はシグモイド関数である。また、Ｗ_ｉ、Ｗ_ｆ、Ｗ_ｏ、Ｗ_ｃ、Ｕ_ｉ、Ｕ_ｆ，Ｕ_ｏ、Ｕ_ｃ、ｂ_ｉ、ｂ_ｆ，ｂ_ｏ、ｂ_ｃは学習されるネットワークパラメータである。 Here, the basic architecture of LSTM, 3 one gate _{_{(input i t, forget f t}} , output o t) and, there is a cell memory vector _{c t.} Also, σ () is a sigmoid function. _{_{_{_{Further, W i, W f, W}}}} o, W c, U i, U f, U o, U c, b i, b f, b o, the _{b c} a network parameter to be learned.

ＱＡ−ＬＳＴＭ部２０−１は、「結論」用のｂｉＬＳＴＭであり、特徴ベクトル列（Ｗ_ｑ、Ｗ_ａｃ＋、Ｗ_ａｃ−）に基づいて、質問埋め込みベクトルＯ_ｑｃ、正解埋め込みベクトルＯ_ａｃ＋、及び不正解埋め込みベクトルＯ_ａｃ−を生成する。ＱＡ−ＬＳＴＭ部２０−１は、質問埋め込みベクトル生成部２１−１と、正解埋め込みベクトル生成部２２−１と、不正解埋め込みベクトル生成部２３−１とを備えている。質問埋め込みベクトル生成部２１−１は、特徴ベクトル列Ｗ_ｑから双方向ベクトル列２４−１を生成し、マックスプーリング処理により質問埋め込みベクトルＯ_ｑｃを生成する。また、正解埋め込みベクトル生成部２２−１は、特徴ベクトル列Ｗ_ａｃ＋から双方向ベクトル列２５−１を生成し、マックスプーリング処理により正解埋め込みベクトルＯ_ａｃ＋を生成する。また、不正解埋め込みベクトル生成部２３−１は、特徴ベクトル列Ｗ_ａｃ−から双方向ベクトル列２６−１を生成し、マックスプーリング処理により正解埋め込みベクトルＯ_ａｃ−を生成する。 The QA-LSTM part 20-1 is a biLSTM for "conclusion", and based on the feature vector sequence (W _q , W _{ac +} , W _ac- ), the question embedding vector O _qc , the correct answer embedding vector O _{ac +} , and no Generate the correct embedding vector O _ac−. The QA-LSTM unit 20-1 includes a question embedding vector generation unit 21-1, a correct answer embedding vector generation unit 22-1, and an incorrect answer embedding vector generation unit 23-1. The question embedding vector generation unit 21-1 generates a _{bidirectional vector sequence 24-1 from the feature vector sequence W q,} and generates a question embedding vector O _qc by max pooling processing. Further, the correct answer embedded vector generation unit 22-1 generates the _{bidirectional vector sequence 25-1 from the feature vector sequence W ac +,} and generates the correct answer embedded vector O _{ac +} by the max pooling process. Further, the incorrect answer embedded vector generation unit 23-1 generates the _{bidirectional vector sequence 26-1 from the feature vector sequence W ac−,} and generates the correct answer embedded vector O _ac− by the max pooling process.

ＱＡ−ＬＳＴＭ部２０−２は、「補足」用のｂｉＬＳＴＭであり、特徴ベクトル列（Ｗ_ｑ、Ｗ_ａｓ＋、Ｗ_ａｓ−）に基づいて、質問埋め込みベクトルＯ_ｑｓ、正解埋め込みベクトルＯ_ａｓ＋、及び不正解埋め込みベクトルＯ_ａｓ−を生成する。ＱＡ−ＬＳＴＭ部２０−２は、質問埋め込みベクトル生成部２１−２と、正解埋め込みベクトル生成部２２−２と、不正解埋め込みベクトル生成部２３−２とを備えている。質問埋め込みベクトル生成部２１−２は、特徴ベクトル列Ｗ_ｑから双方向ベクトル列２４−２を生成し、マックスプーリング処理により質問埋め込みベクトルＯ_ｑｓを生成する。また、正解埋め込みベクトル生成部２２−２は、特徴ベクトル列Ｗ_ａｓ＋から双方向ベクトル列２５−２を生成し、マックスプーリング処理により正解埋め込みベクトルＯ_ａｓ＋を生成する。また、不正解埋め込みベクトル生成部２３−２は、特徴ベクトル列Ｗ_ａｓ−から双方向ベクトル列２６−２を生成し、マックスプーリング処理により不正解埋め込みベクトルＯ_ａｓ−を生成する。 The QA-LSTM unit 20-2 is a biLSTM for "supplementation", and is based on the feature vector sequence (W _q , _Was ₊ , Was-), and the question embedding vector O _qs , the correct embedding vector Os +, and the non- _{question embedding vector O as +} Generate the correct embedded vector O _as-. The QA-LSTM unit 20-2 includes a question embedding vector generation unit 21-2, a correct answer embedding vector generation unit 22-2, and an incorrect answer embedding vector generation unit 23-2. The question-embedded vector generation unit 21-2 generates the _{bidirectional vector sequence 24-2 from the feature vector sequence W q,} and generates the question-embedded vector O _qs by the max pooling process. Also, correct embedding vector generation unit 22-2 generates a bidirectional vector sequence 25-2 from the feature vector sequence _{W the as +,} generating the correct embedded vector _{O the as +} by Max pooling process. Also, incorrect embedding vector generation unit 23-2 generates a bidirectional vector sequence 26-2 from the feature vector sequence _{W as-,} generates an incorrect embedding vectors _{O as-by} Max pooling process.

また、ＱＡ−ＬＳＴＭ部２０−２は、学習する際に、アテンションメカニズムを利用して、双方向ベクトル列２５−２及び双方向ベクトル列２６−２を更新する。ＱＡ−ＬＳＴＭ部２０−２は、例えば、ＱＡ−ＬＳＴＭ部２０−１が生成した「結論」に対応する正解埋め込みベクトルＯ_ａｓ＋及び不正解埋め込みベクトルＯ_ａｓ−ルに基づいて、「補足」に対応する双方向ベクトル列２５−２（正解特徴ベクトル群）及び双方向ベクトル列２６−２（不正解特徴ベクトル群）を更新する。具体的に、ＱＡ−ＬＳＴＭ部２０−２は、以下の式（３）により、双方向ベクトル列２５−２及び双方向ベクトル列２６−２の内部ベクトルである双方向ベクトルｈ_ｓ（ｔ）を更新する。 Further, the QA-LSTM unit 20-2 updates the bidirectional vector sequence 25-2 and the bidirectional vector sequence 26-2 by utilizing the attention mechanism when learning. The QA-LSTM unit 20-2 corresponds to the "supplement" based on _{, for example, the correct answer embedding vector O as +} and the incorrect answer embedding vector O _as- corresponding to the "conclusion" generated by the QA-LSTM unit 20-1. The bidirectional vector sequence 25-2 (correct feature vector group) and the bidirectional vector sequence 26-2 (incorrect feature vector group) are updated. Specifically, the QA-LSTM unit 20-2 uses the following equation (3) to obtain a bidirectional vector h _s (t) which is an internal vector of the bidirectional vector sequence 25-2 and the bidirectional vector sequence 26-2. Update.

ここで、ｔは、時間のステップであり、Ｗ_ｓｍ、Ｗ_ｃｍ、及びｗ_ｍｂは、アテンションパラメータである。また、^〜ｈ_ｓ（ｔ）は、更新後の双方向ベクトルを示す。なお、本文中の上付の〜は文字の真上に付けられた記号を表すものとする。 Here, t is a step of time, and W _sm , W _cm , and w _mb are attention parameters. Further, ^~ h _s (t) indicates an updated bidirectional vector. In addition, the above-mentioned ~ in the text represents the symbol attached directly above the character.

また、ＱＡ−ＬＳＴＭ部２０−１が生成した質問埋め込みベクトルＯ_ｑｃ、正解埋め込みベクトルＯ_ａｃ＋、及び不正解埋め込みベクトルＯ_ａｃ−と、ＱＡ−ＬＳＴＭ部２０−２が生成した質問埋め込みベクトルＯ_ｑｓ、正解埋め込みベクトルＯ_ａｓ＋、及び不正解埋め込みベクトルＯ_ａｓ−とに基づいて、コサイン類似度を利用した損失関数Ｌｓは、例えば、以下の式（４）により表される。すなわち、上述した文単位学習モデルＭ１２に対応する損失関数Ｌｓは、式（４）により表される。なお、文単位学習モデルＭ１２において、損失関数Ｌｓは、「結論」と、「補足」と、「正解」と、「不正解」との組み合わせに基づいて、算出される。 Also, questions embedding vectors _O QA-LSTM portion 20-1 has generated _qc, correct embedded vector _{O ac +,} and an incorrect embedding vectors _{O ac-question} embedding vectors _{O qs} to QA-LSTM portion 20-2 has generated, Based on the correct answer embedding vector Os ₊ and the incorrect embedding vector _Os− , the loss function Ls using the cosine similarity is expressed by, for example, the following equation (4). That is, the loss function Ls corresponding to the sentence unit learning model M12 described above is expressed by the equation (4). In the sentence unit learning model M12, the loss function Ls is calculated based on the combination of "conclusion", "supplement", "correct answer", and "incorrect answer".

ここで、［ｙ，ｚ］は、ベクトルｙとベクトルｚとの結合を示す。Ｏ_ｑは、［Ｏ_ｑｃ，Ｏ_ｑｓ］である。また、Ｍは、定数を示し、ｋ（０＜ｋ＜１）は、マージンをコントロールするパラメータである。 Here, [y, z] indicates the connection between the vector y and the vector z. O _q is [O _qc , O _qs ]. Further, M indicates a constant, and k (0 <k <1) is a parameter for controlling the margin.

損失関数生成部３０は、エンコーダデコーダモデルＭ１１と、文単位学習モデルＭ１２とを組み合わせて損失関数Ｌｗを算出する。すなわち、損失関数生成部３０は、上述した式（１）の損失関数Ｌと式（４）の損失関数Ｌｓとを組み合わせて以下の式（５）により、損失関数Ｌｗを算出する。 The loss function generation unit 30 calculates the loss function Lw by combining the encoder / decoder model M11 and the sentence unit learning model M12. That is, the loss function generation unit 30 combines the loss function L of the above equation (1) and the loss function Ls of the equation (4) to calculate the loss function Lw by the following equation (5).

ここで、αは、重み付けを２つの損失関数でコントロールするパラメータである。
なお、損失関数Ｌｗは、学習中の質問と回答との各組合せにおけるコサイン値が最大になるように設定されている。 Here, α is a parameter that controls the weighting with two loss functions.
The loss function Lw is set so that the cosine value in each combination of the question and the answer being learned is maximized.

学習処理部１３２は、上述のような構成を用いて算出された損失関数Ｌｗによって最適化して学習結果を生成し、生成した学習結果を学習結果記憶部１２２に記憶させる。学習処理部１３２は、上述した式（２）の「結論」用のパラメータセット｛Ｗ_ｉ、Ｗ_ｆ、Ｗ_ｏ、Ｗ_ｃ、Ｕ_ｉ、Ｕ_ｆ，Ｕ_ｏ、Ｕ_ｃ、ｂ_ｉ、ｂ_ｆ，ｂ_ｏ、ｂ_ｃ｝_ｃと、「補足」用のパラメータセット｛Ｗ_ｉ、Ｗ_ｆ、Ｗ_ｏ、Ｗ_ｃ、Ｕ_ｉ、Ｕ_ｆ，Ｕ_ｏ、Ｕ_ｃ、ｂ_ｉ、ｂ_ｆ，ｂ_ｏ、ｂ_ｃ｝_ｓと、アテンションパラメータ｛Ｗ_ｓｍ、Ｗ_ｃｍ、ｗ_ｍｂ｝と、を含む学習結果を生成する。 The learning processing unit 132 optimizes by the loss function Lw calculated using the above configuration to generate a learning result, and stores the generated learning result in the learning result storage unit 122. Learning processing unit 132, a parameter set _{W _i for "conclusion" of the above-mentioned formula _{_{(2), W f, W}} o, W c, U i, U f, U o, U c, b i, b f _, _b o, _b _c} c and the parameter set for "supplementary" _{_{_{_{{W i, W f, W}}}} o, W c, U i, U f, U o, U c, b i, b f, b o , B _c } _s and the attention parameters {W _sm , W _cm , w _mb } to generate a learning result.

なお、上述した例では、回答文のシナリオを「結論」及び「補足」の２つの部分項目により構成する例を説明したが、２つ以上の部分項目により構成するようにしてもよい。その場合、学習処理部１３２によって学習される学習結果は、質問文から生成された質問埋め込みベクトルＯ_ｑと、複数の部分項目それぞれに対応した正解埋め込みベクトルＯ_ａ＋と、複数の部分項目それぞれに対応した不正解中間ベクトルＯ_ａ−との複数の部分項目の組合せに基づいて算出される損失関数Ｌｗにより最適化されて学習される。 In the above-mentioned example, the example in which the scenario of the answer sentence is composed of two sub-items of "conclusion" and "supplement" has been described, but it may be composed of two or more sub-items. In that case, the learning result learned by the learning processing unit 132 _{corresponds to the question embedding vector O q} generated from the question sentence, the correct answer embedding vector O _{a +} corresponding to each of the plurality of sub-items, and each of the plurality of sub-items. It is learned is optimized by loss function Lw calculated based on a combination of a plurality of portions items of the the incorrect intermediate vector O _a-.

次に、図面を参照して、本実施形態による情報処理装置１の動作について説明する。
＜学習処理＞
ます、情報処理装置１における学習処理部１３２の学習処理について、図６を参照して説明する。 Next, the operation of the information processing apparatus 1 according to the present embodiment will be described with reference to the drawings.
<Learning process>
First, the learning process of the learning processing unit 132 in the information processing device 1 will be described with reference to FIG.

図６は、本実施形態における学習処理の一例を示すフローチャートである。
図６に示すように、学習処理部１３２は、まず、変数ｎに“１”を代入する（ステップＳ１０１）。なお、変数ｎは、学習の繰り返し回数をカウントする。この例では、ＮＮ回の学習を繰り返す場合について説明する。 FIG. 6 is a flowchart showing an example of the learning process in the present embodiment.
As shown in FIG. 6, the learning processing unit 132 first substitutes “1” for the variable n (step S101). The variable n counts the number of times the learning is repeated. In this example, a case where learning is repeated NN times will be described.

次に、学習処理部１３２は、質問文及び回答文の組を入力情報として取得する（ステップＳ１０２）。学習処理部１３２は、例えば、複数ある既存の質問文と既存の回答文との組情報のうちから、図４に示すように、質問文ｑと結論回答文ａｃ及び補足回答文ａｓ（「結論」の正解文ａｃ＋、「結論」の不正解文ａｃ−、「補足」の正解文ａｓ＋、及び「補足」の不正解文ａｓ−）との組情報を取得する。 Next, the learning processing unit 132 acquires a set of a question sentence and an answer sentence as input information (step S102). As shown in FIG. 4, the learning processing unit 132 includes, for example, a question sentence q, a conclusion answer sentence ac, and a supplementary answer sentence as (“Conclusion”) from among a plurality of existing question sentences and existing answer sentences. "Correct sentence ac +," conclusion "incorrect sentence ac-," supplement "correct answer sentence as +, and" supplement "incorrect sentence as-).

次に、学習処理部１３２は、質問文ｑをエンコードする（ステップＳ１０３）。学習処理部１３２は、取得した質問文ｑを、上述したエンコーダデコーダモデルＭ１１のエンコーダモデルＭ１１１によりエンコードする。すなわち、学習処理部１３２は、取得した質問文ｑを、ｂｉＬＳＴＭ及びマックスプーリング処理によってエンコードして、文脈ベクトル１０を生成する。ここでの文脈ベクトル１０は、質問ベクトルである。 Next, the learning processing unit 132 encodes the question sentence q (step S103). The learning processing unit 132 encodes the acquired question sentence q by the encoder model M111 of the encoder-decoder model M11 described above. That is, the learning processing unit 132 encodes the acquired question sentence q by biLSTM and max pooling processing to generate the context vector 10. The context vector 10 here is a question vector.

次に、学習処理部１３２は、結論回答文ａｃ及び補足回答文ａｓをデコードする（ステップＳ１０４）。学習処理部１３２は、結論回答文ａｃ及び補足回答文ａｓを学習するために、例えば、予め用意された２つのタイプ識別ベクトル（部分項目ベクトル）を使用する。学習処理部１３２は、エンコーダデコーダモデルＭ１１のデコーダモデルＭ１１２において、タイプ識別ベクトルをエンコーダモデルＭ１１１により生成された文脈ベクトル１０とともに入力情報として使用し、目標のタイプ（部分項目）の回答文（結論回答文ａｃ及び補足回答文ａｓ）になるようにデコードする。 Next, the learning processing unit 132 decodes the conclusion answer sentence ac and the supplementary answer sentence as (step S104). The learning processing unit 132 uses, for example, two type identification vectors (sub-item vectors) prepared in advance in order to learn the conclusion answer sentence ac and the supplementary answer sentence as. The learning processing unit 132 uses the type identification vector as input information together with the context vector 10 generated by the encoder model M111 in the decoder model M112 of the encoder decoder model M11, and answers the target type (partial item) (conclusion answer). Decode so that it becomes sentence ac and supplementary answer sentence as).

ここで、デコーダモデルＭ１１２の出力列（単語列）は、一度に、１つの単語（トークン）を構成する。また、タイプ識別ベクトルは、Ｃ−ＬＳＴＭモデルにおける文脈ベクトルであるという点で同様である。これによって、学習処理部１３２は、単一のエンコーダデコーダネットワークによって、入力した回答タイプの識別情報（「結論」及び「補足」）に従って、目的のシーケンス（回答文）を生成することができる。
なお、学習処理部１３２は、図５に示すように、「結論」と「補足」とのそれぞれについて、正解文と不正解文とをデコードする。また、学習処理部１３２は、ステップＳ１０４において、「結論」と「補足」との回答文を、１文字（１単語）ずつ学習する。 Here, the output string (word string) of the decoder model M112 constitutes one word (token) at a time. The type identification vector is also similar in that it is a context vector in the C-LSTM model. As a result, the learning processing unit 132 can generate a target sequence (answer sentence) according to the input answer type identification information (“conclusion” and “supplement”) by a single encoder / decoder network.
As shown in FIG. 5, the learning processing unit 132 decodes the correct answer sentence and the incorrect answer sentence for each of the “conclusion” and the “supplement”. Further, in step S104, the learning processing unit 132 learns the answer sentences of "conclusion" and "supplement" one character at a time (one word).

次に、学習処理部１３２は、結論回答文ａｃ２及び補足回答文ａｓ２を生成する（ステップＳ１０５）。学習処理部１３２は、１組の質問文ｑと結論回答文ａｃ及び補足回答文ａｓとの組情報のためにそれぞれ学習した上述したエンコーダデコーダ学習過程において、次に出力を予測するための入力として、予測された出力単語を単に供給して、結論回答文ａｃ２及び補足回答文ａｓ２を生成する。なお、この処理は、上述したステップＳ１０４の処理においての学習が、ＥＯＳ（end of sequence；結論回答文ａｃ又は補足回答文ａｓの最後の文字）まで進んだ場合に、そのときまでの更新パラメータ（途中学習結果）によって、一旦、結論回答文ａｃ２及び補足回答文ａｓ２を生成する処理である。すなわち、学習処理部１３２は、エンコーダデコーダモデルＭ１１の途中学習結果及び文脈ベクトル１０に基づいて、結論回答文ａｃ２及び補足回答文ａｓ２を生成する。具体的に、学習処理部１３２は、図４に示すように、「結論」の正解文ａｃ２＋、「結論」の不正解文ａｃ２−、「補足」の正解文ａｓ２＋、及び「補足」の不正解文ａｓ２−を生成する。 Next, the learning processing unit 132 generates the conclusion answer sentence ac2 and the supplementary answer sentence as2 (step S105). In the above-mentioned encoder-decoder learning process learned for the set information of one set of question sentence q, conclusion answer sentence ac, and supplementary answer sentence as, the learning processing unit 132 serves as an input for predicting the output next. , The predicted output words are simply supplied to generate the conclusion answer sentence ac2 and the supplementary answer sentence as2. In this process, when the learning in the process of step S104 described above progresses to EOS (end of sequence; the last character of the conclusion answer sentence ac or the supplementary answer sentence as), the update parameter (up to that point) is performed. This is a process of temporarily generating the conclusion answer sentence ac2 and the supplementary answer sentence as2 based on the intermediate learning result). That is, the learning processing unit 132 generates the conclusion answer sentence ac2 and the supplementary answer sentence as2 based on the intermediate learning result of the encoder / decoder model M11 and the context vector 10. Specifically, as shown in FIG. 4, the learning processing unit 132 has the correct answer sentence ac2 + of the “conclusion”, the incorrect answer sentence ac2- of the “conclusion”, the correct answer sentence as2 + of the “supplement”, and the incorrect answer of the “supplement”. Generate sentence as2-.

次に、学習処理部１３２は、質問文ｑ、結論回答文ａｃ２、及び補足回答文ａｓ２を文単位学習モデルＭ１２に入力して、損失関数Ｌｗを生成する（ステップＳ１０６）。学習処理部１３２は、例えば、以下の（ａ）〜（ｆ）の処理により、損失関数Ｌｗを生成する。
（ａ）学習処理部１３２は、まず、図４に示すように、質問文ｑ、「結論」の正解文ａｃ２＋、「結論」の不正解文ａｃ２−、「補足」の正解文ａｓ２＋、「補足」の不正解文ａｓ２−の組情報を特徴ベクトル列（Ｗ_ｑ、Ｗ_ａｃ＋、Ｗ_ａｃ−、Ｗ_ａｓ＋、Ｗ_ａｓ−）に変換する。 Next, the learning processing unit 132 inputs the question sentence q, the conclusion answer sentence ac2, and the supplementary answer sentence as2 into the sentence unit learning model M12 to generate the loss function Lw (step S106). The learning processing unit 132 generates the loss function Lw by, for example, the following processes (a) to (f).
(A) First, as shown in FIG. 4, the learning processing unit 132 first asks the question sentence q, the correct answer sentence ac2 + of the “conclusion”, the incorrect answer sentence ac2- of the “conclusion”, the correct answer sentence as2 + of the “supplement”, and “supplement”. The set information of the incorrect answer sentence as2- is converted into a feature vector sequence (W _q , W _{ac +} , W _{ac −, Was} ₊ , _{Was −} ).

（ｂ）次に、学習処理部１３２のＱＡ−ＬＳＴＭ部２０−１は、特徴ベクトル列（Ｗ_ｑ、Ｗ_ａｃ＋、Ｗ_ａｃ−）に基づいて、質問埋め込みベクトルＯ_ｑｃ、正解埋め込みベクトルＯ_ａｃ＋、及び不正解埋め込みベクトルＯ_ａｃ−を生成する。また、ＱＡ−ＬＳＴＭ部２０−１は、双方向ベクトル列（２４−１、２５−１、２６−１）のｔ番目の各双方向ベクトル（ｈ_ｑｃ（ｔ）、ｈ_ａｃ＋（ｔ）、ｈ_ａｃ−（ｔ））をそれぞれマックスプーリング処理して、質問埋め込みベクトルＯ_ｑｃ、正解埋め込みベクトルＯ_ａｃ＋、及び不正解埋め込みベクトルＯ_ａｃ−を生成する。 (B) Next, QA-LSTM 20-1 of the learning processing unit 132, feature vector sequence _{_{_{(W q, W ac +,}}} W ac-) based on the question embedded vector _{O qc,} correct embedded vector _{O ac} +, _And generate the incorrect embedding vector O ac−. Further, the QA-LSTM unit 20-1 is the t-th bidirectional vector (h _qc (t), h _{ac +} (t), h) of the bidirectional vector sequence (24-1, 25-1, 26-1). _{Each of ac−} (t)) is max-pooled to generate a _{question embedding vector O qc} , a correct embedding vector O _{ac +} , and an incorrect embedding vector O _ac−.

（ｃ）次に、学習処理部１３２のＱＡ−ＬＳＴＭ部２０−２は、まず、質問埋め込みベクトルＯ_ｑｓを生成する。ＱＡ−ＬＳＴＭ部２０−２は、双方向ベクトル列２４−２のｔ番目の各双方向ベクトルｈ_ｑｓ（ｔ）をマックスプーリング処理して、質問埋め込みベクトルＯ_ｑｓを生成する。 (C) Next, the QA-LSTM unit 20-2 of the learning processing unit 132 first generates the _{question embedding vector O qs.} The QA-LSTM unit 20-2 max-pools each t-th bidirectional vector h _qs (t) of the bidirectional vector sequence 24-2 to generate a question embedding vector O _qs.

（ｄ）次に、ＱＡ−ＬＳＴＭ部２０−２は、双方向ベクトル列（２５−２、２６−２）のｔ番目の各双方向ベクトル（ｈ_ａｓ＋（ｔ）、ｈ_ａｓ−（ｔ））を、正解埋め込みベクトルＯ_ａｃ＋、及び不正解埋め込みベクトルＯ_ａｃ−を用いて更新する。すなわち、ＱＡ−ＬＳＴＭ部２０−２は、上述した式（２）を用いて、各双方向ベクトル（ｈ_ａｓ＋（ｔ）、ｈ_ａｓ−（ｔ））を更新して、更新ベクトル（^〜ｈ_ａｓ＋（ｔ）、^〜ｈ_ａｓ−（ｔ））を生成する。 (D) Next, the QA-LSTM unit 20-2 is the t-th bidirectional vector (has ₊ (t), _has- (t)) of the bidirectional vector sequence (25-2, 26-2). _Is updated using _{the correct embedding vector O ac +} and the incorrect embedding vector O ac −. That is, the QA-LSTM unit 20-2 updates each bidirectional vector (has ₊ (t), _has- (t)) using the above equation (2), and updates the update vector ( ^~ _{has +).} (T), ^~ h _as- (t)) is generated.

（ｅ）次に、ＱＡ−ＬＳＴＭ部２０−２は、更新ベクトル（^〜ｈ_ａｓ＋（ｔ）、^〜ｈ_ａｓ−（ｔ））をそれぞれマックスプーリング処理して、正解埋め込みベクトルＯ_ａｓ＋、及び不正解埋め込みベクトルＯ_ａｓ−を生成する。
（ｆ）そして、学習処理部１３２の損失関数生成部３０は、生成した各埋め込みベクトル（Ｏ_ｑｃ、Ｏ_ａｃ＋、Ｏ_ａｃ−、Ｏ_ｑｓ、Ｏ_ａｓ＋、Ｏ_ａｓ−）、及び式（５）に基づいて、損失関数Ｌｗを生成する。 (E) Next, the QA-LSTM unit 20-2 max-pools the update vectors ( ^~ h _{as +} (t), ^~ h _as- (t)) to obtain the correct embedding vector O _{as +} and the incorrect answer. Generate the embedded vector O _as-.
(F) The loss function generation unit 30 of the learning processing unit 132, the generated embedding vectors _{_{_{_{(O qc, O ac +,}}}} O ac-, O qs, O as +, O as-), and the formula (5) Based on this, the loss function Lw is generated.

次に、学習処理部１３２は、生成した損失関数Ｌｗにより最適化する（ステップＳ１０７）。学習処理部１３２は、算出した損失関数Ｌｗにより、各パラメータを最適化する。学習処理部１３２は、例えば、「結論」用のパラメータセット｛Ｗ_ｉ、Ｗ_ｆ、Ｗ_ｏ、Ｗ_ｃ、Ｕ_ｉ、Ｕ_ｆ，Ｕ_ｏ、Ｕ_ｃ、ｂ_ｉ、ｂ_ｆ，ｂ_ｏ、ｂ_ｃ｝_ｃと、「補足」用のパラメータセット｛Ｗ_ｉ、Ｗ_ｆ、Ｗ_ｏ、Ｗ_ｃ、Ｕ_ｉ、Ｕ_ｆ，Ｕ_ｏ、Ｕ_ｃ、ｂ_ｉ、ｂ_ｆ，ｂ_ｏ、ｂ_ｃ｝_ｓと、アテンションパラメータ｛Ｗ_ｓｍ、Ｗ_ｃｍ、ｗ_ｍｂ｝とを最適化する。学習処理部１３２は、最適化した各パラメータを学習結果として学習結果記憶部１２２に記憶させる。 Next, the learning processing unit 132 optimizes with the generated loss function Lw (step S107). The learning processing unit 132 optimizes each parameter by the calculated loss function Lw. Learning processing unit 132, for example, parameter set _{W _i for _{_{_{_{"conclusion", W f, W o, W}}}} c, U i, U f, U o, U c, b i, b f, b o, b _c} _c and the parameter set for "supplementary" _{_{_{_{{W i, W f, W}}}} o, W c, U i, U f, U o, U c, b i, b f, b o, b c} s And the attention parameters {W _sm , W _cm , w _mb }. The learning processing unit 132 stores each optimized parameter as a learning result in the learning result storage unit 122.

次に、学習処理部１３２は、入力情報が終了であるか否かを判定する（ステップＳ１０８）。学習処理部１３２は、入力情報である質問文ｑと結論回答文ａｃ及び補足回答文ａｓ（「結論」の正解文ａｃ＋、「結論」の不正解文ａｃ−、「補足」の正解文ａｓ＋、及び「補足」の不正解文ａｓ−）との次の組情報があるか否かを判定する。学習処理部１３２は、入力情報が終了である場合（ステップＳ１０８：ＹＥＳ）に、処理をステップＳ１０９に進める。また、学習処理部１３２は、入力情報が終了でない（次の入力情報がある）場合（ステップＳ１０８：ＹＥＳ）に、処理をステップＳ１０２に戻す。 Next, the learning processing unit 132 determines whether or not the input information is completed (step S108). The learning processing unit 132 includes a question sentence q, which is input information, a conclusion answer sentence ac, and a supplementary answer sentence as (correct answer sentence ac + of "conclusion", incorrect answer sentence ac- of "conclusion", correct answer sentence as + of "supplement", And, it is determined whether or not there is the following set information with the incorrect answer sentence as-) of "Supplement". When the input information is completed (step S108: YES), the learning processing unit 132 advances the processing to step S109. Further, the learning processing unit 132 returns the processing to step S102 when the input information is not completed (there is the next input information) (step S108: YES).

ステップＳ１０９において、学習処理部１３２は、変数ｎが繰り返し回数ＮＮ以上（ｎ≧ＮＮ）であるか否かを判定する。すなわち、学習処理部１３２は、ステップＳ１０２からステップＳ１０８までの処理をＮＮ回繰り返して学習したか否かを判定する。学習処理部１３２は、変数ｎが繰り返し回数ＮＮ以上である場合（ステップＳ１０９：ＹＥＳ）に、この学習処理を終了する。また、学習処理部１３２は、変数ｎが繰り返し回数ＮＮ未満である場合（ステップＳ１０９：ＮＯ）に、処理をステップＳ１１０に進める。 In step S109, the learning processing unit 132 determines whether or not the variable n is equal to or greater than the number of repetitions NN (n ≧ NN). That is, the learning processing unit 132 determines whether or not the processing from step S102 to step S108 has been repeated NN times for learning. The learning processing unit 132 ends this learning processing when the variable n is the number of repetitions NN or more (step S109: YES). Further, the learning processing unit 132 advances the processing to step S110 when the variable n is less than the number of repetitions NN (step S109: NO).

ステップＳ１１０において、学習処理部１３２は、変数ｎに“ｎ＋１”を代入して、処理をステップＳ１０２に戻す。すなわち、学習処理部１３２は、変数ｎの値に“１”を加算して、処理をステップＳ１０２に戻す。
このように、学習処理部１３２は、ステップＳ１０２からステップＳ１０８までの処理をＮＮ回繰り返して学習させ、当該学習結果を学習結果記憶部１２２に記憶させる。 In step S110, the learning processing unit 132 assigns “n + 1” to the variable n and returns the processing to step S102. That is, the learning processing unit 132 adds "1" to the value of the variable n and returns the processing to step S102.
In this way, the learning processing unit 132 repeats the processes from step S102 to step S108 NN times to learn, and stores the learning result in the learning result storage unit 122.

＜回答文の生成処理＞
次に、図面を参照して、本実施形態における情報処理装置１の質問文から回答文を生成する処理について説明する。
図７は、本実施形態における情報処理装置１の質問文から回答文を生成する処理の一例を示すフローチャートである。 <Answer sentence generation process>
Next, a process of generating an answer sentence from the question sentence of the information processing apparatus 1 in the present embodiment will be described with reference to the drawings.
FIG. 7 is a flowchart showing an example of a process of generating an answer sentence from the question sentence of the information processing apparatus 1 in the present embodiment.

図７に示すように、情報処理装置１は、まず、質問文をサービス記憶部１２１から取得する（ステップＳ２０１）。情報処理装置１の質問取得部１３３は、サービス記憶部１２１が記憶している質問文の中から、入力質問文を取得する。 As shown in FIG. 7, the information processing apparatus 1 first acquires a question sentence from the service storage unit 121 (step S201). The question acquisition unit 133 of the information processing device 1 acquires an input question sentence from the question sentences stored in the service storage unit 121.

次に、情報処理装置１の回答生成部１３４は、入力質問文と、学習結果記憶部１２２が記憶する学習結果とに基づいて、回答文を生成する（ステップＳ２０２）。なお、ここでの学習結果は、学習処理部１３２によって、上述した文字単位（単語単位）で学習するエンコーダデコーダモデルＭ１１と、文ごとに学習するセンテンスバイセンテンスモデルである文単位学習モデルＭ１２とを組み合わせて学習されている。すなわち、回答生成部１３４は、例えば、図６に示す学習処理により学習されたが学習結果に基づいて、入力質問文から回答文を生成する。 Next, the answer generation unit 134 of the information processing device 1 generates an answer sentence based on the input question sentence and the learning result stored in the learning result storage unit 122 (step S202). The learning result here includes an encoder / decoder model M11 that learns in character units (word units) described above by the learning processing unit 132, and a sentence unit learning model M12 that is a sentence-by-sentence model that learns sentence by sentence. It is learned in combination. That is, the answer generation unit 134 generates an answer sentence from the input question sentence based on the learning result, for example, which was learned by the learning process shown in FIG.

なお、回答生成部１３４は、既存の回答文を単に選択するのではなく、文章の筋道を考慮した部分項目の部分回答文を適切に組み合わせて新たな回答文を生成する。また、回答生成部１３４は、文字単位（単語単位）で学習するエンコーダデコーダモデルＭ１１を組み合わせることにより、「結論」及び「補足」の部分回答文を既存の回答文から単純に選択するのではなく、部分回答文を単語単位で選択して、新たな回答文を生成する。 The answer generation unit 134 does not simply select an existing answer sentence, but generates a new answer sentence by appropriately combining the partial answer sentences of the partial items in consideration of the course of the sentence. Further, the answer generation unit 134 does not simply select the "conclusion" and "supplementary" partial answer sentences from the existing answer sentences by combining the encoder / decoder model M11 that learns in character units (word units). , Select a partial answer sentence word by word to generate a new answer sentence.

次に、回答生成部１３４は、サービス記憶部１２１に回答文を記憶させる（ステップＳ２０３）。すなわち、回答生成部１３４は、生成した回答文をサービス提供部１３１に供給して、当該回答文を、入力質問文の回答の投稿として、サービス記憶部１２１に記憶させる。これにより、情報処理装置１にネットワークＮＷ１を介して接続し端末装置２から、質問文に対して、回答生成部１３４が生成した回答文を閲覧可能になる。ステップＳ２０３の処理後に、情報処理装置１は、回答文を生成する処理を終了する。 Next, the answer generation unit 134 stores the answer sentence in the service storage unit 121 (step S203). That is, the answer generation unit 134 supplies the generated answer sentence to the service providing unit 131, and stores the answer sentence in the service storage unit 121 as a post of the answer of the input question sentence. As a result, the information processing device 1 is connected to the information processing device 1 via the network NW1, and the answer sentence generated by the answer generation unit 134 can be viewed from the terminal device 2 for the question sentence. After the process of step S203, the information processing device 1 ends the process of generating the answer sentence.

次に、図８を参照して、本実施形態による情報処理装置１が生成した回答文の評価結果について説明する。 Next, with reference to FIG. 8, the evaluation result of the response sentence generated by the information processing apparatus 1 according to the present embodiment will be described.

図８は、本実施形態における方式と、従来技術との比較を示す図である。
図８において、「Ｓｅｑ２ｓｅｑ」は、比較のために、センテンスバイセンテンスモデルである文単位学習モデルＭ１２のよる式（４）による損失関数Ｌｓのみを使用した場合の評価結果を示している。また、「Ｃ−ＬＳＴＭ」は、上述したＣ−ＬＳＴＭモデルによる式（１）による損失関数Ｌを使用した場合の評価結果である。 FIG. 8 is a diagram showing a comparison between the method in the present embodiment and the prior art.
In FIG. 8, “Seq2seq” shows the evaluation result when only the loss function Ls according to the equation (4) by the sentence unit learning model M12, which is a sentence-by-sentence model, is used for comparison. Further, "C-LSTM" is an evaluation result when the loss function L according to the equation (1) by the above-mentioned C-LSTM model is used.

また、「本実施形態の方式」は、上述した本実施形態の学習処理部１３２による式（５）による損失関数Ｌｓを使用した場合の方式である。
また、評価結果の「ＲＯＵＧＥ−１」（uni-gram）、「ＲＯＵＧＥ−２」（bigram）、及び「ＲＯＵＧＥ−Ｌ」（Longest common subsequence）は、ＲＯＵＧＥ（Recall-Oriented Understudy for Gisting Evaluation）シリーズの評価指標である。「ＲＯＵＧＥ−１」、「ＲＯＵＧＥ−２」、及び「ＲＯＵＧＥ−Ｌ」は、いずれも“０”〜“１．０”の範囲で示され、値が大きい程、評価が高いことを意味している。なお、ＲＯＵＧＥの詳細については、例えば、技術文献（Chin-Yew Lin,“ROUGE: A Package for Automatic Evaluation of Summaries”，In Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pages 74-81,2004.）に記載されている。 Further, the "method of the present embodiment" is a method when the loss function Ls according to the equation (5) by the learning processing unit 132 of the present embodiment described above is used.
The evaluation results "ROUGE-1" (uni-gram), "ROUGE-2" (bigram), and "ROUGE-L" (Longest common subsequence) are from the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) series. It is an evaluation index. "ROUGE-1", "ROUGE-2", and "ROUGE-L" are all shown in the range of "0" to "1.0", and the larger the value, the higher the evaluation. There is. For more information on ROUGE, see, for example, the technical literature (Chin-Yew Lin, “ROUGE: A Package for Automatic Evaluation of Summaries”, In Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pages 74-81, 2004). .)It is described in.

また、評価の際に使用した学習情報は、Ｑ＆Ａサービス「教えてｇｏｏ」において、「恋愛相談」、「旅行」、「医療」などを含む１６個のカテゴリに蓄積された５０００組の質問文及び回答文を使用している。また、部分項目は、「結論」及び「補足」の２つの場合である。 In addition, the learning information used in the evaluation is 5000 sets of question sentences accumulated in 16 categories including "love consultation", "travel", "medical care", etc. in the Q & A service "Teach me goo". You are using the answer text. In addition, there are two sub-items, "conclusion" and "supplement".

また、評価のためのテストデータとして、質問文ｑ、結論回答文ａｃ及び補足回答文ａｓの１００組を２セット準備し、評価結果は、平均値を示している。また、当該評価において、文字の埋め込み数を“３００”とし、式（５）におけるαを“０．５”、繰り返し回数ＮＮを“５００”としている。 In addition, as test data for evaluation, two sets of 100 sets of question sentence q, conclusion answer sentence ac, and supplementary answer sentence as are prepared, and the evaluation result shows an average value. Further, in the evaluation, the number of embedded characters is set to "300", α in the formula (5) is set to "0.5", and the number of repetitions NN is set to "500".

図８に示すように、「本実施形態による方式」の評価結果は、「ＲＯＵＧＥ−１」が“０．４５４６”であり、「ＲＯＵＧＥ−２」が“０．２０３０”であり、「ＲＯＵＧＥ−Ｌ」が“０．３３１１”である。「本実施形態による方式」の評価結果は、「ＲＯＵＧＥ−１」、「ＲＯＵＧＥ−２」、及び「ＲＯＵＧＥ−Ｌ」のいずれも、従来の「Ｓｅｑ２ｓｅｑ」及び「Ｃ−ＬＳＴＭ」よりも高い値である。 As shown in FIG. 8, the evaluation results of the "method according to the present embodiment" are "ROUGE-1" of "0.4546", "ROUGE-2" of "0.2030", and "ROUGE-". "L" is "0.3311". The evaluation result of the "method according to the present embodiment" is higher than that of the conventional "Seq2seq" and "C-LSTM" for all of "ROUGE-1", "ROUGE-2", and "ROUGE-L". is there.

また、図９は、本実施形態における情報処理装置１が生成した回答文の一例を説明する図である。
なお、図９において、「Ｃ−ＬＳＴＭにより生成した回答文」は、比較のために、上述したＣ−ＬＳＴＭモデルを使用した場合の回答文を示している。
図９に示すように、「Ｃ−ＬＳＴＭにより生成した回答文」では、質問文に対して“相手のことを思いやれないということは、その人がいると思いますよ。大きさを待ったほうが良いと思います。”というような不自然な回答文となる。 Further, FIG. 9 is a diagram illustrating an example of a response sentence generated by the information processing device 1 in the present embodiment.
In addition, in FIG. 9, "answer sentence generated by C-LSTM" shows the answer sentence when the above-mentioned C-LSTM model is used for comparison.
As shown in Fig. 9, in the "answer sentence generated by C-LSTM", in response to the question sentence, "I think that there is a person who cannot think of the other person. It is better to wait for the size. I think it's good. ”Is an unnatural answer.

これに対して、本実施形態による情報処理装置１が生成した回答文は、“勇気を出して、あなたの気持ちを伝えれば良いと思います。少し優柔不断な様子が伺えますね。”となり、違和感を低減した自然な文面の回答を生成することができる。 On the other hand, the answer sentence generated by the information processing device 1 according to this embodiment is "I think you should take courage and convey your feelings. You can see how indecisive it is." It is possible to generate a natural written answer with less discomfort.

以上説明したように、本実施形態による情報処理装置１は、質問取得部１３３と、回答生成部１３４とを備える。質問取得部１３３は、入力された入力質問文を取得する。回答生成部１３４は、質問文と、回答文において、予め定められた文章の筋道により分割される複数の部分項目それぞれに対応する既知の部分回答文との組を複数有する学習情報に基づいて機械学習された学習結果に基づいて、質問取得部１３３によって取得された入力質問文に対する回答文を生成する。ここで、学習結果は、エンコーダデコーダモデルＭ１１と、文単位学習モデルＭ１２と、を組み合わせて算出される損失関数Ｌｗにより最適化されて学習される。また、エンコーダデコーダモデルＭ１１は、質問文を１単語ずつ順次単語の並び順に基づいてエンコードして文脈ベクトル１０を生成し、生成した文脈ベクトル１０に基づいて、複数の部分項目ごとの既知の部分回答文をデコードして学習する。 As described above, the information processing apparatus 1 according to the present embodiment includes a question acquisition unit 133 and an answer generation unit 134. The question acquisition unit 133 acquires the input input question sentence. The answer generation unit 134 is a machine based on learning information having a plurality of sets of a question sentence and a known partial answer sentence corresponding to each of a plurality of partial items divided by a predetermined sentence path in the answer sentence. Based on the learned learning result, the answer sentence to the input question sentence acquired by the question acquisition unit 133 is generated. Here, the learning result is optimized and learned by the loss function Lw calculated by combining the encoder / decoder model M11 and the sentence unit learning model M12. Further, the encoder / decoder model M11 generates a context vector 10 by sequentially encoding a question sentence word by word based on the order of the words, and based on the generated context vector 10, known partial answers for each of a plurality of partial items. Decode and learn sentences.

また、文単位学習モデルＭ１２は、エンコーダデコーダモデルＭ１１に基づいてデコードされた複数の部分項目ごとの部分回答文と、質問文とを含む組情報を入力情報とする。文単位学習モデルＭ１２は、質問埋め込みベクトルＯ_ｑ（質問中間ベクトル）と、複数の部分項目それぞれに対応した回答中間ベクトル（正解埋め込みベクトル（Ｏ_ａｃ＋、Ｏ_ａｓ＋）及び不正解埋め込みベクトル（Ｏ_ａｃ−、Ｏ_ａｓ−））と、の複数の部分項目の組合せに基づいて学習する。ここで、質問埋め込みベクトルＯ_ｑ（質問中間ベクトル）は、入力情報の質問文を単語ごとに変換された特徴ベクトルを時系列の順方向及び逆方向の双方向の単語の並び順に基づいて生成した双方向ベクトル列２４（質問特徴ベクトル群）に基づいて、単語の並びを双方向に学習して生成される。また、回答中間ベクトル（正解埋め込みベクトル（Ｏ_ａｃ＋、Ｏ_ａｓ＋）及び不正解埋め込みベクトル（Ｏ_ａｃ−、Ｏ_ａｓ−））は、部分回答文を単語ごとに変換された特徴ベクトルを双方向の単語の並び順に基づいて生成した回答特徴ベクトル群（双方向ベクトル列２５及び双方向ベクトル列２６）に基づいて、単語の並びを双方向に学習して生成される。 Further, the sentence unit learning model M12 uses set information including a partial answer sentence for each of a plurality of partial items decoded based on the encoder / decoder model M11 and a question sentence as input information. The sentence unit learning model M12 includes a question embedding vector O _q (question intermediate vector), an answer intermediate vector corresponding to each of a plurality of sub-items (correct answer embedding vector (O _{ac +} , Os ₊ ), and an incorrect answer embedding vector (O _ac−). , O _as- )) and, learning based on the combination of multiple sub-items. Here, the question embedding vector O _q (question intermediate vector) generates a feature vector in which the question sentence of the input information is converted for each word based on the order of the words in both the forward and reverse directions of the time series. It is generated by learning the sequence of words in both directions based on the bidirectional vector sequence 24 (question feature vector group). In addition, the answer intermediate vector (correct answer embedding vector (O _{ac +} , O _{as +} ) and incorrect answer embedding vector (O _ac- , O _as- )) is a bidirectional word that is a feature vector in which a partial answer sentence is converted word by word. Based on the answer feature vector group (bidirectional vector string 25 and bidirectional vector sequence 26) generated based on the order of the words, the word sequence is learned in both directions and generated.

これにより、本実施形態による情報処理装置１は、エンコーダデコーダモデルＭ１１と文単位学習モデルＭ１２とを組み合わせて一括して学習した学習結果に基づいて、回答文を生成するため、文字（単語）単位で回答文を生成することができるとともに、各部分項目の回答のつながりを最適化して選択された各部分項目の回答文が結合されて、新たな回答文を作成することができる。よって、本実施形態による情報処理装置１は、質問に対して、違和感を低減した自然な文面の回答を生成することができる。すなわち、本実施形態による情報処理装置１は、自然な言葉（単語）や文の順序を持ち、従来のチャットボットによる短文回答などよりも複雑な回答を生成することができる。このように、本実施形態による情報処理装置１は、１文字生成で、且つ、長文、複雑、多様なNon-Factoid型質問に対する回答に適用可能である。 As a result, the information processing device 1 according to the present embodiment generates an answer sentence based on the learning result obtained by combining the encoder / decoder model M11 and the sentence unit learning model M12 and collectively learning, so that the answer sentence is generated in character (word) units. In addition to being able to generate an answer sentence with, it is possible to create a new answer sentence by optimizing the connection of the answers of each sub-item and combining the answer sentences of each selected sub-item. Therefore, the information processing device 1 according to the present embodiment can generate a natural written answer to the question with less discomfort. That is, the information processing device 1 according to the present embodiment has a natural word (word) or sentence order, and can generate a more complicated answer than a short sentence answer by a conventional chatbot. As described above, the information processing apparatus 1 according to the present embodiment can generate one character and can be applied to answers to long sentences, complicated, and various non-factoid type questions.

また、本実施形態では、文単位学習モデルＭ１２は、エンコーダデコーダモデルＭ１１の途中学習結果及び文脈ベクトル１０に基づいて生成した回答文（例えば、「結論」の正解文ａｃ２＋、「結論」の不正解文ａｃ２−、「補足」の正解文ａｓ２＋、及び「補足」の不正解文ａｓ２−など）を部分回答文とする。文単位学習モデルＭ１２は、当該部分回答文と質問文とを含む組情報を入力情報として学習する。
これにより、本実施形態による情報処理装置１は、エンコーダデコーダモデルＭ１１の途中学習結果に基づいて再生成したより自然な文面となる回答文を入力情報として学習するため、質問に対して、さらに違和感を低減した自然な文面の回答を生成することができる。 Further, in the present embodiment, the sentence unit learning model M12 is an answer sentence generated based on the intermediate learning result of the encoder / decoder model M11 and the context vector 10 (for example, the correct answer sentence ac2 + of the “conclusion” and the incorrect answer of the “conclusion””. Sentence ac2-, correct answer sentence as2 + of "supplement", incorrect answer sentence as2- of "supplement", etc.) are used as partial answer sentences. The sentence unit learning model M12 learns the set information including the partial answer sentence and the question sentence as input information.
As a result, the information processing device 1 according to the present embodiment learns the answer sentence, which is a more natural sentence regenerated based on the intermediate learning result of the encoder / decoder model M11, as input information. It is possible to generate a natural written answer with reduced.

また、本実施形態では、エンコーダデコーダモデルＭ１１は、既知の部分回答文における単語ごとに関連するトピック情報に基づいて、既知の部分回答文をデコードして学習する。すなわち、エンコーダデコーダモデルＭ１１は、例えば、Ｃ−ＬＳＴＭにより学習するモデルである。
これにより、本実施形態による情報処理装置１は、トピック情報により、関連の低い文字（単語）が選ばれることを低減することができるため、文字（単語）単位で、さらに違和感を低減した自然な文面の回答を生成することができる。 Further, in the present embodiment, the encoder / decoder model M11 decodes and learns the known partial answer sentence based on the topic information related to each word in the known partial answer sentence. That is, the encoder / decoder model M11 is, for example, a model learned by C-LSTM.
As a result, the information processing device 1 according to the present embodiment can reduce the selection of characters (words) having low relevance based on the topic information, so that it is natural to further reduce the sense of discomfort on a character (word) basis. Can generate written answers.

また、本実施形態では、既知の回答文には、質問文に対する正解文と、不正解文とが含まれる。回答生成部１３４は、質問文と、複数の部分項目それぞれに対応する正解文及び不正解文との組を複数有する学習情報に基づいて機械学習された学習結果に基づいて、回答文を生成する。エンコーダデコーダモデルＭ１１は、文脈ベクトル１０に基づいて、複数の部分項目ごとの正解文及び不正解文をデコードして学習する。文単位学習モデルＭ１２は、質問埋め込みベクトルＯ_ｑ（質問中間ベクトル）と、複数の部分項目それぞれに対応した正解埋め込みベクトル（Ｏ_ａｃ＋、Ｏ_ａｓ＋）（正解中間ベクトル）と、複数の部分項目それぞれに対応した不正解埋め込みベクトル（Ｏ_ａｃ−、Ｏ_ａｓ−）（不正解中間ベクトル）と、の複数の部分項目の組合せに基づいて学習する。ここで、正解埋め込みベクトル（Ｏ_ａｃ＋、Ｏ_ａｓ＋）は、正解文を単語ごとに変換された特徴ベクトルを双方向の単語の並び順に基づいて生成した双方向ベクトル列２５（正解特徴ベクトル群）に基づいて、単語の並びを双方向に学習して生成される。また、不正解埋め込みベクトル（Ｏ_ａｃ−、Ｏ_ａｓ−）は、不正解文を単語ごとに変換された特徴ベクトルを双方向の単語の並び順に基づいて生成した双方向ベクトル列２６（不正解特徴ベクトル群）に基づいて、単語の並びを双方向に学習して生成される。 Further, in the present embodiment, the known answer sentence includes a correct answer sentence for the question sentence and an incorrect answer sentence. The answer generation unit 134 generates an answer sentence based on the learning result machine-learned based on the learning information having a plurality of pairs of the question sentence and the correct answer sentence and the incorrect answer sentence corresponding to each of the plurality of sub-items. .. The encoder / decoder model M11 decodes and learns a correct answer sentence and an incorrect answer sentence for each of a plurality of sub-items based on the context vector 10. The sentence unit learning model M12 has a question embedding vector O _q (question intermediate vector), a correct answer embedding vector (O _{ac +} , O _{as +} ) (correct answer intermediate vector) corresponding to each of a plurality of sub-items, and a plurality of sub-items. Learning is based on a combination of a plurality of sub-items of the corresponding incorrect answer embedding vector (O _ac- , O _{as-) (incorrect answer intermediate vector).} Here, the correct answer embedding vector (O _{ac +} , O _{as +} ) is converted into a bidirectional vector sequence 25 (correct answer feature vector group) generated based on the bidirectional word arrangement order of the feature vector obtained by converting the correct sentence for each word. Based on this, it is generated by learning the sequence of words in both directions. In addition, the incorrect answer embedding vector (O _ac- , O _as- ) is a bidirectional vector sequence 26 (incorrect answer feature) in which an incorrect answer sentence is converted for each word and a feature vector is generated based on the bidirectional word arrangement order. It is generated by learning the sequence of words in both directions based on the vector group).

これにより、本実施形態による情報処理装置１は、正解文と不正解文とを用いて学習された学習結果に基づいて、回答文を生成するため、さらに違和感を低減した自然な文面の回答を生成することができる。 As a result, the information processing device 1 according to the present embodiment generates an answer sentence based on the learning result learned by using the correct answer sentence and the incorrect answer sentence. Can be generated.

また、本実施形態では、学習結果は、複数の部分項目のうちの第１の部分項目（例えば、「結論」）に対応する正解埋め込みベクトル（Ｏ_ａｃ＋、Ｏ_ａｓ＋）及び不正解埋め込みベクトル（Ｏ_ａｃ−、Ｏ_ａｓ−）に基づいて、第１の部分項目と異なる第２の部分項目（例えば、「補足」）に対応する双方向ベクトル列２５及び双方向ベクトル列２６（不正解特徴ベクトル群）が更新されて学習される。すなわち、学習処理部１３２は、上述した式（３）を用いたアテンションメカニズムにより、双方向ベクトル列２５−２及び双方向ベクトル列２６−２の各双方向ベクトル（ｈ（ｔ））を更新させて学習する。
これにより、本実施形態による情報処理装置１は、部分項目の間の関連（例えば、部分項目のつながり）を最適化した学習を行うことができる。そのため、本実施形態による情報処理装置１は、部分項目を組み合わせて、より自然な回答文を生成することができる。 _{Further, in the present embodiment, the learning result includes a correct answer embedding vector (O ac +} , O _{as +} ) and an incorrect answer embedding vector (O) corresponding to the first sub-item (for example, “conclusion”) of the plurality of sub-items. Bidirectional vector sequence 25 and bidirectional vector sequence 26 (incorrect feature vector group) corresponding to a second sub-item (for example, "supplement") different from the first sub-item based on _ac- , O _as-). ) Is updated and learned. That is, the learning processing unit 132 updates each bidirectional vector (h (t)) of the bidirectional vector sequence 25-2 and the bidirectional vector sequence 26-2 by the attention mechanism using the above equation (3). To learn.
As a result, the information processing apparatus 1 according to the present embodiment can perform learning by optimizing the relationship between the sub-items (for example, the connection of the sub-items). Therefore, the information processing apparatus 1 according to the present embodiment can generate a more natural answer sentence by combining the partial items.

また、本実施形態による情報処理装置１は、学習情報に基づいて機械学習し、学習結果を生成する学習処理部１３２を備える。
これにより、本実施形態による情報処理装置１は、自装置で学習して学習結果を生成することができる。また、本実施形態による情報処理装置１は、例えば、再学習して、質問に対する回答を改善することができる。 Further, the information processing device 1 according to the present embodiment includes a learning processing unit 132 that performs machine learning based on learning information and generates a learning result.
As a result, the information processing device 1 according to the present embodiment can learn by its own device and generate a learning result. Further, the information processing apparatus 1 according to the present embodiment can, for example, relearn to improve the answer to the question.

また、本実施形態では、例えば、上記の式（５）に基づいて、損失関数Ｌｗが算出される。損失関数Ｌｗは、文字（単語）単位での学習と各部分項目の組み合わせとを同時に最適化するため、本実施形態による情報処理装置１は、回答文を生成するのに適切な部分項目を生成することができる。 Further, in the present embodiment, for example, the loss function Lw is calculated based on the above equation (5). Since the loss function Lw simultaneously optimizes learning in character (word) units and combinations of each sub-item, the information processing apparatus 1 according to the present embodiment generates sub-items suitable for generating an answer sentence. can do.

なお、学習処理部１３２は、所定の条件（例えば、定期的、あるいは、評価値（例えば、ＲＯＵＧＥ）が所定の値以下に低下した、など）に基づいて、学習結果を再学習してもよい。
これにより、本実施形態による情報処理装置１は、時間の変化に対応して、質問に対する回答を改善することができる。 The learning processing unit 132 may relearn the learning result periodically or based on a predetermined condition (for example, the evaluation value (for example, ROUGE) is lowered to a predetermined value or less). ..
As a result, the information processing apparatus 1 according to the present embodiment can improve the answer to the question in response to the change in time.

また、本実施形態による情報処理方法は、質問取得ステップと、回答生成ステップとを含む。質問取得ステップにおいて、質問取得部１３３が、入力された入力質問文を取得する。回答生成ステップにおいて、回答生成部１３４が、質問文と、回答文において、予め定められた文章の筋道により分割される複数の部分項目それぞれに対応する正解文及び不正解文との組を複数有する学習情報に基づいて、機械学習された上述した学習結果に基づいて、質問取得ステップによって取得された入力質問文に対する回答文を生成する。
これにより、本実施形態による情報処理方法は、上述した情報処理装置１と同様の効果を奏し、質問に対して、違和感を低減した自然な文面の回答を生成することができる。 Further, the information processing method according to the present embodiment includes a question acquisition step and an answer generation step. In the question acquisition step, the question acquisition unit 133 acquires the input input question sentence. In the answer generation step, the answer generation unit 134 has a plurality of sets of a question sentence and a correct answer sentence and an incorrect answer sentence corresponding to each of a plurality of sub-items divided by a predetermined sentence path in the answer sentence. Based on the learning information, the answer sentence to the input question sentence acquired by the question acquisition step is generated based on the above-mentioned learning result obtained by machine learning.
As a result, the information processing method according to the present embodiment has the same effect as that of the information processing device 1 described above, and can generate a natural text answer to the question with less discomfort.

なお、本発明は、上記の実施形態に限定されるものではなく、本発明の趣旨を逸脱しない範囲で変更可能である。
例えば、上記の実施形態において、情報処理装置１は、学習処理部１３２を備える例を説明したが、これに限定されるものではなく、学習結果を取得できる状態であれば、学習処理部１３２を備えなくてもよい。 The present invention is not limited to the above embodiment, and can be modified without departing from the spirit of the present invention.
For example, in the above embodiment, the information processing device 1 includes the learning processing unit 132, but the present invention is not limited to this, and the learning processing unit 132 may be used as long as the learning result can be obtained. You don't have to prepare.

また、情報処理装置１は、サービス記憶部１２１と、学習結果記憶部１２２とを備える例を説明したが、サービス記憶部１２１と、学習結果記憶部１２２とのいずれか一方又は両方を、例えば、外部のサーバ装置が備えるようにしてもよい。また、情報処理装置１は、制御部１３が備える機能部の一部を外部のサーバ装置が備えるようにしてもよい。
なお、上記の実施形態において、情報処理装置１は、１台のサーバ装置により構成される例を説明したが、複数の装置により構成されてもよい。 Further, although the information processing device 1 has described an example including the service storage unit 121 and the learning result storage unit 122, one or both of the service storage unit 121 and the learning result storage unit 122 may be used, for example. It may be provided by an external server device. Further, the information processing device 1 may include a part of the functional unit included in the control unit 13 in the external server device.
In the above embodiment, the information processing device 1 is configured by one server device, but the information processing device 1 may be composed of a plurality of devices.

また、上記の実施形態において、情報処理装置１は、回答文を「結論」及び「補足」の２つの部分項目により構成する場合の一例を説明したが、これに限定されるものではなく、３つ以上の部分項目に対応させてもよい。
また、上記の実施形態の文単位学習モデルＭ１２において、情報処理装置１は、部分項目ごとにＱＡ−ＬＳＴＭ部２０を備える手法と、アテンションメカニズムによる手法とを適用する例を説明したが、これに限定されるものではない。情報処理装置１は、文単位学習モデルＭ１２において、例えば、これらの手法の一部を適用しない形態であってもよい。 Further, in the above embodiment, the information processing apparatus 1 has described an example in which the answer sentence is composed of two sub-items of "conclusion" and "supplement", but the present invention is not limited to this, and 3 It may correspond to one or more sub-items.
Further, in the sentence unit learning model M12 of the above embodiment, an example in which the information processing apparatus 1 applies a method including a QA-LSTM unit 20 for each sub-item and a method based on an attention mechanism has been described. It is not limited. The information processing device 1 may be in a form in which, for example, a part of these methods is not applied in the sentence unit learning model M12.

また、上記の実施形態において、文単位学習モデルＭ１２の損失関数Ｌｓが、上述した式（４）により算出される例を説明したが、これに限定されるものではなく、以下の式（６）により算出されてもよい。 Further, in the above embodiment, an example in which the loss function Ls of the sentence unit learning model M12 is calculated by the above-mentioned equation (4) has been described, but the present invention is not limited to this, and the following equation (6) is used. It may be calculated by.

また、上記の実施形態の学習処理において、エンコーダデコーダモデルＭ１１の途中学習結果及び文脈ベクトル１０に基づいて再生成した回答文（例えば、結論回答文ａｃ２、補足回答文ａｓ２など）を文単位学習モデルＭ１２の入力情報とする例を説明したが、これに限定されるものではない。学習処理部１３２は、デコードの学習に利用した再生成する前の回答文を、文単位学習モデルＭ１２の入力情報とするようにしてもよい。 Further, in the learning process of the above embodiment, the answer sentences (for example, conclusion answer sentence ac2, supplementary answer sentence as2, etc.) regenerated based on the intermediate learning result of the encoder decoder model M11 and the context vector 10 are used as a sentence unit learning model. An example of using the input information of M12 has been described, but the present invention is not limited to this. The learning processing unit 132 may use the answer sentence before regeneration used for the learning of decoding as the input information of the sentence unit learning model M12.

なお、上述した情報処理装置１が備える各構成は、内部に、コンピュータシステムを有している。そして、上述した情報処理装置１が備える各構成の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより上述した情報処理装置１が備える各構成における処理を行ってもよい。ここで、「記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行する」とは、コンピュータシステムにプログラムをインストールすることを含む。ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。
また、「コンピュータシステム」は、インターネットやＷＡＮ、ＬＡＮ、専用回線等の通信回線を含むネットワークを介して接続された複数のコンピュータ装置を含んでもよい。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。このように、プログラムを記憶した記録媒体は、ＣＤ−ＲＯＭ等の非一過性の記録媒体であってもよい。 Each configuration included in the information processing device 1 described above has a computer system inside. Then, a program for realizing the functions of each configuration included in the above-mentioned information processing apparatus 1 is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed. Therefore, the processing in each configuration included in the information processing apparatus 1 described above may be performed. Here, "loading and executing a program recorded on a recording medium into a computer system" includes installing the program in the computer system. The term "computer system" as used herein includes hardware such as an OS and peripheral devices.
Further, the "computer system" may include a plurality of computer devices connected via a network including a communication line such as the Internet, WAN, LAN, and a dedicated line. Further, the "computer-readable recording medium" refers to a portable medium such as a flexible disk, a magneto-optical disk, a ROM, or a CD-ROM, or a storage device such as a hard disk built in a computer system. As described above, the recording medium in which the program is stored may be a non-transient recording medium such as a CD-ROM.

また、記録媒体には、当該プログラムを配信するために配信サーバからアクセス可能な内部又は外部に設けられた記録媒体も含まれる。なお、プログラムを複数に分割し、それぞれ異なるタイミングでダウンロードした後に情報処理装置１が備える各構成で合体される構成や、分割されたプログラムのそれぞれを配信する配信サーバが異なっていてもよい。さらに「コンピュータ読み取り可能な記録媒体」とは、ネットワークを介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（ＲＡＭ）のように、一定時間プログラムを保持しているものも含むものとする。また、上記プログラムは、上述した機能の一部を実現するためのものであってもよい。さらに、上述した機能をコンピュータシステムに既に記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であってもよい。 The recording medium also includes an internal or external recording medium that can be accessed from the distribution server to distribute the program. It should be noted that the program may be divided into a plurality of units, downloaded at different timings, and then combined with each configuration provided in the information processing device 1, or the distribution server for distributing each of the divided programs may be different. Furthermore, a "computer-readable recording medium" is a volatile memory (RAM) inside a computer system that serves as a server or client when a program is transmitted via a network, and holds the program for a certain period of time. It shall also include things. Further, the above program may be for realizing a part of the above-mentioned functions. Further, a so-called difference file (difference program) may be used, which can realize the above-mentioned functions in combination with a program already recorded in the computer system.

また、上述した機能の一部又は全部を、ＬＳＩ（Large Scale Integration）等の集積回路として実現してもよい。上述した各機能は個別にプロセッサ化してもよいし、一部、又は全部を集積してプロセッサ化してもよい。また、集積回路化の手法はＬＳＩに限らず専用回路、又は汎用プロセッサで実現してもよい。また、半導体技術の進歩によりＬＳＩに代替する集積回路化の技術が出現した場合、当該技術による集積回路を用いてもよい。 Further, a part or all of the above-mentioned functions may be realized as an integrated circuit such as an LSI (Large Scale Integration). Each of the above-mentioned functions may be made into a processor individually, or a part or all of them may be integrated into a processor. Further, the method of making an integrated circuit is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. Further, when an integrated circuit technology that replaces an LSI appears due to advances in semiconductor technology, an integrated circuit based on this technology may be used.

１情報処理装置
２端末装置
１０文脈ベクトル
１１ＮＷ通信部
１２記憶部
１３制御部
２０、２０−１、２０−２ＱＡ−ＬＳＴＭ部
２１、２１−１、２１−２質問埋め込みベクトル生成部
２２、２２−１、２２−２正解埋め込みベクトル生成部
２３、２３−１、２３−２不正解埋め込みベクトル生成部
２４、２４−１、２４−２、２５、２５−１、２５−２、２６、２６−１、２６−２双方向ベクトル列
３０損失関数生成部
１００情報処理システム
１２１サービス記憶部
１２２学習結果記憶部
１３１サービス提供部
１３２学習処理部
１３３質問取得部
１３４回答生成部
Ｍ１学習モデル
Ｍ１１エンコーダデコーダモデル
Ｍ１２文単位学習モデル
Ｍ１１１エンコーダモデル
Ｍ１１２デコーダモデル
ＮＷ１ネットワーク 1 Information processing device 2 Terminal device 10 Context vector 11 NW communication unit 12 Storage unit 13 Control unit 20, 20-1, 20-2 QA-LSTM unit 21, 21-1, 21-2 Question embedded vector generation unit 22, 22 -1,22-2 Correct embedded vector generator 23,23-1,23-2 Incorrect embedded vector generator 24,24-1,24-2,25,25-1,25-2,26,26- 1, 26-2 Bidirectional vector sequence 30 Loss function generation unit 100 Information processing system 121 Service storage unit 122 Learning result storage unit 131 Service provision unit 132 Learning processing unit 133 Question acquisition unit 134 Answer generation unit M1 Learning model M11 Encoder decoder model M12 sentence unit learning model M111 encoder model M112 decoder model NW1 network

Claims

The question acquisition section that acquires the input question text, and
Machine-learned learning results based on learning information that has multiple pairs of question sentences and known partial answer sentences corresponding to each of a plurality of partial items divided by a predetermined sentence path in the answer sentence. Based on this, it is provided with an answer generation unit that generates an answer sentence to the input question sentence acquired by the question acquisition unit.
The learning result is
The question sentence is sequentially encoded word by word based on the order of the words to generate a context vector, and based on the generated context vector, the known partial answer sentence for each of the plurality of partial items is decoded and learned. Encoder Decoder model and
Using the set information including the partial answer sentence for each of the plurality of partial items decoded based on the encoder-decoder model and the question sentence as input information, the feature vector obtained by converting the question sentence for each word is used. The question intermediate vector generated by learning the sequence of words in both directions based on the question feature vector group generated based on the sequence order of the words in both the forward and reverse directions of the series, and the plurality of questions. It is an answer intermediate vector corresponding to each of the sub-items of the above, and is based on the answer feature vector group generated by converting the partial answer sentence for each word based on the order of the words in both directions. A sentence-based learning model that learns based on the combination of the plurality of sub-items of the answer intermediate vector generated by learning the sequence of words in both directions.
An information processing device characterized in that it is optimized and learned by a loss function calculated by combining.

In the sentence unit learning model, the intermediate learning result of the encoder / decoder model and the answer sentence generated based on the context vector are used as the partial answer sentence, and the set information including the partial answer sentence and the question sentence is the input information. The information processing apparatus according to claim 1, wherein the information processing apparatus is characterized by learning as.

The encoder / decoder model according to claim 1 or 2, wherein the encoder / decoder model decodes and learns the known partial answer sentence based on topic information related to each word in the known partial answer sentence. Information processing equipment.

The known answer sentence includes a correct answer sentence and an incorrect answer sentence for the question sentence.
The answer generation unit is based on the learning result machine-learned based on the learning information having a plurality of pairs of the question sentence and the correct answer sentence and the incorrect answer sentence corresponding to each of the plurality of sub-items. , Generate the answer sentence,
The encoder / decoder model decodes and learns the correct answer sentence and the incorrect answer sentence for each of the plurality of partial items based on the context vector, and learns the encoder / decoder model.
The sentence-based learning model is a question intermediate vector and a correct answer intermediate vector corresponding to each of the plurality of sub-items, and a feature vector obtained by converting the correct answer sentence for each word is a sequence of the words in both directions. A correct intermediate vector generated by learning the sequence of words in both directions based on a group of correct feature vectors generated based on the order, and an incorrect intermediate vector corresponding to each of the plurality of sub-items. It is generated by learning the sequence of words in both directions based on the incorrect feature vector group generated by converting the incorrect sentence for each word based on the sequence order of the words in both directions. The information processing apparatus according to any one of claims 1 to 3, wherein learning is performed based on a combination of the incorrect answer intermediate vector and the plurality of sub-items.

The learning result is
A group of correct answer feature vectors corresponding to a second sub-item different from the first sub-item based on the correct intermediate vector corresponding to the first sub-item and the incorrect intermediate vector among the plurality of sub-items. The information processing apparatus according to claim 4, wherein the incorrect answer feature vector group is updated and learned.

The information processing apparatus according to any one of claims 1 to 5, further comprising a learning processing unit that performs machine learning based on the learning information and generates the learning result.

The question acquisition step, in which the question acquisition department acquires the input input question text,
Machine learning based on learning information in which the answer generation unit has a plurality of sets of a question sentence and a known partial answer sentence corresponding to each of a plurality of partial items divided by a predetermined sentence path in the answer sentence. Includes an answer generation step that generates an answer sentence to the input question sentence acquired by the question acquisition step based on the obtained learning result.
The learning result is
The question sentence is sequentially encoded word by word based on the order of the words to generate a context vector, and based on the generated context vector, the known partial answer sentence for each of the plurality of partial items is decoded and learned. Encoder Decoder model and
Using the set information including the partial answer sentence for each of the plurality of partial items decoded based on the encoder-decoder model and the question sentence as input information, the feature vector obtained by converting the question sentence for each word is used. The question intermediate vector generated by learning the sequence of words in both directions based on the question feature vector group generated based on the sequence order of the words in both the forward and reverse directions of the series, and the plurality of questions. It is an answer intermediate vector corresponding to each of the sub-items of the above, and is based on the answer feature vector group generated by converting the partial answer sentence for each word based on the order of the words in both directions. A sentence-based learning model that learns based on the combination of the plurality of sub-items of the answer intermediate vector generated by learning the sequence of words in both directions.
An information processing method characterized in that it is optimized and learned by a loss function calculated by combining.

On the computer
The question acquisition step to acquire the entered input question text, and
Machine-learned learning results based on learning information that has multiple pairs of question sentences and known partial answer sentences corresponding to each of a plurality of partial items divided by a predetermined sentence path in the answer sentence. Based on this, it is a program for executing the answer generation step of generating the answer sentence to the input question sentence acquired by the question acquisition step.
The learning result is
The question sentence is sequentially encoded word by word based on the order of the words to generate a context vector, and based on the generated context vector, the known partial answer sentence for each of the plurality of partial items is decoded and learned. Encoder Decoder model and
Using the set information including the partial answer sentence for each of the plurality of partial items decoded based on the encoder-decoder model and the question sentence as input information, the feature vector obtained by converting the question sentence for each word is used. The question intermediate vector generated by learning the sequence of words in both directions based on the question feature vector group generated based on the sequence order of the words in both the forward and reverse directions of the series, and the plurality of questions. It is an answer intermediate vector corresponding to each of the sub-items of the above, and is based on the answer feature vector group generated by converting the partial answer sentence for each word based on the order of the words in both directions. A sentence-based learning model that learns based on the combination of the plurality of sub-items of the answer intermediate vector generated by learning the sequence of words in both directions.
A program characterized in that it is optimized and learned by a loss function calculated by combining.