JP7099254B2

JP7099254B2 - Learning methods, learning programs and learning devices

Info

Publication number: JP7099254B2
Application number: JP2018206012A
Authority: JP
Inventors: 拓哉牧野
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2018-10-31
Filing date: 2018-10-31
Publication date: 2022-07-12
Anticipated expiration: 2038-10-31
Also published as: JP2020071737A

Description

本発明は、学習方法、学習プログラム及び学習装置に関する。 The present invention relates to a learning method, a learning program and a learning device.

新聞やＷｅｂサイト、電光掲示板などの文書から要約文を生成する自動要約にニューラルネットワークなどの機械学習が利用されることがある。例えば、入力文をベクトル化するＲＮＮ（Recurrent Neural Networks）エンコーダと、入力文のベクトルを参照して要約文の単語の予測を繰り返すＲＮＮデコーダとが接続されたモデルが要約文の生成に用いられる。 Machine learning such as neural networks may be used for automatic summarization to generate summarization from documents such as newspapers, websites, and electric bulletin boards. For example, a model in which an RNN (Recurrent Neural Networks) encoder that vectorizes an input sentence and an RNN decoder that repeatedly predicts words in the summary sentence by referring to the vector of the input sentence is used to generate the summary sentence.

このようなモデルを学習する方法の一例として、学習サンプルの入力文に対応する正解の要約文である参照要約の単語ごとにモデルのパラメータの更新に用いる損失を算出するものがある。例えば、モデル学習の際には、ＲＮＮデコーダは、入力文のベクトル、１時刻前の正解の単語及びＲＮＮデコーダが文末記号と呼ばれるＥＯＳを出力するまでの残り文字数などを入力とし、ＥＯＳを出力するまで時刻ごとに単語の確率分布を繰り返し計算する。ここで言う「ＥＯＳ」は、「End Of Sentence」の略称である。このように時刻ごとに計算される単語の確率分布と、当該時刻における正解の単語とを比較することにより損失が計算される。例えば、１時刻目に計算される単語の確率分布は、参照要約に含まれる単語列のうち先頭の単語と比較される。また、２時刻目に計算される単語の確率分布は、参照要約の先頭から２番目の単語と比較される。 As an example of the method of learning such a model, there is a method of calculating the loss used for updating the model parameters for each word of the reference summary which is the correct summary sentence corresponding to the input sentence of the training sample. For example, at the time of model learning, the RNN decoder inputs the vector of the input sentence, the correct word one hour ago, and the number of remaining characters until the RNN decoder outputs the EOS called the sentence end symbol, and outputs the EOS. Repeatedly calculate the probability distribution of words for each time until. "EOS" here is an abbreviation for "End Of Sentence". The loss is calculated by comparing the probability distribution of the word calculated for each time with the correct word at the time. For example, the probability distribution of the word calculated at the first time is compared with the first word in the word string included in the reference summary. Also, the probability distribution of the word calculated at the second time is compared with the second word from the beginning of the reference summary.

上記のモデル学習が行われる場合、要約文の語数の制限はある程度は満たされる一方で、ＲＮＮデコーダが出力する要約文と正解の参照要約との間で文意が同じであっても単語の語順が異なる場合には、損失が生じる評価となる。 When the above model learning is performed, the word order of the words is satisfied even if the sentence meaning is the same between the summary sentence output by the RNN decoder and the reference summary of the correct answer, while the limitation on the number of words in the summary sentence is satisfied to some extent. If is different, it is an evaluation that causes a loss.

このことから、ＲＯＵＧＥと呼ばれる指標が自動生成の要約文の評価に用いられる場合がある。ここで言う「ＲＯＵＧＥ」とは、正解の参照要約と、モデルが組み込まれた要約文生成システムが出力する要約文との間における単語のＮ－ｇｒａｍの重複度を表す指標を指す。このようなＲＯＵＧＥに基づいてＲＮＮエンコーダ及びＲＮＮデコーダのモデルのパラメータをチューニングするＭＲＴ（Minimum Risk Training）と呼ばれる技術も提案されている。 For this reason, an index called ROUGE may be used to evaluate an automatically generated summary. The term "ROUGE" as used herein refers to an index indicating the degree of overlap of the word N-gram between the reference summary of the correct answer and the summary sentence output by the summary sentence generation system in which the model is incorporated. A technique called MRT (Minimum Risk Training) that tunes the parameters of the model of the RNN encoder and the RNN decoder based on such ROUGE has also been proposed.

特開２０１６－６２１８１号公報Japanese Unexamined Patent Publication No. 2016-62181 特開２０１３－１６７９８５号公報Japanese Unexamined Patent Publication No. 2013-167985 特開２０１５－１７０２２４号公報JP-A-2015-170224 特開２０１４－１２３２１９号公報Japanese Unexamined Patent Publication No. 2014-123219

Ayana, Shiqi Shen, Yu Zhao, Zhiyuan Liu, Maosong Sun “Neural Headline Generation with Sentence-wise Optimization” Submitted on 7 Apr 2016Ayana, Shiqi Shen, Yu Zhao, Zhiyuan Liu, Maosong Sun “Neural Headline Generation with Sentence-wise Optimization” Submitted on 7 Apr 2016

しかしながら、上記の技術では、正解の参照要約と語順が異なる全ての要約文が高評価を受けるので、可読性が低い要約文を生成するモデルが学習されてしまう場合がある。 However, in the above technique, all the abstracts whose word order is different from the reference abstract of the correct answer are highly evaluated, so that a model for generating the abstract with low readability may be learned.

すなわち、上記のＭＲＴでは、正解の参照要約と語順が異なる要約文であっても単語の重複度が高ければ高いＲＯＵＧＥ値が算出される。そして、ＲＯＵＧＥ値が高い要約文の中には、正解の参照要約との間で語順が入れ替わることによって非文法的な表現を持つ要約文も含まれることがある。このように非文法的な表現を持つ要約文に基づいてモデルのパラメータが更新されることが一因となって可読性が低い要約文を生成するモデルが学習されてしまう場合がある。 That is, in the above MRT, even if the summary sentence has a different word order from the reference summary of the correct answer, a high ROUGE value is calculated if the degree of word duplication is high. Then, the abstract sentence having a high ROUGE value may include a abstract sentence having a non-grammatical expression by exchanging the word order with the reference abstract of the correct answer. In some cases, a model that generates a summary sentence with low readability may be learned, partly because the parameters of the model are updated based on the summary sentence having a non-grammatical expression.

１つの側面では、本発明は、可読性が低い要約文を生成するモデルが学習されるのを抑制できる学習方法、学習プログラム及び学習装置を提供することを目的とする。 In one aspect, it is an object of the present invention to provide learning methods, learning programs and learning devices that can suppress the learning of models that generate less readable summaries.

一態様では、入力文から要約文を生成するモデルの機械学習を行う学習方法であって、入力文および正解の要約文を取得し、前記正解の要約文に含まれる単語の語順を入れ替えることにより非文法的な表現が擬似的に再現された擬似文を生成し、前記モデルによって前記擬似文が前記入力文から生成される前記擬似文の生成確率、および、前記モデルによって前記正解の要約文が前記入力文から生成される前記正解の要約文の生成確率に基づいて前記モデルのパラメータを更新する、処理をコンピュータが実行する。 In one aspect, it is a learning method that performs machine learning of a model that generates a summary sentence from an input sentence, by acquiring the input sentence and the summary sentence of the correct answer and replacing the word order of the words included in the summary sentence of the correct answer. A pseudo-sentence in which a non-grammatical expression is simulated is generated, and the pseudo-sentence is generated from the input sentence by the model. The computer executes a process of updating the parameters of the model based on the generation probability of the summary sentence of the correct answer generated from the input sentence.

可読性が低い要約文を生成するモデルが学習されるのを抑制できる。 It is possible to suppress the training of a model that produces a summary sentence with low readability.

図１は、実施例１に係る学習装置の機能的構成を示すブロック図である。FIG. 1 is a block diagram showing a functional configuration of the learning device according to the first embodiment. 図２は、記事要約ツールのユースケースの一例を示す図である。FIG. 2 is a diagram showing an example of a use case of the article summarization tool. 図３は、入力文の一例を示す図である。FIG. 3 is a diagram showing an example of an input sentence. 図４Ａは、参照要約の一例を示す図である。FIG. 4A is a diagram showing an example of a reference summary. 図４Ｂは、システム要約の一例を示す図である。FIG. 4B is a diagram showing an example of a system summary. 図４Ｃは、システム要約の一例を示す図である。FIG. 4C is a diagram showing an example of a system summary. 図５は、ＭＲＴの処理内容の一例を示す図である。FIG. 5 is a diagram showing an example of the processing content of MRT. 図６は、生成確率およびＲＯＵＧＥ値の一例を示す図である。FIG. 6 is a diagram showing an example of the generation probability and the ROUGE value. 図７Ａは、参照要約の一例を示す図である。FIG. 7A is a diagram showing an example of a reference summary. 図７Ｂは、システム要約の一例を示す図である。FIG. 7B is a diagram showing an example of a system summary. 図７Ｃは、システム要約の一例を示す図である。FIG. 7C is a diagram showing an example of a system summary. 図７Ｄは、システム要約の一例を示す図である。FIG. 7D is a diagram showing an example of a system summary. 図８は、モデルのパラメータの更新方法の一例を示す図である。FIG. 8 is a diagram showing an example of how to update the parameters of the model. 図９は、第１のモデル学習の一例を示す図である。FIG. 9 is a diagram showing an example of the first model learning. 図１０は、第１のモデル学習の一例を示す図である。FIG. 10 is a diagram showing an example of the first model learning. 図１１は、第１のモデル学習の一例を示す図である。FIG. 11 is a diagram showing an example of the first model learning. 図１２は、第１の系統におけるモデルへの入出力の一例を示す図である。FIG. 12 is a diagram showing an example of input / output to the model in the first system. 図１３は、重複度の算出方法の一例を示す図である。FIG. 13 is a diagram showing an example of a method for calculating the degree of overlap. 図１４は、誤差付きの重複度の算出方法の一例を示す図である。FIG. 14 is a diagram showing an example of a method for calculating the degree of overlap with an error. 図１５は、誤差付きの重複度の算出方法の一例を示す図である。FIG. 15 is a diagram showing an example of a method for calculating the degree of overlap with an error. 図１６は、第２の系統におけるモデルへの入出力の一例を示す図である。FIG. 16 is a diagram showing an example of input / output to the model in the second system. 図１７は、実施例１に係る学習処理の手順を示すフローチャートである。FIG. 17 is a flowchart showing the procedure of the learning process according to the first embodiment. 図１８は、実施例１に係る第１の損失算出処理の手順を示すフローチャートである。FIG. 18 is a flowchart showing the procedure of the first loss calculation process according to the first embodiment. 図１９は、実施例１に係る第２の損失算出処理の手順を示すフローチャートである。FIG. 19 is a flowchart showing a procedure of the second loss calculation process according to the first embodiment. 図２０は、実施例１及び実施例２に係る学習プログラムを実行するコンピュータのハードウェア構成例を示す図である。FIG. 20 is a diagram showing a hardware configuration example of a computer that executes the learning program according to the first and second embodiments.

以下に添付図面を参照して本願に係る学習方法、学習プログラム及び学習装置について説明する。なお、この実施例は開示の技術を限定するものではない。そして、各実施例は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 The learning method, learning program and learning device according to the present application will be described below with reference to the attached drawings. It should be noted that this embodiment does not limit the disclosed technique. Then, each embodiment can be appropriately combined as long as the processing contents do not contradict each other.

図１は、実施例１に係る学習装置の機能的構成を示すブロック図である。図１に示す学習装置１は、新聞や電光掲示板、Ｗｅｂサイトなどの各種の記事を入力文として受け付け、その要約文を生成するモデルの学習を実行する学習サービスを提供するものである。 FIG. 1 is a block diagram showing a functional configuration of the learning device according to the first embodiment. The learning device 1 shown in FIG. 1 provides a learning service that accepts various articles such as newspapers, electric bulletin boards, and websites as input sentences and executes learning of a model that generates a summary sentence.

一実施形態として、学習装置１は、パッケージソフトウェアやオンラインソフトウェアとして上記の学習サービスを実現する学習プログラムを任意のコンピュータにインストールさせることによって実装できる。このように上記の学習プログラムをコンピュータに実行させることにより、コンピュータを学習装置１として機能させることができる。ここで言うコンピュータは、任意の情報処理装置であってよい。例えば、デスクトップ型またはノート型のパーソナルコンピュータやワークステーションの他、スマートフォンや携帯電話機などの移動体通信端末、タブレット端末、ウェアラブル端末などがその範疇に含まれる。また、学習装置１は、ユーザが使用する端末装置をクライアントとし、当該クライアントに上記の学習サービスを提供するサーバ装置として実装することもできる。この場合、学習装置１は、複数の学習サンプルを含む学習データ、または、学習データをネットワークもしくは記憶メディアを介して呼び出すことができる識別情報を入力とするモデル学習のリクエストを受け付ける。その上で、学習装置１は、モデル学習のリクエストで受け付けた学習データに対するモデル学習の実行結果を出力する学習サービスを提供するサーバ装置として実装される。この場合、学習装置１は、上記の学習サービスを提供するサーバとしてオンプレミスに実装することとしてもよいし、アウトソーシングによって上記の学習サービスを提供するクラウドとして実装することとしてもかまわない。 As one embodiment, the learning device 1 can be implemented by installing a learning program that realizes the above learning service as package software or online software on an arbitrary computer. By causing the computer to execute the above learning program in this way, the computer can function as the learning device 1. The computer referred to here may be any information processing device. For example, in addition to desktop-type or notebook-type personal computers and workstations, mobile communication terminals such as smartphones and mobile phones, tablet terminals, wearable terminals, and the like are included in this category. Further, the learning device 1 may be implemented as a server device in which the terminal device used by the user is a client and the above learning service is provided to the client. In this case, the learning device 1 accepts a request for model learning in which learning data including a plurality of learning samples or identification information capable of calling the learning data via a network or a storage medium is input. Then, the learning device 1 is implemented as a server device that provides a learning service that outputs a learning execution result of model learning for the learning data received in the model learning request. In this case, the learning device 1 may be implemented on-premises as a server that provides the learning service, or may be implemented as a cloud that provides the learning service by outsourcing.

［学習済みモデルのユースケースの一例］
上記の学習サービスにより学習された学習済みモデルは、新聞記事や電光掲示板、Ｗｅｂサイトなどの記事の原文を入力文として受け付け、その要約文を生成する記事要約ツールとして実装することができる。 [Example of trained model use case]
The trained model learned by the above learning service can be implemented as an article summarization tool that accepts the original text of an article such as a newspaper article, an electric bulletin board, or a website as an input sentence and generates the summary sentence.

ここで、上記の記事要約ツールは、あくまで１つの側面として、新聞や電光掲示板、Ｗｅｂサイトなどの各種のメディアを運営するメディア事業者をユーザとするアプリケーションの一機能として組み込むことができる。 Here, the above-mentioned article summarization tool can be incorporated as one function of an application whose user is a media business operator who operates various media such as newspapers, electric bulletin boards, and websites, as one aspect.

このとき、上記のアプリケーションは、メディア事業者の関係者、例えば編集員等により使用される端末装置で実行されるスタンドアローンのソフトウェアとして実装されることとしてもよい。この他、上記のアプリケーションが提供する機能のうち、原文の入力や要約文の表示等のフロントエンドの機能が記者や編集者等の端末装置で提供されると共に、要約文の生成などのバックエンドの機能がＷｅｂサービスとして提供されることとしてもかまわない。 At this time, the above application may be implemented as stand-alone software executed by a terminal device used by a person concerned with a media company, for example, an editor or the like. In addition, among the functions provided by the above applications, front-end functions such as inputting the original text and displaying the summary text are provided by terminal devices such as reporters and editors, and back-end functions such as the generation of the summary text are provided. The function may be provided as a Web service.

図２は、記事要約ツールのユースケースの一例を示す図である。図２には、メディア事業者の関係者により使用される端末装置に表示される記事要約画面２０の遷移の一例が示されている。 FIG. 2 is a diagram showing an example of a use case of the article summarization tool. FIG. 2 shows an example of the transition of the article summary screen 20 displayed on the terminal device used by a person concerned with the media business.

図２の上段には、各種の項目に対する入力が設定されていない初期状態の記事要約画面２０が示されている。例えば、記事要約画面２０には、原文入力エリア２１、要約表示エリア２２、プルダウンメニュー２３、要約ボタン２４、クリアボタン２５などのＧＵＩ（Graphical User Interface）コンポーネントが含まれる。このうち、原文入力エリア２１は、記事等の原文を入力するエリアに対応する。また、要約表示エリア２２は、原文入力エリア２１に入力された原文に対応する要約文を表示するエリアに対応する。また、プルダウンメニュー２３は、要約文の上限文字数を指定するＧＵＩコンポーネントの一例に対応する。また、要約ボタン２４は、原文入力エリア２１に入力された原文に対応する要約文を生成するコマンドの実行を受け付けるＧＵＩコンポーネントの一例に対応する。また、クリアボタン２５は、原文入力エリア２１に入力された原文のテキストをクリアするＧＵＩコンポーネントの一例に対応する。 The upper part of FIG. 2 shows an article summary screen 20 in an initial state in which inputs for various items are not set. For example, the article summary screen 20 includes GUI (Graphical User Interface) components such as an original text input area 21, a summary display area 22, a pull-down menu 23, a summary button 24, and a clear button 25. Of these, the original text input area 21 corresponds to an area for inputting the original text such as an article. Further, the summary display area 22 corresponds to an area for displaying the summary text corresponding to the original text input in the original text input area 21. Further, the pull-down menu 23 corresponds to an example of a GUI component that specifies the maximum number of characters in the summary sentence. Further, the summary button 24 corresponds to an example of a GUI component that accepts the execution of a command that generates a summary sentence corresponding to the original text input in the original text input area 21. Further, the clear button 25 corresponds to an example of a GUI component that clears the text of the original text input in the original text input area 21.

図２に示すように、記事要約画面２０の原文入力エリア２１では、図示しないキーボード等の入力デバイスを介してテキストの入力を受け付けることができる。このように入力デバイスを介してテキストの入力を受け付ける他、原文入力エリア２１では、ワープロソフトなどのアプリケーションにより作成された文書のファイルからテキストをインポートすることができる。 As shown in FIG. 2, in the original text input area 21 of the article summary screen 20, text input can be accepted via an input device such as a keyboard (not shown). In addition to accepting text input via the input device in this way, in the original text input area 21, text can be imported from a document file created by an application such as word processing software.

このように原文入力エリア２１に原文のテキストが入力されることにより、記事要約画面２０は、図２の上段に示された状態から図２の中段に示された状態へ遷移する（ステップＳ１）。例えば、原文入力エリア２１に原文のテキストが入力された場合、要約ボタン２４に対する操作を介して要約文を生成するコマンドの実行を受け付けることができる。また、クリアボタン２５に対する操作を介して原文入力エリア２１に入力されたテキストをクリアすることもできる。この他、プルダウンメニュー２３を介して、複数の上限文字数の中からメディア事業者の関係者が希望する上限文字数の指定を受け付けることもできる。ここでは、新聞やニュースの記事の原文から電光掲示板の速報を要約文として生成する場面の一例として、電光掲示板に表示可能な上限文字数の一例に対応する８０文字が指定された例が示されている。これはあくまで一例であり、新聞やＷｅｂサイトの記事から見出しを生成する場合、見出しに対応する上限文字数を選択することができる。 By inputting the original text in the original text input area 21 in this way, the article summary screen 20 transitions from the state shown in the upper part of FIG. 2 to the state shown in the middle part of FIG. 2 (step S1). .. For example, when the text of the original text is input to the original text input area 21, it is possible to accept the execution of the command to generate the summary text via the operation for the summary button 24. It is also possible to clear the text input in the original text input area 21 via the operation for the clear button 25. In addition, via the pull-down menu 23, it is also possible to accept the designation of the maximum number of characters desired by a person concerned with the media business from among the plurality of maximum number of characters. Here, as an example of a scene in which a bulletin board bulletin board is generated as a summary from the original text of a newspaper or news article, an example in which 80 characters corresponding to an example of the maximum number of characters that can be displayed on the electric bulletin board is specified is shown. There is. This is just an example, and when a headline is generated from an article in a newspaper or a website, the maximum number of characters corresponding to the headline can be selected.

そして、原文入力エリア２１に原文のテキストが入力された状態で要約ボタン２４に対する操作が行われた場合、記事要約画面２０は、図２の中段に示された状態から図２の下段に示された状態へ遷移する（ステップＳ２）。この場合、原文入力エリア２１に入力された原文のテキストが入力文として学習済みモデルに入力されることによりその要約文が生成される。この要約文の生成は、メディア事業者の関係者の端末装置上で実行されることとしてもよいし、あるいはバックエンドのサーバ装置で実行されることとしてもかまわない。この結果、図２の下段に示すように、記事要約画面２０の要約表示エリア２２には、学習済みモデルにより生成された要約文が表示される。 When the summary button 24 is operated with the original text input in the original text input area 21, the article summary screen 20 is shown in the lower part of FIG. 2 from the state shown in the middle part of FIG. Transition to the state (step S2). In this case, the text of the original text input in the original text input area 21 is input to the trained model as an input sentence, and the summary sentence is generated. The generation of this summary may be performed on the terminal device of the media operator's party, or may be executed on the back end server device. As a result, as shown in the lower part of FIG. 2, the summary sentence generated by the trained model is displayed in the summary display area 22 of the article summary screen 20.

このように記事要約画面２０の要約表示エリア２２に表示された要約文のテキストには、図示しない入力デバイス等を介して編集を行うことができる。 In this way, the text of the summary sentence displayed in the summary display area 22 of the article summary screen 20 can be edited via an input device (not shown) or the like.

以上のような記事要約ツールが提供されることで、記者や編集者等により行われる記事要約の作業を軽減することが可能になる。すなわち、記事要約の作業は、メディアにニュースを配信するプロセス、例えば「配信記事の選定」や「メディア編集システムへの送信」、「記事要約」、「見出し作成」、「校閲」などの中でも最も労力が大きいという側面がある。例えば、記事要約が人手により行われる場合、記事の全体から重要な情報を選別し、文章を再構成するといった作業が必要となる。このことから、記事要約の作業が自動化または半自動化される技術的意義は高い。 By providing the above-mentioned article summarization tool, it is possible to reduce the work of article summarization performed by reporters, editors, and the like. That is, the work of article summarization is the most in the process of delivering news to the media, such as "selection of delivered articles", "sending to media editing system", "article summarization", "headline creation", "review", etc. There is an aspect that the effort is large. For example, when article summarization is done manually, it is necessary to select important information from the entire article and reconstruct the text. For this reason, the technical significance of automating or semi-automating the work of article summarization is high.

なお、ここでは、あくまで一例として、メディア事業者の関係者により記事要約ツールが利用されるユースケースを例に挙げたが、メディア事業者から記事の配信を受ける視聴者により記事要約ツールが利用されることとしてもかまわない。例えば、スマートスピーカ等で記事の全文を読み上げる代わりに要約文を読み上げる機能として記事要約ツールを利用することができる。 Here, as an example, the use case where the article summarization tool is used by the people concerned with the media business is taken as an example, but the article summarization tool is used by the viewer who receives the article distribution from the media business. It doesn't matter if you do. For example, the article summarization tool can be used as a function to read out the summary sentence instead of reading out the whole article by a smart speaker or the like.

［ＲＮＮのモデル学習の課題一側面］
上記の背景技術の欄で説明した通り、学習サンプルの入力文に対応する正解の参照要約の単語ごとにモデルのパラメータの更新に用いる損失を算出する場合、参照要約と語順が異なるが文意は類似する要約文の評価が過小評価されることがある。 [One aspect of RNN model learning issues]
As explained in the background technique section above, when calculating the loss used to update the model parameters for each word of the correct reference summary corresponding to the input sentence of the training sample, the word order is different from the reference summary, but the meaning of the sentence is. The evaluation of similar summaries may be underestimated.

このようなモデル学習の失敗事例を図３及び図４Ａ～図４Ｃを用いて説明する。図３は、入力文の一例を示す図である。図４Ａは、参照要約の一例を示す図である。図４Ｂ及び図４Ｃは、システム要約の一例を示す図である。以下では、学習サンプルに含まれる正解の要約文のことを「参照要約」と記載し、モデルが入力文から生成する要約文のことを「システム要約」と記載する場合がある。 Such a failure example of model learning will be described with reference to FIGS. 3 and 4A to 4C. FIG. 3 is a diagram showing an example of an input sentence. FIG. 4A is a diagram showing an example of a reference summary. 4B and 4C are diagrams showing an example of a system summary. In the following, the correct summary sentence included in the training sample may be referred to as a "reference summary", and the summary sentence generated by the model from the input sentence may be referred to as a "system summary".

ここでは、一例として、モデル学習の際に、図３に示す入力文３０及び図４Ａに示す参照要約４０のペアが学習サンプルとして入力される事例を例に挙げる。このとき、ＲＮＮ（Recurrent Neural Networks）エンコーダ及びＲＮＮデコーダが接続されたモデルによって入力文３０から図４Ｂに示すシステム要約４０Ｂや図４Ｃに示すシステム要約４０Ｃが生成される場合、次のような評価が行われる。 Here, as an example, a case where a pair of the input sentence 30 shown in FIG. 3 and the reference summary 40 shown in FIG. 4A is input as a learning sample during model learning will be given as an example. At this time, when the system summary 40B shown in FIG. 4B and the system summary 40C shown in FIG. 4C are generated from the input sentence 30 by the model to which the RNN (Recurrent Neural Networks) encoder and the RNN decoder are connected, the following evaluation is performed. Will be done.

すなわち、図４Ａに示す参照要約４０及び図４Ｂに示すシステム要約４０Ｂの間では、先頭から末尾までの各位置で単語が一致する。図４Ａ及び図４Ｂには、一例として、参照要約４０及びシステム要約４０Ｂの先頭から５番目に位置する単語が太字により示されている。例えば、システム要約４０Ｂの先頭から５番目に位置する単語が予測される際には、図４Ｂに示すように、ＲＮＮデコーダが出力する入力文３０の単語の確率分布のうち単語「ＡＩ」の確率が最高となる。また、先頭から５番目に位置する参照要約４０の単語も、図４Ａに示すように、「ＡＩ」である。このように参照要約４０に含まれる単語ごとに当該単語の位置に対応する位置のシステム要約４０Ｂの単語が一致する場合、損失は「０」となる。 That is, between the reference summary 40 shown in FIG. 4A and the system summary 40B shown in FIG. 4B, the words match at each position from the beginning to the end. As an example, FIGS. 4A and 4B show the fifth word from the beginning of the reference summary 40 and the system summary 40B in bold. For example, when the word located at the fifth position from the beginning of the system summary 40B is predicted, as shown in FIG. 4B, the probability of the word "AI" in the probability distribution of the words of the input sentence 30 output by the RNN decoder. Is the best. The word of the reference summary 40 located at the fifth position from the beginning is also "AI" as shown in FIG. 4A. When the words in the system summary 40B at the positions corresponding to the positions of the words match for each word included in the reference summary 40 in this way, the loss is "0".

一方、図４Ａに示す参照要約４０及び図４Ｃに示すシステム要約４０Ｃは、文意は同一であるが、参照要約４０及びシステム要約４０Ｃの間で先頭から８番目までの単語の語順が異なる。図４Ａ及び図４Ｃには、一例として、参照要約４０及びシステム要約４０Ｃの先頭から５番目の単語が太字により示されている。例えば、システム要約４０Ｃの先頭から５番目に位置する単語が予測される際には、図４Ｃに示すように、ＲＮＮデコーダが出力する入力文３０の単語の確率分布のうち単語「コールセンター」の確率が最高となる。その一方で、先頭から５番目に位置する参照要約４０の単語は、図４Ａに示すように、「ＡＩ」である。このように参照要約４０及びシステム要約４０Ｃの間で語順が入れ替わることにより単語の配置が異なる場合、システム要約４０Ｃが参照要約４０と同一の文意を有する場合であっても、損失が生じる。 On the other hand, the reference summary 40 shown in FIG. 4A and the system summary 40C shown in FIG. 4C have the same meaning, but the word order of the eighth word from the beginning is different between the reference summary 40 and the system summary 40C. As an example, FIGS. 4A and 4C show the fifth word from the beginning of the reference summary 40 and the system summary 40C in bold. For example, when the word located at the fifth position from the beginning of the system summary 40C is predicted, as shown in FIG. 4C, the probability of the word "call center" in the probability distribution of the words of the input sentence 30 output by the RNN decoder. Is the best. On the other hand, the word of the reference summary 40 located at the fifth position from the beginning is "AI" as shown in FIG. 4A. When the word order is changed between the reference summary 40 and the system summary 40C and the word arrangement is different, even if the system summary 40C has the same meaning as the reference summary 40, a loss occurs.

これらのことから、システム要約４０Ｂ及びシステム要約４０Ｃの間で異なる評価がなされることになる。しかしながら、システム要約４０Ｂ及びシステム要約４０Ｃの文意は同一である。それ故、要約という側面から言えば、同一の評価がなされなければ適切とは言えず、システム要約４０Ｃはシステム要約４０Ｂに比べて過小評価されている。 From these things, different evaluations will be made between the system summary 40B and the system summary 40C. However, the meanings of the system summary 40B and the system summary 40C are the same. Therefore, from the aspect of summarization, it cannot be said that it is appropriate unless the same evaluation is made, and the system summarization 40C is underestimated as compared with the system summarization 40B.

［現状のＭＲＴ］
このように、モデル学習時に参照要約と語順が異なるシステム要約が過小評価されるのを抑制する側面から、ＭＲＴ（Minimum Risk Training）と呼ばれる技術が提案されている。例えば、ＭＲＴでは、正解の参照要約およびシステム要約の間における単語のＮ－ｇｒａｍの重複度を表すＲＯＵＧＥに基づいてＲＮＮエンコーダ及びＲＮＮデコーダのモデルのパラメータをチューニングする。 [Current MRT]
As described above, a technique called MRT (Minimum Risk Training) has been proposed from the aspect of suppressing underestimation of system summaries having a different word order from reference summaries during model learning. For example, in MRT, the parameters of the RNN encoder and RNN decoder models are tuned based on the ROUGE that represents the N-gram overlap of words between the correct reference summary and the system summary.

図５は、ＭＲＴの処理内容の一例を示す図である。図５に示すように、ＲＮＮエンコーダ及びＲＮＮデコーダのモデル学習には、入力文ｘおよび正解の参照要約ｙのペアが学習サンプルとして用いられる。これら入力文ｘおよび正解の参照要約ｙのうち入力文ｘがモデルへ入力される。 FIG. 5 is a diagram showing an example of the processing content of MRT. As shown in FIG. 5, in the model learning of the RNN encoder and the RNN decoder, a pair of the input sentence x and the reference summary y of the correct answer is used as a learning sample. Of these input sentences x and the reference summary y of the correct answer, the input sentence x is input to the model.

このように入力文ｘが入力された場合、パラメータθを持つモデルのＲＮＮデコーダが先頭からＥＯＳ（End of Sentence）までの各時刻に出力する単語の確率分布に従って複数のシステム要約ｙ′_１～ｙ′_３がサンプリングされる。 When the input sentence x is input in this way, the RNN decoder of the model having the parameter θ outputs a plurality of system summaries _y'1 to y according to the probability distribution of the words output at each time from the beginning to the EOS (End of Sentence). ′ ₃ is sampled.

例えば、先頭からＥＯＳまでの各時刻では、モデルの辞書に登録された単語、すなわち複数の学習サンプルを含む学習データ全体で入力文に出現する単語ごとに確率が計算される。このような計算で得られる各時刻における単語の確率分布に従って各時刻で単語を抽出することで、上記のサンプリングを実現できる。なお、ここでは、説明の便宜上、３つのシステム要約ｙ′_１～ｙ′_３がサンプリングされる例を挙げたが、任意の個数のシステム要約ｙ′がサンプリングされることとしてかまわない。 For example, at each time from the beginning to EOS, the probability is calculated for each word registered in the model dictionary, that is, for each word that appears in the input sentence in the entire learning data including a plurality of learning samples. The above sampling can be realized by extracting words at each time according to the probability distribution of the words at each time obtained by such a calculation. Here, for convenience of explanation, an example in which _three system summaries _y'1 to y'3 are sampled is given, but any number of system summaries y'may be sampled.

そして、ＭＲＴでは、システム要約ｙ′_１～ｙ′_３ごとに、入力文ｘから当該システム要約ｙ′が生成される生成確率と、参照要約ｙおよび当該システム要約ｙ′の間の単語のｎ－ｇｒａｍの重複度を表すＲＯＵＧＥ値とが算出される。その上で、ＭＲＴでは、システム要約ｙ′_１～ｙ′_３の生成確率およびＲＯＵＧＥ値から下記の式（１）に従って損失Ｌ_ＭＲＴ（θ）が算出される。 Then, in MRT, the generation probability that the system summary y'is generated from the input sentence x and the word n- between the reference summary y and the system summary _y'for _each system summary y'1 to y'3. The ROUGE value representing the degree of overlap of the gram is calculated. Then, in MRT, the loss L _MRT (θ) is calculated according to the following equation (1) from the generation probabilities of the system summaries _y'1 to _y'3 and the ROUGE value.

ここで、上記の式（１）における「Ｐ（ｙ′｜ｘ；θ）」は、モデルのパラメータをθとしたとき、入力文ｘからシステム要約ｙ′が生成される確率を指す。また、上記の式（１）における「Ｄ」は、入力文ｘおよび参照要約ｙを含む学習サンプルの集合である学習データを指す。さらに、上記の式（１）における「Ｓ」は、モデルのパラメータをθとしたとき、入力文ｘから生成されるシステム要約の集合を指す。また、上記の式（１）における「Δ（ｙ′，ｙ）」は、システム要約ｙ′及び参照要約ｙの間で算出される単語の重複度を指し、ここでは、一例として、ＲＯＵＧＥなどの関数を用いることにより負の利得がＲＯＵＧＥ値として算出されることとする。 Here, "P (y'| x; θ)" in the above equation (1) indicates the probability that the system summary y'is generated from the input sentence x when the parameter of the model is θ. Further, "D" in the above equation (1) refers to learning data which is a set of learning samples including an input sentence x and a reference summary y. Further, "S" in the above equation (1) refers to a set of system summaries generated from the input sentence x when the parameter of the model is θ. Further, "Δ (y', y)" in the above equation (1) refers to the degree of word duplication calculated between the system summary y'and the reference summary y, and here, as an example, ROUGE or the like is used. It is assumed that the negative gain is calculated as the ROUGE value by using the function.

その後、ＭＲＴは、損失Ｌ_ＭＲＴに基づいてモデルのパラメータθを更新する。例えば、ＭＲＴは、Ｌ_ＭＲＴ（θ）をθ_ｉで偏微分することにより勾配、すなわち∂Ｌ_ＭＲＴ（θ）／∂θ_ｉを求め、モデルのパラメータθ_ｉの更新、すなわちθ_ｉ←θ＋（∂Ｌ_ＭＲＴ（θ）／∂θ_ｉ）の計算を行う。 The MRT then updates the model parameter θ based on the loss L _MRT . For example, MRT obtains the gradient, that is, ∂L _MRT (θ) / ∂θ _i by partially differentiating L _MRT (θ) with respect to θ _i , and updates the model parameter θ _i , that is, θ _i ← θ + (∂). L _MRT (θ) / ∂θ _i ) is calculated.

このように損失Ｌ_ＭＲＴ（θ）に基づいてモデルのパラメータθ_ｉを更新することにより、ＲＯＵＧＥ値が高いシステム要約の生成確率が上げる一方で、ＲＯＵＧＥ値が低いシステム要約の生成確率を下げるモデルの学習が実現される。 By updating the model parameter θ _i based on the loss _LMRT (θ) in this way, the probability of generating a system summary with a high ROUGE value is increased, while the probability of generating a system summary with a low ROUGE value is decreased. Learning is realized.

このＲＯＵＧＥ値を用いるパラメータ更新前後における損失Ｌ_ＭＲＴ（θ）の変化を図６を用いて説明する。図６は、生成確率およびＲＯＵＧＥ値の一例を示す図である。図６の上段の表には、ｔラウンド目のモデル学習においてパラメータθ_ｔを持つモデルが入力文ｘからシステム要約ｙ′を生成する生成確率と、参照要約およびシステム要約ｙ′の間のＲＯＵＧＥ値とが示されている。なお、図６の表に示す薄いハッチングの箇所は、上記の式（１）に含まれるシステム要約ｙ′の生成確率の計算式で算出されることを示す一方で、図６の表に示す濃いハッチングの箇所は、上記の式（１）に含まれるＲＯＵＧＥの関数で算出されることを示す。 The change in the loss _LMRT (θ) before and after the parameter update using this ROUGE value will be described with reference to FIG. FIG. 6 is a diagram showing an example of the generation probability and the ROUGE value. The upper table of FIG. 6 shows the generation probability that the model having the parameter θ _t in the model learning in the t-th round generates the system summary y'from the input sentence x, and the ROUGE value between the reference summary and the system summary y'. Is shown. The light hatched portion shown in the table of FIG. 6 indicates that it is calculated by the calculation formula of the generation probability of the system summary y'included in the above formula (1), while the dark hatched portion shown in the table of FIG. It is shown that the hatched part is calculated by the ROUGE function included in the above equation (1).

例えば、パラメータθ_ｔを持つモデルが入力文ｘから生成するシステム要約ｙ′_１～ｙ′_３の生成確率およびＲＯＵＧＥ値が図６の上段の表に示す値であるとしたとき、Ｌ_ＭＲＴ（θ_ｔ）は、次のように算出することができる。すなわち、損失Ｌ_ＭＲＴ（θ_ｔ）は、システム要約ｙ′_１の生成確率及びのＲＯＵＧＥ値と、システム要約ｙ′_２の生成確率及びのＲＯＵＧＥ値と、システム要約ｙ′_３の生成確率及びのＲＯＵＧＥ値との総和から求めることができる。つまり、損失Ｌ_ＭＲＴ（θ_ｔ）は、０．２×（－０．３）＋０．６×（－０．１）＋０．２×（－０．６）の計算により、－０．２４と算出される。 For example, assuming that the generation probabilities and ROUGE values of the system summaries _y'1 to _y'3 generated from the input sentence x by the model having the parameter θ _t are the values shown in the upper table of FIG. 6, _LMRT (θ). _t ) can be calculated as follows. That is, the loss L _MRT (θ _t ) is the ROUGE value of the system summary y ′ ₁ generation probability and the ROUGE value, the system summary y ′ ₂ generation probability and the ROUGE value, and the system summary y ′ ₃ generation probability and the ROUGE value. It can be calculated from the sum of the values. That is, the loss L _MRT (θ _t ) is calculated as 0.2 × (−0.3) +0.6 × (−0.1) +0.2 × (−0.6) to be −0.24. It is calculated.

このような損失Ｌ_ＭＲＴ（θ_ｔ）に基づいてパラメータがθ_ｔからθ_ｔ＋１へ更新されたモデルが入力文ｘから生成するシステム要約ｙ′_１～ｙ′_３の生成確率およびＲＯＵＧＥ値が図６の下段の表の通りであるとする。 The system summary y ′ ₁ to y ′ ₃ generation probability and ROUGE value generated from the input sentence x by the model whose parameters are updated from θ _t to θ _{t + 1} based on such loss L _MRT (θ _t ) are shown in FIG. It is assumed that it is as shown in the lower table.

その一方で、図６に示す下段の表には、ｔ＋１ラウンド目のモデル学習においてパラメータθ_ｔ＋１を持つモデルが入力文ｘからシステム要約ｙ′を生成する生成確率と、参照要約およびシステム要約ｙ′の間のＲＯＵＧＥ値とが示されている。この場合にも、損失Ｌ_ＭＲＴ（θ_ｔ＋１）は、システム要約ｙ′_１の生成確率及びのＲＯＵＧＥ値と、システム要約ｙ′_２の生成確率及びのＲＯＵＧＥ値と、システム要約ｙ′_３の生成確率及びのＲＯＵＧＥ値との総和から求めることができる。つまり、損失Ｌ_ＭＲＴ（θ_ｔ＋１）は、０．３×（－０．３）＋０．１×（－０．１）＋０．６×（－０．６）の計算により、－０．４６と算出される。 On the other hand, in the lower table shown in FIG. 6, the generation probability that the model having the parameter θ _{t + 1} in the model learning in the t + 1 round generates the system summary y ′ from the input sentence x, and the reference summary and the system summary y ′. The ROUGE value between is shown. In this case as well, the loss _LMRT (θ _{t + 1} ) is the generation probability and ROUGE value of the system summary y ′ ₁ , the generation probability and the ROUGE value of the system summary y ′ ₂ , and the generation probability of the system summary y ′ ₃ . And can be obtained from the sum of the ROUGE values. That is, the loss L _MRT (θ _{t + 1} ) is −0.46 by the calculation of 0.3 × (−0.3) +0.1 × (−0.1) +0.6 × (−0.6). It is calculated.

このようにモデルのパラメータがθ_ｔからθ_ｔ＋１へ更新されることにより、ｔラウンド目の損失Ｌ_ＭＲＴ（θ_ｔ）よりもｔ＋１ラウンド目の損失Ｌ_ＭＲＴ（θ_ｔ＋１）を減少させるモデル学習が実現されていることがわかる。 By updating the model parameters from θ _t to θ _{t + 1} in this way, model learning that reduces the loss L _MRT (θ _{t + 1} ) in the t + 1 round rather than the loss L _MRT (θ _t ) in the t round is realized. You can see that it has been done.

［現状のＭＲＴの課題の一側面］
しかしながら、上記の背景技術の欄で説明した通り、ＭＲＴのように、語順の違いを不問とし、単語の重複度によりモデルのパラメータを更新する場合、正解の参照要約と語順が異なる全てのシステム要約のＲＯＵＧＥ値が高評価を受ける。それ故、正解の参照要約との間で語順が異なるシステム要約の中に非文法的な表現が含まれる場合でも、システム要約の損失を過小評価してモデルのパラメータが学習される。この結果、可読性が低いシステム要約を生成するモデルが学習されてしまうことがある。 [One aspect of the current MRT issues]
However, as explained in the background technology section above, when the word order difference is irrelevant and the model parameters are updated according to the degree of word duplication, all system summaries with different word orders from the correct reference summary are used. The ROUGE value of is highly evaluated. Therefore, even if non-grammatical expressions are included in the system summary, which has a different word order from the correct reference summary, the loss of the system summary is underestimated and the model parameters are learned. As a result, a model that produces a poorly readable system summary may be trained.

このようなモデル学習の失敗事例を図７Ａ～図７Ｄを用いて説明する。図７Ａは、参照要約の一例を示す図である。図７Ｂ～図７Ｄは、システム要約の一例を示す図である。ここでは、一例として、モデルの学習の際に、図３に示す入力文３０及び図７Ａに示す参照要約７０のペアが学習サンプルとして入力される事例を例に挙げる。このとき、ＲＮＮエンコーダ及びＲＮＮデコーダが接続されたモデルによって入力文３０から図７Ｂ～図７Ｄに示すＲＯＵＧＥ値が同一であるシステム要約７０Ｂ～７０Ｄが生成される場合、次のような評価が行われる。 Such a failure example of model learning will be described with reference to FIGS. 7A to 7D. FIG. 7A is a diagram showing an example of a reference summary. 7B-7D are diagrams showing an example of a system summary. Here, as an example, a case where a pair of the input sentence 30 shown in FIG. 3 and the reference summary 70 shown in FIG. 7A is input as a training sample is given as an example when training the model. At this time, when the system summaries 70B to 70D having the same ROUGE values shown in FIGS. 7B to 7D are generated from the input sentence 30 by the model to which the RNN encoder and the RNN decoder are connected, the following evaluation is performed. ..

すなわち、図７Ａに示す参照要約７０及び図７Ｂに示すシステム要約７０Ｂの間では、語順が一致し、かつ単語の集合も一致する。このように参照要約７０及びシステム要約７０Ｂの間で単語の集合が一致するので、損失は「０」となる。また、図７Ａに示す参照要約７０及び図７Ｃに示すシステム要約７０Ｃの間では、語順は異なるが、単語の集合が一致する。このように参照要約７０及びシステム要約７０Ｃの間で単語の集合が一致するので、損失は「０」となる。また、図７Ａに示す参照要約７０及び図７Ｄに示すシステム要約７０Ｄの間でも、語順は異なるが、単語の集合が一致する。このように参照要約７０及びシステム要約７０Ｄの間で単語の集合が一致するので、損失は「０」となる。このように、ＲＯＵＧＥ値が同一であるシステム要約７０Ｂ～システム要約７０Ｄの間では、同一の評価がなされることになる。 That is, the word order and the set of words match between the reference summary 70 shown in FIG. 7A and the system summary 70B shown in FIG. 7B. Since the set of words matches between the reference summary 70 and the system summary 70B in this way, the loss is "0". Further, the word order is different between the reference summary 70 shown in FIG. 7A and the system summary 70C shown in FIG. 7C, but the set of words is the same. Since the set of words matches between the reference summary 70 and the system summary 70C in this way, the loss is "0". Further, the word order is different between the reference summary 70 shown in FIG. 7A and the system summary 70D shown in FIG. 7D, but the set of words is the same. Since the set of words matches between the reference summary 70 and the system summary 70D in this way, the loss is "0". In this way, the same evaluation is made between the system summary 70B and the system summary 70D having the same ROUGE value.

しかしながら、システム要約７０Ｄには、システム要約７０Ｂやシステム要約７０Ｃでは見られない非文法的な表現が含まれる。例えば、システム要約７０Ｂやシステム要約７０Ｃに示された「・・・チャットで・・・」のように、「チャット」には格助詞の「で」が用いられるのが正しい用法である。それにもかかわらず、システム要約７０Ｄに示された「・・・チャットが・・・」では、「チャット」に格助詞の「が」が用いられており、文法的に誤りがある。さらに、文法的な誤りが一因となって、システム要約７０Ｄでは、「チャットが」の修飾部が「自動応答する」の被修飾部を修飾する誤った係り受けとなっている。 However, the system summary 70D contains non-grammatical expressions not found in the system summary 70B or system summary 70C. For example, the correct usage is that the case particle "de" is used for "chat", such as "... in chat ..." shown in system summary 70B and system summary 70C. Nevertheless, in "... chat is ..." shown in the system summary 70D, the case particle "ga" is used for "chat", and there is a grammatical error. Further, due in part to the grammatical error, in the system summary 70D, the modified part of "chat" modifies the modified part of "automatically respond", which is an erroneous dependency.

このように、現状のＭＲＴでは、ＲＯＵＧＥ値が同一のレベルであれば、非文法的な表現が含まれないシステム要約７０Ｂやシステム要約７０Ｃと、非文法的な表現や誤った係り受けが含まれるシステム要約７０Ｄとの間で同一の評価がなされることになる。すなわち、モデル学習時にシステム要約の中に非文法的な表現を含むシステム要約７０Ｄが含まれる場合、システム要約７０ＤのＲＯＵＧＥ値の負の利得がシステム要約７０Ｂやシステム要約７０ＣのＲＯＵＧＥ値の負の利得と同程度に作用する。このように、非文法的な表現を含むシステム要約７０ＤのＲＯＵＧＥ値の負の利得が過剰に作用する損失に基づいてモデルが更新される結果、可読性が低い要約文を生成するモデルが学習されてしまう場合がある。 As described above, in the current MRT, if the ROUGE values are at the same level, system summaries 70B and system summaries 70C that do not include non-grammatical expressions, and non-grammatical expressions and erroneous dependencies are included. The same evaluation will be made with the system summary 70D. That is, when the system summary 70D containing a non-grammatical expression is included in the system summary during model training, the negative gain of the ROUGE value of the system summary 70D is the negative gain of the ROUGE value of the system summary 70B and the system summary 70C. Works as well as. Thus, the model is updated based on the loss overworked by the negative gain of the ROUGE value of the system summary 70D containing non-grammatical expressions, resulting in the learning of a model that produces a less readable summary. It may end up.

［課題解決のアプローチの一側面］
そこで、本実施例に係る学習装置１は、正解の参照要約に含まれる単語の語順を入れ替えて非文法的な表現が擬似的に再現された擬似文を生成し、モデルが擬似文を生成する確率よりもモデルが参照要約を生成する確率が高くなるようにモデルのパラメータを更新する。 [One aspect of problem-solving approach]
Therefore, the learning device 1 according to the present embodiment replaces the word order of the words included in the reference summary of the correct answer to generate a pseudo-sentence in which the non-grammatical expression is simulated, and the model generates the pseudo-sentence. Update the model parameters so that the model is more likely to generate a reference summary than the probability.

図８は、モデルのパラメータの更新方法の一例を示す図である。図８に示すように、ＲＮＮエンコーダ及びＲＮＮデコーダのモデル学習には、図５に示されたＭＲＴと同様、入力文ｘおよび正解の参照要約ｙのペアが学習サンプルとして用いられる。 FIG. 8 is a diagram showing an example of how to update the parameters of the model. As shown in FIG. 8, in the model learning of the RNN encoder and the RNN decoder, a pair of the input sentence x and the reference summary y of the correct answer is used as a learning sample as in the MRT shown in FIG.

これら入力文ｘおよび正解の参照要約ｙのうち入力文ｘがモデルへ入力される。このように入力文ｘが入力された場合、学習装置１は、パラメータθを持つモデルのＲＮＮデコーダが先頭からＥＯＳまでの各時刻に出力する単語の確率分布に従って複数のシステム要約ｙ′_１～ｙ′_３をサンプリングする。 Of these input sentences x and the reference summary y of the correct answer, the input sentence x is input to the model. When the input sentence x is input in this way, the learning device 1 has a plurality of system summaries _y'1 to y according to the probability distribution of words output by the RNN decoder of the model having the parameter θ at each time from the beginning to the EOS. ′ ₃ is sampled.

そして、学習装置１は、システム要約ｙ′_１～ｙ′_３ごとに、入力文ｘから当該システム要約ｙ′が生成される生成確率と、参照要約ｙおよび当該システム要約ｙ′の間の単語のｎ－ｇｒａｍの重複度を表すＲＯＵＧＥ値とを算出する。その上で、学習装置１は、システム要約ｙ′_１～ｙ′_３の生成確率およびＲＯＵＧＥ値から上記の式（１）に従って損失Ｌ_ＭＲＴ（θ）を算出する。 Then, the learning device ₁ has a generation probability that the system summary y'is generated from the input sentence x for each system summary _y'1 to y'3, and the word between the reference summary y and the system summary y'. The ROUGE value representing the degree of overlap of n-gram is calculated. Then, the learning device 1 calculates the loss L _MRT (θ) according to the above equation (1) from the generation probabilities of the system summaries _y'1 to _y'3 and the ROUGE value.

このように、本実施例においても、システム要約ｙ′の生成確率およびＲＯＵＧＥ値から損失Ｌ_ＭＲＴ（θ）が算出されるまでの過程は上記のＭＲＴと共通するが、損失Ｌ_ＭＲＴ（θ）そのものを損失として用いる訳ではない。 As described above, also in this embodiment, the process until the loss _LMRT (θ) is calculated from the generation probability of the system summary y'and the ROUGE value is the same as the above MRT, but the loss _LMRT (θ) itself. Is not used as a loss.

すなわち、本実施例では、上記のＭＲＴから改良された損失を定義する点が異なる。例えば、本実施例では、システム要約ｙ′の生成確率およびＲＯＵＧＥ値に基づく損失Ｌ_ＭＲＴ（θ）の項と共に非文法的な表現を含む擬似文ｚにペナルティを与える損失Ｌ_{ｏｒｄｅｒ}（θ）の項が加えられた損失Ｌ（θ）を下記の式（２）の通りに定義する。なお、下記の式（２）における「α」は、重み付けの係数であり、例えば、０～１の任意の値を採用できる。 That is, the present embodiment is different in that it defines a loss improved from the above MRT. For example, in this embodiment, the term of the loss _Lord (θ) that penalizes the pseudo-sentence z including the non-grammatical expression together with the term of the loss _LMRT (θ) based on the generation probability of the system summary y'and the ROUGE value. The loss L (θ) to which is added is defined as the following equation (2). In addition, "α" in the following equation (2) is a weighting coefficient, and for example, any value of 0 to 1 can be adopted.

ここで、上記の損失Ｌ_{ｏｒｄｅｒ}（θ）は、下記の式（３）により算出される。下記の式（３）における「Ｄ」は、入力文ｘおよび参照要約ｙを含む学習サンプルの集合である学習データを指す。また、下記の式（３）における「Ｓ′（ｙ）」は、正解の参照要約ｙから生成される擬似文ｚの集合を指す。また、下記の式（３）における「ｐ（ｚ｜ｘ；θ）」は、モデルのパラメータをθとしたとき、入力文ｘから擬似文ｚが生成される確率を指す。また、下記の式（３）における「ｐ（ｙ｜ｘ；θ）」は、入力文ｘから正解の参照要約ｙが生成される確率を指す。 Here, the above loss _Lord (θ) is calculated by the following equation (3). “D” in the following equation (3) refers to training data which is a set of training samples including an input sentence x and a reference summary y. Further, "S'(y)" in the following equation (3) refers to a set of pseudo sentences z generated from the reference summary y of the correct answer. Further, "p (z | x; θ)" in the following equation (3) indicates the probability that a pseudo sentence z is generated from the input sentence x when the parameter of the model is θ. Further, "p (y | x; θ)" in the following equation (3) indicates the probability that the correct reference summary y is generated from the input sentence x.

例えば、学習装置１は、正解の参照要約ｙから当該参照要約ｙに含まれる単語の語順を入れ替えることにより非文法的な表現が擬似的に再現された擬似文ｚ_１～ｚ_３の集合Ｓ′（ｙ）を生成する。このとき、正解の参照要約ｙに含まれる単語の語数を変えずに、単語の語順を入れ替えて擬似文ｚのサンプリングを行うことで、参照要約ｙとの間で計算されるＲＯＵＧＥ値が「１」となる擬似文ｚを生成することができる。なお、ここでは、説明の便宜上、３つの擬似文ｚ_１～ｚ_３がサンプリングされる例を挙げたが、任意の個数の擬似文ｚがサンプリングされることとしてかまわない。 For example, the learning device 1 is a set S'of pseudo-sentences z ₁ to z ₃ in which a non-grammatical expression is simulated by exchanging the word order of the words included in the reference summary y of the correct answer. (Y) is generated. At this time, by sampling the pseudo-sentence z by changing the word order of the words without changing the number of words included in the reference summary y of the correct answer, the ROUGE value calculated with the reference summary y is "1". It is possible to generate a pseudo sentence z that becomes. Here, for convenience of explanation, an example in which three pseudo-sentences z ₁ to z ₃ are sampled is given, but any number of pseudo-sentences z may be sampled.

さらに、学習装置１は、参照要約ｙが入力文ｘから生成される生成確率ｐ（ｙ｜ｘ；θ）を算出すると共に、擬似文ｚごとに当該擬似文ｚが入力文ｘから生成される生成確率ｐ（ｚ｜ｘ；θ）を算出する。例えば、図８の例で言えば、参照要約ｙの生成確率ｐ（ｙ｜ｘ；θ）は、「０．２」と算出される。また、擬似文ｚ_１の生成確率ｐ（ｚ_１｜ｘ；θ）は、「０．３」と算出される。さらに、擬似文ｚ_２の生成確率ｐ（ｚ_２｜ｘ；θ）は、「０．４」と算出される。また、擬似文ｚ_３の生成確率ｐ（ｚ_３｜ｘ；θ）は、「０．１」と算出される。 Further, the learning device 1 calculates a generation probability p (y | x; θ) in which the reference summary y is generated from the input sentence x, and the pseudo sentence z is generated from the input sentence x for each pseudo sentence z. The generation probability p (z | x; θ) is calculated. For example, in the example of FIG. 8, the generation probability p (y | x; θ) of the reference summary y is calculated as “0.2”. Further, the generation probability p (z ₁ | x; θ) of the pseudo sentence z ₁ is calculated as “0.3”. Further, the generation probability p (z ₂ | x; θ) of the pseudo sentence z ₂ is calculated as “0.4”. Further, the generation probability p (z ₃ | x; θ) of the pseudo sentence z ₃ is calculated as “0.1”.

このような生成確率の算出結果の下、損失Ｌ_{ｏｒｄｅｒ}（θ）の計算例について説明する。例えば、Σに定義された集合Ｓ′（ｙ）のうち擬似文ｚ_１の場合、擬似文ｚ_１の生成確率（ｐ（ｚ_１｜ｘ；θ）＝０．３）と参照要約ｙの生成確率（ｐ（ｙ｜ｘ；θ）＝０．２）とが比較される。この場合、擬似文ｚ_１の生成確率が参照要約ｙの生成確率よりも大きい。このため、上記の式（３）において、擬似文ｚ_１の生成確率および参照要約ｙの生成確率の差、すなわちｐ（ｚ_１｜ｘ；θ）－ｐ（ｙ｜ｘ；θ）＝０．１は正となる。この結果、ｍａｘ関数によってｐ（ｚ_１｜ｘ；θ）－ｐ（ｙ｜ｘ；θ）＝０．１が選択される。 Based on the calculation result of such a generation probability, a calculation example of the loss _Lord (θ) will be described. For example, in the case of the pseudo-sentence z ₁ in the set S'(y) defined in Σ, the generation probability of the pseudo-sentence z ₁ (p (z ₁ | x; θ) = 0.3) and the generation of the reference summary y. The probability (p (y | x; θ) = 0.2) is compared. In this case, the generation probability of the pseudo sentence z ₁ is larger than the generation probability of the reference summary y. Therefore, in the above equation (3), the difference between the generation probability of the pseudo sentence z ₁ and the generation probability of the reference summary y, that is, p (z ₁ | x; θ) −p (y | x; θ) = 0. 1 is positive. As a result, p (z ₁ | x; θ) -p (y | x; θ) = 0.1 is selected by the max function.

また、擬似文ｚ_２の場合、擬似文ｚ_２の生成確率（ｐ（ｚ_２｜ｘ；θ）＝０．４）と参照要約ｙの生成確率（ｐ（ｙ｜ｘ；θ）＝０．２）とが比較される。この場合、擬似文ｚ_２の生成確率が参照要約ｙの生成確率よりも大きい。このため、上記の式（３）において、擬似文ｚ_２の生成確率および参照要約ｙの生成確率の差、すなわちｐ（ｚ_２｜ｘ；θ）－ｐ（ｙ｜ｘ；θ）＝０．２は正となる。この場合にも、ｍａｘ関数によってｐ（ｚ_２｜ｘ；θ）－ｐ（ｙ｜ｘ；θ）＝０．２が選択される。 Further, in the case of the pseudo sentence z ₂ , the generation probability of the pseudo sentence z ₂ (p (z ₂ | x; θ) = 0.4) and the generation probability of the reference summary y (p (y | x; θ) = 0. 2) is compared. In this case, the generation probability of the pseudo sentence z ₂ is larger than the generation probability of the reference summary y. Therefore, in the above equation (3), the difference between the generation probability of the pseudo sentence z ₂ and the generation probability of the reference summary y, that is, p (z ₂ | x; θ) -p (y | x; θ) = 0. 2 is positive. Also in this case, p (z ₂ | x; θ) −p (y | x; θ) = 0.2 is selected by the max function.

また、擬似文ｚ_３の場合、擬似文ｚ_３の生成確率（ｐ（ｚ_３｜ｘ；θ）＝０．１）と参照要約ｙの生成確率（ｐ（ｙ｜ｘ；θ）＝０．２）とが比較される。この場合、擬似文ｚ_３の生成確率が参照要約ｙの生成確率よりも小さい。このため、上記の式（３）において、擬似文ｚ_３の生成確率および参照要約ｙの生成確率の差、すなわちｐ（ｚ_３｜ｘ；θ）－ｐ（ｙ｜ｘ；θ）＝－０．１は負となる。この結果、ｍａｘ関数によって０が選択される。 Further, in the case of the pseudo sentence z ₃ , the generation probability of the pseudo sentence z ₃ (p (z ₃ | x; θ) = 0.1) and the generation probability of the reference summary y (p (y | x; θ) = 0. 2) is compared. In this case, the generation probability of the pseudo sentence z ₃ is smaller than the generation probability of the reference summary y. Therefore, in the above equation (3), the difference between the generation probability of the pseudo sentence z ₃ and the generation probability of the reference summary y, that is, p (z ₃ | x; θ) -p (y | x; θ) = −0. .1 is negative. As a result, 0 is selected by the max function.

これら擬似文ｚ_１～ｚ_３の要素ごとに算出された損失が合計されることにより、損失Ｌ_{ｏｒｄｅｒ}（θ）は、０．３（＝０．１＋０．２＋０）と算出することができる。 The loss _Lord (θ) can be calculated as 0.3 (= 0.1 + 0.2 + 0) by summing up the losses calculated for each of the elements of the pseudo sentences z ₁ to z ₃ .

このように、本実施例では、損失Ｌ_ＭＲＴ（θ）に加えて損失Ｌ_{ｏｒｄｅｒ}（θ）に基づいて損失Ｌ（θ）を算出することで、次のようなモデル学習を実現できる。例えば、損失Ｌ_ＭＲＴ（θ）の項によってＲＯＵＧＥ値を向上させつつ、損失Ｌ_{ｏｒｄｅｒ}（θ）の項によって擬似文ｚの生成確率よりも参照要約ｙの生成確率が上回るようにモデルのパラメータを更新することができる。 As described above, in this embodiment, the following model learning can be realized by calculating the loss L (θ) based on the loss _Lord (θ) in addition to the loss L _MRT (θ). For example, while improving the ROUGE value by the term of loss L _MRT (θ), the model parameters are updated so that the probability of generating the reference summary y exceeds the probability of generating the pseudo-sentence z by the term of loss _Lord (θ). can do.

このため、参照要約と単語の重複度は高く、かつ参照要約と語順が異なるシステム要約の生成確率を上げる作用を与えつつ、参照要約と単語の重複度が高い要約文の中でも非文法的な表現を含む擬似文の生成にペナルティを課す反作用を与えることができる。それ故、参照要約と単語の重複度が高い要約文の中でも非文法的な表現が含まれないシステム要約の生成確率を上げるパラメータの更新を実現できる。 For this reason, the degree of duplication of the reference summary and the word is high, and while giving the effect of increasing the probability of generating the system summary having a different word order from the reference summary, it is a non-grammatical expression even in the summary sentence with the high degree of duplication of the reference summary and the word. It is possible to give a reaction that imposes a penalty on the generation of a pseudo-sentence containing. Therefore, it is possible to update the parameters that increase the probability of generating a system summary that does not include non-grammatical expressions even in a summary sentence with a high degree of duplication of a reference summary and a word.

したがって、本実施例に係る学習装置１によれば、可読性が低い要約文を生成するモデルが学習されるのを抑制することができる。 Therefore, according to the learning device 1 according to the present embodiment, it is possible to suppress the learning of a model that generates a summary sentence having low readability.

［学習装置１の機能的構成］
次に、本実施例に係る学習装置１の機能的構成の一例について説明する。図１に示すように、学習装置１は、学習データ記憶部２と、第１のモデル記憶部３と、第１の学習部５と、第２のモデル記憶部８と、第２の学習部１０とを有する。なお、学習装置１は、図１に示した機能部以外にも既知のコンピュータが有する各種の機能部、例えば各種の入力デバイスや音声出力デバイスなどの機能部を有することとしてもかまわない。 [Functional configuration of learning device 1]
Next, an example of the functional configuration of the learning device 1 according to this embodiment will be described. As shown in FIG. 1, the learning device 1 includes a learning data storage unit 2, a first model storage unit 3, a first learning unit 5, a second model storage unit 8, and a second learning unit. Has 10 and. In addition to the functional units shown in FIG. 1, the learning device 1 may have various functional units of a known computer, for example, various functional units such as various input devices and voice output devices.

図１に示す第１の学習部５および第２の学習部１０などの機能部は、あくまで例示として、下記のハードウェアプロセッサにより仮想的に実現される。このようなプロセッサの例として、ＤＬＵ（Deep Learning Unit）やＧＰＧＰＵ（General-Purpose computing on Graphics Processing Units）の他、ＧＰＵクラスタなどが挙げられる。この他、ＣＰＵ（Central Processing Unit）やＭＰＵ（Micro Processing Unit）などであってもかまわない。例えば、プロセッサがＲＡＭ（Random Access Memory）等のメモリ上に上記学習プログラムをプロセスとして展開することにより、上記の機能部が仮想的に実現される。ここでは、プロセッサの一例として、ＤＬＵやＧＰＧＰＵ、ＧＰＵクラスタ、ＣＰＵ、ＭＰＵを例示したが、汎用型および特化型を問わず、任意のプロセッサにより上記の機能部が実現されることとしてもかまわない。この他、上記の機能部は、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）などのハードワイヤードロジックによって実現されることを妨げない。 The functional units such as the first learning unit 5 and the second learning unit 10 shown in FIG. 1 are virtually realized by the following hardware processor as an example. Examples of such processors include DLUs (Deep Learning Units), GPGPUs (General-Purpose computing on Graphics Processing Units), GPU clusters, and the like. In addition, a CPU (Central Processing Unit) or an MPU (Micro Processing Unit) may be used. For example, when the processor deploys the learning program as a process on a memory such as a RAM (Random Access Memory), the functional unit is virtually realized. Here, DLU, GPGPU, GPU cluster, CPU, and MPU are exemplified as an example of the processor, but the above-mentioned functional unit may be realized by any processor regardless of general-purpose type or specialized type. .. In addition, the above-mentioned functional unit does not prevent it from being realized by hard-wired logic such as ASIC (Application Specific Integrated Circuit) and FPGA (Field Programmable Gate Array).

また、図１に示す学習データ記憶部２、第１のモデル記憶部３及び第２のモデル記憶部８などの機能部には、ＨＤＤ（Hard Disk Drive）、光ディスクやＳＳＤ（Solid State Drive）などの記憶装置を採用できる。なお、記憶装置は、必ずしも補助記憶装置でなくともよく、各種の半導体メモリ素子、例えばＲＡＭ、ＥＰＰＲＯＭやフラッシュメモリなども採用できる。 Further, functional units such as the learning data storage unit 2, the first model storage unit 3, and the second model storage unit 8 shown in FIG. 1 include an HDD (Hard Disk Drive), an optical disk, an SSD (Solid State Drive), and the like. Storage device can be adopted. The storage device does not necessarily have to be an auxiliary storage device, and various semiconductor memory elements such as RAM, EPPROM, and flash memory can also be adopted.

ここで、図１には、第２の学習部１０におけるモデルの学習速度を向上させる側面から、第１の学習部５にモデルのパラメータを学習する前処理を実行させてから前処理後のパラメータを用いて第２の学習部１０に上記のモデル学習を実行させる場合を例示する。これはあくまで一例であり、必ずしも第１の学習部５による前処理が行われずともかまわない。例えば、第１の学習部５による前処理をスキップし、第２の学習部１０に初期のパラメータを用いて上記のモデル学習を実行させることとしてもかまわない。以下では、第１の学習部５により実行される前処理となるモデル学習のことを「第１のモデル学習」と記載し、第２の学習部１０により実行される上記のモデル学習のことを「第２のモデル学習」と記載する場合がある。 Here, in FIG. 1, from the aspect of improving the learning speed of the model in the second learning unit 10, the parameters after the preprocessing after the first learning unit 5 is made to execute the preprocessing for learning the parameters of the model. Is used to illustrate a case where the second learning unit 10 is made to execute the above model learning. This is just an example, and the preprocessing by the first learning unit 5 does not necessarily have to be performed. For example, the preprocessing by the first learning unit 5 may be skipped, and the second learning unit 10 may be made to execute the above model learning using the initial parameters. In the following, the model learning that is the preprocessing executed by the first learning unit 5 is referred to as "first model learning", and the above model learning executed by the second learning unit 10 is referred to as "first model learning". It may be described as "second model learning".

学習データ記憶部２は、学習データを記憶する記憶部である。ここで、学習データには、一例として、Ｄ個の学習サンプル、いわゆる学習事例が含まれる。さらに、学習サンプルには、入力文ｘおよび参照要約ｙのペアが含まれる。なお、図１には、あくまで一例として、第１の学習部５及び第２の学習部１０に同一の学習データが用いられる場合を例示するが、第１の学習部５及び第２の学習部１０の間で異なる学習データがモデル学習に用いられることとしてもかまわない。 The learning data storage unit 2 is a storage unit that stores learning data. Here, the learning data includes, as an example, D learning samples, so-called learning cases. Further, the training sample includes a pair of the input sentence x and the reference summary y. Note that FIG. 1 illustrates a case where the same learning data is used for the first learning unit 5 and the second learning unit 10 as an example, but the first learning unit 5 and the second learning unit 5 are illustrated. It is also possible that training data different among 10 are used for model training.

第１のモデル記憶部３及び第２のモデル記憶部８は、いずれもモデルに関する情報を記憶する記憶部である。 The first model storage unit 3 and the second model storage unit 8 are both storage units that store information about the model.

一実施形態として、第１のモデル記憶部３及び第２のモデル記憶部８には、次のような情報が記憶される。例えば、ＲＮＮエンコーダ及びＲＮＮデコーダが接続されたニューラルネットワークを形成する入力層、隠れ層及び出力層の各層のニューロンやシナプスなどのモデルの層構造を始め、各層の重みやバイアスなどのモデルのパラメータを含むモデル情報が記憶される。ここで、第１の学習部５によりモデル学習が実行される前の段階では、第１のモデル記憶部３には、モデルのパラメータとして、乱数により初期設定されたパラメータが記憶される。また、第１の学習部５によりモデル学習が実行された後の段階では、第１のモデル記憶部３には、第１の学習部５により学習されたモデルのパラメータが保存される。また、第２の学習部１０によりモデル学習が実行された後の段階では、第２のモデル記憶部８には、第２の学習部１０により学習されたモデルのパラメータが保存される。 As one embodiment, the following information is stored in the first model storage unit 3 and the second model storage unit 8. For example, the layer structure of the model such as neurons and synapses of each layer of the input layer, the hidden layer and the output layer forming a neural network to which the RNN encoder and the RNN decoder are connected, and the model parameters such as the weight and bias of each layer are set. The included model information is stored. Here, before the model learning is executed by the first learning unit 5, the first model storage unit 3 stores parameters initially set by random numbers as model parameters. Further, at the stage after the model learning is executed by the first learning unit 5, the parameters of the model learned by the first learning unit 5 are stored in the first model storage unit 3. Further, at the stage after the model learning is executed by the second learning unit 10, the parameters of the model learned by the second learning unit 10 are stored in the second model storage unit 8.

第１の学習部５は、上記の前処理となる第１のモデル学習を実行する処理部である。ここでは、第１のモデル学習の一例として、対数尤度の最適化と呼ばれるモデル学習が実行される場合を例示する。 The first learning unit 5 is a processing unit that executes the first model learning that is the above-mentioned preprocessing. Here, as an example of the first model learning, a case where model learning called log-likelihood optimization is executed will be illustrated.

第１の学習部５は、図１に示すように、入力制御部５Ｉと、モデル実行部６と、更新部７とを有する。 As shown in FIG. 1, the first learning unit 5 has an input control unit 5I, a model execution unit 6, and an update unit 7.

入力制御部５Ｉは、モデルに対する入力を制御する処理部である。 The input control unit 5I is a processing unit that controls the input to the model.

一実施形態として、入力制御部５Ｉは、学習データに含まれる学習サンプルごとに、ＲＮＮエンコーダおよびＲＮＮデコーダが接続されたニューラルネットワークのモデルに対するデータの入力制御を行う。 As one embodiment, the input control unit 5I controls the input of data to the model of the neural network to which the RNN encoder and the RNN decoder are connected for each training sample included in the training data.

具体的には、入力制御部５Ｉは、学習サンプルをカウントするループカウンタｄの値を初期化する。続いて、入力制御部５Ｉは、学習データ記憶部２に記憶されたＤ個の学習サンプルのうちループカウンタｄに対応する学習サンプルを取得する。その後、入力制御部５Ｉは、ループカウンタｄをインクリメントし、ループカウンタｄの値が学習サンプルの総数Ｄと等しくなるまで、学習データ記憶部２から学習サンプルを取得する処理を繰り返し実行する。なお、ここでは、学習装置１内部のストレージに保存された学習データを取得する例を挙げたが、ネットワークを介して接続される外部のコンピュータ、例えばファイルサーバの他、リムーバブルメディア等から学習データが取得されることとしてもかまわない。 Specifically, the input control unit 5I initializes the value of the loop counter d that counts the training sample. Subsequently, the input control unit 5I acquires a learning sample corresponding to the loop counter d among the D learning samples stored in the learning data storage unit 2. After that, the input control unit 5I increments the loop counter d, and repeatedly executes the process of acquiring the training sample from the training data storage unit 2 until the value of the loop counter d becomes equal to the total number D of the training samples. Here, an example of acquiring the learning data stored in the storage inside the learning device 1 has been given, but the learning data can be obtained from an external computer connected via a network, for example, a file server, a removable medium, or the like. It does not matter if it is acquired.

このように学習サンプルが取得される度に、入力制御部５Ｉは、当該学習サンプルに含まれる入力文ｘをＲＮＮエンコーダ６Ａへ入力する。これによって、入力文ｘの単語列がベクトル化されたベクトル、いわゆる中間表現がＲＮＮエンコーダ６ＡからＲＮＮデコーダ６Ｂへ出力される。これと同時または前後して、入力制御部５Ｉは、ＲＮＮデコーダ６Ｂに文末記号と呼ばれるＥＯＳを出力させるまでの残り文字数を保持するレジスタの値を所定の上限文字数、例えばユーザ入力やユーザ設定などの値に初期化する。これ以降のＲＮＮデコーダ６Ｂへの入力、ＲＮＮデータからの出力、それを用いたモデルのパラメータの更新についてはその詳細を後述する。 Each time the learning sample is acquired in this way, the input control unit 5I inputs the input sentence x included in the learning sample to the RNN encoder 6A. As a result, a vector in which the word string of the input sentence x is vectorized, a so-called intermediate representation, is output from the RNN encoder 6A to the RNN decoder 6B. At the same time or before and after this, the input control unit 5I sets the value of the register holding the number of remaining characters until the RNN decoder 6B outputs the EOS called the sentence end symbol to a predetermined upper limit number of characters, for example, user input or user setting. Initialize to a value. The details of the subsequent input to the RNN decoder 6B, the output from the RNN data, and the update of the model parameters using the same will be described later.

モデル実行部６は、ＲＮＮエンコーダ６ＡおよびＲＮＮデコーダ６Ｂが接続されたニューラルネットワークのモデルを実行する処理部である。 The model execution unit 6 is a processing unit that executes a model of a neural network to which the RNN encoder 6A and the RNN decoder 6B are connected.

１つの側面として、モデル実行部６は、第１のモデル記憶部３に記憶されたモデル情報にしたがって、入力制御部５Ｉにより入力された学習サンプルの入力文の単語数Ｍに対応するＭ個のＬＳＴＭ（Long Short-Term Memory）をワークエリア上に展開する。これによって、Ｍ個のＬＳＴＭをＲＮＮエンコーダ６Ａとして機能させる。このＲＮＮエンコーダ６Ａでは、入力制御部５Ｉによる入力制御にしたがって、学習サンプルの入力文の先頭の単語から順に、入力文の先頭からｍ番目の単語が当該ｍ番目の単語に対応するＬＳＴＭへ入力されると共に、ｍ－１番目の単語に対応するＬＳＴＭの出力がｍ番目の単語に対応するＬＳＴＭへ入力される。このような入力を先頭の単語に対応するＬＳＴＭから末尾であるＭ番目の単語に対応するＬＳＴＭまで繰り返すことにより、学習サンプルの入力文のベクトル、いわゆる中間表現が得られる。このようにＲＮＮエンコーダ６Ａにより生成された入力文の中間表現がＲＮＮデコーダ６Ｂへ入力される。 As one aspect, the model execution unit 6 has M units corresponding to the number of words M of the input sentence of the learning sample input by the input control unit 5I according to the model information stored in the first model storage unit 3. Deploy LSTM (Long Short-Term Memory) on the work area. As a result, M LSTMs are made to function as RNN encoders 6A. In this RNN encoder 6A, according to the input control by the input control unit 5I, the mth word from the beginning of the input sentence is input to the LSTM corresponding to the mth word in order from the first word of the input sentence of the learning sample. At the same time, the output of the LSTM corresponding to the m-1st word is input to the LSTM corresponding to the mth word. By repeating such input from the LSTM corresponding to the first word to the LSTM corresponding to the Mth word at the end, a vector of the input sentence of the learning sample, a so-called intermediate representation, can be obtained. The intermediate representation of the input sentence generated by the RNN encoder 6A in this way is input to the RNN decoder 6B.

更なる側面として、モデル実行部６は、第１のモデル記憶部３に記憶されたモデル情報にしたがって、入力制御部５Ｉにより入力された正解の参照要約の単語数Ｎに対応するＮ個のＬＳＴＭをワークエリア上に展開する。これによって、Ｎ個のＬＳＴＭをＲＮＮデコーダ６Ｂとして機能させる。これらＲＮＮデコーダ６Ｂには、入力制御部５Ｉの入力制御にしたがって、ＲＮＮエンコーダ６Ａから学習サンプルの入力文の中間表現が入力されると共に、Ｎ個のＬＳＴＭごとに入力制御部５ＩからＥＯＳのタグを出力させるまでの残り文字数が入力される。これらの入力にしたがってＮ個のＬＳＴＭを動作させることにより、ＲＮＮデコーダ６Ｂは、Ｎ個のＬＳＭＴごとに単語の確率分布を出力する。ここで言う「単語の確率分布」とは、学習サンプル全体で入力文に出現する単語ごとに算出された確率の分布を指す。 As a further aspect, the model execution unit 6 has N LSTMs corresponding to the number of words N of the correct reference summary input by the input control unit 5I according to the model information stored in the first model storage unit 3. On the work area. As a result, N RSTMs are made to function as the RNN decoder 6B. In these RNN decoders 6B, an intermediate expression of the input sentence of the learning sample is input from the RNN encoder 6A according to the input control of the input control unit 5I, and the EOS tag is input from the input control unit 5I to each of N LSTMs. The number of characters remaining until output is entered. By operating N LSTMs according to these inputs, the RNN decoder 6B outputs a word probability distribution for each N LSMTs. The "probability distribution of words" referred to here refers to the distribution of probabilities calculated for each word appearing in the input sentence in the entire learning sample.

更新部７は、モデルのパラメータを更新する処理部である。 The update unit 7 is a processing unit that updates the parameters of the model.

一実施形態として、更新部７は、ＲＮＮデコーダ６Ｂのｎ番目のＬＳＴＭから単語の確率分布が出力された場合、当該確率分布で確率が最大である単語をシステム要約の先頭からｎ番目の単語として生成する。その後、更新部７は、システム要約のｎ番目の単語が生成された場合、正解の参照要約に含まれる単語のうちｎ番目の単語と、システム要約として生成されたｎ番目の単語とから損失を算出する。このように、ＲＮＮデコーダ６ＢのＮ個のＬＳＴＭごとに損失が算出される。その上で、更新部７は、各ＬＳＴＭの損失に基づいて対数尤度の最適化を実行することにより、ＲＮＮエンコーダ６ＡおよびＲＮＮデコーダ６Ｂのモデルを更新するパラメータを算出する。そして、更新部７は、第１のモデル記憶部３に記憶されたモデルのパラメータを対数尤度の最適化により求められたパラメータに更新する。このパラメータの更新は、全ての学習サンプルにわたって繰り返し実行すると共に、学習データＤについても所定のエポック数にわたって繰り返し実行することができる。 As one embodiment, when the probability distribution of words is output from the nth LSTM of the RNN decoder 6B, the update unit 7 sets the word having the maximum probability in the probability distribution as the nth word from the beginning of the system summary. Generate. After that, when the nth word of the system summary is generated, the update unit 7 loses from the nth word among the words included in the correct reference summary and the nth word generated as the system summary. calculate. In this way, the loss is calculated for each of the N LSTMs of the RNN decoder 6B. Then, the update unit 7 calculates the parameters for updating the models of the RNN encoder 6A and the RNN decoder 6B by executing the optimization of the log-likelihood based on the loss of each LSTM. Then, the update unit 7 updates the parameters of the model stored in the first model storage unit 3 to the parameters obtained by optimizing the log-likelihood. The update of this parameter can be repeatedly executed over all the training samples, and the training data D can also be repeatedly executed over a predetermined number of epochs.

これら入力制御部５Ｉ、モデル実行部６及び更新部７の処理内容を図９～図１１を用いて説明する。図９～図１１は、第１のモデル学習の一例を示す図である。図９～図１１には、入力制御部５Ｉにより図３に示す入力文３０および図７Ａに示す参照要約７０のペアが学習サンプルとして取得される場合が示されている。 The processing contents of the input control unit 5I, the model execution unit 6, and the update unit 7 will be described with reference to FIGS. 9 to 11. 9 to 11 are diagrams showing an example of the first model learning. 9 to 11 show a case where the input control unit 5I acquires a pair of the input sentence 30 shown in FIG. 3 and the reference summary 70 shown in FIG. 7A as a learning sample.

図９に示すように、モデル実行部６は、入力制御部５Ｉにより取得された入力文３０に含まれる単語列をベクトル化する。すなわち、モデル実行部６は、モデル実行部６が使用するワークエリアに入力文３０の単語数Ｍに対応するＭ個のＬＳＴＭ６ａ－１～６ａ－Ｍを展開する。これによって、Ｍ個のＬＳＴＭ６ａ－１～６ａ－ＭをＲＮＮエンコーダ６Ａとして機能させる。その上で、入力制御部５Ｉは、入力文３０に含まれる先頭の単語から順に入力文３０の単語を当該単語の位置に対応するＬＳＴＭ６ａに入力すると共に１つ前のＬＳＴＭ６ａの出力を入力する。このような入力を先頭の単語「当社」に対応するＬＳＴＭ６ａ－１から末尾の単語「。」に対応するＬＳＴＭ６ａ－Ｍまで繰り返すことにより、入力文３０のベクトルが得られる。このようにＲＮＮエンコーダ６Ａにより生成された入力文３０のベクトルがＲＮＮデコーダ６Ｂへ入力される。 As shown in FIG. 9, the model execution unit 6 vectorizes the word string included in the input sentence 30 acquired by the input control unit 5I. That is, the model execution unit 6 expands M LSTM6a-1 to 6a-M corresponding to the number of words M of the input sentence 30 in the work area used by the model execution unit 6. As a result, M RSTM6a-1 to 6a-M function as the RNN encoder 6A. Then, the input control unit 5I inputs the word of the input sentence 30 into the LSTM6a corresponding to the position of the word in order from the first word included in the input sentence 30, and inputs the output of the previous LSTM6a. By repeating such input from LSTM6a-1 corresponding to the first word "our company" to LSTM6a-M corresponding to the last word ".", The vector of the input sentence 30 is obtained. The vector of the input sentence 30 generated by the RNN encoder 6A in this way is input to the RNN decoder 6B.

その後、モデル実行部６は、入力文３０のベクトル、１時刻前の正解の単語及びＲＮＮデコーダ６Ｂが文末記号と呼ばれるＥＯＳを出力するまでの残り文字数などを入力とし、ＥＯＳを出力するまで時刻ごとに単語の確率分布を繰り返し計算する。 After that, the model execution unit 6 inputs the vector of the input sentence 30, the correct word one hour ago, the number of remaining characters until the RNN decoder 6B outputs the EOS called the sentence end symbol, and the like, and every time until the EOS is output. Repeatedly calculate the probability distribution of words.

例えば、参照要約７０の先頭の単語と照合する単語の確率分布を計算する１時刻目には、図９に示す動作が行われる。すなわち、図９に示すように、入力制御部５Ｉは、モデル実行部６が使用するワークエリアに展開されたＬＳＴＭ６ｂ－１に対し、ＬＳＴＭ６ａ－Ｍの出力およびＢＯＳ（Begin Of Sentence）と呼ばれる文頭記号を入力すると共に参照要約７０の文字数「３７」を残り文字数として入力する。これにより、ＬＳＴＭ６ｂ－１により１時刻目（ｔ＝１）における単語の確率分布が出力される。この結果、更新部７は、１時刻目における単語の確率分布と１時刻目の正解の単語「コールセンター」とから損失を算出する。この場合、１時刻目の正解の単語「コールセンター」の確率が１に近く、かつその他の単語の確率が０に近いほど小さい損失が算出される。 For example, at the first time of calculating the probability distribution of the word to be collated with the first word of the reference summary 70, the operation shown in FIG. 9 is performed. That is, as shown in FIG. 9, the input control unit 5I outputs the LSTM6a-M and the initial symbol called BOS (Begin Of Sentence) with respect to the LSTM6b-1 expanded in the work area used by the model execution unit 6. And input the number of characters "37" of the reference summary 70 as the number of remaining characters. As a result, the probability distribution of the word at the first time (t = 1) is output by LSTM6b-1. As a result, the update unit 7 calculates the loss from the probability distribution of the word at the first time and the correct word "call center" at the first time. In this case, the smaller the loss is calculated as the probability of the correct word "call center" at the first time is closer to 1 and the probability of the other words is closer to 0.

また、参照要約７０の先頭から２番目の単語と照合する単語の確率分布を計算する２時刻目には、図１０に示す動作が行われる。すなわち、図１０に示すように、入力制御部５Ｉは、ＬＳＴＭ６ｂ－２に対し、ＬＳＴＭ６ｂ－１の出力および１時刻前の正解の単語「コールセンター」を入力すると共に１時刻目の残り文字数から１時刻目の正解の単語の字数が減算された字数「３０」を２時刻目の残り文字数として入力する。これにより、ＬＳＴＭ６ｂ－２により２時刻目（ｔ＝２）における単語の確率分布が出力される。この結果、更新部７は、２時刻目における単語の確率分布と２時刻目の正解の単語「の」とから損失を算出する。この場合、２時刻目の正解の単語「の」の確率が１に近く、かつその他の単語の確率が０に近いほど小さい損失が算出される。 Further, at the second time for calculating the probability distribution of the word to be collated with the second word from the beginning of the reference summary 70, the operation shown in FIG. 10 is performed. That is, as shown in FIG. 10, the input control unit 5I inputs the output of LSTM6b-1 and the correct word "call center" one hour before to LSTM6b-2, and one hour from the number of remaining characters at the first hour. The number of characters "30" obtained by subtracting the number of characters of the correct word of the eye is input as the number of remaining characters at the second time. As a result, the probability distribution of the word at the second time (t = 2) is output by LSTM6b-2. As a result, the update unit 7 calculates the loss from the probability distribution of the word at the second time and the correct word "no" at the second time. In this case, the smaller the loss is calculated as the probability of the correct word "no" at the second time is closer to 1 and the probability of the other words is closer to 0.

さらに、参照要約７０の先頭から３番目の単語と照合する単語の確率分布を計算する３時刻目には、図１１に示す動作が行われる。すなわち、図１１に示すように、入力制御部５Ｉは、ＬＳＴＭ６ｂ－３に対し、ＬＳＴＭ６ｂ－２の出力および１時刻前の正解の単語「の」を入力すると共に２時刻目の残り文字数から２時刻目の正解の単語の字数が減算された字数「２９」を３時刻目の残り文字数として入力する。これにより、ＬＳＴＭ６ｂ－３により３時刻目（ｔ＝３）における単語の確率分布が出力される。この結果、更新部７は、３時刻目における単語の確率分布と３時刻目の正解の単語「問い合わせ」とから損失を算出する。この場合、３時刻目の正解の単語「問い合わせ」の確率が１に近く、かつその他の単語の確率が０に近いほど小さい損失が算出される。 Further, at the third time when the probability distribution of the word to be collated with the third word from the beginning of the reference summary 70 is calculated, the operation shown in FIG. 11 is performed. That is, as shown in FIG. 11, the input control unit 5I inputs the output of LSTM6b-2 and the correct word "no" one hour before to LSTM6b-3, and two hours from the number of remaining characters at the second time. The number of characters "29" obtained by subtracting the number of characters of the correct word of the eye is input as the number of remaining characters at the third time. As a result, the probability distribution of the word at the third time (t = 3) is output by LSTM6b-3. As a result, the update unit 7 calculates the loss from the probability distribution of the word at the third time and the correct word "inquiry" at the third time. In this case, the smaller the loss is calculated as the probability of the correct word "inquiry" at the third time is closer to 1 and the probability of the other words is closer to 0.

このような処理をＬＳＴＭ６ｂから文末記号「ＥＯＳ」が出力されるまで繰り返し実行されることにより、更新部７は、参照要約７０の単語ごとに損失を算出する。さらに、学習データに含まれる全ての学習サンプルについて参照要約の単語ごとに損失を算出する処理が実行される。このように学習データに含まれる全ての学習サンプルについて参照要約の単語ごとの損失が算出されると、更新部７は、下記の式（４）に示す目的関数Ｌ_ｔをパラメータθについて最大化する「対数尤度の最適化」を第１のモデル学習として実行する。ここで、下記の式（４）における確率「ｐ（ｙ｜ｘ；θ）」は、下記の式（５）に示す通り、各時刻における損失の総積によって求まる。なお、下記の式（４）における「Ｄ」は、入力文ｘおよび参照要約ｙを含む学習サンプルの集合を指す。また、下記の式（５）における「ｙ_＜ｔ」の「ｔ」は、参照要約における単語の位置を指し、例えば、参照要約の先頭の単語はｙ_１で表され，２番目の単語はｙ_２で表され，・・・，末尾の単語はｙ_ｔ－１で表される。 By repeatedly executing such processing until the sentence end symbol "EOS" is output from LSTM6b, the update unit 7 calculates the loss for each word of the reference summary 70. Further, a process of calculating the loss for each word of the reference summary is executed for all the training samples included in the training data. When the loss for each word of the reference summary is calculated for all the training samples included in the training data in this way, the updater 7 maximizes the objective function _Lt shown in the following equation (4) with respect to the parameter θ. "Optimization of log-likelihood" is executed as the first model learning. Here, the probability "p (y | x; θ)" in the following equation (4) is obtained by the total product of losses at each time as shown in the following equation (5). Note that "D" in the following equation (4) refers to a set of learning samples including the input sentence x and the reference summary y. Further, "t" of "y _<t " in the following formula (5) indicates the position of a word in the reference summary, for example, the first word of the reference summary is represented by y ₁ , and the second word is y. It is represented by ₂ , ..., The last word is represented by y _t-1 .

その後、更新部７は、第１のモデル記憶部３に記憶されたモデルのパラメータを対数尤度の最適化により求められたパラメータθに更新する。このパラメータθの更新は、学習データＤについて所定の回数にわたって繰り返すことができる。このように第１のモデル記憶部３に保存されたモデルのパラメータが第２の学習部１０により用いられることになる。 After that, the update unit 7 updates the parameters of the model stored in the first model storage unit 3 to the parameters θ obtained by optimizing the log-likelihood. This update of the parameter θ can be repeated a predetermined number of times for the training data D. The parameters of the model stored in the first model storage unit 3 in this way are used by the second learning unit 10.

図１の説明に戻り、第２の学習部１０は、上記の第２のモデル学習を実行する処理部である。図１に示すように、第２の学習部１０は、入力制御部１０Ｉと、モデル実行部１１と、要約生成部１２と、第１の確率算出部１３と、重複度算出部１４と、第１の損失算出部１５と、擬似文生成部１６と、第２の確率算出部１７と、第２の損失算出部１８と、更新部１９とを有する。 Returning to the description of FIG. 1, the second learning unit 10 is a processing unit that executes the above-mentioned second model learning. As shown in FIG. 1, the second learning unit 10 includes an input control unit 10I, a model execution unit 11, a summary generation unit 12, a first probability calculation unit 13, an overlap degree calculation unit 14, and a second. It has a loss calculation unit 15, a pseudo sentence generation unit 16, a second probability calculation unit 17, a second loss calculation unit 18, and an update unit 19.

入力制御部１０Ｉは、モデルに対する入力を制御する処理部である。 The input control unit 10I is a processing unit that controls the input to the model.

一実施形態として、入力制御部１０Ｉは、学習データに含まれる学習サンプルごとに、ＲＮＮエンコーダ１１ＡおよびＲＮＮデコーダ１１Ｂが接続されたニューラルネットワークのモデルに対するデータの入力制御を行う。 As one embodiment, the input control unit 10I controls the input of data to the model of the neural network to which the RNN encoder 11A and the RNN decoder 11B are connected for each training sample included in the training data.

具体的には、入力制御部１０Ｉは、学習サンプルをカウントするループカウンタｄの値を初期化する。続いて、入力制御部１０Ｉは、学習データ記憶部２に記憶されたＤ個の学習サンプルのうちループカウンタｄに対応する学習サンプルを取得する。その後、入力制御部１０Ｉは、ループカウンタｄをインクリメントし、ループカウンタｄの値が学習サンプルの総数Ｄと等しくなるまで、学習データ記憶部２から学習サンプルを取得する処理を繰り返し実行する。なお、ここでは、学習装置１内部のストレージに保存された学習データを取得する例を挙げたが、ネットワークを介して接続される外部のコンピュータ、例えばファイルサーバの他、リムーバブルメディア等から学習データが取得されることとしてもかまわない。 Specifically, the input control unit 10I initializes the value of the loop counter d that counts the training sample. Subsequently, the input control unit 10I acquires a learning sample corresponding to the loop counter d among the D learning samples stored in the learning data storage unit 2. After that, the input control unit 10I increments the loop counter d, and repeatedly executes the process of acquiring the training sample from the training data storage unit 2 until the value of the loop counter d becomes equal to the total number D of the training samples. Here, an example of acquiring the learning data stored in the storage inside the learning device 1 has been given, but the learning data can be obtained from an external computer connected via a network, for example, a file server, a removable medium, or the like. It does not matter if it is acquired.

このように学習サンプルが取得される度に、入力制御部１０Ｉは、当該学習サンプルに含まれる入力文ｘをＲＮＮエンコーダ１１Ａへ入力する。これによって、入力文ｘの単語列がベクトル化されたベクトル、いわゆる中間表現がＲＮＮエンコーダ１１ＡからＲＮＮデコーダ１１Ｂへ出力される。これと同時または前後して、入力制御部１０Ｉは、ＲＮＮデコーダ１１Ｂに文末記号と呼ばれるＥＯＳを出力させるまでの残り文字数を保持するレジスタの値を所定の上限文字数、例えばユーザ入力やユーザ設定などの値に初期化する。これ以降のＲＮＮデコーダ１１Ｂへの入力、ＲＮＮデータからの出力、それを用いたモデルのパラメータの更新についてはその詳細を後述する。 Each time the learning sample is acquired in this way, the input control unit 10I inputs the input sentence x included in the learning sample to the RNN encoder 11A. As a result, a vector in which the word string of the input sentence x is vectorized, a so-called intermediate representation, is output from the RNN encoder 11A to the RNN decoder 11B. At the same time or before and after this, the input control unit 10I sets the value of the register holding the number of remaining characters until the RNN decoder 11B outputs the EOS called the sentence end symbol to a predetermined upper limit number of characters, for example, user input or user setting. Initialize to a value. The details of the subsequent input to the RNN decoder 11B, the output from the RNN data, and the update of the model parameters using the same will be described later.

モデル実行部１１は、ＲＮＮエンコーダ１１ＡおよびＲＮＮデコーダ１１Ｂが接続されたニューラルネットワークのモデルを実行する処理部である。 The model execution unit 11 is a processing unit that executes a model of a neural network to which the RNN encoder 11A and the RNN decoder 11B are connected.

１つの側面として、モデル実行部１１は、第１のモデル記憶部３に記憶されたモデル情報にしたがって、入力制御部１０Ｉにより入力された学習サンプルの入力文の単語数Ｍに対応するＭ個のＬＳＴＭをワークエリア上に展開する。これによって、Ｍ個のＬＳＴＭをＲＮＮエンコーダ１１Ａとして機能させる。このＲＮＮエンコーダ１１Ａでは、入力制御部１０Ｉによる入力制御にしたがって、学習サンプルの入力文の先頭の単語から順に、入力文の先頭からｍ番目の単語が当該ｍ番目の単語に対応するＬＳＴＭへ入力されると共に、ｍ－１番目の単語に対応するＬＳＴＭの出力がｍ番目の単語に対応するＬＳＴＭへ入力される。このような入力を先頭の単語に対応するＬＳＴＭから末尾であるＭ番目の単語に対応するＬＳＴＭまで繰り返すことにより、学習サンプルの入力文のベクトル、いわゆる中間表現が得られる。このようにＲＮＮエンコーダ１１Ａにより生成された入力文の中間表現がＲＮＮデコーダ１１Ｂへ入力される。 As one aspect, the model execution unit 11 has M units corresponding to the number of words M of the input sentence of the learning sample input by the input control unit 10I according to the model information stored in the first model storage unit 3. Deploy the LSTM on the work area. As a result, M LSTMs are made to function as RNN encoders 11A. In this RNN encoder 11A, according to the input control by the input control unit 10I, the mth word from the beginning of the input sentence is input to the LSTM corresponding to the mth word in order from the first word of the input sentence of the learning sample. At the same time, the output of the LSTM corresponding to the m-1st word is input to the LSTM corresponding to the mth word. By repeating such input from the LSTM corresponding to the first word to the LSTM corresponding to the Mth word at the end, a vector of the input sentence of the learning sample, a so-called intermediate representation, can be obtained. The intermediate representation of the input sentence generated by the RNN encoder 11A in this way is input to the RNN decoder 11B.

更なる側面として、モデル実行部１１は、第１のモデル記憶部３に記憶されたモデル情報にしたがって、文末記号「ＥＯＳ」が出力されるまで各時刻に対応するＫ個のＬＳＴＭをワークエリア上に展開する。これによって、Ｋ個のＬＳＴＭをＲＮＮデコーダ１１Ｂとして機能させる。これらＲＮＮデコーダ１１Ｂには、入力制御部１０Ｉの入力制御にしたがって、ＲＮＮエンコーダ１１Ａから学習サンプルの入力文の中間表現が入力されると共に、Ｋ個のＬＳＴＭごとに入力制御部１０ＩからＥＯＳのタグを出力させるまでの残り文字数が入力される。これらの入力にしたがってＫ個のＬＳＴＭを動作させることにより、ＲＮＮデコーダ１１Ｂは、Ｋ個のＬＳＭＴごとに単語の確率分布を出力する。 As a further aspect, the model execution unit 11 displays K LSTMs corresponding to each time on the work area until the sentence end symbol "EOS" is output according to the model information stored in the first model storage unit 3. Expand to. As a result, K RSTMs are made to function as the RNN decoder 11B. In these RNN decoders 11B, an intermediate expression of the input sentence of the learning sample is input from the RNN encoder 11A according to the input control of the input control unit 10I, and the EOS tag is input from the input control unit 10I to each K LSTM. The number of characters remaining until output is entered. By operating K LSTMs according to these inputs, the RNN decoder 11B outputs a word probability distribution for each K LSMTs.

これら入力制御部１０Ｉ及びモデル実行部１１の他、第２の学習部１０は、更新部１９がモデルのパラメータの更新に用いる損失Ｌ（θ）を算出する側面から、上記の損失Ｌ_ＭＲＴ（θ）を第１の損失として算出する第１の系統と、上記の損失Ｌ_{ｏｒｄｅｒ}（θ）を第２の損失として算出する第２の系統とに分類することができる。 In addition to the input control unit 10I and the model execution unit 11, the second learning unit 10 has the above-mentioned loss L _MRT (θ) from the aspect of calculating the loss L (θ) used by the update unit 19 for updating the parameters of the model. ) Can be classified into a first system for calculating as a first loss and a second system for calculating the loss _Lord (θ) as a second loss.

このうち、第１の系統には、システム要約を生成する要約生成部１２と、システム要約の生成確率を算出する第１の確率算出部１３と、システム要約および参照要約の重複度を算出する重複度算出部１４と、上記の第１の損失を算出する第１の損失算出部１５とが含まれる。 Of these, the first system includes a summary generation unit 12 that generates a system summary, a first probability calculation unit 13 that calculates the generation probability of the system summary, and a duplication that calculates the multiplicity of the system summary and the reference summary. A degree calculation unit 14 and a first loss calculation unit 15 for calculating the first loss described above are included.

以下、図１２を用いて、第２のモデル学習の第１の系統における処理内容について説明する。図１２は、第１の系統におけるモデルへの入出力の一例を示す図である。図１２には、入力制御部１０Ｉにより図３に示す入力文３０および図７Ａに示す参照要約７０のペアが学習サンプルとして取得される場合が示されている。 Hereinafter, the processing contents in the first system of the second model learning will be described with reference to FIG. 12. FIG. 12 is a diagram showing an example of input / output to the model in the first system. FIG. 12 shows a case where the input control unit 10I acquires a pair of the input sentence 30 shown in FIG. 3 and the reference summary 70 shown in FIG. 7A as a learning sample.

図１２に示すように、モデル実行部１１は、上記のモデル実行部６と同様、入力制御部１０Ｉにより取得された入力文３０に含まれる単語列をベクトル化する。すなわち、モデル実行部１１は、モデル実行部１１が使用するワークエリアに入力文３０の単語数Ｍに対応するＭ個のＬＳＴＭ１１ａ－１～１１ａ－Ｍを展開する。これらＭ個のＬＳＴＭ１１ａ－１～１１ａ－ｎをＲＮＮエンコーダ１１Ａとして機能させる。その上で、入力制御部１０Ｉは、入力文３０に含まれる先頭の単語から順に入力文３０の単語を当該単語の位置に対応するＬＳＴＭ１１ａに入力すると共に１つ前のＬＳＴＭ１１ａの出力を入力する。このような入力を先頭の単語「当社」に対応するＬＳＴＭ１１ａ－１から末尾の単語「。」に対応するＬＳＴＭ１１ａ－Ｍまで繰り返すことにより、入力文３０のベクトルが得られる。このようにＲＮＮエンコーダ１１Ａにより生成された入力文３０のベクトルがＲＮＮデコーダ１１Ｂへ入力される。 As shown in FIG. 12, the model execution unit 11 vectorizes the word string included in the input sentence 30 acquired by the input control unit 10I, similarly to the model execution unit 6 described above. That is, the model execution unit 11 expands M LSTM11a-1 to 11a-M corresponding to the number of words M of the input sentence 30 in the work area used by the model execution unit 11. These M LSTM11a-1 to 11a-n function as the RNN encoder 11A. Then, the input control unit 10I inputs the word of the input sentence 30 into the LSTM11a corresponding to the position of the word in order from the first word included in the input sentence 30, and inputs the output of the previous LSTM11a. By repeating such input from LSTM11a-1 corresponding to the first word "our company" to LSTM11a-M corresponding to the last word ".", The vector of the input sentence 30 is obtained. The vector of the input sentence 30 generated by the RNN encoder 11A in this way is input to the RNN decoder 11B.

その後、モデル実行部１１は、入力文３０のベクトル、１時刻前に予測された単語及びＲＮＮデコーダ１１ＢがＥＯＳを出力するまでの残り文字数などを入力とし、ＥＯＳを出力するまで時刻ごとに単語の確率分布を繰り返し計算する。 After that, the model execution unit 11 inputs the vector of the input sentence 30, the word predicted one time ago, the number of characters remaining until the RNN decoder 11B outputs the EOS, and the like, and the word is output at each time until the EOS is output. Calculate the probability distribution repeatedly.

ここで、第２のモデル学習では、第１のモデル学習とは異なり、ＲＮＮデコーダ１１Ｂの各時刻に１時刻前の正解の単語ではなく、１時刻前に生成された単語が入力制御部１０Ｉにより入力される。さらに、第２のモデル学習では、参照要約に対するシステム要約の損失は、第１のモデル学習のように、ＲＮＮデコーダ６Ｂの各時刻ごとに算出されない。すなわち、第２のモデル学習では、図１２に示すように、ＥＯＳのタグが出力されるまで各時刻に対応するＬＳＴＭ１１ｂから単語の確率分布に基づいて単語を繰り返して出力させることによりシステム要約が生成された後にシステム要約の損失が算出される。 Here, in the second model learning, unlike the first model learning, the word generated one hour before is not the correct word one hour before each time of the RNN decoder 11B, but the word generated one hour before is input by the input control unit 10I. Entered. Further, in the second model learning, the loss of the system summarization with respect to the reference summarization is not calculated for each time of the RNN decoder 6B as in the first model learning. That is, in the second model learning, as shown in FIG. 12, a system summary is generated by repeatedly outputting words based on the probability distribution of words from LSTM11b corresponding to each time until the EOS tag is output. After that, the loss of the system summary is calculated.

例えば、システム要約の先頭の単語を予測する１時刻目には、入力制御部１０Ｉは、モデル実行部１１が使用するワークエリアに展開されたＬＳＴＭ１１ｂ－１に対し、ＬＳＴＭ１１ａ－Ｍの出力および文頭記号「ＢＯＳ」と共に参照要約７０の文字数「３７」を残り文字数として入力する。ここでは、上限文字数の一例として、参照要約の文字数を採用する場合を例示したが、参照要約の文字数よりも短い文字数に制限してもよいし、参照要約の文字数よりも長い文字数に制限することもできる。これにより、ＬＳＴＭ１１ｂ－１によって１時刻目（ｔ＝１）における単語の確率分布が出力される。この単語の確率分布に基づいて、要約生成部１２は、システム要約の先頭の単語を抽出する。例えば、要約生成部１２は、単語の確率分布に従って抽選を実行し、抽選により当選した単語を抽出することができる。この他、要約生成部１２は、確率が上位所定数、例えば上位５位までに属する単語の中から１つの単語をランダムにサンプリングする。ここで、図１２に示す例では、あくまで一例として、システム要約の先頭の単語として「コールセンター」がランダムサンプリングされた場合を例に挙げて２時刻目以降の処理について説明する。 For example, at the first time of predicting the first word of the system summary, the input control unit 10I outputs the LSTM11a-M and the initial symbol to the LSTM11b-1 expanded in the work area used by the model execution unit 11. Enter the number of characters "37" of the reference summary 70 together with "BOS" as the number of remaining characters. Here, as an example of the maximum number of characters, the case where the number of characters in the reference summary is adopted is illustrated, but the number of characters may be limited to be shorter than the number of characters in the reference summary, or may be limited to the number of characters longer than the number of characters in the reference summary. You can also. As a result, the probability distribution of the word at the first time (t = 1) is output by LSTM11b-1. Based on the probability distribution of this word, the summary generator 12 extracts the first word of the system summary. For example, the summary generation unit 12 can execute a lottery according to the probability distribution of words and extract the winning words by the lottery. In addition, the summary generation unit 12 randomly samples one word from the words belonging to a predetermined number having a high probability, for example, the top five. Here, in the example shown in FIG. 12, the processing after the second time will be described by taking as an example the case where "call center" is randomly sampled as the first word of the system summary.

続いて、システム要約の先頭から２番目の単語を予測する２時刻目には、入力制御部１０Ｉは、ＬＳＴＭ１１ｂ－２に対し、ＬＳＴＭ１１ｂ－１の出力および１時刻前の予測結果「コールセンター」と共に１時刻目の残り文字数から１時刻目に予測された単語の字数が減算された字数「３０」を２時刻目の残り文字数として入力する。これにより、ＬＳＴＭ１１ｂ－２によって２時刻目（ｔ＝２）における単語の確率分布が出力される。この単語の確率分布に基づいて単語の抽選を実行することにより、要約生成部１２は、抽選で当選した単語をサンプリングする。 Subsequently, at the second time when the second word from the beginning of the system summary is predicted, the input control unit 10I gives the LSTM11b-2 the output of the LSTM11b-1 and the prediction result "call center" one hour before. The number of characters "30" obtained by subtracting the number of characters of the word predicted at the first time from the number of remaining characters at the time is input as the number of characters remaining at the second time. As a result, the probability distribution of the word at the second time (t = 2) is output by LSTM11b-2. By executing the lottery of words based on the probability distribution of the words, the summary generation unit 12 samples the words won in the lottery.

その後、要約生成部１２は、は、ＬＳＴＭ１１ｂ－ＫによりＥＯＳが出力されるまで、システム要約の単語を時刻ごとにサンプリングする。このようなサンプリングによりシステム要約を生成することで、要約生成部１２は、１つの入力文につき所定数、例えばＳ個のシステム要約ｙ′を生成することができる。このようにＳ個のシステム要約が生成された場合、第１の確率算出部１３は、Ｓ個のシステム要約ｙ′ごとに当該システム要約ｙ′の各時刻で生成された単語の確率に基づいて入力文ｘからシステム要約ｙ′が生成される生成確率ｐ（ｙ′｜ｘ，θ）を算出する。 After that, the summary generation unit 12 samples the words of the system summary at each time until EOS is output by LSTM11b-K. By generating system summaries by such sampling, the summarization generation unit 12 can generate a predetermined number, for example, S system summaries y'for one input sentence. When S system summaries are generated in this way, the first probability calculation unit 13 is based on the probabilities of the words generated at each time of the system summaries y'for each S system summaries y'. The generation probability p (y'| x, θ) in which the system summary y'is generated from the input sentence x is calculated.

ここで、第２のモデル学習では、第１の損失Ｌ_ＭＲＴ（θ）は、上記の式（１）にしたがって算出される。すなわち、第１の損失Ｌ_ＭＲＴ（θ）は、第１の確率算出部１３により算出されるシステム要約の生成確率に加えて、後述の重複度算出部１４により算出されるシステム要約および参照要約の間の単語の重複度に基づいて算出される。 Here, in the second model learning, the first loss _LMRT (θ) is calculated according to the above equation (1). That is, the first loss _LMRT (θ) is the system summary and reference summary calculated by the multiplicity calculation unit 14, which will be described later, in addition to the generation probability of the system summary calculated by the first probability calculation unit 13. Calculated based on the degree of duplication of words between.

このように第１の損失の算出に用いられる重複度Δ（ｙ′，ｙ）は、図１２に示すように、必ずしもシステム要約に含まれる全ての単語を用いて算出されるとは限らない。すなわち、重複度算出部１４は、要約生成部１２により生成されるＳ個のシステム要約ごとに、当該システム要約のうち上限文字数、例えば参照要約の文字数以内の文を対象に参照要約との間で単語の重複度を算出する。これによって、システム要約のうち上限文字数を超える部分の単語、すなわち図１２に示すハッチング部分を重複度の算出対象から除外することができる。 As shown in FIG. 12, the multiplicity Δ (y', y) used for calculating the first loss is not necessarily calculated using all the words included in the system summary. That is, the multiplicity calculation unit 14 has, for each of the S system summaries generated by the summary generation unit 12, between the reference summary and the sentence within the upper limit of the number of characters in the system summary, for example, the number of characters of the reference summary. Calculate the degree of word duplication. Thereby, the word of the part exceeding the upper limit number of characters in the system summary, that is, the hatched part shown in FIG. 12 can be excluded from the calculation target of the degree of duplication.

例えば、重複度算出部１４は、下記の式（６）に示すように、システム要約の文字列の先頭から上限文字数に対応するｎバイト分の文字列に対応する単語を切り取るｔｒｉｍ関数を含むＲＯＵＧＥ関数にしたがってｎ－ｇｒａｍの重複度を算出できる。 For example, as shown in the following equation (6), the duplication degree calculation unit 14 includes a trim function that cuts out the word corresponding to the n-byte character string corresponding to the maximum number of characters from the beginning of the character string of the system summary. The degree of duplication of n-gram can be calculated according to the function.

図１３は、重複度の算出方法の一例を示す図である。図１３には、上記の式（６）にしたがって重複度Δ（ｙ′，ｙ）が算出される例が示されている。図１３に示すように、システム要約ｙ′には、先頭の単語ｙ′_１、先頭から２番目の単語ｙ′_２、・・・、先頭からｋ－１番目の単語ｙ′_ｋ－１、先頭からｋ番目の単語ｙ′_ｋ、・・・、末尾の単語ｙ′_｜ｙ′｜が含まれる。一方、参照要約ｙには、先頭の単語ｙ_１、先頭から２番目の単語ｙ_２、・・・、末尾の単語ｙ_｜ｙ｜が含まれる。この場合、ｔｒｉｍ（ｙ′，ｂｙｔｅ（ｙ））によってシステム要約ｙ′から参照要約ｙに対応するバイト数の単語、すなわち先頭の単語ｙ′_１、先頭から２番目の単語ｙ′_２、・・・、先頭からｋ－１番目の単語ｙ′_ｋ－１が切り取られる。その上で、ＲＯＵＧＥ（ｔｒｉｍ（ｙ′，ｂｙｔｅ（ｙ）），ｙ）により、システム要約ｙ′の先頭の単語ｙ′_１からｋ－１番目の単語ｙ′_ｋ－１まで切り出されたｔｒｉｍ（ｙ′，ｂｙｔｅ（ｙ））と、参照要約ｙとの単語の重複度が算出される。このように上記の式（６）にしたがって重複度Δ（ｙ′，ｙ）を算出することで、上限文字数を超えるシステム要約ｙ′のｋ番目から末尾までの単語、すなわち単語ｙ′_ｋ～単語ｙ′_｜ｙ′｜を重複率の算出対象から除外できる。この結果、上限文字数を超えるシステム要約ｙ′のｋ番目から末尾までの単語、すなわち単語ｙ′_ｋ～単語ｙ′_｜ｙ′｜に参照要約ｙと重複する単語が含まれることが一因となって、システム要約ｙ′が過大評価されるのを抑制できる。 FIG. 13 is a diagram showing an example of a method for calculating the degree of overlap. FIG. 13 shows an example in which the multiplicity Δ (y ′, y) is calculated according to the above equation (6). As shown in FIG. 13, in the system summary y', the _first word y'1, the _second word y'2 from the beginning, ..., The k-1th word from the beginning y'k-1, and the first word y'k _-1 . The _k -th word y'k, ..., And the last word y' _{| y'|} are included. On the other hand, the reference summary y includes the first word y ₁ , the second word y ₂ , ..., And the last word y _{| y |} . In this case, a word having a number of bytes corresponding to the reference summary y from the system summary y'by trim (y', byte (y)), that is, the first word y'1, the _second word y'2 from the _first , ... -The _k -1th word y'k-1 from the beginning is cut off. Then, the trim (trim (y', byte (y), y) cut out from the first word y'1 of the system summary y'to the k-1th word y'k _-1 _by ROUGE (trim (y', byte (y)), y). The degree of word duplication between y', byte (y)) and the reference summary y is calculated. By calculating the multiplicity Δ (y ′, y) according to the above equation (6) in this way, the words from the kth to the end of the system summary y ′ exceeding the upper limit, that is, the words y ′ _k to the word. y' _{| y'|} can be excluded from the calculation target of the multiplicity. As a result, the word from the kth to the end of the system summary y'that exceeds the maximum number of characters, that is, the word _y'k to the word y' _{| y'|} contains a word that overlaps with the reference summary y. Therefore, it is possible to prevent the system summary y'from being overestimated.

このように重複度の算出対象をシステム要約の上限文字数内の単語に抑えることに加え、下記の式（７）に示す通り、重複度算出部１４は、システム要約の上限文字数に足りない分の長さ、もしくは、システム要約の上限文字数を超える分の長さを、重複度にペナルティとして付与する誤差として、算出することもできる。なお、下記の式（７）に示す「Ｃ」は、上記の学習プログラムの開発者やユーザにより設定されるハイパーパラメータを指す。 In addition to limiting the calculation target of the degree of duplication to words within the maximum number of characters in the system summary, as shown in the following equation (7), the duplication degree calculation unit 14 is insufficient for the maximum number of characters in the system summary. It is also possible to calculate the length or the length exceeding the upper limit of the number of characters in the system summary as an error of giving a penalty to the degree of duplication. In addition, "C" shown in the following formula (7) refers to hyperparameters set by the developer or user of the above learning program.

図１４は、誤差付きの重複度の算出方法の一例を示す図である。図１４には、上記の式（７）にしたがって誤差付きの重複度Δ（ｙ′，ｙ）が算出される例が示されている。図１４に示す例においても、図１３に示す例と同様に、ＲＯＵＧＥ（ｔｒｉｍ（ｙ′，ｂｙｔｅ（ｙ）），ｙ）により、システム要約ｙ′の先頭の単語ｙ′_１からｋ－１番目の単語ｙ′_ｋ－１まで切り出されたｔｒｉｍ（ｙ′，ｂｙｔｅ（ｙ））と、参照要約ｙとの単語の重複度が算出される。さらに、上記の式（７）に従えば、システム要約および参照要約の間の長さの差の絶対値、例えば｜ｂｙｔｅ（ｙ′）－ｂｙｔｅ（ｙ）｜が誤差として重複度に付与される。たとえば、図１４の例で言えば、システム要約の長さの方が参照要約よりも大きいので、上限文字数を超える分の長さｂｙｔｅ（ｙ′）－ｂｙｔｅ（ｙ）が重複度に加算されることにより、誤差付きの重複度Δ（ｙ′，ｙ）が算出される。このように上記の式（７）にしたがってＲＯＵＧＥにより算出される重複度に誤差｜ｂｙｔｅ（ｙ′）－ｂｙｔｅ（ｙ）｜を付与して誤差付きの重複度Δ（ｙ′，ｙ）を算出する。これによって、上限文字数に満たないシステム要約および上限文字数を超えるシステム要約の損失が高まる結果、文字数が上限文字数と一致するシステム要約の評価を高めるモデル学習を実現できる。 FIG. 14 is a diagram showing an example of a method for calculating the degree of overlap with an error. FIG. 14 shows an example in which the multiplicity Δ (y ′, y) with an error is calculated according to the above equation (7). In the example shown in FIG. 14, as in the example shown in FIG. 13, the first word y'1 to k- ₁ of the system summary y'by ROUGE (trim (y', byte (y), y)). The degree of overlap between the trim (y', byte (y)) cut out to the word y'k _-1 and the reference summary y is calculated. Further, according to the above equation (7), the absolute value of the difference in length between the system summary and the reference summary, for example | byte (y')-byte (y) |, is given to the multiplicity as an error. .. For example, in the example of FIG. 14, since the length of the system summary is larger than that of the reference summary, the length byte (y')-byte (y) exceeding the maximum number of characters is added to the multiplicity. As a result, the multiplicity Δ (y ′, y) with an error is calculated. In this way, an error | byte (y')-byte (y) | is added to the overlap degree calculated by ROUGE according to the above equation (7) to calculate the overlap degree Δ (y', y) with an error. do. As a result, the loss of the system summary that does not reach the maximum number of characters and the system summary that exceeds the maximum number of characters increases, and as a result, it is possible to realize model learning that enhances the evaluation of the system summary whose number of characters matches the maximum number of characters.

また、重複度算出部１４は、必ずしも上限文字数に満たないシステム要約にまで重複度に付与する誤差を算出せずともかまわない。例えば、重複度算出部１４は、下記の式（８）にしたがって、システム要約が上限文字数を超える場合に絞ってシステム要約の上限文字数を超える分の長さを誤差として算出することもできる。 Further, the multiplicity calculation unit 14 does not necessarily have to calculate the error to be given to the multiplicity even in the system summary that does not necessarily reach the upper limit of the number of characters. For example, the multiplicity calculation unit 14 can calculate the length exceeding the upper limit number of characters of the system summary as an error by narrowing down the case where the system summary exceeds the upper limit number of characters according to the following equation (8).

図１５は、誤差付きの重複度の算出方法の一例を示す図である。図１５には、上記の式（８）にしたがって誤差付きの重複度Δ（ｙ′，ｙ）が算出される例が示されている。図１５に示す例においても、図１３に示す例と同様に、ＲＯＵＧＥ（ｔｒｉｍ（ｙ′，ｂｙｔｅ（ｙ）），ｙ）により、システム要約ｙ′の先頭の単語ｙ′_１からｋ－１番目の単語ｙ′_ｋ－１まで切り出されたｔｒｉｍ（ｙ′，ｂｙｔｅ（ｙ））と、参照要約ｙとの単語の重複度が算出される。さらに、システム要約が上限文字数を超える場合、ｍａｘ（０，ｂｙｔｅ（ｙ′）－ｂｙｔｅ（ｙ））によって上限文字数を超える分の長さｂｙｔｅ（ｙ′）－ｂｙｔｅ（ｙ）が重複度に加算されることにより、誤差付きの重複度Δ（ｙ′，ｙ）が算出される。一方、システム要約が上限文字数に満たない場合、ｍａｘ（０，ｂｙｔｅ（ｙ′）－ｂｙｔｅ（ｙ））によって「０」が選択されるので、重複度には誤差が付与されず、重複度がそのままΔ（ｙ′，ｙ）として算出される。これによって、上限文字数に満たないシステム要約の損失は高めずに上限文字数を超えるシステム要約の損失が高まる結果、上限文字数以内のシステム要約の評価を高めるモデル学習が実現できる。 FIG. 15 is a diagram showing an example of a method for calculating the degree of overlap with an error. FIG. 15 shows an example in which the multiplicity Δ (y ′, y) with an error is calculated according to the above equation (8). In the example shown in FIG. 15, as in the example shown in FIG. 13, the first word y'1 to k- ₁ of the system summary y'by ROUGE (trim (y', byte (y), y)). The degree of overlap between the trim (y', byte (y)) cut out to the word y'k _-1 and the reference summary y is calculated. Further, when the system summary exceeds the maximum number of characters, max (0, byte (y')-byte (y)) adds the length byte (y')-byte (y) exceeding the maximum number of characters to the degree of duplication. By doing so, the multiplicity Δ (y ′, y) with an error is calculated. On the other hand, when the system summary does not reach the upper limit of characters, "0" is selected by max (0, byte (y')-byte (y)), so that no error is given to the degree of duplication and the degree of duplication is increased. It is calculated as Δ (y ′, y) as it is. As a result, the loss of the system summary exceeding the maximum number of characters increases without increasing the loss of the system summary that does not reach the maximum number of characters, and as a result, model learning that enhances the evaluation of the system summary within the maximum number of characters can be realized.

このような誤差付きの重複度Δ（ｙ′，ｙ）が算出された後、第１の損失算出部１５は、要約生成部１２により生成された所定数、例えばＳ個のシステム要約ごとに、当該システム要約が入力文から生成される生成確率の計算結果と、重複度算出部１４により算出された誤差付きの重複度Δ（ｙ′，ｙ）とから第１の損失を算出する。さらに、第１の損失算出部１５は、Ｓ個のシステム要約ごとに算出され第１の損失を合計する計算を実行することにより、Ｓ個のシステム要約ｙ′の集合Ｓ（ｘ，θ）に関する第１の損失の和を算出する。 After the multiplicity Δ (y ′, y) with such an error is calculated, the first loss calculation unit 15 is used for each predetermined number generated by the summary generation unit 12, for example, S system summaries. The first loss is calculated from the calculation result of the generation probability that the system summary is generated from the input statement and the multiplicity Δ (y ′, y) with an error calculated by the multiplicity calculation unit 14. Further, the first loss calculation unit 15 relates to a set S (x, θ) of S system summaries y'by executing a calculation calculated for each S system summaries and summing up the first losses. The sum of the first losses is calculated.

図１の説明に戻り、第２の系統には、擬似文を生成する擬似文生成部１６と、参照要約の生成確率および擬似文の生成確率を算出する第２の確率算出部１７と、上記の第２の損失を算出する第２の損失算出部１８とが含まれる。 Returning to the description of FIG. 1, the second system includes a pseudo-sentence generation unit 16 that generates a pseudo-sentence, a second probability calculation unit 17 that calculates a reference summary generation probability and a pseudo-sentence generation probability, and the above. The second loss calculation unit 18 for calculating the second loss of the above is included.

例えば、擬似文生成部１６は、正解の参照要約ｙから当該参照要約ｙに含まれる単語の語順を入れ替えることにより非文法的な表現が擬似的に再現された擬似文ｚの集合Ｓ′（ｙ）を生成する。このとき、擬似文生成部１６は、正解の参照要約ｙに含まれる単語の語数を変えずに、単語の語順を入れ替えて擬似文ｚのサンプリングを行うことで、参照要約ｙとの間で計算されるＲＯＵＧＥ値が「１」となる擬似文ｚを生成することができる。 For example, the pseudo-sentence generation unit 16 has a set S'(y) of pseudo-sentences z in which a non-grammatical expression is simulated by exchanging the word order of the words included in the reference summary y of the correct answer. ) Is generated. At this time, the pseudo-sentence generation unit 16 calculates the pseudo-sentence z by changing the word order of the words without changing the number of words included in the correct reference summary y. It is possible to generate a pseudo-sentence z whose ROUGE value is "1".

ここで、第２の損失が算出される場合、ＲＮＮエンコーダ１１Ａの構成、ＲＮＮエンコーダ１１Ａへの入力およびＲＮＮエンコーダ１１Ａからの出力は、第１の損失が算出される場合と相違点はない。その一方で、第２の損失が算出される場合、ＲＮＮエンコーダ１１Ａの構成、ＲＮＮエンコーダ１１Ａへの入力およびＲＮＮエンコーダ１１Ａからの出力は、第１の損失が算出される場合と異なる。 Here, when the second loss is calculated, the configuration of the RNN encoder 11A, the input to the RNN encoder 11A, and the output from the RNN encoder 11A are not different from the case where the first loss is calculated. On the other hand, when the second loss is calculated, the configuration of the RNN encoder 11A, the input to the RNN encoder 11A and the output from the RNN encoder 11A are different from the case where the first loss is calculated.

例えば、第２の損失の算出に用いる擬似文ｚの生成確率が算出される場合、モデル実行部１１は、第１のモデル記憶部３に記憶されたモデル情報にしたがって、入力制御部１０Ｉにより入力される擬似文ｚの単語数Ｊに対応するＪ個のＬＳＴＭをワークエリア上に展開する。これによって、Ｊ個のＬＳＴＭをＲＮＮデコーダ１１Ｂとして機能させる。これらＲＮＮデコーダ１１Ｂには、入力制御部１０Ｉの入力制御にしたがって、ＲＮＮエンコーダ１１Ａから学習サンプルの入力文ｘの中間表現が入力されると共に、Ｊ個のＬＳＴＭごとに入力制御部１０ＩからＥＯＳのタグを出力させるまでの残り文字数が入力される。さらに、ＲＮＮデコーダ１１ＢのＪ個のＬＳＴＭには、入力制御部１０Ｉの入力制御にしたがって、１時刻前の擬似文ｚの単語が入力される。これらの入力にしたがってＪ個のＬＳＴＭを動作させることにより、ＲＮＮデコーダ１１Ｂは、Ｊ個のＬＳＭＴごとに擬似文ｚの各時刻における単語の確率を出力する。このようにＲＮＮデコーダ１１Ｂの各ＬＳＭＴが出力する擬似文ｚの各時刻における単語の確率に基づいて、第２の確率算出部１７は、入力文ｘから擬似文ｚが生成される生成確率ｐ（ｚ｜ｘ；θ）を算出する。 For example, when the generation probability of the pseudo sentence z used for calculating the second loss is calculated, the model execution unit 11 inputs by the input control unit 10I according to the model information stored in the first model storage unit 3. Expand J LSTMs corresponding to the number of words J of the pseudo sentence z to be performed on the work area. As a result, J RSTMs are made to function as the RNN decoder 11B. An intermediate expression of the input sentence x of the learning sample is input from the RNN encoder 11A to these RNN decoders 11B according to the input control of the input control unit 10I, and the tags of the input control units 10I to EOS are input for each J LSTM. The number of characters remaining until is output is input. Further, the word of the pseudo sentence z one hour ago is input to the J LSTMs of the RNN decoder 11B according to the input control of the input control unit 10I. By operating the J LSTMs according to these inputs, the RNN decoder 11B outputs the word probabilities of the pseudo sentence z at each time for each J LSMT. In this way, based on the probability of the word at each time of the pseudo sentence z output by each LSMT of the RNN decoder 11B, the second probability calculation unit 17 generates the generation probability p in which the pseudo sentence z is generated from the input sentence x ( z | x; θ) is calculated.

以下、図１６を用いて、第２のモデル学習の第２の系統における処理内容について説明する。図１６は、第２の系統におけるモデルへの入出力の一例を示す図である。図１６には、入力制御部１０Ｉにより図３に示す入力文３０がＲＮＮエンコーダ１１Ａへ入力されると共に、図７Ｄに示すシステム要約と同一の文である擬似文ｚの各時刻の単語がＲＮＮデコーダ１１Ｂへ入力される例が示されている。なお、ＲＮＮエンコーダ１１Ａの構成、ＲＮＮエンコーダ１１Ａへの入力およびＲＮＮエンコーダ１１Ａからの出力は、図１２に示す例と変わりがないので、ＲＮＮデコーダ１１Ｂの説明から開始する。 Hereinafter, the processing contents in the second system of the second model learning will be described with reference to FIG. FIG. 16 is a diagram showing an example of input / output to the model in the second system. In FIG. 16, the input sentence 30 shown in FIG. 3 is input to the RNN encoder 11A by the input control unit 10I, and the word at each time of the pseudo sentence z, which is the same sentence as the system summary shown in FIG. 7D, is the RNN decoder. An example of inputting to 11B is shown. Since the configuration of the RNN encoder 11A, the input to the RNN encoder 11A, and the output from the RNN encoder 11A are the same as those shown in FIG. 12, the description of the RNN decoder 11B will be started.

図１６に示すように、モデル実行部１１は、入力文３０のベクトル、擬似文ｚの各時刻における単語及びＲＮＮデコーダ１１ＢがＥＯＳを出力するまでの残り文字数などを入力とし、ＥＯＳを出力するまで時刻ごとに単語の確率分布を繰り返し計算する。 As shown in FIG. 16, the model execution unit 11 inputs the vector of the input sentence 30, the word at each time of the pseudo sentence z, the number of characters remaining until the RNN decoder 11B outputs the EOS, and the like, and outputs the EOS. Repeatedly calculate the probability distribution of words at each time.

ここで、擬似文ｚの生成確率が算出される場合、システム要約が生成される場合と異なり、ＲＮＮデコーダ１１Ｂの各時刻のＬＳＴＭ１１ｂに１時刻前に生成された単語ではなく、擬似文ｚに含まれる単語のうち１時刻前の擬似文ｚの単語が入力制御部１０Ｉにより入力される。 Here, when the generation probability of the pseudo sentence z is calculated, unlike the case where the system summary is generated, it is included in the pseudo sentence z instead of the word generated one hour before in the LSTM11b at each time of the RNN decoder 11B. The word of the pseudo sentence z one time before is input by the input control unit 10I.

例えば、１時刻目には、入力制御部１０Ｉは、モデル実行部１１が使用するワークエリアに展開されたＬＳＴＭ１１ｂ－１に対し、ＬＳＴＭ１１ａ－Ｍの出力および文頭記号「ＢＯＳ」と共に参照要約７０の文字数「３７」を残り文字数として入力する。ここでは、上限文字数の一例として、参照要約の文字数を採用する場合を例示したが、参照要約の文字数よりも短い文字数に制限してもよいし、参照要約の文字数よりも長い文字数に制限することもできる。これにより、ＬＳＴＭ１１ｂ－１によって１時刻目（ｔ＝１）における単語の確率分布が出力される。このとき、第２の確率算出部１７は、１時刻目における単語の確率分布のうち擬似文ｚの先頭の単語「ＡＩ」に対応する確率を図示しないワークエリアに保存する。 For example, at the first time, the input control unit 10I has the output of the LSTM11a-M and the number of characters of the reference summary 70 together with the initial symbol "BOS" for the LSTM11b-1 expanded in the work area used by the model execution unit 11. Enter "37" as the number of remaining characters. Here, as an example of the maximum number of characters, the case where the number of characters in the reference summary is adopted is illustrated, but the number of characters may be limited to be shorter than the number of characters in the reference summary, or may be limited to the number of characters longer than the number of characters in the reference summary. You can also. As a result, the probability distribution of the word at the first time (t = 1) is output by LSTM11b-1. At this time, the second probability calculation unit 17 stores the probability corresponding to the word "AI" at the beginning of the pseudo sentence z in the work area (not shown) in the probability distribution of the word at the first time.

続いて、２時刻目には、入力制御部１０Ｉは、ＬＳＴＭ１１ｂ－２に対し、ＬＳＴＭ１１ｂ－１の出力および１時刻前の擬似文ｚの単語「ＡＩ」と共に１時刻目の残り文字数から１時刻目の擬似文ｚの単語「ＡＩ」の字数が減算された字数「３５」を２時刻目の残り文字数として入力する。これにより、ＬＳＴＭ１１ｂ－２によって２時刻目（ｔ＝２）における単語の確率分布が出力される。このとき、第２の確率算出部１７は、２時刻目における単語の確率分布のうち擬似文ｚの先頭から２番目の単語「の」に対応する確率を図示しないワークエリアに保存する。 Subsequently, at the second time, the input control unit 10I sends the RSTM11b-2 the output of the SSTM11b-1 and the word "AI" of the pseudo-sentence z one hour before, and the first hour from the number of remaining characters in the first hour. The number of characters "35" obtained by subtracting the number of characters of the word "AI" of the pseudo sentence z of the above is input as the number of remaining characters at the second time. As a result, the probability distribution of the word at the second time (t = 2) is output by LSTM11b-2. At this time, the second probability calculation unit 17 stores the probability corresponding to the second word "no" from the beginning of the pseudo sentence z in the work area (not shown) in the probability distribution of the word at the second time.

このようなＲＮＮデコーダ１１Ｂへの入力がＪ－２時刻目まで繰り返された後、Ｊ－１時刻目には、入力制御部１０Ｉは、ＬＳＴＭ１１ｂ－Ｊ－１に対し、ＬＳＴＭ１１ｂ－Ｊ－２の出力および１時刻前の擬似文ｚの単語「販売」と共に１時刻目の残り文字数からＪ－２時刻目の擬似文ｚの単語「販売」の字数が減算された字数「５」をＪ－１時刻目の残り文字数として入力する。これにより、ＬＳＴＭ１１ｂ－Ｊ－１によってＪ－１時刻目（ｔ＝Ｊ－１）における単語の確率分布が出力される。このとき、第２の確率算出部１７は、Ｊ－１時刻目における単語の確率分布のうち擬似文ｚの先頭からＪ－１番目の単語「問い合わせ」に対応する確率を図示しないワークエリアに保存する。 After such input to the RNN decoder 11B is repeated until the J-2 time, the input control unit 10I outputs the LSTM11b-J-2 to the LSTM11b-J-1 at the J-1 time. And the number of characters "5" obtained by subtracting the number of characters of the word "sale" of the pseudo sentence z at the J-2 time from the number of remaining characters at the first time together with the word "sale" of the pseudo sentence z one hour before is the J-1 time. Enter as the number of characters remaining in the eye. As a result, the probability distribution of words at the J-1 time (t = J-1) is output by LSTM11b-J-1. At this time, the second probability calculation unit 17 stores the probability corresponding to the J-1st word "inquiry" from the beginning of the pseudo sentence z in the work area (not shown) in the probability distribution of the word at the J-1th time. do.

最後に、Ｊ時刻目には、入力制御部１０Ｉは、ＬＳＴＭ１１ｂ－Ｊに対し、ＬＳＴＭ１１ｂ－Ｊ－１の出力および１時刻前の擬似文ｚの単語「問い合わせ」と共に１時刻目の残り文字数からＪ－１時刻目の擬似文ｚの単語「問い合わせ」の字数が減算された字数「０」をＪ時刻目の残り文字数として入力する。これにより、ＬＳＴＭ１１ｂ－ＪによってＪ時刻目（ｔ＝Ｊ）における単語の確率分布が出力される。このとき、第２の確率算出部１７は、Ｊ時刻目における単語の確率分布のうち擬似文ｚの先頭からＪ番目の単語「ＥＯＳ」に対応する確率を図示しないワークエリアに保存する。 Finally, at the J time, the input control unit 10I tells LSTM11b-J from the number of remaining characters at the first time together with the output of LSTM11b-J-1 and the word "inquiry" of the pseudo sentence z one hour before. -Enter the number of characters "0" obtained by subtracting the number of characters of the word "inquiry" of the pseudo sentence z at the time 1 as the number of remaining characters at the J time. As a result, the probability distribution of the word at the J time (t = J) is output by LSTM11b-J. At this time, the second probability calculation unit 17 stores the probability corresponding to the Jth word "EOS" from the beginning of the pseudo sentence z in the work area (not shown) in the probability distribution of the word at the J time.

このようにワークエリアに保存された擬似文ｚの各時刻における単語の確率に基づいて、第２の確率算出部１７は、入力文ｘから擬似文ｚが生成される生成確率ｐ（ｚ｜ｘ；θ）を算出する。これによって、擬似文ｚごとに当該擬似文ｚの生成確率を求めることができる。 Based on the probability of the word at each time of the pseudo sentence z stored in the work area in this way, the second probability calculation unit 17 generates a generation probability p (z | x) in which the pseudo sentence z is generated from the input sentence x. ; Θ) is calculated. Thereby, the generation probability of the pseudo sentence z can be obtained for each pseudo sentence z.

なお、第２の損失の算出に用いる参照要約ｙの生成確率が算出される場合も、擬似文ｚの生成確率を算出する場合と同様にして参照要約ｙの生成確率を算出することができる。すなわち、モデル実行部１１は、第１のモデル記憶部３に記憶されたモデル情報にしたがって、入力制御部１０Ｉにより入力される参照要約ｙの単語数Ｉに対応するＩ個のＬＳＴＭをワークエリア上に展開する。これによって、Ｉ個のＬＳＴＭをＲＮＮデコーダ１１Ｂとして機能させる。これらＲＮＮデコーダ１１Ｂには、入力制御部１０Ｉの入力制御にしたがって、ＲＮＮエンコーダ１１Ａから学習サンプルの入力文ｘの中間表現が入力されると共に、Ｉ個のＬＳＴＭごとに入力制御部１０ＩからＥＯＳのタグを出力させるまでの残り文字数が入力される。さらに、ＲＮＮデコーダ１１ＢのＩ個のＬＳＴＭには、入力制御部１０Ｉの入力制御にしたがって、１時刻前の参照要約ｙの単語が入力される。これらの入力にしたがってＩ個のＬＳＴＭを動作させることにより、ＲＮＮデコーダ１１Ｂは、Ｉ個のＬＳＭＴごとに参照要約ｙの各時刻における単語の確率を出力する。このようにＲＮＮデコーダ１１Ｂの各ＬＳＭＴが出力する参照要約ｙの各時刻における単語の確率に基づいて、第２の確率算出部１７は、入力文ｘから参照要約ｙが生成される生成確率ｐ（ｙ｜ｘ；θ）を算出する。 Even when the generation probability of the reference summary y used for calculating the second loss is calculated, the generation probability of the reference summary y can be calculated in the same manner as in the case of calculating the generation probability of the pseudo sentence z. That is, the model execution unit 11 puts I LSTMs corresponding to the number of words I of the reference summary y input by the input control unit 10I on the work area according to the model information stored in the first model storage unit 3. Expand to. As a result, I RSTMs are made to function as the RNN decoder 11B. An intermediate expression of the input sentence x of the learning sample is input from the RNN encoder 11A to these RNN decoders 11B according to the input control of the input control unit 10I, and the tags of the input control units 10I to EOS are input for each I LSTM. The number of characters remaining until is output is input. Further, the word of the reference summary y one hour ago is input to the I LSTMs of the RNN decoder 11B according to the input control of the input control unit 10I. By operating the I LSTMs according to these inputs, the RNN decoder 11B outputs the word probabilities at each time of the reference summary y for each I LSMT. In this way, based on the probability of the word at each time of the reference summary y output by each LSMT of the RNN decoder 11B, the second probability calculation unit 17 generates the generation probability p in which the reference summary y is generated from the input sentence x ( y | x; θ) is calculated.

このように擬似文ｚの生成確率が算出された後、第２の損失算出部１８は、擬似文ｚの生成確率および参照要約ｙの生成確率を比較する。このとき、擬似文ｚの生成確率が参照要約ｙの生成確率よりも大きい場合、第２の損失算出部１８は、擬似文ｚ_１の生成確率および参照要約ｙの生成確率の差、すなわちｐ（ｚ｜ｘ；θ）－ｐ（ｙ｜ｘ；θ）を第２の損失として算出する。一方、擬似文ｚの生成確率が参照要約ｙの生成確率よりも大きくない場合、第２の損失算出部１８は、所定の設定値、例えばゼロ以上の値を第２の損失として算出する。その後、第２の損失算出部１５は、擬似文ｚごとに算出され第２の損失を合計する計算を実行することにより、Ｓ′個の擬似文ｚの集合Ｓ′（ｙ）に関する第２の損失の和を算出する。 After the generation probability of the pseudo sentence z is calculated in this way, the second loss calculation unit 18 compares the generation probability of the pseudo sentence z and the generation probability of the reference summary y. At this time, when the generation probability of the pseudo sentence z is larger than the generation probability of the reference summary y, the second loss calculation unit 18 determines the difference between the generation probability of the pseudo sentence z ₁ and the generation probability of the reference summary y, that is, p ( z | x; θ) −p (y | x; θ) is calculated as the second loss. On the other hand, when the generation probability of the pseudo sentence z is not larger than the generation probability of the reference summary y, the second loss calculation unit 18 calculates a predetermined set value, for example, a value of zero or more as the second loss. After that, the second loss calculation unit 15 executes a calculation calculated for each pseudo-sentence z and sums the second losses, so that the second loss calculation unit 15 relates to a set S'(y) of S'pseudo-sentences z. Calculate the sum of losses.

以上のように、学習データに含まれる全ての学習サンプルについて、Ｓ個のシステム要約に対する第１の損失の和およびＳ′個の擬似文ｚに対する第２の損失の和を算出する処理が繰り返し実行される。このように学習データに含まれる全ての学習サンプルについて第１の損失の和および第２の損失の和が算出されると、更新部１９は、上記の式（２）に示す目的関数Ｌ（θ）が最小化されるモデルのパラメータθにモデルのパラメータを更新する。このように更新されたモデルのパラメータが第２のモデル記憶部８へ保存される。このパラメータθの更新は、学習データＤについて所定の回数にわたって繰り返すことができる。この結果、第２のモデル記憶部８に保存されたモデル情報は、要約文の生成モデルとして提供することができる。 As described above, for all the training samples included in the training data, the process of calculating the sum of the first losses for the S system summaries and the sum of the second losses for the S'pseudo-sentence z is repeatedly executed. Will be done. When the sum of the first loss and the sum of the second losses are calculated for all the training samples included in the training data in this way, the update unit 19 performs the objective function L (θ) shown in the above equation (2). ) Is minimized Update the model parameter to the model parameter θ. The parameters of the model updated in this way are stored in the second model storage unit 8. This update of the parameter θ can be repeated a predetermined number of times for the training data D. As a result, the model information stored in the second model storage unit 8 can be provided as a generation model of the summary sentence.

［処理の流れ］
図１７は、実施例１に係る学習処理の手順を示すフローチャートである。図１７に示す学習処理のフローチャートは、第２の学習部１０により実行される第２のモデル学習の手順が図式化されたものである。図１７には、あくまで一例として、上記の式（８）にしたがって誤差付きの重複度が算出される例のフローチャートが示されている。例えば、第２の学習部１０におけるモデルの学習速度を向上させる側面から、第１の学習部５による第１のモデル学習を前処理として実行させてから第１の学習部５により学習されたモデルのパラメータを用いて図１７に示す学習処理を開始することができる。 [Processing flow]
FIG. 17 is a flowchart showing the procedure of the learning process according to the first embodiment. The flowchart of the learning process shown in FIG. 17 is a diagrammatic representation of the procedure of the second model learning executed by the second learning unit 10. FIG. 17 shows, as an example, a flowchart of an example in which the degree of duplication with an error is calculated according to the above equation (8). For example, from the aspect of improving the learning speed of the model in the second learning unit 10, the model learned by the first learning unit 5 after executing the first model learning by the first learning unit 5 as a preprocessing. The learning process shown in FIG. 17 can be started using the parameters of.

図１７に示すように、学習データに含まれるＤ個の学習サンプルごとに、ステップＳ１０１～ステップＳ１０３の処理が実行される。すなわち、入力制御部１０Ｉは、学習データ記憶部２に記憶された学習データに含まれる学習サンプルのうち１つを取得する（ステップＳ１０１）。 As shown in FIG. 17, the processes of steps S101 to S103 are executed for each of the D training samples included in the training data. That is, the input control unit 10I acquires one of the learning samples included in the learning data stored in the learning data storage unit 2 (step S101).

このようにステップＳ１０１で取得された学習サンプルが第１の系統に入力されることにより、第１の損失算出処理が実行される（ステップＳ１０２）。 By inputting the learning sample acquired in step S101 into the first system in this way, the first loss calculation process is executed (step S102).

（１）第１の損失算出処理
図１８は、実施例１に係る第１の損失算出処理の手順を示すフローチャートである。この処理は、上記のステップＳ１０２の処理に対応する。図１８に示すように、要約生成部１２は、ＲＮＮデコーダから出力される単語の確率分布に基づいて単語を時刻ごとにサンプリングすることにより、ステップＳ１０１で取得された学習サンプルの入力文ｘに対するＳ個のシステム要約ｙ′を生成する（ステップＳ３０１）。そして、第１の確率算出部１３は、ステップＳ３０１で生成されたＳ個のシステム要約ｙ′の生成確率を算出する（ステップＳ３０２）。 (1) First Loss Calculation Process FIG. 18 is a flowchart showing a procedure of the first loss calculation process according to the first embodiment. This process corresponds to the process of step S102 described above. As shown in FIG. 18, the summary generator 12 samples the words at each time based on the probability distribution of the words output from the RNN decoder, so that the S of the input sentence x of the learning sample acquired in step S101 is satisfied. Generate system summaries y'(step S301). Then, the first probability calculation unit 13 calculates the generation probability of the S system summaries y'generated in step S301 (step S302).

その後、ステップＳ３０１で生成されたＳ個のシステム要約ｙ′ごとに、下記のステップＳ３０３～下記のステップＳ３０６の処理が実行される。すなわち、重複度算出部１４Ｅは、上記の式（８）に示すｔｒｉｍ（ｙ′，ｂｙｔｅ（ｙ））にしたがってシステム要約ｙ′から上限文字数、例えば参照要約ｙに対応するバイト数の単語を切り出す（ステップＳ３０３）。 After that, the processes of the following steps S303 to the following steps S306 are executed for each of the S system summaries y'generated in the step S301. That is, the multiplicity calculation unit 14E cuts out a word having an upper limit number of characters, for example, a number of bytes corresponding to the reference summary y from the system summary y'according to the trim (y', byte (y)) shown in the above equation (8). (Step S303).

その上で、重複度算出部１４は、上記の式（８）に示すＲＯＵＧＥ（ｔｒｉｍ（ｙ′，ｂｙｔｅ（ｙ）），ｙ）にしたがってステップＳ３０３で切り出されたｔｒｉｍ（ｙ′，ｂｙｔｅ（ｙ））と、参照要約ｙとの単語の重複度を算出する（ステップＳ３０４）。 Then, the multiplicity calculation unit 14 has trim (y', byte (y)) cut out in step S303 according to ROUGE (trim (y', byte (y), y) shown in the above equation (8). )) And the multiplicity of words with the reference summary y (step S304).

また、重複度算出部１４は、上記の式（８）に示すｍａｘ（０，ｂｙｔｅ（ｙ′）－ｂｙｔｅ（ｙ））にしたがってシステム要約ｙ′が上限文字数を超える分の長さｂｙｔｅ（ｙ′）－ｂｙｔｅ（ｙ）を誤差として算出する（ステップＳ３０５）。なお、システム要約が上限文字数に満たない場合、ｍａｘ（０，ｂｙｔｅ（ｙ′）－ｂｙｔｅ（ｙ））によって「０」が選択されるので、重複度に付与する誤差は「０」と算出される。 Further, the overlap degree calculation unit 14 has a length byte (y) for which the system summary y'exceeds the upper limit number of characters according to max (0, byte (y')-byte (y)) shown in the above equation (8). ′) -Byte (y) is calculated as an error (step S305). If the system summary does not reach the upper limit of characters, "0" is selected by max (0, byte (y')-byte (y)), so the error given to the degree of duplication is calculated as "0". To.

これらステップＳ３０４で算出された重複度にステップＳ３０５で算出された誤差が付与されることにより、誤差付きの重複度Δ（ｙ′，ｙ）が導出される。 By adding the error calculated in step S305 to the overlap degree calculated in step S304, the overlap degree Δ (y ′, y) with an error is derived.

その後、第１の損失算出部１５は、ステップＳ３０２で算出されたシステム要約ｙ′に対する確率の計算結果と、誤差付きの重複度Δ（ｙ′，ｙ）とから第１の損失を算出する（ステップＳ３０６）。 After that, the first loss calculation unit 15 calculates the first loss from the calculation result of the probability for the system summary y'calculated in step S302 and the multiplicity Δ (y', y) with an error (the first loss calculation unit 15). Step S306).

ステップＳ３０１で生成されたＳ個のシステム要約ｙ′ごとに第１の損失が算出されると、第１の損失算出部１５は、Ｓ個のシステム要約ごとに算出された第１の損失を合計する計算を実行することにより、システム要約ｙ′の集合Ｓ（ｘ，θ）に対応する第１の損失の和を算出し（ステップＳ３０７）、図１７に示されたステップＳ１０２の処理を終了する。 When the first loss is calculated for each of the S system summaries y'generated in step S301, the first loss calculation unit 15 sums up the first losses calculated for each of the S system summaries. By executing the calculation to be performed, the sum of the first losses corresponding to the set S (x, θ) of the system summary y'is calculated (step S307), and the process of step S102 shown in FIG. 17 is terminated. ..

図１７の説明に戻り、ステップＳ１０１で取得された学習サンプルが第２の系統に入力されることにより、第２の損失算出処理が実行される（ステップＳ１０３）。 Returning to the description of FIG. 17, the learning sample acquired in step S101 is input to the second system, so that the second loss calculation process is executed (step S103).

（２）第２の損失算出処理
図１９は、実施例１に係る第２の損失算出処理の手順を示すフローチャートである。この処理は、上記のステップＳ１０３の処理に対応する。図１９に示すように、擬似文生成部１６は、正解の参照要約ｙから当該参照要約ｙに含まれる単語の語順を入れ替えることにより非文法的な表現が擬似的に再現された擬似文ｚの集合Ｓ′（ｙ）を生成する（ステップＳ５０１）。 (2) Second Loss Calculation Process FIG. 19 is a flowchart showing a procedure of the second loss calculation process according to the first embodiment. This process corresponds to the process of step S103 described above. As shown in FIG. 19, the pseudo-sentence generation unit 16 replaces the word order of the words included in the reference summary y of the correct answer to the pseudo-sentence z in which a non-grammatical expression is simulated. The set S'(y) is generated (step S501).

その後、ステップＳ５０１で生成されたＳ′個の擬似文ｚごとに、下記のステップＳ５０２～下記のステップＳ５０５の処理が実行される。すなわち、第２の確率算出部１７は、入力文ｘから擬似文ｚが生成される生成確率ｐ（ｚ｜ｘ；θ）を算出する（ステップＳ５０２）。その上で、第２の損失算出部１８は、ステップＳ５０２で算出された擬似文ｚの生成確率および参照要約ｙの生成確率を比較する（ステップＳ５０３）。 After that, the processes of the following steps S502 to the following steps S505 are executed for each S'pseudo-sentence z generated in step S501. That is, the second probability calculation unit 17 calculates the generation probability p (z | x; θ) in which the pseudo sentence z is generated from the input sentence x (step S502). Then, the second loss calculation unit 18 compares the generation probability of the pseudo sentence z calculated in step S502 and the generation probability of the reference summary y (step S503).

ここで、擬似文ｚの生成確率が参照要約ｙの生成確率よりも大きい場合（ステップＳ５０３Ｙｅｓ）、第２の損失算出部１８は、次のような処理を実行する。すなわち、第２の損失算出部１８は、上記の式（３）にしたがって擬似文ｚ_１の生成確率および参照要約ｙの生成確率の差、すなわちｐ（ｚ｜ｘ；θ）－ｐ（ｙ｜ｘ；θ）を第２の損失として算出する（ステップＳ５０４）。 Here, when the generation probability of the pseudo sentence z is larger than the generation probability of the reference summary y (step S503Yes), the second loss calculation unit 18 executes the following processing. That is, the second loss calculation unit 18 has _a difference between the generation probability of the pseudo sentence z1 and the generation probability of the reference summary y according to the above equation (3), that is, p (z | x; θ) -p (y | x; θ) is calculated as the second loss (step S504).

一方、擬似文ｚの生成確率が参照要約ｙの生成確率よりも大きくない場合（ステップＳ５０３Ｎｏ）、第２の損失算出部１８は、上記の式（３）にしたがって所定の設定値、例えばゼロ以上の値を第２の損失として算出する（ステップＳ５０５）。 On the other hand, when the generation probability of the pseudo sentence z is not larger than the generation probability of the reference summary y (step S503No), the second loss calculation unit 18 has a predetermined set value, for example, zero or more according to the above equation (3). Is calculated as the second loss (step S505).

その後、ステップＳ５０１で生成されたＳ′個の擬似文ｚごとに第２の損失が算出されると、第２の損失算出部１８は、次のような処理を実行する。すなわち、第２の損失算出部１８は、Ｓ′個の擬似文ごとに算出された第２の損失を合計する計算を実行することにより、擬似文ｚの集合Ｓ′（ｘ，θ）に対応する第２の損失の和を算出し（ステップＳ５０６）、図１７に示されたステップＳ１０３の処理を終了する。 After that, when the second loss is calculated for each S'pseudo-sentence z generated in step S501, the second loss calculation unit 18 executes the following processing. That is, the second loss calculation unit 18 corresponds to the set S'(x, θ) of the pseudo-sentence z by executing the calculation of summing the second losses calculated for each S'pseudo-sentence. The sum of the second losses to be performed is calculated (step S506), and the process of step S103 shown in FIG. 17 is terminated.

その後、学習データに含まれる全ての学習サンプルについて、システム要約ｙ′の集合Ｓ（ｘ，θ）に対応する第１の損失の和と、擬似文ｚの集合Ｓ′（ｘ，θ）に対応する第２の損失の和とが算出されると、更新部１９は、第２のモデル記憶部８に記憶されるモデルのパラメータを上記の式（２）に示す目的関数Ｌ（θ）が最小化されるモデルのパラメータθに更新し（ステップＳ１０４）、処理を終了する。 Then, for all the training samples included in the training data, the sum of the first losses corresponding to the set S (x, θ) of the system summary y'and the set S'(x, θ) of the pseudo-sentence z correspond. When the sum of the second losses is calculated, the update unit 19 has the minimum objective function L (θ) in which the parameters of the model stored in the second model storage unit 8 are shown in the above equation (2). The parameter θ of the model to be converted is updated (step S104), and the process is terminated.

［効果の一側面］
上述してきたように、本実施例に係る学習装置１は、正解の参照要約に含まれる単語の語順を入れ替えて非文法的な表現が擬似的に再現された擬似文を生成し、モデルが擬似文を生成する確率よりもモデルが参照要約を生成する確率が高くなるようにモデルのパラメータを更新する。このため、参照要約と単語の重複度は高く、かつ参照要約と語順が異なるシステム要約の生成確率を上げる作用を与えつつ、参照要約と単語の重複度が高い要約文の中でも非文法的な表現を含む擬似文の生成にペナルティを課す反作用を与えることができる。それ故、参照要約と単語の重複度が高い要約文の中でも非文法的な表現が含まれないシステム要約の生成確率を上げるパラメータの更新を実現できる。したがって、本実施例に係る学習装置１によれば、可読性が低い要約文を生成するモデルが学習されるのを抑制することができる。 [One aspect of the effect]
As described above, the learning device 1 according to the present embodiment replaces the word order of the words included in the reference summary of the correct answer to generate a pseudo sentence in which the non-grammatical expression is simulated, and the model is simulated. Update the model parameters so that the model is more likely to generate a reference summary than it is to generate a statement. For this reason, the degree of duplication of the reference summary and the word is high, and while giving the effect of increasing the probability of generating the system summary having a different word order from the reference summary, it is a non-grammatical expression even in the summary sentence with the high degree of duplication of the reference summary and the word. It is possible to give a reaction that imposes a penalty on the generation of a pseudo-sentence containing. Therefore, it is possible to update the parameters that increase the probability of generating a system summary that does not include non-grammatical expressions even in a summary sentence with a high degree of duplication of a reference summary and a word. Therefore, according to the learning device 1 according to the present embodiment, it is possible to suppress the learning of a model that generates a summary sentence having low readability.

さて、これまで開示の装置に関する実施例について説明したが、本発明は上述した実施例以外にも、種々の異なる形態にて実施されてよいものである。そこで、以下では、本発明に含まれる他の実施例を説明する。 Although the embodiments relating to the disclosed apparatus have been described so far, the present invention may be implemented in various different forms other than the above-described embodiments. Therefore, another embodiment included in the present invention will be described below.

［分散および統合］
また、図示した各装置の各構成要素は、必ずしも物理的に図示の如く構成されておらずともよい。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。例えば、第１の学習部５または第２の学習部１０を学習装置１の外部装置としてネットワーク経由で接続するようにしてもよい。また、第１の学習部５または第２の学習部１０を別の装置がそれぞれ有し、ネットワーク接続されて協働することで、上記の学習装置１の機能を実現するようにしてもよい。また、学習データ記憶部２、第１のモデル記憶部３または第２のモデル記憶部８の全部または一部を別の装置がそれぞれ有し、ネットワーク接続されて協働することで、上記の学習装置１の機能を実現するようにしてもかまわない。 [Distributed and integrated]
Further, each component of each of the illustrated devices does not necessarily have to be physically configured as shown in the figure. That is, the specific form of distribution / integration of each device is not limited to the one shown in the figure, and all or part of them may be functionally or physically distributed / physically in any unit according to various loads and usage conditions. Can be integrated and configured. For example, the first learning unit 5 or the second learning unit 10 may be connected via a network as an external device of the learning device 1. Further, another device may have the first learning unit 5 or the second learning unit 10, respectively, and may realize the function of the learning device 1 by being connected to a network and cooperating with each other. Further, another device has all or a part of the learning data storage unit 2, the first model storage unit 3, or the second model storage unit 8, respectively, and is connected to a network to cooperate with each other to achieve the above learning. The function of the device 1 may be realized.

［学習プログラム］
また、上記の実施例で説明した各種の処理は、予め用意されたプログラムをパーソナルコンピュータやワークステーションなどのコンピュータで実行することによって実現することができる。そこで、以下では、図２０を用いて、上記の実施例と同様の機能を有する学習プログラムを実行するコンピュータの一例について説明する。 [Learning program]
Further, the various processes described in the above embodiment can be realized by executing a program prepared in advance on a computer such as a personal computer or a workstation. Therefore, in the following, an example of a computer that executes a learning program having the same function as that of the above embodiment will be described with reference to FIG. 20.

図２０は、実施例１及び実施例２に係る学習プログラムを実行するコンピュータのハードウェア構成例を示す図である。図２０に示すように、コンピュータ１００は、操作部１１０ａと、スピーカ１１０ｂと、カメラ１１０ｃと、ディスプレイ１２０と、通信部１３０とを有する。さらに、このコンピュータ１００は、ＣＰＵ１５０と、ＲＯＭ１６０と、ＨＤＤ１７０と、ＲＡＭ１８０とを有する。これら１１０～１８０の各部はバス１４０を介して接続される。 FIG. 20 is a diagram showing a hardware configuration example of a computer that executes the learning program according to the first and second embodiments. As shown in FIG. 20, the computer 100 includes an operation unit 110a, a speaker 110b, a camera 110c, a display 120, and a communication unit 130. Further, the computer 100 has a CPU 150, a ROM 160, an HDD 170, and a RAM 180. Each of these 110 to 180 parts is connected via the bus 140.

ＨＤＤ１７０には、図２０に示すように、上記の実施例１で示した第２の学習部１０と同様の機能を発揮する学習プログラム１７０ａが記憶される。この学習プログラム１７０ａは、図１に示した第２の学習部１０の各構成要素と同様、統合又は分離してもかまわない。すなわち、ＨＤＤ１７０には、必ずしも上記の実施例１で示した全てのデータが格納されずともよく、処理に用いるデータがＨＤＤ１７０に格納されればよい。 As shown in FIG. 20, the HDD 170 stores a learning program 170a that exhibits the same function as the second learning unit 10 shown in the first embodiment. The learning program 170a may be integrated or separated as in the case of each component of the second learning unit 10 shown in FIG. That is, not all the data shown in the first embodiment may be stored in the HDD 170, and the data used for processing may be stored in the HDD 170.

このような環境の下、ＣＰＵ１５０は、ＨＤＤ１７０から学習プログラム１７０ａを読み出した上でＲＡＭ１８０へ展開する。この結果、学習プログラム１７０ａは、図２０に示すように、学習プロセス１８０ａとして機能する。この学習プロセス１８０ａは、ＲＡＭ１８０が有する記憶領域のうち学習プロセス１８０ａに割り当てられた領域にＨＤＤ１７０から読み出した各種データを展開し、この展開した各種データを用いて各種の処理を実行する。例えば、学習プロセス１８０ａが実行する処理の一例として、図１７～図１９に示す処理などが含まれる。なお、ＣＰＵ１５０では、必ずしも上記の実施例１で示した全ての処理部が動作せずともよく、実行対象とする処理に対応する処理部が仮想的に実現されればよい。 Under such an environment, the CPU 150 reads the learning program 170a from the HDD 170 and deploys it to the RAM 180. As a result, the learning program 170a functions as a learning process 180a, as shown in FIG. The learning process 180a expands various data read from the HDD 170 into an area allocated to the learning process 180a in the storage area of the RAM 180, and executes various processes using the expanded various data. For example, as an example of the process executed by the learning process 180a, the process shown in FIGS. 17 to 19 is included. In the CPU 150, not all the processing units shown in the first embodiment need to operate, and it is sufficient that the processing units corresponding to the processes to be executed are virtually realized.

なお、上記の学習プログラム１７０ａは、必ずしも最初からＨＤＤ１７０やＲＯＭ１６０に記憶されておらずともかまわない。例えば、コンピュータ１００に挿入されるフレキシブルディスク、いわゆるＦＤ、ＣＤ－ＲＯＭ、ＤＶＤディスク、光磁気ディスク、ＩＣカードなどの「可搬用の物理媒体」に学習プログラム１７０ａを記憶させる。そして、コンピュータ１００がこれらの可搬用の物理媒体から学習プログラム１７０ａを取得して実行するようにしてもよい。また、公衆回線、インターネット、ＬＡＮ、ＷＡＮなどを介してコンピュータ１００に接続される他のコンピュータまたはサーバ装置などに学習プログラム１７０ａを記憶させておき、コンピュータ１００がこれらから学習プログラム１７０ａを取得して実行するようにしてもよい。 The learning program 170a may not necessarily be stored in the HDD 170 or the ROM 160 from the beginning. For example, the learning program 170a is stored in a "portable physical medium" such as a flexible disk inserted into the computer 100, a so-called FD, a CD-ROM, a DVD disk, a magneto-optical disk, or an IC card. Then, the computer 100 may acquire and execute the learning program 170a from these portable physical media. Further, the learning program 170a is stored in another computer or server device connected to the computer 100 via a public line, the Internet, a LAN, a WAN, or the like, and the computer 100 acquires and executes the learning program 170a from these. You may try to do it.

以上の実施例を含む実施形態に関し、さらに以下の付記を開示する。 The following additional notes will be further disclosed with respect to the embodiments including the above embodiments.

（付記１）入力文から要約文を生成するモデルの機械学習を行う学習方法であって、
入力文および正解の要約文を取得し、
前記正解の要約文に含まれる単語の語順を入れ替えることにより非文法的な表現が擬似的に再現された擬似文を生成し、
前記モデルによって前記擬似文が前記入力文から生成される前記擬似文の生成確率、および、前記モデルによって前記正解の要約文が前記入力文から生成される前記正解の要約文の生成確率に基づいて前記モデルのパラメータを更新する、
処理をコンピュータが実行することを特徴とする学習方法。 (Appendix 1) This is a learning method for machine learning of a model that generates a summary sentence from an input sentence.
Get the input sentence and the summary sentence of the correct answer,
By exchanging the word order of the words included in the summary sentence of the correct answer, a pseudo sentence in which the non-grammatical expression is simulated is generated.
Based on the generation probability of the pseudo-sentence in which the pseudo-sentence is generated from the input sentence by the model, and the generation probability of the summary sentence of the correct answer in which the summary sentence of the correct answer is generated from the input sentence by the model. Update the parameters of the model,
A learning method characterized by a computer performing processing.

（付記２）前記更新する処理は、前記正解の要約文の生成確率が前記擬似文の生成確率よりも高くなるように前記モデルのパラメータを更新することを特徴とする付記１に記載の学習方法。 (Appendix 2) The learning method according to Appendix 1, wherein the updating process updates the parameters of the model so that the generation probability of the summary sentence of the correct answer is higher than the generation probability of the pseudo sentence. ..

（付記３）前記更新する処理は、前記擬似文の生成確率が前記正解の要約文の生成確率よりも高い場合、前記擬似文の生成確率および前記正解の要約文の生成確率の差を損失に加算して前記モデルのパラメータを更新し、前記擬似文の生成確率が前記正解の要約文の生成確率よりも高くない場合、前記擬似文の生成確率および前記正解の要約文の生成確率の差を損失に加算せずに前記モデルのパラメータを更新することを特徴とする付記２に記載の学習方法。 (Appendix 3) In the update process, when the generation probability of the pseudo sentence is higher than the generation probability of the summary sentence of the correct answer, the difference between the generation probability of the pseudo sentence and the generation probability of the summary sentence of the correct answer is lost. When the parameters of the model are updated by adding and the generation probability of the pseudo sentence is not higher than the generation probability of the summary sentence of the correct answer, the difference between the generation probability of the pseudo sentence and the generation probability of the summary sentence of the correct answer is calculated. The learning method according to Appendix 2, wherein the parameters of the model are updated without adding to the loss.

（付記４）前記入力文を前記モデルへ入力することにより生成された複数の要約文ごとに、前記モデルにより前記要約文が前記入力文から生成される前記要約文の生成確率を算出し、
前記複数の要約文ごとに、前記要約文および前記正解の要約文の単語の重複度を算出する処理を前記コンピュータがさらに実行し、
前記更新する処理は、前記複数の要約文ごとに算出された前記要約文の生成確率と、前記複数の要約文ごとに算出された単語の重複度と、前記擬似文の生成確率および前記正解の要約文の生成確率とに基づいて前記モデルのパラメータを更新することを特徴とする付記１に記載の学習方法。 (Appendix 4) For each of a plurality of summary sentences generated by inputting the input sentence into the model, the generation probability of the summary sentence in which the summary sentence is generated from the input sentence by the model is calculated.
The computer further executes a process of calculating the degree of duplication of words in the summary sentence and the correct answer summary sentence for each of the plurality of summary sentences.
The updating process includes the probability of generating the summary sentence calculated for each of the plurality of summary sentences, the degree of duplication of words calculated for each of the plurality of summary sentences, the probability of generating the pseudo sentence, and the correct answer. The learning method according to Appendix 1, wherein the parameters of the model are updated based on the generation probability of the summary sentence.

（付記５）前記生成する処理は、前記正解の要約文に含まれる単語の語数を変えずに単語の語順を入れ替えることにより前記擬似文を生成することを特徴とする付記１に記載の学習方法。 (Appendix 5) The learning method according to Appendix 1, wherein the generation process generates the pseudo sentence by changing the word order of the words without changing the number of words included in the summary sentence of the correct answer. ..

（付記６）入力文から要約文を生成するモデルの機械学習を実行させる学習プログラムであって、
入力文および正解の要約文を取得し、
前記正解の要約文に含まれる単語の語順を入れ替えることにより非文法的な表現が擬似的に再現された擬似文を生成し、
前記モデルによって前記擬似文が前記入力文から生成される前記擬似文の生成確率、および、前記モデルによって前記正解の要約文が前記入力文から生成される前記正解の要約文の生成確率に基づいて前記モデルのパラメータを更新する、
処理をコンピュータに実行させることを特徴とする学習プログラム。 (Appendix 6) A learning program that executes machine learning of a model that generates a summary sentence from an input sentence.
Get the input sentence and the summary sentence of the correct answer,
By exchanging the word order of the words included in the summary sentence of the correct answer, a pseudo sentence in which the non-grammatical expression is simulated is generated.
Based on the generation probability of the pseudo-sentence in which the pseudo-sentence is generated from the input sentence by the model, and the generation probability of the summary sentence of the correct answer in which the summary sentence of the correct answer is generated from the input sentence by the model. Update the parameters of the model,
A learning program characterized by having a computer perform processing.

（付記７）前記更新する処理は、前記正解の要約文の生成確率が前記擬似文の生成確率よりも高くなるように前記モデルのパラメータを更新することを特徴とする付記６に記載の学習プログラム。 (Appendix 7) The learning program according to Appendix 6, wherein the updating process updates the parameters of the model so that the generation probability of the summary sentence of the correct answer is higher than the generation probability of the pseudo sentence. ..

（付記８）前記更新する処理は、前記擬似文の生成確率が前記正解の要約文の生成確率よりも高い場合、前記擬似文の生成確率および前記正解の要約文の生成確率の差を損失に加算して前記モデルのパラメータを更新し、前記擬似文の生成確率が前記正解の要約文の生成確率よりも高くない場合、前記擬似文の生成確率および前記正解の要約文の生成確率の差を損失に加算せずに前記モデルのパラメータを更新することを特徴とする付記７に記載の学習プログラム。 (Appendix 8) In the update process, when the generation probability of the pseudo sentence is higher than the generation probability of the summary sentence of the correct answer, the difference between the generation probability of the pseudo sentence and the generation probability of the summary sentence of the correct answer is lost. When the parameters of the model are updated by adding and the generation probability of the pseudo sentence is not higher than the generation probability of the summary sentence of the correct answer, the difference between the generation probability of the pseudo sentence and the generation probability of the summary sentence of the correct answer is calculated. The learning program according to Appendix 7, wherein the parameters of the model are updated without adding to the loss.

（付記９）前記入力文を前記モデルへ入力することにより生成された複数の要約文ごとに、前記モデルにより前記要約文が前記入力文から生成される前記要約文の生成確率を算出し、
前記複数の要約文ごとに、前記要約文および前記正解の要約文の単語の重複度を算出する処理を前記コンピュータにさらに実行させ、
前記更新する処理は、前記複数の要約文ごとに算出された前記要約文の生成確率と、前記複数の要約文ごとに算出された単語の重複度と、前記擬似文の生成確率および前記正解の要約文の生成確率とに基づいて前記モデルのパラメータを更新することを特徴とする付記６に記載の学習プログラム。 (Appendix 9) For each of a plurality of summary sentences generated by inputting the input sentence into the model, the generation probability of the summary sentence in which the summary sentence is generated from the input sentence by the model is calculated.
For each of the plurality of summary sentences, the computer is further executed to calculate the degree of duplication of words in the summary sentence and the correct summary sentence.
The updating process includes the probability of generating the summary sentence calculated for each of the plurality of summary sentences, the degree of duplication of words calculated for each of the plurality of summary sentences, the probability of generating the pseudo sentence, and the correct answer. The learning program according to Appendix 6, characterized in that the parameters of the model are updated based on the probability of generating a summary sentence.

（付記１０）前記生成する処理は、前記正解の要約文に含まれる単語の語数を変えずに単語の語順を入れ替えることにより前記擬似文を生成することを特徴とする付記６に記載の学習プログラム。 (Appendix 10) The learning program according to Appendix 6, wherein the generated process generates the pseudo sentence by changing the word order of the words without changing the number of words included in the summary sentence of the correct answer. ..

（付記１１）入力文から要約文を生成するモデルの機械学習を行う学習装置であって、
入力文および正解の要約文を取得する取得部と、
前記正解の要約文に含まれる単語の語順を入れ替えることにより非文法的な表現が擬似的に再現された擬似文を生成する擬似文生成部と、
前記モデルによって前記擬似文が前記入力文から生成される前記擬似文の生成確率、および、前記モデルによって前記正解の要約文が前記入力文から生成される前記正解の要約文の生成確率に基づいて前記モデルのパラメータを更新する更新部と、
を有することを特徴とする学習装置。 (Appendix 11) A learning device that performs machine learning of a model that generates a summary sentence from an input sentence.
The acquisition unit that acquires the input sentence and the summary sentence of the correct answer,
A pseudo-sentence generator that generates a pseudo-sentence in which non-grammatical expressions are simulated by exchanging the word order of the words included in the correct summary sentence.
Based on the generation probability of the pseudo-sentence in which the pseudo-sentence is generated from the input sentence by the model, and the generation probability of the summary sentence of the correct answer in which the summary sentence of the correct answer is generated from the input sentence by the model. An update unit that updates the parameters of the model,
A learning device characterized by having.

（付記１２）前記更新部は、前記正解の要約文の生成確率が前記擬似文の生成確率よりも高くなるように前記モデルのパラメータを更新することを特徴とする付記１１に記載の学習装置。 (Supplementary Note 12) The learning device according to Supplementary Note 11, wherein the updating unit updates the parameters of the model so that the generation probability of the summary sentence of the correct answer is higher than the generation probability of the pseudo sentence.

（付記１３）前記更新部は、前記擬似文の生成確率が前記正解の要約文の生成確率よりも高い場合、前記擬似文の生成確率および前記正解の要約文の生成確率の差を損失に加算して前記モデルのパラメータを更新し、前記擬似文の生成確率が前記正解の要約文の生成確率よりも高くない場合、前記擬似文の生成確率および前記正解の要約文の生成確率の差を損失に加算せずに前記モデルのパラメータを更新することを特徴とする付記１２に記載の学習装置。 (Appendix 13) When the generation probability of the pseudo sentence is higher than the generation probability of the summary sentence of the correct answer, the update unit adds the difference between the generation probability of the pseudo sentence and the generation probability of the summary sentence of the correct answer to the loss. Then, when the parameters of the model are updated and the probability of generating the pseudo-sentence is not higher than the probability of generating the summary of the correct answer, the difference between the probability of generating the pseudo-sentence and the probability of generating the summary of the correct answer is lost. The learning apparatus according to Appendix 12, wherein the parameters of the model are updated without adding to.

（付記１４）前記入力文を前記モデルへ入力することにより生成された複数の要約文ごとに、前記モデルにより前記要約文が前記入力文から生成される前記要約文の生成確率を算出する確率算出部と、
前記複数の要約文ごとに、前記要約文および前記正解の要約文の単語の重複度を算出する重複度算出部とをさらに有し、
前記更新部は、前記複数の要約文ごとに算出された前記要約文の生成確率と、前記複数の要約文ごとに算出された単語の重複度と、前記擬似文の生成確率および前記正解の要約文の生成確率とに基づいて前記モデルのパラメータを更新することを特徴とする付記１１に記載の学習装置。 (Appendix 14) Probability calculation for calculating the generation probability of the summary sentence generated from the input sentence by the model for each of a plurality of summary sentences generated by inputting the input sentence into the model. Department and
Each of the plurality of abstract sentences further has a multiplicity calculation unit for calculating the degree of duplication of words in the abstract and the correct abstract.
The update unit includes the probability of generating the summary sentence calculated for each of the plurality of summary sentences, the degree of duplication of words calculated for each of the plurality of summary sentences, the probability of generating the pseudo sentence, and the summary of the correct answer. The learning device according to Appendix 11, characterized in that the parameters of the model are updated based on the probability of sentence generation.

（付記１５）前記擬似文生成部は、前記正解の要約文に含まれる単語の語数を変えずに単語の語順を入れ替えることにより前記擬似文を生成することを特徴とする付記１１に記載の学習装置。 (Appendix 15) The learning according to Appendix 11, wherein the pseudo-sentence generation unit generates the pseudo-sentence by changing the word order of the words without changing the number of words included in the summary sentence of the correct answer. Device.

１学習装置
２学習データ記憶部
３第１のモデル記憶部
５第１の学習部
５Ｉ入力制御部
６モデル実行部
７更新部
８第２のモデル記憶部
１０第２の学習部
１０Ｉ入力制御部
１１モデル実行部
１２要約生成部
１３第１の確率算出部
１４重複度算出部
１５第１の損失算出部
１６擬似文生成部
１７第２の確率算出部
１８第２の損失算出部
１９更新部 1 Learning device 2 Learning data storage unit 3 First model storage unit 5 First learning unit 5I Input control unit 6 Model execution unit 7 Update unit 8 Second model storage unit 10 Second learning unit 10I Input control unit 11 Model execution unit 12 Summary generation unit 13 First probability calculation unit 14 Duplicate degree calculation unit 15 First loss calculation unit 16 Pseudo sentence generation unit 17 Second probability calculation unit 18 Second loss calculation unit 19 Update unit

Claims

It is a learning method that performs machine learning of a model that generates a summary sentence from an input sentence.
Get the input sentence and the summary sentence of the correct answer,
By exchanging the word order of the words included in the summary sentence of the correct answer, a pseudo sentence in which the non-grammatical expression is simulated is generated.
Based on the generation probability of the pseudo-sentence in which the pseudo-sentence is generated from the input sentence by the model, and the generation probability of the summary sentence of the correct answer in which the summary sentence of the correct answer is generated from the input sentence by the model. Update the parameters of the model,
A learning method characterized by a computer performing processing.

The learning method according to claim 1, wherein the updating process updates the parameters of the model so that the generation probability of the summary sentence of the correct answer is higher than the generation probability of the pseudo sentence.

In the updating process, when the generation probability of the pseudo sentence is higher than the generation probability of the summary sentence of the correct answer, the difference between the generation probability of the pseudo sentence and the generation probability of the summary sentence of the correct answer is added to the loss. Update the parameters of the model, and if the probability of generating the pseudo-sentence is not higher than the probability of generating the summary of the correct answer, add the difference between the probability of generating the pseudo-sentence and the probability of generating the correct summary to the loss. The learning method according to claim 2, wherein the parameters of the model are updated without any problems.

For each of the plurality of summary sentences generated by inputting the input sentence into the model, the generation probability of the summary sentence in which the summary sentence is generated from the input sentence by the model is calculated.
The computer further executes a process of calculating the degree of duplication of words in the summary sentence and the correct answer summary sentence for each of the plurality of summary sentences.
The updating process includes the probability of generating the summary sentence calculated for each of the plurality of summary sentences, the degree of duplication of words calculated for each of the plurality of summary sentences, the probability of generating the pseudo sentence, and the correct answer. The learning method according to any one of claims 1 to 3, wherein the parameters of the model are updated based on the generation probability of the summary sentence.

The process of generating is one of claims 1 to 4, characterized in that the pseudo sentence is generated by changing the word order of the words without changing the number of words included in the summary sentence of the correct answer. The learning method described.

It is a learning program that executes machine learning of a model that generates a summary sentence from an input sentence.
Get the input sentence and the summary sentence of the correct answer,
By exchanging the word order of the words included in the summary sentence of the correct answer, a pseudo sentence in which the non-grammatical expression is simulated is generated.
Based on the generation probability of the pseudo-sentence in which the pseudo-sentence is generated from the input sentence by the model, and the generation probability of the summary sentence of the correct answer in which the summary sentence of the correct answer is generated from the input sentence by the model. Update the parameters of the model,
A learning program characterized by having a computer perform processing.

It is a learning device that performs machine learning of a model that generates a summary sentence from an input sentence.
The acquisition unit that acquires the input sentence and the summary sentence of the correct answer,
A pseudo-sentence generator that generates a pseudo-sentence in which non-grammatical expressions are simulated by exchanging the word order of the words included in the correct summary sentence.
Based on the generation probability of the pseudo-sentence in which the pseudo-sentence is generated from the input sentence by the model, and the generation probability of the summary sentence of the correct answer in which the summary sentence of the correct answer is generated from the input sentence by the model. An update unit that updates the parameters of the model,
A learning device characterized by having.