JP6337183B1

JP6337183B1 - Text extraction device, comment posting device, comment posting support device, playback terminal, and context vector calculation device

Info

Publication number: JP6337183B1
Application number: JP2017122042A
Authority: JP
Inventors: 優理小田桐; 一浩服部
Original assignee: Dwango Co Ltd
Current assignee: Dwango Co Ltd
Priority date: 2017-06-22
Filing date: 2017-06-22
Publication date: 2018-06-06
Anticipated expiration: 2037-06-22
Also published as: JP2019008440A

Abstract

【課題】対象のテキスト群と類似の文脈を示唆するテキスト群を抽出する。【解決手段】テキスト抽出装置１００は、文脈ベクトル計算部１０２と、類似度計算部１０４と、探索部１０７とを含む。文脈ベクトル計算部１０２は、複数のテキストを含む対象テキスト群の第１の文脈ベクトルを、機械学習済みのニューラルネットワークを用いて計算する。類似度計算部１０４は、それぞれ１以上のテキストを含む複数の候補テキスト群の第２の文脈ベクトルの各々と第１の文脈ベクトルとの類似度を計算する。探索部１０７は、類似度に基づいて、第１の文脈ベクトルに類似する１つ以上の第２の文脈ベクトルにそれぞれ対応する１つ以上の候補テキスト群を探索し、類似テキスト群を得る。【選択図】図１A text group suggesting a context similar to a target text group is extracted. A text extraction apparatus includes a context vector calculation unit, a similarity calculation unit, and a search unit. The context vector calculation unit 102 calculates a first context vector of a target text group including a plurality of texts using a machine-learned neural network. The similarity calculation unit 104 calculates the similarity between each of the second context vectors of the plurality of candidate text groups each including one or more texts and the first context vector. The search unit 107 searches for one or more candidate text groups respectively corresponding to one or more second context vectors similar to the first context vector based on the similarity, and obtains similar text groups. [Selection] Figure 1

Description

本発明は、コンテンツの文脈にふさわしいテキストを抽出する技術に関する。 The present invention relates to a technique for extracting text suitable for the context of content.

従来、一部の動画像配信サービスでは、ユーザが動画像に対してコメントを投稿することができる。コメントは、動画像に重ね表示され、同一の動画像を視聴するユーザ同士の一体感を高めて気分を盛り上げることに寄与する。また、動画像に対するコメントの投稿数は、その動画像の盛り上がり（人気度）を評価する指標の１つとして利用できる。 Conventionally, in some moving image distribution services, a user can post a comment on a moving image. The comment is superimposed on the moving image and contributes to enhancing the sense of unity between users who view the same moving image. Also, the number of comments posted on a moving image can be used as one index for evaluating the excitement (popularity) of the moving image.

配信されて間もない動画像は、コメント数が少なく盛り上がりに欠け、再生数が増えにくいことがある。また、ユーザは、既に多くのコメントが投稿されている動画像に対しては気軽にコメントを投稿しやすいが、未だあまりコメントが投稿されていない動画像に対してはコメントの投稿を躊躇しがちである。また、ユーザは、典型的には、キーボード（ソフトウェアキーボードを含む）、マウス、タッチスクリーン、リモートコントローラなどの入力装置を操作してテキストを直接的に入力し、コメントを投稿する。故に、ユーザは、かかる入力装置が快適に操作できる状況でなければ、コメントの投稿を積極的には行わない可能性がある。 A moving image that has just been distributed may have a small number of comments and lack of excitement, and the number of playback may be difficult to increase. In addition, users can easily post comments on videos that have already been posted with a lot of comments, but users tend to hesitate to post comments on videos that have not yet been posted. It is. The user typically inputs text by directly operating an input device such as a keyboard (including a software keyboard), a mouse, a touch screen, or a remote controller, and posts a comment. Therefore, the user may not actively post comments unless the input device can be comfortably operated.

仮に、例えば「ｗｗｗ」、「乙」などの定型的なコメントを自動生成して投稿したとすれば、動画像のコメント数を容易に増やすことはできる。しかしながら、このような当たり障りのないコメントよりも、ユーザの共感を誘う、動画像のその時々の文脈にふさわしいコメントが求められる。 For example, if a standard comment such as “www” or “B” is automatically generated and posted, the number of comments in the moving image can be easily increased. However, a comment suitable for the context of the moving image that invites the user's empathy rather than such a bland comment is required.

特許文献１には、対象の動画データに対する正規コメントからベクトルを生成し、当該ベクトルに関して類似する他の正規コメントを抽出する技術が開示されている（［００４２］〜［００４３］）。また、対象の動画データとは異なる動画データに対する正規コメントを抽出することも開示されている（［００５０］）。 Patent Document 1 discloses a technique for generating a vector from a normal comment for target moving image data and extracting another normal comment similar to the vector ([0042] to [0043]). It is also disclosed to extract a regular comment for moving image data different from the target moving image data ([0050]).

特開２０１５−２２０６１０号公報JP2015-220610A

本発明は、対象のテキスト群と類似の文脈を示唆するテキスト群を抽出することを目的とする。 An object of the present invention is to extract a text group suggesting a context similar to the target text group.

本発明の第１の態様によれば、テキスト抽出装置は、文脈ベクトル計算部と、類似度計算部と、探索部とを含む。文脈ベクトル計算部は、複数のテキストを含む対象テキスト群の第１の文脈ベクトルを、機械学習済みのニューラルネットワークを用いて計算する。類似度計算部は、それぞれ１以上のテキストを含む複数の候補テキスト群の第２の文脈ベクトルの各々と第１の文脈ベクトルとの類似度を計算する。探索部は、類似度に基づいて、第１の文脈ベクトルに類似する１つ以上の第２の文脈ベクトルにそれぞれ対応する１つ以上の候補テキスト群を探索し、類似テキスト群を得る。 According to the first aspect of the present invention, the text extraction device includes a context vector calculation unit, a similarity calculation unit, and a search unit. The context vector calculation unit calculates a first context vector of a target text group including a plurality of texts using a machine-learned neural network. The similarity calculation unit calculates the similarity between each of the second context vectors of the plurality of candidate text groups each including one or more texts and the first context vector. The search unit searches for one or more candidate text groups respectively corresponding to one or more second context vectors similar to the first context vector based on the similarity, and obtains similar text groups.

本発明の第２の態様によれば、コメント投稿装置は、文脈ベクトル計算部と、類似度計算部と、探索部と、送信部とを含む。文脈ベクトル計算部は、コンテンツに関連付けられる複数のテキストを含む対象テキスト群の第１の文脈ベクトルを、機械学習済みのニューラルネットワークを用いて計算する。類似度計算部は、それぞれ１以上のテキストを含む複数の候補テキスト群の第２の文脈ベクトルの各々と第１の文脈ベクトルとの類似度を計算する。探索部は、類似度に基づいて、第１の文脈ベクトルに類似する１つ以上の第２の文脈ベクトルにそれぞれ対応する１つ以上の候補テキスト群を探索し、類似テキスト群を得る。送信部は、類似テキスト群に含まれるテキストの少なくとも１つをコンテンツに関連付けられるコメントとして投稿するために送信する。 According to the second aspect of the present invention, the comment posting device includes a context vector calculation unit, a similarity calculation unit, a search unit, and a transmission unit. The context vector calculation unit calculates a first context vector of a target text group including a plurality of texts associated with content using a machine-learned neural network. The similarity calculation unit calculates the similarity between each of the second context vectors of the plurality of candidate text groups each including one or more texts and the first context vector. The search unit searches for one or more candidate text groups respectively corresponding to one or more second context vectors similar to the first context vector based on the similarity, and obtains similar text groups. The transmission unit transmits at least one of the texts included in the similar text group for posting as a comment associated with the content.

本発明の第３の態様によれば、コメント投稿支援装置は、文脈ベクトル計算部と、類似度計算部と、探索部と、送信部とを含む。文脈ベクトル計算部は、コンテンツに関連付けられる複数のテキストを含む対象テキスト群の第１の文脈ベクトルを、機械学習済みのニューラルネットワークを用いて計算する。類似度計算部は、それぞれ１以上のテキストを含む複数の候補テキスト群の第２の文脈ベクトルの各々と第１の文脈ベクトルとの類似度を計算する。探索部は、類似度に基づいて、第１の文脈ベクトルに類似する１つ以上の第２の文脈ベクトルにそれぞれ対応する１つ以上の候補テキスト群を探索し、類似テキスト群を得る。送信部は、類似テキスト群に含まれるテキストの少なくとも１つを投稿可能コメントとして、コンテンツを再生する再生端末へ送信する。 According to the third aspect of the present invention, the comment posting support device includes a context vector calculation unit, a similarity calculation unit, a search unit, and a transmission unit. The context vector calculation unit calculates a first context vector of a target text group including a plurality of texts associated with content using a machine-learned neural network. The similarity calculation unit calculates the similarity between each of the second context vectors of the plurality of candidate text groups each including one or more texts and the first context vector. The search unit searches for one or more candidate text groups respectively corresponding to one or more second context vectors similar to the first context vector based on the similarity, and obtains similar text groups. The transmission unit transmits at least one of the texts included in the similar text group as a postable comment to a playback terminal that plays back the content.

本発明の第４の態様によれば、再生端末は、受信部と、再生部と、文脈ベクトル計算部と、類似度計算部と、探索部と、送信部とを含む。受信部は、コンテンツを受信する。再生部は、コンテンツを再生する。文脈ベクトル計算部は、コンテンツに関連付けられる複数のテキストを含む対象テキスト群の第１の文脈ベクトルを、機械学習済みのニューラルネットワークを用いて計算する。類似度計算部は、それぞれ１以上のテキストを含む複数の候補テキスト群の第２の文脈ベクトルの各々と第１の文脈ベクトルとの類似度を計算する。探索部は、類似度に基づいて、第１の文脈ベクトルに類似する１つ以上の第２の文脈ベクトルにそれぞれ対応する１つ以上の候補テキスト群を探索し、類似テキスト群を得る。送信部は、類似テキスト群に含まれるテキストの少なくとも１つをコンテンツに関連付けられるコメントとして投稿するために送信する。 According to the fourth aspect of the present invention, the reproduction terminal includes a reception unit, a reproduction unit, a context vector calculation unit, a similarity calculation unit, a search unit, and a transmission unit. The receiving unit receives content. The playback unit plays back content. The context vector calculation unit calculates a first context vector of a target text group including a plurality of texts associated with content using a machine-learned neural network. The similarity calculation unit calculates the similarity between each of the second context vectors of the plurality of candidate text groups each including one or more texts and the first context vector. The search unit searches for one or more candidate text groups respectively corresponding to one or more second context vectors similar to the first context vector based on the similarity, and obtains similar text groups. The transmission unit transmits at least one of the texts included in the similar text group for posting as a comment associated with the content.

本発明の第５の態様によれば、再生端末は、受信部と、再生部と、文脈ベクトル計算部と、類似度計算部と、探索部と、出力部と、送信部とを含む。受信部は、コンテンツを受信する。再生部は、コンテンツを再生する。文脈ベクトル計算部は、コンテンツに関連付けられる複数のテキストを含む対象テキスト群の第１の文脈ベクトルを、機械学習済みのニューラルネットワークを用いて計算する。類似度計算部は、それぞれ１以上のテキストを含む複数の候補テキスト群の第２の文脈ベクトルの各々と第１の文脈ベクトルとの類似度を計算する。探索部は、類似度に基づいて、第１の文脈ベクトルに類似する１つ以上の第２の文脈ベクトルにそれぞれ対応する１つ以上の候補テキスト群を探索し、類似テキスト群を得る。出力部は、類似テキスト群に含まれるテキストの少なくとも１つを投稿可能コメントとして、再生されたコンテンツと共に出力する。送信部は、ユーザによって選択された投稿可能コメントをコンテンツに関連付けられるコメントとして投稿するために送信する。 According to the fifth aspect of the present invention, the reproduction terminal includes a reception unit, a reproduction unit, a context vector calculation unit, a similarity calculation unit, a search unit, an output unit, and a transmission unit. The receiving unit receives content. The playback unit plays back content. The context vector calculation unit calculates a first context vector of a target text group including a plurality of texts associated with content using a machine-learned neural network. The similarity calculation unit calculates the similarity between each of the second context vectors of the plurality of candidate text groups each including one or more texts and the first context vector. The search unit searches for one or more candidate text groups respectively corresponding to one or more second context vectors similar to the first context vector based on the similarity, and obtains similar text groups. The output unit outputs at least one of the texts included in the similar text group as a postable comment together with the reproduced content. The transmission unit transmits the postable comment selected by the user as a comment associated with the content.

本発明の第６の態様によれば、文脈ベクトル計算装置は、文脈ベクトル計算部を含む。文脈ベクトル計算部は、複数のテキストを含む対象テキスト群の文脈ベクトルを、機械学習済みのニューラルネットワークを用いて計算する。ニューラルネットワークは、複数の学習データを用いて行われた機械学習の結果が設定されている。複数の学習データは、それぞれ、入力データとしてのテキスト群のペアと、テキスト群のペアが類似する文脈を示唆するか、それとも類似しない文脈を示唆するかを表す教師データとを含む。 According to the sixth aspect of the present invention, the context vector calculation device includes a context vector calculation unit. The context vector calculation unit calculates a context vector of a target text group including a plurality of texts using a machine-learned neural network. In the neural network, a result of machine learning performed using a plurality of learning data is set. Each of the plurality of learning data includes a text group pair as input data, and teacher data indicating whether the text group pair suggests a similar context or a similar context.

本発明によれば、対象のテキスト群と類似の文脈を示唆するテキスト群を抽出することができる。 According to the present invention, it is possible to extract a text group that suggests a context similar to the target text group.

実施形態に係るテキスト抽出装置を例示するブロック図。The block diagram which illustrates the text extraction device concerning an embodiment. ユーザの再生端末における表示画面例を示す図。The figure which shows the example of a display screen in a user's reproduction | regeneration terminal. 図２の表示画面から抽出される対象テキスト群を例示する図。The figure which illustrates the object text group extracted from the display screen of FIG. 図１の文脈ベクトル計算部によって計算される、対象テキスト群の文脈ベクトルの説明図。Explanatory drawing of the context vector of the object text group calculated by the context vector calculation part of FIG. 図２の対象テキスト群に基づいて抽出される類似テキストを例示する図。The figure which illustrates the similar text extracted based on the object text group of FIG. 図２の対象テキスト群に基づいて抽出される類似テキスト群を例示する図。The figure which illustrates the similar text group extracted based on the object text group of FIG. 図１のテキスト抽出装置の動作を例示するフローチャート。The flowchart which illustrates operation | movement of the text extraction apparatus of FIG. 図１のテキスト抽出装置を含むコメント投稿装置を例示するブロック図。The block diagram which illustrates the comment contribution apparatus containing the text extraction apparatus of FIG. 図１のテキスト抽出装置を含む再生端末を例示するブロック図。FIG. 2 is a block diagram illustrating a playback terminal including the text extraction device of FIG. 1. 図８のコメント投稿装置および図９の再生端末を含むコンテンツ配信システムを例示するブロック図。FIG. 10 is a block diagram illustrating a content distribution system including the comment posting device in FIG. 8 and the playback terminal in FIG. 9.

以下、図面を参照しながら実施形態の説明を述べる。なお、以降、説明済みの要素と同一または類似の要素には同一または類似の符号を付し、重複する説明については基本的に省略する。 Hereinafter, embodiments will be described with reference to the drawings. Hereinafter, elements that are the same as or similar to elements already described are denoted by the same or similar reference numerals, and redundant descriptions are basically omitted.

（実施形態）
本発明の実施形態に係るテキスト抽出装置は、機械学習済みのＮＮ（ＮｅｕｒａｌＮｅｔｗｏｒｋ）を用いて、複数のテキストを含む入力テキスト群（以降、対象テキスト群と称される）の示唆する文脈を定量化する文脈ベクトルを計算する。 (Embodiment)
A text extraction apparatus according to an embodiment of the present invention uses a machine-learned NN (Neural Network) to quantify a context suggested by an input text group including a plurality of texts (hereinafter referred to as a target text group). Compute the context vector

本実施形態において、テキストとは、単語、フレーズ、文、文章など、文字からなる任意の情報単位を意味し得る。 In the present embodiment, the text may mean an arbitrary information unit composed of characters such as a word, a phrase, a sentence, and a sentence.

対象テキスト群は、例えばコンテンツ配信サービスにおいて配信されるコンテンツ、典型的には動画像、の再生に同期して表示される複数のコメントから抽出され得る。 The target text group can be extracted from, for example, a plurality of comments displayed in synchronization with the reproduction of content distributed in a content distribution service, typically a moving image.

ここで、コンテンツの再生に同期して表示されるとは、コメントがコンテンツの再生位置によって定められる時間的位置（時刻）に表示されることを意味する。そして、このテキスト抽出装置は、複数の候補テキスト群のうち、対象テキスト群の文脈ベクトルに類似する文脈ベクトルを持つもの（以降、類似テキスト群と称される）を探索する。 Here, being displayed in synchronization with the reproduction of the content means that the comment is displayed at a temporal position (time) determined by the reproduction position of the content. The text extracting device searches for a candidate text group having a context vector similar to the context vector of the target text group (hereinafter referred to as a similar text group).

これにより、このテキスト抽出装置は、複数の候補テキスト群から、対象テキスト群と類似の文脈を示唆する類似テキスト群を抽出することができる。抽出された類似テキスト群は、例えば、対象テキスト群の抽出されたコンテンツに新規コメントとして自動投稿するために用いられてもよいし、当該コンテンツを視聴するユーザによるコメント投稿を支援するための投稿可能コメントとして用いられてもよい。 Thereby, this text extraction apparatus can extract the similar text group which suggests a context similar to the target text group from the plurality of candidate text groups. The extracted similar text group may be used, for example, to automatically post as a new comment to the extracted content of the target text group, or can be posted to support comment posting by a user who views the content It may be used as a comment.

なお、コメントに加えて、或いは、コメントの代わりに、タイトル、説明文、タグ、カテゴリなどのコンテンツに付与されるメタデータに関わるテキストを対象に同様の処理を行ってもよい。ただし、タイトル、説明文、タグおよびカテゴリは、コンテンツの全体に付与されるのに対して、コメントは、コンテンツ中の限られた部分に対して付与されるので当該コンテンツの局所的な文脈が反映されやすい。 In addition to the comment or instead of the comment, the same processing may be performed on text related to metadata attached to content such as a title, an explanation, a tag, and a category. However, the title, description, tag, and category are assigned to the entire content, whereas the comment is assigned to a limited part of the content, so the local context of the content is reflected. Easy to be.

図１に例示されるように、実施形態に係るテキスト抽出装置１００は、テキスト群入力部１０１と、文脈ベクトル計算部１０２と、ＮＮ１０３と、類似度計算部１０４と、候補テキスト群記憶部１０５と、類似度記憶部１０６と、類似テキスト群探索部１０７とを含む。 As illustrated in FIG. 1, the text extraction apparatus 100 according to the embodiment includes a text group input unit 101, a context vector calculation unit 102, an NN 103, a similarity calculation unit 104, and a candidate text group storage unit 105. And a similarity storage unit 106 and a similar text group search unit 107.

テキスト抽出装置１００は、典型的にはコンピュータであるが、これらに限られない。テキスト抽出装置１００は、文脈ベクトルの計算、類似度の計算、類似テキスト群の探索などを行うプロセッサ、例えばＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）およびＧＰＵ（ＧｒａｐｈｉｃａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）と、かかる処理を実現するために当該プロセッサによって実行されるプログラムおよび当該プロセッサによって使用されるデータなどを一時的に格納するメモリとを含んでいる。 The text extraction apparatus 100 is typically a computer, but is not limited thereto. The text extraction apparatus 100 includes a processor for calculating a context vector, calculating a similarity, searching for a similar text group, and the like, for example, a CPU (Central Processing Unit) and a GPU (Graphical Processing Unit), in order to implement such processing. A memory for temporarily storing a program executed by the processor and data used by the processor;

テキスト抽出装置１００は、さらに、ネットワークに接続するための通信装置と、ユーザ入力を受け付けるための入力装置と、プログラムまたはデータを保存する補助記憶装置とを利用可能である。これらの通信装置、入力装置および補助記憶装置は、テキスト抽出装置１００に内蔵されていてもよいし、テキスト抽出装置１００に外付けされてもよい。 The text extraction device 100 can further use a communication device for connecting to a network, an input device for receiving user input, and an auxiliary storage device for storing programs or data. These communication device, input device, and auxiliary storage device may be built in the text extraction device 100 or may be externally attached to the text extraction device 100.

通信装置は、ネットワーク経由で、Ｗｅｂサーバ、コンテンツ配信サーバ、コメント配信サーバなどの外部装置と通信をする。 The communication device communicates with external devices such as a Web server, a content distribution server, and a comment distribution server via a network.

入力装置は、例えば、キーボード、マウス、テンキー、リモートコントローラ、マイクロフォンなどであってもよいし、タッチスクリーンのように表示装置の機能を備えていてもよい。ユーザ入力は、典型的には、テキスト入力、または音声入力であり得る。音声入力は、ＡＳＲ（ＡｕｔｏｍａｔｉｃＳｐｅｅｃｈＲｅｃｏｇｎｉｔｉｏｎ）を利用してテキストに変換可能である。 The input device may be, for example, a keyboard, a mouse, a numeric keypad, a remote controller, a microphone, or the like, or may have a display device function such as a touch screen. User input may typically be text input or voice input. The voice input can be converted into text using ASR (Automatic Speech Recognition).

補助記憶装置は、例えば、プログラムまたはデータを格納する。補助記憶装置は、例えば、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）などの不揮発性記憶媒体であることが好ましい。補助記憶装置は、テキスト抽出装置１００にネットワーク経由で接続されたファイルサーバであり得る。 The auxiliary storage device stores, for example, a program or data. The auxiliary storage device is preferably a non-volatile storage medium such as an HDD (Hard Disk Drive) and an SSD (Solid State Drive). The auxiliary storage device may be a file server connected to the text extraction device 100 via a network.

テキスト群入力部１０１は、対象テキスト群を獲得し、これを文脈ベクトル計算部１０２へ送る。テキスト群入力部１０１は、以下に説明する手法（１）〜（３）、またはその他の手法で、対象テキスト群を獲得することができる。テキスト群入力部１０１は、前述の入力装置若しくは通信装置、またはこれらとのインターフェースであってもよいし、前述のプロセッサおよびメモリであってもよい。 The text group input unit 101 acquires the target text group and sends it to the context vector calculation unit 102. The text group input unit 101 can acquire the target text group by the following methods (1) to (3) or other methods. The text group input unit 101 may be the above-described input device or communication device, or an interface with these, or the above-described processor and memory.

（１）テキスト群入力部１０１は、例えば前述の入力装置を操作してユーザが入力したテキスト群を獲得してもよい。かかる対象テキスト群に基づいて抽出した類似テキスト群を、自動投稿のために用いたり、コメント投稿を支援するための投稿可能コメントとして用いたりすれば、ユーザは、直接入力したコメント群に加えて、これらと類似の文脈を示唆するコメント群が投稿可能となる。これにより、ユーザは、軽い入力負荷で、自分が盛り上げたいコンテンツに対し、その時々の文脈にふさわしいコメントを多数投稿することができる。 (1) The text group input unit 101 may acquire a text group input by the user by operating the above-described input device, for example. If the similar text group extracted based on the target text group is used for automatic posting or used as a postable comment for supporting comment posting, in addition to the comment group directly input, Comments that suggest similar contexts can be posted. Thereby, the user can post many comments suitable for the context of the content he / she wants to excite with a light input load.

（２）テキスト群入力部１０１は、例えば、コンテンツの再生中に、当該コンテンツの再生に同期して表示されるコメント群の少なくとも一部を対象テキスト群として獲得してもよい。テキスト群入力部１０１は、コメント群を、例えばバッファ（メモリ）から読み出す。かかる対象テキスト群に基づいて抽出した類似テキスト群を、コメント投稿を支援するための投稿可能コメントとして用いれば、ユーザに再生中のコンテンツの文脈にふさわしい投稿可能コメントを提示することができる。これにより、ユーザは、自ら文章を考えなくても、提示された投稿可能コメントのうち気に入ったものを選択するだけで、コメントを投稿することができる。すなわち、コメント投稿に対するユーザの心理的ハードルを下げることができる。 (2) The text group input unit 101 may acquire, for example, at least a part of a comment group displayed in synchronization with the reproduction of the content as the target text group during the reproduction of the content. The text group input unit 101 reads a comment group from, for example, a buffer (memory). If the similar text group extracted based on the target text group is used as a postable comment for supporting comment posting, a postable comment suitable for the context of the content being reproduced can be presented to the user. Thereby, even if a user does not think about a sentence by himself / herself, he / she can post a comment only by selecting a favorite postable comment presented. That is, the user's psychological hurdle for comment posting can be lowered.

なお、抽出可能なテキスト群は、抽出時のコンテンツの再生位置を基準に設定されたタイムウィンドウに含まれる位置（表示時刻）で表示されるコメントに限られてよい。このように抽出されたテキスト群は、コンテンツのこのタイムウィンドウ内の文脈を示唆する可能性が高い。 The text group that can be extracted may be limited to a comment displayed at a position (display time) included in a time window set based on the reproduction position of the content at the time of extraction. A group of text extracted in this way is likely to suggest a context within this time window of the content.

図２に、ユーザの再生端末における表示画面例が示される。動画像表示領域には、動画像に加えてコメントが重ね表示される。各コメントは、当該コメントが表示される再生位置（表示時刻）が定められている。図２の例では、動画像表示領域の右側にコメントとその表示時刻とが一覧可能なテーブルを表示する領域が用意されている。動画像表示領域の上側には、例えば動画タイトル、説明文、動画の種類を表す動画カテゴリ、動画に対してユーザが付与したキーワードである動画タグ、などのメタデータを表示する領域が用意されている。他方、動画像表示領域の下側には、動画像の現在の再生位置を示すシークバー、および再生／停止などの再生制御のためのボタンを表示する領域が用意されている。 FIG. 2 shows an example of a display screen on the user's playback terminal. In the moving image display area, a comment is displayed in addition to the moving image. Each comment has a playback position (display time) at which the comment is displayed. In the example of FIG. 2, an area for displaying a table in which comments and their display times can be listed is prepared on the right side of the moving image display area. On the upper side of the moving image display area, for example, an area for displaying metadata such as a moving image title, a description, a moving image category indicating the type of moving image, a moving image tag that is a keyword assigned to the moving image by the user is prepared Yes. On the other hand, an area for displaying a seek bar indicating the current playback position of the moving image and a button for playback control such as playback / stop is prepared below the moving image display area.

図２の例において、現在の再生時間（００：０１：００）の例えば直前５秒間にタイムウィンドウが設定されたとすると、テキスト群入力部１０１は、図３に例示されるように、「ロンドネタ混ぜていくｗ」、「夜の攻撃回数」、「廃課金兵どもめぇｗ」および「自分の出番は？」の４つのコメントを対象テキスト群として抽出できる。なお、動画カテゴリである「スポーツ」は、対象テキスト群の一部として含まれてもよいし、対象テキスト群とは別の付加的な、コンテンツの属性情報として扱われてもよい。 In the example of FIG. 2, if a time window is set for 5 seconds immediately before the current playback time (00:01:00), for example, the text group input unit 101, as illustrated in FIG. 4 comments, “Wow”, “Number of attacks at night”, “Waste Charge Soldier Domeme w” and “What is your turn?” Can be extracted as the target text group. Note that “sports”, which is a moving image category, may be included as a part of the target text group, or may be treated as additional content attribute information different from the target text group.

（３）テキスト群入力部１０１は、例えばコンテンツ配信サービスにおいて配信される特定のコンテンツを指定する情報を受け取り、このコンテンツに関連付けられるテキスト群をコメント配信サーバから獲得（例えばネットワーク経由で受信）してもよい。かかる対象テキスト群に基づいて抽出した類似テキスト群を、自動投稿のために用いれば、特定のコンテンツに対し、その時々の文脈にふさわしいコメントを多数投稿して盛り上げることができる。特定のコンテンツは、例えば、新規に投稿されたコンテンツ、コメント数の伸び悩んでいるコンテンツ、コメント数の落ち込み始めたコンテンツなどであり得る。 (3) The text group input unit 101 receives, for example, information specifying a specific content distributed in the content distribution service, acquires a text group associated with the content from the comment distribution server (for example, receives it via a network). Also good. If a similar text group extracted based on such a target text group is used for automatic posting, it is possible to post a number of comments suitable for the context of the particular content and excite them. The specific content may be, for example, newly posted content, content where the number of comments is sluggish, content where the number of comments has started to decline, and the like.

なお、抽出可能なテキスト群は、コンテンツの特定の再生位置を基準に設定されたタイムウィンドウに含まれる位置（表示時刻）で表示されるコメントに限られてよい。このように抽出されたテキスト群は、コンテンツのこのタイムウィンドウ内の文脈を示唆する可能性が高い。特定の再生位置は、特定のコンテンツと共に明示的に指定されてもよいし、予め定められていてもよいし、例えばコメント数の分布などに基づいて自動的に決定されてもよい。 The text group that can be extracted may be limited to a comment displayed at a position (display time) included in a time window set based on a specific reproduction position of the content. A group of text extracted in this way is likely to suggest a context within this time window of the content. The specific playback position may be explicitly specified together with the specific content, may be determined in advance, or may be automatically determined based on, for example, the distribution of the number of comments.

文脈ベクトル計算部１０２は、テキスト群入力部１０１から対象テキスト群を受け取り、この対象テキスト群の文脈ベクトルを、機械学習済みのＮＮ１０３を用いて計算する。例えば、文脈ベクトル計算部１０２は、対象テキスト群に含まれる各テキストをＮＮ１０３へ与え、そして、ＮＮ１０３からテキスト毎の文脈ベクトルを受け取り、これらを合成することで対象テキスト群の文脈ベクトルを計算してもよい。文脈ベクトル計算部１０２は、対象テキスト群の文脈ベクトルを類似度計算部１０４へ送る。文脈ベクトル計算部１０２は、前述のプロセッサおよびメモリであり得る。 The context vector calculation unit 102 receives the target text group from the text group input unit 101, and calculates the context vector of the target text group using the machine-learned NN 103. For example, the context vector calculation unit 102 gives each text included in the target text group to the NN 103, receives the context vector for each text from the NN 103, and synthesizes them to calculate the context vector of the target text group. May be. The context vector calculation unit 102 sends the context vector of the target text group to the similarity calculation unit 104. The context vector calculation unit 102 may be the aforementioned processor and memory.

ＮＮ１０３は、入力テキストを文脈ベクトルへ変換する能力を獲得するための機械学習の学習結果（学習済みモデル）が設定されたニューラルネットワークである。この機械学習については後述する。ＮＮ１０３は、前述のプロセッサ（通常はＧＰＵ）およびメモリであり得る。 NN 103 is a neural network in which a learning result (learned model) of machine learning for acquiring the ability to convert input text into a context vector is set. This machine learning will be described later. The NN 103 may be the aforementioned processor (usually a GPU) and memory.

ここで、対象テキスト群Ｔを、ｎ個のテキストｔ［１］，ｔ［２］，・・・，ｔ［ｎ］の集合であるとする。ｎは、２以上の任意の自然数であって、可変値であり得る。図３の例ではｎ＝４である。対象テキスト群Ｔの要素であるテキストｔ［１］，ｔ［２］，・・・，ｔ［ｎ］は、それぞれ、図４に示すように、機械学習済みのＮＮ１０３（ｇ）に与えられ、文脈ベクトルｇ［１］，ｇ［２］，・・・，ｇ［ｎ］へと変換される。文脈ベクトルはｄ次元実ベクトルであって、ｄは設計により定められ得る。 Here, it is assumed that the target text group T is a set of n texts t [1], t [2], ..., t [n]. n is an arbitrary natural number equal to or greater than 2, and may be a variable value. In the example of FIG. 3, n = 4. The texts t [1], t [2],..., T [n] that are elements of the target text group T are respectively given to the machine-learned NN 103 (g) as shown in FIG. , Context vectors g [1], g [2],..., G [n]. The context vector is a d-dimensional real vector, and d can be determined by design.

文脈ベクトル計算部１０２は、これら文脈ベクトルｇ［１］，ｇ［２］，・・・，ｇ［ｎ］を合成することで、対象テキスト群Ｔの文脈ベクトルｇ［Ｔ］を計算する。合成とは、複数の文脈ベクトルを１つに統合する演算を指す。具体的には、単に複数の文脈ベクトルの和を計算することであってもよいし、複数の文脈ベクトルを用いて畳み込み演算を行うことであってもよい。以降の説明では、対象テキスト群Ｔの文脈ベクトルｇ［Ｔ］を、便宜的に、第１の文脈ベクトルと称することがある。 The context vector calculation unit 102 calculates the context vector g [T] of the target text group T by synthesizing these context vectors g [1], g [2],..., G [n]. Combining refers to an operation that integrates a plurality of context vectors into one. Specifically, the sum of a plurality of context vectors may be simply calculated, or a convolution operation may be performed using a plurality of context vectors. In the following description, the context vector g [T] of the target text group T may be referred to as a first context vector for convenience.

類似度計算部１０４は、文脈ベクトル計算部１０２から第１の文脈ベクトルを受け取り、候補テキスト群記憶部１０５から複数（Ｍ群）の候補テキスト群の文脈ベクトルを読み出す。以降の説明では、候補テキスト群の文脈ベクトルを、便宜的に、第２の文脈ベクトルと称することがある。 The similarity calculation unit 104 receives the first context vector from the context vector calculation unit 102 and reads the context vectors of a plurality (group M) of candidate text groups from the candidate text group storage unit 105. In the following description, the context vector of the candidate text group may be referred to as a second context vector for convenience.

候補テキスト群の群数であるＭは、２以上の任意の自然数であるが類似テキスト群の探索範囲を規定するので、ある程度大きいことが好ましい。これら候補テキスト群は、例えば大量のコンテンツに関連付けられるテキスト群を、対象テキスト群と同様の抽出法で、予め抽出することにより収集可能である。各候補テキスト群に含まれるテキストの数は、任意の自然数であり、ばらばらであってもよいし、全て同じであってもよい。また、これら候補テキスト群の文脈ベクトルは、第１の文脈ベクトルと同様の計算法で、予め計算することができる。これら候補テキスト群Ｓ［１］，Ｓ［２］，・・・，Ｓ［Ｍ］と、その文脈ベクトルｇ［Ｓ［１］］，ｇ［Ｓ［２］］，・・・，ｇ［Ｓ［Ｍ］］により、類似テキスト群を探索するためのコーパスを以下の様に構築する。
Ｃｏｒｐｕｓ＝｛（Ｓ［１］，ｇ［Ｓ［１］］），（Ｓ［２］，ｇ［Ｓ［２］］），・・・，（Ｓ［Ｍ］，ｇ［Ｓ［Ｍ］］）｝
類似度計算部１０４は、第１の文脈ベクトルに対するＭ個の第２の文脈ベクトルそれぞれの類似度を計算する。類似度は、ベクトル間の類似性を評価するための任意の指標であってよく、典型的には、コサイン類似度、または正規化された文脈ベクトル間のユークリッド距離である。第１の文脈ベクトルｇ［Ｔ］に対する第ｉ番目の候補テキスト群の第２の文脈ベクトルｇ［Ｓ［ｉ］］のコサイン類似度は、ｃｏｓ（ａｒｇ（ｇ［Ｔ］，ｇ［Ｓ［ｉ］］））で定義される。ｉはＭ以下の任意の自然数である。類似度計算部１０４は、各候補テキスト群を識別する情報（例えばｉの値）に、第１の文脈ベクトルと当該候補テキスト群の第２の文脈ベクトルとの類似度を関連付けて類似度記憶部１０６に保存する。類似度計算部１０４は、前述のプロセッサおよびメモリであり得る。 M, which is the number of candidate text groups, is an arbitrary natural number greater than or equal to 2, but it is preferable that it is large to some extent because it defines the search range of similar text groups. These candidate text groups can be collected by, for example, extracting a text group associated with a large amount of content in advance by the same extraction method as the target text group. The number of texts included in each candidate text group is an arbitrary natural number, may be disjoint, or all may be the same. Further, the context vectors of these candidate text groups can be calculated in advance by the same calculation method as that for the first context vector. These candidate text groups S [1], S [2],..., S [M] and their context vectors g [S [1]], g [S [2]],. With [M]], a corpus for searching for similar text groups is constructed as follows.
Corpus = {(S [1], g [S [1]]), (S [2], g [S [2]]), ..., (S [M], g [S [M]] )}
The similarity calculation unit 104 calculates the similarity of each of the M second context vectors with respect to the first context vector. Similarity may be any index for evaluating similarity between vectors, typically cosine similarity, or Euclidean distance between normalized context vectors. The cosine similarity of the second context vector g [S [i]] of the i-th candidate text group with respect to the first context vector g [T] is cosine (arg (g [T], g [S [i ]])). i is an arbitrary natural number of M or less. The similarity calculation unit 104 associates the similarity between the first context vector and the second context vector of the candidate text group with information for identifying each candidate text group (for example, the value of i), and a similarity storage unit Save to 106. The similarity calculation unit 104 may be the above-described processor and memory.

なお、Ｍ個の第２の文脈ベクトルの全てについて、類似度を計算することは必須ではない。例えば、大量のデータから類似するインスタンスのペアを高速に抽出するアルゴリズムとして、局所鋭敏型ハッシュ（ＬＳＨ：ＬｏｃａｌｉｔｙＳｅｎｓｉｔｉｖｅＨａｓｈｉｎｇ）が知られている。このアルゴリズムは、ベクトルを入力してスカラ値を出力するように定義されたハッシュ関数を用いる。このハッシュ関数は、類似するベクトルが入力された場合に同一のスカラ値を返しやすいという特徴がある。故に、Ｍ個の第２の文脈ベクトルについてスカラ値を予め計算して候補テキスト群記憶部１０５に保存しておき、ハッシュ関数を用いて第１の文脈ベクトルのスカラ値を計算し、このスカラ値と同一のスカラ値に関連付けられる第２の文脈ベクトルを抽出してから類似度計算部１０４に与えるようにしてもよい。類似度計算部１０４は、抽出された第２の文脈ベクトルに限って類似度を計算することで、計算量を大幅に低減することができる。 Note that it is not essential to calculate the similarity for all of the M second context vectors. For example, Local Sensitive Hashing (LSH) is known as an algorithm for rapidly extracting a pair of similar instances from a large amount of data. This algorithm uses a hash function defined to input a vector and output a scalar value. This hash function has a feature that the same scalar value is easily returned when a similar vector is input. Therefore, scalar values for the M second context vectors are calculated in advance and stored in the candidate text group storage unit 105, and the scalar value of the first context vector is calculated using a hash function. The second context vector associated with the same scalar value may be extracted and then given to the similarity calculation unit 104. The similarity calculation unit 104 can greatly reduce the amount of calculation by calculating the similarity only for the extracted second context vector.

候補テキスト群記憶部１０５には前述のコーパスが構築されており、類似度計算部１０４によってコーパス、または第２の文脈ベクトルを読み出される。また、類似テキスト群探索部１０７によって類似テキスト群に相当する候補テキスト群を読み出される。候補テキスト群記憶部１０５は、前述の補助記憶装置であり得る。 The above-mentioned corpus is constructed in the candidate text group storage unit 105, and the corpus or the second context vector is read out by the similarity calculation unit 104. The candidate text group corresponding to the similar text group is read by the similar text group search unit 107. The candidate text group storage unit 105 may be the auxiliary storage device described above.

類似度記憶部１０６は、類似度計算部１０４によって類似度（および対応する候補テキスト群を識別する情報）を書き込まれ、類似テキスト群探索部１０７によってこれを読み出される。類似度記憶部１０６は、前述のメモリであり得る。 In the similarity storage unit 106, the similarity (and information for identifying the corresponding candidate text group) is written by the similarity calculation unit 104, and this is read by the similar text group search unit 107. The similarity storage unit 106 may be the memory described above.

類似テキスト群探索部１０７は、類似度記憶部１０６に記憶された類似度に基づいて、対象テキスト群と文脈ベクトルに関して類似する候補テキスト群である類似テキスト群を探索する。具体的には、類似テキスト群探索部１０７は、対象テキスト群と文脈ベクトルに関して最も類似する候補テキスト群を探索してもよいし、対象テキスト群と文脈ベクトルに関して類似する順に所定数の候補テキスト群を探索してもよいし、類似度が所定の数値条件を満足する候補テキスト群を探索してもよい。 Based on the similarity stored in the similarity storage unit 106, the similar text group search unit 107 searches for a similar text group that is a candidate text group similar to the target text group and the context vector. Specifically, the similar text group search unit 107 may search for a candidate text group that is most similar with respect to the target text group and the context vector, or a predetermined number of candidate text groups in order of similarity with respect to the target text group and the context vector. Or a candidate text group whose similarity satisfies a predetermined numerical condition may be searched.

なお、類似度の定義次第で、類似度が小さいほど、第１の文脈ベクトルと第２の文脈ベクトルとが類似すること意味し得る点に注意を要する。具体的には、コサイン類似度は、その値が大きいほど第１の文脈ベクトルと第２の文脈ベクトルとが類似すること意味するが、正規化された文脈ベクトル間のユークリッド距離は、その値が小さいほど第１の文脈ベクトルと第２の文脈ベクトルとが類似すること意味する。 It should be noted that depending on the definition of the similarity, the smaller the similarity, the more similar the first context vector and the second context vector may mean. Specifically, the cosine similarity means that the larger the value is, the more similar the first context vector and the second context vector are, but the Euclidean distance between the normalized context vectors is A smaller value means that the first context vector and the second context vector are more similar.

類似テキスト群探索部１０７は、探索した１つまたは複数の類似テキスト群を候補テキスト群記憶部１０５から読み出し、これを出力する。類似テキスト群探索部１０７は、前述のプロセッサおよびメモリであり得る。 The similar text group search unit 107 reads one or more searched similar text groups from the candidate text group storage unit 105 and outputs them. The similar text group search unit 107 may be the above-described processor and memory.

出力された類似テキスト群は、対象テキスト群の抽出されたコンテンツに新規コメントとして自動投稿するために用いられてもよいし、当該コンテンツを視聴するユーザによるコメント投稿を支援するための投稿可能コメントとして用いられてもよい。 The output similar text group may be used for automatically posting as a new comment to the content extracted from the target text group, or as a postable comment for supporting comment posting by a user viewing the content May be used.

図３の対象テキスト群に基づいて抽出される類似テキスト群、より具体的には類似コメント群、が図６に例示される。図６では、５つの類似テキスト群が正規化された文脈ベクトル間のユークリッド距離の昇順、すなわち、対象テキスト群と文脈ベクトルに関して類似する順に、ソートされている。 A similar text group extracted based on the target text group in FIG. 3, more specifically a similar comment group, is illustrated in FIG. 6. In FIG. 6, five similar text groups are sorted in ascending order of the Euclidean distance between the normalized context vectors, that is, in an order similar to the target text group and the context vector.

なお、候補テキスト群の代わりに、候補テキスト単位で探索が行われてもよい。これにより、対象テキスト群と文脈ベクトルに関して類似する候補テキストである類似テキストを抽出することができる。これは、各候補テキスト群に含まれるテキストの数を全て１とした場合と等価である。或いは、類似テキスト群を一旦抽出してから、類似テキスト群に属するテキストから類似テキストをさらに抽出してもよい。これにより、候補テキスト単位で探索を行う場合に比べて、類似度の計算および類似テキストの探索に伴う計算量を削減することができる。 Note that instead of the candidate text group, the search may be performed in units of candidate text. As a result, it is possible to extract similar text that is candidate text similar to the target text group and the context vector. This is equivalent to the case where the number of texts included in each candidate text group is all one. Alternatively, once a similar text group is extracted, similar text may be further extracted from text belonging to the similar text group. Thereby, compared with the case where it searches by candidate text unit, the calculation amount accompanying the calculation of similarity and the search of similar text can be reduced.

図５に、図３の対象テキスト群に基づいて抽出される類似テキスト、より具体的には類似コメント、が例示される。図５では、５つの類似テキストが正規化された文脈ベクトル間のユークリッド距離の昇順、すなわち、対象テキスト群と文脈ベクトルに関して類似する順に、ソートされている。 FIG. 5 illustrates similar texts extracted based on the target text group in FIG. 3, more specifically similar comments. In FIG. 5, five similar texts are sorted in ascending order of the Euclidean distance between the normalized context vectors, that is, in an order similar to the target text group and the context vector.

図５および図６の例では、対象テキスト群が抽出されたコンテンツ（動画像）に関連付けられるカテゴリ「スポーツ」とは異なるカテゴリ「ゲーム」または「エンターテイメント」に関連付けられるコンテンツから抽出された候補テキスト（群）が、類似テキスト（群）として抽出されている。このように、文脈ベクトルに関する類似度を基準として用いることで、カテゴリの異なる動画像から抽出された候補テキスト（群）であっても、類似テキスト（群）の候補とすることができる。 In the example of FIGS. 5 and 6, the candidate text extracted from the content associated with the category “game” or “entertainment” different from the category “sports” associated with the content (moving image) from which the target text group is extracted. Group) is extracted as similar text (group). As described above, by using the similarity regarding the context vector as a reference, even candidate text (group) extracted from moving images with different categories can be used as a candidate for similar text (group).

以下、入力テキストを文脈ベクトルへ変換する能力を獲得するための機械学習について説明する。この機械学習を実行するＮＮは、ＮＮ１０３と同一であってもよいし、異なってもよい。いずれにせよ、ＮＮ１０３にはこの機械学習の学習結果が設定される。 Hereinafter, machine learning for acquiring the ability to convert input text into a context vector will be described. The NN that executes this machine learning may be the same as or different from the NN 103. In any case, the learning result of the machine learning is set in the NN 103.

この機械学習は、第１のテキスト群と第２のテキスト群とが類似の文脈を示唆する場合には、当該第１のテキスト群の文脈ベクトルと当該第２のテキスト群の文脈ベクトルとが類似するように、例えばコサイン類似度が０に比べて十分大きくなるように、文脈ベクトルを計算することを目指す。他方、この機械学習は、第１のテキスト群と第２のテキスト群とが類似しない文脈を示唆する場合には、当該第１のテキスト群の文脈ベクトルと当該第２のテキスト群の文脈ベクトルとが類似しないように、例えばコサイン類似度が１に比べて十分小さくなるように、文脈ベクトルを計算することを目指す。 In this machine learning, when the first text group and the second text group suggest a similar context, the context vector of the first text group and the context vector of the second text group are similar. Thus, for example, the aim is to calculate the context vector so that the cosine similarity is sufficiently larger than 0. On the other hand, when this machine learning suggests a context in which the first text group and the second text group are not similar, the context vector of the first text group and the context vector of the second text group The context vector is calculated so that the cosine similarity is sufficiently smaller than 1, for example.

かかる学習目標を達成するために、それらの示唆する文脈が類似するテキスト群のペアを入力データとし、両テキスト群が類似であることを表す類似度の値（コサイン類似度であれば１）を教師データとする第１の学習データ（ｐｏｓｉｔｉｖｅｄａｔａ）と、それらの示唆する文脈が類似しないテキスト群のペアを入力データとし、両テキスト群が非類似であることを表す類似度の値（コサイン類似度であれば０）を教師データとする第２の学習データ（ｎｅｇａｔｉｖｅｄａｔａ）とを用いて機械学習を行う。なお、この機械学習が妥当な文脈ベクトルを計算できるようにするためには、第１の学習データおよび第２の学習データそれぞれ十分な件数揃えることが好ましいが、一方の学習データを十分には揃えられなかったとしても相応の効果は期待できる。 In order to achieve such a learning goal, a pair of text groups with similar suggested contexts is used as input data, and a similarity value (1 for cosine similarity) indicating that both text groups are similar is used. Similarity value (cosine similarity) indicating that the first learning data (possible data) used as teacher data and a text group whose context suggested by them is not similar are input data, and both text groups are dissimilar. If the degree is 0, machine learning is performed using second learning data (negative data) using 0) as teacher data. In order to allow the machine learning to calculate a valid context vector, it is preferable to arrange a sufficient number of each of the first learning data and the second learning data, but one of the learning data is sufficiently arranged. Even if it is not done, a corresponding effect can be expected.

第１の学習データおよび第２の学習データに含まれる入力データは、例えば、ユーザが動画像に対してコメントを投稿することができる動画像配信サービスの提供事業者によって蓄積された膨大なコメントデータから抽出することができる。 The input data included in the first learning data and the second learning data is, for example, an enormous amount of comment data accumulated by a provider of a moving image distribution service that allows a user to post a comment on a moving image. Can be extracted from.

それらの示唆する文脈が類似するテキスト群のペアは、同一のコンテンツに対して投稿されたコメントから抽出することができる。なお、文脈は、コンテンツの再生位置に依存して変化し得る。故に、コンテンツの任意の再生位置の近傍にあるコメントを抽出して２群に分けることで、類似の文脈を示唆する可能性の高いテキスト群のペアを得ることができる。或いは、それらの示唆する文脈が類似するテキスト群のペアは、同じメタデータに関連付けられる（例えば、同じカテゴリに属する）２つのコンテンツに対して投稿されたコメントから抽出することもできる。 Pairs of text groups with similar suggested contexts can be extracted from comments posted on the same content. Note that the context can change depending on the playback position of the content. Therefore, by extracting a comment in the vicinity of an arbitrary playback position of the content and dividing it into two groups, pairs of text groups that are likely to suggest similar contexts can be obtained. Alternatively, pairs of text groups with similar suggested contexts can be extracted from comments posted on two content associated with the same metadata (eg, belonging to the same category).

それらの示唆する文脈が類似しないテキスト群のペアは、例えば異なるメタデータにそれぞれ関連付けられる（例えば、異なるカテゴリにそれぞれ属する）２つのコンテンツに対して投稿されたコメントからそれぞれ抽出することができる。 The pairs of text groups whose dissimilar contexts are similar can be extracted from comments posted on two contents respectively associated with different metadata (for example, belonging to different categories), for example.

なお、いずれの場合にも、抽出されたテキスト群の間でコメントは交差しないようにする。 In either case, the comments are not crossed between the extracted text groups.

かかる機械学習を行うことで、学習対象のＮＮは、例えば、同一のコンテンツに対して投稿されたコメントから抽出されたテキスト群のペア、または同じメタデータに関連付けられる２つのコンテンツに対して投稿されたコメントから抽出されたテキスト群のペアに対して、両テキスト群の文脈ベクトルが類似するように文脈ベクトルを計算することができるようになる。他方、このＮＮは、例えば、異なるメタデータにそれぞれ関連付けられる２つのコンテンツに対して投稿されたコメントからそれぞれ抽出されたテキスト群のペアに対して、両テキスト群の文脈ベクトルが類似しないように文脈ベクトルを計算することができるようになる。 By performing such machine learning, the learning target NN is posted to, for example, a pair of text groups extracted from comments posted to the same content, or two contents associated with the same metadata. It is possible to calculate a context vector so that the context vectors of both text groups are similar to a pair of text groups extracted from the comment. On the other hand, this NN has a context so that the context vectors of both text groups are not similar to a pair of text groups extracted from comments posted for two contents respectively associated with different metadata, for example. The vector can be calculated.

以下、図７を用いて、テキスト抽出装置１００の動作例を説明する。
まず、テキスト群入力部１０１は、対象テキスト群を獲得する（ステップＳ２０１）。対象テキスト群は、前述のように様々な手法により獲得され得る。文脈ベクトル計算部１０２は、機械学習済みのＮＮ１０３を用いて、ステップＳ２０１において獲得された対象テキスト群の第１の文脈ベクトルを計算する（ステップＳ２０２）。 Hereinafter, an operation example of the text extraction apparatus 100 will be described with reference to FIG.
First, the text group input unit 101 acquires a target text group (step S201). The target text group can be obtained by various methods as described above. The context vector calculation unit 102 uses the machine-learned NN 103 to calculate the first context vector of the target text group acquired in step S201 (step S202).

類似度計算部１０４は、ステップＳ２０２において計算された第１の文脈ベクトルに対する、複数の候補テキスト群の第２の文脈ベクトルそれぞれの類似度を計算する（ステップＳ２０３）。類似テキスト群探索部１０７は、ステップＳ２０３において計算された類似度に基づいて、対象テキスト群と文脈ベクトルに関して類似する候補テキスト群である類似テキスト群を探索し（ステップＳ２０４）、処理は終了する。 The similarity calculation unit 104 calculates the similarity of each of the second context vectors of the plurality of candidate text groups with respect to the first context vector calculated in step S202 (step S203). Based on the similarity calculated in step S203, the similar text group search unit 107 searches for a similar text group that is a candidate text group similar to the target text group and the context vector (step S204), and the process ends.

図１のテキスト抽出装置１００は、図８に例示されるように、コメント投稿装置３００に組み込まれてもよい。このコメント投稿装置３００は、受信部３０１と、テキスト抽出装置１００と、送信部３０２とを含む。このコメント投稿装置３００は、コンテンツに関連付けられるコメントを自動的に生成して投稿する。 The text extraction apparatus 100 of FIG. 1 may be incorporated in the comment posting apparatus 300 as illustrated in FIG. This comment posting device 300 includes a receiving unit 301, a text extracting device 100, and a transmitting unit 302. This comment posting device 300 automatically generates and posts a comment associated with the content.

コメント投稿装置３００は、典型的にはコンピュータであるが、これに限られない。コメント投稿装置３００は、図１のテキスト抽出装置の機能に相当するテキスト抽出処理、および通信制御などを行うプロセッサ、例えばＣＰＵおよびＧＰＵと、かかる処理を実現するために当該プロセッサによって実行されるプログラムおよび当該プロセッサによって使用されるデータなどを一時的に格納するメモリとを含んでいる。 The comment posting device 300 is typically a computer, but is not limited thereto. The comment posting apparatus 300 includes a text extraction process corresponding to the function of the text extraction apparatus in FIG. 1 and a processor that performs communication control, for example, a CPU and a GPU, a program executed by the processor to realize such a process, and And a memory for temporarily storing data used by the processor.

コメント投稿装置３００は、さらに、ネットワークに接続するための通信装置を利用可能である。通信装置は、ネットワーク経由で、再生端末、コンテンツ配信サーバ、コメント配信サーバなどの外部装置と通信をする。通信装置は、コメント投稿装置３００に内蔵されていてもよいし、コメント投稿装置３００に外付けされてもよい。 The comment posting apparatus 300 can further use a communication apparatus for connecting to a network. The communication device communicates with external devices such as a reproduction terminal, a content distribution server, and a comment distribution server via a network. The communication device may be built in the comment posting device 300 or may be externally attached to the comment posting device 300.

受信部３０１は、例えば、コメントを投稿する対象となるコンテンツを特定する情報を受信する。かかるコンテンツは、ユーザまたはコンテンツ配信サービスの運営者によって意図的に選択されてもよいし、何らかのアルゴリズムに従って自動的に選択されてもよい。受信部３０１は、さらに、かかるコンテンツに関連付けられる複数のテキスト、すなわち対象テキスト群の候補または対象テキスト群そのものを受信することもできる。ここで、複数のテキストは、例えばコンテンツに関連付けられるコメント、タイトル、説明文、タグ、カテゴリなどのメタデータを含み得る。なお、図８のコメント投稿装置３００が、コンテンツに付与されるコメントなどのメタデータを配信するサーバに組み込まれているならば、当該メタデータを外部装置から受信する必要はない。受信部３０１は、コンテンツを特定する情報をテキスト抽出装置１００へ送る。受信部３０１は、前述の通信装置であってもよいし、当該通信装置とのインターフェースであってもよい。 For example, the receiving unit 301 receives information for specifying content to be posted. Such content may be intentionally selected by the user or the operator of the content distribution service, or may be automatically selected according to some algorithm. The receiving unit 301 can also receive a plurality of texts associated with such content, that is, candidates for the target text group or the target text group itself. Here, the plurality of texts may include metadata such as comments, titles, explanations, tags, and categories associated with the content. Note that if the comment posting device 300 in FIG. 8 is incorporated in a server that distributes metadata such as comments attached to content, it is not necessary to receive the metadata from an external device. The receiving unit 301 sends information specifying the content to the text extraction device 100. The receiving unit 301 may be the communication device described above or an interface with the communication device.

テキスト抽出装置１００は、受信部３０１から受け取った情報によって特定されるコンテンツに関連付けられる複数のテキストを対象テキスト群として抽出し、それからこの対象テキスト群に文脈ベクトルに関して類似する類似テキスト群を探索する。テキスト抽出装置１００は、類似テキスト群に含まれるテキストの少なくとも１つを送信部３０２へ送る。 The text extraction apparatus 100 extracts a plurality of texts associated with the content specified by the information received from the receiving unit 301 as a target text group, and then searches for a similar text group that is similar to the target text group with respect to a context vector. The text extraction apparatus 100 sends at least one of the texts included in the similar text group to the transmission unit 302.

送信部３０２は、テキスト抽出装置１００から少なくとも１つのテキストを受け取り、これをコンテンツに関連付けられるコメントとして、例えばコメント配信サーバへ投稿のために送信する。送信部３０２は、前述の通信装置であってもよいし、当該通信装置とのインターフェースであってもよい。 The transmission unit 302 receives at least one text from the text extraction device 100 and transmits it as a comment associated with the content, for example, to a comment distribution server for posting. The transmission unit 302 may be the communication device described above or an interface with the communication device.

なお、送信部３０２が、テキスト抽出装置１００から受け取った少なくとも１つのテキストを投稿可能コメントとして、コンテンツを再生する再生端末へ送信してもよい。かかる変形によれば、図８のコメント投稿装置３００を、ユーザによるコメント投稿を支援するコメント投稿支援装置として構成することができる。 Note that the transmission unit 302 may transmit at least one text received from the text extraction device 100 as a postable comment to a playback terminal that plays back content. According to such a modification, the comment posting device 300 in FIG. 8 can be configured as a comment posting support device that supports comment posting by the user.

図１のテキスト抽出装置１００は、図９に例示されるように、再生端末４００に組み込まれてもよい。この再生端末４００は、受信部４０１と、バッファ４０２と、再生部４０３と、出力部４０４と、ユーザ入力部４０５と、テキスト抽出装置１００と、送信部４０６とを含む。 The text extraction apparatus 100 of FIG. 1 may be incorporated in the playback terminal 400 as illustrated in FIG. The playback terminal 400 includes a reception unit 401, a buffer 402, a playback unit 403, an output unit 404, a user input unit 405, a text extraction device 100, and a transmission unit 406.

再生端末４００は、（１）再生中のコンテンツに関してユーザが入力したテキストに基づいて、当該コンテンツに対するコメントを自動生成して投稿する機能、ならびに（２）再生中のコンテンツに関してユーザが入力したテキスト、または、再生中のコンテンツに関連付けられる受信テキストに基づいて、投稿可能コメントを自動生成してコンテンツと共に出力し、ユーザの選択に応じて投稿可能コメントを実際に投稿する機能、のうち両方または一方を備える。 The playback terminal 400 includes (1) a function for automatically generating and posting a comment on the content based on the text input by the user regarding the content being played back, and (2) text input by the user regarding the content being played back, Or, based on the received text associated with the content being played, automatically generate a postable comment and output it together with the content, and / or actually post the postable comment according to the user's selection Prepare.

再生端末４００は、典型的には、テレビ受像機（インターネットテレビを含む）、ＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）、モバイル端末（例えば、タブレット、スマートフォン、ラップトップ、フィーチャーフォン、ポータブルゲーム機、デジタルミュージックプレイヤー、電子書籍リーダなど）、ＶＲ（ＶｉｒｔｕａｌＲｅａｌｉｔｙ）端末、ＡＲ(ＡｕｇｍｅｎｔｅｄＲｅａｌｉｔｙ)端末であるが、これらに限られない。 The playback terminal 400 is typically a television receiver (including Internet television), a PC (Personal Computer), a mobile terminal (eg, tablet, smartphone, laptop, feature phone, portable game machine, digital music player, electronic Book reader, etc.), VR (Virtual Reality) terminal, and AR (Augmented Reality) terminal, but are not limited thereto.

再生端末４００は、コンテンツの再生制御、当該再生端末４００に接続された出力装置に対する出力制御、ユーザ入力に対するデータ処理、図１のテキスト抽出装置の機能に相当するテキスト抽出処理、および通信制御などを行うプロセッサ、例えばＣＰＵおよびＧＰＵと、かかる処理を実現するために当該プロセッサによって実行されるプログラムおよび当該プロセッサによって使用されるデータなどを一時的に格納するメモリとを含んでいる。 The playback terminal 400 performs content playback control, output control for an output device connected to the playback terminal 400, data processing for user input, text extraction processing corresponding to the function of the text extraction device in FIG. 1, communication control, and the like. It includes a processor to perform, for example, a CPU and a GPU, and a memory that temporarily stores a program executed by the processor and data used by the processor to realize such processing.

再生端末４００は、さらに、ネットワークに接続するための通信装置と、コンテンツなどを出力するための出力装置と、ユーザ入力を受け付けるための入力装置と、プログラムまたはデータを保存する補助記憶装置とを利用可能である。これらの通信装置、出力装置、入力装置および補助記憶装置は、再生端末４００に内蔵されていてもよいし、再生端末４００に外付けされてもよい。 The playback terminal 400 further uses a communication device for connecting to a network, an output device for outputting content, an input device for receiving user input, and an auxiliary storage device for storing programs or data. Is possible. These communication device, output device, input device, and auxiliary storage device may be built in the playback terminal 400 or may be externally attached to the playback terminal 400.

通信装置は、ネットワーク経由で、Ｗｅｂサーバ、コンテンツ配信サーバ、コメント配信サーバ、コメント投稿装置３００などの外部装置と通信をする。 The communication device communicates with external devices such as a Web server, a content distribution server, a comment distribution server, and a comment posting device 300 via a network.

出力装置は、動画像、静止画像、テキストなどを表示するための表示装置および／または音声、楽曲などを出力するためのスピーカを含み得る。表示装置は、例えば、液晶ディスプレイ、有機ＥＬ（ｅｌｅｃｔｒｏｌｕｍｉｎｅｓｃｅｎｃｅ）ディスプレイ、ＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）ディスプレイなどである。表示装置は、コンテンツを含む表示データを表示する。なお、表示装置は、タッチスクリーンのように入力装置の機能を備えていてもよい。 The output device may include a display device for displaying moving images, still images, text, and / or a speaker for outputting sound, music, and the like. Examples of the display device include a liquid crystal display, an organic EL (electroluminescence) display, and a CRT (Cathode Ray Tube) display. The display device displays display data including content. Note that the display device may have a function of an input device like a touch screen.

入力装置は、例えば、キーボード、マウス、テンキー、リモートコントローラなどであってもよいし、タッチスクリーンのように表示装置の機能を備えていてもよい。ユーザ入力は、典型的には、タップ、クリック、特定のキーの押下などであり得る。このほか、ユーザ入力は、例えば、マイクロフォンによって捉えられる音声、生体センサによって検出される生体データ（例えば体温、表情など）、ＧＰＳ（ＧｌｏｂａｌＰｏｓｉｔｉｏｎｉｎｇＳｙｓｔｅｍ）または基地局情報によって識別される位置データ、加速度センサによって検出される加速度データに基づいて推定されるユーザのアクション（例えば、再生端末４００を振り回した）などを含むこともできる。 The input device may be, for example, a keyboard, a mouse, a numeric keypad, a remote controller, or the like, or may have a display device function like a touch screen. User input may typically be a tap, a click, a specific key press, and the like. In addition, user input includes, for example, voice captured by a microphone, biological data detected by a biological sensor (for example, body temperature, facial expression, etc.), GPS (Global Positioning System) or position data identified by base station information, an acceleration sensor The user action estimated based on the acceleration data detected by (for example, the playback terminal 400 is swung) may be included.

補助記憶装置は、例えば、プログラムまたはデータを格納する。補助記憶装置は、例えば、ＨＤＤ、ＳＳＤなどの不揮発性記憶媒体であることが好ましい。補助記憶装置は、再生端末４００にネットワーク経由で接続されたファイルサーバであり得る。 The auxiliary storage device stores, for example, a program or data. The auxiliary storage device is preferably a non-volatile storage medium such as an HDD or an SSD. The auxiliary storage device may be a file server connected to the playback terminal 400 via a network.

受信部４０１は、再生対象のコンテンツを、例えばコンテンツ配信サーバから受信する。再生対象のコンテンツは、ユーザによって意図的に選択されてもよいし、何らかのアルゴリズムに従って自動的に選択されてもよい。受信部４０１は、受信コンテンツをバッファ４０２へ送る。受信部４０１は、さらに、再生対象のコンテンツに付与されたメタデータを、例えばコメント配信サーバまたはその他のメタデータ配信サーバから受信してもよい。このメタデータは、再生対象のコンテンツに関連付けられるコメント、タイトル、説明文、タグ、カテゴリなどのテキストを含み得る。受信部４０１は、メタデータを受信した場合には、受信メタデータをバッファ４０２へ送る。受信部４０１は、前述の通信装置であってもよいし、当該通信装置とのインターフェースであってもよい。 The receiving unit 401 receives content to be reproduced from, for example, a content distribution server. The content to be played back may be intentionally selected by the user or may be automatically selected according to some algorithm. The receiving unit 401 sends the received content to the buffer 402. The receiving unit 401 may further receive metadata attached to the content to be played back, for example, from a comment distribution server or other metadata distribution server. This metadata may include text such as comments, titles, descriptions, tags, categories, etc. associated with the content to be played. When receiving the metadata, the receiving unit 401 sends the received metadata to the buffer 402. The receiving unit 401 may be the communication device described above or an interface with the communication device.

バッファ４０２は、受信部４０１からのデータ、すなわち受信コンテンツ、および／または受信メタデータを一時的に保存する。バッファ４０２は、保存しているデータを、再生部４０３からの要求に応じて、再生部４０３へ送る。このとき、バッファ４０２は、保存しているメタデータをテキスト抽出装置１００へも送ってよい。バッファ４０２は、前述のメモリであり得る。 The buffer 402 temporarily stores data from the receiving unit 401, that is, received content and / or received metadata. The buffer 402 sends the stored data to the playback unit 403 in response to a request from the playback unit 403. At this time, the buffer 402 may send the stored metadata to the text extraction apparatus 100 as well. The buffer 402 can be the memory described above.

再生部４０３は、バッファ４０２から必要に応じてデータを読み出す。再生部４０３は、バッファ４０２から読み出したコンテンツを再生し、再生したコンテンツを定められたタイミングで出力部４０４へ送る。コンテンツの再生は、例えば、符号化されたビデオデータ、音声データなどのコンテンツの復号、復号されたビデオデータおよびメタデータに基づく表示データの生成、例えば復号ビデオデータにコメントを合成する処理、などを含み得る。再生部４０３は、ユーザ入力部４０５からコンテンツの再生制御に関するユーザ入力を受け取り、これに応じてコンテンツを再生／停止したり、コンテンツの再生状態を指定された再生位置へジャンプさせたりしてもよい。再生部４０３は、前述のプロセッサおよびメモリであり得る。 The playback unit 403 reads data from the buffer 402 as necessary. The playback unit 403 plays back the content read from the buffer 402 and sends the played content to the output unit 404 at a predetermined timing. Content playback includes, for example, decoding of content such as encoded video data and audio data, generation of display data based on the decoded video data and metadata, such as a process of combining a comment with the decoded video data, etc. May be included. The playback unit 403 may receive a user input related to content playback control from the user input unit 405, and may playback / stop the content or jump the content playback state to a designated playback position in response to the user input. . The playback unit 403 may be the above-described processor and memory.

出力部４０４は、再生部４０３から再生されたコンテンツ、例えば表示データおよび／または音声データを出力する。さらに、出力部４０４は、テキスト抽出装置１００から、少なくとも１つのテキストを投稿可能コメントとして受け取り、これをコンテンツと共に出力してもよい。出力部４０４は、前述の出力装置であってもよいし、当該出力装置とのインターフェースであってもよい。 The output unit 404 outputs the content reproduced from the reproduction unit 403, for example, display data and / or audio data. Furthermore, the output unit 404 may receive at least one text as a postable comment from the text extraction device 100 and output it together with the content. The output unit 404 may be the above-described output device or an interface with the output device.

ユーザ入力部４０５は、様々なユーザ入力を受け取る。例えば、ユーザ入力部４０５は、コンテンツの再生制御に関するユーザ入力を受け取り、これを再生部４０３に送ってもよい。ユーザ入力部４０５は、再生中のコンテンツに関連付けられるテキストを投稿するためのユーザ入力を受け取り、これを送信部４０６へ送ってもよい。かかるユーザ入力は、直接的なテキスト入力に限られず、出力部４０４によって出力された投稿可能コメントへの選択入力であり得る。さらに、ユーザ入力部４０５は、再生中のコンテンツに関連付けられるテキストとして投稿するためにユーザによって直接入力されたテキストを、対象テキスト群の候補としてテキスト抽出装置１００へ送ってもよい。ユーザ入力部４０５は、前述の入力装置であってもよいし、当該入力装置とのインターフェースであってもよい。 The user input unit 405 receives various user inputs. For example, the user input unit 405 may receive a user input related to content playback control and send it to the playback unit 403. The user input unit 405 may receive a user input for posting a text associated with the content being reproduced, and may send this to the transmission unit 406. Such user input is not limited to direct text input, and may be selection input to a postable comment output by the output unit 404. Further, the user input unit 405 may send the text directly input by the user for posting as text associated with the content being played back to the text extraction apparatus 100 as a target text group candidate. The user input unit 405 may be the above-described input device or an interface with the input device.

テキスト抽出装置１００は、再生中のコンテンツに付与されたメタデータをバッファ４０２から受け取ってよいし、当該コンテンツに関連付けられるテキストとして投稿するためにユーザによって直接入力されたテキストをユーザ入力部４０５から受け取ってもよい。いずれにせよ、テキスト抽出装置１００は、対象テキスト群に文脈ベクトルに関して類似する類似テキスト群を探索し、類似テキスト群に含まれるテキストのうち少なくとも１つを、投稿可能コメントとして出力部４０４へ送るか、コメント投稿のために送信部４０６へ送るかする。 The text extraction device 100 may receive the metadata attached to the content being played back from the buffer 402, or receives the text directly input by the user for posting as text associated with the content from the user input unit 405. May be. In any case, the text extraction apparatus 100 searches for a similar text group that is similar to the target text group with respect to the context vector, and sends at least one of the texts included in the similar text group to the output unit 404 as a postable comment. Or send it to the transmission unit 406 for comment posting.

送信部４０６は、テキスト抽出装置１００から少なくとも１つのテキストを受け取り、これを再生中のコンテンツに関連付けられるコメントとして投稿するために、例えばコメント配信サーバへ送信する。また、送信部４０６は、ユーザによって直接入力されたテキストを受け取り、これを再生中のコンテンツに関連付けられるテキストとして投稿するために、例えばコメント配信サーバまたはその他のメタデータ配信サーバへ送信してもよい。送信部４０６は、前述の通信装置であってもよいし、当該通信装置とのインターフェースであってもよい。 The transmission unit 406 receives at least one text from the text extraction device 100 and transmits it to a comment distribution server, for example, in order to post it as a comment associated with the content being reproduced. Further, the transmission unit 406 may receive text directly input by the user, and transmit the text to a comment distribution server or other metadata distribution server, for example, in order to post it as text associated with the content being reproduced. . The transmission unit 406 may be the communication device described above or an interface with the communication device.

図８のコメント投稿装置３００および図９の再生端末４００は、図１０に例示されるコンテンツ配信システムへ組み込むことができる。このコンテンツ配信システムは、コメント投稿装置３００と、再生端末４００と、Ｗｅｂサーバ５０１と、コンテンツ配信サーバ５０２と、コメント配信サーバ５０３とを含む。 The comment posting apparatus 300 in FIG. 8 and the playback terminal 400 in FIG. 9 can be incorporated into the content distribution system illustrated in FIG. This content distribution system includes a comment posting device 300, a playback terminal 400, a Web server 501, a content distribution server 502, and a comment distribution server 503.

図１０の各装置は、ネットワーク経由で互いに接続しており、データを互いに送受信できる。なお、図１０の各装置の数は、１つに限られず複数であり得る。また、図１０に示されるサーバ構成は例示に過ぎず、１つのサーバの機能が複数の装置によって分担されてもよいし、複数のサーバの機能が１つの装置に割り当てられてもよい。 The apparatuses in FIG. 10 are connected to each other via a network and can transmit and receive data to and from each other. Note that the number of devices in FIG. 10 is not limited to one and may be plural. The server configuration illustrated in FIG. 10 is merely an example, and the function of one server may be shared by a plurality of devices, or the functions of a plurality of servers may be assigned to one device.

Ｗｅｂサーバ５０１は、再生端末４００からネットワーク経由でアクセスされると、図１０のコンテンツ配信システムから配信可能なコンテンツが列挙されたリストを含むウェブページを当該再生端末４００にネットワーク経由で提供する。 When accessed from the playback terminal 400 via the network, the Web server 501 provides the playback terminal 400 with a web page including a list in which content that can be distributed from the content distribution system of FIG. 10 is listed.

再生端末４００のユーザは、当該再生端末４００に表示されたウェブページ中のリストから配信を希望するコンテンツを、例えば、タップ、クリック、特定のキーの押下などの所定の操作を行うことで選択する。再生端末４００は、このような所定の操作を検知すると、ユーザによって選択されたコンテンツの配信要求をＷｅｂサーバ５０１へとネットワーク経由で送信する。 The user of the playback terminal 400 selects content desired to be distributed from a list in the web page displayed on the playback terminal 400 by performing a predetermined operation such as tap, click, or pressing a specific key, for example. . When the reproduction terminal 400 detects such a predetermined operation, the reproduction terminal 400 transmits a distribution request for the content selected by the user to the Web server 501 via the network.

Ｗｅｂサーバ５０１は、ネットワーク経由で再生端末４００からコンテンツの配信要求を受信すると、当該コンテンツを取得するための情報（典型的にはＵＲＬ（ＵｎｉｆｏｒｍＲｅｓｏｕｒｃｅＬｏｃａｔｏｒ））と、当該コンテンツに関連付けられるコメント（および／またはその他のメタデータ）を取得するための情報（典型的にはＵＲＬ）とを当該再生端末４００へとネットワーク経由で返信する。 When the Web server 501 receives a content distribution request from the playback terminal 400 via the network, the Web server 501 receives information (typically, URL (Uniform Resource Locator)) for acquiring the content, and a comment (and Information (typically URL) for acquiring (/ other metadata) is returned to the playback terminal 400 via the network.

再生端末４００は、ネットワーク経由でＷｅｂサーバ５０１から前述の情報を受信すると、コンテンツを取得するための情報を用いてコンテンツ配信サーバ５０２にネットワーク経由でアクセスし、コメントを取得するための情報を用いてコメント配信サーバ５０３にネットワーク経由でアクセスする。 When the playback terminal 400 receives the above-described information from the Web server 501 via the network, the playback terminal 400 accesses the content distribution server 502 via the network using the information for acquiring the content and uses the information for acquiring the comment. The comment delivery server 503 is accessed via the network.

コンテンツ配信サーバ５０２およびコメント配信サーバ５０３は、それぞれ、再生端末４００からネットワーク経由でコンテンツおよびコメントの配信要求を受けると、当該再生端末４００の用いた情報に関連付けられたコンテンツおよびコメントを当該再生端末４００へとネットワーク経由で配信する。 When the content distribution server 502 and the comment distribution server 503 receive content and comment distribution requests from the playback terminal 400 via the network, respectively, the content distribution server 502 and the comment distribution server 503 receive the content and comments associated with the information used by the playback terminal 400. Over the network.

コンテンツ配信サーバ５０２は、再生端末４００からネットワーク経由でコンテンツの配信要求を受けると、当該配信要求の対象となるコンテンツを特定し、当該再生端末４００へとネットワーク経由で配信する。コンテンツ配信サーバ５０２は、典型的には、大量の録画済みの動画像データと、それぞれのコンテンツを取得するための情報（典型的にはＵＲＬ）とを関連付けて記憶部に格納している。 Upon receiving a content distribution request from the playback terminal 400 via the network, the content distribution server 502 identifies the content that is the target of the distribution request and distributes the content to the playback terminal 400 via the network. The content distribution server 502 typically stores a large amount of recorded moving image data in association with information (typically URLs) for acquiring each content in the storage unit.

なお、コンテンツ配信サーバ５０２は、生配信機能を備えていてもよい。すなわち、コンテンツ配信サーバ５０２は、図示されない送信源（例えば、ビデオカメラに接続されたコンピュータ）から送信される動画像データをネットワーク経由で順次受信し、そのままもしくは必要な加工を行ってから、当該動画像データを再生中の再生端末４００へとネットワーク経由で配信してもよい。テキスト抽出装置１００は、かかる生配信されるコンテンツに関連付けられるテキストも対象テキスト群として扱うことができる。また、コメント投稿装置３００および再生端末４００は、かかる生配信コンテンツに関連付けられるコメントの自動投稿またはその支援が可能となる。 Note that the content distribution server 502 may have a live distribution function. That is, the content distribution server 502 sequentially receives moving image data transmitted from a transmission source (not shown) (for example, a computer connected to a video camera) via a network, and performs the video processing as it is or after performing necessary processing. The image data may be distributed to the playback terminal 400 that is playing back via a network. The text extraction apparatus 100 can also handle text associated with such live-distributed content as a target text group. In addition, the comment posting device 300 and the playback terminal 400 can automatically post or support a comment associated with the live delivery content.

コメント配信サーバ５０３は、再生端末４００からネットワーク経由でコメントデータの配信要求を受けると、当該配信要求の対象となるコメントを特定し、当該再生端末４００へとネットワーク経由で配信する。コメント配信サーバ５０３は、コンテンツ配信サーバ５０２から配信可能なコンテンツのそれぞれに関連付けられるコメントと、当該コメントを取得するための情報（典型的にはＵＲＬ）を関連付けて記憶部に格納している。 When the comment distribution server 503 receives a comment data distribution request from the playback terminal 400 via the network, the comment distribution server 503 identifies the comment that is the target of the distribution request and distributes it to the playback terminal 400 via the network. The comment distribution server 503 stores a comment associated with each of the contents that can be distributed from the content distribution server 502 and information (typically, a URL) for acquiring the comment in association with each other.

コメント配信サーバ５０３は、再生端末４００またはコメント投稿装置３００によって投稿された新たなコメントをネットワーク経由で受信すると、当該新たなコメントを追加して、コンテンツに関連付けられるコメントを更新する。コメント配信サーバ５０３は、更新後のコメントを、更新前のコメントを配信済みの再生端末４００へネットワーク経由で自動配信してもよい。 When the comment distribution server 503 receives a new comment posted by the playback terminal 400 or the comment posting device 300 via the network, the comment distribution server 503 adds the new comment and updates the comment associated with the content. The comment distribution server 503 may automatically distribute the updated comment via the network to the playback terminal 400 that has already distributed the comment before the update.

再生端末４００は、Ｗｅｂサーバ５０１からネットワーク経由で提供されたウェブページを表示し、当該ウェブページの表示中にユーザによってなされたコンテンツを選択する操作を検知し、当該コンテンツの配信要求をＷｅｂサーバ５０１へとネットワーク経由で送信する。 The playback terminal 400 displays a web page provided from the web server 501 via the network, detects an operation of selecting content made by the user while the web page is displayed, and sends a delivery request for the content to the web server 501. Over the network.

再生端末４００は、このコンテンツを取得するための情報と当該コンテンツに関連付けられるコメントを取得するための情報とをＷｅｂサーバ５０１からネットワーク経由で受信し、これらの情報を用いてコンテンツ配信サーバ５０２およびコメント配信サーバ５０３にそれぞれネットワーク経由でアクセスする。 The playback terminal 400 receives information for acquiring the content and information for acquiring a comment associated with the content from the Web server 501 via the network, and uses the information to transmit the content distribution server 502 and the comment. Each distribution server 503 is accessed via a network.

なお、再生端末４００がコンテンツを再生している間に、コメント配信サーバ５０３から当該コンテンツに関連付けられる最新のコメントデータが配信される可能性がある。この場合に、再生端末４００は、古いコメントを破棄し、最新のコメントを用いてコンテンツの再生制御を行うようにすればよい。 Note that the latest comment data associated with the content may be distributed from the comment distribution server 503 while the reproduction terminal 400 is reproducing the content. In this case, the playback terminal 400 may discard the old comment and perform content playback control using the latest comment.

コメント投稿装置３００は、再生端末４００またはその他の装置からの要求に応じて、コメント配信サーバ５０３に対してコメントの自動投稿を行う。 The comment posting device 300 automatically posts a comment to the comment distribution server 503 in response to a request from the playback terminal 400 or another device.

図１０のコンテンツ配信システムにおいて、コメント投稿装置３００および再生端末４００が省略、簡略化または変形されてよい。例えば、コメント投稿装置３００は省略されてもよいし、前述のコメント投稿支援装置に変形されてもよい。後者の場合に、コメント投稿支援装置は、再生端末４００またはその他の装置からの要求に応じて、再生端末４００へ投稿可能コメントを送ってもよい。また、再生端末４００は、前述の、（１）再生中のコンテンツに関してユーザが入力したテキストに基づいて、当該コンテンツに対するコメントを自動生成して投稿する機能、ならびに（２）再生中のコンテンツに関してユーザが入力したテキスト、または、再生中のコンテンツに関連付けられる受信テキストに基づいて、投稿可能コメントを自動生成してコンテンツと共に出力し、ユーザの選択に応じて投稿可能コメントを実際に投稿する機能、のうち一方または両方を備えなくてもよい。 In the content distribution system of FIG. 10, the comment posting device 300 and the playback terminal 400 may be omitted, simplified, or modified. For example, the comment posting device 300 may be omitted or may be modified to the above-described comment posting support device. In the latter case, the comment posting support device may send a postable comment to the playback terminal 400 in response to a request from the playback terminal 400 or another device. In addition, the playback terminal 400 includes (1) the function of automatically generating and posting a comment on the content based on the text input by the user regarding the content being played back, and (2) the user regarding the content being played back. A function that automatically generates a postable comment based on the text entered by or received text associated with the content being played and outputs it together with the content, and actually posts the postable comment according to the user's selection, One or both of them may not be provided.

以上説明したように、実施形態に係るテキスト抽出装置は、複数のテキストを含む対象テキスト群の示唆する文脈を定量化する文脈ベクトルを計算する。そして、このテキスト抽出装置は、複数の候補テキスト群のうち、対象テキスト群の文脈ベクトルに類似する文脈ベクトルを持つ類似テキスト群を探索する。故に、このテキスト抽出装置によれば、複数の候補テキスト群から、対象テキスト群と類似の文脈を示唆する類似テキスト群を抽出することができる。抽出された類似テキスト群は、例えば、対象テキスト群の抽出されたコンテンツに新規コメントとして自動投稿するために用いられてもよいし、当該コンテンツを視聴するユーザによるコメント投稿を支援するための投稿可能コメントとして用いられてもよい。 As described above, the text extraction apparatus according to the embodiment calculates a context vector that quantifies a context suggested by a target text group including a plurality of texts. Then, the text extraction device searches for a similar text group having a context vector similar to the context vector of the target text group among the plurality of candidate text groups. Therefore, according to this text extraction device, a similar text group suggesting a context similar to the target text group can be extracted from a plurality of candidate text groups. The extracted similar text group may be used, for example, to automatically post as a new comment to the extracted content of the target text group, or can be posted to support comment posting by a user who views the content It may be used as a comment.

（変形例１）
実施形態に係るテキスト抽出装置は、対象テキスト群の示唆する文脈が、関心テキストの示唆する文脈に類似するか否かをチェックするために利用することもできる。かかる変形例では、第２の文脈ベクトルとして関心テキストの文脈ベクトルが使用され、例えば類似度が閾値以上であれば、対象テキスト群の示唆する文脈が、関心テキストの示唆する文脈に類似する、と判定される。関心テキストは、例えばＮＧワード、またはその他の規制対象となるテキストであり得る。対象テキスト群がかかる関心テキストに類似する文脈を示唆する場合には、コンテンツ配信サービスの運用者に通報したり、当該対象テキスト群の投稿を拒否したり、当該対象テキスト群が投稿済みであればこれを削除したり、投稿したユーザによるコンテンツ配信サービスの利用、例えばサーバアクセスを制限したりしてもよい。 (Modification 1)
The text extraction apparatus according to the embodiment can also be used to check whether the context suggested by the target text group is similar to the context suggested by the text of interest. In such a modification, the context vector of the text of interest is used as the second context vector. For example, if the similarity is greater than or equal to a threshold, the context suggested by the target text group is similar to the context suggested by the text of interest. Determined. The text of interest may be, for example, an NG word or other regulated text. If the target text group suggests a context similar to the interest text, notify the operator of the content distribution service, refuse to post the target text group, or if the target text group has already been posted This may be deleted, or use of the content distribution service by the posting user, for example, server access may be restricted.

（変形例２）
コンテンツの属性情報を用いて、コンテンツに新規コメントとして自動投稿されるコメントを決定したり、当該コンテンツを視聴するユーザによるコメント投稿を支援するためにユーザに提示される投稿可能コメントを決定したり、第１の文脈ベクトルを補正したりしてもよい。コンテンツの属性情報は、コメントの自動投稿を行う、または投稿可能コメントを提示する時間的位置付近でのコンテンツの局所的な特徴（例えば、特定の人物、キャラクタ、モノ、場所などが写っているか、特定の楽曲、メロディー、効果音などが流れているか）、または、その時間的位置がコンテンツ全体のうちどの部分にあたるか（例えば開始直後か、それとも終了間際か）など、であり得る。また、タグ、カテゴリなどのメタデータも、ここでいうコンテンツの属性情報として利用可能である。 (Modification 2)
Using content attribute information, determine a comment that is automatically posted as a new comment on the content, determine a postable comment that is presented to the user to support comment posting by the user viewing the content, The first context vector may be corrected. The content attribute information includes local features of the content (for example, a specific person, character, thing, place, etc. It may be whether a specific music piece, melody, sound effect, or the like is flowing), or which part of the entire content corresponds to the time position (for example, immediately after the start or just before the end). Also, metadata such as tags and categories can be used as content attribute information here.

（変形例３）
実施形態に係るテキスト抽出装置のうち文脈ベクトル計算部およびＮＮに着目して、文脈ベクトル計算装置を構成することも可能である。この文脈ベクトル計算装置によれば、任意のテキスト群によって形成される文脈を定量化する文脈ベクトルを計算することが可能となる。文脈ベクトルは、例えば、大規模コメントの自動分類、コンテンツの文脈構造の自動分析などの様々な応用例に活用できる可能性がある。 (Modification 3)
It is also possible to configure the context vector calculation device by paying attention to the context vector calculation unit and NN in the text extraction device according to the embodiment. According to this context vector calculation device, it is possible to calculate a context vector that quantifies the context formed by an arbitrary text group. The context vector may be used for various applications such as automatic classification of large-scale comments and automatic analysis of the context structure of content.

上述の実施形態は、本発明の概念の理解を助けるための具体例を示しているに過ぎず、本発明の範囲を限定することを意図されていない。実施形態は、本発明の要旨を逸脱しない範囲で、様々な構成要素の付加、削除または転換をすることができる。 The above-described embodiments are merely specific examples for helping understanding of the concept of the present invention, and are not intended to limit the scope of the present invention. The embodiment can add, delete, or convert various components without departing from the gist of the present invention.

上記各実施形態において説明された種々の機能部は、回路を用いることで実現されてもよい。回路は、特定の機能を実現する専用回路であってもよいし、プロセッサのような汎用回路であってもよい。 The various functional units described in the above embodiments may be realized by using a circuit. The circuit may be a dedicated circuit that realizes a specific function, or may be a general-purpose circuit such as a processor.

上記各実施形態の処理の少なくとも一部は、汎用のコンピュータを基本ハードウェアとして用いることでも実現可能である。上記処理を実現するプログラムは、コンピュータで読み取り可能な記録媒体に格納して提供されてもよい。プログラムは、インストール可能な形式のファイルまたは実行可能な形式のファイルとして記録媒体に記憶される。記録媒体としては、磁気ディスク、光ディスク（ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＤＶＤ等）、光磁気ディスク（ＭＯ等）、半導体メモリなどである。記録媒体は、プログラムを記憶でき、かつ、コンピュータが読み取り可能であれば、何れであってもよい。また、上記処理を実現するプログラムを、インターネットなどのネットワークに接続されたコンピュータ（サーバ）上に格納し、ネットワーク経由でコンピュータ（クライアント）にダウンロードさせてもよい。 At least a part of the processing of each of the above embodiments can also be realized by using a general-purpose computer as basic hardware. A program for realizing the above processing may be provided by being stored in a computer-readable recording medium. The program is stored in the recording medium as an installable file or an executable file. Examples of the recording medium include a magnetic disk, an optical disk (CD-ROM, CD-R, DVD, etc.), a magneto-optical disk (MO, etc.), and a semiconductor memory. The recording medium may be any recording medium as long as it can store the program and can be read by the computer. The program for realizing the above processing may be stored on a computer (server) connected to a network such as the Internet and downloaded to the computer (client) via the network.

１００・・・テキスト抽出装置
１０１・・・テキスト群入力部
１０２・・・文脈ベクトル計算部
１０３・・・ＮＮ
１０４・・・類似度計算部
１０５・・・候補テキスト群記憶部
１０６・・・類似度記憶部
１０７・・・類似テキスト群探索部
３００・・・コメント投稿装置
３０１，４０１・・・受信部
３０２，４０６・・・送信部
４００・・・再生端末
４０２・・・バッファ
４０３・・・再生部
４０４・・・出力部
４０５・・・ユーザ入力部
５０１・・・Ｗｅｂサーバ
５０２・・・コンテンツ配信サーバ
５０３・・・コメント配信サーバ DESCRIPTION OF SYMBOLS 100 ... Text extraction apparatus 101 ... Text group input part 102 ... Context vector calculation part 103 ... NN
104: Similarity calculation unit 105 ... Candidate text group storage unit 106 ... Similarity storage unit 107 ... Similar text group search unit 300 ... Comment posting device 301, 401 ... Reception unit 302 , 406 ... Transmitter 400 ... Playback terminal 402 ... Buffer 403 ... Playback unit 404 ... Output unit 405 ... User input unit 501 ... Web server 502 ... Content distribution server 503: Comment distribution server

Claims

A context vector calculation unit that calculates a first context vector of a target text group including a plurality of mutually independent texts using a machine-learned neural network;
A similarity calculation unit that calculates the similarity between each of the second context vectors of the plurality of candidate text groups each including one or more texts and the first context vector;
A search unit that searches one or more candidate text groups respectively corresponding to one or more second context vectors similar to the first context vector based on the similarity, and obtains similar text groups; And
The plurality of texts included in the target text group are texts associated with first content,
The one or more texts included in each of the candidate text groups are texts associated with the first content or content different from the first content.
Text extraction device.

The context vector calculation unit applies each of a plurality of texts included in the target text group to the neural network, calculates a context vector for each text, and synthesizes the context vector for each text, thereby combining the first text. The text extraction apparatus according to claim 1, wherein a context vector is calculated.

The at least one of the texts included in the target text group is extracted from at least one comment whose display time is determined in synchronization with the reproduction of the first content. Text extractor.

At least one of the texts included in the target text group has a specific reproduction position of the first content among at least one comment whose display time is determined in synchronization with the reproduction of the first content. The text extracting device according to claim 1, wherein the text extracting device is extracted from a comment whose display time is determined within a time window set as a reference.

When the candidate text group corresponding to the second context vector that is most similar to the first context vector includes a plurality of texts, the search unit includes a context vector for each text included in the candidate text group. 5. The method according to claim 1, further searching for a thing most similar to the first context vector to obtain text corresponding to the context vector most similar to the first context vector. 6. Text extraction device.

At least one of the texts included in the target text group is text related to content,
The content attribute information is used to correct the first context vector;
The text extraction device according to any one of claims 1 to 5.

A context vector calculation unit that calculates a first context vector of a target text group including a plurality of texts associated with content using a machine-learned neural network;
A similarity calculation unit that calculates the similarity between each of the second context vectors of the plurality of candidate text groups each including one or more texts and the first context vector;
A search unit that searches one or more candidate text groups respectively corresponding to one or more second context vectors similar to the first context vector based on the similarity, and obtains similar text groups;
A comment posting device comprising: a transmission unit that transmits at least one of the texts included in the similar text group as a comment associated with the content.

A context vector calculation unit that calculates a first context vector of a target text group including a plurality of texts associated with content using a machine-learned neural network;
A similarity calculation unit that calculates the similarity between each of the second context vectors of the plurality of candidate text groups each including one or more texts and the first context vector;
A search unit that searches one or more candidate text groups respectively corresponding to one or more second context vectors similar to the first context vector based on the similarity, and obtains similar text groups;
A comment posting support apparatus comprising: a transmission unit that transmits at least one of texts included in the similar text group as a postable comment to a playback terminal that plays back the content.

A receiving unit for receiving content;
A playback unit for playing back the content;
A context vector calculation unit that calculates a first context vector of a target text group including a plurality of texts associated with the content using a machine-learned neural network;
A similarity calculation unit that calculates the similarity between each of the second context vectors of the plurality of candidate text groups each including one or more texts and the first context vector;
A search unit that searches one or more candidate text groups respectively corresponding to one or more second context vectors similar to the first context vector based on the similarity, and obtains similar text groups;
A playback terminal comprising: a transmitting unit that transmits at least one of the texts included in the similar text group as a comment associated with the content.

A receiving unit for receiving content;
A playback unit for playing back the content;
A context vector calculation unit that calculates a first context vector of a target text group including a plurality of texts associated with the content using a machine-learned neural network;
A similarity calculation unit that calculates the similarity between each of the second context vectors of the plurality of candidate text groups each including one or more texts and the first context vector;
A search unit that searches one or more candidate text groups respectively corresponding to one or more second context vectors similar to the first context vector based on the similarity, and obtains similar text groups;
An output unit that outputs at least one of the texts included in the similar text group as a postable comment together with the reproduced content;
A playback terminal comprising: a transmission unit that transmits a postable comment selected by a user as a comment associated with the content.

A context vector calculator that calculates a context vector of a target text group including a plurality of mutually independent texts using a machine-learned neural network;
The neural network is set with the result of machine learning performed using a plurality of learning data,
Wherein the plurality of training data, respectively, viewed contains a text group pair as input data, and a teacher data indicating whether said text group pairs or suggest the context of similar, or suggest context dissimilar,
The plurality of texts included in the target text group are texts associated with first content,
One or more texts included in each of the text groups as the input data are texts associated with the first content or content different from the first content.
Context vector calculator.

Computer
Means for calculating a first context vector of a target text group including a plurality of mutually independent texts using a machine-learned neural network;
Means for calculating the similarity between each of the second context vectors of a plurality of candidate text groups each including one or more texts and the first context vector;
Based on the similarity, search for one or more candidate text groups respectively corresponding to one or more second context vectors similar to the first context vector, and function as means for obtaining a similar text group ,
The plurality of texts included in the target text group are texts associated with first content,
The one or more texts included in each of the candidate text groups are texts associated with the first content or content different from the first content.
Text extraction program.