JP5439050B2

JP5439050B2 - Related content display device and computer program

Info

Publication number: JP5439050B2
Application number: JP2009148687A
Authority: JP
Inventors: 正啓柴田; 淳後藤; 勝宮崎; 英樹住吉
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2009-06-23
Filing date: 2009-06-23
Publication date: 2014-03-12
Anticipated expiration: 2029-06-23
Also published as: JP2011008334A

Description

本発明は、放送番組などの映像コンテンツを多数蓄積して視聴するシステムにおいて、ユーザが所望の映像コンテンツを好みの時間に視聴することを支援するために、関連するコンテンツを検索して提示する関連コンテンツ表示装置及びコンピュータプログラムに関する。 The present invention relates to a system for accumulating and viewing a large number of video contents such as broadcast programs in order to search and present related contents in order to support a user to view desired video contents at a desired time. The present invention relates to a content display device and a computer program.

ウェブコンテンツなどを対象にした検索サービスにおいて提供される従来の検索機能は、ユーザが検索キーワードを指定することによって所望のコンテンツを見つけ出すものである。しかし、放送番組などの映像コンテンツの視聴時に検索を行なう場合、文字の入力は煩雑な操作であるとともに、文字の入力中は視聴中の映像コンテンツから目を離さなければならないこともあり、文字入力などを行なわない検索操作を可能とすることが望ましい。このためには、視聴中の番組など、任意の番組を検索キーとして、内容的関連の大きい番組や、一部のシーンを検索する機能が有効である。 A conventional search function provided in a search service for web content or the like finds desired content by a user specifying a search keyword. However, when searching for video content such as broadcast programs, entering characters is a cumbersome operation, and while entering characters, it may be necessary to keep an eye on the video content being viewed. It is desirable to be able to perform a search operation without performing the above. For this purpose, it is effective to use a search key as an arbitrary program, such as a program being viewed, to search a program having a high content relation or a part of scenes.

デジタル放送では、番組内容の概略を記述する文章（以下、「番組概要テキスト」と記載）が、電子番組表（EPG：Electronic Program Guide）として配信されている。そこで、検索キーとなる番組の番組概要テキストと、データベースに蓄積されている検索対象の番組の番組概要テキストとの関連度を数値化して評価することにより、検索キーとなる番組に関連度の高い番組を検索することができる。この機能により、ユーザは明示的に検索キーワードを入力することなく、例えば、使用中の報道番組を検索キーとして指定するだけで、取り上げられた話題をより詳しく説明している特集番組や教育番組を検索することができる。 In digital broadcasting, text describing the outline of program content (hereinafter referred to as “program summary text”) is distributed as an electronic program guide (EPG). Therefore, the degree of relevance between the program outline text of the program serving as the search key and the program outline text of the program to be searched stored in the database is quantified and evaluated, so that the degree of relevance to the program serving as the search key is high. You can search for programs. With this function, users can select featured programs and educational programs that explain the topics covered in more detail by, for example, specifying the news program in use as a search key without explicitly entering a search keyword. You can search.

一方、特許文献１には、受信した放送番組に関連したキーワードの中からユーザが選択したキーワードと、当該キーワードの関連キーワードの中からユーザが選択した関連キーワードとを用いて番組情報を検索する技術が記載されている。 On the other hand, Patent Literature 1 discloses a technique for searching program information using a keyword selected by a user from keywords related to a received broadcast program and a related keyword selected by the user from related keywords of the keyword. Is described.

特開２００５−３４８０７１号公報JP 2005-348071 A

上述したように、キーワードを用いずに、放送番組などのコンテンツに関連した他のコンテンツを検索するためには、番組概要テキスト等、コンテンツの概要や内容等を記述したテキストを対象としたベクトル空間法による関連コンテンツ検索機能がある。そして、関連度の高い番組を関連する番組として選択する。このベクトル空間法では、コンテンツに付随するテキスト間の多数の単語の共起に基づいて関連度を計算するため、数個のキーワードの出現に的を絞って検索するキーワード検索に比べて、検索結果に多種のコンテンツが含まれることになる。そのため、ユーザによる検索結果の把握が阻害されるという欠点がある。 As described above, in order to search for other content related to content such as a broadcast program without using a keyword, a vector space targeted for text describing the outline and details of the content, such as program summary text There is a related content search function by law. Then, a program having a high degree of association is selected as a related program. In this vector space method, the relevance is calculated based on the co-occurrence of a large number of words between the texts accompanying the content. Therefore, the search results are compared with the keyword search that searches for the occurrence of several keywords. A variety of contents are included. For this reason, there is a drawback in that the grasp of the search result by the user is hindered.

そこで、関連コンテンツの検索を提供する際のユーザインタフェースの質を向上させるために、検索結果として検出された関連コンテンツがどのような観点から関連していると判定されたかを、わかりやすく表示する必要がある。しかし、キーワード列の表示以外に、検出された関連コンテンツがどのような観点から関連しているかを表示する有効な技術はなかった。特許文献１は、ユーザに関連する番組情報を検索するためのキーワードや関連キーワードを表示して選択させるものであり、このような問題を解決するものではない。 Therefore, in order to improve the quality of the user interface when providing related content searches, it is necessary to display in an easy-to-understand manner the related content detected as a search result is determined to be related. There is. However, there is no effective technique for displaying from what viewpoint the detected related content is related other than displaying the keyword string. Patent Document 1 displays and selects keywords for searching for program information related to a user and related keywords, and does not solve such a problem.

本発明は、このような事情を考慮してなされたもので、その目的は、関連コンテンツを提示する際に、どのような観点から当該関連コンテンツが関連しているかの情報をわかりやすく提示することができる関連コンテンツ表示装置及びコンピュータプログラムを提供することにある。 The present invention has been made in consideration of such circumstances, and its purpose is to present in an easy-to-understand manner information on what the related content is related from when the related content is presented. An object of the present invention is to provide a related content display device and a computer program.

［１］本発明の一態様は、コンテンツ識別情報と、コンテンツの内容を表したテキスト情報とを対応付けて記憶するテキスト情報蓄積部と、ベクトル間の関連度を算出するベクトル間関連度解析部と、指定されたコンテンツ識別情報に対応付けられた前記テキスト情報の言語的特徴を表したキーコンテンツベクトルを生成する第１コンテンツベクトル生成部と、他のコンテンツ識別情報に対応付けられた前記テキスト情報の言語的特徴を表した候補コンテンツベクトルを生成する第２コンテンツベクトル生成部と、前記ベクトル間関連度解析部によって算出された前記キーコンテンツベクトルとそれぞれの前記候補コンテンツベクトルとの関連度に基づいて、前記キーコンテンツベクトルと関連性の高い前記候補コンテンツベクトルを関連コンテンツベクトルとして選択する関連コンテンツ選択部と、前記キーコンテンツベクトル及び前記関連コンテンツベクトルを基に、前記キーコンテンツベクトルに対応した前記テキスト情報と前記関連コンテンツベクトルに対応した前記テキスト情報とで共起する単語の特徴を表した単語共起ベクトルを生成する共起ベクトル生成部と、前記関連コンテンツベクトルに対応した前記テキスト情報に含まれるそれぞれの文について、前記文の言語的特徴を表した関連コンテンツ文ベクトルを生成する第１文ベクトル生成部と、前記ベクトル間関連度解析部によって算出された前記単語共起ベクトルとそれぞれの前記関連コンテンツ文ベクトルとの関連度に基づいて、前記単語共起ベクトルと関連性の高い前記関連コンテンツ文ベクトルを選択し、選択された前記関連コンテンツ文ベクトルに対応する前記文を関連コンテンツの関連表示文として特定する関連表示文選択部と、前記関連表示文選択部により特定された関連コンテンツの関連表示文を出力する関連コンテンツ出力部と、を備えることを特徴とする関連コンテンツ表示装置である。
この発明によれば、指定されたコンテンツの内容を示す文章と、関連コンテンツの候補となるコンテンツの内容を示す文章との言語的な類似度に基づいて関連コンテンツを選択し、当該関連コンテンツの内容を示す文章を構成する文の中から、指定されたコンテンツの内容を示す文章と関連コンテンツとして選択された文章とに共通した言語的特徴を最もよく表す文を選択して、出力する。
これにより、テレビやパソコンなどのコンテンツ表示装置によりユーザが視聴しているコンテンツに関連したコンテンツが、どのような観点において関連しているかの情報を提示することが可能となり、ユーザは、興味を惹く関連コンテンツを見つけ易くなる。 [1] According to one aspect of the present invention, a text information storage unit that stores content identification information and text information that represents the content in association with each other, and an inter-vector relevance analysis unit that calculates the relevance between vectors A first content vector generation unit that generates a key content vector representing a linguistic feature of the text information associated with the specified content identification information, and the text information associated with other content identification information A second content vector generation unit that generates a candidate content vector representing the linguistic features of the key content vector based on the relevance between the key content vector calculated by the inter-vector relevance analysis unit and each of the candidate content vectors The candidate content vector highly related to the key content vector Co-occurs with the related content selection unit selected as the content vector, the text information corresponding to the key content vector, and the text information corresponding to the related content vector based on the key content vector and the related content vector A co-occurrence vector generation unit that generates a word co-occurrence vector that represents the characteristics of the word, and a related content sentence that represents the linguistic characteristics of the sentence for each sentence included in the text information corresponding to the related content vector A first sentence vector generation unit for generating a vector, and the word co-occurrence vector based on the degree of association between the word co-occurrence vector calculated by the inter-vector relevance analysis unit and each of the related content sentence vectors; Select the related content sentence vector that is highly relevant. A related display sentence selection unit that specifies the sentence corresponding to the selected related content sentence vector as a related display sentence of the related content, and a relation that outputs a related display sentence of the related content specified by the related display sentence selection unit A related content display device comprising: a content output unit.
According to the present invention, the related content is selected based on the linguistic similarity between the text indicating the content of the designated content and the text indicating the content of the content that is a candidate for the related content, and the content of the related content is selected. The sentence that best represents the linguistic features common to the sentence indicating the content of the designated content and the sentence selected as the related content is selected and output from the sentences constituting the sentence indicating the content.
As a result, it is possible to present information about what kind of viewpoint the content related to the content viewed by the user using a content display device such as a television or a personal computer is related, and the user is interested. Easier to find related content.

［２］本発明の一態様は、上述する関連コンテンツ表示装置であって、前記キーコンテンツベクトルに対応した前記テキスト情報に含まれるそれぞれの文について、前記文の言語的特徴を表したキーコンテンツ文ベクトルを生成する第２文ベクトル生成部をさらに備え、前記関連表示文選択部は、前記ベクトル間関連度解析部によって算出された前記単語共起ベクトルとそれぞれの前記キーコンテンツ文ベクトルとの関連度に基づいて、前記単語共起ベクトルと関連性の高い前記キーコンテンツ文ベクトルを選択し、選択された前記キーコンテンツ文ベクトルに対応する前記文をキーコンテンツの関連表示文として特定し、前記関連コンテンツ出力部は、前記関連表示文選択部により特定されたキーコンテンツの関連表示文を出力する、ことを特徴とする。
この発明によれば、指定されたコンテンツの内容を示す文章を構成する文の中から、当該指定されたコンテンツの内容を示す文章と関連コンテンツとして選択された文章とに共通した言語的特徴を最もよく表す文を選択して、出力する。
これにより、ユーザが現在視聴しているコンテンツが、検索結果として得られた関連コンテンツとどのような観点において関連しているかの情報を提示することが可能となり、ユーザは、興味を惹く関連コンテンツを見つけ易くなる。 [2] One aspect of the present invention is the related content display device described above, wherein for each sentence included in the text information corresponding to the key content vector, a key content sentence representing a linguistic feature of the sentence. A second sentence vector generation unit that generates a vector, and the related display sentence selection unit includes a degree of association between the word co-occurrence vector calculated by the inter-vector relation analysis unit and each of the key content sentence vectors. The key content sentence vector highly relevant to the word co-occurrence vector is selected, the sentence corresponding to the selected key content sentence vector is specified as a related display sentence of the key content, and the related content The output unit outputs a related display sentence of the key content specified by the related display sentence selection unit ; It is characterized by.
According to the present invention, the linguistic feature common to the text indicating the content of the specified content and the text selected as the related content is the most common among the sentences constituting the text indicating the content of the specified content. Select a sentence that is well represented and output it.
As a result, it is possible to present information on what kind of viewpoint the content that the user is currently viewing is related to the related content obtained as a search result. It becomes easy to find.

［３］本発明の一態様は、上述する関連コンテンツ表示装置であって、コンテンツ識別情報と、当該コンテンツの関連情報とを対応付けて記憶するコンテンツ蓄積部をさらに備え、前記関連コンテンツ出力部は、前記関連コンテンツベクトルに対応した前記テキスト情報と同じコンテンツ識別情報と対応付けられた関連情報を前記コンテンツ蓄積部から読み出し、読み出した関連情報あるいは当該関連情報に基づいて生成した情報を前記関連表示文と併せて出力する、ことを特徴とする。
この発明によれば、検索の結果得られた関連コンテンツに関する情報や、当該情報を加工した情報を関連表示文に付加して出力する。
これにより、検索結果として得られた関連コンテンツのタイトルやサムネイル等をコンテンツ表示装置に提示させることができ、ユーザが興味を惹く関連コンテンツを見つけ易くなる。 [3] One aspect of the present invention is the related content display device described above, further including a content storage unit that stores content identification information and related information of the content in association with each other, and the related content output unit includes The related information associated with the same content identification information as the text information corresponding to the related content vector is read from the content storage unit, and the read related information or information generated based on the related information is read from the related display text. It is output together with.
According to the present invention, information related to related content obtained as a result of search and information obtained by processing the information are added to the related display text and output.
As a result, the title, thumbnail, etc. of the related content obtained as a search result can be presented on the content display device, and it becomes easy to find the related content that the user is interested in.

［４］本発明の一態様は、関連コンテンツ表示装置として用いられるコンピュータを、コンテンツ識別情報と、コンテンツの内容を表したテキスト情報とを対応付けて記憶するテキスト情報蓄積部、ベクトル間の関連度を算出するベクトル間関連度解析部、指定されたコンテンツ識別情報に対応付けられた前記テキスト情報の言語的特徴を表したキーコンテンツベクトルを生成する第１コンテンツベクトル生成部、他のコンテンツ識別情報に対応付けられた前記テキスト情報の言語的特徴を表した候補コンテンツベクトルを生成する第２コンテンツベクトル生成部、前記ベクトル間関連度解析部によって算出された前記キーコンテンツベクトルとそれぞれの前記候補コンテンツベクトルとの関連度に基づいて、前記キーコンテンツベクトルと関連性の高い前記候補コンテンツベクトルを関連コンテンツベクトルとして選択する関連コンテンツ選択部、前記キーコンテンツベクトル及び前記関連コンテンツベクトルを基に、前記キーコンテンツベクトルに対応した前記テキスト情報と前記関連コンテンツベクトルに対応した前記テキスト情報とで共起する単語の特徴を表した単語共起ベクトルを生成する共起ベクトル生成部、前記関連コンテンツベクトルに対応した前記テキスト情報に含まれるそれぞれの文について、前記文の言語的特徴を表した関連コンテンツ文ベクトルを生成する第１文ベクトル生成部、前記ベクトル間関連度解析部によって算出された前記単語共起ベクトルとそれぞれの前記関連コンテンツ文ベクトルとの関連度に基づいて、前記単語共起ベクトルと関連性の高い前記関連コンテンツ文ベクトルを選択し、選択された前記関連コンテンツ文ベクトルに対応する前記文を関連コンテンツの関連表示文として特定する関連表示文選択部、前記関連表示文選択部により特定された関連コンテンツの関連表示文を出力する関連コンテンツ出力部、として機能させることを特徴とするコンピュータプログラムである。
この発明によれば、指定されたコンテンツの内容を示す文章と、関連コンテンツの候補となるコンテンツの内容を示す文章との言語的な類似度に基づいて関連コンテンツを選択し、当該関連コンテンツの内容を示す文章を構成する文の中から、指定されたコンテンツの内容を示す文章と関連コンテンツとして選択された文章とに共通した言語的特徴を最もよく表す文を選択して、出力する。
これにより、テレビやパソコンなどのコンテンツ表示装置によりユーザが視聴しているコンテンツに関連したコンテンツが、どのような観点において関連しているかの情報を提示することが可能となり、ユーザは、興味を惹く関連コンテンツを見つけ易くなる。 [4] According to one aspect of the present invention, a computer used as a related content display device stores a text information storage unit that stores content identification information and text information that represents the content in association with each other, and a degree of association between vectors An inter-vector relevance analysis unit that calculates a first content vector generation unit that generates a key content vector representing a linguistic feature of the text information associated with the specified content identification information, and other content identification information A second content vector generation unit that generates a candidate content vector representing a linguistic feature of the associated text information, the key content vector calculated by the inter-vector relevance analysis unit, and each of the candidate content vectors Based on the relevance of the key content vector and Corresponding to the text information and the related content vector corresponding to the key content vector based on the related content selection unit that selects the candidate content vector having high relevance as the related content vector, the key content vector, and the related content vector A co-occurrence vector generation unit that generates a word co-occurrence vector representing the characteristics of the words that co-occur with the text information, and for each sentence included in the text information corresponding to the related content vector, the language of the sentence A first sentence vector generation unit that generates a related content sentence vector representing a characteristic, and a degree of association between the word co-occurrence vector calculated by the inter-vector relevance analysis part and each of the related content sentence vectors The relevance of the word co-occurrence vector A related display sentence selection unit that selects the related content sentence vector selected and specifies the sentence corresponding to the selected related content sentence vector as a related display sentence of the related content; A computer program that functions as a related content output unit that outputs a related display sentence of content.
According to the present invention, the related content is selected based on the linguistic similarity between the text indicating the content of the designated content and the text indicating the content of the content that is a candidate for the related content, and the content of the related content is selected. The sentence that best represents the linguistic features common to the sentence indicating the content of the designated content and the sentence selected as the related content is selected and output from the sentences constituting the sentence indicating the content.
As a result, it is possible to present information about what kind of viewpoint the content related to the content viewed by the user using a content display device such as a television or a personal computer is related, and the user is interested. Easier to find related content.

本発明によれば、コンテンツに付随し、当該コンテンツの内容等が記述されたテキスト間の関連度に基づいて、キーとなるコンテンツの関連コンテンツを検索し、検索の結果得られた関連コンテンツがどのような観点から関連していると判断されたかの情報を示す文を当該関連コンテンツに付随するテキストから抽出し、検索結果の関連コンテンツに併せてユーザに提示することが可能となる。よって、ユーザは、どのような観点から検索結果の関連コンテンツがキーとなるコンテンツと関連しているのかの情報を得ることができ、興味を惹く関連コンテンツを見つけ易くなる。 According to the present invention, the related content of the key content is searched based on the degree of relevance between the texts accompanying the content and describing the content of the content, and the related content obtained as a result of the search is searched. A sentence indicating information as to whether it is determined to be related from such a viewpoint can be extracted from the text accompanying the related content and presented to the user together with the related content of the search result. Therefore, the user can obtain information on from which point of view the related content of the search result is related to the key content, and it is easy to find related content that attracts interest.

本発明の一実施形態による関連コンテンツ表示装置の機能ブロック図である。It is a functional block diagram of the related content display apparatus by one Embodiment of this invention. 同実施形態によるテキスト情報蓄積部に蓄積されるデータの例を示す図である。It is a figure which shows the example of the data accumulate | stored in the text information storage part by the embodiment. 同実施形態によるコンテンツ蓄積部に蓄積されるデータの例を示す図である。It is a figure which shows the example of the data accumulate | stored in the content storage part by the embodiment. 同実施形態による関連コンテンツ表示装置の処理フローを示す図である。It is a figure which shows the processing flow of the related content display apparatus by the embodiment. 同実施形態による検索結果表示画面の例を示す図である。It is a figure which shows the example of the search result display screen by the embodiment.

以下、図面を参照しながら本発明の実施形態を詳細に説明する。
図１は、本発明の一実施形態による関連コンテンツ表示装置１の構成を示す機能ブロック図であり、発明と関係する機能ブロックのみ抽出して示してある。同図において、関連コンテンツ表示装置１は、テキスト情報蓄積部２、コンテンツ蓄積部３、関連コンテンツ検索部４、ベクトル間関連度解析部５、検索結果解析部６、及び、関連コンテンツ出力部７を含んで構成される。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a functional block diagram showing a configuration of a related content display device 1 according to an embodiment of the present invention, and only functional blocks related to the invention are extracted and shown. In the figure, a related content display device 1 includes a text information storage unit 2, a content storage unit 3, a related content search unit 4, an intervector relevance analysis unit 5, a search result analysis unit 6, and a related content output unit 7. Consists of including.

テキスト情報蓄積部２及びコンテンツ蓄積部３は、ハードディスク装置や半導体メモリなどで実現される。テキスト情報蓄積部２は、コンテンツを特定する識別情報であるコンテンツＩＤと、当該コンテンツの内容に関する情報を記述した文章を示すテキスト情報データとを対応付けて記憶する。コンテンツ蓄積部３は、コンテンツＩＤと、コンテンツ本体のデジタルデータであるコンテンツデータとを対応付けて記憶する。 The text information storage unit 2 and the content storage unit 3 are realized by a hard disk device, a semiconductor memory, or the like. The text information storage unit 2 stores a content ID, which is identification information for specifying the content, and text information data indicating text describing information related to the contents of the content in association with each other. The content storage unit 3 stores a content ID and content data that is digital data of the content body in association with each other.

ベクトル間関連度解析部５は、ベクトルの組を受信し、これらのベクトル間の関連度を算出する。 The inter-vector relevance analysis unit 5 receives a set of vectors and calculates the relevance between these vectors.

関連コンテンツ検索部４は、第１コンテンツベクトル生成部４１、第２コンテンツベクトル生成部４２、及び、関連コンテンツ選択部４３を備える。
第１コンテンツベクトル生成部４１は、検索キーとなるコンテンツＩＤにより特定されるテキスト情報データをテキスト情報蓄積部２から取得すると、取得したテキスト情報データにより示される文章中に出現する各単語に対して重要度に応じた重みを付与し、この重みの数値に基づいて当該文章の言語的な特徴をベクトルで表現したコンテンツベクトルを生成する。以下、検索キーとなるコンテンツＩＤにより特定されるコンテンツを「キーコンテンツ」、当該コンテンツＩＤにより特定されるテキスト情報データを「キーコンテンツテキスト情報データ」、キーコンテンツテキスト情報データにより示される文章から生成されたコンテンツベクトルを「キーコンテンツベクトル」と記載する。 The related content search unit 4 includes a first content vector generation unit 41, a second content vector generation unit 42, and a related content selection unit 43.
When the first content vector generation unit 41 acquires the text information data specified by the content ID serving as a search key from the text information storage unit 2, the first content vector generation unit 41 applies to each word appearing in the sentence indicated by the acquired text information data. A weight corresponding to the importance is assigned, and a content vector expressing the linguistic feature of the sentence as a vector is generated based on the numerical value of the weight. Hereinafter, the content specified by the content ID serving as a search key is generated from “key content”, the text information data specified by the content ID is generated from “key content text information data”, and the text indicated by the key content text information data. The content vector is referred to as “key content vector”.

第２コンテンツベクトル生成部４２は、キーコンテンツの関連コンテンツの候補として検索対象となるコンテンツのテキスト情報データをテキスト情報蓄積部２から取得すると、キーコンテンツベクトルの生成と同様の手法により、取得したテキスト情報データにより示される文章それぞれからコンテンツベクトルを生成する。以下、関連コンテンツの候補となるコンテンツを「候補コンテンツ」、候補コンテンツのテキスト情報データを「候補コンテンツテキスト情報データ」、候補コンテンツテキスト情報データにより示される文章から生成されたコンテンツベクトルを「候補コンテンツベクトル」と記載する。 When the second content vector generation unit 42 acquires the text information data of the content to be searched as a candidate for the related content of the key content from the text information storage unit 2, the second content vector generation unit 42 acquires the acquired text by the same method as the generation of the key content vector. A content vector is generated from each sentence indicated by the information data. Hereinafter, the content that is a candidate for the related content is “candidate content”, the text information data of the candidate content is “candidate content text information data”, and the content vector generated from the sentence indicated by the candidate content text information data is “candidate content vector” ".

関連コンテンツ選択部４３は、第１コンテンツベクトル生成部４１が生成したキーコンテンツベクトルと、第２コンテンツベクトル生成部４２が生成した各候補コンテンツベクトルとの関連度をベクトル間関連度解析部５から取得し、取得した関連度に基づいて候補コンテンツの中から関連コンテンツを選択すると、この選択された候補コンテンツの候補コンテンツテキスト情報データ、候補コンテンツベクトルをそれぞれ関連コンテンツテキスト情報データ、関連コンテンツベクトルとして検索結果解析部６に出力する。 The related content selection unit 43 acquires the relevance between the key content vector generated by the first content vector generation unit 41 and each candidate content vector generated by the second content vector generation unit 42 from the inter-vector relevance analysis unit 5. When the related content is selected from the candidate content based on the acquired relevance level, the search result is obtained by using the candidate content text information data and the candidate content vector of the selected candidate content as the related content text information data and the related content vector, respectively. Output to the analysis unit 6.

検索結果解析部６は、共起ベクトル生成部６１、第１文ベクトル生成部６２、第２文ベクトル生成部６３、及び、関連表示文選択部６４を備える。
共起ベクトル生成部６１は、キーコンテンツベクトル及び候補コンテンツベクトルから、キーコンテンツテキスト情報データで示される文章と関連コンテンツテキスト情報データで示される文章に共通して含まれる単語の特徴を表す単語共起ベクトルを生成する。 The search result analysis unit 6 includes a co-occurrence vector generation unit 61, a first sentence vector generation unit 62, a second sentence vector generation unit 63, and a related display sentence selection unit 64.
The co-occurrence vector generation unit 61 uses the key content vector and the candidate content vector to generate word co-occurrence representing the characteristics of the words included in common in the text indicated by the key content text information data and the text indicated by the related content text information data. Generate a vector.

第１文ベクトル生成部６２は、関連コンテンツテキスト情報データで示される文章内の各文それぞれについて、キーコンテンツベクトルや候補コンテンツベクトルの生成と同様の手法により、各文の言語的特徴をベクトルで表現した文ベクトルを生成する。以下、関連コンテンツテキスト情報データで示される文章内の文から生成した文ベクトルを「関連コンテンツ文ベクトル」と記載する。 The first sentence vector generation unit 62 represents the linguistic features of each sentence as a vector for each sentence in the sentence indicated by the related content text information data by the same method as the generation of the key content vector and the candidate content vector. Generated sentence vectors. Hereinafter, a sentence vector generated from a sentence in a sentence indicated by related content text information data is referred to as a “related content sentence vector”.

第２文ベクトル生成部６３は、キーコンテンツテキスト情報データで示される文章内の各文それぞれについて、キーコンテンツベクトルや候補コンテンツベクトルの生成と同様の手法により、各文の言語的特徴をベクトルで表現した文ベクトルを生成する。以下、キーコンテンツテキスト情報データで示される文章内の文から生成した文ベクトルを「キーコンテンツ文ベクトル」と記載する。 The second sentence vector generation unit 63 expresses the linguistic features of each sentence as a vector for each sentence in the sentence indicated by the key content text information data by the same method as the generation of the key content vector and the candidate content vector. Generated sentence vectors. Hereinafter, the sentence vector generated from the sentence in the sentence indicated by the key content text information data is referred to as “key content sentence vector”.

関連表示文選択部６４は、単語共起ベクトルと、各関連コンテンツ文ベクトル、及び、各キーコンテンツ文ベクトルとの関連度をベクトル間関連度解析部５から取得し、取得した関連度に基づいて、関連コンテンツテキスト情報データ、キーコンテンツテキスト情報データで示される文章それぞれから、キーコンテンツと関連コンテンツの関連性を最もよく示す文である関連表示文を選択する。 The related display sentence selection unit 64 acquires the degree of association between the word co-occurrence vector, each related content sentence vector, and each key content sentence vector from the inter-vector relevance analysis unit 5, and based on the obtained degree of relation Then, a related display sentence that is the sentence that best indicates the relevance between the key content and the related content is selected from the sentences indicated by the related content text information data and the key content text information data.

関連コンテンツ出力部７は、コンテンツ蓄積部３からコンテンツの関連情報として関連コンテンツのタイトルやコンテンツデータを取得すると、当該タイトルや、当該コンテンツデータに基づいて生成したサムネイルなどの画像データ、及び、検索結果解析部６により選択された関連表示文を出力し、ユーザがコンテンツを視聴しているテレビジョン受信機やパーソナルコンピュータなどのコンテンツ表示装置のディスプレイに表示させる。 When the related content output unit 7 acquires the title and content data of the related content as the related information of the content from the content storage unit 3, image data such as the title, a thumbnail generated based on the content data, and the search result The related display text selected by the analysis unit 6 is output and displayed on the display of a content display device such as a television receiver or a personal computer where the user is viewing the content.

なお、以下では、コンテンツ表示装置とネットワークを介して接続される１または複数台のサーバに、テキスト情報蓄積部２、コンテンツ蓄積部３、関連コンテンツ検索部４、ベクトル間関連度解析部５、検索結果解析部６、及び、関連コンテンツ出力部７を備える場合について説明する。ただし、テキスト情報蓄積部２、コンテンツ蓄積部３、関連コンテンツ検索部４、ベクトル間関連度解析部５、検索結果解析部６、及び、関連コンテンツ出力部７の全てをコンテンツ表示装置に備えてもよく、任意の一部の機能部をコンテンツ表示装置に備えてもよい。例えば、テキスト情報蓄積部２及びコンテンツ蓄積部３をサーバに、他の機能部をコンテンツ表示装置に備えてもよく、コンテンツ蓄積部３のみをサーバに、他の機能部をコンテンツ表示装置に備えてもよい。 In the following description, the text information storage unit 2, the content storage unit 3, the related content search unit 4, the inter-vector relevance analysis unit 5, the search are connected to one or a plurality of servers connected to the content display device via the network. The case where the result analysis part 6 and the related content output part 7 are provided is demonstrated. However, the text display storage unit 2, the content storage unit 3, the related content search unit 4, the inter-vector relevance analysis unit 5, the search result analysis unit 6, and the related content output unit 7 may all be provided in the content display device. In addition, any part of the functional units may be provided in the content display device. For example, the text information storage unit 2 and the content storage unit 3 may be provided in the server, other functional units may be provided in the content display device, only the content storage unit 3 may be provided in the server, and other functional units may be provided in the content display device. Also good.

図２は、テキスト情報蓄積部２に蓄積されるデータの例を示す図である。同図に示すように、テキスト情報蓄積部２には、コンテンツＩＤと、テキスト情報データＬとが対応づけて記憶されている。ここでは、テキスト情報データＬとして、電子番組表（EPG：Electronic Program Guide）など、放送番組の番組内容を説明する番組概要テキストをファイル形式で蓄積する。番組概要テキストは複数の文からなる。 FIG. 2 is a diagram illustrating an example of data stored in the text information storage unit 2. As shown in the figure, the text information storage unit 2 stores a content ID and text information data L in association with each other. Here, as text information data L, program summary text explaining the program content of a broadcast program such as an electronic program guide (EPG) is stored in a file format. The program summary text consists of a plurality of sentences.

図３は、コンテンツ蓄積部３に蓄積されるデータの例を示す図である。同図に示すように、コンテンツ蓄積部３には、コンテンツＩＤと、コンテンツデータと、コンテンツのタイトルを示すデータとが対応づけて記憶されている。コンテンツデータは、例えば、放送番組の番組映像を構成する動画像データや音声データのファイルである。 FIG. 3 is a diagram illustrating an example of data stored in the content storage unit 3. As shown in the figure, the content storage unit 3 stores a content ID, content data, and data indicating the title of the content in association with each other. The content data is, for example, a moving image data or audio data file that constitutes a program video of a broadcast program.

図４は、関連コンテンツ表示装置１の処理フローを示す図である。
同図において、関連コンテンツ表示装置１の関連コンテンツ検索部４は、コンテンツ表示装置から、キーコンテンツのコンテンツＩＤが設定された検索キーを受信する（ステップＳ１０５）。例えば、コンテンツ表示装置は、ユーザによって視聴が選択された放送番組のコンテンツＩＤ、ユーザによって入力されたコンテンツＩＤ、あるいは、ユーザによって検索指示が入力されたときに表示していた放送番組のコンテンツＩＤを送信する。ここでは、検索キーには、キーコンテンツｋのコンテンツＩＤが設定されているものとする。 FIG. 4 is a diagram illustrating a processing flow of the related content display device 1.
In the figure, the related content search unit 4 of the related content display device 1 receives the search key in which the content ID of the key content is set from the content display device (step S105). For example, the content display device displays the content ID of a broadcast program selected for viewing by the user, the content ID input by the user, or the content ID of the broadcast program displayed when a search instruction is input by the user. Send. Here, it is assumed that the content ID of the key content k is set in the search key.

関連コンテンツ検索部４の第１コンテンツベクトル生成部４１は、ステップＳ１０５において検索要求として受信した検索キーからコンテンツＩＤを取得し、当該コンテンツＩＤにより特定されるテキスト情報データＬであるキーコンテンツテキスト情報データＬ_ｋをテキスト情報蓄積部２から読み出す。続いて、第２コンテンツベクトル生成部４２は、ステップＳ１１０において読み出したキーコンテンツテキスト情報データＬ_ｋ以外のコンテンツテキスト情報データＬである、候補コンテンツテキスト情報データＬ_ｊ（１≦ｊ≦ｍ、かつ、ｊ≠ｋ）をテキスト情報蓄積部２から読み出す（ステップＳ１１０）。但し、ｍはテキスト情報蓄積部２に記憶されているテキスト情報データＬの数である。 The first content vector generation unit 41 of the related content search unit 4 acquires the content ID from the search key received as the search request in step S105, and the key content text information data that is the text information data L specified by the content ID. L _k is read from the text information storage unit 2. Subsequently, the second content vector generation unit 42 is candidate content text information data L _j (1 ≦ j ≦ m), which is content text information data L other than the key content text information data L _k read in step S110, and j ≠ k) is read from the text information storage unit 2 (step S110). Here, m is the number of text information data L stored in the text information storage unit 2.

第１コンテンツベクトル生成部４１は、キーコンテンツテキスト情報データＬ_ｋにより示される番組概要テキストからコンテンツベクトルを生成する（ステップＳ１１５）。また、第２コンテンツベクトル生成部４２は、各候補コンテンツテキスト情報データＬ_ｊにより示される番組概要テキストそれぞれからコンテンツベクトルを生成する（ステップＳ１２０）。 First content vector generation unit 41 generates a content vector from the program summary text indicated by the key content text information data L _k (step S115). Further, the second content vector generation unit 42 generates a content vector from each program summary text indicated by each candidate content text information data L _j (step S120).

あるコンテンツｐのテキスト情報データＬ_ｐにより示される文章のコンテンツベクトルｗ_ｐは、以下の（式１）のように表される。 A content vector w _p of a sentence indicated by the text information data L _p of a certain content p is expressed as (Equation 1) below.

コンテンツベクトルｗ_ｐ＝（ｗ_ｐ１，…，ｗ_ｐｉ，…，ｗ_ｐＮ）・・・（式１） Content vector w _p = (w _p1 ,..., W _pi ,..., W _pN ) (Expression 1)

要素ｗ_ｐｉは、コンテンツｐのテキスト情報データＬ_ｐが示す文章における単語ｉ（１≦ｉ≦Ｎ）の重要度であり、テキスト情報データＬ_ｐが示す文章に単語ｉが出現しないときはｗ_ｐｉ＝０、出現するときにはｗ_ｐｉ＞０となる。また、Ｎは、全テキスト情報データＬにより示される文章中に出現する異なる単語の数である。コンテンツベクトルは、例えば、ＴＦ／ＩＤＦ（単語重要度の評価手法）を用いたベクトル表現とすることができる。このベクトル表現については、例えば、（文献）徳永健伸著、「情報検索と言語処理」、東京大学出版会、第２章、ｐ．３２−３３、ＩＳＢＮ：4130654055に記載されている。 The element w _pi is the importance of the word i (1 ≦ i ≦ N) in the sentence indicated by the text information data L _p of the content p. When the word i does not appear in the sentence indicated by the text information data L _p, w _pi = 0, and when it appears, w _pi > 0. N is the number of different words that appear in the sentence indicated by the entire text information data L. The content vector can be a vector expression using TF / IDF (word importance evaluation method), for example. For this vector expression, see, for example, Takenobu Tokunaga, “Information Retrieval and Language Processing”, University of Tokyo Press, Chapter 2, p. 32-33, ISBN: 4130654055.

ＴＦ／ＩＤＦでは、複数（ＤＮ個）の文書からなる文書群がある場合、ある文書中にある単語ｉ（キーワード）が出現する数を示すＴＦ（ｉ）値と、文書群の中でその単語ｉが含まれている文書数を示すＤＦ（ｉ）値とを用いて、当該文書における単語ｉのＴＦ−ＩＤＦ値をＴＦ（ｉ）×ｌｏｇ（ＤＮ／ＤＦ（ｉ））により算出する。各文書の特徴を表すベクトル表現は、各単語のＴＦ−ＩＤＦ値を要素として表される。つまり、コンテンツベクトルｗ_ｐの要素ｗ_ｐｉは、単語ｉのＴＦ−ＩＤＦ値となる。 In TF / IDF, when there is a document group composed of a plurality (DN) of documents, a TF (i) value indicating the number of occurrences of a word i (keyword) in a document and the word in the document group. Using the DF (i) value indicating the number of documents including i, the TF-IDF value of the word i in the document is calculated by TF (i) × log (DN / DF (i)). The vector expression representing the characteristics of each document is expressed by using the TF-IDF value of each word as an element. That is, the element w _pi of the content vector w _p is the TF-IDF value of the word i.

具体的には、第１コンテンツベクトル生成部４１は、ステップＳ１１０において読み出したキーコンテンツテキスト情報データＬ_ｋを形態素解析し、第２コンテンツベクトル生成部４２は、ステップＳ１１０において読み出した候補コンテンツテキスト情報データＬ_ｊにより示される番組概要テキストそれぞれを形態素解析する。第１コンテンツベクトル生成部４１は、これらの形態素解析の結果から名詞などの特定の品詞を抽出することによりキーワードとなる単語を決定し、各単語とベクトルの要素との対応を決定する。続いて、キーコンテンツテキスト情報データＬ_ｋ、各候補コンテンツテキスト情報データＬ_ｊにより示される番組概要テキストからキーワードとなる各単語ｉ（ｉ＝１〜Ｎ）のＴＦ（ｉ）値を取得するととともに、当該単語ｉについてのＤＦ（ｉ）値を取得する。そして、テキスト情報蓄積部２に記憶されているコンテンツテキスト情報データＬの数をＤＮとして読み出すと、第１コンテンツベクトル生成部４１は、キーコンテンツテキスト情報データＬ_ｋについて各単語ｉのＴＦ−ＩＤＦ値を算出し、第２コンテンツベクトル生成部４２は、各候補コンテンツテキスト情報データＬ_ｊにより示される番組概要テキストそれぞれについて各単語ｉのＴＦ−ＩＤＦ値を算出する。 Specifically, the first content vector generation unit 41 performs morphological analysis on the key content text information data L _k read in step S110, and the second content vector generation unit 42 reads the candidate content text information data read in step S110. Each program summary text indicated by L _j is morphologically analyzed. The first content vector generation unit 41 determines a word as a keyword by extracting a specific part of speech such as a noun from the results of these morphological analyses, and determines a correspondence between each word and a vector element. Subsequently, the TF (i) value of each word i (i = 1 to N) as a keyword is acquired from the program summary text indicated by the key content text information data L _k and each candidate content text information data L _j , The DF (i) value for the word i is acquired. Then, reading the number of content text information data L stored in the text information storage section 2 as a DN, the first content vector generator 41, the key content text information about the data L _k TF-IDF value of each word i And the second content vector generation unit 42 calculates the TF-IDF value of each word i for each program summary text indicated by each candidate content text information data L _j .

以下、キーコンテンツテキスト情報データＬ_ｋに基づいて生成されたコンテンツベクトルをキーコンテンツベクトルｗ_ｋ、候補コンテンツテキスト情報データＬ_ｊから生成されたコンテンツベクトルを候補コンテンツベクトルｗ_ｊとする。 Hereinafter, a content vector generated based on the key content text information data L _k is referred to as a key content vector w _k , and a content vector generated from the candidate content text information data L _{j is referred} to as a candidate content vector w _j .

続いて、関連コンテンツ選択部４３は、ステップＳ１２０において生成した候補コンテンツベクトルｗ_ｊの中から１つを選択すると、キーコンテンツベクトルｗ_ｋと、選択した候補コンテンツベクトルｗ_ｊとをベクトル間関連度解析部５へ出力する。ベクトル間関連度解析部５は、以下の（式２）のｗ_ａ，ｗ_ｂに、受信したキーコンテンツベクトルｗ_ｋ、候補コンテンツベクトルｗ_ｊを代入し、コサイン尺度である関連度Ｒ（ｗ_ｋ，ｗ_ｊ）を算出する。両ベクトルが類似しているほどこれらのベクトル間の成す角が狭くなるため、関連度の値は大きくなる。 Subsequently, when the related content selection unit 43 selects one of the candidate content vectors w _j generated in step S120, the related content analysis is performed on the key content vector w _k and the selected candidate content vector w _j. Output to unit 5. The inter-vector relevance analysis unit 5 substitutes the received key content vector w _{k and} candidate content vector w _j for w _a and w _b in (Equation 2) below, and the relevance R (w _k), which is a cosine measure. , W _j ). The more similar the two vectors, the narrower the angle formed between these vectors, and the greater the value of relevance.

関連度Ｒ（ｗ_ａ，ｗ_ｂ）＝（ｗ_ａ・ｗ_ｂ）／（｜ｗ_ａ｜｜ｗ_ｂ｜）・・・（式２）
但し、ｗ_ａ・ｗ_ｂはベクトルｗ_ａとベクトルｗ_ｂの内積である。 Relevance R (w _a , w _b ) = (w _a · w _b ) / (| w _a || w _b |) (Expression 2)
However, w _a · w _b is the inner product of the vector w _a and the vector w _b .

ベクトル間関連度解析部５は、算出した関連度Ｒ（ｗ_ｋ，ｗ_ｊ）を関連コンテンツ選択部４３に出力する。
関連コンテンツ選択部４３は、まだキーコンテンツベクトルｗ_ｋとの関連度を算出していない候補コンテンツベクトルがあれば、そのうちの１を選択して上記処理を繰り返し、全ての候補コンテンツベクトルｗ_ｊそれぞれについて、キーコンテンツベクトルｗ_ｋと候補コンテンツベクトルｗ_ｊとの関連度Ｒ（ｗ_ｋ，ｗ_ｊ）を取得する（ステップＳ１２５）。 The inter-vector relevance analysis unit 5 outputs the calculated relevance R (w _k , w _j ) to the related content selection unit 43.
If there is a candidate content vector for which the degree of association with the key content vector w _k has not yet been calculated, the related content selection unit 43 selects one of the candidate content vectors and repeats the above processing, and for each of all candidate content vectors w _j The degree of association R (w _k , w _j ) between the key content vector w _k and the candidate content vector w _j is acquired (step S125).

関連コンテンツ選択部４３は、ステップＳ１２５において取得した関連度Ｒ（ｗ_ｋ，ｗ_ｊの中から、最も高い１つの関連度、あるいは、最も高いものから所定数の関連度を選択する（ステップＳ１３０）。ここでは、１つの関連度Ｒ（ｗ_ｋ，ｗ_ｊ）が選択されたものとする。関連コンテンツ選択部４３は、選択された関連度Ｒ（ｗ_ｋ，ｗ_ｊ）が算出されたときの候補コンテンツベクトルｗ_ｊを特定すると、当該候補コンテンツベクトルｗ_ｊの生成元となった候補コンテンツテキスト情報データＬ_ｊ、及び、当該候補コンテンツテキスト情報データＬ_ｊと対応付けられているコンテンツＩＤを特定する。関連コンテンツ選択部４３は、特定したコンテンツＩＤ、特定した候補コンテンツテキスト情報データＬ_ｊ、特定した候補コンテンツベクトルｗ_ｊをそれぞれ、関連コンテンツｒのコンテンツＩＤ、関連コンテンツテキスト情報データＬ_ｒ、関連コンテンツベクトルｗ_ｒとして検索結果解析部６へ出力するとともに、キーコンテンツｋのコンテンツＩＤ、キーコンテンツベクトルｗ_ｋ及びキーコンテンツテキスト情報データＬ_ｋを検索結果解析部６へ出力する（ステップＳ１３５）。 The related content selection unit 43 selects the highest relevance level or a predetermined number of relevance levels from the relevance level R (w _k , w _j acquired in step S125 (step S130). Here, it is assumed that one relevance level R (w _k , w _j ) has been selected, and the related content selection unit 43 calculates the selected relevance level R (w _k , w _j ). When the candidate content vector w _j is specified, the candidate content text information data L _j that is the generation source of the candidate content vector w _j and the content ID associated with the candidate content text information data L _j are specified. . Related content selection unit 43, the identified content ID, a particular candidate content text information data L _j, identified candidate con Ntsubekutoru w _j, respectively, the content ID of the related content r, related content text information data L _r, and outputs the search result analysis unit 6 as the related content a vector w _r, the content ID of the key content k, key content vector w _k and and it outputs the key content text information data _{L k} to the search result analysis unit 6 (step S135).

検索結果解析部６の共起ベクトル生成部６１は、受信したキーコンテンツベクトルｗ_ｋ＝（ｗ_ｋ１，…，ｗ_ｒｉ，…，ｗ_ｋＮ）と、関連コンテンツベクトルｗ_ｒ＝（ｗ_ｒ１，…，ｗ_ｒｉ，…，ｗ_ｒＮ）とから単語共起ベクトルｗ_ｋｒを以下の（式３）によって生成する（ステップＳ１４０）。 The co-occurrence vector generation unit 61 of the search result analysis unit 6 receives the received key content vector w _k = (w _k1 ,..., W _ri ,..., W _kN ) and the related content vector w _r = (w _r1,. The word co-occurrence vector w _kr is generated from the following (formula 3) from w _ri _,.

単語共起ベクトルｗ_ｋｒ＝（ｗ_ｋ１・ｗ_ｒ１，…，ｗ_ｋｉ・ｗ_ｒｉ，…，ｗ_ｋＮ・ｗ_ｒＮ）・・・（式３） Word co-occurrence vector w _kr = (w _k1 · w _r1 ,..., W _ki · w _ri ,..., W _kN · w _rN ) (Equation 3)

単語共起ベクトルｗ_ｋｒは、キーコンテンツテキスト情報データＬ_ｋにより示される番組概要テキストと、関連コンテンツテキスト情報データＬ_ｒにより示される番組概要テキストとに共通して含まれる単語の特徴を表す。単語共起ベクトルｗ_ｋｒでは、共起しない単語の要素は０、共起する単語の要素は正の値となる。共起する単語の要素の値が大きいほど、キーコンテンツテキスト情報データＬ_ｋ、及び、関連コンテンツテキスト情報データＬ_ｒが示す両番組概要テキストにおいて重要度が高いことを示す。 The word co-occurrence vector w _kr represents the characteristics of a word included in common in the program summary text indicated by the key content text information data L _k and the program summary text indicated by the related content text information data L _r . In the word co-occurrence vector w _kr , the element of the word that does not co-occur is 0, and the element of the word that co-occurs has a positive value. The larger the value of the co-occurring word element, the higher the importance of both program summary texts indicated by the key content text information data L _k and the related content text information data L _r .

次に、第１文ベクトル生成部６２は、関連コンテンツテキスト情報データＬ_ｒにより示される番組概要テキストを構成する各文ｓから、コンテンツベクトルと同様の方法により、各文中に出現する各単語の重要度に基づいた以下の関連コンテンツ文ベクトルｗ_ｒｓを生成する（ステップＳ１４５）。関連コンテンツ文ベクトルｗ_ｒｓは、以下の（式４）のように表される。 Next, the first sentence vector generation unit 62 determines the importance of each word appearing in each sentence from each sentence s constituting the program summary text indicated by the related content text information data L _r by the same method as the content vector. The following related content sentence vector w _rs based on the degree is generated (step S145). The related content sentence vector w _rs is expressed as the following (Formula 4).

関連コンテンツ文ベクトルｗ_ｒｓ＝（ｗ_ｒｓ１，…，ｗ_ｒｓｉ，…，ｗ_ｒｓＮ）・・・（式４） Related content sentence vector w _rs = (w _rs1 ,..., W _rsi ,..., W _rsN ) (Expression 4)

要素ｗ_ｒｓｉは、関連コンテンツテキスト情報データＬ_ｒが示す番組概要テキスト中の文ｓにおける単語ｉ（１≦ｉ≦Ｎ）の重要度であり、当該文ｓに単語ｉが出現しないときはｗ_ｒｓｉ＝０、出現するときにはｗ_ｒｓｉ＞０となる。
関連コンテンツ文ベクトルにＴＦ／ＩＤＦを用いる場合、第１文ベクトル生成部６２は、関連コンテンツテキスト情報データＬ_ｒにより示される番組概要テキストを形態素解析した結果に基づき、各文ｓについて、各単語ｉのＴＦ（ｉ）値を取得し、関連コンテンツ文ベクトルを生成する。各単語ｉのＤＦ（ｉ）値、文書数ＤＮは、関連コンテンツ検索部４から受信してもよく、テキスト情報蓄積部２に記憶されているテキスト情報データＬを解析して取得することでもよい。 The element w _rsi is the importance of the word i (1 ≦ i ≦ N) in the sentence s in the program summary text indicated by the related content text information data L _r , and when the word i does not appear in the sentence s, w _rsi = 0, and when it appears, w _rsi > 0.
When TF / IDF is used for the related content sentence vector, the first sentence vector generation unit 62 uses the word i for each sentence s based on the result of morphological analysis of the program summary text indicated by the related content text information data L _r. TF (i) value is obtained, and a related content sentence vector is generated. The DF (i) value and the document number DN of each word i may be received from the related content search unit 4 or may be obtained by analyzing the text information data L stored in the text information storage unit 2. .

続いて、関連表示文選択部６４は、ステップＳ１４５において生成した関連コンテンツ文ベクトルｗ_ｒｓの中から１つを選択すると、単語共起ベクトルｗ_ｋｒと選択した関連コンテンツ文ベクトルｗ_ｒｓとをベクトル間関連度解析部５へ出力する。ベクトル間関連度解析部５は、上述した（式２）のｗ_ａ，ｗ_ｂに、受信した単語共起ベクトルｗ_ｋｒ、関連コンテンツ文ベクトルｗ_ｒｓを代入して関連度Ｒ（ｗ_ｋｒ，ｗ_ｒｓ）を算出すると、関連表示文選択部６４に出力する。 Subsequently, when the related display sentence selection unit 64 selects one of the related content sentence vectors w _rs generated in step S145, the related co-occurrence vector w _kr and the selected related content sentence vector w _rs are set between vectors. Output to the relevance analysis unit 5. The inter-vector relevance analysis unit 5 substitutes the received word co-occurrence vector w _{kr and} related content sentence vector w _rs for w _a and w _b in (Equation 2) described above to obtain the relevance R (w _kr , w _{When rs} ) is calculated, it is output to the related display sentence selection unit 64.

関連表示文選択部６４は、まだ単語共起ベクトルｗ_ｋｒとの関連度を算出していない関連コンテンツ文ベクトルｗ_ｒｓがあれば、そのうちの１を選択して上記処理を繰り返し、全ての関連コンテンツ文ベクトルｗ_ｒｓそれぞれについて、単語共起ベクトルｗ_ｋｒと関連コンテンツ文ベクトルｗ_ｒｓとの関連度Ｒ（ｗ_ｋｒ，ｗ_ｒｓ）を取得する（ステップＳ１５０）。 If there is a related content sentence vector w _rs for which the degree of relevance with the word co-occurrence vector w _kr has not yet been calculated, the related display sentence selection unit 64 selects one of them and repeats the above processing, and all the related content for each sentence vector _{w rs,} word cooccurrence vector _{w kr} and related content sentence vector _{w rs} and of relevance _R _{(w kr,} w _rs) acquires (step S150).

関連表示文選択部６４は、ステップＳ１５０において取得した関連度Ｒ（ｗ_ｋｒ，ｗ_ｒｓ）の中から、最も高い関連度を選択する。関連表示文選択部６４は、選択された関連度が算出された関連コンテンツ文ベクトルｗ_ｒｓの生成元となった番組概要テキストの文を、関連コンテンツｒの関連表示文として選択する（ステップＳ１５５）。これにより、キーコンテンツテキスト情報データＬ_ｋにより示される番組概要テキストと、関連コンテンツテキスト情報データＬ_ｒにより示される番組概要テキストとに共通して含まれる単語の特徴を最もよく表す文が、関連コンテンツテキスト情報データＬ_ｒにより示される番組概要テキストから選択される。 The related display text selection unit 64 selects the highest relevance degree from the relevance degrees R (w _kr , w _rs ) acquired in step S150. The related display sentence selection unit 64 selects the sentence of the program summary text that is the generation source of the related content sentence vector w _rs for which the selected degree of relevance has been calculated as the related display sentence of the related content r (step S155). . As a result, the sentence that best represents the characteristic of the word that is commonly included in the program summary text indicated by the key content text information data L _k and the program summary text indicated by the related content text information data L _r is the related content. It is selected from the program summary text indicated by the text information data L _r.

続いて、第２文ベクトル生成部６３は、キーコンテンツテキスト情報データＬ_ｋにより示される番組概要テキストを構成する各文ｓから、関連コンテンツ文ベクトルｗ_ｒｓと同様の処理により、各文中に出現する各単語の重要度に基づいたキーコンテンツ文ベクトルｗ_ｋｓ＝（ｗ_ｋｓ１，…，ｗ_ｋｓｉ，…，ｗ_ｋｓＮ）を生成する（ステップＳ１６０）。要素ｗ_ｋｓｉは、キーコンテンツテキスト情報データＬ_ｋが示す番組概要テキスト中の文ｓにおける単語ｉ（１≦ｉ≦Ｎ）の重要度であり、当該文ｓに単語ｉが出現しないときはｗ_ｋｓｉ＝０、出現するときにはｗ_ｋｓｉ＞０となる。 Subsequently, the second sentence vector generation unit 63 appears in each sentence from the sentences s constituting the program summary text indicated by the key content text information data L _k by the same processing as the related content sentence vector w _rs. A key content sentence vector w _ks = (w _ks1 ,..., W _ksi ,..., W _ksN ) based on the importance of each word is generated (step S160). The element w _ksi is the importance of the word i (1 ≦ i ≦ N) in the sentence s in the program summary text indicated by the key content text information data L _k . When the word i does not appear in the sentence s, w _ksi = 0, w _ksi > 0 when it appears.

続いて、関連表示文選択部６４は、ステップＳ１６０において生成したキーコンテンツ文ベクトルｗ_ｋｓの中から１つを選択すると、単語共起ベクトルｗ_ｋｒと選択したキーコンテンツ文ベクトルｗ_ｋｓとをベクトル間関連度解析部５へ出力する。ベクトル間関連度解析部５は、上述した（式２）のｗ_ａ，ｗ_ｂに、受信した単語共起ベクトルｗ_ｋｒ、キーコンテンツ文ベクトルｗ_ｋｓに代入して関連度Ｒ（ｗ_ｋｒ，ｗ_ｋｓ）を算出すると、算出結果を関連表示文選択部６４に出力する。 Subsequently, when the related display sentence selection unit 64 selects one of the key content sentence vectors w _ks generated in step S160, the word co-occurrence vector w _kr and the selected key content sentence vector w _ks are set between vectors. Output to the relevance analysis unit 5. The inter-vector relevance analysis unit 5 substitutes the received word co-occurrence vector w _kr and key content sentence vector w _ks for w _a and w _b in (Equation 2) described above to obtain the relevance R (w _kr , w _{When ks} ) is calculated, the calculation result is output to the related display text selection unit 64.

関連表示文選択部６４は、まだ単語共起ベクトルｗ_ｋｒとの関連度を算出していないキーコンテンツ文ベクトルｗ_ｋｓがあれば、そのうちの１を選択して上記処理を繰り返し、全てのキーコンテンツ文ベクトルｗ_ｋｓそれぞれについて、単語共起ベクトルｗ_ｋｒとキーコンテンツ文ベクトルｗ_ｋｓとの関連度Ｒ（ｗ_ｋｒ，ｗ_ｋｓ）を取得する（ステップＳ１６５） If there is a key content sentence vector w _ks for which the degree of association with the word co-occurrence vector w _kr has not yet been calculated, the related display sentence selection unit 64 selects one of them and repeats the above process, and all the key contents for each sentence vector _{w ks,} word cooccurrence vector _{w kr} and relevance _R _{(w kr,} w _ks) of the key content sentence vector _{w ks} acquires (step S165)

関連表示文選択部６４は、ステップＳ１６５において取得した関連度Ｒ（ｗ_ｋｒ，ｗ_ｋｓ）の中から、最も高い関連度を選択する。関連表示文選択部６４は、選択された関連度が算出されたキーコンテンツ文ベクトルｗ_ｋｓの生成元となった番組概要テキストの文を、キーコンテンツｋの関連表示文として選択する（ステップＳ１７０）。これにより、キーコンテンツテキスト情報データＬ_ｋにより示される番組概要テキストと、関連コンテンツテキスト情報データＬ_ｒにより示される番組概要テキストとに共通して含まれる単語の特徴を最もよく表す文が、キーコンテンツテキスト情報データＬ_ｋにより示される番組概要テキストから選択される。 The related display text selection unit 64 selects the highest relevance degree from the relevance degrees R (w _kr , w _ks ) acquired in step S165. The related display text selection unit 64 selects, as the related display text of the key content k, the text of the program summary text that is the generation source of the key content text vector w _ks for which the selected degree of relevance has been calculated (step S170). . As a result, the sentence that best represents the feature of the word that is commonly included in the program summary text indicated by the key content text information data L _k and the program summary text indicated by the related content text information data L _r is the key content. It is selected from the program summary text indicated by the text information data L _k.

関連表示文選択部６４は、キーコンテンツｋのコンテンツＩＤと、関連コンテンツｒのコンテンツＩＤと、ステップＳ１６０において選択された関連コンテンツｒの関連表示文と、ステップＳ１７０において選択されたキーコンテンツｋの関連表示文とを関連コンテンツ出力部７に出力する（ステップＳ１７５）。 The related display text selection unit 64 relates the content ID of the key content k, the content ID of the related content r, the related display text of the related content r selected in step S160, and the key content k selected in step S170. The display sentence is output to the related content output unit 7 (step S175).

関連コンテンツ出力部７は、コンテンツ蓄積部３からキーコンテンツｋのコンテンツＩＤに対応したタイトルを読み出すとともに、関連コンテンツｒのコンテンツＩＤに対応したコンテンツデータ及びタイトルを読み出し、読み出した関連コンテンツｒのコンテンツデータに基づいてサムネイルなどの静止画を生成する（ステップＳ１８０）。関連コンテンツ出力部７は、キーコンテンツｋのタイトル及び関連表示文、関連コンテンツｒのタイトル、関連表示文及びサムネイルを表示する検索結果表示画面の画面データを生成し、コンテンツ表示装置に出力する（ステップＳ１８５）。コンテンツ表示装置は、受信した画面データをディスプレイに表示する。これにより、ユーザは、現在視聴しているキーコンテンツｋの関連コンテンツｒと、当該関連コンテンツｒがどのような観点から類似していると判断されたかの情報を把握する。 The related content output unit 7 reads the title corresponding to the content ID of the key content k from the content storage unit 3, reads the content data and the title corresponding to the content ID of the related content r, and reads the content data of the related content r that has been read out A still image such as a thumbnail is generated based on (Step S180). The related content output unit 7 generates screen data of a search result display screen that displays the title of the key content k and the related display text, the title of the related content r, the related display text, and the thumbnail, and outputs the screen data to the content display device (step). S185). The content display device displays the received screen data on the display. As a result, the user grasps information regarding the related content r of the key content k currently being viewed and the viewpoint from which the related content r is determined to be similar.

なお、ステップＳ１３０において、関連コンテンツ選択部４３が複数の関連コンテンツｒを選択した場合、各関連コンテンツｒについてステップＳ１４０〜Ｓ１８０の処理を行う。そして、ステップＳ１８５においては、関連度の高い順に関連コンテンツｒを表示する検索結果表示画面の画面データを生成する。 In step S130, when the related content selection unit 43 selects a plurality of related contents r, the processes of steps S140 to S180 are performed for each related content r. In step S185, screen data of a search result display screen that displays related content r in descending order of relevance is generated.

図５は、コンテンツ表示装置に表示される検索結果表示画面の例を示す図である。同図において、検索結果表示画面には、キーコンテンツｋのタイトルｇ１と、キーコンテンツｋの関連表示文ｇ２、ｇ３とが表示されている。キーコンテンツｋの関連表示文ｇ２、ｇ３は、関連度の高い関連コンテンツｒに対応した順に表示されている。同図においては、最も関連度の高いキーコンテンツの関連表示文ｇ２が強調表示されており、最も関連度の高い関連コンテンツｒのタイトルｇ４、関連コンテンツｒの関連表示文ｇ５、及び、当該関連コンテンツｒのサムネイルｇ６が表示されている。 FIG. 5 is a diagram illustrating an example of a search result display screen displayed on the content display device. In the figure, the title g1 of the key content k and the related display sentences g2 and g3 of the key content k are displayed on the search result display screen. The related display sentences g2 and g3 of the key content k are displayed in the order corresponding to the related content r having a high degree of relevance. In the figure, the related display sentence g2 of the key content having the highest degree of relevance is highlighted, the title g4 of the related content r having the highest degree of relevance, the related display sentence g5 of the related content r, and the related content. A thumbnail g6 of r is displayed.

ユーザが、キーコンテンツの関連表示文ｇ３をマウス等によりクリックすると、コンテンツ表示装置から、２番目に関連度の高い関連コンテンツの表示指示が関連コンテンツ出力部７へ送信される。この場合、関連コンテンツ出力部７は、キーコンテンツｋの関連表示文ｇ３を強調表示し、キーコンテンツｋのタイトルｇ１、キーコンテンツｋの関連表示文ｇ２、ｇ３、２番目に関連度の高い関連コンテンツｒのタイトルｇ４、当該関連コンテンツｒの関連表示文ｇ５、及び、当該関連コンテンツｒのサムネイルｇ６を表示する検索結果表示画面の画面データを生成してコンテンツ表示装置へ返送する。 When the user clicks the related display sentence g3 of the key content with a mouse or the like, a display instruction for the related content having the second highest degree of relevance is transmitted from the content display device to the related content output unit 7. In this case, the related content output unit 7 highlights the related display sentence g3 of the key content k, and displays the title g1 of the key content k, the related display sentences g2 and g3 of the key content k, and the second most related content. The screen data of the search result display screen that displays the title g4 of r, the related display text g5 of the related content r, and the thumbnail g6 of the related content r are generated and returned to the content display device.

なお、ユーザのコンテンツ表示装置から検索キーを取得するかわりに、ユーザのコンテンツ表示装置へコンテンツを配信している外部のコンテンツ配信装置から、現在配信しているコンテンツやコンテンツ表示装置から受信したコンテンツＩＤを検索キーとして受信するようにしてもよい。この場合、関連コンテンツ出力部７は、検索結果表示画面をコンテンツ配信装置に出力し、コンテンツ配信装置からこの検索結果表示画面をコンテンツ表示装置へ送信してもよい。また、コンテンツ配信装置に関連コンテンツ表示装置１の一部または全ての機能部を備えてもよい。 Instead of acquiring a search key from the user's content display device, the content ID currently received from the external content distribution device that distributes the content to the user's content display device or the content ID received from the content display device May be received as a search key. In this case, the related content output unit 7 may output the search result display screen to the content distribution device, and transmit the search result display screen from the content distribution device to the content display device. The content distribution device may include a part or all of the functional units of the related content display device 1.

なお、上記においては、コンテンツとして放送番組を例に説明したが、インターネット等により配信される動画であってもよく、静止画、テキスト、あるいは、音声のデータやそれらの組み合わせであってもよい。 In the above description, a broadcast program has been described as an example of content. However, it may be a moving image distributed over the Internet or the like, or may be a still image, text, audio data, or a combination thereof.

なお、上記においては、コンテンツベクトルを生成する際、形態素解析の結果からキーワードとなる単語を抽出しているが、キーワードとなる単語ｉ、各単語ｉに対応したベクトルの要素の情報を予めテキスト情報蓄積部２に記憶しておいてもよい。
また、予めテキスト情報蓄積部２に、テキスト情報データが示す番組概要テキストに基づいて生成されたコンテンツベクトルと、当該テキスト情報データのコンテンツＩＤとを対応付けて記憶させておき、ステップＳ１０５において受信したコンテンツＩＤに対応づけて記憶されているコンテンツベクトルをキーコンテンツベクトルｗ_ｋとして、他のコンテンツＩＤに対応づけて記憶されているコンテンツベクトルを候補コンテンツベクトルｗ_ｊとして読み出すようにしてもよい。
同様に、予めテキスト情報蓄積部２に、テキスト情報データが示す番組概要テキストの各文に基づいて生成された文ベクトルと、当該テキスト情報データのコンテンツＩＤとを対応付けて記憶させておき、関連コンテンツｒのコンテンツＩＤと対応づけて記憶されている文ベクトルを関連コンテンツ文ベクトルｗ_ｒｓ、キーコンテンツｋのコンテンツＩＤと対応づけて記憶されている文ベクトルをキーコンテンツ文ベクトルｗ_ｋｓとして読み出すようにしてもよい。 In the above, when generating a content vector, a word as a keyword is extracted from the result of morphological analysis. However, word information as a keyword and information on vector elements corresponding to each word i are previously stored as text information. It may be stored in the storage unit 2.
In addition, the content information generated based on the program summary text indicated by the text information data and the content ID of the text information data are stored in advance in the text information storage unit 2 and received in step S105. The content vector stored in association with the content ID may be read as the key content vector w _k , and the content vector stored in association with the other content ID may be read out as the candidate content vector w _j .
Similarly, a text vector generated based on each sentence of the program summary text indicated by the text information data and the content ID of the text information data are stored in the text information storage unit 2 in association with each other. The sentence vector stored in association with the content ID of the content r is read out as the related content sentence vector w _rs , and the sentence vector stored in association with the content ID of the key content _k is read out as the key content sentence vector w _ks. May be.

また、上記においては、コンテンツベクトルの生成にＴＦ／ＩＤＦを用いているが、文章中に出現する各単語の重要度に基づいて言語的な特徴を示すベクトル表現を生成する任意の方法を用いることができる。例えば、予め新聞記事等の任意の文章群を解析してキーワードとなる各単語とその重みを決めておき、テキスト情報データに当該単語が出現する場合、当該単語に対応する重みをコンテンツベクトルやコンテンツ文ベクトルの要素として用いるようにしてもよい。 In the above, TF / IDF is used to generate the content vector, but any method for generating a vector expression indicating a linguistic feature based on the importance of each word appearing in the sentence is used. Can do. For example, an arbitrary sentence group such as a newspaper article is analyzed in advance to determine each word to be a keyword and its weight, and when the word appears in text information data, the weight corresponding to the word is set as a content vector or content. It may be used as an element of a sentence vector.

本実施形態によれば、コンテンツに付随し、当該コンテンツの内容等が記述されたテキスト間の関連度に基づいて関連する他のコンテンツを検索し、検索の結果得られた他のコンテンツがどのような観点から関連していると判断されたかの情報を示す文を当該他のコンテンツに付随するテキストから抽出し、検索の結果のコンテンツとともにユーザに提示することが可能となる。よって、ユーザは、どのような観点から検索結果のコンテンツが関連しているのかを把握することができる。 According to the present embodiment, other content that is associated with the content and is related based on the degree of relevance between the texts in which the content of the content is described is searched, and how is the other content obtained as a result of the search? It is possible to extract a sentence indicating information that is determined to be related from various viewpoints from the text accompanying the other content, and present it to the user together with the content as a result of the search. Therefore, the user can grasp from what point of view the content of the search result is related.

従来は、関連コンテンツを検索する際、「○○大統領」、「△△首相」のようなキーワードのみがユーザに提示されていた。一方、本実施形態では、「○○大統領は△△首相との会談に向けて準備を行った。」、「○○大統領は△△首相との会談は行なわれなかった。」などのように、これらのキーとなるコンテンツと検索結果のコンテンツに共通して含まれる特徴的なキーワードを多く含む文を併せて提示する。特に、「行なわれなかった」などのような否定の表現は、キーワードのみの表示では検索できないことが多いが、本実施形態では、文によってコンテンツ間の関連性を示すことによって、ユーザは、単なるキーワード列の提示では得られなかった情報を得ることができ、興味を惹くコンテンツを見つけ易くなる。 Conventionally, when searching for related content, only keywords such as “President XX” and “Prime Minister” are presented to the user. On the other hand, in this embodiment, “President XX prepared for a meeting with the Prime Minister △”, “President XX did not meet with the Prime Minister △,” and so on. , A sentence including many characteristic keywords included in both the key content and the search result content is presented together. In particular, a negative expression such as “not done” is often not searchable by displaying only keywords, but in this embodiment, by indicating the relationship between contents by sentences, the user can simply Information that cannot be obtained by presenting keyword strings can be obtained, and it becomes easier to find interesting content.

なお、上述の関連コンテンツ表示装置１は、内部にコンピュータシステムを有している。そして、関連コンテンツ表示装置１の関連コンテンツ検索部４、ベクトル間関連度解析部５、検索結果解析部６、及び、関連コンテンツ出力部７の動作の過程は、プログラムの形式でコンピュータ読み取り可能な記録媒体に記憶されており、このプログラムをコンピュータシステムが読み出して実行することによって、上記処理が行われる。ここでいうコンピュータシステムとは、ＣＰＵ及び各種メモリやＯＳ、周辺機器等のハードウェアを含むものである。 The related content display device 1 described above has a computer system therein. The operation processes of the related content search unit 4, the inter-vector relevance analysis unit 5, the search result analysis unit 6, and the related content output unit 7 of the related content display device 1 are recorded in a computer-readable form in the form of a program. The above-described processing is performed when the computer system reads and executes the program stored in the medium. The computer system here includes a CPU, various memories, an OS, and hardware such as peripheral devices.

また、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。
また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含むものとする。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであっても良い。 Further, the “computer system” includes a homepage providing environment (or display environment) if a WWW system is used.
The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory in a computer system serving as a server or a client in that case, and a program that holds a program for a certain period of time are also included. The program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.

１…関連コンテンツ表示装置
２…テキスト情報蓄積部
３…コンテンツ蓄積部
４…関連コンテンツ検索部
４１…第１コンテンツベクトル生成部
４２…第２コンテンツベクトル生成部
４３…関連コンテンツ選択部
５…ベクトル間関連度解析部
６…検索結果解析部
６１…共起ベクトル生成部
６２…第１文ベクトル生成部
６３…第２文ベクトル生成部
６４…関連表示文選択部
７…関連コンテンツ出力部 DESCRIPTION OF SYMBOLS 1 ... Related content display apparatus 2 ... Text information storage part 3 ... Content storage part 4 ... Related content search part 41 ... 1st content vector generation part 42 ... 2nd content vector generation part 43 ... Related content selection part 5 ... Inter-vector relation Degree analysis unit 6 ... Search result analysis unit 61 ... Co-occurrence vector generation unit 62 ... First sentence vector generation unit 63 ... Second sentence vector generation unit 64 ... Related display sentence selection unit 7 ... Related content output unit

Claims

A text information storage unit that stores content identification information and text information that represents the content in association with each other;
An inter-vector relevance analysis unit for calculating the relevance between vectors;
A first content vector generation unit that generates a key content vector representing the linguistic characteristics of the text information associated with the specified content identification information;
A second content vector generation unit that generates a candidate content vector representing the linguistic features of the text information associated with other content identification information;
The candidate content vector having high relevance to the key content vector is selected as a related content vector based on the relevance between the key content vector calculated by the inter-vector relevance analysis unit and each of the candidate content vectors. A related content selection section;
Based on the key content vector and the related content vector, a word co-occurrence vector representing the characteristics of the words that co-occur in the text information corresponding to the key content vector and the text information corresponding to the related content vector A co-occurrence vector generation unit to generate;
A first sentence vector generation unit that generates a related content sentence vector representing a linguistic feature of the sentence for each sentence included in the text information corresponding to the related content vector;
Based on the relevance between the word co-occurrence vector calculated by the inter-vector relevance analysis unit and each of the related content sentence vectors, the related content sentence vector highly relevant to the word co-occurrence vector is selected. A related display sentence selection unit that identifies the sentence corresponding to the selected related content sentence vector as a related display sentence of the related content;
A related content output unit that outputs a related display sentence of the related content specified by the related display sentence selection unit;
A related content display device comprising:

For each sentence included in the text information corresponding to the key content vector, further comprising a second sentence vector generation unit that generates a key content sentence vector representing a linguistic feature of the sentence,
The related display sentence selection unit is highly related to the word co-occurrence vector based on the degree of association between the word co-occurrence vector calculated by the inter-vector relevance analysis unit and each of the key content sentence vectors. Selecting the key content sentence vector, specifying the sentence corresponding to the selected key content sentence vector as a related display sentence of the key content;
The related content output unit outputs a related display text of the key content specified by the related display text selection unit ;
The related content display device according to claim 1.

A content storage unit that stores the content identification information and the related information of the content in association with each other;
The related content output unit reads the related information associated with the same content identification information as the text information corresponding to the related content vector from the content storage unit, and generates based on the read related information or the related information Outputting information together with the related display sentence;
The related content display device according to claim 1 or 2, wherein

A computer used as a related content display device,
A text information storage unit that stores content identification information and text information that represents the content in association with each other;
A relevance analysis unit for calculating relevance between vectors,
A first content vector generation unit that generates a key content vector representing a linguistic feature of the text information associated with the specified content identification information;
A second content vector generation unit that generates a candidate content vector representing a linguistic feature of the text information associated with other content identification information;
The candidate content vector having high relevance to the key content vector is selected as a related content vector based on the relevance between the key content vector calculated by the inter-vector relevance analysis unit and each of the candidate content vectors. Related content selection section,
Based on the key content vector and the related content vector, a word co-occurrence vector representing the characteristics of the words that co-occur in the text information corresponding to the key content vector and the text information corresponding to the related content vector A co-occurrence vector generation unit to generate,
A first sentence vector generation unit that generates a related content sentence vector representing a linguistic feature of the sentence for each sentence included in the text information corresponding to the related content vector;
Based on the relevance between the word co-occurrence vector calculated by the inter-vector relevance analysis unit and each of the related content sentence vectors, the related content sentence vector highly relevant to the word co-occurrence vector is selected. A related display sentence selection unit that identifies the sentence corresponding to the selected related content sentence vector as a related display sentence of the related content;
A related content output unit that outputs a related display sentence of the related content specified by the related display sentence selection unit;
A computer program that functions as a computer program.