JP5989157B2

JP5989157B2 - Information presenting apparatus, method, and program

Info

Publication number: JP5989157B2
Application number: JP2015024404A
Authority: JP
Inventors: 正嗣服部; 一生青山; 哲生小林; 早苗藤田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2015-02-10
Filing date: 2015-02-10
Publication date: 2016-09-07
Anticipated expiration: 2035-02-10
Also published as: JP2016148927A

Description

本発明は、情報提示装置、方法、及びプログラムに係り、特に、クエリに類似するメディアデータ情報の探索結果を提示する情報提示装置、方法、及びプログラムに関する。 The present invention relates to an information presentation apparatus, method, and program, and more particularly, to an information presentation apparatus, method, and program that present a search result of media data information similar to a query.

近年、ビッグデータ時代の到来といわれるように、データの量は爆発的に増えつつある。インターネットでは、従来から盛んであった企業やコンピュータに詳しい人々からの情報発信に加え、ＳＮＳ（ソーシャルネットワークサービス）のように気軽かつ手軽に情報発信を可能とする媒体が数多く実用化されたことで、専門的な知識を有さない人も日常的に情報発信を行うようになった。このため、インターネット上に生み出される電子情報は膨大な量となっている。 In recent years, as the arrival of the big data era is said, the amount of data is increasing explosively. On the Internet, in addition to information transmission from companies and people familiar with computers that have been popular in the past, many other media such as SNS (social network service) that allow easy and easy information transmission have been put into practical use. Even those who do not have specialized knowledge began to send information on a daily basis. For this reason, the amount of electronic information generated on the Internet is enormous.

また、実世界においても、各種センサの廉価化、小型化、高性能化に伴い、多くの人が私的に所有する携帯電話、スマートフォン等の携帯端末には、標準的にＧＰＳが搭載され、携帯端末の位置情報を用途に応じて取得できるようになっている。また、商用に農園、サーバ室等に配されたセンサノードで温度、湿度等のデータを常時取得し、運営に役立てることが行われている。このように、各種センサで取得されたセンサ情報が、逐次格納、蓄積、利用等されている。 Also, in the real world, as various sensors become cheaper, smaller, and more sophisticated, mobile terminals such as mobile phones and smartphones that are privately owned by many people are equipped with GPS as standard. The position information of the mobile terminal can be acquired according to the application. In addition, data such as temperature and humidity is constantly acquired by a sensor node provided for commercial use in a farm, a server room, etc., and is used for operation. As described above, sensor information acquired by various sensors is sequentially stored, accumulated, used, and the like.

このように日々生み出される膨大な情報を活用するためには、情報の検索は不可欠であり、検索結果をユーザが容易に理解できるように効果的に提示することが重要である。例えば、インターネット上の情報の検索及び提示を例に説明する。インターネットの検索エンジンは、Ｇｏｏｇｌｅ（登録商標）社のウェブ検索、画像検索、Ｙｏｕｔｕｂｅ（登録商標）の動画検索、Ｍｉｃｒｏｓｏｆｔ（登録商標）社の検索エンジンＢｉｎｇ（登録商標）でのウェブ検索、画像検索等のように、検索対象となるメディアデータがテキスト、画像、動画等と幅広いが、検索結果の出力はリスト形式で数十件程度を単位になされることが多い。リスト形式とは、検索エンジンがクエリと各メディアデータとの関係に基づいて算出したスコアが高い順に、一次元的に検索結果を提示する方法であるといえる。ユーザは順位が高い検索結果から内容を吟味していき、所望の情報が得られたところで、検索結果の確認を打ち切ればよい。 Thus, in order to utilize the enormous amount of information generated every day, it is essential to search for information, and it is important to present the search results effectively so that the user can easily understand them. For example, search and presentation of information on the Internet will be described as an example. Internet search engine is Google (registered trademark) web search, image search, YouTube (registered trademark) video search, Microsoft (registered trademark) search engine Bing (registered trademark) web search, image search, etc. As described above, the media data to be searched is wide, such as text, images, moving images, etc., but the output of search results is often made in units of several tens of items in a list format. The list format can be said to be a method of presenting search results in a one-dimensional manner in descending order of the score calculated by the search engine based on the relationship between the query and each media data. The user should examine the contents from the search results with higher ranks and cancel the confirmation of the search results when the desired information is obtained.

しかし、一次元的なリスト形式での検索結果の提示には、クエリに含まれるキーワード等の検索キーが所望の対象に関する情報だけではなく、他の対象に関する情報にも該当している場合に、所望の対象に対する検索結果と他の対象に対する検索結果とが混在してしまう、という問題がある。例えば、「人物Ａに関する記述を含んだインターネット上のｈｔｍｌテキストの集合」を人物Ａの氏名をキーワードにウェブから検索し、スコアが高い順にリスト形式で得たいとする。この際、人物Ａと同姓同名の別人Ｂが存在し、その「人物Ｂに関する記述を含んだインターネット上のｈｔｍｌテキストの集合」中にスコアが高いものが存在する場合には、検索結果リストに、「人物Ａに関する記述を含んだインターネット上のｈｔｍｌテキスト」と「人物Ｂに関する記述を含んだインターネット上のｈｔｍｌテキスト」とが混在してしまうことになる。 However, in the presentation of search results in a one-dimensional list format, when the search key such as a keyword included in the query corresponds to not only information related to a desired target but also information related to other targets, There is a problem that a search result for a desired target and a search result for another target are mixed. For example, it is assumed that “a set of html texts on the Internet including a description relating to the person A” is searched from the web using the name of the person A as a keyword, and the list is to be obtained in descending order of score. At this time, if there is another person B who has the same name as the person A, and the “collection of html texts on the Internet including the description about the person B” has a high score, the search result list includes: "The html text on the Internet including the description regarding the person A" and "the html text on the Internet including the description regarding the person B" are mixed.

この場合、検索結果から混在してしまった別人Ｂの情報を排するには、Ａは有するがＢは有さない情報を検索キーとしてクエリに追加する必要があるが、そのためにはＡ及びＢの特性を分析する必要が生じる。ユーザがＡ及びＢについて事前知識を有しない場合、特性の分析のためにリストで示されたテキストを複数読み比べることになり、多くの時間が消費されることになる。最悪の場合、適切でない追加の検索キーを選択してしまい、所望のＡに関する情報まで検索結果のリストから排除してしまう可能性がある。 In this case, in order to eliminate the information of another person B who has been mixed from the search results, it is necessary to add information that has A but not B as a search key to the query. It becomes necessary to analyze the characteristics of If the user does not have prior knowledge about A and B, the text shown in the list will be read and compared for characteristic analysis, and a lot of time will be consumed. In the worst case, an additional search key that is not appropriate may be selected, and information about the desired A may be excluded from the search result list.

検索結果を一次元的なリストとして提示する場合の問題を解決するために、三次元的な情報提示を行う手法が提案されている（例えば、非特許文献１参照）。 In order to solve the problem of presenting search results as a one-dimensional list, a method of presenting three-dimensional information has been proposed (for example, see Non-Patent Document 1).

服部正嗣他、「グラフ索引を用いた絵本の類似探索〜特徴の融合と結果のグラフ可視化〜」、2013 Information Processing society of Japan（情報処理学会研究報告）Masatsugu Hattori et al., “Similarity Search for Picture Books Using Graph Index: Feature Fusion and Visualization of Results Graphs”, 2013 Information Processing society of Japan

しかしながら、上述の非特許文献１に記載の、検索結果の三次元的な可視化による情報提示手法は、サービスで使うためのＵＩとして考えられたものではなかった。 However, the information presentation method based on the three-dimensional visualization of search results described in Non-Patent Document 1 described above has not been considered as a UI for use in services.

本発明は上記事情を鑑みてなされたものであり、ユーザが使いやすい最適な情報提示を行うことができる情報提示装置、方法、及びプログラムを提供することを目的とする。 The present invention has been made in view of the above circumstances, and an object thereof is to provide an information presentation apparatus, method, and program capable of performing optimum information presentation that is easy for a user to use.

上記目的を達成するために、第１の発明に係る情報提示装置は、メディアデータに関する少なくとも一つの共通の種類の特徴を含む複数のメディアデータ情報から、ユーザにより入力されたクエリとの類似度に基づいて、メディアデータ情報を複数探索するメディアデータ探索部と、前記メディアデータ探索部により探索された探索結果として、前記クエリとの類似度が高いメディアデータ情報を優先して、探索された複数のメディアデータ情報の各々について、少なくとも前記メディアデータ情報に対して予め定められた概要情報を含む情報を、前記ユーザに対して提示する探索結果提示部と、前記ユーザにより探索結果の可視化要求が入力された場合、前記メディアデータ探索部により探索された探索結果として、前記探索された複数のメディアデータ情報の各々を示すノード、及び類似性を有するメディアデータ情報間を結合するエッジからなるグラフを、前記ユーザに対して提示する可視化結果提示部と、を含んで構成されている。 In order to achieve the above object, an information presentation device according to a first aspect of the present invention provides a similarity to a query input by a user from a plurality of pieces of media data information including at least one common type of feature relating to media data. Based on the media data search unit that searches for a plurality of media data information, and the search results searched by the media data search unit, a plurality of searched media data information with high similarity to the query For each piece of media data information, a search result presentation unit that presents information including at least summary information predetermined for the media data information to the user, and a search result visualization request is input by the user. When the search result searched by the media data search unit, the plurality of searched Node indicating the respective media data information, and a graph consisting of edges that couples between media data information having similarity, is configured to include a, a visualization result presentation unit that presents to the user.

第１の発明に係る情報提示装置によれば、前記メディアデータ探索部が、メディアデータに関する少なくとも一つの共通の種類の特徴を含む複数のメディアデータ情報から、ユーザにより入力されたクエリとの類似度に基づいて、メディアデータ情報を複数探索する。前記探索結果提示部が、前記メディアデータ探索部により探索された探索結果として、前記クエリとの類似度が高いメディアデータ情報を優先して、探索された複数のメディアデータ情報の各々について、少なくとも前記メディアデータ情報に対して予め定められた概要情報を含む情報を、前記ユーザに対して提示する。前記可視化結果提示部が、前記ユーザにより探索結果の可視化要求が入力された場合、前記メディアデータ探索部により探索された探索結果として、前記探索された複数のメディアデータ情報の各々を示すノード、及び類似性を有するメディアデータ情報間を結合するエッジからなるグラフを、前記ユーザに対して提示する。 According to the information presentation apparatus of the first aspect, the media data search unit is similar to a query input by the user from a plurality of pieces of media data information including at least one common type of feature regarding the media data. Based on the above, a plurality of media data information is searched. The search result presentation unit prioritizes media data information having a high similarity to the query as a search result searched by the media data search unit, and at least the media data information searched Information including summary information predetermined for the media data information is presented to the user. When the visualization result presentation unit receives a search result visualization request from the user, a node indicating each of the searched plurality of media data information as a search result searched by the media data search unit; and A graph composed of edges connecting between similar media data information is presented to the user.

このように、探索結果として、クエリとの類似度が高いメディアデータ情報を優先して、概要情報と、クエリとの類似度と、クエリと類似性を有する特徴の、類似度への寄与度とを、ユーザに対して提示し、探索結果の可視化要求が入力された場合、探索された複数のメディアデータ情報の各々を示すノード、及び類似性を有するメディアデータ情報間を結合するエッジからなるグラフを、ユーザに対して提示することにより、ユーザが使いやすい最適な情報提示を行うことができる。 In this way, as a search result, priority is given to media data information having a high degree of similarity with the query, the summary information, the degree of similarity with the query, and the degree of contribution to the degree of similarity of the features having similarity with the query, When a search result visualization request is input, a graph including nodes indicating each of a plurality of searched media data information and an edge connecting between similar media data information By presenting to the user, it is possible to perform optimum information presentation that is easy for the user to use.

また、前記可視化結果提示部は、前記ユーザにより探索結果の可視化要求が入力された場合、前記メディアデータ探索部により探索された探索結果として、前記探索された複数のメディアデータ情報が有する、前記クエリと類似性を有する各特徴と、前記類似度への各特徴の寄与度とを更に提示することができる。 The visualization result presentation unit may include the query that the plurality of searched media data information has as a search result searched by the media data search unit when a search result visualization request is input by the user. Further, each feature having similarity and the contribution of each feature to the similarity can be further presented.

また、第１の発明に係る情報提示装置は、前記ユーザにより前記グラフの分割要求が入力された場合、前記ノードの各々が示すメディアデータ情報間の類似性に基づいて、前記ノードの各々を複数のクラスタに分離するクラスタリング部を更に含み、前記可視化結果提示部は、前記ユーザにより前記グラフの分割要求が入力された場合、前記メディアデータ探索部により探索された探索結果として、前記クラスタリング部によって異なるクラスタに分離された前記ノード間のエッジを削除した前記グラフを提示し、更に、前記複数のクラスタ毎に、前記クラスタに属するノードの各々が示すメディアデータ情報が有する、前記クエリと類似性を有する各特徴と、前記類似度への各特徴の寄与度とを並べて提示することができる。
また、第１の発明に係る情報提示装置は、探索された複数のメディアデータ情報間の類似性に基づいて、前記探索された複数のメディアデータ情報の各々を複数のクラスタに分離するクラスタリング部を更に含み、前記探索結果提示部は、前記メディアデータ探索部により探索された探索結果として、前記複数のクラスタ毎に、前記クエリとの類似度が高いメディアデータ情報を優先して、前記クラスタに属するメディアデータ情報の各々について、少なくとも前記メディアデータ情報に対して予め定められた概要情報を含む情報を、前記ユーザに対して提示することができる。 The information presentation device according to the first aspect of the present invention may be configured such that, when the graph division request is input by the user, a plurality of each of the nodes is set based on similarity between media data information indicated by each of the nodes. The visualization result presentation unit is different depending on the clustering unit as a search result searched by the media data search unit when the graph division request is input by the user. Presenting the graph in which edges between the nodes separated into clusters are deleted, and further having similarity to the query included in the media data information indicated by each of the nodes belonging to the cluster for each of the plurality of clusters Each feature and the contribution of each feature to the similarity can be presented side by side.
Further, the information presentation device according to the first invention includes a clustering unit that separates each of the searched plurality of media data information into a plurality of clusters based on the similarity between the searched plurality of media data information. In addition, the search result presentation unit, as a search result searched by the media data search unit, belongs to the cluster with priority given to media data information having a high similarity to the query for each of the plurality of clusters. For each piece of media data information, information including at least summary information predetermined for the media data information can be presented to the user.

また、前記可視化結果提示部は、前記ユーザにより前記グラフの分割要求が入力された場合、前記メディアデータ探索部により探索された探索結果として、前記クラスタリング部によって異なるクラスタに分離された前記ノード間のエッジを削除した前記グラフであって、かつ、少なくとも１つのノードの表示座標が、前記クラスタリング部による分離前の前記ノードの表示座標から変化した前記グラフを提示することができる。 Further, the visualization result presentation unit, when a request for dividing the graph is input by the user, as a search result searched by the media data search unit, between the nodes separated into different clusters by the clustering unit The graph in which the edge is deleted and the graph in which the display coordinates of at least one node are changed from the display coordinates of the nodes before separation by the clustering unit can be presented.

また、前記可視化提示部は、前記ユーザにより前記グラフの分割要求が入力された場合、前記メディアデータ探索部により探索された探索結果として、前記クラスタリング部によって異なるクラスタに分離された前記ノード間のエッジを削除した前記グラフであって、かつ、前記ノードの各々の表示座標が、前記クラスタ毎に分かれるように、少なくとも１つのノードの表示座標が、前記クラスタリング部による分離前の前記ノードの表示座標から変化した前記グラフを提示することができる。 In addition, when the graph division request is input by the user, the visualization presenting unit, as a search result searched by the media data search unit, is an edge between the nodes separated into different clusters by the clustering unit The display coordinates of at least one node are determined from the display coordinates of the nodes before separation by the clustering unit so that the display coordinates of each of the nodes are separated for each cluster. The changed graph can be presented.

また、第２の発明に係る情報提示装置は、メディアデータに関する少なくとも一つの共通の種類の特徴を含む複数のメディアデータ情報から、ユーザにより入力されたクエリとの類似度に基づいて、メディアデータ情報を複数探索するメディアデータ探索部と、前記メディアデータ探索部により探索された探索結果として、前記探索された複数のメディアデータ情報の各々を示すノード、及び類似性を有するメディアデータ情報間を結合するエッジからなるグラフを、前記ユーザに対して提示する可視化結果提示部と、前記ユーザにより前記グラフのノードが指定された場合、指定されたノードが示す前記メディアデータ情報に対して予め定められた概要情報を少なくとも含む情報を、前記ユーザに対して提示する指定メディアデータ情報提示部と、を含んで構成されている。 In addition, the information presentation device according to the second aspect of the present invention provides media data information based on a similarity to a query input by a user from a plurality of pieces of media data information including at least one common type of feature relating to media data. A media data search unit that searches for a plurality of media data, a node indicating each of the searched media data information, and media data information having similarity as a search result searched by the media data search unit A visualization result presentation unit that presents a graph composed of edges to the user, and when the node of the graph is designated by the user, an outline predetermined for the media data information indicated by the designated node Provide specified media data information that presents information including at least information to the user. It is configured to include a section, a.

第２の発明に係る情報提示装置によれば、前記メディアデータ探索部が、メディアデータに関する少なくとも一つの共通の種類の特徴を含む複数のメディアデータ情報から、ユーザにより入力されたクエリとの類似度に基づいて、メディアデータ情報を複数探索する。前記可視化結果提示部が、前記メディアデータ探索部により探索された探索結果として、前記メディアデータ探索部により探索された探索結果として、前記探索された複数のメディアデータ情報の各々を示すノード、及び類似性を有するメディアデータ情報間を結合するエッジからなるグラフを、前記ユーザに対して提示する。前記指定メディアデータ情報提示部が、前記ユーザにより前記グラフのノードが指定された場合、指定されたノードが示す前記メディアデータ情報に対して予め定められた概要情報を少なくとも含む情報を、前記ユーザに対して提示する。 According to the information presentation device of the second invention, the media data search unit is similar to a query input by the user from a plurality of media data information including at least one common type of feature regarding the media data. Based on the above, a plurality of media data information is searched. The visualization result presenting unit is a search result searched by the media data search unit, a search result searched by the media data search unit, a node indicating each of the searched plurality of media data information, and similar A graph composed of edges connecting the media data information having characteristics is presented to the user. When the designated media data information presenting unit designates a node of the graph by the user, the designated media data information presenting unit provides the user with information including at least summary information predetermined for the media data information indicated by the designated node. To present.

このように、探索結果として、探索された複数のメディアデータ情報の各々を示すノード、及び類似性を有するメディアデータ情報間を結合するエッジからなるグラフを、ユーザに対して提示し、グラフのノードが指定された場合、指定されたノードが示す前記メディアデータ情報の概要情報を少なくとも含む情報を、ユーザに対して提示することにより、ユーザが使いやすい最適な情報提示を行うことができる。 Thus, as a search result, a graph including a node indicating each of a plurality of searched media data information and an edge connecting between similar media data information is presented to the user. Is designated, information that includes at least the summary information of the media data information indicated by the designated node is presented to the user, so that optimum information presentation that is easy for the user to use can be performed.

第２の発明に係る情報提示装置は、前記ユーザにより前記クエリと類似性を有する特徴が指定された場合、前記指定された特徴を有するメディアデータ情報を示すノードのみ、識別可能な低情報量表現となる情報であって、かつ、前記メディアデータ情報から抽出した粗い情報を表示した前記グラフを、前記ユーザに対して提示する指定特徴情報提示部を更に含むことができる。 In the information presentation device according to the second invention, when a feature having similarity to the query is designated by the user, only a node indicating media data information having the designated feature can be identified. And a designated feature information presenting unit that presents the graph displaying rough information extracted from the media data information to the user.

また、前記指定特徴情報提示部は、前記ユーザにより前記クエリと類似性を有する特徴が指定された場合、前記指定された特徴を有するメディアデータ情報を示すノードのみ、前記識別可能な低情報量表現となる情報であって、かつ、前記メディアデータ情報から抽出した粗い情報として、前記概要情報より情報量が少ない情報を表示した前記グラフを、前記ユーザに対して提示することができる。 In addition, the designated feature information presentation unit, when a feature having similarity to the query is designated by the user, only the node indicating the media data information having the designated feature can be identified as a low information amount expression. As a rough information extracted from the media data information, the graph displaying information with a smaller information amount than the summary information can be presented to the user.

また、上記のメディアデータを、本とすることができる。 Further, the media data can be a book.

また、第３の発明に係る情報提示方法は、メディアデータ探索部と、探索結果提示部と、可視化結果提示部とを含む情報提示装置における情報提示方法であって、前記メディアデータ探索部が、メディアデータに関する少なくとも一つの共通の種類の特徴を含む複数のメディアデータ情報から、ユーザにより入力されたクエリとの類似度に基づいて、メディアデータ情報を複数探索し、前記探索結果提示部が、前記メディアデータ探索部により探索された探索結果として、前記クエリとの類似度が高いメディアデータ情報を優先して、探索された複数のメディアデータ情報の各々について、少なくとも前記メディアデータ情報に対して予め定められた概要情報を含む情報を、前記ユーザに対して提示し、前記可視化結果提示部が、前記ユーザにより探索結果の可視化要求が入力された場合、前記メディアデータ探索部により探索された探索結果として、前記探索された複数のメディアデータ情報の各々を示すノード、及び類似性を有するメディアデータ情報間を結合するエッジからなるグラフを、前記ユーザに対して提示する方法である。 An information presentation method according to a third invention is an information presentation method in an information presentation device including a media data search unit, a search result presentation unit, and a visualization result presentation unit, wherein the media data search unit includes: A plurality of media data information is searched from a plurality of media data information including at least one common type of feature regarding media data based on a similarity with a query input by a user, and the search result presentation unit includes As a search result searched by the media data search unit, media data information having a high similarity to the query is given priority, and at least the media data information searched for is determined in advance for at least the media data information. Information including the displayed summary information is presented to the user, and the visualization result presentation unit provides the user When a search result visualization request is input, a search result searched by the media data search unit includes a node indicating each of the searched plurality of media data information, and between similar media data information. This is a method of presenting a graph composed of edges to be combined to the user.

第４の発明に係る情報提示方法は、メディアデータ探索部と、可視化結果提示部と、指定メディアデータ情報提示部とを含む情報提示装置における情報提示方法であって、前記メディアデータ探索部が、メディアデータに関する少なくとも一つの共通の種類の特徴を含む複数のメディアデータ情報から、ユーザにより入力されたクエリとの類似度に基づいて、メディアデータ情報を複数探索し、前記可視化結果提示部が、前記メディアデータ探索部により探索された探索結果として、前記探索された複数のメディアデータ情報の各々を示すノード、及び類似性を有するメディアデータ情報間を結合するエッジからなるグラフを、前記ユーザに対して提示し、前記指定メディアデータ情報提示部が、前記ユーザにより前記グラフのノードが指定された場合、指定されたノードが示す前記メディアデータ情報に対して予め定められた概要情報を少なくとも含む情報を、前記ユーザに対して提示する方法である。 An information presentation method according to a fourth invention is an information presentation method in an information presentation device including a media data search unit, a visualization result presentation unit, and a designated media data information presentation unit, wherein the media data search unit includes: A plurality of media data information is searched from a plurality of media data information including at least one common type of feature regarding media data based on a similarity with a query input by a user, and the visualization result presentation unit includes the visualization result presentation unit, As a search result searched by the media data search unit, a graph including a node indicating each of the searched media data information and an edge connecting between the similar media data information is displayed to the user. The designated media data information presentation unit designates a node of the graph by the user. If it is, the information including at least a predetermined summary information to the media data information indicating the specified node is a method of presenting to the user.

また、第５の発明に係るプログラムは、コンピュータを、上記の情報提示装置を構成する各部として機能させるためのプログラムである。 A program according to the fifth invention is a program for causing a computer to function as each unit constituting the information presentation device.

以上説明したように、本発明の情報提示装置、方法、及びプログラムによれば、ユーザが使いやすい最適な情報提示を行うことができる、という効果が得られる。 As described above, according to the information presentation apparatus, method, and program of the present invention, there is an effect that it is possible to perform optimum information presentation that is easy for the user to use.

本発明の第１の実施の形態に係る情報提示装置の構成を示すブロック図である。It is a block diagram which shows the structure of the information presentation apparatus which concerns on the 1st Embodiment of this invention. 入力画面の一例を示すイメージ図である。It is an image figure which shows an example of an input screen. クエリとなるメディアデータ情報を提示する画面の一例を示すイメージ図である。It is an image figure which shows an example of the screen which presents the media data information used as a query. 探索対象メディアデータＤＢに格納されたメディアデータ情報の一例を示すイメージ図である。It is an image figure which shows an example of the media data information stored in search object media data DB. 探索結果の提示画面の一例を示すイメージ図である。It is an image figure which shows an example of the presentation screen of a search result. 探索結果の可視化結果の一例を示すイメージ図である。It is an image figure which shows an example of the visualization result of a search result. グラフのエッジに関する媒介中心性を説明するための概略図である。It is the schematic for demonstrating the median centrality regarding the edge of a graph. ２つのクラスタに分離された場合における探索結果の可視化結果の一例を示すイメージ図である。It is an image figure which shows an example of the visualization result of a search result when isolate | separated into two clusters. ３つのクラスタに分離された場合における探索結果の可視化結果の一例を示すイメージ図である。It is an image figure which shows an example of the visualization result of a search result when isolate | separated into three clusters. ノードの表示座標の変化を説明するための探索結果の可視化結果の一例を示すイメージ図である。It is an image figure which shows an example of the visualization result of the search result for demonstrating the change of the display coordinate of a node. ４つのクラスタに分離された場合における探索結果の可視化結果の一例を示すイメージ図である。It is an image figure which shows an example of the visualization result of a search result in the case of being isolate | separated into four clusters. グラフのノードを指定した場合における探索結果の可視化結果の一例を示すイメージ図である。It is an image figure which shows an example of the visualization result of the search result when the node of a graph is designated. 円グラフ上で特徴語を指定した場合における探索結果の可視化結果の一例を示すイメージ図である。It is an image figure which shows an example of the visualization result of the search result when the feature word is designated on the pie chart. 円グラフ上で特徴語を指定した場合における探索結果の可視化結果の一例を示すイメージ図である。It is an image figure which shows an example of the visualization result of the search result when the feature word is designated on the pie chart. 円グラフ上で特徴語を指定した場合における探索結果の可視化結果の一例を示すイメージ図である。It is an image figure which shows an example of the visualization result of the search result when the feature word is designated on the pie chart. 円グラフ上で特徴語を指定した場合における探索結果の可視化結果の一例を示すイメージ図である。It is an image figure which shows an example of the visualization result of the search result when the feature word is designated on the pie chart. 円グラフ上で特徴語を指定した場合における探索結果の可視化結果の一例を示すイメージ図である。It is an image figure which shows an example of the visualization result of the search result when the feature word is designated on the pie chart. 本発明の第１の実施の形態における情報提示処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the information presentation process routine in the 1st Embodiment of this invention. 本発明の第１の実施の形態における情報提示処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the information presentation process routine in the 1st Embodiment of this invention. 本発明の第２の実施の形態に係る情報提示装置の構成を示すブロック図である。It is a block diagram which shows the structure of the information presentation apparatus which concerns on the 2nd Embodiment of this invention.

以下、図面を参照して本発明の実施の形態を詳細に説明する。なお、以下では、探索対象メディアデータを「絵本」とした場合を例示しながら、各実施の形態について説明する。また、以下の各実施の形態では、「検索」よりも広い概念として「探索」という用語を用いる。ここでは、「検索」とは、検索対象のメディアデータ群の中で、クエリとして与えた条件を全て満たすメディアデータを見つけるタスクとする。一方で、「探索」とは、クエリとして与えた条件を全て満たすメディアデータが存在しなかった場合には、全メディアデータ中から多くの条件を満たすメディアデータを指定数個見つけるタスクとする。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In the following, each embodiment will be described with reference to a case where search target media data is “picture book”. In the following embodiments, the term “search” is used as a concept wider than “search”. Here, “search” is a task for finding media data satisfying all the conditions given as a query in the media data group to be searched. On the other hand, “search” is a task for finding a specified number of media data satisfying many conditions from all media data when there is no media data that satisfies all the conditions given as a query.

＜第１の実施の形態に係る情報提示装置の構成＞
本発明の第１の実施の形態に係る情報提示装置１０は、ＣＰＵ（Central Processing Unit）と、ＲＡＭ（Random Access Memory）と、後述する情報提示処理ルーチンを実行するための情報提示プログラムを記憶したＲＯＭ（Read Only Memory）とを備えたコンピュータで構成されている。ＣＰＵが情報提示プログラムを、内部記憶装置であるＲＯＭから読み込んで実行することにより、コンピュータが情報提示装置１０として機能する。 <Configuration of Information Presentation Device according to First Embodiment>
The information presentation apparatus 10 according to the first embodiment of the present invention stores a CPU (Central Processing Unit), a RAM (Random Access Memory), and an information presentation program for executing an information presentation processing routine described later. It is comprised with the computer provided with ROM (Read Only Memory). The CPU functions as the information presentation device 10 by the CPU reading and executing the information presentation program from the ROM which is an internal storage device.

図１に示すように、情報提示装置１０は、入力部１２、演算部１４、及び表示部１６を備えている。 As illustrated in FIG. 1, the information presentation device 10 includes an input unit 12, a calculation unit 14, and a display unit 16.

〔入力部１２〕
入力部１２は、図２に示すような入力画面において、所望の情報を探索するためのクエリの入力を受け付ける。図２の入力画面の例では、後述する探索対象メディアデータデータベース２０に格納されているメディアデータの中から、クエリとなるメディアデータを選択することにより、クエリの入力を受け付ける。 [Input unit 12]
The input unit 12 receives an input of a query for searching for desired information on the input screen as shown in FIG. In the example of the input screen of FIG. 2, a query input is accepted by selecting media data to be a query from media data stored in a search target media data database 20 described later.

〔演算部１４〕
演算部１４は、入力されたクエリに応じた複数のメディアデータを探索し、探索したメディアデータを示す情報を可視化した探索結果を表示部１６により出力する。 [Calculation unit 14]
The calculation unit 14 searches for a plurality of media data corresponding to the input query, and outputs a search result obtained by visualizing information indicating the searched media data on the display unit 16.

図１に示すように、演算部１４は、機能的には、探索対象メディアデータデータベース２０、クエリ特徴抽出部２２、メディアデータ探索部２４、探索結果提示部２５、可視化結果提示部２６、クラスタリング部２８、指定メディアデータ情報提示部３０、及び指定特徴情報提示部３２を含んだ構成で表すことができる。 As shown in FIG. 1, the calculation unit 14 functionally includes a search target media data database 20, a query feature extraction unit 22, a media data search unit 24, a search result presentation unit 25, a visualization result presentation unit 26, and a clustering unit. 28, a designated media data information presentation unit 30 and a designated feature information presentation unit 32.

〔クエリ特徴抽出部２２〕
クエリ特徴抽出部２２は、入力されたクエリを受け付け、クエリとして受け付けたメディアデータのメディアデータ情報を、探索対象メディアデータＤＢ２０から取得し、図３に示すように、表示部１６に、図３に示すような、クエリとなるメディアデータのメディアデータ情報を提示する。提示するメディアデータ情報はすべてのメディアデータ情報でもよいし，ユーザにとって重要なタイトルなどの一部だけでもよい． [Query feature extraction unit 22]
The query feature extraction unit 22 receives the input query, acquires the media data information of the media data received as the query from the search target media data DB 20, as shown in FIG. As shown, the media data information of the media data to be a query is presented. The media data information to be presented may be all media data information or only a part of the title important for the user.

また、クエリ特徴抽出部２２は、クエリとして受け付けたメディアデータのメディアデータ情報から特徴を抽出し、抽出したクエリの特徴を、メディアデータ探索部２４へ受け渡す。 Further, the query feature extraction unit 22 extracts features from the media data information of the media data received as a query, and delivers the extracted query features to the media data search unit 24.

例えば、クエリ特徴抽出部２２は、クエリとして与えられたメディアデータのメディアデータ情報から、書誌情報の各項目を示す情報、及びメディアデータの内容に含まれる所定の単語の出現回数等を、クエリの特徴として抽出することができる。 For example, the query feature extraction unit 22 obtains information indicating each item of the bibliographic information from the media data information of the media data given as the query, the number of occurrences of a predetermined word included in the content of the media data, and the like. It can be extracted as a feature.

なお、クエリは、後述する探索対象メディアデータＤＢ２０に格納されたメディアデータ情報に含まれる特徴と比較できる特徴を有するものであればよく、探索対象メディアデータＤＢ２０内のデータに存在するものであっても、存在しないものであってもよい。例えば、探索対象メディアデータＤＢ２０に、メディアデータの特徴として、探索対象である絵本に含まれるテキストに関する情報が格納されている場合には、入力されるクエリもテキストを有するものであればよく、そのテキストが、探索対象メディアデータＤＢ２０に格納された絵本に出現する表現そのものでなくてもよい。
ここで入力部１２ではクエリとしてメディアデータの入力を受け付け、そこからクエリ特徴抽出部２２が特徴を抽出する例で説明したが、入力部１２ではメディアデータではなく、メディアデータに関する少なくとも一つの特徴の入力をクエリとして受け付けてもよい。この場合、クエリ特徴抽出部２２は不要であり、入力部１２に入力された特徴が、次に説明するメディアデータ探索部２４に入力される。たとえば、ユーザにより特徴となる形態素と出現回数がクエリとして入力部１２に入力されてもよい。また、例えば、書誌情報のうちの作者とタイトルが入力されてもよい。形態素の出現回数や、書誌情報のタイトルといった特徴を入力とする場合は、検索において重視したものや他の特徴との正規化の観点で重みづけしたものであってもよい。
また、入力部１４では、メディアデータに対して一段落処理したデータの入力を受け付けてもよい。例えば、絵本そのものではなく、絵本をスキャンした画像などを入力として受け付けてもよい。この場合には、クエリ特徴抽出部２２において、入力された画像に対してＯＣＲを行って著者名などの特徴を抽出する。 The query only needs to have a feature that can be compared with the feature included in the media data information stored in the search target media data DB 20 to be described later, and exists in the data in the search target media data DB 20. May not exist. For example, if the search target media data DB 20 stores information about text included in the picture book that is the search target as a feature of the media data, the input query only needs to have text. The text may not be the expression itself that appears in the picture book stored in the search target media data DB 20.
Here, the example in which the input unit 12 receives input of media data as a query and the query feature extraction unit 22 extracts the feature from the input has been described. However, the input unit 12 has at least one feature related to media data, not media data. Input may be accepted as a query. In this case, the query feature extraction unit 22 is unnecessary, and the feature input to the input unit 12 is input to the media data search unit 24 described below. For example, the morpheme that is a feature and the number of appearances may be input to the input unit 12 as a query by the user. Further, for example, the author and title of the bibliographic information may be input. When features such as the number of appearances of morphemes and titles of bibliographic information are input, the features may be weighted from the viewpoint of normalization with those emphasized in the search or other features.
Further, the input unit 14 may accept input of data obtained by performing one paragraph processing on the media data. For example, instead of the picture book itself, an image obtained by scanning the picture book may be accepted as an input. In this case, the query feature extraction unit 22 performs OCR on the input image to extract features such as the author name.

〔メディアデータ探索部２４〕
メディアデータ探索部２４は、クエリ特徴抽出部２２から受け渡されたクエリの特徴を用いて、探索対象メディアデータＤＢ２０から、クエリとの類似度が高いメディアデータ情報を取得し、探索結果提示部２５、可視化結果提示部２６、及びクラスタリング部２８へ受け渡す。 [Media data search unit 24]
The media data search unit 24 acquires media data information having a high similarity to the query from the search target media data DB 20 using the query features passed from the query feature extraction unit 22, and the search result presentation unit 25. Then, the data is transferred to the visualization result presentation unit 26 and the clustering unit 28.

ここで、探索対象メディアデータＤＢ２０には、図４に示すように、複数のメディアデータ情報が格納されている。ここで、メディアデータとは、探索対象を示す実体であり、例えば、本、音楽、画像、映像等である。また、メディアデータ情報とは、上記のようなメディアデータに関する少なくとも一種類の特徴を含む情報であり、メディアデータが示す性質、属性等の様々な情報を含むことができる。例えば、メディアデータが画像の場合には、画像のタイトル、サイズ、作成日時等のメタデータに含まれる各項目が示す情報や、画像自体の内容を示す情報、すなわち画像データから抽出される画像特徴量等の各々を画像の特徴として、その画像を示すメディアデータ情報に含めることができる。 Here, as shown in FIG. 4, the search target media data DB 20 stores a plurality of pieces of media data information. Here, the media data is an entity indicating a search target, such as a book, music, an image, and a video. The media data information is information including at least one type of feature related to the media data as described above, and can include various information such as properties and attributes indicated by the media data. For example, when the media data is an image, information indicated by each item included in metadata such as the image title, size, creation date and time, information indicating the content of the image itself, that is, image features extracted from the image data Each amount or the like can be included in the media data information indicating the image as a feature of the image.

なお、ここでは「同じ種類の特徴」を持つことと「同じ特徴」を持つことを別の概念であることとする。例えば、二つのメディアデータＡ及びＢが、それぞれサイズが１００×１００画素の画像と、サイズが２００×２００画素の画像であったとする。両者は画像のタイトル、サイズ、画像特徴量等といった複数の「同じ種類の特徴」を持つ。しかし、「サイズ」に注目するとその値が異なるため、異なる特徴を有するメディアデータであるといえる。 Here, “having the same type of feature” and “having the same feature” are different concepts. For example, it is assumed that the two media data A and B are an image having a size of 100 × 100 pixels and an image having a size of 200 × 200 pixels, respectively. Both have a plurality of “features of the same type” such as an image title, a size, and an image feature amount. However, if the “size” is noted, the value is different, so it can be said that the media data has different characteristics.

また、各メディアデータ情報には、探索対象となるメディアデータを識別するための識別番号（ＩＤ）が付与されている。 Each media data information is given an identification number (ID) for identifying media data to be searched.

図４の例は、探索対象メディアデータが「絵本」の場合であり、例えば、絵本のタイトル、表紙画像、著者、挿絵の作者、出版社等の書誌情報の各項目、及び各絵本に出現する形態素毎の出現回数のそれぞれを特徴として含むメディアデータ情報が格納されている。なお、ここでは、書誌情報以外のメディアデータの特徴の一つとして、絵本に出現する形態素毎の出現回数を用いているが、例えば、絵本の中の挿絵を示す画像データから抽出された画像特徴量、絵本を朗読した音声データから抽出された音声特徴量等をメディアデータの特徴としてもよい。また、書誌情報についても、シリーズ名や絵本の大きさ等の他の書誌情報を特徴として含めてもよい。 The example of FIG. 4 is a case where the search target media data is “picture book”, and for example, appears in each item of bibliographic information such as a title of a picture book, a cover image, an author, an author of an illustration, and a publisher, and each picture book. Media data information including each of the appearance counts for each morpheme as features is stored. Here, as one of the features of media data other than bibliographic information, the number of appearances for each morpheme that appears in a picture book is used. For example, the image feature extracted from image data indicating an illustration in a picture book The volume, the voice feature amount extracted from the voice data read from the picture book, and the like may be used as the feature of the media data. Also, bibliographic information may include other bibliographic information such as a series name and picture book size as features.

メディアデータ探索部２４は、クエリ特徴抽出部２２で抽出されたクエリの特徴と、探索対象メディアデータＤＢ２０に格納された各メディアデータのメディアデータ情報に含まれる特徴とを用いて、クエリと各メディアデータとの類似度を算出し、探索対象メディアデータＤＢ２０から、クエリとの類似度が高いメディアデータのメディアデータ情報を、類似度が高い順に所定件数（例えば３０件）取得する。 The media data search unit 24 uses the query feature extracted by the query feature extraction unit 22 and the feature included in the media data information of each media data stored in the search target media data DB 20 to use the query and each media. The similarity with the data is calculated, and the media data information of the media data with the high similarity with the query is acquired from the search target media data DB 20 in a descending order of the similarity (for example, 30).

クエリとメディアデータとの類似度の算出について、探索対象メディアデータが「絵本」の場合を例に説明する。まず、探索対象メディアデータＤＢ２０に格納された各絵本のメディアデータ情報に含まれる特徴のうち、クエリ特徴抽出部２２で抽出されたクエリの特徴と比較可能な特徴を用いて、各絵本の特徴ベクトルを作成する。ここでは、メディアデータの特徴として、書誌情報の各項目、及び絵本に出現する形態素毎の出現回数を用いる。具体的には、探索対象メディアデータＤＢ２０に格納されたメディアデータの特徴としての、各絵本の書誌情報の各項目、及び各絵本に出現する形態素毎の出現回数を用いて、探索対象メディアデータＤＢ２０にメディアデータ情報が格納された全ての絵本に出現した、書誌情報及び形態素の総種類数の次元を持つベクトルを、絵本一冊毎に作成する。そして、ベクトルの各要素に、その要素の次元に対応する書誌情報の項目又は形態素の各絵本における出現回数ｔｆ（term frequency）に、逆絵本出現頻度ｉｄｆ（inverse document frequency）を乗じた値ｔｆｉｄｆを記入する。最後にベクトルの大きさを１に正規化して、各絵本の特徴ベクトルとする。ｔｆｉｄｆは、例えば、下記（１）式〜（３）式に示すように算出することができる。 The calculation of the similarity between the query and the media data will be described by taking a case where the search target media data is “picture book” as an example. First, among the features included in the media data information of each picture book stored in the search target media data DB 20, the feature vector of each picture book is used by using a feature that can be compared with the query feature extracted by the query feature extraction unit 22. Create Here, each item of bibliographic information and the number of appearances for each morpheme appearing in a picture book are used as features of the media data. Specifically, the search target media data DB 20 using each item of the bibliographic information of each picture book and the number of appearances of each morpheme appearing in each picture book as characteristics of the media data stored in the search target media data DB 20. For each picture book, a vector having dimensions of the total number of types of bibliographic information and morphemes appearing in all picture books in which the media data information is stored is created. Then, a value tfidf obtained by multiplying each element of the vector by an inverse picture book appearance frequency idf (frequency of appearance tf (term frequency) in each picture book of the bibliographic information item or morpheme corresponding to the dimension of the element. Fill out. Finally, the size of the vector is normalized to 1 to obtain a feature vector for each picture book. tfidf can be calculated, for example, as shown in the following equations (1) to (3).

ここで、ｎ_ｉ，ｊは書誌情報の項目ｉ又は形態素ｉの絵本ｊにおける出現回数である。Ｄは探索対象メディアデータＤＢ２０にメディアデータ情報が格納されている絵本の総数、ｄは探索対象メディアデータＤＢ２０にメディアデータ情報が格納されている絵本のうち、書誌情報の項目ｉ又は形態素ｉが出現する絵本の数である。 Here, n _{i, j} is the number of appearances of the bibliographic information item i or the morpheme i in the picture book j. D is the total number of picture books whose media data information is stored in the search target media data DB 20, d is the bibliographic information item i or morpheme i among the picture books whose media data information is stored in the search target media data DB 20 The number of picture books to be played.

例えば、探索対象メディアデータＤＢ２０に、総数３冊の絵本についてのメディアデータ情報が格納されており（Ｄ＝３）、各絵本のメディアデータ情報に、下記のような書誌情報の項目又は形態素とその出現回数とで表された特徴が含まれているものとする。なお、ＩＤがｎの絵本を、以下では「絵本ｎ」と表記する。書誌情報には重みとして回数を付与し、形態素の回数との関係で相対的に重要性を表現する。たとえば、ある本において著者がXであることの重みを100回とすると著者がXであるという情報はある形態素が100回現れることと同一の重みということになる。重みは同じ特徴の種類ごとに一定に設定（著者は100回、挿絵は50回等）してもよいし、個別に設定してもよい。 For example, media data information about a total of three picture books is stored in the search target media data DB 20 (D = 3), and the following bibliographic information items or morphemes and their morphemes are included in the media data information of each picture book. It is assumed that the feature expressed by the number of appearances is included. Note that the picture book whose ID is n is hereinafter referred to as “picture book n”. Bibliographic information is given a number of times as a weight, and the relative importance is expressed in relation to the number of morphemes. For example, if the weight of the author being X is 100 times in a book, the information that the author is X is the same weight as the appearance of a certain morpheme 100 times. The weight may be set constant for each type of the same feature (100 times for the author, 50 times for the illustration, etc.) or may be set individually.

絵本１：Ａ３回，Ｂ２回，Ｃ２回
絵本２：Ａ１回，Ｃ３回，Ｄ２回
絵本３：Ａ２回，Ｄ１回，Ｅ２回 Picture Book 1: A 3 times, B 2 times, C 2 times Picture Book 2: A 1 time, C 3 times, D 2 times Picture Book 3: A 2 times, D 1 time, E 2 times

この場合、各項目、各形態素のｉｄｆ_ｉ（ｉ＝Ａ，Ｂ，Ｃ，Ｄ，Ｅ）は下記（４）式となる。 In this case, idf _i (i = A, B, C, D, E) of each item and each morpheme is expressed by the following equation (4).

また、絵本１についてのベクトルの各要素の値は、下記（５）式となる。 Further, the value of each element of the vector for the picture book 1 is expressed by the following equation (5).

最後にベクトルの大きさを１に正規化するために、各要素の二乗和の平方根で各要素を割る。結果得られる絵本１の特徴ベクトルｖ_１は下記（６）式となる。 Finally, in order to normalize the vector size to 1, each element is divided by the square root of the sum of squares of each element. The resulting feature vector v ₁ of the picture book 1 is expressed by the following equation (6).

同様の手順で絵本２の特徴ベクトルｖ_２、及び絵本３の特徴ベクトルｖ_３を作成する。 The feature vector v ₂ of the picture book 2 and the feature vector v 3 of the picture book ₃ are created in the same procedure.

次に、クエリの特徴ベクトルを作成する。クエリ特徴抽出部２２により、例えば、下記に示すような書誌情報の項目又は形態素とその出現回数とで表された特徴がクエリから抽出されたとする。 Next, a feature vector of the query is created. For example, it is assumed that the feature represented by the bibliographic information item or morpheme as shown below and the number of appearances thereof are extracted from the query by the query feature extraction unit 22.

クエリ：Ａ２回，Ｃ３回 Query: A 2 times, C 3 times

クエリの特徴ベクトルの作成に当たり、ｔｆｉｄｆを計算する際のｉｄｆ_ｉの値は、探索対象メディアデータＤＢ２０から求めた値、すなわち（４）式を利用する。その他については、下記（７）式に示すように絵本の特徴ベクトルと同様の手順で求める。 In preparing a feature vector of a query, the value of idf _i in calculating tfidf utilizes values obtained from the search target media data DB 20, i.e., the expression (4). Others are obtained in the same procedure as the feature vector of the picture book as shown in the following equation (7).

最後にベクトルの大きさを１に正規化するために、各要素の二乗和の平方根で各要素を割り、クエリの特徴ベクトルｖ_{ｑｕｅｒｙ}＝（０，０，１，０，０）を作成する。なお、クエリに出現した書誌情報の項目又は形態素であって、探索対象メディアデータＤＢ２０にメディアデータ情報が格納された絵本に出現しなかった書誌情報の項目又は形態素については無視すればよい。すなわち、その書誌情報の項目又は形態素についてはクエリの特徴ベクトルの要素として考慮しない。 Finally, in order to normalize the magnitude of the vector to 1, each element is divided by the square root of the sum of squares of each element to create a _query feature vector v _query = (0, 0, 1, 0, 0). Note that bibliographic information items or morphemes that appear in the query and that do not appear in the picture book whose media data information is stored in the search target media data DB 20 may be ignored. That is, the bibliographic information item or morpheme is not considered as an element of the query feature vector.

次に、作成したクエリの特徴ベクトルｖ_{ｑｕｅｒｙ}と、各絵本の特徴ベクトルｖ_ｎ（ｎ＝１，２，３）の各々とのコサイン類似度を計算する。ここでは、特徴ベクトルはいずれも大きさが１に正規化されているので、コサイン類似度はｖ_{ｑｕｅｒｙ}とｖ_ｎとの内積と等価になる。また、自身との内積はコサイン類似度の最大値である１になる。 Next, the cosine similarity between the feature vector v _query of the created _query and each of the feature vectors v _n (n = 1, 2, 3) of each picture book is calculated. Here, since the size one feature vector is normalized to 1, cosine similarity becomes the inner product equivalent to the v _query and v _n. Also, the inner product with itself becomes 1, which is the maximum value of cosine similarity.

メディアデータ探索部２４は、探索対象メディアデータＤＢ２０から、クエリの特徴との類似度が所定値以上の特徴を含むメディアデータ情報を取得する。ここでは、クエリの特徴ベクトルｖ_{ｑｕｅｒｙ}と各メディアデータの特徴ベクトルｖ_ｎとのコサイン類似度が高い順に、予め指定された所定件数（例えば３０件）のメディアデータ情報を取得する。なお、メディアデータの取得件数は、類似度の上位所定件数に限らず、探索対象メディアデータＤＢ２０に格納されたメディアデータ情報の総数を指定し、全てのメディアデータ情報を類似度順に取得してもよい。 The media data search unit 24 acquires media data information including a feature whose similarity to the query feature is a predetermined value or more from the search target media data DB 20. Here, the feature vector v _query and cosine similarity between the feature vector v _n of each media data query in descending order, obtains the media data information pre-specified predetermined number (e.g., 30 cases). Note that the number of media data acquisitions is not limited to the upper predetermined number of similarities, and the total number of media data information stored in the search target media data DB 20 may be specified, and all media data information may be acquired in order of similarity. Good.

また、メディアデータ探索部２４は、取得したメディアデータ情報に含まれる特徴の少なくとも一つに基づいて、各メディアデータ間の類似度を算出する。メディアデータ間の類似度は、例えば、上記で計算した各メディアデータの特徴ベクトルのコサイン類似度により算出することができる。メディアデータ探索部２４は、取得したメディアデータ情報、算出したメディアデータ間の類似度、及び算出したクエリと各メディアデータとの類似度を、探索結果提示部２５及び可視化結果提示部２６へ受け渡す。 In addition, the media data search unit 24 calculates the similarity between each piece of media data based on at least one of the features included in the acquired media data information. The similarity between media data can be calculated, for example, from the cosine similarity of the feature vector of each media data calculated above. The media data search unit 24 delivers the acquired media data information, the calculated similarity between media data, and the similarity between the calculated query and each media data to the search result presentation unit 25 and the visualization result presentation unit 26. .

なお、各メディアデータ間の類似度、及びクエリと各メディアデータとの類似度は、各々の特徴ベクトルの類似度から求める場合に限定されない。例えば、探索対象メディアデータＤＢ２０にメディアデータ情報が格納された２つのメディアデータの全組み合わせについて、主観的な類似度または非類似度を予め与えておくことができる。この場合、各メディアデータ間の類似度は、与えられた類似度または非類似度を直接用いることができる。また、クエリと各メディアデータとの類似度は、クエリの特徴と一致する特徴を含むメディアデータ情報が示すメディアデータと組み合わせたときの他のメディアデータに与えられた類似度または非類似度を用いればよい。 In addition, the similarity between each media data and the similarity between a query and each media data are not limited to the case of calculating | requiring from the similarity of each feature vector. For example, a subjective similarity or dissimilarity can be given in advance for all combinations of two media data whose media data information is stored in the search target media data DB 20. In this case, given similarity or dissimilarity can be directly used as the similarity between the media data. The similarity between the query and each piece of media data is the similarity or dissimilarity given to other media data when combined with the media data indicated by the media data information including features that match the features of the query. That's fine.

本実施の形態では、メディアデータ探索部２４は、メディアデータ間の類似度を表す情報として、取得したメディアデータ情報を示すノードを設定し、各ノードについてそのノードと類似度が指定順位以内となるノードとの間をエッジで接続した部分ネットワーク構造を作成し、その部分ネットワーク構造のエッジのうち、削除しても貪欲探索を行ったときに可到達性に変化がないエッジを枝刈りされた部分ネットワーク構造を作成する（特開２００８−２９３４４４号公報、及び特開２００８−３０５０７２号公報参照）。メディアデータ探索部２４は、取得したメディアデータ情報、及び算出したクエリと各メディアデータとの類似度を、部分ネットワーク構造を構成する各ノードに保持させる。メディアデータ探索部２４は、このようにして作成した部分ネットワーク構造を可視化結果提示部２６へ受け渡す。ここでは、上述の部分ネットワーク構造を用いたが、同じ特徴を有することで類似度が高いノード同士がエッジで接続されたネットワーク構造であれば別種の部分ネットワーク構造でもかまわない。たとえば、一定類似度以上のノード間をエッジで結んで得られる部分ネットワーク構造でもかまわないし、ノードと類似度が指定順位以内となるノードとの間をエッジで接続した部分ネットワーク構造でもかまわない。 In the present embodiment, the media data search unit 24 sets a node indicating the acquired media data information as information indicating the similarity between the media data, and the similarity between the node and the node is within a specified order for each node. Create a partial network structure that connects nodes with an edge, and prun the edges of the partial network structure that have no change in reachability when greedy search is performed even if they are deleted A network structure is created (see JP 2008-293444 A and JP 2008-305072 A). The media data search unit 24 holds the acquired media data information and the similarity between the calculated query and each media data in each node constituting the partial network structure. The media data search unit 24 delivers the partial network structure thus created to the visualization result presentation unit 26. Although the above-described partial network structure is used here, another type of partial network structure may be used as long as nodes having the same characteristics and having high similarity are connected by edges. For example, a partial network structure obtained by connecting nodes having a certain degree of similarity or higher with an edge may be used, or a partial network structure in which a node and a node whose similarity is within a specified order may be connected with an edge.

〔探索結果提示部２５〕
探索結果提示部２５は、メディアデータ探索部２４から受け渡されたメディアデータ情報、及び各メディアデータ間の類似度に基づいて、図５に示すように、メディアデータ探索部２４により探索された探索結果として、クエリとの類似度が高いメディアデータ情報を優先して、探索された複数のメディアデータ情報の各々（「トップランカー」と呼ぶ）について、少なくともメディアデータ情報に対して予め定められた概要情報を含む情報を表示部１６により、ユーザに対して提示する。例えば、概要情報と、クエリとの類似度と、クエリと類似性を有する各特徴語の、類似度への寄与度とを表す画面を、表示部１６により、ユーザに対して提示する。提示される情報は、少なくともメディアデータ情報に対して予め定められた概要情報を含めばよく、クエリとの類似度と、クエリと類似性を有する各特徴語の、類似度への寄与度を含まなくても、また、他の情報を含んでもよい。 [Search result presentation unit 25]
As shown in FIG. 5, the search result presentation unit 25 searches the media data search unit 24 based on the media data information passed from the media data search unit 24 and the similarity between the media data. As a result, priority is given to the media data information having a high similarity to the query, and for each of the searched plurality of media data information (referred to as “top ranker”), at least an outline predetermined for the media data information Information including information is presented to the user by the display unit 16. For example, a screen representing the summary information, the similarity to the query, and the contribution to the similarity of each feature word having similarity to the query is presented to the user by the display unit 16. The presented information should include at least summary information predetermined for the media data information, and includes the degree of similarity to the query and the degree of contribution to the degree of similarity of each feature word having similarity to the query. There may be no other information.

上記図５に示すように、探索されたメディアデータ情報に関する補助情報が提示される。補助情報は、メディアデータの特徴、特に書誌的事項を示す特徴、クエリと各メディアデータとの類似度及び順位、特徴語、特徴語の類似度への寄与度等を含むことができる。特徴語とは、クエリとの類似度が高いメディアデータとして選択されたことに対する寄与度が高い単語である。例えば、クエリの特徴ベクトルとメディアデータの特徴ベクトルとのコサイン類似度を算出した際の各要素間の積が、他の要素より大きい要素に対応する書誌事項の項目又は形態素を特徴語とすることができる。また、特徴語について、クエリの特徴ベクトルとメディアデータの特徴ベクトルとのコサイン類似度を算出した際の、当該特徴語に対応する要素間の積に基づいて、特徴語の寄与度を算出する。 As shown in FIG. 5 above, auxiliary information related to the searched media data information is presented. The auxiliary information can include features of media data, particularly features indicating bibliographic items, similarities and ranks of queries and media data, feature words, contributions to similarity of feature words, and the like. A feature word is a word that has a high degree of contribution to being selected as media data having a high degree of similarity to a query. For example, a bibliographic item or morpheme corresponding to an element whose product between elements when a cosine similarity between a query feature vector and a media data feature vector is calculated is a feature word Can do. For the feature word, the contribution degree of the feature word is calculated based on the product between the elements corresponding to the feature word when the cosine similarity between the feature vector of the query and the feature vector of the media data is calculated.

〔可視化結果提示部２６〕
可視化結果提示部２６は、ユーザにより「グラフ可視化」ボタンが押下され、探索結果の可視化要求が入力された場合、以下に説明するように、二次元平面座標の算出処理、標高の算出処理、特徴情報の算出処理、及び可視化結果の生成処理を行う。 [Visualization result presentation unit 26]
When the “graph visualization” button is pressed by the user and a search result visualization request is input by the user, the visualization result presentation unit 26 calculates a two-dimensional plane coordinate, calculates an elevation, and features as described below. Information calculation processing and visualization result generation processing are performed.

可視化結果提示部２６は、まず、二次元平面座標の算出処理として、メディアデータ探索部２４から受け渡されたメディアデータ情報、及び各メディアデータ間の類似度に基づいて、各メディアデータ情報を示すノードの二次元平面座標を算出する。この際、メディアデータ情報に含まれる特徴間の類似度が高いメディアデータ情報を示すノード同士は近くに配置され、類似度が低いメディアデータ情報を示すノード同士は遠くに配置されるような二次元平面座標を算出する。 First, the visualization result presentation unit 26 indicates each piece of media data information based on the media data information passed from the media data search unit 24 and the similarity between the pieces of media data as a two-dimensional plane coordinate calculation process. Calculate the two-dimensional plane coordinates of the node. At this time, two-dimensional such that nodes indicating media data information having high similarity between features included in the media data information are arranged close to each other, and nodes showing media data information having low similarity are arranged far from each other Calculate plane coordinates.

本実施の形態では、メディアデータ探索部２４から受け渡された部分ネットワーク構造を、バネ長固定のバネモデルとし、各メディアデータ情報を示すノードの二次元平面座標を算出する。具体的には、部分ネットワーク構造において、エッジで接続されたノード同士は近くに配置し、エッジで接続されていないノード同士は遠くに配置し、部分ネットワーク構造に含まれる全てのノードが安定な配置となるように、各ノードの二次元平面座標を求める。 In the present embodiment, the partial network structure delivered from the media data search unit 24 is used as a spring model with a fixed spring length, and the two-dimensional plane coordinates of a node indicating each piece of media data information are calculated. Specifically, in a partial network structure, nodes connected by edges are placed close to each other, nodes not connected by edges are placed far away, and all nodes included in the partial network structure are placed stably. The two-dimensional plane coordinates of each node are obtained so that

なお、メディアデータ情報に含まれる特徴間の類似度が高いメディアデータ情報を示すノード同士が近くに配置され、類似度が低いメディアデータ情報を示すノード同士が遠くに配置されるような二次元平面座標を算出することができる手法であれば、バネモデル以外の手法を用いてもよい。また、二次元平面座標は、上記のように、絵本に出現する形態素の出現回数及び書誌事項の項目で表されるメディアデータの特徴を用いた特徴ベクトル間のコサイン類似度を用いて算出する場合に限らず、各ノードが示すメディアデータ情報間の関係に応じて算出すればよい。例えば、メディアデータ情報に含まれる特徴のいずれか（例えば「著者」）が一致するメディアデータを示すノード同士は近く、特徴が異なるメディアデータを示すノード同士は遠くに配置されるような二次元座標を算出してもよい。 A two-dimensional plane in which nodes indicating media data information having high similarity between features included in the media data information are arranged close to each other and nodes indicating media data information having low similarity are arranged far away from each other. Any method other than the spring model may be used as long as the method can calculate the coordinates. In addition, as described above, the two-dimensional plane coordinates are calculated using the cosine similarity between feature vectors using the number of appearances of morphemes appearing in a picture book and the features of media data represented by bibliographic items. The calculation is not limited to this, and may be performed according to the relationship between the media data information indicated by each node. For example, two-dimensional coordinates such that nodes indicating media data that matches one of the features included in the media data information (for example, “author”) are close to each other, and nodes indicating media data having different features are arranged far from each other. May be calculated.

可視化結果提示部２６は、算出した各ノードの二次元平面座標、及び各ノードが保持するメディアデータ情報及びクエリと各メディアデータとの類似度を、標高の算出処理へ受け渡す。 The visualization result presentation unit 26 passes the calculated two-dimensional plane coordinates of each node, the media data information held by each node, and the similarity between the query and each media data to the altitude calculation process.

可視化結果提示部２６は、標高の算出処理として、メディアデータ探索部２４で取得された各メディアデータ情報を示すノードについて、二次元座標の算出処理から受け渡されたクエリと各メディアデータとの類似度に応じた標高を算出する。標高とは、算出された各メディアデータ情報を示すノードの二次元平面座標に加えて三次元座標とするための高さ情報である。 The visualization result presentation unit 26, as an altitude calculation process, for the node indicating each piece of media data information acquired by the media data search unit 24, the similarity between the query passed from the two-dimensional coordinate calculation process and each piece of media data. Calculate the altitude according to the degree. The altitude is height information for obtaining three-dimensional coordinates in addition to the two-dimensional plane coordinates of the nodes indicating the calculated pieces of media data information.

標高としては、各メディアデータの特徴ベクトルとクエリの特徴ベクトルとのコサイン類似度をそのまま用いてもよいし、コサイン類似度に所定の定数（例えば４）を乗算して、三次元座標における高さを強調するような標高を算出してもよい。高さを強調することにより、「二次元平面で表現されたトップランカー同士の関係」と「高さ方向に表現されたトップランカーとクエリとの類似度の関係」の対比を強調することができ、クエリに似た（あるいは似ていない）トップランカーの集まり具合を認識しやすくできる。また、この対比を強調するために、標高を、類似度の対数または平方根のような類似度の小さな値が強調される関数によって求めてもよいし、そのような関数により求めた値にさらに定数を乗じてもよい。 As the altitude, the cosine similarity between the feature vector of each media data and the feature vector of the query may be used as it is, or the cosine similarity is multiplied by a predetermined constant (for example, 4) to obtain a height in three-dimensional coordinates. You may calculate the altitude that emphasizes. By emphasizing the height, it is possible to emphasize the contrast between the “relationship between top rankers expressed in a two-dimensional plane” and the “relationship between the top rankers expressed in the height direction and the query”. , It can make it easier to recognize top rankers that are similar (or not) to queries. In order to emphasize this contrast, the altitude may be obtained by a function in which a small value of similarity is emphasized, such as a logarithm of similarity or a square root, and a value obtained by such a function is further increased by a constant. May be multiplied.

可視化結果提示部２６は、算出した各ノードの標高、並びに二次元座標の算出処理から受け渡された各ノードの二次元平面座標、及び各ノードが保持するクエリと各メディアデータとの類似度を、可視化結果の生成処理へ受け渡す。 The visualization result presentation unit 26 displays the calculated altitude of each node, the two-dimensional plane coordinates of each node passed from the calculation process of the two-dimensional coordinates, and the similarity between the query held by each node and each media data. , And pass it to the visualization result generation process.

可視化結果提示部２６は、特徴情報の算出処理として、メディアデータ探索部２４により探索された複数のメディアデータ情報が有する、クエリと類似性を有する各特徴語と、各特徴語の類似度への寄与度とを算出する。例えば、探索された複数のメディアデータ情報が有する、クエリと類似性を有する各特徴語の和集合を求める。また、クエリとの類似性を有する各特徴語について、探索結果提示部２５で算出される、複数のメディアデータ情報の各々における当該特徴語の類似度の寄与度の平均値を求め、各特徴語の類似度への寄与度とする。 The visualization result presentation unit 26 calculates, as feature information calculation processing, each feature word having a similarity with a query included in the plurality of media data information searched by the media data search unit 24 and the similarity of each feature word. The contribution is calculated. For example, a union of each feature word having similarity to a query included in a plurality of searched media data information is obtained. Further, for each feature word having similarity to the query, an average value of the contribution degree of similarity of the feature word in each of the plurality of media data information calculated by the search result presentation unit 25 is obtained, and each feature word The degree of contribution to the similarity.

可視化結果提示部２６は、二次元座標の算出処理で算出された各ノードの二次元平面座標に、標高の算出処理で算出された標高を加え、各メディアデータ情報を示すノードの三次元座標を求め、これを三次元グラフ上にプロットした可視化結果を生成する。また、可視化結果提示部２６は、各ノードが示すメディアデータ情報に含まれる特徴の少なくとも一つを用いて、各ノードが示すメディアデータが判別可能となるように編集する。図６の右側に、探索対象メディアデータを「絵本」とした場合の可視化結果の一例を示す。図６の右側では、三次元グラフを、標高の高い位置から見たときの可視化結果の例を示している。標高の低いノードの一部が、標高の高いノードに被って表示されている。また、図６の右側では、メディアデータ情報に含まれる特徴の一つである絵本の表紙画像を、各ノードに貼り付けた例を示している。これにより、各ノードがどの絵本を表すかを表現することができる。また、表紙画像以外の他の特徴を用いてもよく、例えば、絵本のタイトルを示すテキストを用いてもよい。 The visualization result presentation unit 26 adds the altitude calculated by the altitude calculation process to the two-dimensional plane coordinates of each node calculated by the 2D coordinate calculation process, and obtains the 3D coordinates of the node indicating each media data information. The visualization result obtained by plotting it on a three-dimensional graph is generated. The visualization result presentation unit 26 edits the media data indicated by each node so that the media data indicated by each node can be determined using at least one of the features included in the media data information indicated by each node. The right side of FIG. 6 shows an example of a visualization result when the search target media data is “picture book”. The right side of FIG. 6 shows an example of a visualization result when the three-dimensional graph is viewed from a position at a high altitude. A part of the low altitude node is displayed over the high altitude node. The right side of FIG. 6 shows an example in which a cover image of a picture book that is one of the features included in the media data information is pasted to each node. Thereby, it is possible to express which picture book each node represents. Moreover, you may use the characteristics other than a cover image, for example, you may use the text which shows the title of a picture book.

また、可視化結果提示部２６は、クエリとの類似度が高いメディアデータを示すノード、すなわち標高が高いノードほど表紙の枠に濃い色を付けるなどして、強調表示するようにしてもよい。
また、可視化結果提示部２６は、上記図６に示すようなグラフを提示する場合を例に説明したが、提示するグラフは複数の絵本についての関係性を把握可能なグラフであればよく、上記図６に示すようなグラフに限定されるものではない。 In addition, the visualization result presentation unit 26 may highlight the nodes indicating the media data having a high similarity to the query, that is, the nodes having a higher altitude by adding a darker color to the cover frame.
Moreover, although the visualization result presentation part 26 demonstrated to the case where the graph as shown in the said FIG. 6 was shown as an example, the graph to show should just be a graph which can grasp | ascertain the relationship about several picture books, It is not limited to the graph as shown in FIG.

また、可視化結果提示部２６は、特徴情報の算出処理で得られた、探索された複数のメディアデータ情報が有する、クエリと類似性を有する各特徴語と、各特徴語の類似度への寄与度とを更に提示してもよい。図６の左側に、クエリと類似性を有する各特徴語と、各特徴語の類似度への寄与度とを表わす棒グラフの例を示す。棒グラフでは、横軸が、特徴語の種類を表し、縦軸が、寄与度を表わしている。 In addition, the visualization result presentation unit 26 contributes to each feature word having similarity to a query included in a plurality of searched media data information obtained by the feature information calculation process and the similarity of each feature word. The degree may be further presented. On the left side of FIG. 6, an example of a bar graph representing each feature word having similarity to the query and the degree of contribution to the similarity of each feature word is shown. In the bar graph, the horizontal axis represents the type of feature word, and the vertical axis represents the contribution.

〔クラスタリング部２８〕
クラスタリング部２８は、ユーザによりグラフの分割要求が入力された場合、ノードの各々が示すメディアデータ情報間の類似性に基づいて、ノードの各々を複数のクラスタに分離する。 [Clustering unit 28]
When a graph division request is input by the user, the clustering unit 28 separates each of the nodes into a plurality of clusters based on the similarity between the media data information indicated by each of the nodes.

具体的には、クラスタリング部２８は、グラフのエッジに関する媒介中心性（betweenness centerality）という概念を導入する。グラフのエッジに関する媒介中心性とは、グラフ中の全ての２ノードの組み合わせについて、２ノード間の最短経路中に最も多く現れるエッジがそのグラフにおいて中心であるという考え方である。例えば、図７に示すグラフを例にとると、各２ノード間の最短経路は下記の通りである。 Specifically, the clustering unit 28 introduces the concept of betweenness centerality regarding the edges of the graph. The intermediary centrality regarding the edge of the graph is an idea that, for all combinations of two nodes in the graph, the edge that appears most frequently in the shortest path between the two nodes is the center in the graph. For example, taking the graph shown in FIG. 7 as an example, the shortest path between each two nodes is as follows.

ＡＢ：Ａ→Ｂ
ＡＣ：Ａ→Ｃ
ＡＤ：Ａ→Ｂ→Ｄ
ＡＥ：Ａ→Ｂ→Ｄ→Ｅ
ＡＦ：Ａ→Ｂ→Ｄ→Ｆ
ＡＧ：Ａ→Ｂ→Ｄ→Ｆ→Ｇ
ＢＣ：Ｂ→Ｃ
ＢＤ：Ｂ→Ｄ
ＢＥ：Ｂ→Ｄ→Ｅ
ＢＦ：Ｂ→Ｄ→Ｆ
ＢＧ：Ｂ→Ｄ→Ｆ→Ｇ
ＣＤ：Ｃ→Ｂ→Ｄ
ＣＥ：Ｃ→Ｂ→Ｄ→Ｅ
ＣＦ：Ｃ→Ｂ→Ｄ→Ｆ
ＣＧ：Ｃ→Ｂ→Ｄ→Ｆ→Ｇ
ＤＥ：Ｄ→Ｅ
ＤＦ：Ｄ→Ｆ
ＤＧ：Ｄ→Ｆ→Ｇ
ＥＦ：Ｅ→Ｆ
ＥＧ：Ｅ→Ｇ
ＦＧ：Ｆ→Ｇ AB: A → B
AC: A → C
AD: A → B → D
AE: A → B → D → E
AF: A → B → D → F
AG: A → B → D → F → G
BC: B → C
BD: B → D
BE: B → D → E
BF: B → D → F
BG: B → D → F → G
CD: C → B → D
CE: C → B → D → E
CF: C → B → D → F
CG: C → B → D → F → G
DE: D → E
DF: D → F
DG: D → F → G
EF: E → F
EG: E → G
FG: F → G

これら最短経路の中で、最も多く現れるのは、Ｂ→Ｄのエッジで、１２回現れる。このエッジを枝刈りすることで、図７に示すグラフを、Ａ、Ｂ、及びＣのノード群とＤ、Ｅ、Ｆ、及びＧのノード群との二つのクラスタに分離できる。 Of these shortest paths, the most frequently appearing edge is B → D and appears 12 times. By pruning this edge, the graph shown in FIG. 7 can be separated into two clusters of A, B, and C node groups and D, E, F, and G node groups.

しかし、一本のエッジを取り除いてもクラスタに分かれない場合がある。この場合、エッジを取り除いた後で、再度各クラスタにおいて最短経路に現れるエッジの出現回数を調べ、媒介中心性が最大のエッジを枝刈りする。それでもクラスタに分かれない場合は、媒介中心性が最大のエッジを、クラスタに分かれるまで枝刈りし続ける。 However, even if one edge is removed, it may not be divided into clusters. In this case, after removing the edge, the number of appearances of the edge appearing in the shortest path in each cluster is checked again, and the edge having the maximum mediation centrality is pruned. If it still does not divide into clusters, it continues pruning the edge with the greatest median centrality until it is divided into clusters.

可視化結果提示部２６は、ユーザによりグラフの分割要求が入力された場合、メディアデータ探索部２４により探索された探索結果として、クラスタリング部２８によってクラスタに分離するように枝刈りされたエッジを削除したグラフを提示する。例えば、図８の右側に示すように、点線の上下でクラスタに分離されたグラフが提示される。 When the graph division request is input by the user, the visualization result presentation unit 26 deletes the edges that have been pruned so as to be separated into clusters by the clustering unit 28 as a search result searched by the media data search unit 24. Present a graph. For example, as shown on the right side of FIG. 8, a graph separated into clusters above and below the dotted line is presented.

また、可視化結果提示部２６は、クラスタ毎に、当該クラスタに属するノードの各々が示すメディアデータ情報が有する、クエリと類似性を有する各特徴語と、各特徴語の前記類似度への寄与度とを求め、クラスタ毎に、クエリと類似性を有する各特徴語と、各特徴語の前記類似度への寄与度とを並べて提示する。例えば、図８の左側に示すように、クラスタ毎に、クエリと類似性を有する各特徴語と、各特徴語の類似度への寄与度とを表わす棒グラフが提示される。図８の左側の棒グラフでは、特徴語「○○△△こ」を著者情報として有するクラスタと、有しないクラスタで別れたことが分かる。このように、同じ特徴を有することで類似度が高いノード同士がエッジで接続されたネットワーク構造においてグラフ分割を行うと、同じ特徴を有する者同士で形成されるクラスタが得られる． Further, the visualization result presentation unit 26 has, for each cluster, each feature word having similarity to a query included in the media data information indicated by each node belonging to the cluster, and the contribution degree of each feature word to the similarity. For each cluster, each feature word having similarity to the query and the contribution degree of each feature word to the similarity are presented side by side. For example, as shown on the left side of FIG. 8, for each cluster, a bar graph representing each feature word having similarity to the query and the degree of contribution to the similarity of each feature word is presented. In the bar graph on the left side of FIG. 8, it can be seen that the cluster having the feature word “XXΔΔko” as the author information is separated from the cluster having no author word. In this way, when a graph is divided in a network structure in which nodes with high similarity due to having the same features are connected by edges, clusters formed by those having the same features are obtained.

クラスタリング部２８は、ユーザによりグラフの分割要求が更に入力される毎に、ノードの各々が示すメディアデータ情報間の類似性に基づいて、ノードの各々を更にクラスタに分離する。 The clustering unit 28 further separates each of the nodes into clusters based on the similarity between the media data information indicated by each of the nodes each time a graph division request is further input by the user.

また、可視化結果提示部２６は、ユーザによりグラフの分割要求が更に入力される毎に、メディアデータ探索部２４により探索された探索結果として、クラスタリング部２８によってクラスタに分離するように枝刈りされたエッジを削除したグラフを提示する。例えば、図９の右側に示すように、３つのクラスタに分離されたグラフが提示される。また、図９の左側に示すように、３つのクラスタ毎に、クエリと類似性を有する各特徴語と、各特徴語の類似度への寄与度とを表わす棒グラフが提示される。 The visualization result presentation unit 26 is pruned so as to be separated into clusters by the clustering unit 28 as a search result searched by the media data search unit 24 every time a graph division request is further input by the user. Present a graph with edges removed. For example, as shown on the right side of FIG. 9, a graph separated into three clusters is presented. Also, as shown on the left side of FIG. 9, for each of the three clusters, a bar graph representing each feature word having similarity to the query and the degree of contribution to the similarity of each feature word is presented.

また、可視化結果提示部２６は、ユーザによりグラフの分割要求が入力された場合、メディアデータ探索部２４により探索された探索結果として、クラスタリング部２８によってクラスタに分離するように枝刈りされたエッジを削除したグラフであって、かつ、ノードの各々の表示座標が、クラスタ毎に分かれるように、少なくとも１つのノードの表示座標が、クラスタリング部２８による分離前のノードの表示座標から変化したグラフを提示する。例えば、ノードの各々の表示座標が、クラスタ毎に分かれるように、クラスタを構成するノードの表示座標を変化させて、クラスタの中心座標を調整する。これによって、図１０の右側に示すように、３つのクラスタに分割されたことが分かりやすいグラフが提示される。なお、図１０の例では、ノード間が被って表示されないようにエッジの長さが調整されている。また、図１０の例では、上部の大きなクラスタが、特徴語「□□書店」を出版社情報として有するか否かの２つのクラスタに分割されたことが分かる。 In addition, when the graph division request is input by the user, the visualization result presentation unit 26 uses the search result searched by the media data search unit 24 to display the edges pruned so as to be separated into clusters by the clustering unit 28. A graph that is a deleted graph and in which the display coordinates of at least one node are changed from the display coordinates of the node before separation by the clustering unit 28 so that the display coordinates of each node are separated for each cluster is presented. To do. For example, the center coordinates of the cluster are adjusted by changing the display coordinates of the nodes constituting the cluster so that the display coordinates of each node are separated for each cluster. As a result, as shown on the right side of FIG. 10, a graph easy to understand that it is divided into three clusters is presented. In the example of FIG. 10, the length of the edge is adjusted so that the nodes are not covered and displayed. In the example of FIG. 10, it can be seen that the upper large cluster is divided into two clusters indicating whether or not the feature word “□□ bookstore” is included as publisher information.

また、ユーザによりグラフの分割要求が更に入力されると、図１１の右側に示すように、４つのクラスタに分離されたグラフが提示される。また、図１１の左側に示すように、４つのクラスタ毎に、クエリと類似性を有する各特徴語と、各特徴語の類似度への寄与度とを表わす棒グラフが提示される。 Further, when a graph division request is further input by the user, a graph separated into four clusters is presented as shown on the right side of FIG. Also, as shown on the left side of FIG. 11, for each of the four clusters, a bar graph representing each feature word having similarity to the query and the degree of contribution to the similarity of each feature word is presented.

図１１では、３つのクラスタはそれぞれ特徴語「□□書店」を出版社情報として有すること、特徴語「狼」を有すること、特徴語「○○△△こ」を著者情報として有することが寄与度の大部分を占めていることが分かる。また、もう一つのクラスタは、特徴語「ＡＡＡ」、「おばあちゃん」、「狼」の寄与度が比較的高い。 In FIG. 11, each of the three clusters has a feature word “□□ bookstore” as publisher information, a feature word “wolf”, and a feature word “XXXXX” as author information. It can be seen that it accounts for the majority of the degree. In another cluster, the contributions of the feature words “AAA”, “Grandma”, and “Wolf” are relatively high.

ユーザは、一段階ずつグラフ分割を行ないながら、適当にグラフ分割が行なわれたと判断して、グラフ分割を終了させる。なお、グラフ分割の終了後、またはグラフ分割の途中段階において、上記図５に示すように、探索されたメディアデータ情報をリスト表示した画面を再度表示できるようにしてもよい。また、本実施例では図１１の左側のクラスタ毎の特徴を表す棒グラフについて、クエリとトップランカーが共通して持つ特徴語に限った例により説明したが、クエリが持たずトップランカーのみが持つ特徴語を併用してもよい。指定した探索トップランカー数が多い、あるいは、クエリと類似しているメディアデータがメディアデータＤＢ中に少ない場合は、クエリとトップランカーが共通して持つ特徴語のみを用いると下位トップランカーが所属するクラスタはいずれの特徴語の寄与度も低くなり、ユーザがクラスタの共通性を認識することが困難になる。クエリが持たずトップランカーのみが持つ特徴語を併用すると、その特徴語によってクラスタの意味づけが容易になる。 While performing the graph division step by step, the user determines that the graph division has been performed appropriately, and ends the graph division. Note that, after the graph division is completed, or in the middle of the graph division, as shown in FIG. 5, a screen displaying a list of searched media data information may be displayed again. Further, in the present embodiment, the bar graph representing the features for each cluster on the left side of FIG. 11 has been described using an example limited to the feature words that the query and the top ranker have in common. Words may be used together. When the specified number of search top rankers is large or the media data similar to the query is small in the media data DB, the lower top rankers belong if only the feature words that the query and the top rankers have in common are used. A cluster contributes less to any feature word, making it difficult for the user to recognize the commonality of the cluster. If a feature word that is not possessed by a query and is possessed only by the top ranker is used in combination, the meaning of the cluster is facilitated by the feature word.

〔指定メディアデータ情報提示部３０〕
指定メディアデータ情報提示部３０は、ユーザによりグラフ上のノードが指定され、メディアデータ情報が指定された場合、指定されたノードが示すメディアデータ情報に対して予め定められた概要情報と、クエリとの類似度と、メディアデータ情報が有する、クエリと類似性を有する特徴語の、類似度への寄与度との少なくとも1つ以上を、表示部１６により、ユーザに対して提示する。 [Designated media data information presentation unit 30]
When the node on the graph is designated by the user and the media data information is designated by the user, the designated media data information presenting unit 30 is configured to obtain summary information predetermined for the media data information indicated by the designated node, a query, The display unit 16 presents the user with at least one of the similarity and the contribution of the feature word having similarity to the query included in the media data information to the similarity.

例えば、図１２の左側に示すように、指定されたメディアデータ情報に関する補助情報が提示される。補助情報は、メディアデータの特徴、特に書誌情報の各項目、クエリと各メディアデータとの類似度及び順位、特徴語、類似度への各特徴語の寄与度を表す円グラフ等を含むことができる。 For example, as shown on the left side of FIG. 12, auxiliary information related to designated media data information is presented. The auxiliary information may include features of the media data, in particular, each item of the bibliographic information, similarity and ranking between the query and each media data, feature words, a pie chart showing the contribution of each feature word to the similarity, etc. it can.

〔指定特徴情報提示部３２〕
指定特徴情報提示部３２は、指定メディアデータ情報提示部３０により提示された各特徴語の寄与度を表す円グラフ上で、ユーザにより特徴語が指定された場合、指定された特徴語を有するメディアデータ情報を示すノードのみ、識別可能な低情報量表現となる情報であって、かつ、メディアデータ情報から抽出した粗い情報を表示したグラフを、表示部１６により、ユーザに対して提示する。識別可能な低情報量表現となる情報であって、かつ、メディアデータ情報から抽出した粗い情報として、メディアデータ情報の概要情報より情報量が少ない情報を表示すればよく、例えば、図１３〜図１７に示すように、指定された特徴語を有する絵本を示すノードのみ、絵本の表紙を表示したグラフを提示する。図１３、図１４の例では、特徴語「ＡＡＡ」が指定され、特徴語「ＡＡＡ」を有する絵本を示すノードのみ、絵本の表紙が表示されている。図１５の例では、特徴語「狼」が指定され、特徴語「狼」を有する絵本を示すノードのみ、絵本の表紙が表示されている。図１６の例では、出版社情報としての特徴語「□□書店」が指定され、出版社情報として特徴語「□□書店」を有する絵本を示すノードのみ、絵本の表紙が表示されている。図１７の例では、著者情報としての特徴語「○○△△こ」が指定され、特徴語「○○△△こ」を著者情報として有する絵本を示すノードのみ、絵本の表紙が表示されている。 [Designated feature information presentation unit 32]
The designated feature information presentation unit 32 is a media having the designated feature word when the feature word is designated by the user on the pie chart representing the contribution degree of each feature word presented by the designated media data information presentation unit 30. Only the node indicating the data information is information that can be identified as a low information amount expression, and a graph displaying rough information extracted from the media data information is presented to the user by the display unit 16. Information that is an identifiable low information amount expression and that has a smaller information amount than the summary information of the media data information may be displayed as rough information extracted from the media data information. For example, FIGS. As shown in FIG. 17, only the node indicating the picture book having the specified feature word is presented with a graph displaying the cover of the picture book. In the example of FIGS. 13 and 14, the feature word “AAA” is designated, and the cover of the picture book is displayed only at the node indicating the picture book having the feature word “AAA”. In the example of FIG. 15, the feature word “wolf” is designated, and the cover of the picture book is displayed only for the node indicating the picture book having the feature word “wolf”. In the example of FIG. 16, the feature word “□□ bookstore” as the publisher information is specified, and the cover of the picture book is displayed only in the node indicating the picture book having the feature word “□□ bookstore” as the publisher information. In the example of FIG. 17, the feature word “XXΔΔko” as the author information is designated, and the cover of the picture book is displayed only for the node indicating the picture book having the feature word “XXΔΔko” as the author information. Yes.

＜第１の実施の形態に係る情報提示装置１０の作用＞
次に、本実施の形態に係る情報提示装置１０の作用について説明する。まず、上記図２に示す入力画面において、ユーザによりクエリとなるメディアデータ情報が選択され、クエリの入力を受け付けると、情報提示装置１０では、図１８、図１９に示す情報提示処理ルーチンが実行される。 <Operation of Information Presentation Device 10 according to First Embodiment>
Next, the operation of the information presentation device 10 according to the present embodiment will be described. First, on the input screen shown in FIG. 2, when the media data information to be a query is selected by the user and the query input is accepted, the information presentation apparatus 10 executes the information presentation processing routine shown in FIGS. The

まず、ステップＳ１００において、クエリ特徴抽出部２２が、入力されたクエリを受け付け、クエリとして選択されたメディアデータ情報を、探索対象メディアデータＤＢ２０から取得し、上記図３に示すように、表示部１６によりクエリとして選択されたメディアデータ情報を提示する。そして、類似探索の実行が指示されると、ステップＳ１０２へ移行する。 First, in step S100, the query feature extraction unit 22 receives the input query, acquires the media data information selected as the query from the search target media data DB 20, and displays the display unit 16 as shown in FIG. Presents the media data information selected as a query. When the execution of the similarity search is instructed, the process proceeds to step S102.

ステップＳ１０２で、クエリ特徴抽出部２２が、クエリとなるメディアデータ情報から特徴を抽出し、抽出したクエリの特徴を、メディアデータ探索部２４へ受け渡す。 In step S <b> 102, the query feature extraction unit 22 extracts features from the media data information serving as a query, and passes the extracted query features to the media data search unit 24.

次に、ステップＳ１０４で、メディアデータ探索部２４が、クエリ特徴抽出部２２から受け渡されたクエリの特徴を用いて、探索対象メディアデータＤＢ２０から、クエリとの類似度が高いメディアデータ情報を取得する。例えば、メディアデータ探索部２４は、クエリ特徴抽出部２２で抽出されたクエリの特徴から作成した特徴ベクトルと、探索対象メディアデータＤＢ２０に格納された各メディアデータのメディアデータ情報に含まれる特徴のうち、クエリの特徴と比較可能な特徴から作成した特徴ベクトルとのコサイン類似度を、クエリと各メディアデータとの類似度として算出し、探索対象メディアデータＤＢ２０から、クエリとの特徴ベクトル間の類似度が高いメディアデータ情報を、類似度が高い順に所定件数（例えば３０件）取得する。 Next, in step S104, the media data search unit 24 acquires media data information having a high similarity to the query from the search target media data DB 20 using the query features passed from the query feature extraction unit 22. To do. For example, the media data search unit 24 includes a feature vector created from the query features extracted by the query feature extraction unit 22 and features included in the media data information of each media data stored in the search target media data DB 20. The cosine similarity between the query feature and the feature vector created from the comparable feature is calculated as the similarity between the query and each media data, and the similarity between the query and the feature vector is calculated from the search target media data DB 20. Media data information having a high level is acquired in a predetermined number (for example, 30) in descending order of similarity.

そして、メディアデータ探索部２４は、取得した各メディアデータ情報に含まれる特徴の少なくとも一つに基づいて、各メディアデータ間の類似度を算出し、取得したメディアデータ情報、算出したメディアデータ間の類似度、及び算出したクエリと各メディアデータとの類似度を、探索結果提示部２５及び可視化結果提示部２６へ受け渡す。例えば、メディアデータ探索部２４は、取得したメディアデータの特徴間の類似度に基づいて、取得したメディアデータ情報を示すノードを設定し、各ノードが示すメディアデータ情報に含まれる特徴同士の類似度が所定値以上となるノード間をエッジで接続し、各ノードにメディアデータ情報及びクエリと各メディアデータとの類似度を保持させた部分ネットワーク構造を作成して、可視化結果提示部２６へ受け渡す。 Then, the media data search unit 24 calculates the similarity between the media data based on at least one of the features included in the acquired media data information, and obtains between the acquired media data information and the calculated media data. The similarity and the similarity between the calculated query and each media data are transferred to the search result presentation unit 25 and the visualization result presentation unit 26. For example, the media data search unit 24 sets a node indicating the acquired media data information based on the similarity between the features of the acquired media data, and the similarity between the features included in the media data information indicated by each node Are connected to each other by an edge, and a partial network structure in which the similarity between the media data information and the query and each media data is maintained in each node is created and transferred to the visualization result presentation unit 26 .

次に、ステップＳ１０６で、探索結果提示部２５が、メディアデータ探索部２４から受け渡されたメディアデータ情報、及び各メディアデータ間の類似度に基づいて、上記図５に示すように、メディアデータ探索部２４により探索された探索結果として、クエリとの類似度が高いメディアデータ情報を優先して、探索された複数のメディアデータ情報の各々について、メディアデータ情報に対して予め定められた概要情報と、クエリとの類似度と、クエリと類似性を有する特徴語の、類似度への寄与度とを表す画面を、表示部１６により、ユーザに対して提示する。 Next, in step S106, the search result presentation unit 25 determines that the media data based on the media data information passed from the media data search unit 24 and the similarity between the media data as shown in FIG. As a search result searched for by the search unit 24, summary information predetermined for the media data information for each of a plurality of searched media data information by giving priority to media data information having a high similarity to the query. The display unit 16 presents a screen representing the similarity to the query and the contribution of the feature word having similarity to the query to the similarity.

そして、ステップＳ１０８で、ユーザにより探索結果の可視化要求が入力されたか否かを判定する。ユーザにより「グラフ可視化」ボタンが押下され、探索結果の可視化要求が入力されると、ステップＳ１１０へ進む。 In step S108, it is determined whether a search result visualization request is input by the user. When the “graph visualization” button is pressed by the user and a search result visualization request is input, the process proceeds to step S110.

ステップＳ１１０では、可視化結果提示部２６が、二次元平面座標の算出処理として、メディアデータ探索部２４から受け渡されたメディアデータ情報及び各メディアデータ間の類似度に基づいて、各メディアデータ情報を示すノードの二次元平面座標を、メディアデータ情報に含まれる特徴間の類似度が高いメディアデータ情報を示すノード同士は近くに配置され、類似度が低いメディアデータ情報を示すノード同士は遠くに配置されるように算出する。例えば、可視化結果提示部２６は、メディアデータ探索部２４から受け渡された部分ネットワーク構造を、バネ長固定のバネモデルとし、エッジで接続されたノード同士は近くに配置し、エッジで接続されていないノード同士は遠くに配置し、部分ネットワーク構造に含まれる全てのノードが安定な配置となるように、各ノードの二次元平面座標を求める。そして、可視化結果提示部２６は、算出した各ノードの二次元平面座標、各ノードが保持するメディアデータ情報、及びクエリと各メディアデータとの類似度を、標高の算出処理へ受け渡す。 In step S110, the visualization result presentation unit 26 calculates each piece of media data information based on the media data information passed from the media data search unit 24 and the similarity between the pieces of media data as a calculation process of the two-dimensional plane coordinates. The two-dimensional plane coordinates of the indicated node are arranged close to each other indicating the media data information having a high similarity between the features included in the media data information, and distant from each other indicating the media data information having a low similarity. Calculate as follows. For example, the visualization result presentation unit 26 uses the partial network structure delivered from the media data search unit 24 as a spring model with a fixed spring length, and nodes connected by edges are arranged close to each other and are not connected by edges. The nodes are arranged far from each other, and the two-dimensional plane coordinates of each node are obtained so that all the nodes included in the partial network structure are stably arranged. The visualization result presentation unit 26 passes the calculated two-dimensional plane coordinates of each node, the media data information held by each node, and the similarity between the query and each media data to the altitude calculation process.

次に、ステップＳ１１２で、可視化結果提示部２６が、メディアデータ探索部２４で取得された各メディアデータ情報を示すノードについて、二次元平面座標の算出処理から受け渡されたクエリと各メディアデータとの類似度に応じた標高を算出する。可視化結果提示部２６は、算出した各ノードの標高、並びに各ノードの二次元平面座標、各ノードが保持するメディアデータ情報及びクエリと各メディアデータとの類似度を、可視化結果の生成処理へ受け渡す。 Next, in step S112, the visualization result presentation unit 26, for the node indicating each piece of media data information acquired by the media data search unit 24, the query passed from the calculation process of the two-dimensional plane coordinates, each piece of media data, The altitude corresponding to the similarity is calculated. The visualization result presentation unit 26 receives the calculated elevation of each node, the two-dimensional plane coordinates of each node, the media data information held by each node, and the similarity between the query and each media data, to the visualization result generation process. hand over.

そして、ステップＳ１１３で、可視化結果提示部２６が、特徴情報の算出処理として、メディアデータ探索部２４により探索された複数のメディアデータ情報が有する、クエリと類似性を有する各特徴語と、類似度への各特徴語の寄与度とを算出する。 Then, in step S113, the visualization result presenting unit 26, as the feature information calculation process, includes each feature word having similarity to the query included in the plurality of media data information searched by the media data search unit 24, and the similarity degree. The contribution of each feature word to is calculated.

次に、ステップＳ１１４で、可視化結果提示部２６が、可視化結果の生成処理として、上記ステップＳ１１０で算出された各ノードの二次元平面座標に、上記ステップＳ１１２で算出された標高を加え、各メディアデータ情報を示すノードの三次元座標を求め、これを三次元グラフ上にプロットした可視化結果を生成する。そして、可視化結果提示部２６は、各ノードが示すメディアデータ情報に含まれる特徴の少なくとも一つを用いて、各ノードが示すメディアデータが判別可能となるように編集する。例えば、探索対象メディアデータが「絵本」の場合には、その絵本の表紙画像を各ノードに貼り付ける。また、可視化結果提示部２６は、上記ステップＳ１１３で得られた、探索された複数のメディアデータ情報が有する、クエリと類似性を有する各特徴語と、各特徴語の類似度への寄与度とを表す棒グラフを追加して、可視化結果が表示部１６により提示されるように出力する。 Next, in step S114, the visualization result presentation unit 26 adds the elevation calculated in step S112 to the two-dimensional plane coordinates of each node calculated in step S110 as a visualization result generation process, A three-dimensional coordinate of a node indicating data information is obtained, and a visualization result is generated by plotting this on a three-dimensional graph. The visualization result presentation unit 26 edits the media data indicated by each node so that the media data indicated by each node can be determined using at least one of the features included in the media data information indicated by each node. For example, when the search target media data is “picture book”, the cover image of the picture book is pasted to each node. In addition, the visualization result presentation unit 26 includes each feature word having similarity to the query included in the searched plurality of media data information obtained in step S113, and the degree of contribution to the similarity of each feature word. Is added so that the visualization result is presented by the display unit 16.

そして、ステップＳ１１６において、ユーザによりグラフの分割要求が入力されたか否かを判定する。ユーザによりグラフの分割要求が入力されると、ステップＳ１１８へ移行する。 In step S116, it is determined whether or not a graph division request is input by the user. When a graph division request is input by the user, the process proceeds to step S118.

ステップＳ１１８では、クラスタリング部２８が、グラフのエッジに関する媒介中心性（betweenness centerality）という概念を導入し、ノードの各々が示すメディアデータ情報間の類似性に基づいて、エッジの枝刈りを行うことで、各ノードを、複数のクラスタに分離する。そして、可視化結果提示部２６は、クラスタリング部２８により複数のクラスタに分離された結果を反映した可視化結果を生成する。また、可視化結果提示部２６は、クラスタリング部２８により複数のクラスタに分離された結果を反映した棒グラフを生成する。そして、可視化結果提示部２６は、棒グラフを追加して、可視化結果が表示部１６により提示されるように出力する。 In step S118, the clustering unit 28 introduces the concept of betweenness centerality regarding the edges of the graph, and prunes edges based on the similarity between the media data information indicated by each node. Each node is separated into a plurality of clusters. Then, the visualization result presentation unit 26 generates a visualization result reflecting the result of separation into a plurality of clusters by the clustering unit 28. In addition, the visualization result presentation unit 26 generates a bar graph that reflects the result of being separated into a plurality of clusters by the clustering unit 28. The visualization result presentation unit 26 adds a bar graph and outputs the visualization result so that the display unit 16 presents the visualization result.

そして、ステップＳ１２０において、ユーザによりグラフの分割要求が入力されたか否かを判定する。ユーザによりグラフの分割要求が入力されると、上記ステップＳ１１８へ戻る。また、ユーザによりグラフの分割要求が入力されない場合には、ステップＳ１２２へ移行する。 In step S120, it is determined whether a graph division request is input by the user. When the graph division request is input by the user, the process returns to step S118. If the user does not input a graph division request, the process proceeds to step S122.

ステップＳ１２２では、ユーザによりグラフ上のノードが指定されたか否かを判定する。ユーザの操作に応じたマウスカーソルによりグラフ上のノードが指定されると、ステップＳ１２４へ移行する。一方、ユーザによりグラフ上のノードが指定されない場合には、上記ステップＳ１２０へ戻る。 In step S122, it is determined whether or not a node on the graph is designated by the user. When a node on the graph is designated by the mouse cursor according to the user operation, the process proceeds to step S124. On the other hand, if the node on the graph is not designated by the user, the process returns to step S120.

ステップＳ１２４では、指定メディアデータ情報提示部３０は、指定されたノードが示すメディアデータ情報に対して予め定められた概要情報と、クエリとの類似度と、メディアデータ情報が有する、クエリと類似性を有する各特徴語の、類似度への寄与度を表す円グラフとの少なくとも１つ以上を、表示部１６により、ユーザに対して提示する。 In step S124, the designated media data information presenting unit 30 uses the similarity between the outline information predetermined for the media data information indicated by the designated node, the query, and the query similarity that the media data information has. The display unit 16 presents at least one or more of each feature word having a pie chart representing the degree of contribution to the similarity to the user.

次に、ステップＳ１２６では、ユーザにより円グラフ上で特徴語が指定されたか否かを判定する。ユーザの操作に応じたマウスカーソルにより円グラフ上で特徴語が指定されると、ステップＳ１２８へ進む。 Next, in step S126, it is determined whether or not a feature word is designated on the pie chart by the user. When a feature word is designated on the pie chart with the mouse cursor according to the user's operation, the process proceeds to step S128.

ステップＳ１２８では、指定特徴情報提示部３２は、指定された特徴語を有するメディアデータ情報を示すノードのみ、識別可能な低情報量表現となる情報であって、かつ、メディアデータ情報から抽出した粗い情報を表示したグラフを、表示部１６により、ユーザに対して提示する。 In step S128, the designated feature information presenting unit 32 is information that can be identified as a low information amount expression only for the node indicating the media data information having the designated feature word, and is coarsely extracted from the media data information. The graph displaying the information is presented to the user by the display unit 16.

ステップＳ１３０では、探索結果の提示を終了するか否かを判定し、終了しない場合には、上記ステップＳ１２６へ戻る。一方、ユーザの操作により探索結果の提示終了を指示されると、情報提示処理ルーチンを終了する。 In step S130, it is determined whether or not to end the presentation of the search result. If not, the process returns to step S126. On the other hand, when the end of the presentation of the search result is instructed by the user operation, the information presentation processing routine is ended.

以上説明したように、本実施の形態に係る情報提示装置によれば、探索結果として、クエリとの類似度が高いメディアデータ情報を優先して、概要情報と、クエリとの類似度と、類似度への特徴語の寄与度とを、ユーザに対して提示し、探索結果の可視化要求が入力された場合、探索された複数のメディアデータ情報の各々を示すノード、及び類似性を有するメディアデータ情報間を結合するエッジからなるグラフを、ユーザに対して提示することにより、ユーザが使いやすい最適な情報提示を行うことができる。 As described above, according to the information presenting apparatus according to the present embodiment, as a search result, priority is given to the media data information having a high similarity to the query, the summary information, the similarity to the query, and the similarity. When the search result visualization request is input to the user, the degree of contribution of the feature word to the degree is presented to the user, and the media data having similarities and the media data having similarity By presenting a graph composed of edges connecting information to the user, it is possible to present optimal information that is easy for the user to use.

また、探索結果として、探索された複数のメディアデータ情報の各々を示すノード、及び類似性を有するメディアデータ情報間を結合するエッジからなるグラフを、ユーザに対して提示し、グラフのノードが指定された場合、指定されたノードが示すメディアデータ情報の概要情報と、クエリとの類似度と、当該メディアデータ情報が有する特徴語の、類似度への寄与度とを、ユーザに対して提示することにより、ユーザが使いやすい最適な情報提示を行うことができる。 Also, as a search result, a graph including nodes indicating each of a plurality of searched media data information and an edge connecting between similar media data information is presented to the user, and the node of the graph is designated. If it is, the outline information of the media data information indicated by the designated node, the similarity to the query, and the contribution to the similarity of the feature words of the media data information are presented to the user. Thus, it is possible to provide optimum information that is easy for the user to use.

また、リスト型の結果表示と異なり、グラフ型の結果表示をすることで、クエリとの類似度が高いメディアデータ同士の関係が理解しやすく、結果として、協調フィルタリングを用いたメディアデータ検索と異なり、なぜそのメディアデータが推薦されているのか（同一著者、同じ形態素）が分かるように情報提示を行うことができる。 Also, unlike list-type result display, graph-type result display makes it easy to understand the relationship between media data with high similarity to queries, and as a result, unlike media data search using collaborative filtering. Information can be presented so that the media data is recommended (same author, same morpheme).

また、リスト型結果表示では内容を読み込まないと困難な、ユーザが必要としないノイズを排除しやすい。 In addition, it is easy to eliminate noise that is not necessary for the user, which is difficult if the contents are not read in the list type result display.

なお、本発明は、上記実施の形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 The present invention is not limited to the above embodiment, and various modifications and applications can be made without departing from the gist of the present invention.

例えば、探索対象となるメディアデータが絵本である場合を例に説明したが、これに限定されるものではない。絵本のようにテキストや画像等、時間方向に変化しないメディアデータであってもよいし、映像を含む時系列データであるメディアデータであってもよい。また、音声データであってもよい。 For example, although the case where the media data to be searched is a picture book has been described as an example, the present invention is not limited to this. Media data that does not change in the time direction, such as text and images, like a picture book, or media data that is time-series data including video may be used. It may also be audio data.

また、メディアデータが絵本である場合、グラフのノードに表す、メディアデータの概要情報より情報量の少ない情報として、表紙を表示する場合を例に説明したが、これに限定されるものではない。例えば、メディアデータが本(書籍)である場合、グラフのノードに、表紙、又は表紙とタイトル文字列の組み合わせを表示してもよい。また、メディアデータがＣＤである場合、グラフのノードに、ジャケットの表紙を表示してもよい。また、メディアデータが映像である場合、グラフのノードに、サムネイル画像を表示してもよく、gifアニメ等の動きのある表示であってもよい。また、メディアデータが音声である場合には、グラフのノードに、話者のアイコン（たとえば、男性、女性のアイコン）、集団のアイコン、又は楽器のアイコンを表示してもよい。また、メディアデータが新聞記事である場合、グラフのノードに、タイトル又は記事番号を表示してもよい。 Further, in the case where the media data is a picture book, the case where the cover is displayed as information having a smaller amount of information than the summary information of the media data shown in the graph node has been described as an example. However, the present invention is not limited to this. For example, when the media data is a book (book), a cover or a combination of a cover and a title character string may be displayed on a node of the graph. When the media data is a CD, a jacket cover may be displayed at a node of the graph. When the media data is a video, a thumbnail image may be displayed at a node of the graph, or a display with movement such as gif animation may be used. When the media data is speech, a speaker icon (for example, a male or female icon), a group icon, or a musical instrument icon may be displayed at a node of the graph. Further, when the media data is a newspaper article, a title or an article number may be displayed at a node of the graph.

また、グラフの分割要求が入力される毎に、一つずつクラスタ数が増えるように分割する場合を例に説明したが、これに限定されるものではなく、所定の条件を満たすまで自動的にクラスタの分離を繰り返し行うようにしてもよい。例えば、modularity maximizationという手法（参照非特許文献１「Newman, M. E. J. “Modularity and community structure in networks”. PROCEEDINGS- NATIONAL ACADEMY OF SCIENCES USA 103 (23), 2006」）を用いることができる。この手法では、直感的に「あるノードから同じクラスタに所属されているノードへのエッジが別のクラスタに所属しているノードへのエッジに比べて多いときに大きくなる」関数であるモジュラリティＱを導入して、エッジを枝刈りする度にＱを調べ、Ｑが最大値を取る回数だけ媒介中心性が最大のエッジを枝刈りするようにしてもよい。 In addition, an example has been described in which the number of clusters is increased by one each time a graph division request is input. However, the present invention is not limited to this, and it is automatically performed until a predetermined condition is satisfied. Cluster separation may be repeated. For example, a technique called modularity maximization (reference non-patent document 1 “Newman, M. E. J.“ Modularity and community structure in networks ”. PROCEEDINGS-NATIONAL ACADEMY OF SCIENCES USA 103 (23), 2006”) can be used. In this method, the modularity Q is a function that is intuitively “a larger value when there are more edges from one node to a node belonging to the same cluster than to a node belonging to another cluster”. May be introduced each time an edge is pruned, and Q may be examined, and an edge having the maximum mediation centrality may be pruned as many times as Q takes the maximum value.

＜第２の実施の形態に係る情報提示装置の構成＞
次に、第２の実施の形態に係る情報提示装置について説明する。なお、第１の実施の形態と同様の構成となる部分については、同一符号を付して説明を省略する。 <Configuration of Information Presentation Device according to Second Embodiment>
Next, an information presentation apparatus according to the second embodiment will be described. In addition, about the part which becomes the structure similar to 1st Embodiment, the same code | symbol is attached | subjected and description is abbreviate | omitted.

〔演算部２１４〕
図２０に示すように、第２の実施の形態に係る情報提示装置２００の演算部２１４は、機能的には、探索対象メディアデータデータベース２０、クエリ特徴抽出部２２、メディアデータ探索部２４、クラスタリング部２２８、探索結果提示部２２５、可視化結果提示部２２６、指定メディアデータ情報提示部３０、及び指定特徴情報提示部３２を含んだ構成で表すことができる。 [Calculation unit 214]
As illustrated in FIG. 20, the calculation unit 214 of the information presentation device 200 according to the second embodiment functionally includes a search target media data database 20, a query feature extraction unit 22, a media data search unit 24, clustering. It can be expressed by a configuration including a unit 228, a search result presentation unit 225, a visualization result presentation unit 226, a designated media data information presentation unit 30, and a designated feature information presentation unit 32.

〔メディアデータ探索部２４〕
メディアデータ探索部２４は、クエリ特徴抽出部２２から受け渡されたクエリの特徴を用いて、探索対象メディアデータＤＢ２０から、クエリとの類似度が高いメディアデータ情報を取得し、クラスタリング部２２８、探索結果提示部２２５、及び可視化結果提示部２２６へ受け渡す。 [Media data search unit 24]
The media data search unit 24 acquires media data information having a high similarity to the query from the search target media data DB 20 using the query features passed from the query feature extraction unit 22, and the clustering unit 228, search The result is transferred to the result presentation unit 225 and the visualization result presentation unit 226.

〔クラスタリング部２２８〕
クラスタリング部２２８は、メディアデータ探索部２４から受け渡されたメディアデータ情報、及び各メディアデータ間の類似度に基づいて、探索されたメディアデータ情報間の類似性に基づいて、探索されたメディアデータ情報の各々を複数のクラスタに分離する。クラスタリング部２２８は、所定の条件を満たすまで自動的にクラスタの分離を繰り返し行うことで、所定の条件を満たすまで自動的にクラスタの分離を繰り返し行う。前述のように例えば、modularity maximizationという手法（参照非特許文献１「Newman, M. E. J. “Modularity and community structure in networks”. PROCEEDINGS- NATIONAL ACADEMY OF SCIENCES USA 103 (23), 2006」）によるモジュラリティＱを導入し、エッジを枝刈りする度にＱを調べ、Ｑが最大値を取る回数だけ媒介中心性が最大のエッジを枝刈りする方法などを用いる。 [Clustering unit 228]
The clustering unit 228 searches the media data based on the similarity between the searched media data information based on the media data information passed from the media data search unit 24 and the similarity between the media data. Separate each piece of information into multiple clusters. The clustering unit 228 automatically repeats the separation of the clusters until the predetermined condition is satisfied by automatically repeating the separation of the clusters until the predetermined condition is satisfied. As mentioned above, for example, modularity Q is introduced by the method called modularity maximization (Reference Non-Patent Document 1 “Newman, MEJ“ Modularity and community structure in networks ”. PROCEEDINGS- NATIONAL ACADEMY OF SCIENCES USA 103 (23), 2006”). Then, Q is checked every time an edge is pruned, and a method of pruning an edge having the maximum median centrality as many times as Q takes a maximum value is used.

〔探索結果提示部２２５〕
探索結果提示部２２５は、メディアデータ探索部２４から受け渡されたメディアデータ情報、クラスタリング部２２８によるクラスタリング結果、及び各メディアデータ間の類似度に基づいて、メディアデータ探索部２４により探索された探索結果として、クラスタ毎に、当該クラスタに属するメディアデータ情報の各々について、クエリとの類似度が高いメディアデータ情報を優先して、少なくともメディアデータ情報に対して予め定められた概要情報を含む情報を表示部１６により、ユーザに対して提示する。例えば、概要情報と、クエリとの類似度と、クエリと類似性を有する各特徴語の、類似度への寄与度とを表す画面を、表示部１６により、ユーザに対して提示する。 [Search result presentation unit 225]
The search result presentation unit 225 searches the media data search unit 24 based on the media data information passed from the media data search unit 24, the clustering result by the clustering unit 228, and the similarity between the media data. As a result, for each piece of media data information belonging to the cluster, information including at least summary information predetermined for the media data information is given priority over media data information having a high similarity to the query. Presented to the user by the display unit 16. For example, a screen representing the summary information, the similarity to the query, and the contribution to the similarity of each feature word having similarity to the query is presented to the user by the display unit 16.

〔可視化結果提示部２２６〕
可視化結果提示部２２６は、ユーザによりクラスタ毎に設けられた「グラフ可視化」ボタンが押下され、探索結果の可視化要求が入力された場合、該当するクラスタについて、メディアデータ探索部２４から受け渡されたクエリとの類似度が高いメディアデータ情報に基づいて、上記の第１の実施の形態と同様に、複数の絵本についての関係性を把握可能なグラフを生成し、提示する。例えば、メディアデータ探索部２４から受け渡されたメディアデータ情報、及び各メディアデータ間の類似度を入力として、上記の第１の実施の形態で説明したように、二次元平面座標の算出処理、標高の算出処理、特徴情報の算出処理、及び可視化結果の生成処理を行うことで、該当するクラスタに属するメディアデータ情報についてのグラフを生成し、提示する。 [Visualization result presentation unit 226]
The visualization result presentation unit 226 receives a search result visualization request from the media data search unit 24 when a “graph visualization” button provided for each cluster is pressed by the user and a search result visualization request is input. Based on the media data information having a high degree of similarity with the query, a graph capable of grasping the relationship between a plurality of picture books is generated and presented in the same manner as in the first embodiment. For example, as described in the first embodiment, the media data information passed from the media data search unit 24 and the similarity between each media data are input, An altitude calculation process, a feature information calculation process, and a visualization result generation process are performed to generate and present a graph of media data information belonging to the corresponding cluster.

また、可視化結果提示部２２６は、ユーザにより、全クラスタに対応する「グラフ可視化」ボタンが押下され、探索結果の可視化要求が入力された場合、上記の第１の実施の形態と同様に、メディアデータ探索部２４から受け渡されたクエリとの類似度が高いメディアデータ情報に基づいて、上記の第１の実施の形態と同様に、複数の絵本についての関係性を把握可能なグラフを生成し、提示する。例えば、上記の第１の実施の形態で説明したように、メディアデータ探索部２４から受け渡されたメディアデータ情報、及び各メディアデータ間の類似度を入力として、二次元平面座標の算出処理、標高の算出処理、特徴情報の算出処理、及び可視化結果の生成処理を行い、クラスタリング部２２８によってクラスタに分離するように枝刈りされたエッジを削除したグラフを生成し、提示する。例えば、上記図１１の右側に示すグラフが提示される。 Also, the visualization result presentation unit 226, when the “graph visualization” button corresponding to all the clusters is pressed by the user and a search result visualization request is input, in the same way as in the first embodiment, Based on the media data information having a high degree of similarity with the query passed from the data search unit 24, a graph that can grasp the relationship between a plurality of picture books is generated as in the first embodiment. To present. For example, as described in the first embodiment above, the media data information passed from the media data search unit 24 and the similarity between each media data are input, and a calculation process of two-dimensional plane coordinates, An altitude calculation process, a feature information calculation process, and a visualization result generation process are performed, and a graph from which edges that have been pruned so as to be separated into clusters by the clustering unit 228 are generated is generated and presented. For example, the graph shown on the right side of FIG. 11 is presented.

なお、第２の実施の形態に係る情報提示装置２００の他の構成及び作用については、第１の実施の形態と同様であるため、説明を省略する。 In addition, since it is the same as that of 1st Embodiment about the other structure and effect | action of the information presentation apparatus 200 which concern on 2nd Embodiment, description is abbreviate | omitted.

また、本発明は、上記実施の形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。
The present invention is not limited to the above-described embodiment, and various modifications and applications are possible without departing from the gist of the present invention.

例えば、上記の第１の実施の形態と同様に、ユーザにより更にグラフの分割要求が入力された場合、ノードの各々が示すメディアデータ情報間の類似性に基づいて、ノードの各々を更にクラスタに分離し、クラスタに分離するように枝刈りされたエッジを削除したグラフを提示するようにしてもよい。この場合には、上記の第１の実施の形態と同様に、ノードの各々の表示座標が、クラスタ毎に分かれるように、少なくとも１つのノードの表示座標が、クラスタリングによる分離前のノードの表示座標から変化したグラフを提示するようにしてもよい。 For example, in the same way as in the first embodiment, when a graph division request is further input by the user, each of the nodes is further clustered based on the similarity between the media data information indicated by each of the nodes. You may make it present the graph which remove | separated and deleted the edge pruned so that it might isolate | separate into a cluster. In this case, as in the first embodiment, the display coordinates of at least one node are the display coordinates of the node before separation by clustering so that the display coordinates of each node are separated for each cluster. You may make it show the graph which changed from.

また、本願明細書中において、プログラムが予めインストールされている実施の形態として説明したが、外部の記憶装置や記録媒体等に格納されたプログラムを随時読み込んで、またインターネットを介してダウンロードして実行するようにしてもよい。また、当該プログラムを、コンピュータ読み取り可能な記録媒体に格納して提供することも可能である。 Further, in the present specification, the embodiment in which the program is preinstalled has been described. However, the program stored in an external storage device or recording medium is read as needed, and is downloaded and executed via the Internet. You may make it do. In addition, the program can be provided by being stored in a computer-readable recording medium.

１０、２００情報提示装置
１２入力部
１４、２１４演算部
１６表示部
２０探索対象メディアデータデータベース
２２クエリ特徴抽出部
２４メディアデータ探索部
２５、２２５探索結果提示部
２６、２２６可視化結果提示部
２８、２２８クラスタリング部
３０指定メディアデータ情報提示部
３２指定特徴情報提示部 10, 200 Information presentation device 12 Input unit 14, 214 Calculation unit 16 Display unit 20 Search target media data database 22 Query feature extraction unit 24 Media data search unit 25, 225 Search result presentation unit 26, 226 Visualization result presentation unit 28, 228 Clustering unit 30 Designated media data information presentation unit 32 Designated feature information presentation unit

Claims

A media data search unit for searching a plurality of media data information based on a similarity to a query input by a user from a plurality of media data information including at least one common type of feature regarding the media data;
As a search result searched by the media data search unit, media data information having a high similarity to the query is given priority, and for each of the searched media data information, at least the media data information is previously stored. A search result presentation unit for presenting information including the defined summary information to the user;
When a search result visualization request is input by the user, as a search result searched by the media data search unit, a node indicating each of the searched plurality of media data information, and media data information having similarity A graph composed of edges that join each other, each feature having similarity to the query included in the searched plurality of media data information, and a contribution degree of each feature to the similarity to the user. A visualization result presentation unit to present
An information presentation device.

A media data search unit for searching a plurality of media data information based on a similarity to a query input by a user from a plurality of media data information including at least one common type of feature regarding the media data;
As a search result searched by the media data search unit, media data information having a high similarity to the query is given priority, and for each of the searched media data information, at least the media data information is previously stored. A search result presentation unit for presenting information including the defined summary information to the user;
When a search result visualization request is input by the user, as a search result searched by the media data search unit, a node indicating each of the searched plurality of media data information, and media data information having similarity A visualization result presenting unit that presents a graph composed of edges that join between them to the user;
A clustering unit that separates each of the nodes into a plurality of clusters based on the similarity between the media data information indicated by each of the nodes when the graph division request is input by the user;
When the graph division request is input by the user, the visualization result presentation unit displays an edge between the nodes separated into different clusters by the clustering unit as a search result searched by the media data search unit. The deleted graph is presented, and for each of the plurality of clusters, the media data information indicated by each of the nodes belonging to the cluster has features similar to the query, and features to the similarity An information presentation device that presents the contributions side by side .

A clustering unit that separates each of the searched plurality of media data information into a plurality of clusters based on the similarity between the searched plurality of media data information;
The search result presentation unit gives priority to media data information having a high similarity to the query for each of the plurality of clusters as the search result searched by the media data search unit, and media data information belonging to the cluster for each, information, information presentation apparatus according to claim 1, wherein presenting to the user with a summary information that is predetermined for at least the media data information.

When the graph division request is input by the user, the visualization result presentation unit displays an edge between the nodes separated into different clusters by the clustering unit as a search result searched by the media data search unit. a deleted the graph, and, at least the display coordinates of the one node, the information presentation apparatus according to claim 2 or 3, wherein presenting the graph changes from the display coordinates of the node before separation by the clustering unit.

A media data search unit for searching a plurality of media data information based on a similarity to a query input by a user from a plurality of media data information including at least one common type of feature regarding the media data;
A clustering unit for separating each of the searched plurality of media data information into a plurality of clusters based on the similarity between the searched plurality of media data information;
As a search result searched by the media data search unit, for each of the plurality of clusters, the media data information having a high similarity with the query is prioritized and the media data information belonging to the cluster is searched. For each, a search result presentation unit that presents information including at least summary information predetermined for the media data information to the user;
When a search result visualization request is input by the user, as a search result searched by the media data search unit, a node indicating each of the searched plurality of media data information, and media data information having similarity A visualization result presenting unit that presents a graph composed of edges that join between them to the user;
Only including,
When the graph division request is input by the user, the visualization result presentation unit displays an edge between the nodes separated into different clusters by the clustering unit as a search result searched by the media data search unit. The deleted graph and the graph in which the display coordinates of at least one node are changed from the display coordinates of the nodes before separation by the clustering unit are presented.
Information presentation device.

When the graph division request is input by the user, the visualization result presentation unit displays an edge between the nodes separated into different clusters by the clustering unit as a search result searched by the media data search unit. In the deleted graph, the display coordinates of at least one node change from the display coordinates of the node before separation by the clustering unit so that the display coordinates of each of the nodes are separated for each cluster. The information presentation apparatus according to claim 4 or 5 , wherein the graph is presented.

A media data search unit for searching a plurality of media data information based on a similarity to a query input by a user from a plurality of media data information including at least one common type of feature regarding the media data;
As a search result searched by the media data search unit, a graph including a node indicating each of the searched media data information and an edge connecting between the similar media data information is displayed to the user. A visualization result presentation unit to present
When the node of the graph is designated by the user, the designated media data information presentation that presents to the user information including at least summary information predetermined for the media data information indicated by the designated node And
When a feature having similarity to the query is designated by the user, only the node indicating the media data information having the designated feature is information that can be identified as a low information amount expression, and the media A designated feature information presentation unit for presenting the graph displaying rough information extracted from data information to the user;
An information presentation device.

When a feature having similarity to the query is designated by the user, the designated feature information presenting unit provides the identifiable low information amount expression only for the node indicating the media data information having the designated feature. The information presenting apparatus according to claim 7 , wherein the graph displaying information that is information and has less information than the summary information is presented to the user as rough information extracted from the media data information.

The media data, the present and claims 1 information presentation apparatus according to any one of claims 8.

An information presentation method in an information presentation device including a media data search unit, a search result presentation unit, and a visualization result presentation unit,
The media data search unit searches a plurality of media data information based on a similarity with a query input by a user from a plurality of media data information including at least one common type of feature regarding the media data,
The search result presentation unit prioritizes media data information having a high similarity to the query as a search result searched by the media data search unit, and at least the media data information searched Presenting the user with information including summary information predetermined for the media data information,
When the visualization result presentation unit receives a search result visualization request from the user, a node indicating each of the searched plurality of media data information as a search result searched by the media data search unit; and A graph composed of edges connecting between media data information having similarity, each feature having similarity with the query included in the searched plurality of media data information, and contribution of each feature to the similarity steal, information presentation method for presenting to the user.

An information presentation method in an information presentation device including a media data search unit, a search result presentation unit, a visualization result presentation unit, and a clustering unit ,
The media data search unit searches a plurality of media data information based on a similarity with a query input by a user from a plurality of media data information including at least one common type of feature regarding the media data,
The search result presentation unit prioritizes media data information having a high similarity to the query as a search result searched by the media data search unit, and at least the media data information searched Presenting the user with information including summary information predetermined for the media data information,
When the visualization result presentation unit receives a search result visualization request from the user, a node indicating each of the searched plurality of media data information as a search result searched by the media data search unit; and Presenting to the user a graph consisting of edges connecting between similar media data information ;
When the clustering unit receives a request to divide the graph by the user, based on the similarity between the media data information indicated by each of the nodes, each of the nodes is divided into a plurality of clusters,
When the visualization result presentation unit receives a request to divide the graph by the user, as a search result searched by the media data search unit, an edge between the nodes separated into different clusters by the clustering unit is displayed. The deleted graph is presented, and for each of the plurality of clusters, the media data information indicated by each of the nodes belonging to the cluster has features similar to the query, and features to the similarity Information presentation method that presents the contributions of each side by side .

An information presentation method in an information presentation device including a media data search unit, a search result presentation unit, a visualization result presentation unit, and a clustering unit ,
The media data search unit searches a plurality of media data information based on a similarity with a query input by a user from a plurality of media data information including at least one common type of feature regarding the media data,
The clustering unit separates each of the searched media data information into a plurality of clusters based on the similarity between the searched media data information,
The search result presenting unit is searched as a search result searched for by the media data search unit with priority given to media data information having a high similarity to the query for each of the plurality of clusters , and For each piece of media data information belonging to the cluster , at least information including summary information predetermined for the media data information is presented to the user,
When the visualization result presentation unit receives a search result visualization request from the user, a node indicating each of the searched plurality of media data information as a search result searched by the media data search unit; and Presenting to the user a graph consisting of edges connecting between similar media data information ;
When the visualization result presentation unit receives a request to divide the graph by the user, as a search result searched by the media data search unit, an edge between the nodes separated into different clusters by the clustering unit is displayed. The deleted graph and the graph in which the display coordinates of at least one node are changed from the display coordinates of the nodes before separation by the clustering unit are presented.
Information presentation method.

An information presentation method in an information presentation device including a media data search unit, a visualization result presentation unit, a designated media data information presentation unit, and a designated feature information presentation unit ,
The media data search unit searches a plurality of media data information based on a similarity with a query input by a user from a plurality of media data information including at least one common type of feature regarding the media data,
The visualization result presentation unit includes, as a search result searched by the media data search unit, a node indicating each of the plurality of searched media data information and an edge connecting between the similar media data information Presenting a graph to the user;
When the designated media data information presenting unit designates a node of the graph by the user, the designated media data information presenting unit provides the user with information including at least summary information predetermined for the media data information indicated by the designated node. It presents for,
When the designated feature information presenting unit designates a feature having similarity to the query by the user, only the node indicating the media data information having the designated feature can be identified as a low information amount expression. And the information presentation method which presents the said graph which displayed the rough information extracted from the said media data information with respect to the said user .

A program for causing a computer to function as each section of the information presentation apparatus according to any one of claims 1 to 9.