JP2009075777A

JP2009075777A - Document processing system and method

Info

Publication number: JP2009075777A
Application number: JP2007243002A
Authority: JP
Inventors: Tetsuya Sakai; 哲也酒井
Original assignee: NewsWatch Inc
Current assignee: NewsWatch Inc
Priority date: 2007-09-19
Filing date: 2007-09-19
Publication date: 2009-04-09

Abstract

<P>PROBLEM TO BE SOLVED: To provide a graph structure in which a node showing one document and a node showing the other document can be bidirectionally traced even when one document is not referenced from the other document although the other document is referenced from the one document. <P>SOLUTION: Each of a plurality of documents including reference information to another document includes a central node corresponding to the document and peripheral nodes corresponding to the other document linked to the document by the reference information, and a graph structure in which the central node has a node name showing the document and the peripheral nodes have a node name showing the other document is generated, and a plurality of graph structures corresponding to each of the documents are searched, and second peripheral nodes having the node name showing the second document are added to a second graph structure in which a first document corresponding to the first peripheral nodes in the first graph structure among the plurality of graph structures is included as the central node, and a second document corresponding to the central node of the first graph structure is not included as the peripheral nodes. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、文書間の参照関係を表すグラフ構造を生成及び表示するための文書処理システムに関する。 The present invention relates to a document processing system for generating and displaying a graph structure representing a reference relationship between documents.

近年、World Wide Web、Wikipedia（登録商標）やWeblogのように、互いに参照関係を持つ電子化された文書データは増える一方である。 In recent years, electronic document data having a reference relationship with each other, such as the World Wide Web, Wikipedia (registered trademark), and Weblog, has been increasing.

文書間の参照関係をユーザにわかりやすく表示する方法として、各文書をノード、文書間の参照関係をグラフ表現により図示する方法が知られている。 As a method for displaying the reference relationship between documents in an easy-to-understand manner for the user, there is known a method in which each document is illustrated by a node and the reference relationship between documents is illustrated by a graph expression.

例えば、特許文献１では、ユーザにクリッカブルなノードから構成されるグラフ構造を提示し、ユーザが着目しクリックしたノード（「フォーカス」エンティティ）をグラフの中央に再配置し、そのノードの詳細情報を画面上に展開する技術が開示されている。 For example, in Patent Document 1, a graph structure composed of clickable nodes is presented to the user, and the node that the user pays attention to and clicked ("focus" entity) is rearranged in the center of the graph, and detailed information on the node is displayed. A technique for developing on a screen is disclosed.

また、非特許文献１では、インターネットのユーザであれば誰もが書き込める百科辞典であるWikipediaの機能を拡張し、例えば「ロンドンというテキストと、イングランドというテキスト間の関係は、"is capital of"という関係で結ばれる」という知識を、イングランドというテキスト中に執筆者に明示的に記述させる枠組みが提案されており、これに基づくグラフ表示の例が図示されている。 Non-Patent Document 1 extends the function of Wikipedia, an encyclopedia that anyone on the Internet can write. For example, the relationship between the text “London and the text England” is “is capital of”. A framework has been proposed that allows authors to explicitly describe the knowledge "linked by relationship" in the text of England, and an example of a graph display based on this is shown.

参照情報をもつ文書を少ない計算量でグラフ表示する方法としては、単一の文書を解析して、この文書を中心ノード（「フォーカス」エンティティ）とする単一のグラフ構造を生成し、これをユーザに提示することが考えられる。すなわち、全文書数（ノード数）をＮとしたとき、全ノードの組み合わせ（Ｎ（Ｎ−１／２）通り）について参照関係の有無を調べ、全ノードを含む巨大なグラフ構造を構築するのではなく、あくまでＮ回ばらばらに文書を解析して、Ｎ個のばらばらなグラフ構造を用意することである。 To display a document with reference information as a graph with a small amount of computation, analyze a single document and generate a single graph structure with this document as the central node (the “focus” entity). It can be presented to the user. In other words, when the total number of documents (number of nodes) is N, the presence or absence of a reference relationship is checked for a combination of all nodes (N (N-1 / 2)), and a huge graph structure including all nodes is constructed. Rather, it is to analyze the document N times and prepare N disjoint graph structures.

この用途としては、例えば、「女優E」というキーワードで検索を行ったユーザに対して、「女優E」に関する文書を解析することにより得られたグラフ構造を提示することが考えられる。 As this application, for example, it is conceivable to present a graph structure obtained by analyzing a document related to “actress E” to a user who has performed a search using the keyword “actress E”.

例えば、女優Eに関する文書中に、「男性タレントG」への参照があった場合、「女優E」がこのグラフ構造の中心ノードとなり、「男性タレントG」が周辺ノードとなり、中心ノードから周辺ノードへの参照関係を表す矢印が表示されることになる。ここで、もしユーザが次に「男性タレントG」のノードをクリックすれば、今度は「男性タレントG」に関する文書を解析することにより得られた「男性タレントG」を中心ノードとする新たなグラフ構造を提示することができる。
特開２００５−１２２７０３号公報 "Volkel, et al.: Semantic Wikipedia, Proceedings of ACM WWW 2006, Edinburgh, Scotland, 2006" For example, if there is a reference to "male talent G" in a document about actress E, "actress E" will be the central node of this graph structure, "male talent G" will be the peripheral node, and the central node to the peripheral node An arrow indicating the reference relationship to is displayed. Here, if the user clicks on the node of “Men Talent G” next time, this time, a new graph centered on “Men Talent G” obtained by analyzing the document about “Men Talent G” The structure can be presented.
JP 2005-122703 A "Volkel, et al .: Semantic Wikipedia, Proceedings of ACM WWW 2006, Edinburgh, Scotland, 2006"

前述のクリッカブルなグラフ構造提示インタフェースにおいて、例えばユーザがクリックしたノードに対応する文書を別ウィンドウに表示し、そのウィンドウ内に広告を提示する場合、広告収入をビジネスとする企業は、ユーザがなるべく多くのノードをクリックしてくれるほど収入が増える。従って、ユーザにグラフ構造を自由に辿らせてなるべく多くのノードをクリックさせることが望ましい。 In the clickable graph structure presentation interface described above, for example, when a document corresponding to a node clicked by a user is displayed in a separate window and an advertisement is presented in that window, there are as many companies as possible that the business is advertising revenue. The more you click on the node, the more you earn. Therefore, it is desirable to let the user click on as many nodes as possible by freely tracing the graph structure.

しかし、上記のように単一の文書を解析して単一のグラフ構造を得るアプローチでは、参照関係が双方向になっていない場合に問題が生じる。 However, in the approach of obtaining a single graph structure by analyzing a single document as described above, a problem occurs when the reference relationship is not bidirectional.

具体的には、上記の「女優E」と「男性タレントG」の例において、「男性タレントG」に関する文書中には「女優E」への参照が存在しない場合が考えられる。特に、Wikipediaのように複数の執筆者がおもいおもいの方針でテキストを執筆する場合には、このような現象は頻発する。 Specifically, in the above-mentioned examples of “actress E” and “male talent G”, there may be a case where there is no reference to “actress E” in the document regarding “male talent G”. In particular, such a phenomenon occurs frequently when multiple authors write texts according to the most important policy like Wikipedia.

この場合、「男性タレントG」を中心ノードとするグラフ構造には、「女優E」のノードが含まれない。従って、ユーザが「女優E」を中心ノードとするグラフ構造において周辺ノード「男性タレントG」をクリックし、「男性タレントG」を中心ノードとするグラフ構造に遷移した場合、ユーザは「女優E」のノードに引き返す手段がなく、これは直感的にもわかりにくいので、ユーザのクリックを妨げてしまう可能性がある。 In this case, the graph structure having “male talent G” as the central node does not include the node of “actress E”. Therefore, when the user clicks on the peripheral node “male talent G” in the graph structure with “actress E” as the central node and transitions to the graph structure with “male talent G” as the central node, the user is “actress E”. There is no means to return to this node, and this is difficult to understand intuitively, which may prevent the user from clicking.

このように、従来は、一方の文書から他方の文書を参照しているが、当該他方の文書から当該一方の文書を参照していない場合、当該他方の文書を中心ノードとするグラフ構造には当該一方の文書を表す周辺ノードが欠落しているので、当該他方の文書を中心ノードとするグラフ構造からは、（当該他方の文書を参照する）当該一方の文書の存在を認識できない、当該一方の文書を中心ノードとするグラフ構造へ遷移できない、という問題点があった。 As described above, conventionally, when one document is referred to the other document, but the other document is not referred to the one document, the graph structure having the other document as a central node is used. Since the peripheral node representing the one document is missing, the graph structure having the other document as the central node cannot recognize the presence of the one document (referring to the other document) There was a problem that it was not possible to transition to a graph structure with the document as the central node.

そこで、上記問題点に鑑み、本発明は、一方の文書から他方の文書を参照しているが、当該他方の文書から当該一方の文書を参照していない場合でも、当該一方の文書を表すノードと当該他方の文書を表すノードとの間を双方向に辿ることができるグラフ構造を提供することを目的とする。 Therefore, in view of the above problems, the present invention refers to the other document from one document, but the node representing the one document even when the other document is not referenced from the other document. An object of the present invention is to provide a graph structure capable of bidirectionally tracing between a node representing a document and a node representing the other document.

（１）文書処理システムは、
他の文書への参照情報を含む複数の文書を収集する文書収集手段と、
収集された前記複数の文書を記憶する第１の記憶手段と、
前記第１の記憶手段に記憶されている各文書について、該文書に対応する中心ノードと、該文書に前記参照情報によりリンクされている他の文書に対応する周辺ノードとを含み、前記中心ノードは該文書を表すノード名を有し、前記周辺ノードは該他の文書を表すノード名を有するグラフ構造を生成する生成手段と、
前記複数の文書のそれぞれに対応する複数のグラフ構造を記憶する第２の記憶手段と、
前記第２の記憶手段に記憶されている前記複数のグラフ構造のうちの第１のグラフ構造中の第１の周辺ノードに対応する第１の文書を中心ノードとして含み、前記第１のグラフ構造の中心ノードに対応する第２の文書が周辺ノードとして含まれていない第２のグラフ構造に、前記第２の文書を表すノード名を有する第２の周辺ノードを追加するグラフ構造補完手段と、
前記第２の記憶手段からグラフ構造を検索する第１の検索手段と、
検索されたグラフ構造を表示する第１の表示手段と、
を含む。 (1) The document processing system
Document collection means for collecting a plurality of documents including reference information to other documents,
First storage means for storing the collected documents;
For each document stored in the first storage means, the central node includes a central node corresponding to the document and a peripheral node corresponding to another document linked to the document by the reference information. Has a node name representing the document, and the peripheral node generates a graph structure having a node name representing the other document;
Second storage means for storing a plurality of graph structures corresponding to each of the plurality of documents;
The first graph structure includes, as a central node, a first document corresponding to a first peripheral node in the first graph structure among the plurality of graph structures stored in the second storage unit. Graph structure complementing means for adding a second peripheral node having a node name representing the second document to a second graph structure that does not include the second document corresponding to the central node of the second node as a peripheral node;
First search means for searching for a graph structure from the second storage means;
First display means for displaying the retrieved graph structure;
including.

（２）前記グラフ補完手段は、前記第２のグラフ構造に、前記第１のグラフ構造への参照情報及び前記第２の文書を表すノード名を有する第２の周辺ノードを追加する。 (2) The graph complementing unit adds, to the second graph structure, a second peripheral node having reference information to the first graph structure and a node name representing the second document.

（３）前記グラフ構造中の各周辺ノードは、該周辺ノードに対応する文書を中心ノードとする他のグラフ構造への参照情報を有する。 (3) Each peripheral node in the graph structure has reference information to another graph structure having a document corresponding to the peripheral node as a central node.

（４）前記第１の検索手段は、前記第１の表示手段で表示されたグラフ構造のなかから選択された周辺ノードの前記ノード名または前記参照情報を基に、該周辺ノードに対応する文書を中心ノードとする他のグラフ構造を前記第２の記憶手段から検索する。 (4) The first search means is a document corresponding to the peripheral node based on the node name or the reference information of the peripheral node selected from the graph structure displayed by the first display means. Another graph structure having a central node as a central node is retrieved from the second storage means.

一方の文書から他方の文書を参照しているが、当該他方の文書から当該一方の文書を参照していない場合でも、当該一方の文書を表すノードと当該他方の文書を表すノードとの間を双方向に辿ることができるグラフ構造を提供できる。 Even if one document refers to the other document, but the other document does not refer to the one document, the node representing the one document and the node representing the other document A graph structure that can be traced in both directions can be provided.

以下、本発明の実施形態について図面を参照して説明する。 Embodiments of the present invention will be described below with reference to the drawings.

図１は、本実施形態に係る文書処理システム１の構成例を示したものである。 FIG. 1 shows a configuration example of a document processing system 1 according to the present embodiment.

図１において、文書処理システム１は、大きく分けて、グラフ構造生成部１００、検索部１０１、第１の記憶部１０２、第２の記憶部１０３、第３の記憶部１０４を含む。 In FIG. 1, the document processing system 1 roughly includes a graph structure generation unit 100, a search unit 101, a first storage unit 102, a second storage unit 103, and a third storage unit 104.

グラフ構造生成部１００は、文書収集部１１、文書解析部１２、補完部１３を含み、検索部１０１は、検索要求処理部１４、第１の検索部１５、第２の検索部１６、第３の検索部１７、検索結果提示部１８を含む。 The graph structure generation unit 100 includes a document collection unit 11, a document analysis unit 12, and a complement unit 13. The search unit 101 includes a search request processing unit 14, a first search unit 15, a second search unit 16, and a third search unit 101. The search part 17 and the search result presentation part 18 are included.

図１では、文書処理システム１をネットワークＮＷに接続された１つのサーバ装置として構成した場合を示しているが、この場合に限らず、複数のサーバ装置で構成されていてもよい。 Although FIG. 1 shows a case where the document processing system 1 is configured as a single server device connected to the network NW, the present invention is not limited to this case, and may be configured with a plurality of server devices.

文書収集部１１は、例えば検索エンジンに用いられているクローラなどで構成され、World Wide Web (WWW)上の文書やWikipedia（登録商標）の文書のように、他の文書への参照情報（URI(Uniform Resource Identifier)やURL(Uniform resource Locator）などの情報資源を特定するための情報)が埋め込まれている文書（参照情報付き文書）をネットワークＮＷから収集し、第１の記憶部１０２に記憶する。文書収集部１１は、ネットワークＮＷ上のある特定のサイトから、参照情報付きの文書を定期的にまたは随時配信してもらい、これを受信して、第１の記憶部１０２に記憶するようにしてもよい。 The document collection unit 11 includes, for example, a crawler used for a search engine, and references information (URI) to other documents such as a document on the World Wide Web (WWW) or a Wikipedia (registered trademark) document. Documents with embedded information (information for identifying information resources such as (Uniform Resource Identifier) and URL (Uniform resource Locator)) (documents with reference information) are collected from the network NW and stored in the first storage unit 102 To do. The document collection unit 11 receives a document with reference information periodically or as needed from a specific site on the network NW, receives it, and stores it in the first storage unit 102. Also good.

第１の記憶部１０２は、収集された複数の文書が、当該文書を一意に特定するための識別情報（例えば、ＵＲＩやＵＲＬなど）とともに記憶されている。 The first storage unit 102 stores a plurality of collected documents together with identification information (for example, URI, URL, etc.) for uniquely identifying the documents.

文書解析部１２は、文書収集部１１で収集され、第1の記憶部１０２に記憶された各文書を解析して、当該文書に対応するグラフ構造を表すグラフ構造データを生成し、第２の記憶部１０３に記憶する。 The document analysis unit 12 analyzes each document collected by the document collection unit 11 and stored in the first storage unit 102 to generate graph structure data representing a graph structure corresponding to the document, and the second Store in the storage unit 103.

第１の記憶部１０２に記憶されている各文書に対し１つのグラフ構造データが生成され、各グラフ構造データは、当該文書を表す中心ノード、当該文書に参照情報によりリンクされている他の文書を表す周辺ノード、該中心ノードから該周辺ノードへのエッジを含むグラフ構造が、例えばＸＭＬ（Extensible Markup Language）などの構造化文書記述言語を用いて記述されている。 One graph structure data is generated for each document stored in the first storage unit 102, and each graph structure data is a central node representing the document and another document linked to the document by reference information. And a graph structure including an edge from the central node to the peripheral node is described using a structured document description language such as XML (Extensible Markup Language).

文書解析部１２は、第1の記憶部１０２に記憶された各文書に対し、当該文書を中心ノードとする上記グラフ構造データを生成する。各グラフ構造データは、当該グラフ構造データを一意に特定するための識別情報（例えば、ＵＲＩ、ＵＲＬなど）や、第１の記憶部１０２（及び第３の記憶部１０４）に記憶されている複数の文書のうち当該グラフ構造データ（の中心ノード）に対応する文書の識別情報とともに、第２の記憶１０３に記憶される。 For each document stored in the first storage unit 102, the document analysis unit 12 generates the graph structure data having the document as a central node. Each piece of graph structure data includes identification information (for example, URI, URL, etc.) for uniquely identifying the graph structure data, and a plurality of pieces stored in the first storage unit 102 (and the third storage unit 104). Are stored in the second storage 103 together with the identification information of the document corresponding to the graph structure data (the central node).

文書解析部１２の文書解析処理の詳細は後述する。 Details of the document analysis processing of the document analysis unit 12 will be described later.

補完部１３は、第２の記憶部１０３に記憶された各グラフ構造データに対して、当該グラフ構造の中心ノードである文書にはリンクされていない他の文書（当該他の文書からは当該文書はリンクされている）を表す周辺ノードの記述を追加する。補完部１３での補完処理の詳細は後述する。 For each graph structure data stored in the second storage unit 103, the complementing unit 13 receives another document that is not linked to the document that is the central node of the graph structure (from the other document, the document Add a description of the peripheral node that represents Details of the complementing process in the complementing unit 13 will be described later.

検索要求処理部１４は、ネットワークＮＷに接続されたユーザ端末２からの検索要求を受け付けて、第１の検索部１５、第２の検索部１６、第３の検索部１７を起動する。 The search request processing unit 14 receives a search request from the user terminal 2 connected to the network NW, and activates the first search unit 15, the second search unit 16, and the third search unit 17.

第１の検索部１５は、キーワード等を基に、第２の記憶部１０３からグラフ構造データを検索する。 The first search unit 15 searches for graph structure data from the second storage unit 103 based on keywords and the like.

第２の検索部１６は、第１の記憶部１０２からグラフ構造データ（中の中心ノード）に対応する文書を検索する。 The second search unit 16 searches the first storage unit 102 for a document corresponding to the graph structure data (the central node in the graph).

第３の記憶部１０４は、ネットワークＮＷ上から収集された全文書や、それらのインデックスが記憶されている。 The third storage unit 104 stores all documents collected from the network NW and their indexes.

第３の検索部１７は、キーワード等を基に、第３の記憶部１０４から文書を検索する。 The third search unit 17 searches for documents from the third storage unit 104 based on keywords and the like.

なお、図１では、第３の記憶部１０４と、グラフ構造データを生成する文書を記憶するための第１の記憶部１０２とを別個に表しているが、第３の記憶部１０４は必ずしも必要ではなく、第１の記憶部１０２があればよい。この場合、第３の検索部１７は、第１の記憶部１０２からキーワードを基に文書を検索し、グラフ構造生成部１００は、第１の記憶部１０２に記憶されている文書のうちの全部または一部の文書についてグラフ構造データを生成する。 In FIG. 1, the third storage unit 104 and the first storage unit 102 for storing the document for generating the graph structure data are separately shown, but the third storage unit 104 is not necessarily required. Instead, the first storage unit 102 may be provided. In this case, the third search unit 17 searches for a document from the first storage unit 102 based on the keyword, and the graph structure generation unit 100 selects all of the documents stored in the first storage unit 102. Alternatively, graph structure data is generated for some documents.

また、図１に示した構成の場合、文書収集部１１は、ネットワークＮＷからグラフ構造データを生成する文書のみを収集してもよいし、第１の記憶部１０２には、グラフ構造データを生成する文書のみを記憶しおいてもよい。 In the case of the configuration shown in FIG. 1, the document collection unit 11 may collect only the documents for generating the graph structure data from the network NW, and the graph storage data is generated in the first storage unit 102. Only documents to be stored may be stored.

検索結果提示部１８は、第1の検索部１５での検索結果である文書、第２の検索部１６での検索結果であるグラフ構造データ、第３の検索部１７での検索結果である文書やその一覧を、検索要求元のユーザ端末２へ返す。この結果、ユーザ端末２では、ブラウザにより当該検索結果がユーザ端末２のディスプレイに表示される。 The search result presentation unit 18 is a document that is a search result in the first search unit 15, graph structure data that is a search result in the second search unit 16, and a document that is a search result in the third search unit 17. Or the list thereof is returned to the user terminal 2 as the search request source. As a result, in the user terminal 2, the search result is displayed on the display of the user terminal 2 by the browser.

次に、図２に示すフローチャートを参照して、文書解析部１２の処理動作について説明する。 Next, the processing operation of the document analysis unit 12 will be described with reference to the flowchart shown in FIG.

文書解析部１２は、第１の記憶部１０２から参照情報付き文書を１つずつ取り出し、第１の記憶部１０２に記憶されている処理対象の全ての文書について以下の処理を行う。 The document analysis unit 12 takes out documents with reference information one by one from the first storage unit 102, and performs the following processing on all the documents to be processed stored in the first storage unit 102.

まず、文書解析部１２は、変数ｉを「１」に初期化する（ステップＳ１）。次に、文書解析部１２は、第１の記憶部１０２に記憶されている処理対象の複数の文書のうちのｉ（＝1）番目の文書を取り出し、ｉ番目の文書から、中心ノードのノード名として用いる文字列を抽出する（ステップＳ２）。中心ノードのノード名は、例えば、該ｉ番目の文書のタイトルや見出し、表題などである。 First, the document analysis unit 12 initializes the variable i to “1” (step S1). Next, the document analysis unit 12 extracts the i (= 1) -th document among the plurality of processing target documents stored in the first storage unit 102, and the node of the central node is extracted from the i-th document. A character string used as a name is extracted (step S2). The node name of the central node is, for example, the title, heading, title, etc. of the i-th document.

次に、文書解析部１２は、ｉ番目の文書から、参照情報により該文書にリンクされている他の文書を検出する（ステップＳ３）。ｉ番目の文書にリンクされている他の文書を検出した場合には、ｉ番目の文書から、当該他の文書への参照情報が埋め込まれている文字列や画像を、当該他の文書を表す周辺ノードのノード名として抽出する（ステップＳ４）。 Next, the document analysis unit 12 detects another document linked to the document by reference information from the i-th document (step S3). When another document linked to the i-th document is detected, a character string or an image in which reference information for the other document is embedded is represented from the i-th document. Extracted as node names of peripheral nodes (step S4).

さらに、ｉ番目の文書から、上記中心ノードから上記周辺ノードへのエッジに付加するラベルとして用いる文字列を抽出する（ステップＳ５）。 Further, a character string used as a label to be added to the edge from the central node to the peripheral node is extracted from the i-th document (step S5).

例えば、ｉ番目の文書中の上記他の文書への参照情報が埋め込まれている箇所を中心に一定の範囲内の文字列などの文脈情報（例えば、ｉ番目の文書中の上記他の文書への参照情報が埋め込まれている文字列及びその前後にある文字列）を、当該他の文書へのエッジに付加するラベルとして抽出する。あるいは、ｉ番目の文書中で上記他の文書への参照情報が埋め込まれている文字列や画像を含む章や節のタイトルを当該他の文書へのエッジに付加するラベルとして抽出する。 For example, context information such as a character string within a certain range centering on a portion where reference information to the other document in the i-th document is embedded (for example, to the other document in the i-th document Are extracted as labels to be added to the edges of the other documents. Alternatively, the title of a chapter or section including a character string or an image in which reference information for the other document is embedded in the i-th document is extracted as a label to be added to the edge of the other document.

そして、文書解析部１２は、以上のようにして得られた、ｉ番目の文書を表す中心ノードのノード名、i番目の文書にリンクされている他の文書を表す周辺ノードのノード名、中心ノードから周辺ノードへのエッジのラベルを含む、ｉ番目の文書に対応するグラフ構造データを生成する。このグラフ構造データは、ｉ番目の文書を表すノード名を有する中心ノードと、i番目の文書にリンクされている他の文書を表すノード名を有する周辺ノードと、中心ノードから周辺ノードへのエッジと、そのラベルを含むグラフ構造を、例えば、ＸＭＬ（Extensible Markup Language）などの構造化文書記述言語を用いて記述したものである。文書解析部１２は、生成されたグラフ構造データを第２の記憶部１０３に記憶し（ステップＳ７）、変数ｉを１つインクリメントする（ステップＳ８）。 Then, the document analysis unit 12 obtains the node name of the central node representing the i-th document, the node names of the peripheral nodes representing other documents linked to the i-th document, the center obtained as described above. Graph structure data corresponding to the i-th document including the labels of the edges from the nodes to the peripheral nodes is generated. The graph structure data includes a central node having a node name representing the i-th document, a peripheral node having a node name representing another document linked to the i-th document, and an edge from the central node to the peripheral node. The graph structure including the label is described using a structured document description language such as XML (Extensible Markup Language). The document analysis unit 12 stores the generated graph structure data in the second storage unit 103 (step S7), and increments the variable i by 1 (step S8).

以上のステップＳ２〜ステップＳ８の処理を、変数ｉの値が、第１の記憶部１０２に記憶されている処理対象の文書の総数よりも大きくなるまで繰り返す（ステップＳ９）。 The processes from step S2 to step S8 are repeated until the value of the variable i becomes larger than the total number of documents to be processed stored in the first storage unit 102 (step S9).

図３は、図２の文書解析処理における処理対象の文書（第１の記憶部１０２に記憶されている文書）の一例を示したもので、「女優Ｅ」に関する情報を記述した参照情報付き文書である。 FIG. 3 shows an example of a document to be processed (document stored in the first storage unit 102) in the document analysis processing of FIG. 2, and a document with reference information describing information on “actress E” It is.

図３に示した文書には、「女優Ｙ」をはじめ、「男優Ｓ」「男性タレントＩ」などの文字列に、他の文書がそれぞれリンクされている（参照情報が埋め込まれている）。なお、図３では、他の文書がリンクされている文字列は、“[[” と “]]” で囲まれている。実際、Wikipedia（登録商標）ではこの形式で参照情報が埋め込まれている文字列を示している。参照情報が埋め込まれている文字列や画像は、図３に示したように、例えば“[[” と “]]” のようなタグや記号で囲まれているので、文書全体から、周辺ノードに相当する文書、及びそのノード名として用いる文字列を検出することは容易である。 In the document shown in FIG. 3, other documents are linked to the character strings such as “Actress Y”, “Actor S”, “Male Actor I”, etc. (reference information is embedded). In FIG. 3, a character string to which other documents are linked is surrounded by “[[” and “]]”. Actually, Wikipedia (registered trademark) indicates a character string in which reference information is embedded in this format. As shown in FIG. 3, the character string or image in which the reference information is embedded is surrounded by tags and symbols such as “[[” and “]]”. It is easy to detect a document corresponding to and a character string used as its node name.

図３に示した文書から、当該文書のタイトルや見出し、表題に相当する「女優Ｅ」が、中心ノードのノード名として抽出され、「女優Ｙ」「男優Ｓ」「男性タレントＧ」が、それぞれ周辺ノードのノード名として抽出される。また、当該文書中の「女優Ｙ」を含む文字列が、中心ノードから「女優Ｙ」という周辺ノードへのエッジのラベルとして抽出され、当該文書中の「男優Ｓ」を含む文字列が中心ノードから「男優Ｓ」という周辺ノードへのエッジのラベルとして抽出され、当該文書中の「男性タレントＧ」を含む文字列が中心ノードから「男性タレントＧ」という周辺ノードへのエッジのラベルとして抽出される。 From the document shown in FIG. 3, “actress E” corresponding to the title, heading, and title of the document is extracted as the node name of the central node, and “actress Y”, “actor S”, and “male talent G” are respectively Extracted as the node name of the peripheral node. In addition, a character string including “Actress Y” in the document is extracted as a label of an edge from the central node to the peripheral node “Actress Y”, and a character string including “Actor S” in the document is the central node. Is extracted as the label of the edge to the peripheral node “Actor S”, and the character string including “Men Talent G” in the document is extracted as the label of the edge from the central node to the peripheral node “Men Talent G” The

この結果、図３の文書からは、図４に示すような、「女優Ｅ」という中心ノードをもつグラフ構造が得られる。図４では、「女優Ｅ」が中心ノードとなっており、ここから当該文書にリンクされている「女優Ｙ」、「男優S」、および「男性タレントＧ」が周辺ノードとして表されている。また、ノード間のエッジには、「女優Ｅ」の文書中から文書解析部１２によって切り出された文脈情報（参照情報が埋め込まれている箇所周辺の文字列など）、あるいは章・節のタイトルがラベルとして付加されている。 As a result, a graph structure having a central node of “actress E” as shown in FIG. 4 is obtained from the document of FIG. In FIG. 4, “Actress E” is a central node, and “Actress Y”, “Actor S”, and “Male talent G” linked to the document are represented as peripheral nodes. In addition, at the edge between nodes, context information (such as a character string around a portion where reference information is embedded) extracted from the document of “actress E” by the document analysis unit 12 or the title of a chapter / section is displayed. It is added as a label.

文書解析部１２は、図３の文書から、図４に示したようなグラフ構造を表すグラフ構造データを生成する。このグラフ構造データを予め作成されたスタイルシートなどを適用することで、ユーザ端末２のディスプレイには図４に示したグラフ構造が表示される。 The document analysis unit 12 generates graph structure data representing the graph structure as shown in FIG. 4 from the document of FIG. By applying a style sheet or the like created in advance to the graph structure data, the graph structure shown in FIG. 4 is displayed on the display of the user terminal 2.

図３では、１つのエッジにつき、１つのラベルを付加しているが、この場合に限らず、文書解析部１２は、処理対象の文書中から、１つのエッジにつき複数の文脈情報をラベルとして抽出し、複数のラベル、あるいは、複数の文脈情報を含む１つのラベルを当該エッジに付加してもよい。さらに、１つのエッジにつき、文脈情報のラベルと章・節のタイトルのラベルとの２種類のラベルを付加してもよい。 In FIG. 3, one label is added to each edge. However, the present invention is not limited to this, and the document analysis unit 12 extracts a plurality of context information for each edge as labels from the document to be processed. However, a plurality of labels or one label including a plurality of context information may be added to the edge. Further, two types of labels, a context information label and a chapter / section title label, may be added to one edge.

また、１つの文書に多くの文書がリンクされている場合、多くの周辺ノードを含むグラフ構造となり、これをユーザ端末２に表示すると、ユーザにとって見にくい場合もあり得る。このような場合を考慮し、文書解析部１２は、１つの中心ノードにリンクする周辺ノードの数を予め定められた数以内に限定するようにしてもよい。例えば、中心ノードに対応する文書にリンクされている文書のうち、先頭からの出現順に予め定められた数までの文書を周辺ノードとして採用する。 In addition, when many documents are linked to one document, a graph structure including many peripheral nodes is formed. If this is displayed on the user terminal 2, it may be difficult for the user to see. Considering such a case, the document analysis unit 12 may limit the number of peripheral nodes linked to one central node to a predetermined number. For example, among the documents linked to the document corresponding to the central node, up to a predetermined number of documents in the order of appearance from the top are employed as the peripheral nodes.

さらに、例えば、図４に示したグラフ構造を多くのユーザ端末２に提示したところ、図３の「女優Ｅ」に関する文書中で「女優Ｙ」の文書の参照が頻繁に行われた場合、この情報を反映して、文書解析部１２は、図４のグラフ構造中の「女優Ｙ」のノードを大きく表示したり、「女優Ｅ」と「女優Ｙ」を結ぶエッジの太さを太くするように、当該グラフ構造データを修正してもよい。 Further, for example, when the graph structure shown in FIG. 4 is presented to many user terminals 2, when the document of “actress Y” is frequently referred to in the document related to “actress E” in FIG. Reflecting the information, the document analysis unit 12 displays the node of “actress Y” in the graph structure of FIG. 4 to be large, or increases the thickness of the edge connecting “actress E” and “actress Y”. In addition, the graph structure data may be modified.

また、各エッジのラベルに関しては、通常は表示せずに、例えばユーザがエッジ上にマウスポインタを置いたらポップアップ表示するようなグラフ構造データを生成するようにしてもよい。 The label of each edge may not be normally displayed but may be generated as graph structure data that pops up when the user places the mouse pointer on the edge, for example.

図５は、図４の「女優Ｅ」にリンクされている「男性タレントＧ」に関する文書の一例を示したものである。この文書からは、上述の文書解析処理により、中心ノード「男性タレントＧ」、その周辺ノード「男性タレントＫ」「男優Ｔ」「バンドＰ」を含む、図６に示したようなグラフ構造が生成される。 FIG. 5 shows an example of a document related to “male talent G” linked to “actress E” in FIG. From this document, a graph structure as shown in FIG. 6 including the central node “male talent G” and its peripheral nodes “male talent K”, “actor T”, and “band P” is generated from the document analysis process described above. Is done.

なお、図５の文書では、「所属バンド」という文字列が“==”というタグで囲まれているが、これは、章もしくは節のタイトルを表すタグである。実際、Wikipedia（登録商標）ではこの形式で章のタイトルであることを明示している。章や節のタイトルは、予め定められたタグや記号などで囲まれているので、文書全体から、章や節のタイトルを検出することは容易である。 In the document of FIG. 5, the character string “affiliation band” is surrounded by a tag “==”, which is a tag representing a chapter or section title. In fact, Wikipedia (registered trademark) clearly indicates the title of a chapter in this format. Since the chapter and section titles are surrounded by predetermined tags and symbols, it is easy to detect the chapter and section titles from the entire document.

図６のグラフ構造では、中心ノード「男性タレントＧ」の周辺ノードとして、「男性タレントＫ」、「男優Ｔ」、および「バンドＰ」が得られている。ここで、「男性タレントＫ」および「男優Ｔ」へのエッジのラベルとして、図４と同様に、図５の文書中で、当該周辺ノードに対応する文書がリンクされている文字列及びその前後の文字列が採用されている。一方、「バンドＰ」へのエッジのラベルとしては、図５の文書から抽出された章・節のタイトルである文字列「所属バンド」が採用されている。 In the graph structure of FIG. 6, “male talent K”, “actor T”, and “band P” are obtained as peripheral nodes of the central node “male talent G”. Here, as the labels of edges to “male talent K” and “actor T”, as in FIG. 4, in the document of FIG. 5, the character string to which the document corresponding to the peripheral node is linked and before and after The character string is adopted. On the other hand, as the label of the edge to “band P”, the character string “belonging band” which is the title of the chapter / section extracted from the document of FIG. 5 is adopted.

このため、図６のグラフ構造では、各周辺ノードへ向かうエッジのラベルが、当該周辺ノードに対応する文書がリンクされている箇所周辺の文字列であるのか、章・節のタイトルであるのかを区別できるように、ラベルが章・節のタイトルである場合には、太線で囲んで表示されるようになっている。 Therefore, in the graph structure of FIG. 6, it is determined whether the edge label toward each peripheral node is a character string around the location where the document corresponding to the peripheral node is linked or the title of a chapter / section. As can be distinguished, when the label is the title of a chapter / section, it is displayed surrounded by a thick line.

図７は、図６のグラフ構造のソースデータであるグラフ構造データを示している。図７のグラフ構造データは、ＸＭＬで記述されており、「男性タレントＧ」を中心ノードとする、文書解析部１２で生成されたグラフ構造データの一例である。 FIG. 7 shows graph structure data which is source data of the graph structure of FIG. The graph structure data in FIG. 7 is an example of the graph structure data generated by the document analysis unit 12 described in XML and having “male talent G” as a central node.

図７のグラフ構造データは、中心ノードの記述部分Ｃ１と、複数の（例えば、ここでは２つの）周辺ノードの記述部分Ｃ２、Ｃ３を含む。中心ノードの記述部分Ｃ１には、＜ｅｎｔｒｙｎａｍｅ＞というタグ名で中心ノードのノード名が記述されている。周辺ノードの記述部分Ｃ１及びＣ２には、＜ｅｎｔｒｙｎａｍｅ＞というタグ名で当該周辺ノードのノード名が記述されている。また、当該周辺ノードに、当該周辺ノードを中心ノードとするグラフ構造データをリンクするために、＜ｌｉｎｋ＞というタグ名で、当該周辺ノードを中心ノードとするグラフ構造データへの参照情報（ここでは識別情報）が記述されている。さらに、中心ノードから当該周辺ノードへのエッジのラベルが＜ｌａｂｅｌｔｅｘｔ＞というタグ名で記述され、このラベルが文脈情報と、章・節のタイトルのうちのどちらのタイプであるかを示す情報が、＜ｌａｂｅｌｔｙｐｅ＞というタグ名で記述されている。例えば、ラベルが文脈情報であれば、「ＴＩＴＬＥ」という値が記述され、ラベルが章・節のタイトルであれば「ＣＯＮＴＥＸＴ」という値が記述されている。 The graph structure data in FIG. 7 includes a description portion C1 of a central node and description portions C2 and C3 of a plurality of (for example, two here) peripheral nodes. In the description portion C1 of the central node, the node name of the central node is described with a tag name <entryname>. In the description part C1 and C2 of the peripheral node, the node name of the peripheral node is described with the tag name <entryname>. In addition, in order to link the graph structure data having the peripheral node as the central node to the peripheral node, reference information (here, reference information to the graph structure data having the peripheral node as the central node with a tag name <link>) (Identification information) is described. Further, the label of the edge from the central node to the peripheral node is described with a tag name <labeltext>, and information indicating whether this label is the context information or the chapter / section title type, It is described with a tag name <labeltype>. For example, if the label is context information, a value “TITLE” is described, and if the label is a chapter / section title, a value “CONTEXT” is described.

１つ目の周辺ノードの記述部分Ｃ２には、中心ノードから当該周辺ノードへのエッジのラベルが２つあり、そのうちの１つ目は文脈情報のラベルであり、２つ目は章・節のタイトルであることがわかる。２つ目の周辺ノードの記述部分Ｃ３には、中心ノードから当該周辺ノードへのエッジのラベルが１つあり、それは文脈情報のラベルであることがわかる。 In the description part C2 of the first peripheral node, there are two edge labels from the central node to the peripheral node, the first of which is the context information label, and the second is the chapter / section. It turns out that it is a title. It can be seen that the description portion C3 of the second peripheral node has one edge label from the central node to the peripheral node, which is a label of context information.

さらに、周辺ノードの記述部分には、当該記述部分が後述する補完処理で追加されたものであるのか否かを示す情報が＜ａｄｄｅｄ＿ｎｏｄｅ＞というタグ名で記述されている。ここでは、２つの周辺ノードの記述部分Ｃ１、Ｃ２は、補完処理で追加されたものではないため（当該グラフ構造データにもともと存在するものであるため）、「ＮＯ」という値が記述されている。 Further, in the description part of the peripheral node, information indicating whether or not the description part is added by a complementing process described later is described with a tag name <added_node>. Here, since the description parts C1 and C2 of the two peripheral nodes are not added by the complementing process (because they exist in the graph structure data), the value “NO” is described. .

図４や図６に示したグラフ構造がユーザ端末２に表示された場合、ユーザにより、当該グラフ構造中の周辺ノードが選択されたとき（クリックされたとき）には、当該グラフ個構造のソースデータ（グラフ構造データ）中の当該選択された周辺ノードの記述部分に＜ｌｉｎｋ＞要素に含まれている参照情報（識別情報）が、検索要求処理部１４から第１の検索部１５に渡され、当該識別情報を基に、第１の検索部１６は、第２の記憶部１０３から、当該識別情報により特定されるグラフ構造データ、すなわち、当該周辺ノードを中心ノードとするグラフ構造データを検索する。検索されたグラフ構造データは、検索結果提示部１７により、ユーザ端末２のディスプレイに表示される。 When the graph structure shown in FIG. 4 or FIG. 6 is displayed on the user terminal 2, when the user selects a peripheral node in the graph structure (when clicked), the source of the graph individual structure Reference information (identification information) included in the <link> element in the description part of the selected peripheral node in the data (graph structure data) is passed from the search request processing unit 14 to the first search unit 15. Based on the identification information, the first search unit 16 searches the second storage unit 103 for graph structure data specified by the identification information, that is, graph structure data having the peripheral node as a central node. To do. The retrieved graph structure data is displayed on the display of the user terminal 2 by the search result presentation unit 17.

例えば、ユーザ端末２に、図４のグラフ構造が表示されている場合、ユーザが「男性タレントＧ」をクリックすると、図４のグラフ構造の表示から、図６のグラフ構造の表示に切り替わる。すなわち、図４のグラフ構造から図６のグラフ構造に遷移する。 For example, when the graph structure of FIG. 4 is displayed on the user terminal 2, when the user clicks “male talent G”, the display of the graph structure of FIG. 4 is switched to the display of the graph structure of FIG. That is, the graph structure of FIG. 4 is transitioned to the graph structure of FIG.

この場合、図４のグラフ構造では、中心ノード「女優Ｅ」の周辺ノードとして「男性タレントＧ」が存在し、図４のグラフ構造から図６のグラフ構造へは遷移可能であるが、図６のグラフ構造では、中心ノード「男性タレントＧ」には、「女優Ｅ」という周辺ノードが存在しないため、図６のグラフ構造が表示されると、ユーザは「女優Ｅ」を中心ノードとする図４のグラフ構造に戻るというアクションを選択できない。これは、もともと、図６のグラフ構造の中心ノードに対応する文書（図５参照）に、図３に示すような「女優Ｅ」という文書がリンクされていないからである。 In this case, in the graph structure of FIG. 4, “male talent G” exists as a peripheral node of the central node “actress E”, and the graph structure of FIG. 4 can transition to the graph structure of FIG. In the graph structure of FIG. 6, the central node “male talent G” does not have a peripheral node “actress E”. Therefore, when the graph structure of FIG. The action of returning to the graph structure of 4 cannot be selected. This is because the document “Actress E” as shown in FIG. 3 is not linked to the document corresponding to the central node of the graph structure of FIG. 6 (see FIG. 5).

この問題を解決するのが、補完部１３での補完処理である。次に、図８に示すフローチャートを参照して、補完部１３の処理動作について説明する。 What solves this problem is a complementing process in the complementing unit 13. Next, the processing operation of the complementing unit 13 will be described with reference to the flowchart shown in FIG.

補完部１３は、第２の記憶部１０３からグラフ構造データを１つずつ取り出し、第２の記憶部１０３に記憶されている処理対象の全てのグラフ構造について以下の処理を行う。 The complementing unit 13 extracts the graph structure data one by one from the second storage unit 103, and performs the following processing for all the graph structures to be processed stored in the second storage unit 103.

まず、補完部１３は、変数ｉを「１」に初期化する（ステップＳ２１）。次に、補完部１３は、第２の記憶部１０３から、処理対象の複数のグラフ構造データのうちのｉ（＝1）番目のグラフ構造データ（グラフ構造データ（ｉ））を取り出す（ステップＳ２２）。 First, the complement unit 13 initializes the variable i to “1” (step S21). Next, the complementing unit 13 extracts the i (= 1) -th graph structure data (graph structure data (i)) from the plurality of graph structure data to be processed from the second storage unit 103 (step S22). ).

補完部１３は、変数ｊを「１」に初期化し（ステップＳ２３）。グラフ構造データ（ｉ）中の先頭からｊ番目の周辺ノードの記述部分に含まれている、当該周辺ノードを中心ノードとするグラフ構造データ（ｊ）への参照情報（グラフ構造データ（ｊ）の識別情報）を基に、第２の記憶部１９から当該識別情報により特定されるグラフ構造データ（ｊ）を取り出す（ステップＳ２４）。 The complement unit 13 initializes the variable j to “1” (step S23). Reference information (graph structure data (j) reference information) included in the description part of the j-th peripheral node from the beginning in the graph structure data (i), with the peripheral node as the central node Based on the identification information), the graph structure data (j) specified by the identification information is extracted from the second storage unit 19 (step S24).

ここで、グラフ構造データ（ｉ）により表されているグラフ構造をグラフ構造（ｉ）とし、グラフ構造データ（ｊ）により表されているグラフ構造をグラフ構造（ｊ）とする。 Here, a graph structure represented by the graph structure data (i) is a graph structure (i), and a graph structure represented by the graph structure data (j) is a graph structure (j).

ステップＳ２５へ進み、グラフ構造データ（ｊ）にグラフ構造データ（ｉ）がリンクされているかどうかを調べる。これは、グラフ構造データ（ｊ）中に、グラフ構造データ（ｉ）への参照情報を含む周辺ノードの記述部分が含まれているかどうかを調べればよい。 In step S25, it is checked whether the graph structure data (i) is linked to the graph structure data (j). This can be done by checking whether or not the graph structure data (j) includes a description part of a peripheral node including reference information to the graph structure data (i).

グラフ構造データ（ｉ）にグラフ構造データ（ｊ）がリンクされているので、さらにグラフ構造データ（ｊ）にグラフ構造データ（ｉ）がリンクされている場合には、グラフ構造（ｉ）からグラフ構造（ｊ）へ遷移可能であり、しかもグラフ構造（ｊ）からグラフ構造（ｉ）へも遷移可能である。一方、グラフ構造データ（ｊ）にグラフ構造データ（ｉ）がリンクされていない場合には、グラフ構造（ｉ）からグラフ構造（ｊ）へは遷移可能であるが、グラフ構造（ｊ）からグラフ構造（ｉ）へは遷移することができない。 Since the graph structure data (j) is linked to the graph structure data (i), if the graph structure data (i) is further linked to the graph structure data (j), the graph structure (i) to the graph Transition to the structure (j) is possible, and transition from the graph structure (j) to the graph structure (i) is also possible. On the other hand, when the graph structure data (i) is not linked to the graph structure data (j), the graph structure (i) can be transitioned to the graph structure (j). Transition to structure (i) is not possible.

グラフ構造データ（ｊ）にグラフ構造データ（ｉ）がリンクされていない場合、ステップＳ５からステップＳ２６へ進み、グラフ構造データ（ｊ）に、グラフ構造データ（ｉ）をリンクするための、グラフ構造データ（ｉ）の中心ノードのノード名と同じノード名を有する新たな周辺ノードの記述を追加する（ステップＳ２６）。 When the graph structure data (i) is not linked to the graph structure data (j), the process proceeds from step S5 to step S26, and the graph structure for linking the graph structure data (i) to the graph structure data (j). A description of a new peripheral node having the same node name as the central node name of data (i) is added (step S26).

この新たな周辺ノードの記述には、グラフ構造データ（ｉ）への参照情報（グラフ構造データ（ｉ）の識別情報）と、グラフ構造データ（ｉ）中のグラフ構造データ（ｊ）に対応する周辺ノードの記述部分に含まれている、エッジのラベルとを含む。 The description of the new peripheral node corresponds to the reference information (identification information of the graph structure data (i)) to the graph structure data (i) and the graph structure data (j) in the graph structure data (i). And edge labels included in the description part of the peripheral node.

例えば、図４に示したグラフ構造のソースデータが、グラフ構造データ（ｉ）であり、このグラフ構造データ（ｉ）中のｊ番目の周辺ノードを中心ノードとするグラフ構造データ（ｊ）が図７であるとする。グラフ構造データ（ｉ）の中で、周辺ノード「男性タレントＧ」の記述部分を図９に示す。 For example, the source data having the graph structure shown in FIG. 4 is the graph structure data (i), and the graph structure data (j) having the j-th peripheral node in the graph structure data (i) as the central node is shown in FIG. 7 is assumed. FIG. 9 shows a description part of the peripheral node “male talent G” in the graph structure data (i).

この場合、ステップＳ２６では、グラフ構造データ（ｉ）の識別情報と、グラフ構造データ（ｉ）中の中心ノードのノード名と、図９の記述部分に含まれている、ラベルや、そのタイプとを基に、これらの情報を含む新たな周辺ノードの記述を、図７のグラフ構造データ（ｊ）に追加する。 In this case, in step S26, the identification information of the graph structure data (i), the node name of the central node in the graph structure data (i), the label contained in the description part of FIG. Based on the above, a description of a new peripheral node including these pieces of information is added to the graph structure data (j) in FIG.

図１０は、新たな周辺ノードの記述が追加されたグラフ構造データ（ｊ）を示したものである。図１０において、追加された新たな周辺ノードの記述部分Ｄ１には、＜ｅｎｔｒｙｎａｍｅ＞というタグ名でグラフ構造データ（ｉ）の中心ノードのノード名「女優Ｅ」が記述されている。また、当該新たな周辺ノードに、グラフ構造データ（ｉ）をリンクするために、＜ｌｉｎｋ＞というタグ名で、グラフ構造データ（ｉ）の識別情報が記述され、当該新たな周辺ノードと中心ノードとの間のエッジのラベルとそのタイプとして、図９の＜ｌａｂｅｌｔｅｘｔ＞タグや、＜ｌａｂｅｌｔｙｐｅ＞タグで記述されているものと同じラベルとタイプが記述されている。さらに、＜ａｄｄｅｄ＿ｎｏｄｅ＞タグで、当該新たな周辺ノードの記述部分が補完処理で追加されたものであることを示す値「ＹＥＳ」が記述されている。 FIG. 10 shows graph structure data (j) to which a description of a new peripheral node is added. In FIG. 10, the node name “actress E” of the central node of the graph structure data (i) is described with the tag name <entryname> in the description portion D1 of the added new peripheral node. Further, in order to link the graph structure data (i) to the new peripheral node, the identification information of the graph structure data (i) is described with the tag name <link>, and the new peripheral node and the central node As the label of the edge between and the type thereof, the same label and type as described in the <labeltext> tag of FIG. 9 or the <labeltype> tag are described. Furthermore, a value “YES” is described in the <added_node> tag, which indicates that the description part of the new peripheral node has been added by the complementing process.

図１０に示したグラフ構造データの表示例を図１１に示す。図１１に示すグラフ構造では、図６に示したグラフ構造に、図１０の記述部分Ｄ１に相当する、「女優Ｅ」という新たな周辺ノードが追加されている。また、中心ノード「男性タレントＧ」と新たな周辺ノード「女優Ｅ」との間のエッジには、「男性タレントＧ扮する主人公の恋人役として…」というラベルが付加されているが、このラベルとして用いている文字列は、上述したように、図５の文書「男性タレントＧ」から抽出したものではなく、図３の文書「女優Ｅ」から抽出されたもので、図４のグラフ構造中で、中心ノード「女優Ｅ」と周辺ノード「男性タレントＧ」との間のエッジのラベルある。 A display example of the graph structure data shown in FIG. 10 is shown in FIG. In the graph structure shown in FIG. 11, a new peripheral node “actress E” corresponding to the description portion D1 of FIG. 10 is added to the graph structure shown in FIG. In addition, a label of “as a hero of a hero who is jealous of male talent G” is added to the edge between the central node “male talent G” and a new peripheral node “actress E”. As described above, the character string used is not extracted from the document “male talent G” in FIG. 5 but extracted from the document “actress E” in FIG. 3. , The label of the edge between the central node “Actress E” and the peripheral node “Male Talent G”.

図１１に示した表示例において、上述の補完処理によりグラフ構造に追加された周辺ノード「女優Ｅ」は、＜ａｄｄｅｄ＿ｎｏｄｅ＞タグの値が「ＹＥＳ」であることから、当該グラフ構造の中心ノードに対応する文書中にリンクされている文書を表す通常の周辺ノードとは異なる表示形態で表示される。例えば、図１１では、中心ノードと通常のノードとの間のエッジは実線で表示されているが、中心ノードと追加された新たな周辺ノード「女優Ｅ」との間のエッジが、補完処理により追加された新たな周辺ノードであることを明示するために点線で表示されている。中心ノードと追加された新たな周辺ノードとの間のエッジを点線で表示する他、太い実戦や、他のエッジとは異なる色で表示してもよい。 In the display example shown in FIG. 11, the peripheral node “actress E” added to the graph structure by the above-described complementing process has the value of the <added_node> tag “YES”, so that the central node of the graph structure is displayed. It is displayed in a display form different from normal peripheral nodes representing documents linked in the corresponding document. For example, in FIG. 11, the edge between the central node and the normal node is indicated by a solid line, but the edge between the central node and the added new peripheral node “actress E” is It is displayed with a dotted line to clearly indicate that it is a new added peripheral node. In addition to displaying the edge between the central node and the added new peripheral node with a dotted line, it may be displayed in a thick actual battle or in a color different from other edges.

グラフ構造中に上記補完処理により追加された周辺ノードの表示形態としては、上記のように、中心ノードと追加された周辺ノードとの間のエッジを、中心ノードと通常のノードとの間のエッジの表示形態とは異なる表示形態で表示する他、追加された周辺ノードを通常の周辺ノードの色、形、フォント、線などとは異なる表示形態で表示するようにしてもよい。 As described above, the display form of the peripheral node added by the above complement processing in the graph structure is the edge between the central node and the added peripheral node, and the edge between the central node and the normal node as described above. In addition to the display form different from the above display form, the added peripheral node may be displayed in a display form different from the color, shape, font, line, etc. of the normal peripheral node.

いずれにしても、グラフ構造を表示した際に、追加された周辺ノードと通常の周辺ノードとが区別できるように、追加された周辺ノード（そのエッジを含む）が、通常の周辺ノード（そのエッジを含む）の表示形態とは異なる表示形態で表示されていることが望ましい。 In any case, when the graph structure is displayed, the added peripheral node (including its edge) is added to the normal peripheral node (its edge) so that the added peripheral node can be distinguished from the normal peripheral node. It is desirable that the display format is different from the display format of

図８の説明に戻り、ステップＳ２５において、グラフ構造データ（ｊ）にグラフ構造データ（ｉ）がリンクされている場合にはステップＳ２７へ進み、変数ｊを１つインクリメントする。また、ステップＳ２６で、グラフ構造データ（ｊ）にグラフ構造データ（ｉ）をリンクするための新たな周辺ノードの記述を追加した後もステップＳ２７へ進む。 Returning to the description of FIG. 8, if the graph structure data (i) is linked to the graph structure data (j) in step S25, the process proceeds to step S27, and the variable j is incremented by one. In step S26, the process proceeds to step S27 even after a description of a new peripheral node for linking the graph structure data (i) to the graph structure data (j) is added.

以上のステップＳ２４〜ステップＳ２７の処理を、変数ｊの値が、グラフ構造データ（ｉ）中の周辺ノードの総数よりも大きくなるまで繰り返す（ステップＳ２８）。変数ｊの値が、グラフ構造データ（ｉ）中の周辺ノードの総数よりも大きい場合には、ステップＳ２９へ進み、変数ｉを１つインクリメントし、変数ｉの値が第２の記憶部１０３に記憶されている処理対象のグラフ構造データの総数よりも大きくなるまで、ステップＳ２２〜ステップＳ２９までの処理を繰り返す（ステップＳ３０）。 The processes in steps S24 to S27 are repeated until the value of the variable j becomes larger than the total number of peripheral nodes in the graph structure data (i) (step S28). When the value of the variable j is larger than the total number of neighboring nodes in the graph structure data (i), the process proceeds to step S29, where the variable i is incremented by 1, and the value of the variable i is stored in the second storage unit 103. The processing from step S22 to step S29 is repeated until it becomes larger than the total number of graph structure data to be processed (step S30).

図８に示す補完処理では、グラフ構造（ｉ）からｊ番目の周辺ノードを介してグラフ構造（ｊ）へは遷移するが、グラフ構造（ｊ）には、グラフ構造（ｉ）へ遷移できる周辺ノードが含まれていない場合に（ステップＳ２５）、グラフ構造（ｊ）に、グラフ構造（ｉ）へ遷移するための新たな周辺ノードを追加している（ステップＳ２６）。グラフ構造（ｊ）に、グラフ構造（ｉ）へ遷移できる周辺ノードが含まれている場合には（ステップＳ２５）、そのまま次の周辺ノードへと処理を移すようになっているが、この場合に限らず、図１２に示すようなに補完処理を行うようにしてもよい。 In the complementing process shown in FIG. 8, the graph structure (i) transits to the graph structure (j) through the jth peripheral node, but the graph structure (j) has a transition that can transit to the graph structure (i). When the node is not included (step S25), a new peripheral node for transition to the graph structure (i) is added to the graph structure (j) (step S26). When the graph structure (j) includes a peripheral node that can transition to the graph structure (i) (step S25), the processing is moved to the next peripheral node as it is. Not limited to this, the complementary processing may be performed as shown in FIG.

図１２に示す補完処理では、図８の補完処理のステップＳ２５において、グラフ構造（ｊ）にグラフ構造（ｉ）へ遷移できる周辺ノードが含まれている場合には、ステップＳ４１、ステップＳ４２に示す処理を行ってからステップＳ２７へ進むようになっている。 In the complementing process shown in FIG. 12, if the graph structure (j) includes a peripheral node that can transition to the graph structure (i) in step S25 of the complementing process in FIG. After the processing, the process proceeds to step S27.

ステップＳ４１、ステップＳ４２の処理では、グラフ構造（ｉ）で中心ノードから当該ｊ番目の周辺ノードへのエッジのラベルとともに、当該ｊ番目の周辺ノードから中心ノードへのエッジのラベルを表示するために必要な情報を、グラフ構造データ（ｉ）に追加する。 In the processing of step S41 and step S42, in order to display the label of the edge from the j-th peripheral node to the central node together with the label of the edge from the central node to the j-th peripheral node in the graph structure (i) Necessary information is added to the graph structure data (i).

ます、ステップＳ４１では、グラフ構造データ（ｊ）中の、グラフ構造データ（ｉ）がリンクされている周辺ノードの記述部分から、エッジのラベルやそのタイプなどを抽出する。すなわち、＜ｌａｂｅｌｔｅｘｔ＞要素や、＜ｌａｂｅｌｔｙｐｅ＞要素の値を取り出す。 First, in step S41, the edge label and its type are extracted from the description part of the peripheral node to which the graph structure data (i) is linked in the graph structure data (j). That is, the value of the <labeltext> element or the <labeltype> element is extracted.

ステップＳ４２では、グラフ構造データ（ｉ）のｊ番目の周辺ノードの記述部分に、ステップＳ４１で得られたラベルやそのタイプなどの情報を値としてもつ＜ｌａｂｅｌｔｅｘｔ＞要素や、＜ｌａｂｅｌｔｙｐｅ＞要素を追加する。 In step S42, a <labeltext> element or <labeltype> element having information such as the label obtained in step S41 and its type as a value is added to the description part of the j-th peripheral node of the graph structure data (i). To do.

例えば、図６のグラフ構造をグラフ構造（ｉ）であるとき、その周辺ノード「男性タレントＫ」にリンクされているグラフ構造データ（ｊ）には、グラフ構造データ（ｉ）へ遷移する周辺ノード「男性タレントＧ」が含まれているとする。この場合、図１２の補完処理により、グラフ構造（ｊ）の中心ノード「男性タレントＫ」から周辺ノード「男性タレントＧ」へのエッジのラベルが、グラフ構造（ｉ）の中心ノード「男性タレントＫ」と周辺ノード「男性タレントＧ」との間のエッジに付加される。この結果、図１３に示すように、グラフ構造（ｉ）を表示する際には、中心ノード「男性タレントＧ」から周辺ノード「男性タレントＫ」へのエッジのラベルと、周辺ノード「男性タレントＫ」から中心ノード「男性タレントＧ」へのエッジのラベルとが表示される。 For example, when the graph structure in FIG. 6 is the graph structure (i), the graph structure data (j) linked to the peripheral node “male talent K” includes a peripheral node that transitions to the graph structure data (i). It is assumed that “male talent G” is included. In this case, by the complementing process of FIG. 12, the label of the edge from the central node “male talent K” of the graph structure (j) to the peripheral node “male talent G” is changed to the central node “male talent K” of the graph structure (i). ”And the peripheral node“ male talent G ”. As a result, as shown in FIG. 13, when the graph structure (i) is displayed, the label of the edge from the central node “male talent G” to the peripheral node “male talent K” and the peripheral node “male talent K” are displayed. ”To the center node“ male talent G ”.

図１３では、中心ノード「男性タレントＧ」に対応する文書内での「男性タレントＫ」に対する言及と、周辺ノード「男性タレントＫ」に対応する文書内での「男性タレントＧ」に対する言及とが共に表示されている。ユーザは、これらの情報を参考にしてノードを選択し、グラフ構造上を行き来すればよい。 In FIG. 13, reference to “male talent K” in the document corresponding to the central node “male talent G” and reference to “male talent G” in the document corresponding to the peripheral node “male talent K”. Both are displayed. The user may select a node with reference to these pieces of information and go back and forth on the graph structure.

補完部１３で図８あるいは図１２に示したような補完処理の施されたグラフ構造データが、第２の記憶部１０３に記憶される。 The graph structure data subjected to the complementing process as shown in FIG. 8 or 12 in the complementing unit 13 is stored in the second storage unit 103.

なお、以上の説明では、処理対象とする参照情報付き文書は日本語のみに限らず、あらゆる言語で記述された文書であってもよい。また、中心ノードや周辺ノードのノード名は、人物名に限らず、組織名や、論文のタイトル、Webページのタイトルなど、当該ノードに対応する参照付き文書を識別できるものであれば何でも良い。例えば、１つのグラフ構造に、人物名と組織名とが混在するノード名のノードが含まれていても良いし、日本語のノード名のノードと英語のノード名のノードとが混在してもよい。 In the above description, the document with reference information to be processed is not limited to Japanese, and may be a document written in any language. The node names of the central node and the peripheral nodes are not limited to person names, and may be anything such as an organization name, a title of a paper, a title of a web page, etc., as long as it can identify a document with reference corresponding to the node. For example, a single graph structure may include a node having a node name in which a person name and an organization name are mixed, or a node having a Japanese node name and a node having an English node name may be mixed. Good.

次に、検索部１０１の処理動作について、図１４に示すフローチャートを参照して説明する。 Next, the processing operation of the search unit 101 will be described with reference to the flowchart shown in FIG.

ユーザ端末２から入力されたキーワードは、検索要求処理部１４により受け入れられ、該キーワードが、検索要求処理部１４から第１の検索部１５及び第３の検索部１７に渡される（ステップＳ１００）。 The keyword input from the user terminal 2 is accepted by the search request processing unit 14, and the keyword is passed from the search request processing unit 14 to the first search unit 15 and the third search unit 17 (step S100).

第１の検索部１５は、検索要求処理部１４から入力されたキーワードを基に、第２の記憶部１０３から、当該キーワードが中心ノードのノード名に含まれているグラフ構造データを検索する（ステップＳ１０１）。検索されたグラフ構造データは、検索結果提示部１８によりユーザ端末２へ送信され、ユーザ端末２では、受信した当該グラフ構造データに記述されているグラフ構造をディスプレイに表示する（ステップＳ１０２）。 Based on the keyword input from the search request processing unit 14, the first search unit 15 searches the second storage unit 103 for graph structure data in which the keyword is included in the node name of the central node ( Step S101). The retrieved graph structure data is transmitted to the user terminal 2 by the search result presentation unit 18, and the user terminal 2 displays the graph structure described in the received graph structure data on the display (step S102).

第３の検索部１７は、検索要求処理部１４から入力されたキーワードを基に、第３の記憶部１０４から、当該キーワードを含む文書を検索する（ステップＳ１１１）。例えば、当該キーワードを含む文書を第３の記憶部１０４から検索する。検索された文書の一覧は、検索結果提示部１８によりユーザ端末２へ送信され、ユーザ端末２では、受信した検索結果の文書の一覧をディスプレイに表示する（ステップＳ１１２）。 The third search unit 17 searches the third storage unit 104 for a document including the keyword based on the keyword input from the search request processing unit 14 (step S111). For example, a document including the keyword is searched from the third storage unit 104. The retrieved document list is transmitted to the user terminal 2 by the retrieval result presentation unit 18, and the user terminal 2 displays the received retrieved document list on the display (step S112).

図１５は、ユーザ端末２のディスプレイに表示されるグラフ構造及び検索結果の文書一覧の画面表示例を示したものである。ここでは、ユーザ端末２からユーザが「女優Ｅ」というキーワードを入力した場合に、第１の検索部１５で検索された、「女優Ｅ」を中心ノードとするグラフ構造（図４と同じグラフ構造）を表示するとともに、「女優Ｅ」をキーワードとして、第３の検索部１７が第３の記憶部１０４から検索した文書の一覧を表示している。 FIG. 15 shows a screen display example of a graph structure displayed on the display of the user terminal 2 and a document list of search results. Here, when the user inputs the keyword “actress E” from the user terminal 2, the graph structure having “actress E” as a central node, searched by the first search unit 15 (the same graph structure as FIG. 4). ) And a list of documents searched by the third search unit 17 from the third storage unit 104 using “actress E” as a keyword.

図１５に示した表示画面には、キーワードの入力及び表示するための領域Ａ１と、グラフ構造を表示するための領域Ａ２と、検索された文書の一覧を表示する領域Ａ３とが設けられている。 The display screen shown in FIG. 15 is provided with an area A1 for inputting and displaying keywords, an area A2 for displaying a graph structure, and an area A3 for displaying a list of searched documents. .

図１５に示したような表示インタフェース上において、ユーザがグラフ構造上の周辺ノード「男性タレントＧ」をクリックすると（ステップＳ１０３）、検索要求処理部１４は、これを受けて、例えば、当該選択された周辺ノード「男性タレントＧ」のノード名を新たなキーワードに設定する（ステップＳ１０４）。すなわち、図５の表示画面上の領域Ａ１には、「女優Ｅ」に代わって「男性タレントＧ」が表示される。さらに検索要求処理部１４は、この新たなキーワード「男性タレントＧ」を第１の検索部１５及び第３の検索部１７に渡し、上記ステップＳ１０１〜ステップＳ１０２、ステップＳ１１１〜ステップＳ１１２の処理が行われる。 When the user clicks on the peripheral node “male talent G” on the graph structure on the display interface as shown in FIG. 15 (step S103), the search request processing unit 14 receives this and, for example, selects the selected node. The node name of the peripheral node “male talent G” is set as a new keyword (step S104). That is, “male talent G” is displayed in place of “actress E” in the area A1 on the display screen of FIG. Further, the search request processing unit 14 passes this new keyword “male talent G” to the first search unit 15 and the third search unit 17, and the processes of steps S101 to S102 and steps S111 to S112 are performed. Is called.

すなわち、ステップＳ１０１において、第１の検索部１５は、新たなキーワード「男性タレントＧ」が中心ノードのノード名に含まれているグラフ構造データを検索し、検索されたグラフ構造データがユーザ端末２へ送信されて、ユーザ端末２の表示画面上の領域Ａ２には、図１１に示すようなグラフ構造が表示される（ステップＳ１０２）。 That is, in step S101, the first search unit 15 searches for the graph structure data in which the new keyword “male talent G” is included in the node name of the central node, and the searched graph structure data is the user terminal 2. 11 is displayed in the area A2 on the display screen of the user terminal 2 (step S102).

なお、ステップＳ１０１では、選択された周辺ノード「男性タレントＧ」がもつ参照情報（グラフ構造データ中の＜ｌｉｎｋ＞要素の値）から当該参照情報に対応するグラフ構造データを検索してもよい。 In step S101, the graph structure data corresponding to the reference information may be retrieved from the reference information (value of the <link> element in the graph structure data) possessed by the selected peripheral node “male talent G”.

ステップＳ１１１において、第３の検索部１７は、新たなキーワード「男性タレントＧ」を基に、第３の記憶部１０４から、当該キーワードを含む文書を検索し、検索された文書の一覧がユーザ端末２へ送信されて、ユーザ端末２の表示画面上の領域Ａ３には、受信した検索結果の文書の一覧が表示される（ステップＳ１１２）。 In step S111, the third search unit 17 searches the third storage unit 104 for a document including the keyword based on the new keyword “male talent G”, and the list of searched documents is a user terminal. 2 and a list of received search result documents is displayed in the area A3 on the display screen of the user terminal 2 (step S112).

一方、図１５に示したような表示インタフェース上において、ユーザがグラフ構造上の中心ノード「女優Ｅ」をクリックすると（ステップＳ１０３）、ステップＳ１２１へ進む。ステップＳ１２２では、検索要求処理部１４は、当該中心ノードのノード名「女優Ｅ」を第２の検索部１６へ渡す。なお、検索要求処理部１４は、ユーザ端末２の表示画面上に表示されているグラフ構造に対応する文書（すなわち、当該グラフ構造の中心ノードに対応する文書）の識別情報を取得できる場合には、（例えば、文書の識別情報が当該グラフ構造データに付加されているあるいは含まれている場合など）当該中心ノードのノード名「女優Ｅ」とともに、あるいは当該ノード名に代えて、当該文書の識別情報を第２の検索部１６へ渡す。 On the other hand, when the user clicks the central node “actress E” on the graph structure on the display interface as shown in FIG. 15 (step S103), the process proceeds to step S121. In step S <b> 122, the search request processing unit 14 passes the node name “actress E” of the central node to the second search unit 16. When the search request processing unit 14 can acquire the identification information of the document corresponding to the graph structure displayed on the display screen of the user terminal 2 (that is, the document corresponding to the central node of the graph structure). (For example, when document identification information is added to or included in the graph structure data) The identification of the document together with or instead of the node name “actress E” of the central node The information is passed to the second search unit 16.

第２の検索部１６は、検索要求得処理部１４から入力されたノード名「女優」、または当該ノード名に対応する文書の識別情報を基に、第１の記憶部１０２から文書を検索する。ここでは、「女優」をタイトル、見出し、表題に含まれている文書を検索する。または入力された識別情報を有する文書を検索する。このようにして検索された文書は、検索結果提示部１８によりユーザ端末２へ送信され、ユーザ端末２では、図１５の表示画面とは別の新たなウインドウを表示して、この新たなウインドウ内に、受信した検索結果の文書を表示する。 The second search unit 16 searches for a document from the first storage unit 102 based on the node name “actress” input from the search request acquisition processing unit 14 or the identification information of the document corresponding to the node name. . Here, a document including “actress” in the title, headline, and title is searched. Alternatively, a document having the input identification information is searched. The document searched in this way is transmitted to the user terminal 2 by the search result presentation unit 18, and the user terminal 2 displays a new window different from the display screen of FIG. The received search result document is displayed.

なお、図１４には示していないが、図１５の表示画面の領域Ａ３に表示されている文書の一覧のなかから、ある１つの文書が選択された場合、当該選択された文書を（例えば、図１５の表示画面とは別の新たなウインドウを表示して、この新たなウインドウ内に）表示するとともに、図１５の表示画面の領域Ａ２に表示するグラフ構造を、当該文書を中心ノードとするグラフ構造に遷移させてもよい。 Although not shown in FIG. 14, when one document is selected from the list of documents displayed in the area A3 of the display screen in FIG. 15, the selected document (for example, A new window different from the display screen of FIG. 15 is displayed and displayed in this new window, and the graph structure displayed in the area A2 of the display screen of FIG. You may make a transition to the graph structure.

この場合、図１５の表示画面の領域Ａ３に表示されている文書の一覧のなかからある１つの文書がクリックされて選択されると、検索要求処理部１４では、当該選択された文書の識別情報（例えば、ＵＲＩ、ＵＲＬなど）を第１の検索部１５及び第３の検索部１７に渡す。 In this case, when one document is clicked and selected from the list of documents displayed in the area A3 of the display screen in FIG. 15, the search request processing unit 14 identifies the selected document. (For example, URI, URL, etc.) is passed to the first search unit 15 and the third search unit 17.

第１の検索部１５は、検索要求処理部１４から入力された文書の識別情報に対応付けて記憶されているグラフ構造データを第２の記憶部１０３から検索する。検索されたグラフ構造データはユーザ端末２へ送信されて、ユーザ端末２の表示画面上の領域Ａ２に表示される。また第３の検索部１５は、検索要求処理部１４から入力された識別情報を有する文書を第３の記憶部１０４から検索する。検索された文書はユーザ端末２へ送信されて、新たなウインドウ内に表示される。 The first search unit 15 searches the second storage unit 103 for graph structure data stored in association with the document identification information input from the search request processing unit 14. The retrieved graph structure data is transmitted to the user terminal 2 and displayed in the area A2 on the display screen of the user terminal 2. The third search unit 15 searches the third storage unit 104 for a document having the identification information input from the search request processing unit 14. The retrieved document is transmitted to the user terminal 2 and displayed in a new window.

このように、図１５の表示画面上の領域Ａ２に表示されたグラフ構造から所望の周辺ノードから選択された場合には、当該選択された周辺ノードを中心ノードとする別のグラフ構造が表示される。この別のグラフ構造には、必ず、元のグラフ構造の中心ノードが周辺ノードとして含まれているため、当該周辺ノードを選択することで再び元のグラフ構造を表示することができる。 As described above, when a desired peripheral node is selected from the graph structure displayed in the area A2 on the display screen of FIG. 15, another graph structure having the selected peripheral node as the central node is displayed. The Since the other graph structure always includes the central node of the original graph structure as a peripheral node, the original graph structure can be displayed again by selecting the peripheral node.

ユーザは、表示されたグラフ構造の周辺ノードを選択することで当該周辺ノードを中心ノードとするグラフ構造に遷移させて、参照関係のあるノード（文書）を辿ることができ、しかも、元のノード（文書）へも自由に戻ることができる。 By selecting a peripheral node of the displayed graph structure, the user can change to a graph structure having the peripheral node as a central node, and can follow a node (document) having a reference relationship, and the original node You can freely return to (document).

すなわち、文書間の参照関係が非対称であることに起因するノードの欠落がある場合でも、ユーザが文書間の参照関係を図示したグラフのノード間を自由に辿ることができる。 That is, even when there is a missing node due to an asymmetric reference relationship between documents, the user can freely trace between the nodes of the graph illustrating the reference relationship between documents.

従って、ユーザは、表示されたグラフ構造上の周辺ノードを選択することにより、多くの異なるグラフ構造間を双方向に辿ることができる。ユーザは、目的とする情報はもちろんのこと、これに関連し合う多くの情報を目にすることができるので、ユーザの興味をそそるようなグラフ構造の探索を実現することができる。 Therefore, the user can bidirectionally navigate between many different graph structures by selecting the peripheral nodes on the displayed graph structure. Since the user can see not only the target information but also a lot of information related to it, it is possible to realize a search for a graph structure that is intriguing to the user.

本発明の実施の形態に記載した本発明の手法（グラフ構造生成部１００、検索部１０の処理動作）は、コンピュータに実行させることのできるプログラムとして、磁気ディスク（フレキシブルディスク、ハードディスクなど）、光ディスク（ＣＤ−ＲＯＭ、ＤＶＤなど）、半導体メモリなどの記録媒体に格納して頒布することもできる。 The method of the present invention described in the embodiment of the present invention (processing operation of the graph structure generation unit 100 and the search unit 10) is a program that can be executed by a computer as a magnetic disk (flexible disk, hard disk, etc.), optical disk. It can also be stored and distributed in a recording medium such as a semiconductor memory (CD-ROM, DVD, etc.).

なお、本発明は上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. In addition, various inventions can be formed by appropriately combining a plurality of components disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

本発明の実施形態に係る文書処理システムの構成例を示した図。1 is a diagram showing a configuration example of a document processing system according to an embodiment of the present invention. 文書解析部における文書解析処理を説明するためのフローチャート。6 is a flowchart for explaining document analysis processing in a document analysis unit. 参照情報を含む文書の一例を示す図。The figure which shows an example of the document containing reference information. 図３の文書から生成されたグラフ構造データの表示例を示す図。The figure which shows the example of a display of the graph structure data produced | generated from the document of FIG. 参照情報を含む文書の他の例であり、図３の文書から参照している文書を示す図。The figure which is another example of the document containing reference information, and shows the document referred from the document of FIG. 図５の文書から生成されたグラフ構造データの表示例を示す図。The figure which shows the example of a display of the graph structure data produced | generated from the document of FIG. 図６のグラフ構造のソースデータであるグラフ構造データを示す図。The figure which shows the graph structure data which are the source data of the graph structure of FIG. 補完部における補完処理を説明するためのフローチャート。The flowchart for demonstrating the complement process in a complement part. 図４のグラフ構造のソースデータであるグラフ構造データのうち、周辺ノード「男性タレントＧ」の記述例を示した図。The figure which showed the example of description of peripheral node "male talent G" among the graph structure data which is the source data of the graph structure of FIG. 新たな周辺ノードの記述が追加されたグラフ構造データの例を示した図。The figure which showed the example of the graph structure data to which the description of the new surrounding node was added. 図１０のグラフ構造データの表示例を示す図。The figure which shows the example of a display of the graph structure data of FIG. 補完部における他の補完処理を説明するためのフローチャート。The flowchart for demonstrating the other complementation process in a complement part. 図１２に示した補完処理により、互いに参照関係にある２つのノード間に追加されたラベルの表示例を示す図。The figure which shows the example of a display of the label added between the two nodes which have a reference relation mutually by the complement process shown in FIG. 検索部の処理動作を説明するためのフローチャート。The flowchart for demonstrating the processing operation of a search part. ユーザ端末のディスプレイに表示されるグラフ構造及び検索結果の文書一覧の画面表示例を示す図。The figure which shows the example of a screen display of the graph structure displayed on the display of a user terminal, and the document list of a search result.

Explanation of symbols

１…文書処理システム、２…ユーザ端末、１１…文書収集部、１２…文書解析部、１３…補完部、１４…検索要求処理部、１５…第１の検索部、１６…第２の検索部、１７…第３の検索部、１８…検索結果提示部、１００…グラフ構造生成部、１０１…検索部、１０２…第１の記憶部、１０３…第２の記憶部、１０４…第３の記憶部 DESCRIPTION OF SYMBOLS 1 ... Document processing system, 2 ... User terminal, 11 ... Document collection part, 12 ... Document analysis part, 13 ... Complement part, 14 ... Search request process part, 15 ... 1st search part, 16 ... 2nd search part , 17 ... third search unit, 18 ... search result presentation unit, 100 ... graph structure generation unit, 101 ... search unit, 102 ... first storage unit, 103 ... second storage unit, 104 ... third storage Part

Claims

Document collection means for collecting a plurality of documents including reference information to other documents,
First storage means for storing the collected documents;
For each document stored in the first storage means, the central node includes a central node corresponding to the document and a peripheral node corresponding to another document linked to the document by the reference information. Has a node name representing the document, and the peripheral node generates a graph structure having a node name representing the other document;
Second storage means for storing a plurality of graph structures corresponding to each of the plurality of documents;
A first document corresponding to a first peripheral node in the first graph structure of the plurality of graph structures stored in the second storage means as a central node; Graph structure complementing means for adding a second peripheral node having a node name representing the second document to a second graph structure that does not include the second document corresponding to the central node as a peripheral node;
First search means for searching for a graph structure from the second storage means;
First display means for displaying the retrieved graph structure;
Including document processing system.

The first search means searches the second storage means for another graph structure having a peripheral node selected from the graph structure displayed by the first display means as a central node,
The document processing system according to claim 1, wherein the first display unit displays the searched other graph structure.

When the first peripheral node is selected from the first graph structure displayed by the first display means, the second graph structure searched by the first search means is the first graph structure. 1 is displayed on the display means,
When the second peripheral node is selected from the second graph structure displayed by the first display means, the first graph structure searched by the first search means is the first graph structure. 3. The document processing system according to claim 2, wherein the document processing system is displayed by one display means.

The generation means includes a character string having the reference information to the other document in the document and its surrounding character string, or a title of a chapter or section including the character string having the reference information as the central node. The document processing system according to claim 1, wherein the document structure is generated in the graph structure included as a label of an edge between the peripheral nodes.

The graph structure complementing means uses the central node in the first graph structure and the first as a label of an edge between the central node in the second graph structure and the second peripheral node. 5. The document processing system according to claim 4, wherein labels of edges between peripheral nodes are added to the second graph structure.

The graph structure complementing means includes:
A third graph structure having a third document corresponding to a third peripheral node in the first graph structure as a central node, and the second document corresponding to the central node in the first graph structure being When included as a peripheral node, the edge between the central node and the third peripheral node in the first graph structure corresponds to the central node and the second document in the third graph structure The document processing system according to claim 4, wherein a label of an edge between the peripheral nodes is added.

The second peripheral node in the second graph structure displayed by the first display means is a document linked to the second document corresponding to the central node of the second graph structure 4. The document processing system according to claim 3, wherein the document processing system is displayed in a display form different from that of the peripheral node corresponding to.

A second search unit that searches the first storage unit for a document that includes a keyword input as a search condition or a document that includes a node name of a central node of the graph structure displayed by the first display unit;
Second display means for displaying a document searched by the second search means or a list of searched documents;
Further including
2. The document processing system according to claim 1, wherein the first search unit searches the second storage unit for a graph structure including the keyword in the node name of the central node.

First storage means for storing a plurality of documents including reference information to other documents;
For each document stored in the first storage means, the central node includes a central node corresponding to the document and a peripheral node corresponding to another document linked to the document by the reference information. Has a node name representing the document, and the peripheral node generates a graph structure having a node name representing the other document;
Second storage means for storing a plurality of graph structures corresponding to each of the plurality of documents;
Graph structure complementing means for complementing each graph structure stored in the second storage means;
Retrieval means for retrieving a graph structure from the second storage means;
Display means for displaying the retrieved graph structure;
A document processing method in a document processing system including:
The graph structure complementing means includes, as a central node, a first document corresponding to a first peripheral node in the first graph structure of the plurality of graph structures, and the central node of the first graph structure A second peripheral node having reference information to the first graph structure and a node name representing the second document is added to a second graph structure that does not include the corresponding second document as a peripheral node. A graph structure completion step,
A first search step in which the search means searches the second storage means for a graph structure including a keyword input as a search condition in a node name of the central node;
A first display step in which the display means displays the retrieved graph structure;
Document processing method.

The second storage unit stores the second graph structure having the first peripheral node selected from the first graph structure displayed in the first display step as a central node. A second search step for searching from the means;
A second display step in which the display means displays the searched second graph structure;
The second memory stores the first graph structure having the second peripheral node selected from the second graph structure displayed in the second display step as a central node. A third search step for searching from the means;
A third display step in which the display means displays the searched first graph structure;
The document processing method according to claim 9, further comprising:

Computer
First storage means for storing a plurality of documents including reference information to other documents;
For each document stored in the first storage means, the central node includes a central node corresponding to the document and a peripheral node corresponding to another document linked to the document by the reference information. Has a node name representing the document, and the peripheral node generates a graph structure having a node name representing the other document,
Second storage means for storing a plurality of graph structures corresponding to each of the plurality of documents;
The first graph structure includes, as a central node, a first document corresponding to a first peripheral node in the first graph structure among the plurality of graph structures stored in the second storage unit. Graph structure complementing means for adding a second peripheral node having a node name representing the second document to a second graph structure that does not include the second document corresponding to the central node of the second node as a peripheral node;
First retrieval means for retrieving a graph structure from the second storage means;
First display means for displaying the retrieved graph structure;
Program to function as.