JP6572758B2

JP6572758B2 - Program and information processing apparatus

Info

Publication number: JP6572758B2
Application number: JP2015235551A
Authority: JP
Inventors: 岡本　洋; 洋岡本
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2015-12-02
Filing date: 2015-12-02
Publication date: 2019-09-11
Anticipated expiration: 2035-12-02
Also published as: JP2017102712A

Description

本発明は、プログラム及び情報処理装置に関する。 The present invention relates to a program and an information processing apparatus.

現実世界に存在する巨大で複雑なネットワークである「複雑ネットワーク」の研究が進んでおり、複雑ネットワークの分析やその分析結果を利用したサービスが種々提案されている。複雑ネットワークには、例えば、膨大な数のウェブページ群から構成されるＷＷＷ（World Wide Web）、インターネット、ソーシャルネットワーク、文書引用ネットワーク、ユーザ商品ネットワーク、遺伝子制御ネットワーク、タンパク相互作用ネットワーク等がある。 Research on “complex networks”, which are huge and complex networks existing in the real world, is advancing and various analyzes of complex networks and services using the analysis results have been proposed. The complex network includes, for example, a WWW (World Wide Web) composed of a huge number of web page groups, the Internet, a social network, a document citation network, a user product network, a gene control network, a protein interaction network, and the like.

複雑ネットワーク科学において、ネットワーク中のリンク（エッジとも呼ばれる）が密な部分のことを「コミュニティ」と呼ぶ（後述の非特許文献１，２参照）。コミュニティは、ネットワークで表現される複雑系を構成する個々の機能モジュールに対応するとも考えられる。複雑ネットワーク等のネットワークからコミュニティを抽出する手法が様々に提案されている。 In complex network science, a portion where links (also called edges) in a network are dense is called a “community” (see Non-Patent Documents 1 and 2 described later). A community is also considered to correspond to each functional module that constitutes a complex system expressed by a network. Various methods for extracting a community from a network such as a complex network have been proposed.

非特許文献１，２は、ネットワークからコミュニティを検出するアルゴリズムに関する総説である。これまでに提案されてきたコミュニティ検出アルゴリズムのうち注目すべきもの（例えばモジュラリティ最大化、情報圧縮、ラベル伝搬、クリークパーコレーション、ブロックモデル）はすべてこれら文献に記載されている。 Non-Patent Documents 1 and 2 are reviews on algorithms for detecting a community from a network. Of the community detection algorithms that have been proposed so far, notable ones (eg, modularity maximization, information compression, label propagation, clique percolation, block model) are all described in these documents.

非特許文献３〜６には、ユーザ等から指定された１個の種ノードをメンバとして含むコミュニティを検出する方法が開示されている。これら文献に開示された方法では、指定された種ノードを起点として、ネットワーク内のノードをたどりながらコミュニティにノードを付け足していき、コミュニティの評価値を評価関数により計算する。その評価値を最適化することで、目的のコミュニティを求める。 Non-Patent Documents 3 to 6 disclose a method for detecting a community including one seed node designated by a user or the like as a member. In the methods disclosed in these documents, a node is added to a community while tracing a node in the network starting from a designated seed node, and a community evaluation value is calculated by an evaluation function. The target community is obtained by optimizing the evaluation value.

特許文献１には、ノード間の遷移がマルコフ連鎖に従うネットワークにおいて、指定されたノードを始点とする経路のうち、たどる可能性の最も高い経路を、マルコフ連鎖の遷移確率に基づく時間遷移の繰り返し演算によって求める方法が記載されている。 In Patent Document 1, in a network in which transition between nodes follows a Markov chain, a path having the highest possibility of following a path starting from a specified node is repeatedly calculated based on the transition probability of the Markov chain. The method to obtain is described.

特許文献２には、ノード間の遷移がマルコフ連鎖に従うネットワークにおいて、ユーザからの検索条件に適合するノード群を求めるための計算負荷が高いパーソナライズド・ページランクアルゴリズムの演算処理を、ネットワーク全体ではなく、検索条件に関連する部分に限定するための方法が記載されている。この方法では、学習用データを用いてネットワーク上のマルコフ連鎖のクラスタリングを学習すると共に、ユーザからの検索条件に該当するノード群とクラスタリングの学習結果とから、検索条件に適合するクラスタを抽出し、そのクラスタに属するノード群からなる部分ネットワークに対してパーソナライズド・ページランクアルゴリズムを実行する。 In Patent Document 2, in a network in which transitions between nodes follow a Markov chain, the computation processing of the personalized page rank algorithm, which has a high calculation load for obtaining a node group that matches a search condition from the user, is not applied to the entire network. A method for limiting to a part related to a search condition is described. In this method, the clustering of Markov chains on the network is learned using the learning data, and the cluster that matches the search condition is extracted from the node group corresponding to the search condition from the user and the learning result of the clustering, A personalized page rank algorithm is executed on a partial network composed of nodes belonging to the cluster.

特許文献３に記載された情報処理装置は、ログ情報を文書ＤＢ（データベース）から取得するログ取得手段と、文書情報を操作する利用者をノードとし、同一の文書情報を操作した利用者間にリンクを張って、ネットワーク構造を生成するネットワーク生成手段と、ネットワーク構造からリンクの生成確率に基づく確率モデルを用いて共通の関連性を有する複数の利用者の集合をコミュニティとして抽出するコミュニティ抽出手段と、コミュニティにおける利用者の確率分布である所属率又はコミュニティにおける利用者のネットワーク指標の期待値である影響度を指標として算出する利用者指標算出手段とを有する。 The information processing apparatus described in Patent Document 3 includes a log acquisition unit that acquires log information from a document DB (database), and a user who operates the document information as a node, and a user who operates the same document information. Network generating means for creating a network structure by establishing a link, and community extracting means for extracting a set of a plurality of users having a common relationship as a community from the network structure using a probability model based on the generation probability of the link A user index calculating means for calculating, as an index, an affiliation rate that is a probability distribution of users in the community or an influence degree that is an expected value of the network index of users in the community.

Newman, M.E.J. Communities, modules and large-scale structure in networks. Nature Physics 8, 25-31 (2011)Newman, M.E.J.Communities, modules and large-scale structure in networks.Nature Physics 8, 25-31 (2011) Fortunato, S. Community detection in graphs. Physic Report 486, 75-174 (2010)Fortunato, S. Community detection in graphs.Physic Report 486, 75-174 (2010) Bagrow, J.P., Bollt, E.M.: Local method for detecting communities. Phys. Rev. E 72, 046108 (2005)Bagrow, J.P., Bollt, E.M .: Local method for detecting communities.Phys. Rev. E 72, 046108 (2005) Clauset, A.: Finding local community structure in networks. Phys. Rev. E 72, 026132 (2005)Clauset, A .: Finding local community structure in networks. Phys. Rev. E 72, 026132 (2005) Lancichinetti, A., Fortunato, S., Kertesz, J.: Detecting the overlapping and hierarchical community structure in complex networks. New J. Phys. 11, 033015 (2009)Lancichinetti, A., Fortunato, S., Kertesz, J .: Detecting the overlapping and hierarchical community structure in complex networks.New J. Phys. 11, 033015 (2009) Okamoto, H.: Local Detection of Communities vt Neural-Network Dynamics. LNCS 8131, 50--57 (2013)Okamoto, H .: Local Detection of Communities vt Neural-Network Dynamics. LNCS 8131, 50--57 (2013)

特開２０１３−４１５３０号公報JP 2013-41530 A 特開２０１３−１６８１２７号公報JP2013-168127A 特開２０１５−４６１０２号公報Japanese Patent Laying-Open No. 2015-46102

指定された種ノードを起点として、ネットワーク内のノードをたどりながらコミュニティにノードを付け足していき、評価値の高いコミュニティを求める方法では、ユーザが種ノードを指定する必要がある。ユーザが種ノードを特定できない場合や、１つの種ノードに絞り込めない場合等には、そのような方法は利用できない。 In a method of adding a node to a community while tracing a node in the network starting from a designated seed node, and obtaining a community having a high evaluation value, the user needs to designate the seed node. Such a method cannot be used when the user cannot identify the seed node or when the user cannot narrow down to one seed node.

本発明は、ユーザがネットワーク内の単一のノードを種ノードとして特定できない場合でも、ユーザのクエリに対応するネットワーク内のコミュニティを抽出できるようにすることを目的とする。 An object of the present invention is to enable extraction of a community in a network corresponding to a user query even when the user cannot identify a single node in the network as a seed node.

請求項１に係る発明は、コンピュータを、ネットワーク内の各ノードの、クエリに対する適合度合いを示す評価値を計算する評価値計算手段、前記各ノードの評価値のうちの第１成分は当該ノードから出るリンクに従ってリンク先のいずれかのノードへと遷移し、前記各ノードの評価値のうちの第２成分は当該ノードから出るリンクとは無関係に他のノードへと遷移する、という計算ルールに従って、遷移後の前記各ノードの評価値を計算する、前記第２成分の遷移計算手段であって、前記計算ルールでは、前記第２成分は、０に近い第１範囲内の評価値を持つノードよりも前記第１範囲よりも高い評価値を持つノードに対してより多く遷移すると共に、前記ノードの評価値に占める前記第１成分の割合は前記評価値の増加に従って単調増加する、遷移計算手段、前記遷移計算手段の計算した前記各ノードの遷移後の評価値に基づき、前記クエリに対応する１以上のノードを特定する特定手段、として機能させるためのプログラムである。 According to the first aspect of the present invention, there is provided an evaluation value calculating means for calculating an evaluation value indicating a degree of conformity with respect to a query of each node in the network, and the first component of the evaluation value of each node is obtained from the node. According to a calculation rule that a transition is made to any one of linked nodes according to an outgoing link, and the second component of the evaluation value of each node transits to another node regardless of the link exiting from the node, The second component transition calculation means for calculating an evaluation value of each node after the transition, wherein, in the calculation rule, the second component is obtained from a node having an evaluation value within a first range close to 0. More transitions to nodes having an evaluation value higher than the first range, and the proportion of the first component in the evaluation value of the node increases monotonically as the evaluation value increases That the transition calculation means, based on the evaluation value after the transition of the calculated each node of the transition computing means, a program for causing the specifying means functions as, identifying one or more nodes that correspond to the query.

請求項２に係る発明は、前記特定手段は、前記遷移計算手段の計算した前記各ノードの遷移後の評価値があらかじめ定めた閾値より大きい値を持つノード群を、前記クエリに対応するコミュニティとして特定する、ことを特徴とする請求項１に記載のプログラムである。 According to a second aspect of the present invention, the specifying unit sets, as a community corresponding to the query, a node group in which an evaluation value after transition of each node calculated by the transition calculating unit is larger than a predetermined threshold value. 2. The program according to claim 1, wherein the program is specified.

請求項３に係る発明は、前記特定手段は、特定したノード群を、前記遷移計算手段の計算した前記各ノードの遷移後の評価値が高い順に示したリストを生成する、ことを特徴とする請求項１又は２に記載のプログラムである。 The invention according to claim 3 is characterized in that the specifying unit generates a list showing the specified node group in descending order of evaluation values after transition of the nodes calculated by the transition calculating unit. A program according to claim 1 or 2.

請求項４に係る発明は、前記計算ルールでは、前記ノードの評価値に占める前記第１成分の割合は、前記評価値が閾値以下であるノードについては０である、ことを特徴とする請求項１〜３のいずれか１項に記載のプログラムである。 The invention according to claim 4 is characterized in that, in the calculation rule, the ratio of the first component to the evaluation value of the node is 0 for a node having the evaluation value equal to or less than a threshold value. The program according to any one of 1 to 3.

請求項５に係る発明は、ネットワーク内の各ノードの、クエリに対する適合度合いを示す評価値を計算する評価値計算手段と、前記各ノードの評価値のうちの第１成分は当該ノードから出るリンクに従ってリンク先のいずれかのノードへと遷移し、前記各ノードの評価値のうちの第２成分は当該ノードから出るリンクとは無関係に他のノードへと遷移する、という計算ルールに従って、遷移後の前記各ノードの評価値を計算する、前記第２成分の遷移計算手段であって、前記計算ルールでは、前記第２成分は、０に近い第１範囲内の評価値を持つノードよりも前記第１範囲よりも高い評価値を持つノードに対してより多く遷移すると共に、前記ノードの評価値に占める前記第１成分の割合は前記評価値の増加に従って単調増加する、遷移計算手段と、前記遷移計算手段の計算した前記各ノードの遷移後の評価値に基づき、前記クエリに対応する１以上のノードを特定する特定手段と、を有する情報処理装置である。 The invention according to claim 5 is an evaluation value calculation means for calculating an evaluation value indicating a degree of conformity to a query of each node in the network, and a first component of the evaluation value of each node is a link coming out of the node According to the calculation rule that the second component of the evaluation value of each node transits to another node regardless of the link exiting from the node. The second component transition calculation means for calculating the evaluation value of each of the nodes, wherein, in the calculation rule, the second component is more than the node having the evaluation value in the first range close to 0. Transition calculation with more transitions for nodes having an evaluation value higher than the first range, and the proportion of the first component in the evaluation value of the node monotonously increases as the evaluation value increases Stage and, based on the evaluation value after the transition of the calculated each node of the transition computing means, and identifying means for identifying one or more nodes that correspond to the query, an information processing apparatus having a.

請求項１，２又は５に係る発明によれば、ユーザがネットワーク内の単一のノードを種ノードとして特定できない場合でも、ユーザのクエリに対応するネットワーク内のコミュニティを抽出できる。 According to the first, second, or fifth aspect of the present invention, it is possible to extract a community in the network corresponding to a user query even when the user cannot identify a single node in the network as a seed node.

請求項３に係る発明によれば、クエリに対するノードの適合度合いだけでなく、ネットワーク内でのノード同士のリンク状況も加味して、クエリに対する評価値の高いノードのリストを提供することができる。 According to the third aspect of the present invention, it is possible to provide a list of nodes having a high evaluation value for a query in consideration of not only the degree of matching of the node with respect to the query but also the link status between the nodes in the network.

請求項４に係る発明によれば、閾値以下の評価値に対応する第１成分の割合を０より大きい値とする場合と比べて、遷移計算手段における計算で、クエリに対応するコミュニティに含まれるノードと、そのコミュニティに含まれないノードとの差をより明確にすることができる。 According to the invention of claim 4, compared to the case where the ratio of the first component corresponding to the evaluation value equal to or less than the threshold is set to a value larger than 0, the calculation in the transition calculating means includes the community corresponding to the query. The difference between a node and a node not included in the community can be made clearer.

実施形態の情報処理装置の機能構成の一例を示す図である。It is a figure which shows an example of a function structure of the information processing apparatus of embodiment. 評価値ｐ_n（ｔ）とその中のリンク遷移成分ｆ_n（ｔ）との関係を示すグラフの例を示す図である。It is a diagram illustrating an example of a graph showing the relationship between evaluation values p _n (t) and the link transition component f _n (t) therein. クエリに対する適合度合いだけではコミュニティに属さない偽メンバーが抽出される可能性があるのに対し、ネットワークのリンク状況を加味して評価値の時間発展を繰り返し計算することで、偽メンバーを排除したコミュニティが抽出されることを説明するための図である。While there is a possibility that a false member that does not belong to the community may be extracted only by the degree of conformity to the query, the community that excludes the false member by repeatedly calculating the time evolution of the evaluation value in consideration of the link status of the network It is a figure for demonstrating that is extracted. 評価値ｐ_n（ｔ）とその中のリンク遷移成分ｆ_n（ｔ）との関係を示すグラフの別の例を示す図である。A diagram showing another example of a graph showing the relationship between evaluation values p _n (t) and the link transition component f _n (t) therein is.

図１を参照して、本発明の一実施形態の装置構成を説明する。 With reference to FIG. 1, an apparatus configuration according to an embodiment of the present invention will be described.

ネットワーク情報記憶部１０は、処理の対象となるネットワーク（例えばＷＷＷ）を表す情報（以下「ネットワーク情報」と呼ぶ）を記憶する記憶装置である。ネットワーク情報には、当該ネットワークの構造を示す情報、すなわち当該ネットワーク内のノード同士のリンク接続の関係を示す接続関係情報が含まれる。この接続関係情報は、例えば隣接行列の形で表現される。なお、周知のように隣接行列Ａ_ｎｍは、ノードｍからノードｎへリンクが張られていればＡ_ｎｍ＝１、そうでなければＡ_ｎｍ＝０となる行列である。また、ネットワークの接続関係情報は、遷移行列Ｔ_ｎｍの形で表現される情報であってもよい。遷移行列Ｔ_ｎｍは、ノードｍ上に居るエージェントがノードｍからノードｎへのリンクをたどって、ノードｎに遷移する確率を示す行列である。ノードｍからノードｎへのリンクがない場合は、Ｔ_ｎｍの値は０である。また、あるノードｍに着目した場合、Ｔ_ｎｍをネットワーク内の全ノードｎにわたって総和すると「１」となる。 The network information storage unit 10 is a storage device that stores information (hereinafter referred to as “network information”) representing a network (for example, WWW) to be processed. The network information includes information indicating the structure of the network, that is, connection relationship information indicating a link connection relationship between nodes in the network. This connection relation information is expressed in the form of an adjacency matrix, for example. As is well known, the adjacency matrix A _nm is a matrix in which A _nm = 1 if a link is established from the node m to the node n, and A _nm = 0 otherwise. The network connection relation information may be information expressed in the form of a transition matrix T _nm . The transition matrix T _nm is a matrix indicating the probability that an agent on the node m follows the link from the node m to the node n and transitions to the node n. If there is no link from node m to node n, the value of T _nm is zero. When attention is paid to a certain node m, the sum of T _nm over all nodes n in the network is “1”.

また、ネットワーク情報には、各ノードの属性情報が含まれていてもよい。ノードの属性情報には、そのノードが表すエンティティが持つ１以上の属性の値が含まれる。例えばノードがウェブページを表している場合、そのウェブページのコンテンツデータ（そのページを表すＨＴＭＬの記述）はそのノードの属性情報の１つである。またノードが論文を表すものである場合、ノードの属性情報にはその論文の内容（テキスト）、著者名、掲載雑誌名等が含まれ得る。また、対象のネットワークがソーシャルネットワークである場合個々のノードはそれぞれ人に対応するが、この場合のノードの属性情報には、そのノードに対応する人の氏名、年齢、性別、職業、所属組織等の情報が含まれ得る。 The network information may include attribute information of each node. The attribute information of a node includes one or more attribute values possessed by the entity represented by the node. For example, when a node represents a web page, content data of the web page (description of HTML representing the page) is one of the attribute information of the node. If the node represents a paper, the attribute information of the node may include the content (text) of the paper, the author name, the journal name, and the like. If the target network is a social network, each node corresponds to a person. In this case, the attribute information of the node includes the name, age, gender, occupation, organization, etc. of the person corresponding to the node. Information may be included.

処理対象のネットワークの情報が本装置の外部に存在する場合、ネットワーク情報取得部１２が、そのネットワークの情報を取得する。例えば、対象のネットワークがＷＷＷである場合、ネットワーク情報取得部１２は、いわゆるクローラーによりリンクをたどってウェブページ群を収集し、収集した情報からノード（ウェブページ）間の接続（リンク）関係や各ノードの属性情報を抽出する。また、ＳＮＳ（ソーシャルネットワークサービス）が管理しているソーシャルネットワークを対象とする場合、ネットワーク情報取得部１２は、ＳＮＳが公開しているインタフェースを利用するなどして、そのソーシャルネットワークの情報を取得する。ネットワーク情報取得部１２が取得したネットワーク情報は、ネットワーク情報記憶部１０に記憶される。 When the information on the network to be processed exists outside the apparatus, the network information acquisition unit 12 acquires the network information. For example, when the target network is the WWW, the network information acquisition unit 12 collects web page groups by following links by a so-called crawler, and the connection (link) relationship between nodes (web pages) and each of the collected information. Extract node attribute information. When a social network managed by an SNS (social network service) is targeted, the network information acquisition unit 12 acquires information on the social network by using an interface published by the SNS. . The network information acquired by the network information acquisition unit 12 is stored in the network information storage unit 10.

クエリ受付部１４は、対象のネットワークに対するクエリ（問合せ）をユーザから受け付ける。クエリは、対象のネットワーク内のノード群の中から、ユーザの意図に適したものを絞り込むための条件である。クエリの形式は特に限定されない。 The query receiving unit 14 receives a query (inquiry) for the target network from the user. The query is a condition for narrowing down a node group in the target network that is suitable for the user's intention. The format of the query is not particularly limited.

例えば、クエリは、ノードが持つ１以上の属性に関する検索条件であってもよい。ノードがウェブページ等の文書に対応している場合、キーワード（検索語）等や複数のキーワードからなる論理式をクエリとしてもよい。また、著者と掲載雑誌名のように複数の属性に関する条件の組み合わせをクエリとしてもよい。 For example, the query may be a search condition related to one or more attributes of the node. When the node corresponds to a document such as a web page, a query may be a logical expression including a keyword (search term) or a plurality of keywords. Moreover, it is good also considering the combination of the conditions regarding a some attribute like an author and a publication journal name as a query.

またクエリは、対象のネットワーク内のノード群のうちの特定の１以上のノードを直接指定するものであってもよい。例えば、ソーシャルネットワークを構成する多数の人（ノード）の中から、ユーザの中学時代の同窓生のコミュニティを見つけたい場合に、同窓生として心当たりのある何人かを指定し、これら指定した人の組み合わせをクエリとする場合がこれに該当する。 Further, the query may directly specify one or more specific nodes among the node group in the target network. For example, if you want to find a community of junior high school alumni from a large number of people (nodes) that make up a social network, you can specify the number of people you are familiar with as alumni and query the combination of those people This is the case.

初期評価値計算部１６は、クエリ受付部１４が受け付けたクエリに対する各ノードの適合度合いを示す評価値の初期値（初期評価値）を計算する。クエリが検索条件である場合、この初期評価値の計算は、検索条件に対する適合度（スコア）を計算する検索アルゴリズムの計算方式により行えばよい。例えば、ネットワーク情報記憶部１０に記憶された各ノードの属性情報に基づき、各ノードがその検索条件に適合している程度を計算すればよい。また、検索アルゴリズムが求めた適合度を正規化（例えば全ノードの評価値の合計が１になるようにする処理）したものを評価値として用いてもよい。また、１以上のノードを直接指定するクエリの場合、指定されたノード以外のノードの初期評価値は０とし、指定された各ノードの初期評価値は正の値（例えば１を指定されたノードの数で割った値）としてもよい。また、ユーザに指定した各ノードの初期評価値を入力させてもよい。なお、ノードの「評価値」は０以上の実数とする。 The initial evaluation value calculation unit 16 calculates an initial value (initial evaluation value) of an evaluation value indicating the degree of suitability of each node for the query received by the query reception unit 14. When the query is a search condition, the initial evaluation value may be calculated by a calculation method of a search algorithm that calculates a fitness (score) for the search condition. For example, based on the attribute information of each node stored in the network information storage unit 10, the degree to which each node meets the search condition may be calculated. Moreover, you may use what normalized the matching degree which the search algorithm calculated | required (for example, the process which makes the sum total of the evaluation value of all the nodes become 1) as an evaluation value. In the case of a query that directly specifies one or more nodes, the initial evaluation value of nodes other than the specified node is 0, and the initial evaluation value of each specified node is a positive value (for example, a node with 1 specified). Or a value divided by the number of In addition, an initial evaluation value of each node designated by the user may be input. The “evaluation value” of the node is a real number of 0 or more.

時間発展計算部１８は、各ノードの評価値の時間発展を計算する。 The time evolution calculation unit 18 calculates the time evolution of the evaluation value of each node.

ネットワーク内の各ノードの評価値（ページランク値）の時間発展を計算するアルゴリズムとして、ページランクアルゴリズムがよく知られている。ページランクアルゴリズムは、一人のエージェントがネットワーク内のノード間をリンクを辿りながらランダムに歩き回るマルコフ連鎖ランダムウォークのアナロジーに基づいており、エージェントが存在する確率が高いノードほどページランク値が高くなる。ページランクアルゴリズムは、ノードｍ上のエージェントはノードｍから出る１以上のリンクを等確率で選択してリンク先のノードへと遷移すると仮定し、ノード群の評価値の時間発展を計算する。また、実際に得られる情報から各リンクの強さを求め、リンクの強さに応じてエージェントがそのリンクを選択する確率を決める方法も知られている。ページランクアルゴリズム等の従来手法でのノード評価値の時間発展の計算は、エージェントのノード間での遷移はリンクを通って行われることを前提としている。 A page rank algorithm is well known as an algorithm for calculating the time evolution of the evaluation value (page rank value) of each node in the network. The page rank algorithm is based on the Markov chain random walk analogy in which one agent walks at random while following links between nodes in the network, and a node with a higher probability that an agent exists has a higher page rank value. The page rank algorithm assumes that the agent on the node m selects one or more links coming out of the node m with equal probability and makes a transition to the link destination node, and calculates the time evolution of the evaluation value of the node group. A method is also known in which the strength of each link is obtained from information actually obtained, and the probability that the agent selects the link according to the strength of the link is known. The calculation of the time evolution of the node evaluation value by the conventional method such as the page rank algorithm is based on the premise that the transition between the nodes of the agent is performed through the link.

これに対し、本実施形態の時間発展計算部１８で用いる計算ルールでは、ノード間のエージェントの遷移には、ネットワーク上のリンクを通る遷移の成分だけでなく、評価値が高いノードに対してリンクを通らずに直接ジャンプする遷移の成分も含まれ得るものと想定する。すなわち、ノードの評価値を第１成分と第２成分とに分け、第１成分はそのノードから出るリンクを通って他のノードに遷移し、第２成分はリンクと無関係に他のノードに遷移するものとする。第１成分のことを「リンク遷移成分」、第２成分のことを「非リンク遷移成分」と呼ぶ。ノードの評価値のうちリンク遷移成分はそのノードのリンク先のいずれかのノードにしか遷移できないが、非リンク遷移成分はそのノードのリンク先以外のノードにも遷移し得る。 On the other hand, in the calculation rule used in the time evolution calculation unit 18 of the present embodiment, the agent transition between nodes includes not only a transition component passing through a link on the network but also a link with respect to a node having a high evaluation value. It is assumed that components of transitions that jump directly without passing through may also be included. That is, the evaluation value of a node is divided into a first component and a second component, the first component transits to another node through a link exiting from the node, and the second component transits to another node regardless of the link. It shall be. The first component is called a “link transition component”, and the second component is called a “non-link transition component”. Of the evaluation value of a node, a link transition component can only transition to any one of the nodes linked to that node, but a non-link transition component can also transition to a node other than the link destination of that node.

また時間発展計算部１８の計算ルールでは、非リンク遷移成分は、評価値が高いノードに対して遷移しやすい（すなわち、より多くの量が遷移する）ようにする。例えば、あるノードの非リンク遷移成分が遷移する度合い（確率）が、遷移先のノードの評価値の増加に伴って単調増加するようにしてもよい。これを逆から見ると、ノードの評価値のうち他のノード群の非リンク遷移成分の（当該ノードへの）遷移に由来する部分は、当該ノードの評価値の増加に伴って単調に増加することになる。非リンク遷移成分をこのように定めることで、非リンク遷移成分は評価値の高いノードに集まりやすくなる。１つの例として、非リンク遷移成分の遷移先を、評価値があらかじめ定めた閾値以上であるノード群に限定してもよい。この例では、非リンク遷移成分は、評価値が閾値以下のノードには遷移しない。 Further, according to the calculation rule of the time evolution calculation unit 18, the non-link transition component is likely to transition to a node having a high evaluation value (that is, a larger amount transitions). For example, the degree (probability) of transition of a non-link transition component of a certain node may monotonously increase as the evaluation value of the transition destination node increases. Looking at this from the opposite side, the portion derived from the transition of the non-link transition component of another node group (to the node) in the node evaluation value increases monotonously as the evaluation value of the node increases. It will be. By defining the non-link transition component in this way, the non-link transition component is likely to gather at a node having a high evaluation value. As an example, the transition destination of the non-link transition component may be limited to a node group whose evaluation value is equal to or greater than a predetermined threshold value. In this example, the non-link transition component does not transition to a node whose evaluation value is equal to or less than the threshold value.

また時間発展計算部１８の計算ルールでは、ノードの評価値に占めるリンク遷移成分の割合は、ノードの評価値の増加に伴って単調増加するものとする。この場合、非リンク遷移成分がノードの評価値に占める割合は、その評価値の増加に伴って単調減少することとなる。このようにすることで、評価値の低いノードの評価値は、リンクに沿って遷移する代わりに、評価値の高いノードに集まっていきやすい。その一方、評価値の高いノードの評価値は、リンクに沿って他のノードに遷移していく傾向が高い。したがって、ネットワーク内に評価値の高いノードがある程度集まってリンクで相互に接続されている部分があれば、時間発展を繰り返した場合でも、その部分のノードの間でリンクを介して評価値が巡っていくので、その部分のノードの評価値が全般的に高くなる。この「部分」が１つの「コミュニティ」を形成する。一方、ネットワーク内で評価値の高い他のノード群から離れた位置に評価値の高いノードが孤立していた場合、その評価値はリンクを介して他のノードに流出する一方、他のノードからそのノードへ流入する評価値は少ないので、時間発展の結果そのノードの評価値は低くなっていく。したがって、仮にクエリから求められた初期評価値が高いノードがあったとしても、他から孤立している場合には、そのノードはコミュニティのメンバとして抽出されにくい。 Further, according to the calculation rule of the time evolution calculation unit 18, it is assumed that the ratio of the link transition component in the evaluation value of the node increases monotonously as the evaluation value of the node increases. In this case, the ratio of the non-link transition component to the evaluation value of the node monotonously decreases as the evaluation value increases. By doing in this way, the evaluation value of a node with a low evaluation value tends to gather at a node with a high evaluation value instead of transitioning along a link. On the other hand, the evaluation value of a node having a high evaluation value tends to transition to another node along the link. Therefore, if there is a part where nodes with high evaluation values gather to some extent in the network and are connected to each other by links, even if the time evolution is repeated, the evaluation values circulate between the nodes of that part via the links. As a result, the evaluation value of the node in that part generally increases. This “part” forms one “community”. On the other hand, if a node with a high evaluation value is isolated at a position away from another node group with a high evaluation value in the network, the evaluation value flows out to the other node via the link, while the other node Since the evaluation value flowing into the node is small, the evaluation value of the node becomes lower as a result of time development. Therefore, even if there is a node having a high initial evaluation value obtained from the query, if the node is isolated from others, the node is difficult to be extracted as a member of the community.

１つの例では、評価値がある閾値θ未満（あるいは以下）であるノードについては、リンク遷移成分を０とする。この例では、評価値がその閾値θ未満であるノードの評価値はすべて非リンク遷移成分として取り扱われ、上述したように、ネットワークのリンクとは無関係に、評価値の高いノードに対して優先的に遷移する。 In one example, the link transition component is set to 0 for a node whose evaluation value is less than (or below) a certain threshold value θ. In this example, all evaluation values of nodes whose evaluation value is less than the threshold value θ are treated as non-link transition components, and as described above, priority is given to nodes with high evaluation values regardless of the link of the network. Transition to.

遷移していく評価値成分の源泉は、初期評価値計算部１６がクエリに基づき求めた各ノードの初期評価値であり、初期評価値が高いノードの評価値は、主としてリンクを介してリンク先のノードに流出していく。しかし、初期評価値が高いノード群が密にリンクで接続されている部分（すなわち、コミュニティ）では、他の評価値の高いノードからのリンクを介して評価値成分が流入するので、その部分のノードの評価値は高く維持される。一方、その部分から外に向かうにつれてノードの評価値は下がっていき、非リンク遷移成分が優位になっていくので、評価値はリンク先に遷移するよりも、評価値が高いノードにいわば戻っていく部分が多くなる。特に、評価値が閾値θ未満のノードについてはリンク遷移成分を０とする方式では、その閾値未満のノードの評価値は、リンク経由でコミュニティの外側に流出することはなく、その全量が評価値が高いノード（ほとんどがコミュニティ内のノード）に移る。したがって、時間発展が進むと、コミュニティの内側のノード群の評価値は高く維持される一方、コミュニティの外側の実質上すべてのノードは評価値が０になっていく。したがって、時間発展が収束した後の最終的な評価値が０より大きいノードを抽出すれば、それら抽出したノードの集合が、当初のクエリに対応するコミュニティとなる。なお、ここで抽出の閾値に用いた０は、あくまで理想的な場合のものである。実際には、何らかの事情で収束後もコミュニティ外部のノードが正の小さい値の評価値を持つ場合もあり得るので、そのような外部ノードをコミュニティのメンバとして抽出しないよう、抽出の閾値を正の値（例えば上述の閾値θ）とする。 The source of the evaluation value component that transitions is the initial evaluation value of each node obtained by the initial evaluation value calculation unit 16 based on the query, and the evaluation value of a node with a high initial evaluation value is mainly linked via a link. It flows out to the node. However, in a portion where a node group having a high initial evaluation value is closely connected by a link (that is, a community), an evaluation value component flows in through a link from another node having a high evaluation value. The evaluation value of the node is kept high. On the other hand, the evaluation value of the node goes down from that part and the non-link transition component becomes dominant, so the evaluation value returns to the node with a higher evaluation value than the transition to the link destination. There will be more parts to go. In particular, for a node whose evaluation value is less than the threshold θ, in the method in which the link transition component is set to 0, the evaluation value of the node whose threshold value is less than the threshold value does not flow out of the community via the link, and the total amount is the evaluation value. To higher nodes (mostly nodes in the community). Therefore, as time progresses, the evaluation value of the node group inside the community is kept high, while the evaluation value of all the nodes outside the community becomes zero. Therefore, if nodes whose final evaluation value after time evolution has converged are extracted, a set of these extracted nodes becomes a community corresponding to the initial query. Note that 0 used as the extraction threshold here is an ideal case. In practice, a node outside the community may have a small positive evaluation value even after convergence for some reason, so that the extraction threshold is set to a positive value so that such an external node is not extracted as a member of the community. A value (for example, the threshold value θ described above) is used.

具体的な一例として、時間発展計算部１８は、次の式１〜３により各ノードｎの評価値ｐ_n（ｔ）の時間発展を計算する。

As a specific example, the time evolution calculation unit 18 calculates the time evolution of the evaluation value p _n (t) of each node n using the following equations 1-3.

ここでｐ_n（ｔ）は、時刻ｔにノードｎが持つ評価値であり、ネットワーク全体でのノードｎの評価値の合計が１になるよう正規化済みの値である。すなわち、

である。またＮはネットワークの全ノードの数である。すなわち、ノードの識別番号であるｎは１〜Ｎの値をとる。またθは評価値ｐ_n（ｔ）に対して設定された閾値である。 Here, p _n (t) is an evaluation value possessed by the node n at the time t, and is a value that has been normalized so that the sum of the evaluation values of the node n in the entire network is 1. That is,

It is. N is the number of all nodes in the network. That is, n, which is a node identification number, takes a value of 1 to N. Θ is a threshold set for the evaluation value p _n (t).

式２で定義される関数ｆ_n（ｔ）は、ノードｎの評価値ｐ_n（ｔ）のうちのリンク遷移成分である。この例では、リンク遷移成分ｆ_n（ｔ）は、評価値ｐ_n（ｔ）が閾値θ未満であれば０であり、評価値ｐ_n（ｔ）が閾値θ以上であれば（ｐ_n（ｔ）−θ）である。この例における評価値ｐ_n（ｔ）に対するリンク遷移成分ｆ_n（ｔ）を示すグラフの例を図２に示す。図２では閾値θ＝０．１としているが、これは一例に過ぎない。 The function f _n (t) defined by Equation 2 is a link transition component in the evaluation value p _n (t) of the node n. In this example, the link transitions component f _n (t), the evaluation value is 0 if p _n (t) is less than the threshold value theta, if evaluation value p _n (t) is the threshold value theta higher (p _n ( t) −θ). An example of a graph showing the link transition component f _n (t) with respect to the evaluation value p _n (t) in this example is shown in FIG. In FIG. 2, the threshold θ is set to 0.1, but this is only an example.

図２に示す太い実線のグラフが式２に従うリンク遷移成分ｆ_n（ｔ）を示している。破線のグラフは、参考のために、評価値ｐ_n（ｔ）の全部がリンク遷移成分ｆ_n（ｔ）である場合（すなわちｆ_n（ｔ）＝ｐ_n（ｔ））の場合のリンク遷移成分ｆ_n（ｔ）を示している。 The thick solid line graph shown in FIG. 2 indicates the link transition component f _n (t) according to Equation 2. For reference, the broken line graph shows the link transition when the evaluation value p _n (t) is entirely the link transition component f _n (t) (that is, f _n (t) = p _n (t)). The component f _n (t) is shown.

また式３に示すように、関数Ｆ（ｔ）は、ネットワークの全ノードのリンク遷移成分ｆ_n（ｔ）の総和である。 As shown in Equation 3, the function F (t) is the sum of the link transition components f _n (t) of all the nodes in the network.

式１の右辺第１項は、リンク遷移成分の遷移によりもたらされる評価値成分を示している。この成分は、時刻（ｔ−１）における各ノードｍのリンク遷移成分ｆ_m（ｔ−１）が、ネットワーク内のノード間の接続関係を示す遷移行列Ｔ_ｎｍに従ってノードｎに遷移した場合の評価値の総和である。 The first term on the right side of Equation 1 represents an evaluation value component caused by the transition of the link transition component. This component is evaluated when the link transition component f _m (t−1) of each node m at time (t−1) transitions to the node n according to the transition matrix T _nm indicating the connection relationship between the nodes in the network. The sum of values.

また式１の右辺第２項は、非リンク遷移成分の遷移によりもたらされる評価値成分を示している。この成分について、以下に詳しく説明する。 In addition, the second term on the right side of Equation 1 represents an evaluation value component caused by the transition of the non-link transition component. This component will be described in detail below.

ノードの評価値のうちの非リンク遷移成分は次式で表される。

The non-link transition component in the evaluation value of the node is expressed by the following equation.

非リンク遷移成分をネットワークの全ノードにわたって総和すると、次式に示すようになる。

When the non-link transition components are summed over all the nodes of the network, the following equation is obtained.

これから分かるように、式１の右辺第２項に含まれる［１―Ｆ（ｔ−１）］は、時刻（ｔ−１）におけるネットワークの全ノードの非リンク遷移成分の総和である。そして、右辺第２項は、この総和に対し、注目するノードｎのリンク遷移成分ｆ_n（ｔ−１）のその総和に占める割合を乗じた値となっている。すなわち、右辺第２項は、全ノードの非リンク遷移成分の総和を、各ノードｎのリンク遷移成分の大きさに比例して分配したものである。したがってこの例では、評価値ｐ_n（ｔ）が高いノードほど、非リンク遷移成分に由来する評価値の分配を多く受けることになる。 As can be seen, [1-F (t−1)] included in the second term on the right side of Equation 1 is the sum of the non-link transition components of all the nodes of the network at time (t−1). The second term on the right side is a value obtained by multiplying the sum by the ratio of the link transition component f _n (t−1) of the node n of interest to the sum. That is, the second term on the right side is obtained by distributing the sum of the non-link transition components of all nodes in proportion to the size of the link transition component of each node n. Therefore, in this example, the higher the evaluation value p _n (t) is, the more evaluation value distribution derived from the non-link transition component is received.

式１では、説明を簡潔にするため、時間発展の１サイクルである単位時間の長さを１としている。１サイクルを他の長さとした場合にも、式１と同様の考え方で計算が可能である。 In Formula 1, the length of a unit time, which is one cycle of time evolution, is set to 1 for the sake of brevity. Even when one cycle is set to other lengths, the calculation can be performed in the same way as in Equation 1.

この例では、時間発展計算部１８は、初期評価値計算部１６が計算した各ノードｎの初期評価値の集合｛ｐ_n（０）｝（時刻ｔ＝０）を開始点として、上記式１の時間発展を繰り返し計算していく。開始点である初期評価値の集合｛ｐ_n（０）｝のことを、初期パタンと呼ぶ。時間発展計算部１８は、時刻ｔを単位時間１ずつ進めながら、この式１の計算を、評価値ｐ_n（ｔ）が収束するまで繰り返す。収束の判定は従来の方式で行えばよい。例えば、全ノードｎについて、ある時刻の評価値ｐ_n（ｔ）と前の時刻の評価値ｐ_n（ｔ−１）との差があらかじめ定めた閾値以下になった場合に、評価値ｐ_n（ｔ）が収束したと判定すればよい。 In this example, the time evolution calculation unit 18 uses the set of initial evaluation values {p _n (0)} (time t = 0) of each node n calculated by the initial evaluation value calculation unit 16 as a starting point. The time evolution of is repeatedly calculated. A set of initial evaluation values {p _n (0)} as a starting point is called an initial pattern. The time evolution calculation unit 18 repeats the calculation of Expression 1 while advancing the time t by one unit time until the evaluation value p _n (t) converges. The determination of convergence may be performed by a conventional method. For example, for all nodes n, when the difference between the evaluation value p _n (t) at a certain time and the evaluation value p _n (t−1) at the previous time is equal to or less than a predetermined threshold value, the evaluation value p _n What is necessary is just to determine with (t) having converged.

このようにして求められた収束後のノードｎの評価値ｐ_n ^steadは、ネットワーク内のノード間の接続（リンク）関係の影響を織り込んだ上での、ユーザからのクエリに対するノードｎの適合度合いを表す。 The evaluation value p _n ^stead of the node n after convergence obtained in this way is the degree of conformity of the node n to the query from the user, taking into ^account the influence of the connection (link) relationship between the nodes in the network. Represents.

コミュニティ抽出部２０は、時間発展計算部１８が求めた、収束後（すなわち定常状態）の評価値ｐ_n ^steadから、クエリに適合したコミュニティ（ノードの集合）を抽出する。例えば、定常状態の評価値ｐ_n ^steadがあらかじめ定めた閾値（０以上の値）より大きいノードｎを、コミュニティのメンバとして抽出する。またコミュニティ抽出部２０は、抽出したコミュニティのメンバーのリストを生成し、画面表示してもよい。また、このリストは、メンバーを最終的な評価値の順にソートしたものとしてもよい。 The community extraction unit 20 extracts a community (a set of nodes) that ^matches the query from the evaluation value p _n ^stead after convergence (that is, steady state) obtained by the time evolution calculation unit 18. For example, a node n whose steady state evaluation value p _n ^stead is larger than a predetermined threshold value (a value greater than or equal to 0) is extracted as a member of the community. Moreover, the community extraction part 20 may produce | generate the list of the members of the extracted community, and may display on a screen. This list may be obtained by sorting the members in the order of final evaluation values.

時間発展計算部１８の計算の結果求められる収束後のノードｎの評価値の集合｛ｐ_n ^stead｝は、ネットワークのノードｎの評価値の集合が取り得る状態の中でのアトラクタの１つである。初期パタン｛ｐ_n（０）｝を多少変化させても、時間発展計算により同じアトラクタに到達する。したがって、ユーザが初期パタンを厳密に特定しなくても、ある程度以上の精度で特定できていれば、時間発展計算で同じアトラクタに到達できる。 A set {p _n ^stead } of the evaluation values of the node n after convergence obtained as a result of the calculation of the time evolution calculation unit 18 is one of the attractors in a state that the set of evaluation values of the node n of the network can take. is there. Even if the initial pattern {p _n (0)} is slightly changed, the same attractor is reached by time evolution calculation. Therefore, even if the user does not specify the initial pattern strictly, the same attractor can be reached by the time evolution calculation if it can be specified with a certain degree of accuracy.

また、初期パタン｛ｐ_n（０）｝の中に、求めたいコミュニティの外に位置するが初期評価値は高いノードが含まれていたとしても、初期評価値の高いノードの大部分がそのコミュニティ内に含まれていれば、時間発展計算により、そのコミュニティ外のすべてのノードの評価値が０乃至小さい値となった同じアトラクタに到達することができる。例えば、図３の（ａ）に示すネットワーク１００において破線の円１１０の内部にあるノード１０２群が、求めたいコミュニティであるとする。ここで、（ａ）内の濃色の６個のノードが、ユーザが入力したクエリから求められた初期評価値が高いノードであったとする。この例では、初期評価値が高いノードのうちの多数はコミュニティ１１０内に含まれているが、いくつかはコミュニティ１１０の外部にある。これらコミュニティ１１０外の評価値の高いノードは、コミュニティ１１０にとって偽メンバーである。このような偽メンバーを含む初期パタンに対して時間発展計算部１８の繰り返し計算を適用すると、図３の（ｂ）に示すように、コミュニティ１１０の内部のノード１０２群が高い評価値（濃色で示す）となり、コミュニティ１１０の外部のノードは偽メンバーであったものも含めすべて、０乃至きわめて小さい評価値となる。すなわち、クエリから求めた初期パタンに多少の偽メンバーが含まれていても、時間発展計算部１８及びコミュニティ抽出部２０は、偽メンバーを含まない正しいコミュニティーを抽出する。 Even if the initial pattern {p _n (0)} includes a node that is located outside the desired community but has a high initial evaluation value, most of the nodes with a high initial evaluation value are included in the community. If it is included, it is possible to reach the same attractor in which the evaluation values of all the nodes outside the community are 0 to small values by the time evolution calculation. For example, it is assumed that the node 102 group inside the broken-line circle 110 in the network 100 shown in FIG. Here, it is assumed that the six dark-colored nodes in (a) are nodes having a high initial evaluation value obtained from the query input by the user. In this example, many of the nodes with high initial evaluation values are included in the community 110, but some are outside the community 110. These nodes with high evaluation values outside the community 110 are false members for the community 110. When the iterative calculation of the time evolution calculation unit 18 is applied to the initial pattern including such a false member, as shown in FIG. 3B, the node 102 group inside the community 110 has a high evaluation value (dark color). All of the nodes outside the community 110, including those that were false members, have an evaluation value of 0 to very small. That is, even if some of the false members are included in the initial pattern obtained from the query, the time evolution calculation unit 18 and the community extraction unit 20 extract a correct community that does not include the false members.

以上に説明したように、ユーザが最初に入力するクエリが、求めたいコミュニティをある程度正確に特定できていれば、求めたいコミュニティを抽出することができる。求めたいコミュニティの外に位置するが初期評価値は高いノードは、そのコミュニティから見れば偽メンバーであるといえるが、本実施形態の方法では、初期パタンに偽メンバーが含まれていたとしても、求めたいコミュニティを抽出できるのである。 As described above, if the query that the user first inputs can accurately specify the community to be obtained to some extent, the community to be obtained can be extracted. A node located outside the desired community but having a high initial evaluation value can be said to be a fake member from the viewpoint of the community, but in the method of this embodiment, even if a fake member is included in the initial pattern, You can extract the community you want.

以上に説明した実施形態をあくまで例示に過ぎず、本発明の範囲内で様々な変形が可能である。 The embodiment described above is merely an example, and various modifications are possible within the scope of the present invention.

例えば、上述の式２（図２も参照）に示したリンク遷移成分ｆ_n（ｔ）の定義式はあくまで一例に過ぎない。例えば、別の例として、評価値ｐ_n（ｔ）が閾値θ未満ではｆ_n（ｔ）＝０だが、θ以上ではｆ_n（ｔ）＝ｐ_n（ｔ）とする定義式を用いてもよい。この例は、評価値ｐ_n（ｔ）が閾値θ以上のノードについては、評価値ｐ_n（ｔ）のうちの全量がリンク遷移成分となり、非リンク遷移成分は０となる。この例のリンク遷移成分ｆ_n（ｔ）のグラフは図４に示す実線のグラフのようなものとなる。 For example, the definition formula of the link transition component f _n (t) shown in the above formula 2 (see also FIG. 2) is merely an example. For example, as another example, if the evaluation value p _n (t) is less than the threshold value θ, f _n (t) = 0, but if it is greater than or _{equal to} θ, a definition formula that satisfies f _n (t) = p _n (t) may be used. Good. In this example, for nodes whose evaluation value p _n (t) is equal to or greater than the threshold θ, the entire amount of the evaluation value p _n (t) is a link transition component, and the non-link transition component is zero. The graph of the link transition component f _n (t) in this example is like the solid line graph shown in FIG.

また、図２（式２）と図４の例では、評価値ｐ_n（ｔ）が閾値θ未満の範囲については、リンク遷移成分ｆ_n（ｔ）は０であったが、これも一例に過ぎない。評価値ｐ_n（ｔ）内でリンク遷移成分ｆ_n（ｔ）の占める割合が評価値ｐ_n（ｔ）の増加に伴って単調増加するという条件を満たしていれば、評価値ｐ_n（ｔ）が０〜閾値θまでの範囲内のリンク遷移成分ｆ_n（ｔ）の値を０より大きい値（例えばｐ_n（ｔ）が０から増えるにつれてｆ_n（ｔ）が０から徐々に増えるなど）としてもよい。この条件が満たされれば、コミュニティ内のノードからコミュニティ外のノードへと流出しにくくなるので、時間発展によりコミュニティとその外部との評価値の差が明確になっていく。なお、リンク遷移成分ｆ_n（ｔ）は、評価値ｐ_n（ｔ）のうち、当該ノードｎから出るリンクに沿って遷移する成分の量を示すものであるから、評価値ｐ_n（ｔ）を上回る値となることはない。 In the example of FIG. 2 (Equation 2) and FIG. 4, the link transition component f _n (t) is 0 for the range where the evaluation value p _n (t) is less than the threshold θ, but this is also an example. Not too much. If it meets the condition that the proportion of the links in the evaluation value p _n (t) transition component f _n (t) is monotonically increases with an increase in the evaluation value p _n (t), the evaluation value p _n (t ) Between 0 and the threshold θ, the value of the link transition component f _n (t) is larger than 0 (for example, f _n (t) gradually increases from 0 as p _n (t) increases from 0, etc.) ). If this condition is satisfied, it will be difficult for a node in the community to flow out to a node outside the community, so that the difference in evaluation value between the community and the outside becomes clear due to time development. Incidentally, link transition component f _n (t), of the evaluation values p _n (t), but to indicate the amount of component a transition along the links emanating from the node n, evaluation values p _n (t) The value will never exceed.

また、以上の説明に現れる（ａ）リンク遷移成分が０である上限の評価値を示す閾値θ（図２参照）と、（ｂ）コミュニティとして抽出するノードが持つべき評価値の下限である閾値とは、同じ値でなくてもよい。 In addition, (a) a threshold value θ (see FIG. 2) indicating an upper limit evaluation value where the link transition component is 0, and (b) a threshold value which is a lower limit of the evaluation value that a node extracted as a community should have. And may not be the same value.

以上に例示した情報処理装置は、コンピュータにそれら各装置の機能を表すプログラムを実行させることにより実現される。ここで、コンピュータは、例えば、ハードウエアとして、ＣＰＵ等のマイクロプロセッサ、ランダムアクセスメモリ（ＲＡＭ）およびリードオンリメモリ（ＲＯＭ）等のメモリ（一次記憶）、ＨＤＤ（ハードディスクドライブ）やＳＳＤ（ソリッドステートドライブ）等の固定記憶装置を制御するコントローラ、各種Ｉ／Ｏ（入出力）インタフェース、ローカルエリアネットワークなどのネットワークとの接続のための制御を行うネットワークインタフェース等が、たとえばバスを介して接続された回路構成を有する。また、そのバスに対し、例えばＩ／Ｏインタフェース経由で、ＣＤやＤＶＤなどの可搬型ディスク記録媒体に対する読み取り及び／又は書き込みのためのディスクドライブ、フラッシュメモリなどの各種規格の可搬型の不揮発性記録媒体に対する読み取り及び／又は書き込みのためのメモリリーダライタ、などが接続されてもよい。上に例示した各機能モジュールの処理内容が記述されたプログラムがＣＤやＤＶＤ等の記録媒体を経由して、又はネットワーク等の通信手段経由で、ＨＤＤ等の固定記憶装置に保存され、コンピュータにインストールされる。固定記憶装置に記憶されたプログラムがＲＡＭに読み出されＣＰＵ等のマイクロプロセッサにより実行されることにより、上に例示した機能モジュール群が実現される。 The information processing apparatus exemplified above is realized by causing a computer to execute a program representing the function of each apparatus. Here, the computer includes, for example, a microprocessor such as a CPU, a memory (primary storage) such as a random access memory (RAM) and a read only memory (ROM), an HDD (hard disk drive), and an SSD (solid state drive) as hardware. ), Etc., a controller that controls a fixed storage device, various I / O (input / output) interfaces, a network interface that performs control for connection to a network such as a local area network, etc. It has a configuration. Also, portable non-volatile recording of various standards such as a disk drive and a flash memory for reading and / or writing to a portable disk recording medium such as a CD or a DVD via the I / O interface, for example. A memory reader / writer for reading from and / or writing to a medium may be connected. A program describing the processing contents of each functional module exemplified above is stored in a fixed storage device such as an HDD via a recording medium such as a CD or DVD, or via a communication means such as a network, and installed in a computer. Is done. The program stored in the fixed storage device is read into the RAM and executed by a microprocessor such as a CPU, thereby realizing the functional module group exemplified above.

１０ネットワーク情報記憶部、１２ネットワーク情報取得部、１４クエリ受付部、１６初期評価値計算部、１８時間発展計算部、２０コミュニティ抽出部。 DESCRIPTION OF SYMBOLS 10 Network information storage part, 12 Network information acquisition part, 14 Query reception part, 16 Initial evaluation value calculation part, 18 Time development calculation part, 20 Community extraction part.

Claims

Computer
An evaluation value calculating means for calculating an evaluation value indicating the degree of conformity to the query of each node in the network;
The first component of the evaluation value of each node transitions to any one of the linked nodes according to the link exiting from the node, and the second component of the evaluation value of each node is a link exiting from the node. Is a second component transition calculating means for calculating an evaluation value of each node after transition according to a calculation rule of transitioning to another node regardless of the second rule. Is more transitioned to nodes having an evaluation value higher than the first range than nodes having an evaluation value in the first range close to 0, and the first component occupying the evaluation value of the node The rate is monotonously increased as the evaluation value increases, transition calculation means,
A specifying means for specifying one or more nodes corresponding to the query based on the evaluation value after the transition of each node calculated by the transition calculating means;
Program to function as.

The specifying unit specifies, as a community corresponding to the query, a node group in which an evaluation value after transition of each node calculated by the transition calculating unit is larger than a predetermined threshold value. The program according to claim 1.

3. The list according to claim 1, wherein the specifying unit generates a list indicating the specified node group in descending order of evaluation values after transition of the nodes calculated by the transition calculating unit. program.

4. The calculation rule according to claim 1, wherein the ratio of the first component to the evaluation value of the node is 0 for a node whose evaluation value is equal to or less than a threshold value. 5. The program described in.

An evaluation value calculating means for calculating an evaluation value indicating the degree of conformity to the query of each node in the network;
The first component of the evaluation value of each node transitions to any one of the linked nodes according to the link exiting from the node, and the second component of the evaluation value of each node is a link exiting from the node. Is a second component transition calculating means for calculating an evaluation value of each node after transition according to a calculation rule of transitioning to another node regardless of the second rule. Is more transitioned to nodes having an evaluation value higher than the first range than nodes having an evaluation value in the first range close to 0, and the first component occupying the evaluation value of the node The ratio monotonously increases as the evaluation value increases, and transition calculation means,
Identification means for identifying one or more nodes corresponding to the query based on the evaluation value after the transition of each node calculated by the transition calculation means;
An information processing apparatus.