JP2008077403A

JP2008077403A - Evaluation device, method and program

Info

Publication number: JP2008077403A
Application number: JP2006255986A
Authority: JP
Inventors: Tsutomu Yonemori; 力米森; Tsutomu Matsunaga; 務松永
Original assignee: NTT Data Corp
Current assignee: NTT Data Group Corp
Priority date: 2006-09-21
Filing date: 2006-09-21
Publication date: 2008-04-03
Anticipated expiration: 2026-09-21
Also published as: JP4391506B2

Abstract

<P>PROBLEM TO BE SOLVED: To determine a group of companies that helps to support decision making and to easily select target companies through scoring. <P>SOLUTION: A data conversion part 102 converts information stored in a company-to-company relationship DB 201 indicative of the transaction relationships between a plurality of companies into network data wherein the companies serve as nodes while the relationships between the companies serve as edges. In this case, for example, the edges may be weighted as the credit scores of the transaction relationships are taken into consideration. A community extracting part 103 extracts communities by applying eigenvalue expansion to the network data, clustering the companies, calculating the level of belonging of each company to its respective cluster, and comparing a threshold value τ to the levels of assignment. An output part 104 outputs company information and scores (levels of belonging) included in each community. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、評価装置、評価方法及び評価プログラムに関する。 The present invention relates to an evaluation apparatus, an evaluation method, and an evaluation program.

企業間の関係は、経営上の意志決定に関する重要な情報源として活用される。例えば、投資先の決定や、融資先の限度額などを決定する際には、対象となる企業の取引関係や、提携関係などによるリスクが考慮される。 Business relationships are used as an important source of information about management decisions. For example, when determining an investment destination or a loan amount limit, risk due to the business relationship or partnership relationship of the target company is considered.

従来、信用調査結果や、分析対象企業からの有価証券報告書等のデータベースを検索することにより、企業間の取引関係を把握する技術が提案されている（例えば、特許文献１参照）。
特開２００３−３０４６７号公報 2. Description of the Related Art Conventionally, a technique for grasping a business relationship between companies by searching a database such as a credit check result or a securities report from an analysis target company has been proposed (for example, see Patent Document 1).
JP 2003-30467 A

しかしながら、上述した従来技術では、データベースを検索して企業間の関係を把握するため、当該企業と直接関係のある企業を知ることができるものの、取引先のさらに取引先など、全体的な視点からは、企業間関係を把握することができないという問題があった。また、企業間の関係に基づいて、対象となる企業を選定する際には、スコアリングにより優先度を付与することが有効であると考えられるが、従来技術では、全体的な関係を考慮したスコアを付与することができないという問題がある。 However, in the above-described conventional technology, since the database is searched to grasp the relationship between companies, it is possible to know a company directly related to the company. However, there was a problem that the relationship between companies could not be grasped. In addition, when selecting target companies based on the relationship between companies, it is considered effective to give priority by scoring. However, in the conventional technology, the overall relationship is considered. There is a problem that a score cannot be given.

本発明は、このような事情を考慮してなされたものであり、その目的は、企業間関係を包括的に考慮することにより、意志決定支援に役立つ企業群を把握することができ、スコアリングにより容易に対象企業を選択することができる評価装置、評価方法及び評価プログラムを提供することにある。 The present invention has been made in consideration of such circumstances, and the purpose of the present invention is to comprehensively consider inter-company relations so that a group of companies useful for decision support can be grasped and scoring is performed. It is to provide an evaluation apparatus, an evaluation method, and an evaluation program that can easily select a target company.

上述した課題を解決するために、本発明は、複数のデータの各々をノードとし、各データ間の関連度をエッジとしたネットワークデータに変換し、変換したネットワークデータを入力する変換手段と、入力される前記ネットワークデータからデータ同士の凝集度を算出し、算出した凝集度に基づいて、前記複数のデータを複数のクラスタに分類する分類手段と、前記分類手段により分類された複数のクラスタの各々のデータ毎に、該当クラスタに対する帰属度を算出する帰属度算出手段と、前記データ毎の該当クラスタへの帰属度に基づいて、クラスタ毎に高い関連度を有する複数のデータからなるコミュニティを抽出する抽出手段とを具備することを特徴とする。 In order to solve the above-described problems, the present invention converts each of a plurality of data into a network data having a node and a degree of relevance between each data as an edge, and converting means for inputting the converted network data, and an input A degree of aggregation between the data is calculated from the network data, a classification means for classifying the plurality of data into a plurality of clusters based on the calculated degree of aggregation, and each of the plurality of clusters classified by the classification means Based on the degree-of-affiliation calculating means for calculating the degree of belonging to the corresponding cluster and the degree of belonging to the corresponding cluster for each data, a community composed of a plurality of data having a high degree of relevance for each cluster is extracted. And an extracting means.

本発明は、上記の発明において、前記抽出手段は、抽出するコミュニティの数が、予め設定される抽出予定のコミュニティ数に達するまで、前記データ毎の該当クラスタへの帰属度に基づいて前記ネットワークデータを更新して前記分類手段に入力することを特徴とする。 According to the present invention, in the above invention, the extraction unit is configured to extract the network data based on the degree of belonging to the corresponding cluster for each data until the number of communities to be extracted reaches a preset number of communities to be extracted. Is updated and input to the classification means.

本発明は、上記の発明において、前記変換手段は、評価目的に応じて、前記複数のデータ間の関連度に重み付けすることを特徴とする。 The present invention is characterized in that, in the above invention, the conversion means weights the degree of association between the plurality of data according to an evaluation purpose.

また、上述した課題を解決するために、本発明は、複数のデータの各々をノードとし、各データ間の関連度をエッジとしたネットワークデータに変換し、変換したネットワークデータを入力する変換ステップと、入力される前記ネットワークデータからデータ同士の凝集度を算出し、算出した凝集度に基づいて、前記複数のデータを複数のクラスタに分類する分類ステップと、前記分類された複数のクラスタの各々のデータ毎に、該当クラスタに対する帰属度を算出する帰属度算出ステップと、前記データ毎の該当クラスタへの帰属度に基づいて、クラスタ毎に高い関連度を有する複数のデータからなるコミュニティを抽出する抽出ステップとを含むことを特徴とする。 In order to solve the above-mentioned problem, the present invention converts each of a plurality of data into a network data having each node as a node and the degree of association between each data as an edge, and converting the network data to be input. , Calculating the degree of aggregation of the data from the input network data, and classifying the plurality of data into a plurality of clusters based on the calculated degree of aggregation, and each of the classified clusters Extraction to extract a community composed of a plurality of data having a high degree of relevance for each cluster based on the degree of belonging calculation for calculating the degree of belonging to the corresponding cluster for each data and the degree of belonging to the corresponding cluster for each data And a step.

また、上述した課題を解決するために、本発明は、コンピュータに、複数のデータの各々をノードとし、各データ間の関連度をエッジとしたネットワークデータに変換し、変換したネットワークデータを入力する変換ステップと、入力される前記ネットワークデータからデータ同士の凝集度を算出し、算出した凝集度に基づいて、前記複数のデータを複数のクラスタに分類する分類ステップと、前記分類された複数のクラスタの各々のデータ毎に、該当クラスタに対する帰属度を算出する帰属度算出ステップと、前記データ毎の該当クラスタへの帰属度に基づいて、クラスタ毎に高い関連度を有する複数のデータからなるコミュニティを抽出する抽出ステップとを実行させることを特徴とする。 Further, in order to solve the above-described problem, the present invention converts a plurality of pieces of data into network data having each node as a node and the degree of association between each data as an edge, and inputs the converted network data. A conversion step; a degree of aggregation between data is calculated from the input network data; and a step of classifying the plurality of data into a plurality of clusters based on the calculated degree of aggregation; and the plurality of classified clusters For each of the data, a membership degree calculating step for calculating the degree of belonging to the corresponding cluster, and a community composed of a plurality of data having a high degree of relevance for each cluster based on the degree of belonging to the corresponding cluster for each data. The extraction step of extracting is executed.

この発明によれば、複数のデータの各々をノードとし、各データ間の関連度をエッジとしたネットワークデータに変換し、変換したネットワークデータを入力し、入力されるネットワークデータからデータ同士の凝集度を算出し、算出した凝集度に基づいて、複数のデータを複数のクラスタに分類し、該分類された複数のクラスタの各々のデータ毎に、該当クラスタに対する帰属度を算出し、データ毎の該当クラスタへの帰属度に基づいて、クラスタ毎に高い関連度を有する複数のデータからなるコミュニティを抽出する。したがって、評価結果として、企業間関係を包括的に考慮した、目的に応じたコミュニティが得られるため、意志決定支援に役立つ企業群を容易に把握することができるという利点が得られる。 According to this invention, each of a plurality of data is converted into network data with each node as a node and the degree of association between each data as an edge, the converted network data is input, and the degree of aggregation of the data from the input network data And classifying a plurality of data into a plurality of clusters based on the calculated degree of aggregation, calculating the degree of belonging to the corresponding cluster for each data of the classified plurality of clusters, Based on the degree of belonging to the cluster, a community composed of a plurality of data having a high degree of association for each cluster is extracted. Therefore, since the community according to the purpose which comprehensively considered the relationship between companies is obtained as an evaluation result, the advantage that the company group useful for decision support can be easily grasped | ascertained is acquired.

また、本発明によれば、抽出するコミュニティの数が、予め設定される抽出予定のコミュニティ数に達するまで、データ毎の該当クラスタへの帰属度に基づいてネットワークデータを更新して前記分類手段に入力する。したがって、クラスタの分類がより正確になり、分析結果に対する理解の促進につなげることができるという利点が得られる。 Further, according to the present invention, until the number of communities to be extracted reaches a preset number of communities to be extracted, the network data is updated based on the degree of belonging to the corresponding cluster for each data, and the classification means is updated. input. Therefore, it is possible to obtain an advantage that the classification of the cluster becomes more accurate and the understanding of the analysis result can be promoted.

また、本発明によれば、評価目的に応じて、複数のデータ間の関連度に重み付けする。したがって、コミュニティに帰属する度合いがスコア（帰属度）として得られるため、このスコア（帰属度）を調査対象選定基準の優先度として活用することができ、容易に対象企業を選択することができるという利点が得られる。 According to the present invention, the degree of association between a plurality of data is weighted according to the evaluation purpose. Therefore, since the degree of belonging to the community is obtained as a score (attribution degree), this score (attribution degree) can be used as the priority of the survey object selection criteria, and the target company can be easily selected. Benefits are gained.

以下、本発明の一実施形態による評価装置を、図面を参照して説明する。
図１は、本発明の実施形態による評価装置１の構成を示すブロック図である。図１において、入力部１０１は、パラメータとして、抽出したいコミュニティ数Ｃ、抽出したいコミュニティ内の企業を決定するための閾値τを入力する。データ変換部１０２は、企業間関係情報ＤＢ（Data Base）２０１に格納されている、複数の企業間の取引関係を示す情報を、ネットワークデータに変換し、ネットワークデータ２０２として出力する。 Hereinafter, an evaluation apparatus according to an embodiment of the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram showing a configuration of an evaluation apparatus 1 according to an embodiment of the present invention. In FIG. 1, an input unit 101 inputs, as parameters, the number of communities C to be extracted and a threshold value τ for determining companies in the communities to be extracted. The data conversion unit 102 converts information indicating business relationships between a plurality of companies stored in an inter-company relationship information DB (Data Base) 201 into network data, and outputs the data as network data 202.

コミュニティ抽出部１０３は、ネットワークデータ２０２を用いて、企業の凝集度に従って複数のクラスタに分類するとともに、該クラスタへの各企業の帰属度に従って複数の企業群からなるコミュニティを抽出する。この際、各企業に対してコミュニティに帰属する度合いを表すスコア（帰属度）を出力部１０４に出力するとともに、凝集度を凝集度ＤＢ２０３に格納する。出力部１０４は、各コミュニティに含まれる企業情報とスコア（帰属度）とを出力する。 The community extraction unit 103 uses the network data 202 to classify into a plurality of clusters according to the degree of company aggregation, and extracts a community composed of a plurality of companies according to the degree of attribution of each company to the cluster. At this time, a score (degree of belonging) representing the degree of belonging to the community for each company is output to the output unit 104 and the degree of aggregation is stored in the degree of aggregation DB 203. The output unit 104 outputs company information and score (degree of belonging) included in each community.

次に、本実施形態の動作について説明する。
図２は、データ変換部１０２の動作を説明するためのフローチャートである。データ変換部１０２は、企業間関係情報ＤＢ２０１に格納されている、複数の企業間の取引関係を示す情報を入力し（ステップＳ１）、企業をノード、企業間関係をエッジとして、ネットワークデータ２０２に変換する（ステップＳ２）。 Next, the operation of this embodiment will be described.
FIG. 2 is a flowchart for explaining the operation of the data conversion unit 102. The data conversion unit 102 inputs information indicating a business relationship between a plurality of companies stored in the inter-company relationship information DB 201 (step S1), and stores the information in the network data 202 with the company as a node and the inter-company relationship as an edge. Conversion is performed (step S2).

例えば、企業間関係を取引関係とした場合、ｉ、ｊを企業のｉｄ（識別情報）、企業ｉをノードとしたとき、企業ｉと企業ｊとの関係を、要素ｅ_ｉｊとする隣接行列Ｅを構成することができる。仕入・販売の取引関係の有無に応じて、それぞれ、ｅ_ｉｊ＝１（取引あり）、ｅ_ｉｊ＝０（取引なし）として、隣接行列Ｅを構成する。この隣接行列Ｅがネットワークデータ２０２となる。 For example, when the business relationship is a business relationship, an adjacency matrix E having the relationship between the business i and the business j as an element e _ij when i and j are business ids (identification information) and the business i is a node. Can be configured. The adjacency matrix E is configured with e _ij = 1 (with a transaction) and e _ij = 0 (without a transaction), depending on whether there is a purchase / sale transaction relationship. This adjacency matrix E becomes the network data 202.

また、分析目的に応じて、エッジに重みを付与するようにしてもよい。例えば、取引関係に与信スコア（ｓｃｏｒｅ：与信スコアは評点ともいう）を考慮した場合、ｉ、ｊを企業のｉｄ（識別情報）、ｓｃｏｒｅを企業ｉの与信スコア、ｄｉｍ_ｉを企業ｉの次数（ｉのエッジ数）としたとき、企業ｉと企業ｊの間のエッジの重みｅ_ｉｊを次式（１）のようにして決定し、ネットワークデータ２０２を構成する。ｄｉｍは、ノードの枝の数である。 Moreover, you may make it give a weight to an edge according to the analysis objective. For example, when a credit score (score: credit score is also referred to as a score) is considered in the business relationship, i and j are company ids (identification information), score is a credit score of company i, and dim _i is an order of company i ( (the number of edges of i), the edge weight e _ij between the company i and the company j is determined as in the following equation (1), and the network data 202 is configured. dim is the number of branches of the node.

ここで、上述した凝集度、帰属度およびクラスタについて説明する。凝集度とは、最大固有値であり、それを用いてクラスタ分類を行う。上述した隣接行列Ｅにおいて、企業のコミュニティの抽出対象となる企業数をｎとし、ノードｉのクラスタへの帰属度をｘ_ｉとしたとき、クラスタの凝集度を次式（２）で評価する。 Here, the above-described aggregation degree, attribution degree, and cluster will be described. The degree of aggregation is the maximum eigenvalue, which is used for cluster classification. In adjacency matrix E described above, the number of companies subject to extraction community company is n, when the degree of belonging to a cluster of nodes i and the x _i, evaluating the degree of aggregation of the cluster by the following formula (2).

ここで、ｘの右肩のＴはｘの転置を表す。Ｅ＝ｅ_ｉｊ（ｅ_ｉｉ＝０とする）は、重み付き隣接行列であり、この凝集度が最大となるｘ＝［ｘ_ｉ，．．．．．，ｘ_ｎ］を求める。但し、｜ｘ｜^２＝１である。ここに、凝集度が最も高いクラスタは、次式（３）に示す、隣接行列Ｅの固有値問題を解くこと、すなわち固有値展開を行うことによって得られる最大固有値の固有ベクトルにより決定される。なお、固有値展開の詳細は、後述する。 Here, T on the right shoulder of x represents transposition of x. E = e _ij (e _ii = 0) is a weighted adjacency matrix, and x = [x _i,. . . . . , X _n ]. However, | x | ² = 1. Here, the cluster having the highest degree of aggregation is determined by the eigenvector of the maximum eigenvalue obtained by solving the eigenvalue problem of the adjacency matrix E shown in the following equation (3), that is, by performing eigenvalue expansion. Details of eigenvalue expansion will be described later.

なお、固有ベクトルｘの要素は、Ｅが非負対称行列であることから非負値で与えられる。凝集度は、最大固有値である。ここでは、ｘの要素のうち、最大値を持つｉ_＊の要素ｘ_ｉ＊を規格化定数とし、次式（４）により、最大値を１と変換し、帰属度と定義する。 The element of the eigenvector x is given as a non-negative value because E is a non-negative symmetric matrix. The degree of aggregation is the maximum eigenvalue. Here, among the elements of x, the element x _{i *} of i _* having the maximum value is set as a normalization constant, and the maximum value is converted to 1 by the following equation (4) and defined as the degree of attribution.

次に、図３は、コミュニティ抽出部１０３の動作を説明するためのフローチャートである。コミュニティ抽出部１０３は、入力部１０１から入力されるパラメータ（コミュニティ数C、閾値τ）と、上記ネットワークデータ２０２（ｎ×ｎ行列）を入力する（ステップＳ１０）。次に、ネットワークデータ２０２に対し、固有値展開を適用する（ステップＳ１１）。固有値展開では、ｎ個の固有値と、ｎ×ｎの固有ベクトルが得られる。このうち、最大固有値、すなわち凝集度に対する固有ベクトルｘのみを以後の処理に用いる。固有ベクトルｘは、最大固有値に対する一列の行列（ベクトル）からなる。すなわち、固有値展開を適用することで、複数の企業（ノード）群からなるクラスタが構成されることになる。上記最大固有値（凝集度）は、凝集度ＤＢ２０３に記録される。なお、固有値展開の詳細については後述する。 Next, FIG. 3 is a flowchart for explaining the operation of the community extraction unit 103. The community extraction unit 103 inputs the parameters (the number of communities C, the threshold τ) input from the input unit 101 and the network data 202 (n × n matrix) (step S10). Next, eigenvalue expansion is applied to the network data 202 (step S11). In eigenvalue expansion, n eigenvalues and n × n eigenvectors are obtained. Among these, only the maximum eigenvalue, that is, the eigenvector x with respect to the cohesion degree is used for the subsequent processing. The eigenvector x is composed of a matrix (vector) of one column for the maximum eigenvalue. That is, by applying eigenvalue expansion, a cluster composed of a plurality of companies (nodes) is configured. The maximum eigenvalue (aggregation degree) is recorded in the aggregation degree DB 203. Details of the eigenvalue expansion will be described later.

次に、固有ベクトルｘを、ｘの要素の最大値ｘ^＊で除算することで、各企業のクラスタへの帰属度を算出し（ステップＳ１２）、閾値τと帰属度とを対比させ、コミュニティ内の企業を決定する（ステップＳ１３）。次に、入力データを変換する（ステップＳ１４）。すなわち、ネットワークデータ２０２（Ｅ）の要素ｅ_ｉｊを、帰属度ｍを用いた以下の式（５）により更新する。これにより、第ｔ番目までに得られた帰属度を除去したネットワークデータ２０２（Ｅ）が得られる。 Next, by dividing the eigenvector x by the maximum value x ^* of the elements of x, the degree of belonging to the cluster of each company is calculated (step S12), the threshold τ and the degree of belonging are compared, A company is determined (step S13). Next, the input data is converted (step S14). That is, the element e _ij of the network data 202 (E) is updated by the following formula (5) using the degree of attribution m. Thereby, network data 202 (E) from which the degree of attribution obtained up to the t-th is removed is obtained.

ここで、ｔは、繰り返し回数に対応する。次に、抽出したいコミュニティ数Ｃだけ抽出したか否かを判定し（ステップＳ１５）、コミュニティ数Ｃに達していない場合には、ステップＳ１１に戻り、上述した処理を繰り返す。そして、コミュニティ数Ｃに達すると、当該処理を終了する。 Here, t corresponds to the number of repetitions. Next, it is determined whether or not only the community number C to be extracted has been extracted (step S15). If the community number C has not been reached, the process returns to step S11 and the above-described processing is repeated. When the community number C is reached, the process is terminated.

このように、ステップＳ１１〜Ｓ１４を繰り返し、次のクラスタの算出時には、既に抽出したクラスタの影響を除外する方法として、隣接行列から第ｔ番目までに得られる帰属度を除去することにより、第ｔ＋１番目のクラスタを得ることで、企業を複数のクラスタに分類することが可能となる。ここで、ネットワークに含まれるクラスタ数は、凝集度の変化によって決定される。 In this way, steps S11 to S14 are repeated, and when calculating the next cluster, as a method of excluding the influence of the already extracted cluster, the t + 1-th degree is obtained by removing the degree of membership obtained from the adjacency matrix up to the t-th. By obtaining the second cluster, it becomes possible to classify the company into a plurality of clusters. Here, the number of clusters included in the network is determined by a change in the degree of aggregation.

ここで、図５（ａ）は、企業間関係情報ＤＢ２０１の一例を示す概念図である。図示の例では、企業Ａは、企業Ｂ、Ｃ、Ｅ、Ｆと取引があり、企業Ｂは、企業Ａ、Ｃ、Ｇと取引があり、企業Ｃは、企業Ａ、Ｂと取引があり、…、企業Ｋは、企業Ｅ、Ｈ、Ｉ、Ｊと取引があることを示している。該企業間関係情報ＤＢ２０１から変換されたネットワークデータ２０２は、例えば、図５（ｂ）に示すようになる。各ノード（丸印）が各企業Ａ〜Ｇに相当し、企業間の取引がエッジで示されている。また、該ネットワークデータ２０２は、図５（ｃ）に示す行列で示されることになり、ネットワーク上の取引の有無に応じて行列の要素に１と０とが割り振られる。 Here, FIG. 5A is a conceptual diagram illustrating an example of the inter-company relationship information DB 201. In the illustrated example, company A has transactions with companies B, C, E, and F, company B has transactions with companies A, C, and G, and company C has transactions with companies A and B. ..., company K indicates that there are transactions with companies E, H, I, and J. The network data 202 converted from the inter-company relationship information DB 201 is, for example, as shown in FIG. Each node (circle) corresponds to each company AG, and transactions between companies are indicated by edges. Further, the network data 202 is represented by a matrix shown in FIG. 5C, and 1 and 0 are assigned to the elements of the matrix according to the presence or absence of transactions on the network.

そして、上記ネットワークデータ２０２が図５（ｃ）で示される場合には、図６に示すように、２つのクラスタ１、２に分類されることになる。なお、図６に示すように、それぞれのクラスタ１、２に属する企業には、そのクラスタ１、２にどの程度帰属しているかを示す帰属度が算出される。そして、閾値τ（図示の例では、０．３）に従って、各クラスタ１、２から図７に示すようにコミュニティ１、２が抽出されることになる。 When the network data 202 is shown in FIG. 5C, it is classified into two clusters 1 and 2 as shown in FIG. Note that, as shown in FIG. 6, the degree of attribution indicating how much the companies belonging to the respective clusters 1 and 2 belong to the clusters 1 and 2 is calculated. Then, according to the threshold τ (0.3 in the illustrated example), the communities 1 and 2 are extracted from the clusters 1 and 2 as shown in FIG.

次に、図４は、上述した固有値展開の代表的アルゴリズムであるヤコビ法の手順を示すフローチャートである。ヤコビ法では、まず、データ（処理対象行列）をＥ（ｎ×ｎ）＝ｅ_ｉｊとし、収束判定パラメータｅｐｓを０に近い値（０．００１等）を設定し、さらに、固有ベクトル格納用のＸ（ｎ×ｎ）＝ｘ_ｉｊを用意する（ステップＳ２０）。次に、Ａの非対角要素（ｉ≠ｊ）中で最大の値を持つ要素ａ_ＩＪ及びその要素の位置情報Ｉ，Ｊをサーチする（ステップＳ２１）。 Next, FIG. 4 is a flowchart showing the procedure of the Jacobian method, which is a typical algorithm for eigenvalue expansion described above. In the Jacobian method, first, the data (processing target matrix) is set to E (n × n) = e _ij , the convergence determination parameter eps is set to a value close to 0 (such as 0.001), and X (N × n) = x _ij is prepared (step S20). Next, the element a _IJ having the maximum value among the non-diagonal elements (i ≠ j) of A and the position information I and J of the element are searched (step S21).

次に、収束したか否か（ｅ_ＩＪ＜ｅｐｓ）を判定し（ステップＳ２２）、収束していない場合には、θを計算し（ステップＳ２３）、ａ_Ｉｊ及びａ_Ｊｊ（ｊ＝１，２，…，ｎ）を更新し（ステップＳ２４）、ｅ_ｉＩ及びｅ_ｉＪ（ｉ＝１，２，…，ｎ）を更新し（ステップＳ２５）、さらに、ｘ_ｉＩ及びｘ_ｉＪ（ｉ＝１，２，…，ｎ）を更新し（ステップＳ２６）、ステップＳ２１に戻り、収束するまで繰り返す。そして、収束すると、Ｖを固有ベクトル、Ａの対角部分を固有値として出力し（ステップＳ２７）、当該処理を終了する。 Next, it is determined whether or not it has converged (e _IJ <eps) (step S22). If it has not converged, θ is calculated (step S23), and a _Ij and a _Jj (j = 1, 2) ,..., N) are updated (step S24), e _iI and e _iJ (i = 1, 2,..., N) are updated (step S25), and x _iI and x _iJ (i = 1, 2) are updated. ,..., N) are updated (step S26), the process returns to step S21, and is repeated until convergence. When converged, V is output as an eigenvector, and the diagonal portion of A is output as an eigenvalue (step S27), and the process ends.

次に、上述した本実施形態による評価方法により実際のデータを用いたコミュニティ抽出の結果について説明する。企業の仕入ならびに販売の取引情報を用いて、構成した関係に応じたコミュニティが抽出されるかどうかを調べる。ここでは、取引関係を用いた場合と、取引関係に与信スコアを反映した場合とについて説明する。なお、以下では、（１）日経ＮＥＥＤＳ：２００３年度上場企業２９０３社の仕入並びに販売の取引情報、（２）帝国データバンクＣＯＳＭＯＳ１：概要ファイルに含まれる与信スコアを用いた。 Next, the result of community extraction using actual data by the above-described evaluation method according to the present embodiment will be described. It is examined whether or not a community corresponding to the configured relationship is extracted by using the purchase and sales transaction information of the company. Here, a case where a business relationship is used and a case where a credit score is reflected in the business relationship will be described. In the following, (1) Nikkei NEEDS: purchase and sales transaction information of 2903 listed companies in 2003, (2) Teikoku Databank COSMOS1: Credit score included in the summary file was used.

まず、企業間関係を取引関係とした場合、すなわち、図２の処理にて、与信スコアを考慮しない場合には、図８に示す結果が得られた。クラスタ１には、総合商社のコミュニティが抽出され、クラスタ２には、自動車業界のコミュニティが抽出された。これらは、販売先や、仕入先の関係が密であり、抽出されたコミュニティとして妥当であることが分かる。これにより、提携先企業がどのコミュニティに属するかを容易に把握することが可能となる。 First, when the business relationship is a business relationship, that is, when the credit score is not considered in the processing of FIG. 2, the result shown in FIG. 8 is obtained. In cluster 1, a general trading company community was extracted, and in cluster 2 an automobile industry community was extracted. It can be seen that these have a close relationship between sales destinations and suppliers and are appropriate as an extracted community. Thereby, it becomes possible to easily grasp to which community the partner company belongs.

また、図９には、クラスタの帰属度の高い、３３６社を対象にネットワークが示されている。図において、黒丸、平行四辺形、正方形、三角形、菱形のドットは、それぞれ、企業であるノードを示し、平行四辺形、正方形、三角形、菱形のドットは、順に、当該企業がそれぞれ第１、第２、第３、第４クラスタに属することを示している。複数のクラスタに属している場合は、先に抽出されたクラスタで記している。図から、取引関係のネットワークデータ上に、取引関係の密な部分がコミュニティとして抽出されていることが確認できる。 FIG. 9 shows a network for 336 companies having a high degree of cluster membership. In the figure, black circles, parallelograms, squares, triangles, and rhombus dots indicate nodes that are companies, respectively, and parallelograms, squares, triangles, and rhombus dots are the first, 2, 3 and 4 belong to the cluster. In the case of belonging to a plurality of clusters, it is indicated by the previously extracted cluster. From the figure, it can be confirmed that a dense part of the business relationship is extracted as a community on the network data of the business relationship.

次に、図２の処理にて、取引関係に加えて与信スコア（評点）を考慮した場合には、図１０に示す結果が得られた。太線で示すように、与信スコアが低く、関連のある企業がコミュニティとして抽出されているのが分かる。例えば、提携先企業がコミュニティの上位企業になっている際には、より詳細な分析を行うことでリスクを低減させることが可能となる。 Next, in the process of FIG. 2, when a credit score (score) is considered in addition to the business relationship, the result shown in FIG. 10 is obtained. As shown by the bold line, it can be seen that the credit score is low and related companies are extracted as communities. For example, when the partner company is a high-ranking company in the community, it is possible to reduce the risk by performing a more detailed analysis.

上述した実施形態によれば、評価結果として、企業間関係を包括的に考慮し、目的に応じたコミュニティが得られるため、意志決定支援に役立つ企業群を容易に把握することができる。また、コミュニティに帰属する度合いがスコア（帰属度）として得られるため、このスコア（帰属度）を調査対象選定基準の優先度として活用することができる。また、凝集度の変化に基づいてデータに含まれる本質的なコミュニティ数が決定されることにより、分析結果に対する理解の促進につなげることができる。 According to the above-described embodiment, as a result of the evaluation, the relationship between companies is comprehensively considered and a community according to the purpose is obtained. Therefore, it is possible to easily grasp a company group that is useful for decision support. Moreover, since the degree which belongs to a community is obtained as a score (attribution degree), this score (attribution degree) can be utilized as a priority of a survey object selection criterion. In addition, since the number of essential communities included in the data is determined based on the change in the degree of aggregation, it is possible to promote understanding of the analysis results.

なお、上述した実施形態では、例として、企業のデータを用いたが、本発明はこれに限らず、様々なデータを対象としてもよい。 In the embodiment described above, company data is used as an example, but the present invention is not limited to this, and various data may be targeted.

本発明の変換手段は、データ変換部１０２に対応し、分類手段、帰属度算出手段、抽出手段は、コミュニティ抽出部１０３に対応する。 The conversion unit of the present invention corresponds to the data conversion unit 102, and the classification unit, the belonging degree calculation unit, and the extraction unit correspond to the community extraction unit 103.

本発明の実施形態による評価装置の構成を示すブロック図である。It is a block diagram which shows the structure of the evaluation apparatus by embodiment of this invention. データ変換部１０２の動作を説明するためのフローチャートである。4 is a flowchart for explaining the operation of a data conversion unit 102. コミュニティ抽出部１０３の動作を説明するためのフローチャートである。5 is a flowchart for explaining the operation of a community extraction unit 103. 固有値展開の代表的アルゴリズムであるヤコビ法の手順を示すフローチャートである。It is a flowchart which shows the procedure of the Jacobian method which is a typical algorithm of eigenvalue expansion. ネットワークデータへの変換を説明するための概念図である。It is a conceptual diagram for demonstrating conversion to network data. クラスタおよび帰属度を説明するための概念図である。It is a conceptual diagram for demonstrating a cluster and a degree of attribution. ネットワークデータにおけるクラスタを説明するための概念図である。It is a conceptual diagram for demonstrating the cluster in network data. 企業間関係を取引関係とした場合の結果を示す図である。It is a figure which shows the result at the time of making business relationship into business relationship. 企業間関係を取引関係とした場合のネットワーク、クラスタを示す概念図である。It is a conceptual diagram which shows a network and a cluster at the time of making business relationship into a business relationship. 取引関係に与信スコアを反映した場合の結果を示す図である。It is a figure which shows a result at the time of reflecting a credit score in business relationship.

Explanation of symbols

１０１入力部
１０２データ変換部
１０３コミュニティ抽出部
１０４出力部
２０１企業間関係情報ＤＢ
２０２ネットワークデータ
２０３凝集度ＤＢ 101 Input unit 102 Data conversion unit 103 Community extraction unit 104 Output unit 201 Inter-company relationship information DB
202 Network data 203 Cohesion degree DB

Claims

Conversion means for converting each of the plurality of data into a network data having a node and a degree of association between each data as an edge, and inputting the converted network data;
Classification means for calculating the degree of aggregation between the data from the input network data, and classifying the plurality of data into a plurality of clusters based on the calculated degree of aggregation;
For each data of the plurality of clusters classified by the classification means, an attribution degree calculating means for calculating an attribution degree for the corresponding cluster;
An evaluation apparatus comprising: an extraction unit that extracts a community composed of a plurality of data having a high degree of association for each cluster based on the degree of belonging to the corresponding cluster for each data.

The extraction means includes
Until the number of communities to be extracted reaches a preset number of communities to be extracted, the network data is updated based on the degree of belonging to the corresponding cluster for each data and is input to the classification means. The evaluation apparatus according to claim 1.

The evaluation apparatus according to claim 1, wherein the conversion unit weights the degree of association between the plurality of data according to an evaluation purpose.

A conversion step of converting each of a plurality of data into a network data having a node and a degree of association between each data as an edge, and inputting the converted network data;
A step of calculating the degree of aggregation of the data from the input network data, and classifying the plurality of data into a plurality of clusters based on the calculated degree of aggregation;
For each data of the plurality of classified clusters, an attribution calculation step for calculating the attribution to the corresponding cluster,
An extraction method comprising: extracting a community composed of a plurality of data having a high degree of association for each cluster based on the degree of belonging to the corresponding cluster for each data.

On the computer,
A conversion step of converting each of a plurality of data into a network data having a node and a degree of association between each data as an edge, and inputting the converted network data;
A step of calculating the degree of aggregation of the data from the input network data, and classifying the plurality of data into a plurality of clusters based on the calculated degree of aggregation;
For each data of the plurality of classified clusters, an attribution calculation step for calculating the attribution to the corresponding cluster,
An extraction program for extracting a community composed of a plurality of data having a high degree of association for each cluster based on the degree of belonging to the corresponding cluster for each data.