JP2012064136A

JP2012064136A - Test data generation method, test data generation device, and test data generation program

Info

Publication number: JP2012064136A
Application number: JP2010209739A
Authority: JP
Inventors: Shigeki Hino; 滋樹日野
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2010-09-17
Filing date: 2010-09-17
Publication date: 2012-03-29

Abstract

PROBLEM TO BE SOLVED: To generate test data simulating a one-sided link structure in a latent community.SOLUTION: A cluster generation part 11 generates a plurality of clusters by dividing a node group consisting of a plurality of nodes given identification information at random into the number of clusters determined by the total number of nodes and the number of cluster scales. A graph data generation part 12 generates respective graph data of the plurality of clusters by setting link attribute information representing link states between nodes constituting the respective clusters using random numbers and thresholds generated for between the nodes. A test data generation part 13 generates test data simulating the link states between the respective nodes constituting the node group by integrating all the graph data generated by the graph data generation part 12 based upon the identification information by performing control such that the cluster generation processing and graph data generation processing are repeated as many times as multiplex association is possible.

Description

本発明は、テストデータ生成方法、テストデータ生成装置及びテストデータ生成プログラムに関する。 The present invention relates to a test data generation method, a test data generation device, and a test data generation program.

従来、外部入力が存在する情報処理に用いるアルゴリズムの妥当性を判定することが行なわれている。ここで、外部入力が実データとして容易に取得できる場合、当該実データは、アルゴリズムの妥当性をテストするためのデータとして使用することができる。例えば、収集可能な実データとして、ＳＮＳ（Social Networking Service）内におけるユーザー間のリンク構造を分析するためのグラフデータを生成することが行なわれている。 Conventionally, the validity of an algorithm used for information processing in which an external input exists is determined. Here, when the external input can be easily acquired as actual data, the actual data can be used as data for testing the validity of the algorithm. For example, as actual data that can be collected, graph data for analyzing a link structure between users in an SNS (Social Networking Service) is generated.

一方、実データが容易に収集できない場合、アルゴリズムの妥当性を判定するためには、外部入力を擬似したテストデータが必要となる。通信網を分析対象とするアルゴリズムにおいて実データをテスト用に収集することは、分析対象が大規模であり、また、実データを逐次収集する際に収集中の実データの状態が実時間で変化してしまう場合があることから、困難である。すなわち、通信網を分析対象とするアルゴリズムの妥当性を判定するためには、テストデータを生成することが必要となる。 On the other hand, when actual data cannot be easily collected, test data simulating an external input is required to determine the validity of the algorithm. Collecting actual data for testing in an algorithm whose analysis target is a communication network is that the analysis target is large-scale, and the state of the actual data being collected changes in real time when the actual data is collected sequentially. It may be difficult to do so. That is, it is necessary to generate test data in order to determine the validity of an algorithm whose analysis target is a communication network.

このため、電話網の出現以来、交換機などの試験では、電話網の制御アルゴリズムを検証するために、外部入力を擬似したテストデータを生成する装置が使用されてきた。電話のように通信経路が物理的回線と対応付けられ、かつ、回線の接続構造が地理的条件から自明ならば、テストデータは、比較的容易に生成することができる。すなわち、グラフ構造を容易に取得することができるので、テストデータは、例えば、乱数で通信密度を決定することで容易に生成することができる。 For this reason, since the advent of the telephone network, a device for generating test data that simulates an external input has been used in testing of an exchange or the like in order to verify a control algorithm of the telephone network. If the communication path is associated with a physical line as in the case of a telephone and the connection structure of the line is obvious from geographical conditions, the test data can be generated relatively easily. That is, since the graph structure can be easily acquired, the test data can be easily generated by determining the communication density with, for example, a random number.

特開２００８−０５２４９５号公報JP 2008-052495 A

ところで、現在のインターネットサービスにおける接続の概念は、物理的な機器間でないＷｅｂページ間のリンクにも存在する。また、全てのＷｅｂページ間は、リンクさえ記述すれば直接遷移が可能である。従って、インターネットサービスにおいては、電話網のように、地理的条件などに基づく自明な接続構造の抽出が困難であった。 By the way, the concept of connection in the current Internet service also exists in a link between Web pages that is not between physical devices. In addition, direct transition is possible between all Web pages as long as a link is described. Therefore, in the Internet service, it is difficult to extract a trivial connection structure based on geographical conditions as in a telephone network.

また、全てのＷｅｂページ間は、同確率でランダムにリンクされるわけではなく、潜在する縁故などによって偏ったリンクが張られている。さらに、検索エンジンなどでは、潜在的なコミュニティ内での偏ったリンクによる関係性を利用したアルゴリズムの需要がある。検索アルゴリズムなどの動作検証では、テストデータが潜在的なコミュニティ内での偏ったリンク構造を模擬していないと、検証の妥当性が不確かになる。 Also, not all Web pages are linked at random with the same probability, but links that are biased due to potential ties, etc. are provided. Furthermore, in search engines, etc., there is a demand for an algorithm that uses the relationship due to a biased link in a potential community. In operation verification such as a search algorithm, if the test data does not simulate a biased link structure in a potential community, the validity of the verification becomes uncertain.

しかし、上記した従来の技術は、回線の接続構造が取得できることを前提としているので、Ｗｅｂページ間のリンクのように、潜在的なコミュニティ内での偏ったリンク構造を模擬したテストデータを生成することができなかった。 However, since the conventional technology described above is based on the assumption that the connection structure of the line can be acquired, test data that simulates a biased link structure in a potential community, such as a link between Web pages, is generated. I couldn't.

そこで、この発明は、上述した従来技術の課題を解決するためになされたものであり、潜在的なコミュニティ内での偏ったリンク構造を模擬したテストデータを生成することが可能となるテストデータ生成方法、テストデータ生成装置及びテストデータ生成プログラムを提供することを目的とする。 Accordingly, the present invention has been made to solve the above-described problems of the prior art, and test data generation that can generate test data that simulates a biased link structure in a potential community. It is an object to provide a method, a test data generation device, and a test data generation program.

上述した課題を解決し、目的を達成するために、本願の開示するテストデータ生成方法は、コンピュータが、識別情報がそれぞれ付与された複数のノードから構成されるノード群を、当該ノード群の総ノード数及び所定のクラスタ規模数で決定されるクラスタ数に無作為に分割することで、複数のクラスタを生成するクラスタ生成ステップと、前記クラスタ生成ステップにより生成された各クラスタを構成するノード間のリンク状態を示すリンク属性情報を、当該ノード間に対して生成された乱数及び所定の閾値を用いて設定することで、前記複数のクラスタそれぞれにおけるリンク構造を示すグラフデータを生成するグラフデータ生成ステップと、前記クラスタ生成ステップによるクラスタ生成処理及び前記グラフデータ生成ステップによるグラフデータ生成処理を所定の回数繰り返すように制御することで、前記グラフデータ生成ステップにより生成された全てのグラフデータを、各ノードに付与された識別情報に基づいて統合することで、前記ノード群を構成する各ノード間のリンク状態を模擬したテストデータを生成するテストデータ生成ステップと、を含んだことを特徴とする。 In order to solve the above-described problems and achieve the object, the test data generation method disclosed in the present application is a computer in which a node group composed of a plurality of nodes each having identification information is assigned to the total node group. By randomly dividing the number of nodes and the number of clusters determined by a predetermined number of clusters, a cluster generation step for generating a plurality of clusters and between nodes constituting each cluster generated by the cluster generation step A graph data generation step of generating graph data indicating a link structure in each of the plurality of clusters by setting link attribute information indicating a link state using a random number generated between the nodes and a predetermined threshold value And a cluster generation process by the cluster generation step and the graph data generation step. By controlling the graph data generation process to repeat a predetermined number of times, all the graph data generated by the graph data generation step is integrated based on the identification information given to each node, so that the node group And a test data generation step for generating test data simulating the link state between the nodes constituting the network.

また、本願の開示するテストデータ生成装置は、識別情報がそれぞれ付与された複数のノードから構成されるノード群を、当該ノード群の総ノード数及び所定のクラスタ規模数で決定されるクラスタ数に無作為に分割することで、複数のクラスタを生成するクラスタ生成手段と、前記クラスタ生成手段により生成された各クラスタを構成するノード間のリンク状態を示すリンク属性情報を、当該ノード間に対して生成された乱数及び所定の閾値を用いて設定することで、前記複数のクラスタそれぞれにおけるリンク構造を示すグラフデータを生成するグラフデータ生成手段と、前記クラスタ生成手段によるクラスタ生成処理及び前記グラフデータ生成手段によるグラフデータ生成処理を所定の回数繰り返すように制御することで、前記グラフデータ生成手段により生成された全てのグラフデータを、各ノードに付与された識別情報に基づいて統合することで、前記ノード群を構成する各ノード間のリンク状態を模擬したテストデータを生成するテストデータ生成手段と、を備えたことを特徴とする。 In addition, the test data generation device disclosed in the present application is configured such that a node group composed of a plurality of nodes each having identification information is assigned to the number of clusters determined by the total number of nodes of the node group and a predetermined cluster size number. By randomly dividing the cluster generation means for generating a plurality of clusters, and link attribute information indicating the link state between the nodes constituting each cluster generated by the cluster generation means, between the nodes A graph data generation unit that generates graph data indicating a link structure in each of the plurality of clusters by setting using the generated random number and a predetermined threshold, and a cluster generation process and the graph data generation by the cluster generation unit By controlling the graph data generation processing by means to be repeated a predetermined number of times, the graph All the graph data generated by the data generation means is integrated based on the identification information given to each node, thereby generating test data that simulates the link state between the nodes constituting the node group. And test data generation means.

また、本願の開示するテストデータ生成プログラムは、上記のテストデータ生成方法をコンピュータに実行させることを特徴とする。 A test data generation program disclosed in the present application causes a computer to execute the test data generation method.

本願の開示するテストデータ生成方法、テストデータ生成装置及びテストデータ生成プログラムによれば、潜在的なコミュニティ内での偏ったリンク構造を模擬したテストデータを生成することが可能となる。 According to the test data generation method, test data generation apparatus, and test data generation program disclosed in the present application, it is possible to generate test data that simulates a biased link structure in a potential community.

図１は、実施例１に係るテストデータ生成装置の構成を説明するための図である。FIG. 1 is a diagram for explaining the configuration of the test data generation apparatus according to the first embodiment. 図２は、実施例１に係るテストデータ生成装置が実行する処理の概要を説明するための図である。FIG. 2 is a diagram for explaining an overview of processing executed by the test data generation apparatus according to the first embodiment. 図３は、実施例２に係るテストデータ生成装置の構成を説明するための図である。FIG. 3 is a diagram for explaining the configuration of the test data generation apparatus according to the second embodiment. 図４は、図３に示す設定リストを説明するための図である。FIG. 4 is a diagram for explaining the setting list shown in FIG. 図５は、図３に示すノードリストを説明するための図である。FIG. 5 is a diagram for explaining the node list shown in FIG. 図６は、図３に示すテストデータのデータ形式を説明するための図である。FIG. 6 is a diagram for explaining the data format of the test data shown in FIG. 図７は、実施例２に係るクラスタ生成部が実行する第１のクラスタ生成方法を説明するためのフローチャートである。FIG. 7 is a flowchart for explaining a first cluster generation method executed by the cluster generation unit according to the second embodiment. 図８は、実施例２に係るクラスタ生成部が実行する第２のクラスタ生成方法を説明するためのフローチャートである。FIG. 8 is a flowchart for explaining a second cluster generation method executed by the cluster generation unit according to the second embodiment. 図９は、実施例２に係るグラフデータ生成部の処理を説明するためのフローチャートである。FIG. 9 is a flowchart for explaining processing of the graph data generation unit according to the second embodiment. 図１０は、実施例２に係るテストデータ生成装置の処理を説明するためのフローチャートである。FIG. 10 is a flowchart for explaining the process of the test data generation apparatus according to the second embodiment. 図１１は、従来のテストデータを説明するための図である。FIG. 11 is a diagram for explaining conventional test data. 図１２は、実施例２で生成されるテストデータを説明するための図である。FIG. 12 is a diagram for explaining test data generated in the second embodiment. 図１３は、実施例１に係るテストデータ生成プログラムを実行するコンピュータを示す図である。FIG. 13 is a diagram illustrating the computer that executes the test data generation program according to the first embodiment.

以下、本願の開示するテストデータ生成方法、テストデータ生成装置及びテストデータ生成プログラムの実施例を詳細に説明する。以下では、本願の開示するテストデータ生成方法を実行するテストデータ生成装置を実施例として説明する。なお、以下の実施例により本発明が限定されるものではない。 Hereinafter, embodiments of a test data generation method, a test data generation device, and a test data generation program disclosed in the present application will be described in detail. Hereinafter, a test data generation apparatus that executes the test data generation method disclosed in the present application will be described as an example. In addition, this invention is not limited by the following examples.

図１及び図２を用いて、実施例１に係るテストデータ生成装置について説明する。図１は、実施例１に係るテストデータ生成装置の構成を説明するための図であり、図２は、実施例１に係るテストデータ生成装置が実行する処理の概要を説明するための図である。 A test data generation apparatus according to the first embodiment will be described with reference to FIGS. 1 and 2. FIG. 1 is a diagram for explaining the configuration of the test data generation device according to the first embodiment, and FIG. 2 is a diagram for explaining an overview of processing executed by the test data generation device according to the first embodiment. is there.

図１に示すように、実施例１に係るテストデータ生成装置１０は、クラスタ生成部１１と、グラフデータ生成部１２と、テストデータ生成部１３とを有する。 As illustrated in FIG. 1, the test data generation device 10 according to the first embodiment includes a cluster generation unit 11, a graph data generation unit 12, and a test data generation unit 13.

ここで、実施例１に係るテストデータ生成装置１０は、外部入力が存在する情報処理に用いるアルゴリズムの妥当性を判定する際、外部入力が実データとして容易に取得できない場合に、当該外部入力を擬似したテストデータを生成する装置である。例えば、実施例１に係るテストデータ生成装置は、インターネットサービスにおけるＷｅｂページの検索アルゴリズムの動作検証を行なうためのテストデータを生成する。具体的には、実施例１に係るテストデータ生成装置は、Ｗｅｂページ間のリンクのように、潜在的なコミュニティ内での偏ったリンク構造を模擬したテストデータを生成する。 Here, the test data generation device 10 according to the first embodiment, when determining the validity of an algorithm used for information processing in which an external input exists, when the external input cannot be easily acquired as actual data, It is a device that generates simulated test data. For example, the test data generation apparatus according to the first embodiment generates test data for performing operation verification of a Web page search algorithm in an Internet service. Specifically, the test data generation apparatus according to the first embodiment generates test data that simulates a biased link structure in a potential community, such as a link between Web pages.

より具体的には、クラスタ生成部１１、グラフデータ生成部１２及びテストデータ生成部１３は、テストデータ生成装置１０の操作者から、図１に示すように、設定リストを受け付けた場合に、協働してテストデータの生成を行なう。ここで、設定リストには、図２の（Ａ）に示すように、総ノード数（ｎ）、クラスタ規模数（ｍ）、多重帰属可能数（ｔ）及び閾値（ｓ）が設定される。以下、設定リストに記載された各数値を用いて実行されるクラスタ生成部１１、グラフデータ生成部１２及びテストデータ生成部１３の処理を説明する。 More specifically, the cluster generation unit 11, the graph data generation unit 12, and the test data generation unit 13 cooperate with each other when an operator of the test data generation apparatus 10 receives a setting list as shown in FIG. To generate test data. Here, as shown in FIG. 2A, the total number of nodes (n), the number of cluster scales (m), the number of possible multiple assignments (t), and the threshold (s) are set in the setting list. Hereinafter, processing of the cluster generation unit 11, the graph data generation unit 12, and the test data generation unit 13 executed using each numerical value described in the setting list will be described.

クラスタ生成部１１は、まず、識別情報がそれぞれ付与された複数のノードから構成されるノード群を設定する。 The cluster generation unit 11 first sets a node group composed of a plurality of nodes to which identification information is assigned.

例えば、図２の（Ａ）に示すように、「総ノード数（ｎ）：４」と設定されている場合、クラスタ生成部１１は、４つのノードからなるノード群を設定する。ここで、クラスタ生成部１１は、４つのノードそれぞれに、ノード番号「１〜４」を識別情報として付与する。これにより、クラスタ生成部１１は、図２の（Ａ）に示すように、ノード番号「１〜４」が付与された４つのノードから構成されるノード群「Ｎ＝｛１，２，３，４｝」を設定する。なお、識別情報は、各ノードを一意に特定できるものであれば任意の情報でよく、例えば、名称であってもよい。 For example, as illustrated in FIG. 2A, when “total number of nodes (n): 4” is set, the cluster generation unit 11 sets a node group including four nodes. Here, the cluster generation unit 11 assigns node numbers “1 to 4” as identification information to each of the four nodes. As a result, as shown in FIG. 2A, the cluster generation unit 11 has a node group “N = {1, 2, 3, 3] composed of four nodes assigned node numbers“ 1 to 4 ”. 4} ". The identification information may be any information as long as it can uniquely identify each node, and may be a name, for example.

ここで、総ノード数「ｎ」が設定された場合、テストデータ生成部１３は、「ｎ×ｎ」通りのノード間のリンク状態を模擬したテストデータを生成することとなる。例えば、テストデータ生成部１３は、「ｎ×ｎ」の正方行列の各要素にノード間のリンク状態を設定することで、テストデータを生成する。ここで、テストデータ生成部１３は、テストデータを「ｎ×ｎ」の零行列又は単位行列に初期設定する。 Here, when the total number of nodes “n” is set, the test data generation unit 13 generates test data simulating the link state between “n × n” nodes. For example, the test data generation unit 13 generates test data by setting a link state between nodes in each element of an “n × n” square matrix. Here, the test data generation unit 13 initializes the test data to an “n × n” zero matrix or unit matrix.

すなわち、操作者がノード間の関連性の強さを模擬したテストデータを生成したい場合、各ノードが自ノードと完全に接続されていると設定することが適切であるので、操作者の指示により、テストデータ生成部１３は、生成するテストデータを「ｎ×ｎ」の単位行列に初期設定する。また、操作者がノード間のリンクの有無を模擬したテストデータを生成したい場合、自ノードをリンク先ではないと設定することが適切であるので、操作者の指示により、テストデータ生成部１３は、生成するテストデータを「ｎ×ｎ」の零行列に初期設定する。 That is, when the operator wants to generate test data that simulates the strength of the relationship between nodes, it is appropriate to set that each node is completely connected to its own node. The test data generation unit 13 initializes the test data to be generated in an “n × n” unit matrix. In addition, when the operator wants to generate test data that simulates the presence or absence of a link between nodes, it is appropriate to set the own node not to be a link destination. The test data to be generated is initialized to a zero matrix of “n × n”.

例えば、ノード間の関連性の強さを模擬したテストデータを生成したい操作者の指示により、テストデータ生成部１３は、図２の（Ａ）に示すように、生成するテストデータ「Ｅａ」を「４×４」の単位行列に初期設定する。 For example, in response to an instruction from an operator who wants to generate test data simulating the strength of relevance between nodes, the test data generation unit 13 generates test data “Ea” to be generated as shown in FIG. Initially set to a “4 × 4” unit matrix.

そして、クラスタ生成部１１は、ノード群を、総ノード数（ｎ）及びクラスタ規模数（ｍ）で決定されるクラスタ数に無作為に分割することで、複数のクラスタを生成する。例えば、「ｍ」が「ｎ」の約数である場合、クラスタ生成部１１は、「ｃ＝ｎ／ｍ」回にわたり、ノード群を構成する「ｎ」個のノードから「ｍ」個のノードを順次無作為に選択する。これにより、クラスタ生成部１１は、「ｎ」個のノードからなるノード群を「ｍ」個のノードから構成されるクラスタ「Ｍ（１）〜Ｍ（ｃ）」に分割する。 Then, the cluster generation unit 11 generates a plurality of clusters by randomly dividing the node group into the number of clusters determined by the total number of nodes (n) and the number of cluster sizes (m). For example, when “m” is a divisor of “n”, the cluster generation unit 11 performs “c = n / m” times from “n” nodes constituting the node group to “m” nodes. Are randomly selected. Accordingly, the cluster generation unit 11 divides the node group including “n” nodes into clusters “M (1) to M (c)” including “m” nodes.

図２の（Ａ）に示す一例では、クラスタ生成部１１は、設定リストにて「総ノード数（ｎ）：４、クラスタ規模数（ｍ）：２」と設定されているので、「２＝４／２」回にわたり、ノード群を構成する「４」個のノードから「２」個のノードのノードを順次無作為に選択する。これにより、クラスタ生成部１１は、「４」個のノードからなるノード群「Ｎ」を、「２」個のノードから構成されるクラスタ「Ｍ（１）及びＭ（２）」に分割する。 In the example shown in FIG. 2A, the cluster generation unit 11 is set as “total number of nodes (n): 4, cluster size number (m): 2” in the setting list, so “2 = The nodes of “2” nodes are sequentially selected at random from “4” nodes constituting the node group over 4/2 times. Thereby, the cluster generation unit 11 divides the node group “N” composed of “4” nodes into clusters “M (1) and M (2)” composed of “2” nodes.

図２の（Ｂ）に示す一例では、クラスタ生成部１１は、ノード群「Ｎ」を、「Ｍ（１）＝｛１，３｝」及び「Ｍ（２）＝｛２，４｝」に分割する。なお、総ノード数がクラスタ規模数で割り切れない場合のクラスタ生成部１１の処理については、後に詳細に説明する。 In the example illustrated in FIG. 2B, the cluster generation unit 11 sets the node group “N” to “M (1) = {1, 3}” and “M (2) = {2, 4}”. To divide. The processing of the cluster generation unit 11 when the total number of nodes is not divisible by the number of cluster scales will be described in detail later.

そして、図１に示すグラフデータ生成部１２は、クラスタ生成部１１により生成された各クラスタを構成するノード間のリンク状態を示すリンク属性情報を、当該ノード間に対して生成された乱数及び閾値（ｓ）を用いて設定することで、複数のクラスタそれぞれにおけるリンク構造を示すグラフデータを生成する。 Then, the graph data generation unit 12 shown in FIG. 1 uses the link attribute information indicating the link state between the nodes constituting each cluster generated by the cluster generation unit 11 as a random number and a threshold generated between the nodes. By setting using (s), graph data indicating a link structure in each of a plurality of clusters is generated.

具体的には、グラフデータ生成部１２は、ノード番号「ｉ」からノード番号「ｊ」へのリンク属性情報「ｅ（ｉ，ｊ）」を設定する際に、まず、乱数「ｒ」を生成する。そして、グラフデータ生成部１２は、例えば、乱数「ｒ」が閾値「ｓ」より大きい値である場合、「ｅ（ｉ，ｊ）」として「ｒ」に基づくリンク属性情報を設定し、乱数「ｒ」が閾値「ｓ」以下の値である場合、「ｅ（ｉ，ｊ）」を「０」に設定する。なお、グラフデータ生成部１２は、ノード番号「ｉ」からノード番号「ｉ」へのリンク属性情報「ｅ（ｉ，ｉ）」を「０」と設定する。 Specifically, when setting the link attribute information “e (i, j)” from the node number “i” to the node number “j”, the graph data generation unit 12 first generates a random number “r”. To do. Then, for example, when the random number “r” is larger than the threshold value “s”, the graph data generating unit 12 sets the link attribute information based on “r” as “e (i, j)”, and the random number “ If “r” is equal to or less than the threshold “s”, “e (i, j)” is set to “0”. The graph data generation unit 12 sets the link attribute information “e (i, i)” from the node number “i” to the node number “i” as “0”.

例えば、クラスタ「Ｍ（１）」において、グラフデータ生成部１２は、図２の（Ｂ）に示すように、ノード番号「１」からノード番号「３」へのリンク属性情報を設定するための乱数「ｒ」を生成し、「ｒ」と「ｓ」との大小関係に基づいて、「ｅ（１，３）」を設定する。また、グラフデータ生成部１２は、図２の（Ｂ）に示すように、ノード番号「３」からノード番号「１」へのリンク属性情報を設定するための乱数「ｒ」、「ｒ」と「ｓ」との大小関係に基づいて、「ｅ（３，１）」を設定する。これにより、グラフデータ生成部１２は、図２の（Ｂ）に示すように、クラスタ「Ｍ（１）」のグラフデータとして、各要素にリンク属性情報が設定された「２×２」の正方行列「Ｅ１」を生成する。 For example, in the cluster “M (1)”, the graph data generation unit 12 sets link attribute information from the node number “1” to the node number “3” as shown in FIG. A random number “r” is generated, and “e (1,3)” is set based on the magnitude relationship between “r” and “s”. Further, as shown in FIG. 2B, the graph data generating unit 12 sets random numbers “r” and “r” for setting link attribute information from the node number “3” to the node number “1”. Based on the magnitude relationship with “s”, “e (3,1)” is set. As a result, the graph data generation unit 12 generates a square of “2 × 2” in which the link attribute information is set for each element as the graph data of the cluster “M (1)” as illustrated in FIG. A matrix “E1” is generated.

また、グラフデータ生成部１２は、図２の（Ｂ）に示すように、クラスタ「Ｍ（２）」において、ノード番号「２」からノード番号「４」へのリンク属性情報を設定するための乱数「ｒ」を生成して「ｅ（２，４）」を設定し、ノード番号「４」からノード番号「２」へのリンク属性情報を設定するための乱数「ｒ」を生成して「ｅ（４，２）」を設定する。これにより、グラフデータ生成部１２は、図２の（Ｂ）に示すように、クラスタ「Ｍ（２）」のグラフデータとして、各要素にリンク属性情報が設定された「２×２」の正方行列「Ｅ２」を生成する。なお、リンク属性情報が設定される順番は、任意の順番でよい。 Further, as shown in FIG. 2B, the graph data generation unit 12 sets link attribute information from the node number “2” to the node number “4” in the cluster “M (2)”. A random number “r” is generated and “e (2, 4)” is set, and a random number “r” for setting link attribute information from the node number “4” to the node number “2” is generated and “ e (4,2) "is set. As a result, the graph data generation unit 12 generates a square of “2 × 2” in which the link attribute information is set for each element as graph data of the cluster “M (2)”, as shown in FIG. A matrix “E2” is generated. The order in which the link attribute information is set may be any order.

そして、図１に示すテストデータ生成部１３は、クラスタ生成部１１によるクラスタ生成処理及びグラフデータ生成部１２によるグラフデータ生成処理を多重帰属可能数（ｔ）繰り返すように制御することで、グラフデータ生成部１２により生成された全てのグラフデータを、各ノードに付与された識別情報に基づいて統合することで、ノード群を構成する各ノード間のリンク状態を模擬したテストデータを生成する。 Then, the test data generation unit 13 illustrated in FIG. 1 performs control so that the cluster generation processing by the cluster generation unit 11 and the graph data generation processing by the graph data generation unit 12 are repeated so as to repeat the number (t) of multiple assignments. All the graph data generated by the generation unit 12 is integrated based on the identification information given to each node, thereby generating test data that simulates the link state between the nodes constituting the node group.

例えば、テストデータ生成部１３は、図２の（Ｂ）に示すように、１回目の処理によりグラフデータ生成部１２が生成したＥ１及びＥ２を、ノード番号に応じて初期設定のＥａに組み込む。そして、テストデータ生成部１３は、クラスタ生成部１１及びグラフデータ生成部１２に対して２回目の処理を行なうように指示を送出する。 For example, as shown in FIG. 2B, the test data generation unit 13 incorporates E1 and E2 generated by the graph data generation unit 12 in the first process into the initial setting Ea according to the node number. Then, the test data generation unit 13 sends an instruction to the cluster generation unit 11 and the graph data generation unit 12 to perform the second process.

これにより、クラスタ生成部１１は、図２の（Ｃ）に示すように、２回目の処理において、ノード群「Ｎ」を、例えば、「Ｍ（１）＝｛１，４｝」及び「Ｍ（２）＝｛２，３｝」の２つのクラスタに分割する。そして、グラフデータ生成部１２は、図２の（Ｃ）に示すように、クラスタ「Ｍ（１）」において、ノード番号「１」からノード番号「４」へのリンク属性情報を設定するための乱数「ｒ」を生成して「ｅ（１，４）」を設定し、ノード番号「４」からノード番号「１」へのリンク属性情報を設定するための乱数「ｒ」を生成して「ｅ（４，１）」を設定する。これにより、グラフデータ生成部１２は、図２の（Ｃ）に示すように、クラスタ「Ｍ（１）」の２回目のグラフデータとして、「Ｅ１」を生成する。 Thereby, as shown in FIG. 2C, the cluster generation unit 11 sets the node group “N” to “M (1) = {1, 4}” and “M” in the second process. (2) = {2, 3} ”is divided into two clusters. Then, the graph data generation unit 12 sets link attribute information from the node number “1” to the node number “4” in the cluster “M (1)”, as shown in FIG. A random number “r” is generated and “e (1, 4)” is set, and a random number “r” for setting link attribute information from the node number “4” to the node number “1” is generated and “ e (4,1) "is set. As a result, the graph data generation unit 12 generates “E1” as the second graph data of the cluster “M (1)”, as shown in FIG.

また、グラフデータ生成部１２は、図２の（Ｃ）に示すように、クラスタ「Ｍ（２）」において、ノード番号「２」からノード番号「３」へのリンク属性情報を設定するための乱数「ｒ」を生成して「ｅ（２，３）」を設定し、ノード番号「３」からノード番号「２」へのリンク属性情報を設定するための乱数「ｒ」を生成して「ｅ（３，２）」を設定する。これにより、グラフデータ生成部１２は、図２の（Ｃ）に示すように、クラスタ「Ｍ（２）」の２回目のグラフデータとして、「Ｅ２」を生成する。 Further, as shown in FIG. 2C, the graph data generation unit 12 sets link attribute information from the node number “2” to the node number “3” in the cluster “M (2)”. A random number “r” is generated and “e (2, 3)” is set, and a random number “r” for setting link attribute information from the node number “3” to the node number “2” is generated and “ e (3,2) "is set. As a result, the graph data generation unit 12 generates “E2” as the second graph data of the cluster “M (2)”, as shown in FIG.

そして、テストデータ生成部１３は、１回目の処理により生成されたＥ１及びＥ２が組み込まれたＥａに対して、２回目の処理により生成されたＥ１及びＥ２をノード番号に応じて組み込む。これにより、テストデータ生成部１３は、図２の（Ｃ）に示すように、グラフデータ生成部１２の１回目及び２回目の処理結果が反映されたＥａを生成する。 Then, the test data generation unit 13 incorporates E1 and E2 generated by the second process according to the node number with respect to Ea in which E1 and E2 generated by the first process are installed. As a result, the test data generation unit 13 generates Ea reflecting the first and second processing results of the graph data generation unit 12, as shown in FIG.

そして、テストデータ生成部１３は、図２の（Ｃ）に示すように、クラスタ生成部１１及びグラフデータ生成部１２の処理を「ｔ」回繰り返させる。これにより、テストデータ生成部１３は、グラフデータ生成部１２が１回目からｔ回目の処理により生成した全てのグラフデータが統合されたテストデータ「Ｅａ」を生成する。なお、テストデータ生成部１３によるグラフデータの統合処理は、上述したように、グラフデータ生成部１２の処理が終了するごとに実行される場合であっても良いし、グラフデータ生成部１２のｔ回分の処理が終了した後にまとめて実行される場合であっても良い。 Then, the test data generation unit 13 repeats the processes of the cluster generation unit 11 and the graph data generation unit 12 “t” times as illustrated in FIG. Accordingly, the test data generation unit 13 generates test data “Ea” in which all the graph data generated by the graph data generation unit 12 through the first to t-th processing is integrated. Note that, as described above, the graph data integration process by the test data generation unit 13 may be executed every time the process of the graph data generation unit 12 is completed, or the graph data generation unit 12 t It may be a case where batch processing is executed after batch processing is completed.

上述したように、実施例１によれば、クラスタ生成部１１は、識別情報がそれぞれ付与された複数のノードから構成されるノード群を、当該ノード群の総ノード数及びクラスタ規模数で決定されるクラスタ数に無作為に分割することで、複数のクラスタを生成する。そして、グラフデータ生成部１２は、クラスタ生成部１１により生成された各クラスタを構成するノード間のリンク状態を示すリンク属性情報を、当該ノード間に対して生成された乱数及び閾値を用いて設定することで、複数のクラスタそれぞれにおけるリンク構造を示すグラフデータを生成する。そして、テストデータ生成部１３は、クラスタ生成部１１によるクラスタ生成処理及びグラフデータ生成部１２によるグラフデータ生成処理を多重帰属可能数繰り返すように制御することで、グラフデータ生成部１２により生成された全てのグラフデータを、各ノードに付与された識別情報に基づいて統合することで、ノード群を構成する各ノード間のリンク状態を模擬したテストデータを生成する。 As described above, according to the first embodiment, the cluster generation unit 11 determines a node group including a plurality of nodes to which identification information is assigned, based on the total number of nodes and the number of cluster sizes of the node group. Multiple clusters are generated by randomly dividing the number of clusters. Then, the graph data generation unit 12 sets the link attribute information indicating the link state between the nodes constituting each cluster generated by the cluster generation unit 11 using the random numbers and threshold values generated between the nodes. Thus, graph data indicating the link structure in each of the plurality of clusters is generated. Then, the test data generation unit 13 generates the graph data generation unit 12 by controlling the cluster generation processing by the cluster generation unit 11 and the graph data generation processing by the graph data generation unit 12 to repeat the number of multiple assignments. By integrating all the graph data based on the identification information given to each node, test data simulating the link state between each node constituting the node group is generated.

すなわち、実施例１では、クラスタ生成処理及びグラフデータ生成処理を複数回繰り返し、さらに、生成された全グラフデータを統合することでテストデータを生成する。ここで、クラスタ生成部１１により生成されたクラスタは、潜在的なコミュニティと見なすことができる。また、グラフデータ生成部１２により生成されたグラフデータは、コミュニティ内での偏ったリンク構造を模擬したデータと見なすことができる。すなわち、実施例１で生成されるデータは、偏ったリンク構造を有する複数のコミュニティが入り組んで形成される状態の通信網を模擬したデータとなる。従って、実施例１では、潜在的なコミュニティ内での偏ったリンク構造を模擬したテストデータを生成することが可能となる。 That is, in the first embodiment, test data is generated by repeating the cluster generation process and the graph data generation process a plurality of times, and further integrating all the generated graph data. Here, the cluster generated by the cluster generation unit 11 can be regarded as a potential community. The graph data generated by the graph data generation unit 12 can be regarded as data simulating a biased link structure in the community. That is, the data generated in the first embodiment is data that simulates a communication network in which a plurality of communities having a biased link structure are formed in an intricate manner. Therefore, in the first embodiment, it is possible to generate test data that simulates a biased link structure in a potential community.

実施例２では、実施例１で説明したクラスタ生成処理、グラフデータ生成処理及びテストデータ生成処理を詳細に説明する。 In the second embodiment, the cluster generation process, the graph data generation process, and the test data generation process described in the first embodiment will be described in detail.

まず、実施例２に係るテストデータ生成装置１０を、図３などを用いて説明する。図３は、実施例２に係るテストデータ生成装置の構成を説明するための図である。 First, the test data generation apparatus 10 according to the second embodiment will be described with reference to FIG. FIG. 3 is a diagram for explaining the configuration of the test data generation apparatus according to the second embodiment.

実施例２に係るテストデータ生成装置は、図３に示すように、入力部２０、出力部３０、入出力制御Ｉ／Ｆ部４０、記憶部５０及び処理部６０を有する。 As illustrated in FIG. 3, the test data generation apparatus according to the second embodiment includes an input unit 20, an output unit 30, an input / output control I / F unit 40, a storage unit 50, and a processing unit 60.

入力部２０は、マウスやキーボード、ＦＤ（Flexible Disk Drive）、ＣＤ−ＲＯＭ（Compact Disc Read Only Memory）、ＭＯ（Magneto Optical Disc）、ＤＶＤ（Digital Versatile Disc）などの記録媒体の読取装置などを有し、テストデータ生成装置１０の操作者から各種設定情報を受け付ける。具体的には、入力部２０は、テストデータ生成装置１０の操作者から総ノード数（ｎ）、クラスタ規模数（ｍ）、多重帰属可能数（ｔ）及び閾値（ｓ）が設定された設定リストの登録を受け付ける。 The input unit 20 includes a reading device for a recording medium such as a mouse, a keyboard, an FD (Flexible Disk Drive), a CD-ROM (Compact Disc Read Only Memory), an MO (Magneto Optical Disc), and a DVD (Digital Versatile Disc). Then, various setting information is received from the operator of the test data generation apparatus 10. Specifically, the input unit 20 is a setting in which the total number of nodes (n), the number of cluster sizes (m), the number of possible multiple assignments (t), and the threshold value (s) are set by the operator of the test data generation apparatus 10. Accept registration of list.

出力部３０は、モニタやスピーカなどを有し、例えば、後述する処理部６０の処理結果などをモニタに表示する。 The output unit 30 includes a monitor, a speaker, and the like, and displays, for example, a processing result of the processing unit 60 described later on the monitor.

入出力制御Ｉ／Ｆ部４０は、入力部２０および出力部３０と、記憶部５０および処理部６０との間におけるデータ転送を制御する。 The input / output control I / F unit 40 controls data transfer between the input unit 20 and the output unit 30, and the storage unit 50 and the processing unit 60.

記憶部５０は、処理部６０による各種処理に用いるデータと、処理部６０による各種処理結果を記憶する。記憶部５０は、例えば、ＨＤＤ（Hard Disc Drive）やＲＡＭ（Random Access Memory）などである。具体的には、記憶部５０は、図３に示すように、設定リスト５１、ノードリスト５２、クラスタ別グラフデータ５３及びテストデータ５４を有する。 The storage unit 50 stores data used for various processes by the processing unit 60 and various processing results by the processing unit 60. The storage unit 50 is, for example, an HDD (Hard Disc Drive) or a RAM (Random Access Memory). Specifically, as shown in FIG. 3, the storage unit 50 includes a setting list 51, a node list 52, cluster-specific graph data 53, and test data 54.

設定リスト５１は、入力部２０が受け付けた設定リストを記憶する。図４は、図３に示す設定リストを説明するための図である。 The setting list 51 stores the setting list received by the input unit 20. FIG. 4 is a diagram for explaining the setting list shown in FIG.

設定リスト５１は、例えば、図４の（Ａ）に示すように、「総ノード数：ｎ、多重帰属可能数：ｔ、クラスタ規模数：ｍ、閾値：ｓ」が設定された設定リストを記憶する。あるいは、設定リスト５１は、例えば、図４の（Ｂ）に示すように、「総ノード数：ｎ、多重帰属可能数：ｔ」と、ｔ個のクラスタ規模数が配列された「ｍ１，ｍ２，・・・・・・・・，ｍｔ」と、ｔ個の閾値が配列された「ｓ１，ｓ２，・・・・・・・・，ｓｔ」とが設定された設定リストを記憶する。 For example, as shown in FIG. 4A, the setting list 51 stores a setting list in which “total number of nodes: n, number of multiple assignments: t, number of cluster sizes: m, threshold: s” is set. To do. Alternatively, as shown in FIG. 4B, for example, the setting list 51 includes “m1, m2” in which “total number of nodes: n, number of possible multiple assignments: t” and t cluster scale numbers are arranged. ,..., Mt ”and“ s1, s2,..., St ”in which t threshold values are arranged are stored.

すなわち、図４の（Ａ）に示す一例では、ｔ回繰り返されるクラスタ生成処理を全て同じクラスタ規模数にて実行し、かつ、ｔ回繰り返されるグラフデータ生成処理を全て同じ閾値にて実行するように設定されている。一方、図４の（Ｂ）に示す一例では、ｔ回繰り返されるクラスタ生成処理に用いられるクラスタ規模数が変化するように設定され、ｔ回繰り返されるグラフデータ生成処理に用いられる閾値が変化するように設定されている。なお、ｔ個のクラスタ規模数それぞれは、互いに異なる値であってもよいし、一部又は全部が同じ値である場合であってもよい。同様に、ｔ個の閾値それぞれは、互いに異なる値であってもよいし、一部又は全部が同じ値である場合であってもよい。 That is, in the example shown in FIG. 4A, all cluster generation processes repeated t times are executed with the same number of cluster scales, and all graph data generation processes repeated t times are executed with the same threshold value. Is set to On the other hand, in the example shown in FIG. 4B, the number of cluster sizes used in the cluster generation process repeated t times is set to change, and the threshold value used in the graph data generation process repeated t times changes. Is set to Each of the t cluster scale numbers may be different from each other, or a part or all of them may have the same value. Similarly, each of the t threshold values may be different from each other, or may be a case where a part or all of them are the same value.

ノードリスト５２は、設定リスト５１が記憶する設定リストの総ノード数「ｎ」に基づいて、処理部６０のクラスタ生成部１１が設定したノード群を構成する複数のノードのリストを記憶する。ノードリスト５２の記憶部５０における領域規模は、総ノード数に応じて、例えば、クラスタ生成部１１により確保される。図５は、図３に示すノードリストを説明するための図である。 The node list 52 stores a list of a plurality of nodes constituting the node group set by the cluster generation unit 11 of the processing unit 60 based on the total number “n” of the setting lists stored in the setting list 51. The area size of the node list 52 in the storage unit 50 is ensured by, for example, the cluster generation unit 11 according to the total number of nodes. FIG. 5 is a diagram for explaining the node list shown in FIG.

ノードリスト５２は、例えば、図５の（Ａ）に示すように、「ｎ」個のノードそれぞれに、ノード番号「１，２，３、・・・・、ｎ−１、ｎ」が識別情報として付与された形式のノードリストを記憶する。あるいは、ノードリスト５２は、例えば、図５の（Ｂ）に示すように、名称「名称（１），名称（２），名称（３）、・・・・、名称（ｎ−１）、名称（ｎ）」が識別情報として付与された形式のノードリストを記憶する。なお、以下では、ノードリスト５２が、「ｎ」個のノードそれぞれに、ノード番号が識別情報として付与された形式のノードリストを記憶する場合について説明する。 In the node list 52, for example, as shown in FIG. 5A, node numbers “1, 2, 3,..., N−1, n” are assigned to each of “n” nodes. Is stored as a node list in the format assigned as. Alternatively, the node list 52 includes, for example, the names “name (1), name (2), name (3),..., Name (n−1), name as shown in FIG. (N) "is stored as a node list of identification information. A case will be described below where the node list 52 stores a node list in a format in which a node number is assigned as identification information to each of “n” nodes.

図３に戻って、クラスタ別グラフデータ５３は、処理部６０のクラスタ生成部１１が生成した各クラスタを構成するノードのリスト及び処理部６０のグラフデータ生成部１２が生成した各クラスタのリンクリストを記憶する。すなわち、クラスタ別グラフデータ５３は、クラスタ生成部１１が生成した各クラスタを構成するノードのリストを、クラスタ別にまとめたクラスタ別ノードリストを記憶する。また、クラスタ別グラフデータ５３は、クラスタ生成部１１が生成した各クラスタを構成するノード間のリンク属性情報のリストを、クラスタ別にまとめたクラスタ別リンクリストを記憶する。換言すれば、クラスタ別グラフデータ５３は、クラスタ別ノードリストとクラスタ別リンクリストとをクラスタごとにペアとして記憶することで、各クラスタのグラフデータを記憶する。 Returning to FIG. 3, the cluster-specific graph data 53 includes a list of nodes constituting each cluster generated by the cluster generation unit 11 of the processing unit 60 and a link list of each cluster generated by the graph data generation unit 12 of the processing unit 60. Remember. That is, the cluster-specific graph data 53 stores a cluster-specific node list in which a list of nodes constituting each cluster generated by the cluster generation unit 11 is grouped by cluster. Further, the cluster-specific graph data 53 stores a cluster-specific link list in which a list of link attribute information between nodes constituting each cluster generated by the cluster generation unit 11 is compiled for each cluster. In other words, the cluster-specific graph data 53 stores the graph data of each cluster by storing the cluster-specific node list and the cluster-specific link list as a pair for each cluster.

なお、クラスタ別ノードリストの記憶部５０における領域規模は、総ノード数、クラスタ規模数、多重帰属可能数及び後述するクラスタ生成法に応じて、例えば、クラスタ生成部１１及びグラフデータ生成部１２により確保される。すなわち、クラスタ別ノードリストの領域規模は、クラスタ生成部１１が多重帰属可能数分繰り返して生成した各クラスタを構成するノード数を加算した領域規模となる。また、クラスタ別リンクリストの領域規模は、クラスタ生成部１１が多重帰属可能数分繰り返して生成した各クラスタを構成するノード数を二乗した数を加算した領域規模となる。 The area size in the storage unit 50 of the cluster-specific node list is determined by, for example, the cluster generation unit 11 and the graph data generation unit 12 according to the total number of nodes, the number of cluster sizes, the number of possible multiple assignments, and the cluster generation method described later Secured. That is, the area scale of the cluster-specific node list is the area scale obtained by adding the number of nodes constituting each cluster, which is repeatedly generated by the cluster generation unit 11 as many times as possible. The area scale of the cluster-specific link list is an area scale obtained by adding the squares of the number of nodes constituting each cluster generated by the cluster generation unit 11 repeatedly for the number of multiple assignments.

テストデータ５４は、処理部６０のテストデータ生成部１３が生成したテストデータを記憶する。総ノード数「ｎ」が設定された場合、ノード間のリンク状態を模擬したテストデータは、「ｎ×ｎ」通りのノード間のリンク状態を模擬したデータとなる。すなわち、テストデータ５４の記憶部５０における領域規模は、総ノード数の二乗個に相当する領域規模として、例えば、テストデータ生成部１３により確保される。図６は、図３に示すテストデータのデータ形式を説明するための図である。 The test data 54 stores test data generated by the test data generation unit 13 of the processing unit 60. When the total number of nodes “n” is set, the test data simulating the link state between the nodes becomes data simulating the link state between “n × n” nodes. That is, the area size of the test data 54 in the storage unit 50 is secured by, for example, the test data generation unit 13 as an area size corresponding to the square of the total number of nodes. FIG. 6 is a diagram for explaining the data format of the test data shown in FIG.

例えば、テストデータ５４は、図６の（Ａ）に示すように、「ｎ×ｎ」の正方行列のデータ形式にてテストデータを記憶する。図６の（Ａ）に示す一例では、ｉ行目、ｊ列目の要素が、リンク元であるｉ番目のノードからリンク先であるｊ番目のノードへのリンク属性情報となる。 For example, as shown in FIG. 6A, the test data 54 stores test data in a data format of an “n × n” square matrix. In the example shown in FIG. 6A, the element in the i-th row and j-th column is the link attribute information from the i-th node as the link source to the j-th node as the link destination.

あるいは、テストデータ５４は、図６の（Ｂ）に示すように、「リンク元、リンク先、リンク属性情報」といったｎ個の配列情報のデータ形式にてテストデータを記憶する。なお、以下では、テストデータ５４が、図６の（Ａ）に示す「ｎ×ｎ」の正方行列のデータ形式にてテストデータを記憶する場合について説明する。 Alternatively, the test data 54 stores test data in a data format of n pieces of array information such as “link source, link destination, link attribute information” as shown in FIG. In the following, a case will be described in which the test data 54 is stored in the “n × n” square matrix data format shown in FIG.

図３に戻って、処理部６０は、設定リスト５１が記憶する設定リストに基づいて、テストデータを生成するために、クラスタ生成部１１、グラフデータ生成部１２及びテストデータ生成部１３を有する。 Returning to FIG. 3, the processing unit 60 includes a cluster generation unit 11, a graph data generation unit 12, and a test data generation unit 13 in order to generate test data based on the setting list stored in the setting list 51.

クラスタ生成部１１は、まず、識別情報がそれぞれ付与された複数のノードから構成されるノード群を設定し、設定したノード群をノードリストとしてノードリスト５２に格納する。例えば、クラスタ生成部１１は、図５の（Ａ）に例示したノードリストをノードリスト５２に格納する。 First, the cluster generation unit 11 sets a node group composed of a plurality of nodes to which identification information is assigned, and stores the set node group in the node list 52 as a node list. For example, the cluster generation unit 11 stores the node list illustrated in FIG.

そして、クラスタ生成部１１は、設定したノード群を、総ノード数（ｎ）及びクラスタ規模数（ｍ）で決定されるクラスタ数に無作為に分割することで、複数のクラスタを生成する。そして、クラスタ生成部１１は、生成した各クラスタを構成するノードのリストを、クラスタ別ノードリストとしてクラスタ別グラフデータ５３に格納する。 Then, the cluster generation unit 11 generates a plurality of clusters by randomly dividing the set node group into the number of clusters determined by the total node number (n) and the cluster size number (m). The cluster generation unit 11 stores the generated list of nodes constituting each cluster in the cluster-specific graph data 53 as a cluster-specific node list.

ここで、「ｍ」が「ｎ」の約数である場合、クラスタ生成部１１は、「ｃ＝ｎ／ｍ」回にわたり、ノード群を構成する「ｎ」個のノードから「ｍ」個のノードを順次無作為に選択する。これにより、クラスタ生成部１１は、「ｎ」個のノードからなるノード群を「ｍ」個のノードから構成されるクラスタ「Ｍ（１）〜Ｍ（ｃ）」に分割する。 Here, when “m” is a divisor of “n”, the cluster generation unit 11 performs “c = n / m” times from “n” nodes constituting the node group to “m” pieces. Select nodes sequentially and randomly. Accordingly, the cluster generation unit 11 divides the node group including “n” nodes into clusters “M (1) to M (c)” including “m” nodes.

しかし、「ｍ」が「ｎ」の約数でない場合、クラスタ生成部１１は、「ｎ」を「ｍ」を除算した余りの数分のノードを処理する必要がある。そこで、実施例２に係るクラスタ生成部１１は、以下に説明する第１のクラスタ生成方法、又は、第２のクラスタ生成方法により、クラスタ生成処理を行なう。 However, if “m” is not a divisor of “n”, the cluster generation unit 11 needs to process as many nodes as the remainder obtained by dividing “n” by “m”. Therefore, the cluster generation unit 11 according to the second embodiment performs cluster generation processing using a first cluster generation method or a second cluster generation method described below.

まず、実施例２に係るクラスタ生成部１１は、第１のクラスタ生成方法及び第２のクラスタ生成方法双方において、複数のノードの中でクラスタとして未抽出のノード群から無作為にクラスタ規模数のノードを抽出することでクラスタを順次生成する。 First, in both the first cluster generation method and the second cluster generation method, the cluster generation unit 11 according to the second embodiment generates a cluster size number randomly from a node group that is not extracted as a cluster among a plurality of nodes. Clusters are generated sequentially by extracting nodes.

ここで、「ｍ」が「ｎ」の約数である場合、実施例２に係るクラスタ生成部１１は、上記の処理のみで、クラスタ生成処理を実行することとなる。一方、「ｍ」が「ｎ」の約数でないと、上記の処理を「ノード群の総ノード数をクラスタ規模数で除算した値以下で最大となる整数回」繰り返すと、クラスタとして未抽出のノード群のノード数は、クラスタ規模「ｍ」より少なくなる。 Here, when “m” is a divisor of “n”, the cluster generation unit 11 according to the second embodiment executes the cluster generation processing only by the above processing. On the other hand, if “m” is not a divisor of “n” and the above process is repeated “integer number of times that is not more than a value obtained by dividing the total number of nodes in the node group by the number of cluster sizes”, the cluster is not extracted The number of nodes in the node group is smaller than the cluster scale “m”.

そこで、実施例２に係るクラスタ生成部１１は、第１のクラスタ生成方法として、クラスタとして未抽出のノード群のノード数がクラスタ規模数より少なくなった場合、当該クラスタとして未抽出のノード群を生成済みの各クラスタに無作為に割り振る。これにより、実施例２に係るクラスタ生成部１１は、第１のクラスタ生成方法を実行することで、ノード群の総ノード数をクラスタ規模数で除算した値以下で最大となる整数個のクラスタを生成する。 Therefore, the cluster generation unit 11 according to the second embodiment, as the first cluster generation method, when the number of nodes in the unextracted node group as a cluster is smaller than the number of cluster sizes, the unextracted node group as the cluster is selected. Randomly allocate to each generated cluster. As a result, the cluster generation unit 11 according to the second embodiment executes the first cluster generation method to obtain an integer number of clusters that is the maximum of the number of nodes less than the value obtained by dividing the total number of nodes in the node group by the number of cluster sizes. Generate.

あるいは、実施例２に係るクラスタ生成部１１は、第２のクラスタ生成方法として、クラスタとして未抽出のノード群のノード数がクラスタ規模数より少なくなった場合、当該クラスタとして未抽出のノード群を新たなクラスタとして生成する。これにより、実施例２に係るクラスタ生成部１１は、第２のクラスタ生成方法を実行することで、ノード群の総ノード数をクラスタ規模数で除算した値以上で最小となる整数個のクラスタを生成する。 Alternatively, the cluster generation unit 11 according to the second embodiment, as the second cluster generation method, when the number of nodes in the node group that has not been extracted as a cluster is smaller than the number of cluster sizes, the node group that has not been extracted as the cluster. Create as a new cluster. As a result, the cluster generation unit 11 according to the second embodiment executes the second cluster generation method, so that an integer number of clusters that are the minimum number of clusters equal to or greater than the value obtained by dividing the total number of nodes in the node group by the number of cluster sizes is obtained. Generate.

以下、第１のクラスタ生成方法及び第２のクラスタ生成方法それぞれについて、図７及び図８のフローチャートを用いて詳細に説明する。図７は、実施例２に係るクラスタ生成部が実行する第１のクラスタ生成方法を説明するためのフローチャートであり、図８は、実施例２に係るクラスタ生成部が実行する第２のクラスタ生成方法を説明するためのフローチャートである。 Hereinafter, each of the first cluster generation method and the second cluster generation method will be described in detail with reference to the flowcharts of FIGS. 7 and 8. FIG. 7 is a flowchart for explaining a first cluster generation method executed by the cluster generation unit according to the second embodiment. FIG. 8 is a second cluster generation executed by the cluster generation unit according to the second embodiment. It is a flowchart for demonstrating a method.

第１のクラスタ生成方法を実行する場合、図７に示すように、クラスタ生成部１１は、設定リストを参照して、総ノード数「ｎ」のノード群「Ｎ」を設定し、クラスタ規模数「ｍ」を取得する（ステップＳＡ１）。ここで、クラスタ生成部１１は、内部処理用データとして、ノード群Ｎ（０）〜Ｎ（Ｃ）を準備する。第１のクラスタ生成方法を実行する場合、「Ｃ」は、総ノード数「ｎ」をクラスタ規模数「ｍ」で除算した値以下で最大となる整数から「１」を差し引いた値となる。 When executing the first cluster generation method, as shown in FIG. 7, the cluster generation unit 11 refers to the setting list, sets the node group “N” of the total number of nodes “n”, and sets the number of cluster sizes. “M” is acquired (step SA1). Here, the cluster generation unit 11 prepares node groups N (0) to N (C) as internal processing data. When the first cluster generation method is executed, “C” is a value obtained by subtracting “1” from an integer that is equal to or less than a value obtained by dividing the total node number “n” by the cluster size number “m”.

そして、クラスタ生成部１１は、「Ｎ（０）＝Ｎ」とし、クラスタ管理用変数「ｉ」を「０」とし（ステップＳＡ２）、Ｎ（ｉ）からランダムにｍ個の要素を選択してＭ（ｉ＋１）の要素とする（ステップＳＡ３）。 Then, the cluster generation unit 11 sets “N (0) = N”, sets the cluster management variable “i” to “0” (step SA2), and randomly selects m elements from N (i). The element is M (i + 1) (step SA3).

その後、クラスタ生成部１１は、「ｉ」をインクリメントして、「ｉ＝ｉ＋１」とし（ステップＳＡ４）、「Ｎ（ｉ）＝Ｎ（ｉ−１）−Ｍ（ｉ）」とする（ステップＳＡ５）。 Thereafter, the cluster generation unit 11 increments “i” to “i = i + 1” (step SA4) and “N (i) = N (i−1) −M (i)” (step SA5). ).

そして、クラスタ生成部１１は、Ｎ（ｉ）の要素数がｍ以上か否かを判定する（ステップＳＡ６）。ここで、Ｎ（ｉ）の要素数がｍ以上である場合（ステップＳＡ６肯定）、クラスタ生成部１１は、ステップＳＡ３に戻って、Ｎ（ｉ）からクラスタＭ（ｉ＋１）を構成する要素の選択処理を行なう。 Then, the cluster generation unit 11 determines whether the number of elements of N (i) is greater than or equal to m (step SA6). Here, when the number of elements of N (i) is greater than or equal to m (Yes at step SA6), the cluster generation unit 11 returns to step SA3 and selects elements that form the cluster M (i + 1) from N (i). Perform processing.

一方、Ｎ（ｉ）の要素数がｍより小さい値である場合（ステップＳＡ６否定）、クラスタ生成部１１は、剰余ノード分配管理用変数（Ｊ）を「Ｊ＝０」とし（ステップＳＡ７）、Ｎ（ｉ）からランダムに１個の要素を選択してＭ（Ｊ＋１）に追加し、追加した要素をＮ（ｉ）から取り除く（ステップＳＡ８）。 On the other hand, when the number of elements of N (i) is smaller than m (No at Step SA6), the cluster generation unit 11 sets the residual node distribution management variable (J) to “J = 0” (Step SA7). One element is selected at random from N (i) and added to M (J + 1), and the added element is removed from N (i) (step SA8).

そして、クラスタ生成部１１は、「Ｊ＝（Ｊ＋１）ｍｏｄ（ｎ／ｍを超えない最大整数）」とし（ステップＳＡ９）、Ｎ（ｉ）の要素数が「０」となったか否かを判定する（ステップＳＡ１０）。ここで、Ｎ（ｉ）の要素数が「０」でない場合（ステップＳＡ１０否定）、クラスタ生成部１１は、ステップＳＡ８に戻って、剰余ノードの分配処理を行なう。 Then, the cluster generation unit 11 sets “J = (J + 1) mod (maximum integer not exceeding n / m)” (step SA9), and determines whether the number of elements of N (i) is “0”. (Step SA10). Here, when the number of elements of N (i) is not “0” (No at Step SA10), the cluster generation unit 11 returns to Step SA8 and performs distribution processing of the surplus nodes.

一方、Ｎ（ｉ）の要素数が「０」となった場合（ステップＳＡ１０肯定）、クラスタ生成部１１は、処理を終了する。 On the other hand, when the number of elements of N (i) becomes “0” (Yes at step SA10), the cluster generation unit 11 ends the process.

また、第２のクラスタ生成方法を実行する場合、図８に示すように、クラスタ生成部１１は、設定リストを参照して、総ノード数「ｎ」のノード群「Ｎ」を設定し、クラスタ規模数「ｍ」を取得する（ステップＳＢ１）。ここで、クラスタ生成部１１は、内部処理用データとして、ノード群Ｎ（０）〜Ｎ（Ｃ）を準備する。第２のクラスタ生成方法を実行する場合、「Ｃ」は、総ノード数「ｎ」をクラスタ規模数「ｍ」で除算した値以下で最大となる整数となる。 When executing the second cluster generation method, as illustrated in FIG. 8, the cluster generation unit 11 refers to the setting list, sets the node group “N” of the total number of nodes “n”, and The scale number “m” is acquired (step SB1). Here, the cluster generation unit 11 prepares node groups N (0) to N (C) as internal processing data. When the second cluster generation method is executed, “C” is an integer that becomes the maximum value not more than a value obtained by dividing the total node number “n” by the cluster size number “m”.

そして、クラスタ生成部１１は、「Ｎ（０）＝Ｎ」とし、クラスタ管理用変数「ｉ」を「０」とし（ステップＳＢ２）、Ｎ（ｉ）からランダムにｍ個の要素を選択してＭ（ｉ＋１）の要素とする（ステップＳＢ３）。 Then, the cluster generation unit 11 sets “N (0) = N”, sets the cluster management variable “i” to “0” (step SB2), and selects m elements at random from N (i). The element is M (i + 1) (step SB3).

その後、クラスタ生成部１１は、「ｉ」をインクリメントして、「ｉ＝ｉ＋１」とし（ステップＳＡ４）、「Ｎ（ｉ）＝Ｎ（ｉ−１）−Ｍ（ｉ）」とする（ステップＳＢ５）。 Thereafter, the cluster generation unit 11 increments “i” to “i = i + 1” (step SA4) and “N (i) = N (i−1) −M (i)” (step SB5). ).

そして、クラスタ生成部１１は、Ｎ（ｉ）の要素数がｍ以上か否かを判定する（ステップＳＢ６）。ここで、Ｎ（ｉ）の要素数がｍ以上である場合（ステップＳＢ６肯定）、クラスタ生成部１１は、ステップＳＢ３に戻って、Ｎ（ｉ）からクラスタＭ（ｉ＋１）を構成する要素の選択処理を行なう。 Then, the cluster generation unit 11 determines whether the number of elements of N (i) is greater than or equal to m (step SB6). Here, when the number of elements of N (i) is greater than or equal to m (Yes in step SB6), the cluster generation unit 11 returns to step SB3 and selects elements that constitute the cluster M (i + 1) from N (i). Perform processing.

一方、Ｎ（ｉ）の要素数がｍより小さい値である場合（ステップＳＢ６否定）、クラスタ生成部１１は、「Ｍ（ｉ＋１）＝Ｎ（ｉ）」とし（ステップＳＢ７）、処理を終了する。 On the other hand, when the number of elements of N (i) is smaller than m (No at Step SB6), the cluster generation unit 11 sets “M (i + 1) = N (i)” (Step SB7), and ends the process. .

すなわち、第１のクラスタ生成方法は、「ｎ」を「ｍ」を除算した余りの数分のノードを、生成済みのクラスタ（要素数：ｍ）に分配する方法である。一方、第２のクラスタ生成方法は、「ｎ」を「ｍ」を除算した余りの数分のノードを、そのまま新たなクラスタとするものである。第１のクラスタ生成方法及び第２のクラスタ生成方法のどちらを用いるかは、各クラスタを構成するノード数をなるべく均等にすることを優先するか、各クラスタを構成するノード数をなるべく「ｍ」から変えないことを優先するかに依存する。操作者は、テストデータを使用するテスト対象アルゴリズムの目的に応じて、第１のクラスタ生成方法又は第２のクラスタ生成方法のいずれかを選択する。例えば、クラスタ生成部１１が第１のクラスタ生成方法を実行するか、第２のクラスタ生成方法を実行するかは、設定リストの登録時や、クラスタ生成部１１の処理開始前などに操作者により指定される。 That is, the first cluster generation method is a method of distributing the remaining number of nodes obtained by dividing “n” by “m” to the generated clusters (number of elements: m). On the other hand, in the second cluster generation method, the remaining number of nodes obtained by dividing “n” by “m” are used as new clusters as they are. Whether to use the first cluster generation method or the second cluster generation method gives priority to equalizing the number of nodes constituting each cluster as much as possible, or “m” as many as possible. Depends on whether to give priority to not changing. The operator selects either the first cluster generation method or the second cluster generation method according to the purpose of the test target algorithm using the test data. For example, whether the cluster generation unit 11 executes the first cluster generation method or the second cluster generation method is determined by the operator when registering the setting list or before starting the processing of the cluster generation unit 11. It is specified.

図３に戻って、グラフデータ生成部１２は、クラスタ生成部１１により生成された各クラスタを構成するノード間のリンク状態を示すリンク属性情報を、当該ノード間に対して生成された乱数及び閾値（ｓ）を用いて設定することで、複数のクラスタそれぞれにおけるリンク構造を示すグラフデータを生成する。そして、グラフデータ生成部１２は、各クラスタから生成したグラフデータにて設定されたノード間のリンク属性情報のリストをクラスタ別リンクリストとして、生成対象であるクラスタのクラスタ別ノードリストに対応付けてクラスタ別グラフデータ５３に格納する。 Returning to FIG. 3, the graph data generation unit 12 uses the link attribute information indicating the link state between the nodes constituting each cluster generated by the cluster generation unit 11, the random number and threshold value generated between the nodes. By setting using (s), graph data indicating a link structure in each of a plurality of clusters is generated. Then, the graph data generation unit 12 associates the list of link attribute information between the nodes set in the graph data generated from each cluster as a cluster-specific link list, and associates the list with the cluster-specific node list of the cluster to be generated. This is stored in the graph data 53 for each cluster.

ここで、グラフデータ生成部１２は、ノード間のリンク状態を示すリンク属性情報として、ノード間のリンクの有無、又は、ノード間の関係強度を示す重み付けを設定する。リンク属性情報の種別は、操作者の指示により決定される。以下では、リンク属性情報として、グラフデータ生成部１２がノード間の関係強度を示す重み付けを設定する場合について説明する。 Here, the graph data generation part 12 sets the weight which shows the presence or absence of the link between nodes, or the relationship strength between nodes as link attribute information which shows the link state between nodes. The type of link attribute information is determined by an instruction from the operator. Below, the case where the graph data generation part 12 sets the weight which shows the relationship strength between nodes as link attribute information is demonstrated.

具体的には、グラフデータ生成部１２は、ノード番号「ｉ」からノード番号「ｊ」への重み付け「ｅ（ｉ，ｊ）」を設定する際に、まず、乱数「ｒ」を生成する。例えば、グラフデータ生成部１２は、「０〜１」の範囲にある実数を乱数として生成する。グラフデータ生成部１２は、乱数「ｒ」が閾値「ｓ」以下の値である場合、「ｅ（ｉ，ｊ）」を「０」に設定する。また、グラフデータ生成部１２は、「ｉ＝ｊ」の場合、重み付け「ｅ（ｉ，ｊ）」を「０」に設定する。 Specifically, when setting the weighting “e (i, j)” from the node number “i” to the node number “j”, the graph data generation unit 12 first generates a random number “r”. For example, the graph data generation unit 12 generates a real number in the range of “0 to 1” as a random number. The graph data generation unit 12 sets “e (i, j)” to “0” when the random number “r” is equal to or less than the threshold value “s”. Further, the graph data generation unit 12 sets the weight “e (i, j)” to “0” when “i = j”.

一方、グラフデータ生成部１２は、「ｉ＝ｊ」でなく、乱数「ｒ」が閾値「ｓ」より大きい値である場合、「ｅ（ｉ，ｊ）」を「ｒ」に基づいて設定する。具体的には、グラフデータ生成部１２は、以下で説明する３つの設定方法のいずれかの方法により、「ｅ（ｉ，ｊ）」を設定する。 On the other hand, when “i = j” is not satisfied and the random number “r” is larger than the threshold “s”, the graph data generation unit 12 sets “e (i, j)” based on “r”. . Specifically, the graph data generation unit 12 sets “e (i, j)” by any one of the three setting methods described below.

第１の設定方法は、「ｒ＞ｓ」ならば、乱数「ｒ」をそのまま「ｅ（ｉ，ｊ）」と設定する方法である。また、第２の設定方法は、「ｒ＞ｓ」ならば「ｒ−ｓ」を「ｅ（ｉ，ｊ）」として設定する方法である。また、第３の設定方法は、「ｒ＞ｓ」ならば「（ｒ−ｓ）／（１−ｓ）」を「ｅ（ｉ，ｊ）」として設定する方法である。なお、グラフデータ生成部１２は、操作者が第１〜第３の設定方法から指定した設定方法により、ノード間の重み付けを設定する。例えば、グラフデータ生成部１２が第１の設定方法、第２の設定方法、又は、第３の設定方法のいずれかを実行するかは、設定リストの登録時や、グラフデータ生成部１２の処理開始前などに操作者により指定される。 The first setting method is a method of setting the random number “r” as it is as “e (i, j)” if “r> s”. The second setting method is a method of setting “r−s” as “e (i, j)” if “r> s”. The third setting method is a method of setting “(rs− / 1-s)” as “e (i, j)” if “r> s”. Note that the graph data generation unit 12 sets weights between nodes by a setting method designated by the operator from the first to third setting methods. For example, whether the graph data generating unit 12 executes the first setting method, the second setting method, or the third setting method depends on whether the graph data generating unit 12 performs processing when registering the setting list. It is specified by the operator before starting.

なお、グラフデータ生成部１２は、リンク属性情報としてノード間のリンクの有無を設定する場合、例えば、乱数「ｒ」が閾値「ｓ」より大きい値であるならば、「ｅ（ｉ，ｊ）」を「リンク有」に設定し、乱数「ｒ」が閾値「ｓ」以下の値であるならば、「ｅ（ｉ，ｊ）」を「リンク無」に設定する。なお、グラフデータ生成部１２は、「ｉ＝ｊ」の場合、「ｅ（ｉ，ｊ）」を「リンク無」に設定する。 When the graph data generation unit 12 sets the presence / absence of a link between nodes as the link attribute information, for example, if the random number “r” is larger than the threshold “s”, “e (i, j)” Is set to “with link”, and if the random number “r” is equal to or less than the threshold value “s”, “e (i, j)” is set to “without link”. The graph data generation unit 12 sets “e (i, j)” to “no link” when “i = j”.

以下、グラフデータ生成部１２の処理について、図９を用いて説明する。図９は、実施例２に係るグラフデータ生成部の処理を説明するためのフローチャートである。なお、図９は、グラフデータ生成部１２がリンク属性情報としてノード間の関係強度を示す重み付けを第２の設定方法により設定する場合について説明する。 Hereinafter, the process of the graph data generation part 12 is demonstrated using FIG. FIG. 9 is a flowchart for explaining processing of the graph data generation unit according to the second embodiment. FIG. 9 illustrates a case where the graph data generation unit 12 sets the weight indicating the relationship strength between nodes as the link attribute information by the second setting method.

図９に示すように、実施例２に係るグラフデータ生成部１２は、クラスタ別グラフデータ５３のクラスタ別ノードリスト及び設定リスト５１を参照して、要素数「ｙ」のクラスタ及び閾値「ｓ」を取得する（ステップＳＣ１）。 As illustrated in FIG. 9, the graph data generation unit 12 according to the second embodiment refers to the cluster-specific node list and setting list 51 of the cluster-specific graph data 53, and includes the cluster having the number of elements “y” and the threshold value “s”. Is acquired (step SC1).

そして、グラフデータ生成部１２は、リンクの発着ノード番号であるリンク元ノード番号管理用変数（ｉ）とリンク先ノード番号管理用変数（ｊ）を「ｉ＝１，ｊ＝１」とし、クラスタ別グラフデータを格納する行列「Ｅ＝［ｙ×ｙ］」を零行列に初期化する（ステップＳＣ２）。 Then, the graph data generation unit 12 sets the link source node number management variable (i) and the link destination node number management variable (j), which are the link arrival and departure node numbers, to “i = 1, j = 1”, and the cluster A matrix “E = [y × y]” for storing another graph data is initialized to a zero matrix (step SC2).

その後、グラフデータ生成部１２は、「ｉ＝ｊ」であるか否かを判定する（ステップＳＣ３）。ここで、「ｉ＝ｊ」である場合（ステップＳＣ３肯定）、グラフデータ生成部１２は、ノード番号「ｉ」からノード番号「ｊ」への重み付け「ｅ（ｉ，ｊ）」を「０」と設定する（ステップＳＣ７）。 Thereafter, the graph data generation unit 12 determines whether or not “i = j” (step SC3). Here, if “i = j” (Yes at step SC3), the graph data generation unit 12 sets the weight “e (i, j)” from the node number “i” to the node number “j” to “0”. Is set (step SC7).

「ｉ＝ｊ」でない場合（ステップＳＣ３否定）、グラフデータ生成部１２は、「０〜１」の範囲内で乱数（ｒ）を発生する（ステップＳＣ４）。そして、グラフデータ生成部１２は、「ｒ＞ｓ」であるか否かを判定する（ステップＳＣ５）。ここで、「ｒ」が「ｓ」以下である場合（ステップＳＣ５否定）、グラフデータ生成部１２は、「ｅ（ｉ，ｊ）」を「０」と設定する（ステップＳＣ７）。 When it is not “i = j” (No at Step SC3), the graph data generation unit 12 generates a random number (r) within the range of “0 to 1” (Step SC4). Then, the graph data generating unit 12 determines whether or not “r> s” (step SC5). Here, when “r” is equal to or less than “s” (No in step SC5), the graph data generating unit 12 sets “e (i, j)” to “0” (step SC7).

一方、「ｒ＞ｓ」である場合（ステップＳＣ５肯定）、グラフデータ生成部１２は、「ｅ（ｉ，ｊ）」を「ｒ−ｓ」と設定する（ステップＳＣ６）。なお、第１の設定方法を行なう場合、グラフデータ生成部１２は、「ｅ（ｉ，ｊ）」を「ｒ」と設定する。また、第３の設定方法を行なう場合、グラフデータ生成部１２は、「ｅ（ｉ，ｊ）」を「（ｒ−ｓ）／（１−ｓ）」と設定する。 On the other hand, if “r> s” (Yes at step SC5), the graph data generating unit 12 sets “e (i, j)” to “rs” (step SC6). When the first setting method is performed, the graph data generation unit 12 sets “e (i, j)” to “r”. When the third setting method is performed, the graph data generation unit 12 sets “e (i, j)” to “(rs) / (1-s)”.

ステップＳＣ６の処理、又は、ステップＳＣ７の処理を行なった後、グラフデータ生成部１２は、「ｊ」をインクリメントして「ｊ＝ｊ＋１」とし（ステップＳＣ８）、「ｊ＞ｙ」であるか否かを判定する（ステップＳＣ９）。ここで、「ｊ」が「ｙ」以下である場合（ステップＳＣ９否定）、グラフデータ生成部１２は、ステップＳＣ３に戻って「ｉ＝ｊ」であるか否かの判定処理を行なう。 After performing the process of step SC6 or the process of step SC7, the graph data generating unit 12 increments “j” to “j = j + 1” (step SC8), and whether “j> y” is satisfied. Is determined (step SC9). Here, when “j” is equal to or less than “y” (No at Step SC9), the graph data generating unit 12 returns to Step SC3 and determines whether or not “i = j”.

一方、「ｊ＞ｙ」である場合（ステップＳＣ９肯定）、グラフデータ生成部１２は、「ｉ」をインクリメントして「ｉ＝ｉ＋１」とし、「ｊ」をリセットして「ｊ＝１」とし、（ステップＳＣ１０）、「ｉ＞ｙ」であるか否かを判定する（ステップＳＣ１１）。ここで、「ｉ」が「ｙ」以下である場合（ステップＳＣ１１否定）、グラフデータ生成部１２は、ステップＳＣ３に戻って「ｉ＝ｊ」であるか否かの判定処理を行なう。 On the other hand, if “j> y” (Yes at step SC9), the graph data generation unit 12 increments “i” to “i = i + 1”, resets “j” to “j = 1”. (Step SC10), it is determined whether or not “i> y” (step SC11). Here, when “i” is equal to or less than “y” (No in step SC11), the graph data generating unit 12 returns to step SC3 and performs a determination process as to whether or not “i = j”.

一方、「ｉ＞ｙ」である場合（ステップＳＣ１１肯定）、グラフデータ生成部１２は、処理を終了する。 On the other hand, if “i> y” (Yes at step SC11), the graph data generating unit 12 ends the process.

なお、図９では、リンク先ノード番号管理用変数（ｊ）をインクリメントした後に、リンク元ノード番号管理用変数（ｉ）をインクリメントする場合について説明した。しかし、グラフデータ生成処理は、リンク元ノード番号管理用変数（ｉ）をインクリメントした後に、リンク先ノード番号管理用変数（ｊ）をインクリメントする場合であってもよい。 In FIG. 9, the case where the link source node number management variable (i) is incremented after the link destination node number management variable (j) is incremented has been described. However, the graph data generation process may be a case where the link destination node number management variable (j) is incremented after the link source node number management variable (i) is incremented.

図３に戻って、テストデータ生成部１３は、クラスタ生成部１１によるクラスタ生成処理及びグラフデータ生成部１２によるグラフデータ生成処理を多重帰属可能数（ｔ）繰り返すように制御する。そして、テストデータ生成部１３は、グラフデータ生成部１２により生成された全てのグラフデータを、各ノードに付与された識別情報に基づいて統合することで、ノード群を構成する各ノード間のリンク状態を模擬したテストデータを生成する。そして、テストデータ生成部１３は、生成したテストデータをテストデータ５４に格納する。 Returning to FIG. 3, the test data generation unit 13 performs control so that the cluster generation process by the cluster generation unit 11 and the graph data generation process by the graph data generation unit 12 are repeated (t). Then, the test data generation unit 13 integrates all the graph data generated by the graph data generation unit 12 based on the identification information given to each node, so that the links between the nodes constituting the node group Generate test data that simulates the condition. Then, the test data generation unit 13 stores the generated test data in the test data 54.

具体的には、テストデータ生成部１３は、リンク属性情報がノード間の関係強度を示す重み付けである場合、グラフデータ生成部１２が生成した全てのグラフデータの算術和をとることでテストデータを生成する。例えば、テストデータ生成部１３は、「ｎ×ｎ」の単位行列に初期設定したテストデータ（Ｅａ）に対して、グラフデータ生成部１２が生成した全てのグラフデータ（ノード間の関係強度）の算術和をとることでテストデータを生成する。 Specifically, when the link attribute information is a weight indicating the relationship strength between the nodes, the test data generation unit 13 obtains the test data by calculating the arithmetic sum of all the graph data generated by the graph data generation unit 12. Generate. For example, the test data generation unit 13 generates all the graph data (relationship strength between nodes) generated by the graph data generation unit 12 with respect to the test data (Ea) that is initially set in the “n × n” unit matrix. Test data is generated by calculating the arithmetic sum.

なお、テストデータ生成部１３は、リンク属性情報がノード間のリンクの有無である場合、グラフデータ生成部１２が生成した全てのグラフデータの論理和をとることでテストデータを生成する。例えば、テストデータ生成部１３は、「ｎ×ｎ」の零行列に初期設定したテストデータ（Ｅａ）に対して、グラフデータ生成部１２が生成した全てのグラフデータ（リンクの有無）の論理和をとることでテストデータを生成する。 Note that the test data generation unit 13 generates test data by taking the logical sum of all the graph data generated by the graph data generation unit 12 when the link attribute information is the presence or absence of a link between nodes. For example, the test data generation unit 13 performs a logical sum of all the graph data (the presence / absence of a link) generated by the graph data generation unit 12 with respect to the test data (Ea) initially set to an “n × n” zero matrix. Test data is generated by taking

ここで、テストデータ生成部１３は、クラスタ生成処理及びグラフデータ生成処理が多重帰属可能数（ｔ）実行されるごとに、クラスタ規模数及び閾値の少なくともいずれか一方を変化させても良い。例えば、図４の（Ｂ）に示すように、ｔ回繰り返されるクラスタ生成処理に用いられるクラスタ規模数が変化するように設定されている場合、クラスタ生成部１１は、テストデータ生成部１３の制御により、現に実行する繰り返し回数に該当するクラスタ規模数を設定リストから取得して、図７又は図８で例示したクラスタ生成処理を行なう。 Here, the test data generation unit 13 may change at least one of the cluster size number and the threshold every time the cluster generation process and the graph data generation process are executed (t). For example, as shown in FIG. 4B, when the cluster size number used for the cluster generation process repeated t times is set to change, the cluster generation unit 11 controls the test data generation unit 13. Thus, the cluster size number corresponding to the number of repetitions actually executed is acquired from the setting list, and the cluster generation processing illustrated in FIG. 7 or FIG. 8 is performed.

また、図４の（Ｂ）に示すように、ｔ回繰り返されるグラフデータ生成処理に用いられる閾値が変化するように設定された設定リストが格納されている場合、グラフデータ生成部１２は、テストデータ生成部１３の制御により、現に実行する繰り返し回数に該当する閾値を設定リストから取得して、図９で例示したグラフデータ生成処理を行なう。なお、クラスタ規模数及び閾値は、図４の（Ｂ）に示すように、双方が繰り返し回数ごとに設定されている場合であっても良いし、いずれか一方が、一つの値である場合であっても良い。 In addition, as illustrated in FIG. 4B, when the setting list is set so that the threshold used for the graph data generation process repeated t times is stored, the graph data generation unit 12 performs the test Under the control of the data generation unit 13, a threshold corresponding to the number of repetitions to be executed is acquired from the setting list, and the graph data generation process illustrated in FIG. 9 is performed. As shown in FIG. 4B, the cluster size number and the threshold value may be set for each repetition count, or one of them is a single value. There may be.

また、クラスタ生成部１１は、例えば、操作者からの指示に基づくテストデータ生成部１３の制御により、繰り返し回数ごとに、第１のクラスタ生成方法又は第２のクラスタ生成方法のいずれかを実行する場合であってもよい。例えば、繰り返し回数ごとに、クラスタ生成部１１が第１のクラスタ生成方法を実行するか、第２のクラスタ生成方法を実行するかは、設定リストの登録時や、クラスタ生成部１１の処理開始前などに操作者により指定される。また、グラフデータ生成部１２は、重み付けを設定する際に、例えば、操作者からの指示に基づくテストデータ生成部１３の制御により、繰り返し回数ごとに、第１の設定方法、第２の設定方法、又は、第３の設定方法のいずれかを実行する場合であってもよい。例えば、繰り返し回数ごとに、グラフデータ生成部１２が第１の設定方法、第２の設定方法、又は、第３の設定方法のいずれかを実行するかは、設定リストの登録時や、グラフデータ生成部１２の処理開始前などに操作者により指定される。 In addition, the cluster generation unit 11 executes either the first cluster generation method or the second cluster generation method for each repetition count under the control of the test data generation unit 13 based on an instruction from the operator, for example. It may be the case. For example, for each iteration, whether the cluster generation unit 11 executes the first cluster generation method or the second cluster generation method depends on whether the setting list is registered or before the cluster generation unit 11 starts processing. Specified by the operator. Further, when setting the weighting, the graph data generation unit 12 controls the test data generation unit 13 based on an instruction from the operator, for example, for each repetition count, the first setting method and the second setting method. Alternatively, any one of the third setting methods may be executed. For example, for each repetition count, whether the graph data generation unit 12 executes the first setting method, the second setting method, or the third setting method is determined when the setting list is registered or the graph data It is designated by the operator before the generation unit 12 starts processing.

続いて、図１０を用いて、実施例２に係るテストデータ生成装置１０による処理の手順を説明する。図１０は、実施例２に係るテストデータ生成装置の処理を説明するためのフローチャートである。なお、図１０は、リンク属性情報がノード間の関係強度を示す重み付けである場合の処理を説明するためのフローチャートである。 Subsequently, a processing procedure performed by the test data generation apparatus 10 according to the second embodiment will be described with reference to FIG. FIG. 10 is a flowchart for explaining the process of the test data generation apparatus according to the second embodiment. FIG. 10 is a flowchart for explaining processing when the link attribute information is weighting indicating the relationship strength between nodes.

図１０に示すように、実施例２に係るテストデータ生成装置１０は、設定リスト５１に設定リストが格納されたか否かを判定する（ステップＳ１０１）。ここで、設定リストが格納されていない場合（ステップＳ１０１否定）、テストデータ生成装置１０は、待機状態となる。 As illustrated in FIG. 10, the test data generation device 10 according to the second embodiment determines whether or not a setting list is stored in the setting list 51 (Step S <b> 101). Here, when the setting list is not stored (No at Step S101), the test data generating apparatus 10 enters a standby state.

一方、設定リストが格納された場合（ステップＳ１０１肯定）、テストデータ生成部１３は、多重帰属可能数管理用変数（ｋ）を「ｋ＝０」とし、テストデータを格納するための「ｎ×ｎ」の正方行列（Ｅａ）を「Ｅａ＝単位行列」とする（ステップＳ１０２）。なお、リンク属性情報がノード間のリンクの有無である場合、ステップＳ１０２において、テストデータ生成部１３は、「Ｅａ＝零行列」とする。 On the other hand, when the setting list is stored (Yes in step S101), the test data generation unit 13 sets the multiple assignable number management variable (k) to “k = 0” and stores “n ×” for storing the test data. The square matrix (Ea) of “n” is set to “Ea = unit matrix” (step S102). If the link attribute information is the presence or absence of a link between nodes, the test data generation unit 13 sets “Ea = zero matrix” in step S102.

そして、クラスタ生成部１１は、クラスタ生成処理を行なう（ステップＳ１０３）。具体的には、クラスタ生成部１１は、図７で説明した第１のクラスタ生成処理、又は図８で説明した第２のクラスタ生成処理を行なう。 Then, the cluster generation unit 11 performs a cluster generation process (step S103). Specifically, the cluster generation unit 11 performs the first cluster generation process described with reference to FIG. 7 or the second cluster generation process described with reference to FIG.

その後、テストデータ生成部１３は、クラスタ番号管理用カウンタ変数（Ｌ）を「Ｌ＝０」とし（ステップＳ１０４）、グラフデータ生成部１２は、グラフデータ生成処理を行なう（ステップＳ１０５）。例えば、グラフデータ生成部１２は、重み付けの設定方法が第１の設定方法として設定されている場合、図９で説明したグラフデータ生成処理を行なう。 Thereafter, the test data generation unit 13 sets the cluster number management counter variable (L) to “L = 0” (step S104), and the graph data generation unit 12 performs a graph data generation process (step S105). For example, when the weighting setting method is set as the first setting method, the graph data generation unit 12 performs the graph data generation processing described with reference to FIG.

そして、テストデータ生成部１３は、「Ｌ」をインクリメントして「Ｌ＝１」とし（ステップＳ１０６）、「Ｌ」がステップＳ１０３で生成されたクラスタ数以上であるか否かを判定する（ステップＳ１０７）。ここで、「Ｌ」がステップＳ１０３で生成されたクラスタ数より小さい場合（ステップＳ１０７否定）、テストデータ生成部１３は、グラフデータが未生成のクラスタが存在すると判断し、グラフデータ生成部１２は、テストデータ生成部１３の制御により、ステップＳ１０５に戻って、残りのクラスタに対するグラフデータ生成処理を行なう。 Then, the test data generation unit 13 increments “L” to “L = 1” (step S106), and determines whether “L” is equal to or greater than the number of clusters generated in step S103 (step S106). S107). Here, when “L” is smaller than the number of clusters generated in step S103 (No in step S107), the test data generation unit 13 determines that there is a cluster in which the graph data has not been generated, and the graph data generation unit 12 Under the control of the test data generation unit 13, the process returns to step S105 to perform graph data generation processing for the remaining clusters.

一方、「Ｌ」がステップＳ１０３で生成されたクラスタ数以上である場合（ステップＳ１０７肯定）、テストデータ生成部１３は、「ｋ＝ｋ＋１」とし（ステップＳ１０８）、ｋのグラフデータ群を「Ｅ」とする（ステップＳ１０９）。すなわち、「Ｌ」がステップＳ１０３で生成されたクラスタ数以上である場合、テストデータ生成部１３は、ステップＳ１０３で生成されたクラスタ数分のグラフデータが生成されたと判断する。 On the other hand, when “L” is equal to or greater than the number of clusters generated in step S103 (Yes in step S107), the test data generation unit 13 sets “k = k + 1” (step S108), and sets the graph data group of k to “E”. (Step S109). That is, when “L” is equal to or greater than the number of clusters generated in step S103, the test data generation unit 13 determines that the graph data for the number of clusters generated in step S103 has been generated.

そして、テストデータ生成部１３は、「Ｅａ＝Ｅａ＋Ｅ」としてテストデータを更新し（ステップＳ１１０）、「ｋ＝ｔ（多重帰属可能数）」であるか否かを判定する（ステップＳ１１１）。 Then, the test data generation unit 13 updates the test data as “Ea = Ea + E” (step S110), and determines whether or not “k = t (multiple assignment possible number)” (step S111).

ここで、「ｋ」が「ｔ」に到達していない場合（ステップＳ１１１否定）、テストデータ生成部１３は、多重帰属可能数分のグラフデータ群が生成されていないと判断し、クラスタ生成部１１は、テストデータ生成部１３の制御により、ステップＳ１０３に戻って、再度、クラスタ生成処理を行なう。 Here, when “k” has not reached “t” (No in step S111), the test data generation unit 13 determines that the graph data group corresponding to the number of multiple assignments has not been generated, and the cluster generation unit 11 returns to step S103 under the control of the test data generation unit 13, and performs the cluster generation process again.

一方、「ｋ＝ｔ」である場合（ステップＳ１１１肯定）、テストデータ生成部１３は、ステップＳ１１０で更新された「Ｅａ」をテストデータとし、処理を終了する。 On the other hand, if “k = t” (Yes at Step S111), the test data generation unit 13 sets “Ea” updated at Step S110 as test data, and ends the process.

なお、図１０では、「Ｅａ」が順次更新される場合について説明したが、本実施例２は、多重帰属可能数分のグラフデータ群が生成された後に、一括して統合されることで「Ｅａ」が生成される場合であっても良い。また、図１０では、多重帰属可能数分のクラスタ生成処理及びグラフデータ生成処理が順次行なわれる場合について説明したが、本実施例２は、例えば、多重帰属可能数分のクラスタ生成処理及びグラフデータ生成処理が並列に行なわれる場合であっても良い。また、図１０では、クラスタ番号管理用カウンタ変数（Ｌ）を用いたテストデータ生成部１３の制御により、グラフデータ生成部１２がクラスタごとにグラフデータを生成する場合について説明したが、本実施例２は、ステップＳ１０３で生成された複数のクラスタに対するグラフデータ生成処理が並列に行なわれる場合であっても良い。 In FIG. 10, the case where “Ea” is sequentially updated has been described. However, in the second embodiment, after graph data groups corresponding to the number of possible multiple assignments are generated, they are integrated in a lump. Ea ”may be generated. Further, FIG. 10 illustrates the case where the cluster generation process and the graph data generation process for the number of multiple assignments are sequentially performed. However, in the second embodiment, for example, the cluster generation process and the graph data for the number of multiple assignments can be performed. The generation process may be performed in parallel. FIG. 10 illustrates the case where the graph data generation unit 12 generates graph data for each cluster under the control of the test data generation unit 13 using the cluster number management counter variable (L). 2 may be a case where the graph data generation processing for the plurality of clusters generated in step S103 is performed in parallel.

上述したように、実施例２によれば、実施例１と同様に、クラスタ生成部１１、グラフデータ生成部１２及びテストデータ生成部１３の処理によりテストデータを生成する。 As described above, according to the second embodiment, as in the first embodiment, test data is generated by the processing of the cluster generation unit 11, the graph data generation unit 12, and the test data generation unit 13.

例えば、電話網では、通信経路が物理的回線と対応付けられ、かつ、回線の接続構造が地理的条件から自明である。このため、電話網の制御アルゴリズムを検証するためのテストデータは、図１１の（Ａ）に示すように、例えば、乱数で通信密度を決定することで、通信経路に対応した情報の流れを模擬したグラフ構造として生成することができた。また、ランダムグラフによるテストデータの生成方法では、全てのノード間が結線される可能性を与えたうえで、乱数により、図１１の（Ｂ）に示すように、結線の有無を決定したり、結線に重み付けを付与したりしていた。しかし、図１１に示す方法では、Ｗｅｂページ間のリンクのように、潜在的なコミュニティ内での偏ったリンク構造を模擬したテストデータを生成することができなかった。なお、図１１は、従来のテストデータを説明するための図である。 For example, in a telephone network, a communication path is associated with a physical line, and the connection structure of the line is obvious from geographical conditions. Therefore, the test data for verifying the control algorithm of the telephone network simulates the flow of information corresponding to the communication path, for example, by determining the communication density with a random number as shown in FIG. Could be generated as a graph structure. Further, in the test data generation method using a random graph, after giving the possibility that all nodes are connected, the presence or absence of connection is determined by random numbers as shown in FIG. The connection was given a weight. However, the method shown in FIG. 11 cannot generate test data that simulates a biased link structure in a potential community, such as a link between Web pages. FIG. 11 is a diagram for explaining conventional test data.

しかし、実施例２では、図１２の（Ａ）に示すように、ノード群からクラスタを生成する処理を複数回行ない、さらに、図１２の（Ｂ）に示すように、複数回行なわれたクラスタ生成処理ごとに、各クラスタのグラフ構造（グラフデータ）を生成する。そして、実施例２では、図１２の（Ｃ）に示すように、各クラスタのグラフ構造を統合する。すなわち、実施例２で生成されるテストデータは、図１１に示す従来のテストデータとは異なり、偏ったリンク構造を有する複数のコミュニティが入り組んで形成される状態の通信網を模擬したデータとなる。なお、図１２は、実施例２で生成されるテストデータを説明するための図である。 However, in the second embodiment, as shown in FIG. 12A, a process for generating a cluster from a node group is performed a plurality of times, and further, as shown in FIG. For each generation process, a graph structure (graph data) of each cluster is generated. And in Example 2, as shown to (C) of FIG. 12, the graph structure of each cluster is integrated. That is, unlike the conventional test data shown in FIG. 11, the test data generated in the second embodiment is data simulating a communication network in a state where a plurality of communities having a biased link structure are formed in an intricate manner. . FIG. 12 is a diagram for explaining test data generated in the second embodiment.

従って、実施例２では、実施例１と同様に、潜在的なコミュニティ内での偏ったリンク構造を模擬したテストデータを生成することができる。 Therefore, in the second embodiment, as in the first embodiment, it is possible to generate test data that simulates a biased link structure in a potential community.

また、実施例２では、グラフデータ生成部１２は、リンク属性情報として、ノード間のリンクの有無、又は、ノード間の関係強度を示す重み付けを設定する。そして、テストデータ生成部１３は、リンク属性情報がノード間のリンクの有無である場合、全てのグラフデータの論理和をとることでテストデータを生成する。また、グラフデータ生成部１２は、リンク属性情報がノード間の関係強度を示す重み付けである場合、全てのグラフデータの算術和をとることでテストデータを生成する。 In the second embodiment, the graph data generation unit 12 sets the weight indicating the presence / absence of a link between nodes or the relationship strength between nodes as the link attribute information. Then, when the link attribute information is the presence / absence of a link between nodes, the test data generation unit 13 generates test data by taking the logical sum of all the graph data. Further, when the link attribute information is a weight indicating the relationship strength between nodes, the graph data generation unit 12 generates test data by calculating the arithmetic sum of all the graph data.

従って、実施例２では、操作者の要望するタイプに応じて、潜在的なコミュニティ内での偏ったリンク有無を模擬したテストデータや、潜在的なコミュニティ内での偏ったリンク密度を模擬したテストデータを生成することが可能となる。 Therefore, in the second embodiment, depending on the type desired by the operator, test data simulating the presence or absence of a biased link in a potential community, or a test simulating a biased link density in a potential community. Data can be generated.

また、実施例２では、クラスタ生成部１１は、第１のクラスタ生成方法として、複数のノードの中でクラスタとして未抽出のノード群から無作為にクラスタ規模数のノードを抽出することでクラスタを順次生成し、さらに、クラスタとして未抽出のノード群のノード数がクラスタ規模数より少なくなった場合、当該クラスタとして未抽出のノード群を生成済みの各クラスタに無作為に割り振ることで、総ノード数をクラスタ規模数で除算した値以下で最大となる整数個のクラスタを生成する。あるいは、クラスタ生成部１１は、第２のクラスタ生成方法として、複数のノードの中でクラスタとして未抽出のノード群から無作為にクラスタ規模数のノードを抽出することでクラスタを順次生成し、さらに、クラスタとして未抽出のノード群のノード数がクラスタ規模数より少なくなった場合、当該クラスタとして未抽出のノード群を新たなクラスタとして生成することで、総ノード数をクラスタ規模数で除算した値以上で最小となる整数個のクラスタを生成する。 Further, in the second embodiment, the cluster generation unit 11 performs cluster extraction by randomly extracting nodes having a cluster size from a group of nodes not extracted as a cluster among a plurality of nodes as a first cluster generation method. If the number of nodes in a node group that has not been extracted as a cluster is smaller than the number of cluster sizes, the total number of nodes can be determined by randomly allocating the node group that has not been extracted as the cluster to each generated cluster. Generates an integer number of clusters that is the maximum value less than or equal to the number divided by the cluster size number. Alternatively, as a second cluster generation method, the cluster generation unit 11 sequentially generates clusters by randomly extracting nodes of a cluster scale number from a group of nodes not extracted as a cluster among a plurality of nodes. When the number of nodes in a node group that has not been extracted as a cluster is less than the number of cluster sizes, a value obtained by dividing the total number of nodes by the number of cluster sizes by generating a new cluster of nodes that have not been extracted as the cluster. The minimum number of clusters is generated as described above.

従って、実施例２では、各クラスタを構成するノード数をなるべく均等にすることを優先するか、各クラスタを構成するノード数をなるべくクラスタ規模数から変えないことを優先するかという操作者の要望に応じて、クラスタ生成処理を行なうことができる。また、実施例２では、第１のクラスタ生成方法及び第２のクラスタ生成方法を、クラスタ生成処理を行なうごとに変更させることもできる。 Therefore, in the second embodiment, an operator's request whether to give priority to equalizing the number of nodes constituting each cluster as much as possible or not giving priority to changing the number of nodes constituting each cluster from the number of cluster sizes as much as possible. In response to this, cluster generation processing can be performed. In the second embodiment, the first cluster generation method and the second cluster generation method can be changed every time the cluster generation process is performed.

また、実施例２では、テストデータ生成部１３は、設定リストの設定値に応じて、クラスタ生成処理及びグラフデータ生成処理が多重帰属可能数実行されるごとに、クラスタ規模数及び閾値の少なくともいずれか一方を変化させる。すなわち、実施例２では、想定するコミュニティの多様性を増大することができる。従って、実施例２では、潜在的なコミュニティ内での偏ったリンク構造を模擬したテストデータの多様性を任意に変更することが可能となる。 In the second embodiment, the test data generation unit 13 performs at least one of the cluster size number and the threshold value every time the cluster generation process and the graph data generation process are executed in accordance with the setting value of the setting list. Change one or the other. That is, in the second embodiment, it is possible to increase the diversity of the assumed community. Therefore, in the second embodiment, it is possible to arbitrarily change the diversity of test data simulating a biased link structure in a potential community.

すなわち、操作者は、テストデータ生成装置１０を用いる際に、模擬しようとする対象の性質に応じて、設定リストに設定する値を調整することで、自身が要望するテストデータを生成させることができる。例えば、インターネット上のサイト間交流の密度（関係強度）を対象とする場合、操作者が設定リストに設定する値は、以下のようになる。 That is, when using the test data generation device 10, the operator can generate the test data desired by himself by adjusting the value set in the setting list according to the property of the target to be simulated. it can. For example, when targeting the density (relationship strength) between sites on the Internet, values set by the operator in the setting list are as follows.

まず、多重帰属可能数は、あるサイトがアクセスされるコミュニティの想定数を表す値として設定される。インターネット上で絶対数が多い個人サイトを多く含む通信網を対象とする場合、多重帰属可能数は、個人が持ちうる興味分野の数などから推定して設定できる。例えば、非常に多趣味な人物が多い社会を想定しても、同時に興味を持てる範囲は、人間の脳の構造などから限界がある。近年の脳構造に関する知見から、人間の短期記憶は、高々１０個の数字が保持できる程度であることが判明している。 First, the multiple attribution possible number is set as a value representing the assumed number of communities to which a certain site is accessed. When targeting a communication network including many personal sites with a large absolute number on the Internet, the number of multiple belongings can be estimated and set from the number of fields of interest that an individual can have. For example, even if a society with many very hobby people is assumed, the scope of interest at the same time is limited by the structure of the human brain. Recent knowledge about brain structure has revealed that human short-term memory can only hold up to 10 numbers.

ここで、短期記憶が可能な情報量は、実用化されている検索システムにおける結果一覧にて参照される件数などとも類似する。このため、潜在的な興味が広くても、実際のサイト作成において目立つ場所におかれるリンクは、１０件前後であることが多い。このような性質に鑑みれば、多重帰属可能数は、「１０」に設定されることが望ましい。あるいは、模擬したいサイト群によっては、多重帰属可能数は、例えば、「７」と、「１０」より小さい値に設定されてもよい。 Here, the amount of information that can be stored for a short time is similar to the number of cases that are referred to in a result list in a practically used search system. For this reason, even if there is wide potential interest, there are often about 10 links placed in a prominent place in actual site creation. In view of such properties, it is desirable that the number of possible multiple assignments is set to “10”. Alternatively, depending on the site group to be simulated, the number of possible multiple assignments may be set to a value smaller than “7” and “10”, for example.

また、クラスタ規模数は、コミュニティの想定最大規模を表す値として設定される。例えば、クラスタ規模数を低めに設定すると、狭いコミュニティを模擬したことになる。また、クラスタ規模数を高めに設定すると、広いコミュニティを模擬したことになる。 The cluster size number is set as a value representing the assumed maximum size of the community. For example, if the number of cluster sizes is set low, a small community is simulated. If the cluster size is set high, it means that a large community is simulated.

また、閾値は、上述したグラフデータ生成処理において、各発・宛ノードの対が接続されない確率を表す値である。例えば、閾値を低めに設定するとリンクが濃密なコミュニティを模擬したことになり、閾値を高めに設定するとリンクが希薄なコミュニティを模擬したことになる。 Further, the threshold value is a value that represents the probability that each source / destination node pair is not connected in the graph data generation process described above. For example, when the threshold value is set low, a community with dense links is simulated, and when the threshold value is set high, a community with thin links is simulated.

すなわち、クラスタ規模数と閾値とを設定することで、様々なリンク密度及び規模を有するコミュニティを模擬することができる。例えば、閾値を低めに設定し、クラスタ規模数を低めに設定すると、リンクが濃密な狭いコミュニティを模擬したことになる。また、閾値を高めに設定し、クラスタ規模数を高めに設定すると、リンクが希薄な広いコミュニティを模擬したことになる。 That is, by setting the cluster size number and the threshold value, it is possible to simulate communities having various link densities and sizes. For example, if the threshold value is set low and the cluster size number is set low, a narrow community with dense links is simulated. If the threshold value is set high and the cluster size number is set high, it means that a wide community with few links is simulated.

ここで、実施例１や２では、リンク生成のランダム性の結果として、想定コミュニティ内の連結性を保障しない。つまり、実施例１や２では、グラフデータ生成部１２が生成した複数のグラフデータにおいて、コミュニティ内の連結性が記述されたグラフデータの数が、設定したクラスタ規模数を下回っている可能性がある。コミュニティ内の連結性が記述されたグラフデータの数がクラスタ規模数を下回る確率は、クラスタ規模数及び閾値双方の大きさに依存する。すなわち、閾値が高い場合は、全てのノードが互いに到達可能なパスを持たないデータが生成する確率が高まり、その結果、コミュニティ内の連結性が記述されたグラフデータの数がクラスタ規模数を下回る確率が大きくなる。このため、クラスタ規模数は、閾値との組み合わせで決定されることが望ましい。 Here, in the first and second embodiments, connectivity within the assumed community is not guaranteed as a result of the randomness of link generation. In other words, in the first and second embodiments, in the plurality of graph data generated by the graph data generation unit 12, the number of graph data describing the connectivity in the community may be less than the set cluster size number. is there. The probability that the number of graph data describing connectivity in the community is lower than the cluster size number depends on both the cluster size number and the threshold value. In other words, when the threshold is high, the probability that all nodes do not have mutually reachable paths increases, and as a result, the number of graph data describing connectivity within the community is less than the number of cluster sizes. Probability increases. For this reason, it is desirable to determine the number of cluster sizes in combination with a threshold value.

なお、模擬対象が潜在的なコミュニティであって互いにリンクを張ることに消極的な傾向にあるサイト群であるならば、閾値は、リンクが希薄なため全ノードが連結されていない状態を示すテストデータを生成するために高めに設定する場合であっても良い。 Note that if the simulation target is a potential community and sites that tend to be reluctant to link to each other, the threshold value is a test indicating that all nodes are not connected because the links are sparse. It may be a case where it is set higher in order to generate data.

このようにして設定された設定データを用いて生成されるテストデータは、クラスタにに属するノード間の縁故を模擬したリンクを多重帰属可能数通りのクラスタ分割について重ね合わせたものになっている。例えば、ノードがＷｅｂサイト、クラスタがＷｅｂサイト運営者の属するコミュニティを模擬すると見なせば、各運営者がｔ個以内のコミュニティに多重帰属することで、直接同じコミュニティに属さないＷｅｂサイト間にも関わりが発生する様子を模擬できる。従って、本実施例２で生成されるテストデータを用いることで、操作者は、例えば、Ｗｅｂサイト間の関係強度を決定するための重要な経路を検証することができる。 The test data generated using the setting data set in this way is obtained by superimposing links simulating the margins between nodes belonging to a cluster with respect to the number of cluster divisions that can be attributed to multiple assignments. For example, if it is assumed that a node is a website and a cluster simulates a community to which a website operator belongs, each operator can belong to up to t communities, and even between websites that do not belong directly to the same community. You can simulate how the relationship occurs. Therefore, by using the test data generated in the second embodiment, the operator can verify an important path for determining the strength of relationship between Web sites, for example.

なお、上記した実施例１及び２では、テストデータ生成方法が一つの装置にて実行される場合について説明した。しかし、実施例１及び２で説明したテストデータ生成方法は、複数の装置にて分散されて実行される場合であっても良い。例えば、実施例１及び２で説明したテストデータ生成方法は、クラスタ生成部１１及びグラフデータ生成部１２の機能を有する装置と、テストデータ生成部１３の機能を有する装置とが一つの仮想的なテストデータ生成装置１０として機能することにより実現される場合であっても良い。また、１装置または２以上の装置からなるシステム内に、仮想マシンが起動される環境であって、この仮想マシン上で複数の仮想化されたコンピュータが動作するものであり、この仮想化されたコンピュータ上で実施例１及び２で説明したテストデータ生成方法が実行される場合であってもよい。 In the first and second embodiments, the case where the test data generation method is executed by one apparatus has been described. However, the test data generation method described in the first and second embodiments may be executed in a distributed manner by a plurality of apparatuses. For example, in the test data generation method described in the first and second embodiments, the device having the functions of the cluster generation unit 11 and the graph data generation unit 12 and the device having the function of the test data generation unit 13 are one virtual. It may be realized by functioning as the test data generation device 10. In addition, an environment in which a virtual machine is started in a system composed of one device or two or more devices, and a plurality of virtual computers operate on the virtual machine. The test data generation method described in the first and second embodiments may be executed on a computer.

また、上記した実施例１及び２で説明したテストデータ生成方法は、あらかじめ用意されたテストデータ生成プログラムをパーソナルコンピュータやワークステーションなどのコンピュータで実行することによって実現することができる。そこで以下では、図１３を用いて、上記の実施例１に示したテストデータ生成装置１０と同様の機能を有するテストデータ生成プログラムを実行するコンピュータの一例を説明する。図１３は、実施例１に係るテストデータ生成プログラムを実行するコンピュータを示す図である。なお、以下では、テストデータ生成方法が一つのコンピュータにて実行される場合について説明する。しかし、実施例１及び２で説明したテストデータ生成方法は、上述したように、複数のコンピュータにて分散されて実行される場合であっても良い。また、実施例１及び２で説明したテストデータ生成方法は、上述したように、仮想化されたコンピュータ上で実行される場合であっても良い。 The test data generation method described in the first and second embodiments can be realized by executing a test data generation program prepared in advance on a computer such as a personal computer or a workstation. In the following, an example of a computer that executes a test data generation program having the same function as that of the test data generation apparatus 10 shown in the first embodiment will be described with reference to FIG. FIG. 13 is a diagram illustrating the computer that executes the test data generation program according to the first embodiment. In the following, a case where the test data generation method is executed by one computer will be described. However, the test data generation method described in the first and second embodiments may be executed in a distributed manner by a plurality of computers as described above. Further, the test data generation method described in the first and second embodiments may be executed on a virtualized computer as described above.

図１３に示すように、ＰＣ（Personal Computer）などの情報処理装置であるコンピュータ１００は、ＣＰＵ（Central Processing Unit）１１０、入力装置１２０、出力装置１３０、媒体読取装置１４０、及びネットワークインタフェース１５０を有する。ＣＰＵ１１０は、各種の演算処理を実行する。入力装置１２０は、各種データの入力を利用者から受け付ける。出力装置１３０は、ＣＰＵ１１０により処理された各種データを出力する。媒体読取装置１４０は、記憶媒体からプログラム等を読み取る。ネットワークインタフェース１５０は、ネットワークを介して他のコンピュータとの間でデータの授受を行う。例えば、ネットワークインタフェース１５０は、設定リストを、ネットワークを介して受信することもできる。 As shown in FIG. 13, a computer 100 that is an information processing device such as a PC (Personal Computer) has a CPU (Central Processing Unit) 110, an input device 120, an output device 130, a medium reading device 140, and a network interface 150. . The CPU 110 executes various arithmetic processes. The input device 120 receives input of various data from the user. The output device 130 outputs various data processed by the CPU 110. The medium reader 140 reads a program or the like from a storage medium. The network interface 150 exchanges data with other computers via the network. For example, the network interface 150 can receive the setting list via the network.

また、コンピュータ１００は、ＲＡＭ１６０及びＨＤＤ１７０を有する。ＲＡＭ１６０は、ＣＰＵ１１０により処理される各種情報を一時的に記憶する。ＨＤＤ１７０は、各種データや各種プログラムを記憶する。 The computer 100 also has a RAM 160 and an HDD 170. The RAM 160 temporarily stores various information processed by the CPU 110. The HDD 170 stores various data and various programs.

そして、例えば、ＨＤＤ１７０は、図１に示したクラスタ生成部１１、グラフデータ生成部１２及びテストデータ生成部１３それぞれと同様の機能を有するクラスタ生成プログラム１７１、グラフデータ生成プログラム１７２及びテストデータ生成プログラム１７３を記憶する。また、ＣＰＵ１１０は、ＨＤＤ１７０からクラスタ生成プログラム１７１、グラフデータ生成プログラム１７２及びテストデータ生成プログラム１７３を読み出してＲＡＭ１６０に、クラスタ生成プログラム１７１、グラフデータ生成プログラム１７２及びテストデータ生成プログラム１７３を、クラスタ生成プログラム１６１、グラフデータ生成プログラム１６２及びテストデータ生成プログラム１６３として展開する。 For example, the HDD 170 includes a cluster generation program 171, a graph data generation program 172, and a test data generation program having the same functions as the cluster generation unit 11, the graph data generation unit 12, and the test data generation unit 13 illustrated in FIG. 1. 173 is stored. Further, the CPU 110 reads the cluster generation program 171, the graph data generation program 172, and the test data generation program 173 from the HDD 170, and stores the cluster generation program 171, the graph data generation program 172, and the test data generation program 173 in the RAM 160. 161, a graph data generation program 162 and a test data generation program 163.

その後、ＣＰＵ１１０は、クラスタ生成プログラム１６１、グラフデータ生成プログラム１６２及びテストデータ生成プログラム１６３それぞれを、クラスタ生成プロセス１１１、グラフデータ生成プロセス１１２及びテストデータ生成プロセス１１３として実行する。 Thereafter, the CPU 110 executes the cluster generation program 161, the graph data generation program 162, and the test data generation program 163 as the cluster generation process 111, the graph data generation process 112, and the test data generation process 113, respectively.

なお、クラスタ生成プログラム１７１、グラフデータ生成プログラム１７２及びテストデータ生成プログラム１７３は、必ずしもＨＤＤ１７０に格納されている必要はなく、例えば、ＣＤ−ＲＯＭ等の記憶媒体に記憶され、記憶媒体から読み出されて実行されてもよい。又は、クラスタ生成プログラム１７１、グラフデータ生成プログラム１７２及びテストデータ生成プログラム１７３は、他のコンピュータによって記憶され、公衆回線、インターネット、ＬＡＮ（Local Area Network）、ＷＡＮ（Wide Area Network）等を介して取得されて実行されてもよい。 Note that the cluster generation program 171, the graph data generation program 172, and the test data generation program 173 are not necessarily stored in the HDD 170, and are stored in a storage medium such as a CD-ROM and read from the storage medium, for example. May be executed. Alternatively, the cluster generation program 171, the graph data generation program 172, and the test data generation program 173 are stored by another computer and acquired via a public line, the Internet, a LAN (Local Area Network), a WAN (Wide Area Network), or the like. May be executed.

以上のように、本発明に係るテストデータ生成方法、テストデータ生成装置及びテストデータ生成プログラムは、外部入力が存在する情報処理に用いるアルゴリズムの妥当性を判定するために、外部入力を模擬したテストデータを生成する場合に有用であり、特に、潜在的なコミュニティ内での偏ったリンク構造を模擬したテストデータを生成することに適する。 As described above, the test data generation method, the test data generation device, and the test data generation program according to the present invention are designed to simulate the external input in order to determine the validity of the algorithm used for information processing in which the external input exists. This is useful when generating data, and is particularly suitable for generating test data that simulates a biased link structure within a potential community.

１０テストデータ生成装置
１１クラスタ生成部
１２グラフデータ生成部
１３テストデータ生成部
２０入力部
３０出力部
４０入出力制御Ｉ／Ｆ部
５０記憶部
５１設定リスト
５２ノードリスト
５３クラスタ別グラフデータ
５４テストデータ
６０処理部 DESCRIPTION OF SYMBOLS 10 Test data generation apparatus 11 Cluster generation part 12 Graph data generation part 13 Test data generation part 20 Input part 30 Output part 40 Input / output control I / F part 50 Storage part 51 Setting list 52 Node list 53 Graph data classified by cluster 54 Test data 60 processor

Claims

Computer
By randomly dividing a node group composed of a plurality of nodes each having identification information into a number of clusters determined by the total number of nodes of the node group and a predetermined number of cluster sizes, a plurality of clusters can be divided. A cluster generation step to generate;
By setting link attribute information indicating a link state between nodes constituting each cluster generated by the cluster generation step using a random number generated between the nodes and a predetermined threshold, A graph data generation step for generating graph data indicating a link structure in each cluster;
All graph data generated by the graph data generation step is assigned to each node by controlling the cluster generation processing by the cluster generation step and the graph data generation processing by the graph data generation step to be repeated a predetermined number of times. A test data generation step of generating test data simulating the link state between the nodes constituting the node group by integrating based on the identified identification information;
The test data generation method characterized by including.

The graph data generation step sets, as the link attribute information, a weight indicating the presence or absence of a link between nodes, or a relationship strength between nodes,
In the test data generation step, when the link attribute information is the presence / absence of a link between nodes, the test data is generated by taking the logical sum of all the graph data generated by the graph data generation step, and the link The test data is generated by calculating an arithmetic sum of all graph data generated by the graph data generation step when the attribute information is a weight indicating a relation strength between nodes. Test data generation method.

In the cluster generation step, as a first cluster generation method, clusters are sequentially generated by randomly extracting nodes of the predetermined number of cluster sizes from a node group not extracted as a cluster among the plurality of nodes. Further, when the number of nodes of the node group that has not been extracted as a cluster is smaller than the predetermined cluster size number, the node group that has not been extracted as the cluster is randomly allocated to each of the generated clusters. 3. The test data generation method according to claim 1 or 2, wherein an integer number of clusters that are maximum not more than a value obtained by dividing the total number of nodes of the group by the predetermined number of cluster sizes is generated.

In the cluster generation step, as a second cluster generation method, clusters are sequentially generated by randomly extracting the nodes of the predetermined cluster scale number from a node group not extracted as a cluster among the plurality of nodes. Further, when the number of nodes in the node group that has not been extracted as a cluster is smaller than the predetermined number of cluster sizes, the node group that has not been extracted as the cluster is generated as a new cluster, so that The test data generation method according to claim 1, wherein an integer number of clusters that are minimum at or above a value obtained by dividing a number by the predetermined cluster size number is generated.

The test data generation step includes at least one of the predetermined cluster size number and the predetermined threshold every time the cluster generation process by the cluster generation step and the graph data generation process by the graph data generation step are executed the predetermined number of times. Any one of them is changed, The test data generation method as described in any one of Claims 1-4 characterized by the above-mentioned.

By randomly dividing a node group composed of a plurality of nodes each having identification information into a number of clusters determined by the total number of nodes of the node group and a predetermined number of cluster sizes, a plurality of clusters can be divided. A cluster generating means for generating;
By setting link attribute information indicating a link state between nodes constituting each cluster generated by the cluster generation means using a random number generated between the nodes and a predetermined threshold, Graph data generating means for generating graph data indicating the link structure in each cluster;
All graph data generated by the graph data generation unit is assigned to each node by controlling the cluster generation processing by the cluster generation unit and the graph data generation processing by the graph data generation unit to be repeated a predetermined number of times. A test data generating means for generating test data simulating the link state between the nodes constituting the node group by integrating based on the identified identification information;
A test data generating apparatus comprising:

A test data generation program for causing a computer to execute the test data generation method according to claim 1.