JPH11296487A

JPH11296487A - System and control method for distributed shaped memory

Info

Publication number: JPH11296487A
Application number: JP10102838A
Authority: JP
Inventors: Hideaki Hirayama; 秀昭平山; Kuninori Tanaka; 邦典田中; Tetsuya Iinuma; 哲也飯沼
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1998-04-14
Filing date: 1998-04-14
Publication date: 1999-10-29

Abstract

PROBLEM TO BE SOLVED: To provide a distributed shared system which can extended a data structure that counts a frequency of appearance of an item during counting the frequency of appearance of the item. SOLUTION: When a data structure is extended during counting by a process 11, a lock is first acquired by a distributed lock acquisition part 24, exclusive control of data structure extension operation is performed, and extension of the data structure is executed by a data structure expansion part 26 in a shared memory space. At this time, a data structure extension log 30 is recorded by a data structure expansion log recording part 28, and is transferred to another node 100 by a data structure extension log transfer part 31 at the time of lock release by a distributed lock release part 25. Then, this transferred data structure expansion log 30 is received by a data structure extension log reception part 29 in each node 100, and is reflected upon each data structure by a data structure extension log reflection part 27.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、たとえばＴＢ
（テラバイト）オーダーなどの大規模データマイニング
を実行する分散メモリ型のマルチプロセッサシステムに
適用して好適な分散共有メモリシステムおよび分散共有
メモリの制御方法に関する。[0001] The present invention relates to, for example, TB
The present invention relates to a distributed shared memory system and a distributed shared memory control method suitable for being applied to a distributed memory type multiprocessor system for executing large-scale data mining of (terabyte) order or the like.

【０００２】[0002]

【従来の技術】近年、バーコード技術の発展などによっ
て、スーパーマーケットなどの小売業者は、大量の売上
データを蓄積するようになった。また、先進的な小売業
者にあっては、この蓄積された大量の売上データを解析
し、ストアレイアウトなどに反映させることにより、そ
の売上を高めるに至っている。このような技術は、一般
にデータマイニングと呼ばれている。2. Description of the Related Art In recent years, with the development of barcode technology and the like, retailers such as supermarkets have accumulated a large amount of sales data. In addition, advanced retailers have increased the sales by analyzing the accumulated large amount of sales data and reflecting it in store layouts and the like. Such a technique is generally called data mining.

【０００３】このデータマイニングで求める情報にはい
くつかあるが、最も代表的な情報として、アソシエイシ
ョンルールというものが存在する。アソシエイションル
ールとは、たとえば、「紙オムツを買うお客の５０％
は、一緒に缶ビールも買っていく」というものである。
これは、アメリカのスーパーマーケットにおける例であ
り、アメリカでは、若い父親が紙オムツを買いに来るこ
とが多いため、紙オムツと一緒に缶ビールを買っていく
ことが多いということを示している。そこで、こういっ
た情報を生かし、たとえば紙オムツと缶ビールとを近く
に置くことによって、缶ビールの売上を高めるという訳
である。このアソシエイションルールの求め方は、R. A
grawal et al., “Mining Association Rules between
Sets of Items in Large Databases ”, Proceedings
of ACM SIGMOD, May 1993 に示されている。これを以下
に簡単に示す。[0003] There are several types of information required by this data mining, but the most typical information is an association rule. Association rules are, for example, "50% of customers who buy disposable diapers
Will also buy canned beer together. "
This is an example of a supermarket in the United States, which shows that in the United States young fathers often buy paper diapers, so they often buy canned beer with paper diapers. Therefore, by utilizing such information, for example, by disposing disposable diapers and canned beer close to each other, the sales of canned beer can be increased. How to find this association rule is R.A.
grawal et al., “Mining Association Rules between
Sets of Items in Large Databases ”, Proceedings
of ACM SIGMOD, May 1993. This is briefly shown below.

【０００４】属性の集合（アイテム）をＩ＝｛ｉ１，ｉ
２，…，ｉｍ｝、トランザクションデータベースをＤ＝
｛ｔ１，ｔ２，…，ｔｎ｝とする。なお、ここで、ｔｉ
は、アイテムの集合からなる。アソシエイションルール
をＸ＝＞Ｙと定義する。ここで、Ｘ、ＹはＩの部分集合
であり、ＸとＹとの共通集合は空集合である。ここでサ
ポート値とコンフィデンス値という２つの評価値を定義
する。サポート値は、ＤがＸを含む割合を示し、コンフ
ィデンス値は、Ｄの中でＸを含むトランザクションのう
ち、ＸとＹを共に含むトランザクションの割合を示す。
アソシエイションルールの抽出は以下の手順による。（１）．最小サポート値を満足するアイテム集合を見つ
け出す（これを頻出アイテム集合と呼ぶ）。（２）．（１）で求めた頻出アイテム集合から最小コン
フィデンス値を満足するアソシエイションルールを見つ
け出す。以下にアソシエイションルールの抽出例を示
す。トランザクションとして、Ｔ１＝｛１，３，４｝、
Ｔ２＝｛１，２，３，５｝、Ｔ３＝｛２，４｝、Ｔ４＝
｛１，２｝、Ｔ５＝｛１，３，５｝があるとする。そし
て、このトランザクションの中から最小サポート値６０
％、最小コンフィデンス値６０％で、アソシエイション
ルールを見つけ出す。すると、頻出アイテム集合は、
｛１｝、｛２｝、｛３｝、｛１、３｝で、アソシエイシ
ョンルールとしては、１＝＞３が見つかる。A set of attributes (items) is defined as I = Ｉi1, i
2, ..., im}, the transaction database is D =
{T1, t2,..., Tn}. Here, ti
Consists of a set of items. The association rule is defined as X => Y. Here, X and Y are subsets of I, and a common set of X and Y is an empty set. Here, two evaluation values, a support value and a confidence value, are defined. The support value indicates a ratio of D including X, and the confidence value indicates a ratio of transactions including both X and Y among transactions including X in D.
The extraction of association rules is performed according to the following procedure. (1). Find an item set that satisfies the minimum support value (this is called a frequent item set). (2). From the frequent item set obtained in (1), find an association rule that satisfies the minimum confidence value. An example of association rule extraction is shown below. As a transaction, T1 = {1, 3, 4},
T2 = {1,2,3,5}, T3 = {2,4}, T4 =
It is assumed that {1, 2} and T5 = {1, 3, 5}. Then, the minimum support value of 60 from this transaction
Find association rules with%, minimum confidence value 60%. Then, the frequent itemset is
In {1}, {2}, {3}, {1, 3}, 1 => 3 is found as the association rule.

【０００５】また、この頻出アイテム集合の抽出を効率
的に処理する手法として、Aprioriのアルゴリズムが知
られている。このApriori のアルゴリズムは、R. Agraw
al et al., “Fast Algorithms for Mining Associati
on Rules”, Proceedings of20th VLDB, 1994に示され
ている。これを以下に簡単に示す。（１）．トランザクションデータベースを読み、各アイ
テムの生起回数を数え上げてサポート値を求める。な
お、ここで、アイテムの生起回数の数え上げとは、トラ
ンザクションデータベースに各アイテムが各々何回出現
したかを数えることを示す。以降、「数え上げ」とは、
このことを意味するものとする。（２）．最小サポート値を満足するものを取り出し、長
さ１の頻出アイテム集合とする。（３）．長さ１の頻出アイテム集合から２つのアイテム
の組み合わせを作る。これらを長さ２の候補アイテム集
合と呼ぶ。（４）．トランザクションデータベースを検索してサポ
ート値を求める。（５）．最小サポート値を満足するものを取り出し、長
さ２の頻出アイテム集合とする。（６）．以降、長さｋ（＞＝２）の場合の処理は、以下
の通りとなる。（ａ）．長さｋ−１の頻出アイテム集合から長さｋの候
補アイテム集合を作る。（ｂ）．トランザクションデータベースを検索してサポ
ート値を求める。（ｃ）．最小サポート値を満足するものを取り出し、長
さｋの頻出アイテム集合とする。（７）．前述の処理を頻出アイテム集合が空になるまで
繰り返す。Apriori's algorithm is known as a method for efficiently processing the extraction of the frequent item set. This Apriori algorithm is based on R. Agraw
al et al., “Fast Algorithms for Mining Associati
on Rules ”, Proceedings of 20th VLDB, 1994. This is briefly described as follows: (1) Read the transaction database and count the number of occurrences of each item to obtain a support value. Counting the number of occurrences of an item refers to counting how many times each item has appeared in the transaction database.
This means this. (2). Those that satisfy the minimum support value are extracted and set as a frequent item set of length 1. (3). A combination of two items is created from a frequent item set of length 1. These are referred to as a length-2 candidate item set. (4). Search the transaction database for support values. (5). An item satisfying the minimum support value is extracted and set as a frequent item set of length 2. (6). Hereinafter, the processing in the case of the length k (> = 2) is as follows. (A). A candidate item set of length k is created from a frequent item set of length k-1. (B). Search the transaction database for support values. (C). An item satisfying the minimum support value is extracted and set as a frequent item set of length k. (7). The above processing is repeated until the frequently-used item set becomes empty.

【０００６】このように、従来のデータマイニングにお
いては、アソシエイションルールを見つけ出すために、
基本的にはこのApriori のアルゴリズムを利用してい
た。そして、このApriori のアルゴリズムに基づいたデ
ータマイニングの処理を共有メモリを持たない分散メモ
リ型のマルチプロセッサ計算機上で並列に処理させるこ
とにより、ＴＢオーダーのトランザクションを高速に処
理することを可能にするものとして、特願平９−３４１
３８４号が存在する。As described above, in the conventional data mining, in order to find an association rule,
Basically, it used this Apriori algorithm. A data mining process based on Apriori's algorithm is processed in parallel on a distributed memory type multiprocessor computer having no shared memory, thereby enabling high-speed processing of TB order transactions. As Japanese Patent Application No. 9-341
No. 384 exists.

【０００７】この特願平９−３４１３８４号において
は、分散共有メモリを提供することによって、分散メモ
リ型のマルチプロセッサ計算機においても、通信を伴な
う分散メモリ型のプログラミングモデルを不要とし、逐
次処理からの自然な拡張である共有メモリモデルでプロ
グラムを開発できるようにしている。In Japanese Patent Application No. 9-341384, a distributed shared memory is provided so that a distributed memory type multiprocessor computer does not require a distributed memory type programming model involving communication, and a serial processing is performed. It allows programs to be developed using the shared memory model, which is a natural extension from.

【０００８】Apriori アルゴリズムでは、トランザクシ
ョンデータを順次読み込み、各アイテムの出現頻度を数
え上げていく。なお、ここでいうトランザクションデー
タとは、たとえばスーパーマーケットのＰＯＳデータで
いえば、紙オムツや缶ビールなどの一回の買い物レシー
トを示す。In the Apriori algorithm, transaction data is sequentially read and the appearance frequency of each item is counted. The transaction data referred to here is, for example, a single shopping receipt such as a disposable diaper or a canned beer in POS data of a supermarket.

【０００９】図８は、特願平９−３４１３８４号で処理
されるトランザクションデータの例を示す図である。図
８中、１つ目のトランザクションデータは、ａ，ｂ，ｃ
の３つのアイテムを含んでいる。また、２つ目のトラン
ザクションデータは、ａ，ｂ，ｄ，ｅ，ｘの５つのアイ
テムを含んでいる。そして、このａ，ｂ，といったアイ
テムが、スーパーマーケットのＰＯＳデータでいう紙オ
ムツや缶ビールなどに相当する。FIG. 8 is a diagram showing an example of transaction data processed in Japanese Patent Application No. 9-341384. In FIG. 8, the first transaction data is a, b, c
It contains three items. The second transaction data includes five items a, b, d, e, and x. The items a and b correspond to disposable diapers, canned beers, and the like in POS data of a supermarket.

【００１０】また、図９は、頻出アイテム集合の抽出を
効率的に処理する手法であるApriori のアルゴリズムを
実行するプログラムの処理の流れを示すフローチャート
である。FIG. 9 is a flowchart showing a processing flow of a program for executing Apriori's algorithm which is a technique for efficiently extracting a frequently-used item set.

【００１１】まず、長さ１の候補アイテム集合を作る
（ステップＡ１）。なお、ここで、長さ１のアイテム集
合とは、｛ａ｝や｛ｂ｝のような１個の要素だけからな
るアイテムの集合を示す。次に、トランザクションデー
タ１７を読み出し、各アイテムの出現頻度を数え上げ、
サポート値を求める（ステップＡ２）。そして、最小サ
ポート値を満足するアイテムを取り出し、長さ１の頻出
アイテム集合を作る（ステップＡ３）。First, a candidate item set of length 1 is created (step A1). Here, the item set having a length of 1 indicates a set of items including only one element such as {a} or {b}. Next, the transaction data 17 is read, the appearance frequency of each item is counted,
A support value is obtained (step A2). Then, an item satisfying the minimum support value is extracted, and a frequently-used item set having a length of 1 is created (step A3).

【００１２】次に、長さ１の頻出アイテム集合から、２
つのアイテムの組み合せを作る（ステップＡ４）。これ
を長さ２の候補アイテム集合とする。すなわち、長さ２
のアイテム集合とは、｛ａ，ｂ｝や｛ａ，ｃ｝のような
２個の要素からなるアイテムの集合を示す。そして、ト
ランザクションデータ１７を読み出し、各アイテムの出
現頻度を数え上げ、サポート値を求め（ステップＡ
５）、最小サポート値を満足するアイテムを取り出し、
長さ２の頻出アイテム集合を作る（ステップＡ６）。Next, from the frequent item set of length 1, 2
A combination of two items is made (step A4). This is a candidate item set of length 2. That is, length 2
Is a set of items consisting of two elements such as {a, b} and {a, c}. Then, the transaction data 17 is read, the appearance frequency of each item is counted, and the support value is obtained (step A).
5) Take out the item that satisfies the minimum support value,
A frequent item set of length 2 is created (step A6).

【００１３】ここで、長さｋ（＞＝２）の頻出アイテム
集合が空かどうか検査し（ステップＡ７）、もし空なら
ば（ステップＡ７のＹＥＳ）、処理を終了する。一方、
空でなければ（ステップＡ７のＮＯ）、ｋに１を加え
（ステップＡ８）、長さｋ−１の頻出アイテム集合か
ら、長さｋの候補アイテム集合を作る（ステップＡ
９）。そして、トランザクションデータ１７を読み出
し、各アイテムの出現頻度を数え上げ、サポート値を求
め（ステップＡ１０）、最小サポート値を満足するアイ
テムを取り出し、長さｋの頻出アイテム集合を作った後
（ステップＡ１１）、ステップＡ７から前述の処理を繰
り返す。Here, it is checked whether or not the frequent item set of length k (> = 2) is empty (step A7). If it is empty (YES in step A7), the process is terminated. on the other hand,
If it is not empty (NO in step A7), 1 is added to k (step A8), and a candidate item set of length k is created from the frequently-used item set of length k-1 (step A).
9). Then, the transaction data 17 is read, the appearance frequency of each item is counted, a support value is obtained (step A10), an item that satisfies the minimum support value is extracted, and a frequent item set of length k is created (step A11). The above-described processing is repeated from step A7.

【００１４】図１０は、図９に示した頻出アイテム集合
の抽出処理において数え上げが行なわれるパス１、すな
わち長さ１のアイテムの種類と出現頻度とからなる統計
情報１５を管理するハッシュテーブル１６を示すもので
ある。図１０における｛ａ｝、｛ｂ｝および｛ｃ｝など
は、アイテムの種類を示しており、その後の空白部分
は、そのアイテムの出現頻度を数え上げるための領域を
示している。FIG. 10 shows a hash table 16 that manages the path 1 that is counted in the frequent item set extraction process shown in FIG. 9, that is, the statistical information 15 that includes the type and appearance frequency of an item having a length of 1. It is shown. {A}, {b}, and {c} in FIG. 10 indicate the type of item, and a blank portion thereafter indicates a region for counting the appearance frequency of the item.

【００１５】また、図１１は、図９に示した頻出アイテ
ム集合の抽出処理において数え上げが行なわれるパス
２、すなわち長さ２のアイテムの種類と出現頻度とから
なる統計情報１５を管理するハッシュテーブル１６を示
すものである。図１１における｛ａ，ｂ｝、｛ａ，ｃ｝
および｛ａ，ｄ｝などは、２要素からなるアイテムの種
類を示しており、その後の空白部分は、そのアイテムの
出現頻度を数え上げるための領域を示している。FIG. 11 shows a hash table which manages the path 2 to be counted in the frequent item set extraction process shown in FIG. 9, that is, the statistical information 15 including the type and the appearance frequency of the item of length 2. 16 is shown. {A, b}, {a, c} in FIG.
And {a, d}, etc., indicate the type of item composed of two elements, and the blank portion thereafter indicates an area for counting the appearance frequency of the item.

【００１６】さらに、図１２は、数え上げログの構造を
示すものであり、数え上げが行なわれたアイテムの出現
頻度をカウントする領域のアドレスを、数え上げが行な
われる度に記録している。そして、特願平９−３４１３
８４号では、この数え上げログを他のノードに転送し、
他のノードでそれを反映することにより、各ノードの分
散共有メモリの一貫性を保持している。この出現頻度を
カウントする領域は、分散共有メモリ上に位置している
ので、どのノードでも同じアドレスにあることになる。
したがって、数え上げログは、このようなアドレス情報
で構わない。FIG. 12 shows the structure of the counting log, in which the address of an area for counting the frequency of appearance of the counted items is recorded every time counting is performed. And Japanese Patent Application No. 9-3413
In No. 84, this counted log is transferred to another node,
By reflecting this on other nodes, the consistency of the distributed shared memory of each node is maintained. Since the area for counting the appearance frequency is located on the distributed shared memory, any node has the same address.
Therefore, the counting log may be such address information.

【００１７】[0017]

【発明が解決しようとする課題】ところで、前述したAp
riori アルゴリズムの処理では、数え上げを行なってい
る最中に、その出現頻度をカウントするデータ構造体が
変化することは想定していない。このことから、前述し
た特願平９−３４１３８４号においても、数え上げを行
なっている最中に、その出現頻度をカウントするデータ
構造体を拡張する手段などは備えていない。By the way, the aforementioned Ap
In the processing of the riori algorithm, it is not assumed that the data structure that counts the appearance frequency changes during counting. For this reason, Japanese Patent Application No. 9-341384 mentioned above does not include any means for expanding a data structure for counting the frequency of occurrence during counting.

【００１８】しかしながら、たとえば、ある時点まで
は、ａ，ｂ，ｃ，ｄ，ｅ，ｆ，ｇ，ｈの８個のアイテム
を数え上げの対象としていたが、これらの数え上げの結
果は保持しつつ、以降については、新たにｊというアイ
テムも数え上げの対象としたいということは十分考えら
れる。このような場合には、たとえば、図１３に示すよ
うに、データ構造体を拡張する必要が生じる。図１３に
示すデータ構造体は、ハッシュ構造（ハッシュリンク）
である。図１３（ａ）に示すように、拡張前はａ，ｂ，
ｃ，ｄ，ｅ，ｆ，ｇ，ｈの８個のエントリしかなかった
が、図１３（ｂ）に示すように、拡張後はこれにｊが加
わって９個になっている。これにより、拡張前であれば
ａ，ｂ，ｃ，ｄ，ｅ，ｆ，ｇ，ｈの８個のエントリのう
ちのいずれかが数え上げられていたものを、拡張後は、
これにｊを加えた９個のエントリのうちのいずれかが数
え上げられるようになる。However, for example, up to a certain point in time, eight items a, b, c, d, e, f, g, and h are counted, but while the results of these counting are retained, From then on, it is sufficiently conceivable that it is desired to newly count the item j. In such a case, for example, it is necessary to extend the data structure as shown in FIG. The data structure shown in FIG. 13 is a hash structure (hash link)
It is. As shown in FIG. 13A, before expansion, a, b,
Although there were only eight entries of c, d, e, f, g, and h, as shown in FIG. 13B, after expansion, j is added to nine entries. As a result, if any of the eight entries a, b, c, d, e, f, g, and h has been counted before the expansion,
Any one of the nine entries obtained by adding j to this can be counted.

【００１９】一方、前述したように、特願平９−３４１
３８４号では、数え上げを行なっている最中に、出現頻
度をカウントするデータ構造体を拡張する手段などは備
えていないため、このような要求に応じることができな
いといった問題があった。On the other hand, as described above, Japanese Patent Application No. 9-341
No. 384 has a problem that such a request cannot be satisfied because there is no means for expanding a data structure for counting the frequency of appearance during counting.

【００２０】この発明はこのような実情に鑑みてなされ
たものであり、数え上げを行なっている最中であって
も、出現頻度をカウントするデータ構造体を拡張するこ
とのできる分散共有メモリシステムおよび同システムに
適用される分散共有メモリの制御方法を提供することを
目的とする。The present invention has been made in view of the above circumstances, and has a distributed shared memory system capable of expanding a data structure for counting the frequency of appearance even during counting. An object of the present invention is to provide a control method of a distributed shared memory applied to the system.

【００２１】[0021]

【課題を解決するための手段】この発明は、複数のコン
ピュータが疎結合された分散メモリ型のマルチプロセッ
サシステムに適用される分散共有メモリシステムであっ
て、前記複数のコンピュータそれぞれが、他のコンピュ
ータ上で動作するプロセスと同一アドレスで共通にアク
セス可能な共有メモリ空間を同一コンピュータ上で動作
するプロセスに対して提供する共有メモリ空間提供手段
と、前記同一コンピュータ上で動作するプロセスが入力
データから抽出する特定のアイテムの集合ごとの出現頻
度を保持するデータ構造体を前記共有メモリ空間内に作
成するデータ構造体作成手段と、前記データ構造体に保
持された出現頻度に対する前記同一コンピュータ上で動
作するプロセスの数え上げの履歴を取得する数え上げ履
歴取得手段と、前記数え上げ履歴取得手段により取得さ
れた数え上げ履歴を前記他のコンピュータに転送する数
え上げ履歴転送手段と、前記他のコンピュータから転送
される数え上げ履歴を受信する数え上げ履歴受信手段
と、前記数え上げ履歴受信手段により受信された数え上
げ履歴を前記データ構造体に保持された出現頻度に反映
させる数え上げ履歴反映手段とを備えた分散共有メモリ
システムにおいて、前記複数のコンピュータそれぞれ
に、前記データ構造体を拡張するデータ構造体拡張手段
と、前記データ構造体拡張手段による前記データ構造体
の拡張の履歴を取得する拡張履歴取得手段と、前記拡張
履歴取得手段により取得された拡張履歴を前記他のコン
ピュータに転送する拡張履歴転送手段と、前記他のコン
ピュータから転送される拡張履歴を受信する拡張履歴受
信手段と、前記拡張履歴受信手段により受信された拡張
履歴を前記データ構造体に反映させる拡張履歴反映手段
と、前記データ構造体の拡張を前記同一コンピュータ上
で動作する前記データ構造体拡張手段および前記拡張履
歴反映手段と前記他のコンピュータ上で動作する前記デ
ータ構造体拡張手段および前記拡張履歴反映手段との間
で排他制御する拡張排他制御手段とを設け、前記複数の
コンピュータ上で動作するプロセスの数え上げの最中に
前記データ構造体を拡張可能としたことを特徴とする。The present invention is a distributed shared memory system applied to a distributed memory type multiprocessor system in which a plurality of computers are loosely coupled, wherein each of the plurality of computers is connected to another computer. A shared memory space providing means for providing a shared memory space commonly accessible at the same address as a process operating on the same computer to a process operating on the same computer; and a process operating on the same computer is extracted from input data. Data structure creating means for creating, in the shared memory space, a data structure holding the frequency of appearance for each set of specific items to be executed, and operating on the same computer for the frequency of appearance held in the data structure A counting history acquisition means for acquiring a counting history of the process; Counting history transfer means for transferring the counting history acquired by the counting history acquisition means to the other computer, counting history receiving means for receiving the counting history transferred from the other computer, and reception by the counting history receiving means. In a distributed shared memory system comprising counting history reflecting means for reflecting the counted counting history in the appearance frequency held in the data structure, a data structure extension for extending the data structure to each of the plurality of computers. Means, an extension history acquisition means for acquiring a history of extension of the data structure by the data structure extension means, and an extension history transfer means for transferring the extension history acquired by the extension history acquisition means to the other computer And receive the extended history transferred from the other computer Extended history receiving means, extended history reflecting means for reflecting the extended history received by the extended history receiving means in the data structure, and the data structure operating on the same computer for extending the data structure Extended exclusive control means for performing exclusive control between the extended means and the extended history reflecting means and the data structure extending means and the extended history reflecting means operating on the other computer; and The data structure can be extended during counting of the number of operating processes.

【００２２】また、この発明は、前記複数のコンピュー
タそれぞれに、前記拡張履歴転送手段により転送される
拡張履歴と前記数え上げ履歴転送手段により転送される
数え上げ履歴とを一時的に蓄積して前記他のコンピュー
タに一括して転送する履歴一括転送手段とをさらに設け
たことを特徴とする。Also, the present invention is characterized in that the extended history transferred by the extended history transferring means and the counting history transferred by the counting history transferring means are temporarily stored in each of the plurality of computers, and the other computers are temporarily stored. A log batch transfer means for batch transfer to a computer is further provided.

【００２３】また、この発明は、前記履歴一括転送手段
が、前記拡張履歴が前記数え上げ履歴よりも前に転送さ
れるように前記蓄積した拡張履歴および数え上げ履歴を
再配置する手段を有することを特徴とする。Further, the present invention is characterized in that the history batch transfer means has means for rearranging the accumulated extension history and counting history so that the extension history is transferred before the counting history. And

【００２４】また、この発明は、前記データ構造体に保
持されるデータに対して交換法則（ａ＋ｂ＝ｂ＋ａ）が
成り立つ任意の操作を施すことを特徴とする。また、こ
の発明は、前記データ構造体が、任意の形式で構成され
ることを特徴とする。Further, the present invention is characterized in that an arbitrary operation that satisfies the exchange rule (a + b = b + a) is performed on the data held in the data structure. Further, the invention is characterized in that the data structure is configured in an arbitrary format.

【００２５】[0025]

【発明の実施の形態】以下、図面を参照してこの発明の
実施の形態を説明する。図１は、この発明の実施形態に
係る分散メモリ型マルチプロセッサシステムの概略構成
およびこの分散メモリ型マルチプロセッサシステムに適
用される分散共有メモリシステムの機能ブロックを示す
図である。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a diagram showing a schematic configuration of a distributed memory multiprocessor system according to an embodiment of the present invention and functional blocks of a distributed shared memory system applied to the distributed memory multiprocessor system.

【００２６】ノード１００は、各々１個以上のプロセッ
サ、メモリおよびＩ／Ｏ装置を備えてなり、また、これ
らのノード１００は、ネットワーク２００を介して互い
に接続されている。Each of the nodes 100 includes one or more processors, memories, and I / O devices, and these nodes 100 are connected to each other via a network 200.

【００２７】これら各ノード１００上では、プロセス１
１が動作し、各プロセス１１のアドレス空間内には、図
２に示すように、すべてのプロセス１０から共通に（同
じアドレスに）見える共有メモリ空間１２が存在する。
この共有メモリ空間１２は、共有メモリ空間提供部１３
によって提供される。On each of these nodes 100, process 1
1 operates, and in the address space of each process 11, as shown in FIG. 2, there is a shared memory space 12 that is common to all processes 10 (at the same address).
This shared memory space 12 includes a shared memory space providing unit 13
Provided by

【００２８】この共有メモリ空間１２内には、共有メモ
リ空間内データ構造作成部１４によってハッシュ構造の
ハッシュテーブル１６が作成され、アイテムの種類と出
現頻度とからなる統計情報１５が格納される。なお、こ
こで、アイテムとは、たとえばスーパーマーケットのＰ
ＯＳデータでいえば、紙オムツや缶ビールなどの商品項
目を示す。In the shared memory space 12, a hash table 16 having a hash structure is created by the data structure creating unit 14 in the shared memory space, and statistical information 15 including the type of item and the frequency of appearance is stored. Here, the item is, for example, P of a supermarket.
In terms of OS data, it indicates a product item such as a disposable diaper or canned beer.

【００２９】この統計情報１５は、プロセス１１がトラ
ンザクションデータ１７を読み込んだ際に、プロセス１
１のアイテム出現頻度数え上げ部１８によってそのアイ
テムの出現頻度が数え上げられていく。なお、ここでい
うトランザクションデータとは、たとえばスーパーマー
ケットのＰＯＳデータでいえば、紙オムツや缶ビールな
どの一回の買い物レシートを示す。When the process 11 reads the transaction data 17, the statistical information 15
The item appearance frequency counting unit 18 counts the appearance frequency of the item. The transaction data referred to here is, for example, a single shopping receipt such as a disposable diaper or a canned beer in POS data of a supermarket.

【００３０】図１では、各ノードにディスク装置が１個
しか存在せず、そこにすべてのトランザクションデータ
１７が格納されている様に示されているが、実際には、
各ノード１００に多数のディスク装置が存在し、トラン
ザクションデータ１７を分割して保持している。たとえ
ば、１ＴＢのトランザクションデータは、２ＧＢのディ
スク装置で５００台を要するため、たとえば１０ノード
のシステムであれば各ノードに５０台のディスク装置が
接続されることになる。そして、１ＴＢのトランザクシ
ョンデータは、１０ノード５００台のディスク装置に分
割して格納されることになる。FIG. 1 shows that each node has only one disk device, and all transaction data 17 is stored therein.
Many disk devices exist in each node 100, and the transaction data 17 is divided and held. For example, since 1 TB of transaction data requires 500 units of a 2 GB disk unit, for example, in a 10-node system, 50 disk units are connected to each node. Then, 1 TB of transaction data is divided and stored in 500 disk devices of 10 nodes.

【００３１】また、プロセス１１のアイテム出現頻度数
え上げ部１８がアイテムの出現頻度を数え上げる際、同
時に、プロセス１１の数え上げログ記録部１９が、この
数え上げの履歴である数え上げログ２０を記録してい
く。そして、この記録された数え上げログ２０は、数え
上げログ転送部２１によって、他のノード１００に転送
される。When the item appearance frequency counting unit 18 of the process 11 counts the item appearance frequency, the counting log recording unit 19 of the process 11 simultaneously records a counting log 20 which is a history of the counting. Then, the recorded counting log 20 is transferred to another node 100 by the counting log transfer unit 21.

【００３２】この転送された数え上げログ２０は、各々
のノード１００で、数え上げログ受信部２２によって受
信され、数え上げログ反映部２３によって、それぞれの
ノード１００の共有メモリ空間１２に反映される。The transferred counting log 20 is received by the counting log receiving unit 22 in each node 100, and is reflected in the shared memory space 12 of each node 100 by the counting log reflecting unit 23.

【００３３】ここでいう「数え上げ」とは、たとえば、
トランザクションデータ１７を読み出し、そこに含まれ
るアイテムの出現頻度をインクリメントすることであ
る。よって、「数え上げログ」とは、アイテムの出現頻
度を記録するための領域のアドレスとなる。このアイテ
ムの出現頻度を記録するための領域は、共有メモリ空間
１２上に確保されているので、どのノード１００上のプ
ロセス１１からも同じアドレスでアクセスできる。それ
ゆえに、数え上げログ２０としてアイテムの出現頻度を
記録するための領域のアドレスを送れば、それを受信し
たノード１００では、そのアドレスで示された領域の値
をインクリメントすることによって、他のノード１００
で行なわれた数え上げを自身のノード１００に反映する
ことができるため、ノード１００間で共通に見える共有
メモリ空間１２上のデータの一貫性を保持することが可
能となる。The "counting" here means, for example,
That is, the transaction data 17 is read, and the appearance frequency of the item included therein is incremented. Therefore, the “counting log” is an address of an area for recording the appearance frequency of the item. An area for recording the appearance frequency of this item is secured in the shared memory space 12, so that the process 11 on any node 100 can access the same address. Therefore, if the address of the area for recording the appearance frequency of the item is sent as the counting log 20, the node 100 that has received the address increments the value of the area indicated by the address, and the other node 100
Can be reflected in its own node 100, so that it is possible to maintain consistency of data in the shared memory space 12, which appears to be common among the nodes 100.

【００３４】さらに、この実施形態における分散共有メ
モリシステムでは、分散ロック獲得部２４、分散ロック
解放部２５、共有メモリ空間内データ構造拡張部２６、
データ構造拡張ログ反映部２７、データ構造拡張ログ記
録部２８、データ構造拡張ログ受信部２９およびデータ
構造拡張ログ転送部３１を備えることにより、数え上げ
の最中におけるハッシュテーブル１６の拡張を可能とす
る。そして、このことが、この発明の特徴とする点であ
る。以下、この数え上げの最中におけるハッシュテーブ
ル１６の拡張について説明する。Further, in the distributed shared memory system in this embodiment, the distributed lock acquisition unit 24, the distributed lock release unit 25, the data structure extension unit 26 in the shared memory space,
The provision of the data structure extension log reflection unit 27, the data structure extension log recording unit 28, the data structure extension log reception unit 29, and the data structure extension log transfer unit 31 enables the hash table 16 to be extended during counting. . This is the feature of the present invention. Hereinafter, expansion of the hash table 16 during the counting is described.

【００３５】それぞれのノード１００でプロセス１１が
数え上げを行なっている最中にハッシュテーブル１６を
拡張するには、まず、分散ロック獲得部２４によってハ
ッシュテーブル１６の拡張に関するロックを獲得し、デ
ータ構造拡張操作の排他制御を行なう。In order to extend the hash table 16 while the process 11 is enumerating in each node 100, first, the distributed lock acquisition unit 24 acquires a lock related to the extension of the hash table 16 and extends the data structure. Performs exclusive control of operations.

【００３６】次に、共有メモリ空間内データ構造拡張部
２６によって、共有メモリ空間１２内のハッシュテーブ
ル１６を拡張する。この際、データ構造拡張ログ記録部
２８によって、データ構造拡張ログ３０が記録される。Next, the hash table 16 in the shared memory space 12 is expanded by the data structure expansion unit 26 in the shared memory space. At this time, the data structure extension log recording unit 28 records the data structure extension log 30.

【００３７】そして、分散ロック解放部２５によってロ
ックを解放する際に、その記録されたデータ構造拡張ロ
グ３０が、データ構造拡張ログ転送部３１によって他の
ノード１００に転送される。When the lock is released by the distributed lock release unit 25, the recorded data structure extension log 30 is transferred to another node 100 by the data structure extension log transfer unit 31.

【００３８】転送されたデータ構造拡張ログ３０は、各
々のノード１００で、データ構造拡張ログ受信部２９に
よって受信され、データ構造拡張ログ反映部２７によっ
て自分のノード１００のハッシュテーブル１６に反映さ
れる。The transferred data structure extension log 30 is received by the data structure extension log reception unit 29 at each node 100 and is reflected on the hash table 16 of the node 100 by the data structure extension log reflection unit 27. .

【００３９】図３は、前述した特願平９−３４１３８４
号で示される分散共有メモリシステムによってノード
（０）とノード（１）とで並列に数え上げを行なう場合
の様子を示す図である。なお、ここでは説明の都合上、
ログを記録するバッファは、最大で４個のログしか格納
できないものとするが、実際には、通常、数千以上のロ
グが格納できるサイズに設定される。FIG. 3 shows the above-mentioned Japanese Patent Application No. 9-341384.
FIG. 11 is a diagram showing a state where nodes (0) and (1) are counted in parallel by the distributed shared memory system indicated by a symbol. Here, for convenience of explanation,
It is assumed that the log recording buffer can store only a maximum of four logs, but in practice, it is usually set to a size that can store thousands or more logs.

【００４０】ノード（０）では、ａ，ｂ，ｃ，ｄの数え
上げが行なわれ、それがバッファに格納され、バッファ
がフルになった際に、それがノード（１）に転送され、
ノード（１）にも反映される。同様に、このとき、ノー
ド（１）では、１，２，３，４の数え上げが行なわれ、
それがバッファに格納され、バッファがフルになった際
に、それがノード（０）に転送され、ノード（０）にも
反映される。これは、データ構造体、すなわち、ハッシ
ュテーブル１６の拡張を伴なわない場合の例である。At node (0), a, b, c, and d are counted and stored in a buffer. When the buffer becomes full, it is transferred to node (1).
It is also reflected on node (1). Similarly, at this time, 1, 2, 3, and 4 are counted at node (1).
It is stored in the buffer, and when the buffer is full, it is transferred to node (0) and reflected on node (0). This is an example in which the data structure, that is, the hash table 16 is not expanded.

【００４１】一方、図４は、この実施形態の分散共有メ
モリシステムによってノード（０）とノード（１）とで
並列に数え上げを行なう場合の様子を示す図である。ノ
ード（０）では、時刻ｔ１にロックを獲得し、データ構
造体、すなわちハッシュテーブル１５を拡張し、時刻ｔ
３で、ロックを解放している。On the other hand, FIG. 4 is a diagram showing a state in which the nodes (0) and (1) count up in parallel by the distributed shared memory system of this embodiment. At the node (0), the lock is acquired at time t1, and the data structure, that is, the hash table 15 is extended.
At 3, the lock is released.

【００４２】このハッシュテーブル１５の拡張を示すデ
ータ構造拡張ログｘは、数え上げログａ，ｂ，ｃと一緒
にバッファに格納（ａ，ｂ，ｘ，ｃ）され、ノード
（１）に転送され、ノード（１）に反映される。The data structure extension log x indicating the extension of the hash table 15 is stored (a, b, x, c) in the buffer together with the enumeration logs a, b, c, and transferred to the node (1). Reflected on node (1).

【００４３】一方、ノード（１）でも、時刻ｔ２におい
て、ロックの獲得要求が出されている。しかし、その時
点では、すでにノード（０）によってロックが獲得され
ているので、すぐには獲得できず、時刻ｔ４まで待たさ
れる。On the other hand, the node (1) also issues a lock acquisition request at time t2. However, at that time, since the lock has already been acquired by the node (0), the lock cannot be acquired immediately, and the process waits until time t4.

【００４４】その後、時刻ｔ４になると、ノード（０）
で獲得されていたロックが解放されるため、そのロック
をノード（１）で獲得でき、続いて、ハッシュテーブル
１６を拡張し、時刻ｔ５で、ロックを解放している。そ
して、このときのデータ構造拡張ログＹは、数え上げロ
グ１，２，３と一緒にバッファに格納（１，２，Ｙ，
３）され、ノード（０）に転送されてノード（０）に反
映される。Thereafter, at time t4, node (0)
Is released, the lock can be acquired by the node (1), the hash table 16 is extended, and the lock is released at time t5. Then, the data structure extended log Y at this time is stored in the buffer together with the counted logs 1, 2, 3 (1, 2, Y,
3) is transferred to the node (0) and reflected on the node (0).

【００４５】そして、時刻ｔ６で、ノード（１）で獲得
されていたロックが解放される。このように、この実施
形態の分散共有メモリシステムでは、ハッシュテーブル
１５の拡張をノード１００間で排他制御する機構をもつ
ことによって、数え上げ操作の最中におけるハッシュテ
ーブル１５の拡張を可能とする。Then, at time t6, the lock acquired by the node (1) is released. As described above, the distributed shared memory system of this embodiment has a mechanism for exclusively controlling the expansion of the hash table 15 between the nodes 100, thereby enabling the expansion of the hash table 15 during the counting operation.

【００４６】図５は、図４に示した方法を修正した場合
の例であり、バッファ内に格納された数え上げログとデ
ータ構造拡張ログとを他のノードに転送する前に、バッ
ファ内のログの配置順をデータ構造拡張ログが先に、数
え上げログが後にくるように再配置するようにしたもの
である。FIG. 5 shows an example in which the method shown in FIG. 4 is modified. The log in the buffer is transferred before transferring the counted log and the data structure extension log stored in the buffer to another node. The arrangement order is rearranged such that the data structure extension log comes first, and the counted log comes later.

【００４７】なお、この例では、バッファ内に格納され
るログの数が少ない（４）が、実際には、多数のログが
格納されているため、データ構造拡張ログを先に配置す
ることにより、それを受け取ったノードでは、データ構
造の拡張が先に行なわれることになるため、バッファ内
のすべてのログを反映する以前に、ロックを解放するこ
とが可能となる。In this example, although the number of logs stored in the buffer is small (4), since a large number of logs are actually stored, the data structure extension log is arranged first. On the node that receives it, the data structure is expanded first, so that the lock can be released before all the logs in the buffer are reflected.

【００４８】図６は、図５に示した方法をさらに修正し
た場合の例であり、バッファにデータ構造拡張ログが格
納されたら、たとえバッファがフルになっていなくて
も、その時点でバッファに格納されたデータ構造拡張ロ
グおよび数え上げログを即座に他ノード１００に転送す
るようにしたものである。FIG. 6 shows an example in which the method shown in FIG. 5 is further modified. When the data structure extension log is stored in the buffer, even if the buffer is not full, the data is stored in the buffer at that time. The stored data structure extension log and the counted log are immediately transferred to another node 100.

【００４９】また、図７は、さらに別の方法を示す例で
ある。ここでは、数え上げログのバッファとデータ構造
拡張ログのバッファとを分離させ、ロック解放時に、デ
ータ構造拡張ログを数え上げログに先駆けて送り、他の
ノードに反映させる。すなわち、ノード（０）では、ま
ず、ａ，ｂの数え上げが行なわれ、次に、時刻ｔ１でロ
ックを獲得し、データ構造の拡張を行ない、時刻ｔ３で
ロックが解放されているが、この際に、データ構造拡張
ログｘのみがノード（０）からノード（１）に転送さ
れ、ノード（１）にも反映される。FIG. 7 is an example showing still another method. Here, the buffer of the enumerated log and the buffer of the data structure extension log are separated, and when releasing the lock, the data structure extension log is sent prior to the enumeration log and reflected on other nodes. That is, at the node (0), first, a and b are counted up, then the lock is acquired at time t1, the data structure is extended, and the lock is released at time t3. Then, only the data structure extension log x is transferred from the node (0) to the node (1), and is also reflected on the node (1).

【００５０】その後、ノード（０）では、さらに、ｃ，
ｄの数え上げが行なわれ、このときに、バッファがフル
になるため、数え上げログがノード（０）からノード
（１）に送られ、ノード（１）にも反映される。Thereafter, at the node (0), c,
Counting of d is performed. At this time, since the buffer becomes full, the counting log is sent from the node (0) to the node (1), and is reflected on the node (1).

【００５１】一方、ノード（１）では、時刻ｔ２におい
て、ロック獲得要求が出されているが、この時点では、
ノード（０）がロックを獲得しているために待たされ、
時刻ｔ３で、ノード（０）がロックを解放した際に獲得
される。On the other hand, at node (1), a lock acquisition request is issued at time t2.
Waited because node (0) has acquired the lock,
Acquired when the node (0) releases the lock at time t3.

【００５２】そして、ノード（１）でも、データ構造の
拡張が行われ、時刻ｔ４でロックを解放する。この際
に、データ構造拡張ログＹが、ノード（１）からノード
（０）に転送され、ノード（０）にも反映される。Then, the data structure is extended at the node (1), and the lock is released at time t4. At this time, the data structure extension log Y is transferred from the node (1) to the node (0), and is also reflected on the node (0).

【００５３】その後、ノード（１）では、さらに、３，
４の数え上げが行なわれ、このときに、バッファがフル
になるため、数え上げログがノード（１）からノード
（０）に送られ、ノード（０）にも反映される。Thereafter, at node (1),
4 is counted. At this time, since the buffer becomes full, the counting log is sent from the node (1) to the node (0) and is reflected on the node (0).

【００５４】このように、図５乃至図７の例では、ある
ノードで実行された「数え上げ」と「データ構造拡張」
との操作の順番を、他のノードではその順番を変更して
行なうことにより、実行効率を高めている。As described above, in the examples shown in FIGS. 5 to 7, "counting" and "data structure extension" executed at a certain node are performed.
By changing the order of the operations with the other nodes, the execution efficiency is enhanced by changing the order.

【００５５】しかも、「データ構造拡張」操作を「数え
上げ」操作より前に移動させているだけなので、処理に
矛盾を生じさせることもない。なお、前述した実施形態
では、Apriori アルゴリズムにおけるデータ項目の生起
回数の数え処理を例に説明しているが、この発明の適用
範囲はこれだけに止まるものではない。Further, since the "data structure expansion" operation is merely moved before the "enumeration" operation, there is no inconsistency in the processing. In the above-described embodiment, the process of counting the number of occurrences of a data item in the Apriori algorithm is described as an example. However, the scope of the present invention is not limited to this.

【００５６】すなわち、この発明は集計処理のような交
換法則（ａ＋ｂ＝ｂ＋ａ）の成り立つ処理全般に適用可
能なものである。また、分散共有メモリ上に保持するデ
ータ構造も、ハッシュテーブルだけに止まるものではな
く、配列、キュー、ネットワークなどに広く適用可能な
ものである。That is, the present invention can be applied to all processes such as the tabulation process that satisfy the exchange rule (a + b = b + a). Further, the data structure stored in the distributed shared memory is not limited to the hash table, but can be widely applied to arrays, queues, networks, and the like.

【００５７】[0057]

【発明の効果】以上詳述したように、この発明によれ
ば、分散共有メモリ内に構築されたデータ構造の拡張を
複数のコンピュータ間で排他制御することができるた
め、数え上げ操作の最中におけるデータ構造の拡張を可
能とし、数え上げの対象とするアイテムの追加にも柔軟
に対応できることになる。As described above in detail, according to the present invention, the expansion of the data structure constructed in the distributed shared memory can be exclusively controlled between a plurality of computers, so that during the counting operation, The data structure can be expanded, and it is possible to flexibly cope with the addition of items to be counted.

[Brief description of the drawings]

【図１】この発明の実施形態に係る分散メモリ型マルチ
プロセッサシステムの概略構成およびこの分散メモリ型
マルチプロセッサシステムに適用される分散共有メモリ
システムの機能ブロックを示す図。FIG. 1 is a diagram showing a schematic configuration of a distributed memory type multiprocessor system according to an embodiment of the present invention and functional blocks of a distributed shared memory system applied to the distributed memory type multiprocessor system.

【図２】同実施形態のプロセスの内部構成を示す図。FIG. 2 is a view showing an internal configuration of a process according to the embodiment.

【図３】従来の分散共有メモリシステムによってノード
（０）とノード（１）とで並列に数え上げを行なう場合
の様子を示す図。FIG. 3 is a diagram showing a state in which counting is performed in parallel at a node (0) and a node (1) by a conventional distributed shared memory system.

【図４】同実施形態の分散共有メモリシステムによって
ノード（０）とノード（１）とで並列に数え上げを行な
う場合の様子を示す図。FIG. 4 is an exemplary view showing a case where nodes (0) and (1) are counted in parallel by the distributed shared memory system according to the embodiment;

【図５】バッファ内に格納された数え上げログとデータ
構造拡張ログとを他のノードに転送する前に、バッファ
内のログの配置順をデータ構造拡張ログが先に、数え上
げログが後にくるように再配置するように図４に示した
方法を修正した場合の例を示す図。FIG. 5 shows the arrangement order of the logs in the buffer before transferring the counted log and the data structure extended log stored in the buffer to another node, so that the data structure extended log comes first and the counted log comes later. FIG. 5 is a diagram showing an example of a case where the method shown in FIG.

【図６】バッファにデータ構造拡張ログが格納された
ら、たとえバッファがフルになっていなくても、その時
点でバッファに格納されたデータ構造拡張ログおよび数
え上げログを即座に他ノードに転送するように図５に示
した方法をさらに修正した場合の例を示す図。FIG. 6 is a diagram showing a configuration in which, when a data structure extension log is stored in a buffer, even if the buffer is not full, the data structure extension log and the counting log stored in the buffer at that time are immediately transferred to another node. FIG. 6 is a diagram showing an example when the method shown in FIG. 5 is further modified.

【図７】同実施形態の数え上げログのバッファとデータ
構造拡張ログのバッファとを分離させ、ロック解放時
に、データ構造拡張ログを数え上げログに先駆けて送
り、他のノードに反映させる例を示す図。FIG. 7 is a diagram showing an example of separating the counted log buffer and the data structure extended log buffer according to the embodiment, sending the data structure extended log prior to the counted log, and reflecting it on another node when releasing a lock; .

【図８】従来の分散共有メモリシステムのトランザクシ
ョンデータの例を示す図FIG. 8 is a diagram showing an example of transaction data in a conventional distributed shared memory system.

【図９】頻出アイテム集合の抽出を効率的に処理する手
法であるApriori のアルゴリズムを実行するプログラム
の処理の流れを示すフローチャート。FIG. 9 is a flowchart showing a processing flow of a program for executing Apriori's algorithm, which is a technique for efficiently processing the extraction of a frequent item set.

【図１０】図９に示した頻出アイテム集合の抽出処理に
おいて数え上げが行なわれるパス１、すなわち長さ１の
アイテムの種類と出現頻度とからなる統計情報を管理す
るハッシュテーブルを示す図。FIG. 10 is a diagram showing a hash table for managing statistical information including the type and appearance frequency of an item having a length of 1, which is counted in the frequent item set extraction process shown in FIG. 9;

【図１１】図９に示した頻出アイテム集合の抽出処理に
おいて数え上げが行なわれるパス２、すなわち長さ２の
アイテムの種類と出現頻度とからなる統計情報を管理す
るハッシュテーブルを示す図。FIG. 11 is a diagram showing a hash table for managing statistical information including the type and appearance frequency of an item having a length of 2, which is counted in the frequent item set extraction process shown in FIG. 9;

【図１２】従来の分散共有メモリシステムの数え上げロ
グの構造を示す図。FIG. 12 is a diagram showing the structure of a counting log of a conventional distributed shared memory system.

【図１３】従来の分散共有メモリシステムのデータ構造
体の拡張を示す図。FIG. 13 is a diagram showing an extension of a data structure of a conventional distributed shared memory system.

[Explanation of symbols]

１１…プロセス、１２…共有メモリ空間、１３…共有メ
モリ空間提供部、１４…共有メモリ空間内データ構造作
成部、１５…アイテムの種類と出現頻度とからなる統計
情報、１６…ハッシュテーブル、１７…トランザクショ
ンデータ、１８…アイテム出現頻度数え上げ部、１９…
数え上げログ記録部、２０…数え上げログ、２１…数え
上げログ転送部、２２…数え上げログ受信部、２３…数
え上げログ反映部、２４…分散ロック獲得部、２５…分
散ロック解放部、２６…共有メモリ空間内データ構造拡
張部、２７…データ構造拡張ログ反映部、２８…データ
構造拡張ログ記録部、２９…データ構造拡張ログ受信
部、３０…データ構造拡張ログ、３１…データ構造拡張
ログ転送部、１００…ノード、２００…ネットワーク。11: Process, 12: Shared memory space, 13: Shared memory space providing unit, 14: Data structure creating unit in shared memory space, 15: Statistical information consisting of item type and appearance frequency, 16: Hash table, 17 ... Transaction data, 18 ... Item appearance frequency counting unit, 19 ...
Counting log recording section, 20: Counting log, 21: Counting log transfer section, 22: Counting log receiving section, 23: Counting log reflecting section, 24: Distributed lock acquisition section, 25: Distributed lock release section, 26: Shared memory space Internal data structure extension unit, 27: Data structure extension log reflection unit, 28: Data structure extension log recording unit, 29: Data structure extension log receiving unit, 30: Data structure extension log, 31: Data structure extension log transfer unit, 100 ... node, 200 ... network.

Claims

[Claims]

1. A distributed shared memory system applied to a distributed memory type multiprocessor system in which a plurality of computers are loosely coupled, wherein each of the plurality of computers has the same address as a process operating on another computer. A shared memory space providing means for providing a shared memory space that can be commonly accessed in the same computer to a process operating on the same computer; Data structure creating means for creating a data structure holding the appearance frequency in the shared memory space, and acquiring a history of counting of processes operating on the same computer with respect to the appearance frequency held in the data structure Counting history acquisition means, and counting history acquisition means A counting history transfer unit that transfers the obtained counting history to the other computer, a counting history receiving unit that receives the counting history transferred from the other computer, and a counting history received by the counting history receiving unit. In a distributed shared memory system including a counting history reflecting unit that reflects the appearance frequency held in the data structure, a data structure extending unit that extends the data structure to each of the plurality of computers; An extension history acquisition unit that acquires a history of extension of the data structure by a structure extension unit; an extension history transfer unit that transfers the extension history acquired by the extension history acquisition unit to the other computer; Extended history receiving means for receiving an extended history transferred from the computer, Extended history reflecting means for reflecting the extended history received by the extended history receiving means on the data structure; data structure extending means operating on the same computer for extending the data structure; and extended history reflecting Means and an extended exclusive control means for exclusive control between the data structure extending means and the extended history reflecting means operating on the other computer, wherein the number of processes operating on the plurality of computers is counted up. A distributed shared memory system, wherein the data structure can be expanded during the process.

2. The computer according to claim 1, wherein the extended history transferred by the extended history transferring unit and the counting history transferred by the counting history transferring unit are temporarily stored in each of the plurality of computers and are collectively stored in the other computers. 2. The distributed shared memory system according to claim 1, further comprising: a history batch transfer means for transferring the data.

3. The distributed sharing system according to claim 2, wherein the history batch transfer means includes means for rearranging the accumulated extension history and the count history so that the extension history is transferred before the count history. Memory system.

4. The distributed shared memory system according to claim 1, wherein an arbitrary operation that satisfies an exchange rule (a + b = b + a) is performed on the data held in the data structure.

5. The data structure according to claim 1, wherein the data structure can be constructed in any format.
A distributed shared memory system as described.

6. A distributed shared memory system applied to a distributed memory type multiprocessor system in which a plurality of computers are loosely coupled, wherein each of the plurality of computers has the same address as a process operating on another computer. A shared memory space providing means for providing a shared memory space that can be commonly accessed in the same computer to a process operating on the same computer; Data structure creating means for creating a data structure holding the appearance frequency in the shared memory space, and acquiring a history of counting of processes operating on the same computer with respect to the appearance frequency held in the data structure Counting history acquisition means, and counting history acquisition means A counting history transfer unit that transfers the obtained counting history to the other computer, a counting history receiving unit that receives the counting history transferred from the other computer, and a counting history received by the counting history receiving unit. A control method for a distributed shared memory applied to a distributed shared memory system including a counting history reflecting unit that reflects the frequency of appearance held in the data structure, wherein the expansion of the data structure is performed with the other computer. Executing while performing exclusive control between; acquiring the extension history of the executed data structure; transferring the acquired extension history to the other computer; and Receiving the extended history to be transferred; and Reflecting the data structure in the data structure while maintaining consistency with another computer, and expanding the data structure during counting of processes running on the plurality of computers. A method for controlling a distributed shared memory, the method comprising:

7. The distributed sharing according to claim 6, further comprising a step of temporarily storing the transferred extended history and the counted history and transferring the extended history and the counted history to the another computer at a time. How to control memory.

8. The distributed sharing according to claim 7, further comprising the step of rearranging the accumulated extension history and counting history so that the extension history is transferred before the counting history. How to control memory.

9. The distributed shared memory system according to claim 1, wherein the exchange rule (a +
9. The method according to claim 6, wherein any operation that satisfies b = b + a) is performed.

10. The method according to claim 6, wherein the data structure can be constructed in any format.