JP2871755B2

JP2871755B2 - Split control method in dynamic hash

Info

Publication number: JP2871755B2
Application number: JP1299210A
Authority: JP
Inventors: 孝司小幡; 一孝小沢
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1989-11-17
Filing date: 1989-11-17
Publication date: 1999-03-17
Anticipated expiration: 2014-03-17
Also published as: JPH03158966A

Description

【発明の詳細な説明】〔概要〕ダイナミック・ハッシュ法を用いた格納構造により，
データを管理するデータ処理システムにおいて，スプリ
ット時にページ複写を行うことによってページスプリッ
ト機能の改善を図ったダイナミック・ハッシュにおける
スプリット制御方法に関し，スプリット処理のオーバヘッドをシステム全体でバラ
ンスさせ，トランザクション処理におけるレスポンスや
スループットを保証できるようにすることを目的とし，スプリット時に，新ページを確保して，スプリット対
象ページの内容を，新ページに複写する処理過程と，複
写処理後のスプリット対象ページおよび新ページについ
て，再ハッシュを行い，不要レコードを削除する処理過
程とを備え，前記２つの処理過程を，分離して動作可能
にするように構成する。DETAILED DESCRIPTION OF THE INVENTION [Overview] A storage structure using a dynamic hash method
In a data processing system that manages data, a split control method for dynamic hashing, in which the page splitting function is improved by performing page copying at the time of splitting, the overhead of the split processing is balanced in the entire system, and the response and the response in the transaction processing are improved. The purpose is to guarantee the throughput. At the time of splitting, a new page is secured and the contents of the split target page are copied to the new page. A process of rehashing and deleting unnecessary records, so that the two processes can be operated separately.

〔産業上の利用分野〕本発明は，ダイナミック・ハッシュ法を用いた格納構
造により，データを管理するデータ処理システムにおい
て，スプリット時にページ複写を行うことによってペー
ジスプリット機能の改善を図ったダイナミック・ハッシ
ュにおけるスプリット制御方法に関する。[Industrial Application Field] The present invention relates to a dynamic hashing method which improves a page splitting function by performing page copying at the time of splitting in a data processing system for managing data by using a storage structure using a dynamic hashing method. For the split control method.

データベース管理システムにおける格納構造として，
ダイナミック・ハッシュを用いた研究が行われている。
ダイナミック・ハッシュ方式には，大きく分けて２つの
種類がある。１つはディレクトリを基本とする木状（Tr
ee）のダイナミック・ハッシュであり，もう１つは，デ
ィレクトリによらないリニア・ハッシュである。As the storage structure in the database management system,
Research using dynamic hashes has been conducted.
The dynamic hash method is roughly classified into two types. One is a tree based on directory (Tr
ee), and the other is a directory-independent linear hash.

ダイナミック・ハッシュを，データベースに本格的に
実用化した商用データベースは，現在見当たらない。ダ
イナミック・ハッシュは，大規模オンライン・トランザ
クション処理（OLTP）用のデータベースシステムに適応
する上で，いくつかの問題がある。その１つの問題は，
ダイナミック・ハッシュを実現する上で，ページスプリ
ット機能が，トランザクション処理におけるレスポンス
やスループットに重要な影響を与え，それらの許容値を
十分には保証できないことである。No commercial database has ever been put into practical use of dynamic hashing as a database. Dynamic hashes have several problems in adapting to database systems for large-scale online transaction processing (OLTP). One problem is that
In realizing dynamic hashing, the page splitting function has a significant effect on the response and throughput in transaction processing, and their allowable values cannot be sufficiently guaranteed.

[Conventional technology]

第５図は従来の木状ハッシュの例，第６図は従来のリ
ニア・ハッシュの例，第７図は従来のスプリット処理の
例，第８図は従来のスプリット制御の概念図を示す。FIG. 5 shows an example of a conventional tree-like hash, FIG. 6 shows an example of a conventional linear hash, FIG. 7 shows an example of a conventional split process, and FIG. 8 shows a conceptual diagram of a conventional split control.

データベースシステムに求められる要件として，レス
ポンスが速いことと，データを格納するスペース使用率
がよいことがある。特に，オンライン・トランザクショ
ン処理では，これらが，「平均するとよい」というよ
り，「ちらばりが少なく安定している」，すなわち「最
悪値がある程度の範囲に収まっている」という状態にあ
ることが重要である。The requirements for a database system include a fast response and a good use of space for storing data. In particular, in online transaction processing, it is important that these are in a state of "stable and stable", that is, "the worst value is within a certain range", rather than "average is good". It is.

これらを満たすデータの格納構造として，ダイナミッ
ク・ハッシュによるアプローチが研究されている。この
ダイナミック・ハッシュについては，例えば，以下の文
献に詳しい説明がある。As a data storage structure that satisfies these requirements, a dynamic hash approach has been studied. The dynamic hash is described in detail in, for example, the following document.

（参考文献）R.J.Enbody,H.C.Do著 “Dynamic Hashing Schemes" ACM Computing Surveys,Vol.20,No.2,June,1988ハッ
シュ関数を用いて，データを格納する場合，どうして
も，データが偏在し，キーが重複することが起きる。従
来の一般に用いられているスタティック・ハッシュで
は，データの偏在によるアクセス回数の増加を防ごうと
した場合，スペース使用率が悪くなり，スペース使用率
を維持しようとした場合，アクセス回数が保証できなく
なる。(References) by RJEnbody, HCDo “Dynamic Hashing Schemes” ACM Computing Surveys, Vol. 20, No. 2, June, 1988 When storing data using hash functions, data is inevitably unevenly distributed and keys are duplicated. Things happen. With the conventional, generally used static hash, the space usage rate deteriorates when trying to prevent an increase in the number of accesses due to uneven distribution of data, and the access number cannot be guaranteed when trying to maintain the space usage rate .

ダイナミック・ハッシュでは，データ量に伴い，使用
するスペースが拡大／縮小するので，スペース使用率と
アクセス回数のトレードオフが，ある程度解消される。In dynamic hashing, the space used increases or decreases according to the amount of data, so that a trade-off between the space usage rate and the number of accesses is eliminated to some extent.

ダイナミック・ハッシュの代表的なものとして，木状
ハッシュとリニア・ハッシュがある。Typical examples of the dynamic hash include a tree-like hash and a linear hash.

ハッシュ値からレコードを求める方法は，木状ハッシ
ュとリニア・ハッシュでは違いがある。木状ハッシュで
は，ハッシュ値からディレクトリ内のエントリ識別子を
求める。一方，リニア・ハッシュでは，ハッシュ値から
ページ識別子（ページ番号，ページアドレス等）を求め
る。ここで，ページは，二次記憶上の物理的または論理
的なI/Oのアクセス単位である。ブロックまたはバケッ
トと呼ぶこともある。There are differences between a tree-like hash and a linear hash in the method of finding a record from a hash value. In a tree-like hash, an entry identifier in a directory is obtained from a hash value. On the other hand, in the linear hash, a page identifier (a page number, a page address, etc.) is obtained from a hash value. Here, a page is a physical or logical I / O access unit on the secondary storage. Sometimes called a block or bucket.

両ダイナミック・ハッシュとも，ページを複数のハッ
シュ値のレコードで共有することで，格納構造のスペー
ス効率を向上させている。ページの維持方法は，木状ハ
ッシュならば，従来の２進木構造のように，ページが一
杯になると，スプリットさせる。第５図は，その例を示
している。Both dynamic hashes improve the space efficiency of the storage structure by sharing pages with records of multiple hash values. The method of maintaining a page is to split the page when it is full, as in a conventional binary tree structure, if it is a tree-like hash. FIG. 5 shows an example thereof.

この木状ハッシュのハッシュ関数は， h_k（key）＝key//2^kである。The hash function of this tree-like hash is h _k (key) = key // 2 ^k .

ハッシュ関数の値が，ディレクトリエントリ番号に対
応している。ｋは，ディレクトリエントリ拡張回数,//
は，剰余を表す。The value of the hash function corresponds to the directory entry number. k is the number of directory entry extensions, ///
Represents the remainder.

１ページに２レコードの格納が可能であるとする。第
５図（イ）に示す状態で,005のキーを持つレコードの格
納要求があった場合，新しいページを追加し,002および
003のキーを持つレコードを，再ハッシュによって，振
り分ける。そして，第５図（ロ）に示すように,005のキ
ーを持つレコードについても，ハッシュにより，該当す
るページを決め，それに格納する。It is assumed that two records can be stored in one page. In the state shown in FIG. 5 (a), if there is a storage request for a record having a key of 005, a new page is added, and 002 and
Records with the key of 003 are sorted by re-hashing. Then, as shown in FIG. 5 (b), for a record having a key of 005, a corresponding page is determined by hash and stored therein.

リニア・ハッシュでは，ページをオーバフローさせて
おき，他の任意の契機に，スプリット処理を行う。第６
図は，その例を示している。In the linear hash, a page is caused to overflow, and split processing is performed at another arbitrary timing. Sixth
The figure shows an example.

第６図の例では，次のようなハッシュを行っている。 In the example of FIG. 6, the following hash is performed.

（１）Ｈをｋからランダムなｍ個のビット列に変換する
関数とする。(1) Let H be a function to convert from k to m random bit strings.

（２）拡張の回数ｄに応じて，物理空間のページ数を，
2^d個（ｄ＝０…ｍ）とする。(2) According to the number of expansions d, the number of pages in the physical space is
It is assumed that 2 ^d (d = 0... M).

（３）物理空間に対するマッピング関数を， h_i（ｋ）＝Ｈ（ｋ）mod2ⁱとする。(3) Let h _i (k) = H (k) mod 2 ⁱ be the mapping function for the physical space.

（４）拡張の直後または初期値は，スプリットポインタ
SPを，物理空間の先頭ページとする。この状態でのレコ
ードの物理空間へのマッピングは，h_dを用いる。(4) Immediately after expansion or the initial value is a split pointer
SP is the first page of the physical space. In this state, _hd is used for mapping the record to the physical space.

（５）一定件数または量Ｌに応じて，スプリットポイン
タSPを移動し，移動済みの部分については，次の物理空
間へのマッピング関数h_d+1を用いて，レコードを再分配
させる。したがって，この状態では，レコードの物理空
間へのマッピング関数ｈ（ｋ）は，以下のようになる。(5) The split pointer SP is moved according to the fixed number or the number L of records, and records that have been moved are redistributed using the mapping function _{hd + 1} to the next physical space. Therefore, in this state, the mapping function h (k) of the record to the physical space is as follows.

h_d（ｋ）≧SPのとき,h（ｋ）＝h_d（ｋ） h_d（ｋ）＜SPのとき,h（ｋ）＝h_d+1（ｋ）ここで，例えば初期値ｄを２とし,Lを３件の格納とす
る。ページ内格納を２件とする。When h _d (k) ≧ SP, h (k) = h _d (k) When h _d (k) <SP, h (k) = h _{d + 1} (k) Here, for example, the initial value d is 2, and L is stored in three cases. Two items are stored in the page.

Ｈ（ｋ）の結果の具体例を以下とする。 A specific example of the result of H (k) is as follows.

160,177,194,061,078,095,006,023,
028。160,177,194,061,078,095,006,023,
028.

この場合，リニア・ハッシュでは，第６図に示すよう
に格納されることになる。In this case, the linear hash is stored as shown in FIG.

第６図（イ）は，のレコードまでが格納された状態
を示している。ここで，〜のレコードの追加がある
と，それぞれ，次の関数により，格納ページが決定され
る。FIG. 6 (a) shows a state where up to the record is stored. Here, when there is an addition of the record of 〜, the storage page is determined by the following function, respectively.

h_d（ｋ）＝006mod 2²＝２ h_d（ｋ）＝023mod 2²＝３ h_d（ｋ）＝028mod 2²＝０のレコードの格納先である第２ページには，すでに
２件のレコード，が格納されているので，オーバフ
ローページOPが確保され，ここに格納される。までの
レコードを格納した結果は，第６図（ロ）に示すように
なる。a _{h d (k) = 006mod 2} 2 = 2 h d (k) = 023mod 2 2 = 3 h d (k) = 028mod 2 of ² = 0 of record storage location to the second page, already 2 reviews Since the record is stored, the overflow page OP is secured and stored here. The result of storing the records up to is as shown in FIG.

〜までの３件の格納が終わると，スプリット制御
に入る。スプリットポインタSPの指しているページにオ
ーバフローページがあるので，新しいページを確保し，
レコード，，を，新しい関数で再ハッシュする。When the storage of the three items is completed, the split control is started. Since there is an overflow page in the page pointed to by the split pointer SP, a new page is secured.
Re-hash records, and with the new function.

′h_d+1（ｋ）＝194mod 2³＝２ ′h_d+1（ｋ）＝078mod 2³＝６ ′h_d+1（ｋ）＝006mod 2³＝６この結果から，およびのレコードを，第６図
（ハ）に示すように，新しい第６ページに移動する。From _{'h d + 1 (k)} = 194mod 2 3 = 2' h d + 1 (k) = 078mod 2 3 = 6 'h d + 1 (k) = 006mod 2 3 = 6 result, and the record Then, as shown in FIG. 6 (c), the page is moved to a new sixth page.

スプリットポインタSPは,1つ先に進め，次のスプリッ
ト契機におけるスプリット対象をポイントさせる。The split pointer SP advances by one and points to the split target at the next split trigger.

このように，リニア・ハッシュでは，ページをオーバ
フローさせておき，後からスプリット制御を行うので，
遅延スプリットと呼ぶ。遅延スプリットをするリニア・
ハッシュの場合には，木状ハッシュと比較して，ページ
のオーバフロー分だけ，スプリット処理のオーバヘッド
が大きくなる。このオーバーヘッドは，スプリット処理
におけるプログラム走行ステップ数の増加や資源のロッ
ク待ちである。Thus, in the linear hash, the page overflows and the split control is performed later.
Called delayed split. Linear split with delay split
In the case of the hash, the overhead of the split processing is increased by the overflow of the page as compared with the tree-like hash. This overhead is an increase in the number of program running steps in the split processing and waiting for resource lock.

遅延スプリットでは，スプリット処理をバックグラウ
ンドで動作するデーモン機能で制御することが考えられ
る。しかし，安易にデーモン機能でスプリット制御をす
れば，フォアグラウンド処理として動作する格納構造ア
クセス機能のレコード（資源）排他待ちなどにより，レ
スポンスや全体のスループットを低下させる危険があ
る。In delayed splitting, it is conceivable that the splitting process is controlled by a daemon function that operates in the background. However, if split control is easily performed by the daemon function, there is a danger that the response and the overall throughput will be reduced due to waiting for record (resource) exclusion of the storage structure access function that operates as foreground processing.

以上のようなリニア・ハッシュの場合には，スプリッ
トによるページの使用率の変化が激しいため，例えば，
上述の参考文献にも記載されているように，スペースの
利用効率を改良させた部分拡張ハッシュなどが，提案さ
れている。しかし，スプリットによるオーバヘッドの検
討はなされていないため，オーバヘッドの面からみた場
合には，本質的には，リニア・ハッシュと同様である。In the case of the linear hash described above, the usage rate of the page due to the split changes drastically.
As described in the above-mentioned references, a partially expanded hash having improved space utilization efficiency has been proposed. However, since the overhead due to the split has not been studied, the overhead is essentially the same as that of the linear hash.

以上のような，従来のスプリット処理の概要は，第７
図に示すようになっている。The outline of the conventional split processing as described above is described in Section 7.
It is as shown in the figure.

（ａ）スプリットが必要になった場合，まず新ページを
確保する。(A) When a split is required, a new page is first secured.

（ｂ）スプリット対象ページのレコードを読む。(B) Read the record of the page to be split.

（ｃ）レコードが終了した場合，処理を終了する。(C) When the record ends, the processing ends.

（ｄ）新しいハッシュ関数により，再ハッシュする。(D) Re-hash using a new hash function.

（ｅ）ハッシュの結果，格納場所が新ページ側か現ペー
ジ側かを判定する。現ページ側の場合，レコードをその
ままにして，処理（ｂ）に戻り，次のレコードの処理に
移る。(E) As a result of the hash, it is determined whether the storage location is the new page side or the current page side. In the case of the current page, the record is left as it is, the process returns to the process (b), and the process proceeds to the next record.

（ｆ）格納場所が新ページ側の場合，そのレコードを新
ページ側に移動する。すなわち，新ページにレコードを
複写し，現ページにある元のレコードを削除する。その
後，処理（ｂ）に戻り，処理を繰り返す。(F) If the storage location is on the new page, the record is moved to the new page. That is, the record is copied to a new page, and the original record on the current page is deleted. Thereafter, the process returns to the process (b) and the process is repeated.

なお，オーバフローページがある場合には，オーバフ
ローページについても，同様に，処理（ｂ）〜（ｆ）を
繰り返す。If there is an overflow page, the processes (b) to (f) are similarly repeated for the overflow page.

このような従来のスプリットでは，例えば第８図
（イ）に示すページP2をスプリットさせる場合，レコー
ドP2に格納してあるレコードR1〜R4のすべてについて，
各々，再ハッシュして，第８図（ロ）に示すように，レ
コードの移動（削除を含む）を，一連の処理で実行す
る。In such a conventional split, for example, when the page P2 shown in FIG. 8A is split, all of the records R1 to R4 stored in the record P2 are
Each of them is re-hashed and, as shown in FIG. 8 (b), the movement (including deletion) of the record is executed by a series of processing.

[Problems to be solved by the invention]

従来のダイナミック・ハッシュにおけるスプリット制
御では，スプリットが必要となった時点で，オーバフロ
ーページを含むスプリット対象ページの全レコードにつ
いて，再ハッシュを行うとともに，格納場所の変更が必
要になったレコードの複写および削除を，一連の処理と
して実行することになるので，この間の排他制御によ
り，オンライン・トランザクションの処理が，長時間待
たされる危険性があった。また，このスプリット制御を
行っている間に,CPUの負荷が，集中するという問題があ
った。In conventional split control in dynamic hashing, when splitting becomes necessary, all records on the page to be split, including overflow pages, are re-hashed, and records that need to be changed in storage location are copied and copied. Since the deletion is performed as a series of processing, there is a risk that the processing of the online transaction may be delayed for a long time due to the exclusive control during this time. In addition, there is a problem that the load on the CPU is concentrated during the execution of the split control.

オンライン・トランザクション処理用データベースの
格納構造に必要な条件は，格納構造変更の自己調整機能
の影響を最小限にして，以下に示す３点を満たすことで
ある。A necessary condition for the storage structure of the database for online transaction processing is to satisfy the following three points while minimizing the influence of the self-adjustment function of the storage structure change.

（１）データ処理のトランザクション（フォアグラウン
ド）と，格納構造変更処理（バックグラウンド）とを分
離して，トランザクション処理のレスポンスを保証す
る。(1) The transaction of data processing (foreground) and the storage structure change processing (background) are separated to guarantee the response of transaction processing.

（２）データ処理のトランザクション処理と，格納構造
変更処理との間の排他待ちが，長時間になるのを防ぐ。(2) Exclusion waiting between transaction processing of data processing and storage structure change processing is prevented from becoming long.

（３）全体的にデータ処理のトランザクション処理コス
トと，格納構造変更処理のコストとを，バランスさせ
る。(3) Overall, the transaction processing cost of data processing and the cost of storage structure change processing are balanced.

上記（１）の条件は，従来技術の遅延スプリット方式
で，ある程度，実現可能である。さらに，（２），
（３）の条件も満たすようにする必要がある。The above condition (1) can be realized to some extent by the conventional delay split method. Furthermore, (2),
It is necessary to satisfy the condition (3).

本発明は上記問題点の解決を図り，スプリット処理の
オーバヘッドをシステム全体でバランスさせ，トランザ
クション処理におけるレスポンスやスループットを保証
できるようにすることを目的としている。SUMMARY OF THE INVENTION It is an object of the present invention to solve the above problems, to balance the overhead of split processing in the entire system, and to guarantee the response and throughput in transaction processing.

[Means for solving the problem]

第１図は本発明の原理説明図である。 FIG. 1 is a diagram illustrating the principle of the present invention.

第１図において,10はCPUおよびメモリなどからなる処
理装置,11はレコードの挿入，検索，更新，削除などを
行う格納構造アクセス処理部,12はスプリット制御を行
う格納構造変更処理部,13はページ複写処理,14は不要レ
コード削除処理,15はページの読み込み，書き出しなど
を行うバッファ制御部,16は磁気ディスク装置などの二
次記憶装置,17はハッシュ格納構造によるデータ格納部
を表す。In FIG. 1, reference numeral 10 denotes a processing unit comprising a CPU and a memory, etc., 11 denotes a storage structure access processing unit for performing insertion, retrieval, update, and deletion of records, 12 denotes a storage structure change processing unit for performing split control, and 13 denotes Page copy processing, 14 denotes unnecessary record deletion processing, 15 denotes a buffer control unit for reading and writing pages, 16 denotes a secondary storage device such as a magnetic disk device, and 17 denotes a data storage unit having a hash storage structure.

ページ複写処理13では，ダイナミック・ハッシュにお
けるスプリットが必要になったときに，新ページを確保
して，スプリット対象ページの内容を，新ページにその
まま複写する処理を行う。In the page copy process 13, when a split in the dynamic hash becomes necessary, a process of securing a new page and copying the content of the split target page to the new page as it is is performed.

不要レコード削除処理14では，ページ複写処理13によ
る複写処理後の元のページおよび新ページについて，再
ハッシュを行い，新ページおよび元のページにおける重
複したレコードの一方の不要レコードを削除する処理を
行う。In the unnecessary record deletion process 14, the original page and the new page after the copy process by the page copy process 13 are re-hashed, and one unnecessary record of the duplicate record in the new page and the original page is deleted. .

本発明では，特に，この格納構造変更処理部12におけ
るページ複写処理13および不要レコード削除処理14の処
理を分離し，この処理の間に，例えば緊急性を有するオ
ンライン・トランザクションなどがあれば，それを優先
させて実行できるようにする。In the present invention, in particular, the processing of the page copy processing 13 and the unnecessary record deletion processing 14 in the storage structure change processing unit 12 are separated, and if there is an urgent online transaction during this processing, for example, To be executed with priority.

例えば，第１図（ロ）に示すような状態で，ページP2
をスプリット対象として，レコードR1〜R4を，再配置す
るとする。For example, in the state shown in FIG.
Suppose that records R1 to R4 are to be rearranged with the target being a split target.

ページ複写処理13の処理では，各レコードについて再
ハッシュすることなく，第１図（ハ）に示すように，ペ
ージP2の内容を，そのまま新ページP6に複写する。オー
バフローページがあれば，それについても同様に，オー
バフローページの領域を確保して，そのまま複写する。
したがって，この処理の開始から終了までの時間は，き
わめて短い。In the process of the page copy process 13, the content of the page P2 is copied to a new page P6 without any re-hashing of each record, as shown in FIG. If there is an overflow page, the overflow page area is similarly secured and copied as it is.
Therefore, the time from the start to the end of this processing is extremely short.

不要レコード削除処理14の処理では，他の任意の時間
に，ページP2およびP6に重複して格納されているレコー
ドR1〜R4について，再ハッシュを行い，現在，格納され
ている場所が正しいか否かを判定して，正しくない場合
には，不要レコードと判断し，第１図（ニ）に示すよう
に，そのレコードを削除する。In the unnecessary record deletion processing 14, at any other time, the records R1 to R4 duplicately stored in the pages P2 and P6 are re-hashed, and whether or not the currently stored location is correct is determined. If it is not correct, it is determined that the record is unnecessary, and the record is deleted as shown in FIG.

[Action]

本発明では，木状ハッシュ，リニア・ハッシュ，部分
拡張ハッシュなどのダイナミック・ハッシュにおいて，
以下の制御を行う。In the present invention, in a dynamic hash such as a tree-like hash, a linear hash, and a partially extended hash,
The following control is performed.

−スプリット時には，新ページを確保して，スプリット
対象ページを複写する。-When splitting, secure a new page and copy the split target page.

−スプリットページの検索時には，複写による不要なレ
コードは，新ハッシュ関数を用いるので，検索対象とな
らない。ただし，ハッシュ値によらない順検索の場合に
は，再ハッシュして検索対象のレコードであることを確
認する。-When retrieving a split page, records that are unnecessary due to duplication are not retrieved because the new hash function is used. However, in the case of sequential search not based on the hash value, re-hash is performed to confirm that the record is a search target.

不要レコードとは，本来，スプリットによって移動対
象とならないレコードである。ページ複写をするので，
不要ページが新ページに複写されることになる。Unnecessary records are records that are not originally moved due to splitting. Because the page is copied,
Unnecessary pages are copied to new pages.

−自己調整機能では，この不要レコードを再ハッシュに
よって検出して削除する。-In the self-adjustment function, this unnecessary record is detected and deleted by re-hashing.

以上のように，スプリット時には，単にページを複写
するだけで処理が終了するので，その処理のための排他
制御の期間は，きわめて短い。自己調整機能による不要
レコードの削除は，全不要レコードを一度に削除する必
要がなく,CPU負荷などとの兼ね合いにより，適当な時間
に順次実施することができるので，オンライン・トラン
ザクション等に対する影響を極力小さくすることができ
る。As described above, at the time of splitting, the process is completed simply by copying a page, and thus the exclusive control period for the process is extremely short. Deletion of unnecessary records by the self-adjustment function eliminates the need to delete all unnecessary records at once, and can be performed sequentially at appropriate times depending on the CPU load, etc., so that the influence on online transactions, etc. is minimized. Can be smaller.

〔Example〕

第２図は本発明の木状ハッシュへの適用例，第３図は
本発明のリニア・ハッシュへの適用例，第４図は本発明
の実施例によるリニア・ハッシュ構造の処理構成例を示
す。FIG. 2 shows an example of application of the present invention to a tree-like hash, FIG. 3 shows an example of application of the present invention to a linear hash, and FIG. 4 shows a processing configuration example of a linear hash structure according to an embodiment of the present invention. .

［木状ハッシュへの適用例］木状ハッシュのスプリット処理では，レコード格納時
に，ページが一杯になったならば，新ページを確保し
て，旧ページの内容のすべてを新ページへ複写する。そ
の後，不要レコードは，新・旧ページ内で削除する。不
要レコードは，再ハッシュすれば，新ページまたは旧ペ
ージに存在すべきレコードであるかどうかを判定でき
る。[Example of application to tree-like hash] In the tree-like hash split processing, when a record is stored, if a page is full, a new page is secured and all the contents of the old page are copied to the new page. After that, unnecessary records are deleted from the new and old pages. If the unnecessary record is rehashed, it can be determined whether or not the record should exist on the new page or the old page.

ハッシュ値からディレクトリエントリ，そしてページ
を求めるため，複写による不要なレコードは，キー検索
対象外となり，不要レコードが存在しても，キー検索で
は問題にならない。Since the directory entry and the page are obtained from the hash value, unnecessary records due to copying are excluded from the key search. Even if the unnecessary records exist, there is no problem in the key search.

ところで，実際のデータベースでは，ハッシュ値によ
らずに，順検索する場合がある。そのため，スプリット
時に，例えばページ単位でスプリット中表示を立ててお
く。後で，自己調整機能が動作して不要レコードを削除
し，ページ内の不要レコードをすべて削除レコードとし
たならば，スプリット中表示を消す。By the way, in an actual database, a forward search may be performed without depending on a hash value. Therefore, at the time of splitting, for example, the display during splitting is set up in page units. Later, when the self-adjustment function is operated to delete unnecessary records and all unnecessary records in the page are deleted records, the display during splitting is turned off.

順検索では，スプリット中表示の立っているページ内
のレコードは，再ハッシュして不要レコードでないもの
だけを検索対象とする。In the sequential search, the records in the page with the split display are re-hashed and only those that are not unnecessary records are searched.

第２図は，従来例として説明した第５図と同じ木状ハ
ッシュのケースについての本発明の適用例を示してい
る。FIG. 2 shows an application example of the present invention to the same tree-like hash case as in FIG. 5 described as a conventional example.

初期データとして,002,003を挿入すると，第２図
（イ）に示す状態になる。スプリット中表示Ｆは“0"で
ある。When 002 and 003 are inserted as initial data, the state shown in FIG. The split display F is "0".

次に,005の値を持つレコードを挿入しようとした場
合，そのページが一杯であるので，ページ複写のスプリ
ットを行う。これにより，第２図（ロ）に示すような状
態になる。（003），（002）は，不要レコードである
が，ページ上での直接的な不要レコードの表示はない。
スプリット中表示Ｆは，“1"にセットされる。Next, when an attempt is made to insert a record having a value of 005, the page is full, so a page copy split is performed. As a result, a state as shown in FIG. (003) and (002) are unnecessary records, but there is no direct display of unnecessary records on the page.
The split display F is set to "1".

挿入しようとする005のレコードをハッシュすると，
ディレクトリエントリ「１」が示すページに挿入すべき
ことがわかる。そこで，そのページの不要レコードを再
ハッシュによって求め，第２図（ハ）に示すように，そ
の領域に005のレコードを挿入する。なお，この不要レ
コードは,005のレコードが挿入できる分だけ見つければ
よく，この段階で，ページ内のすべての不要レコードを
探す必要はない。When hashing the record of 005 to be inserted,
It can be seen that it should be inserted into the page indicated by the directory entry “1”. Therefore, an unnecessary record of the page is obtained by re-hashing, and a record of 005 is inserted into the area as shown in FIG. Note that the unnecessary records need only be found as long as 005 records can be inserted. At this stage, it is not necessary to search all unnecessary records in the page.

他の機会に，すべての不要レコードを削除したとき
に，スプリット中表示Ｆを“0"に戻す。When all unnecessary records are deleted at another time, the display F during split is returned to "0".

［リニア・ハッシュへの適用例］リニア・ハッシュでは，ページが一杯になったなら
ば，オーバフローページにレコードを格納し，スプリッ
トを遅延する。例えば，ある量のレコードを挿入する
と，自己調整機能がスプリットポインタSPの内容を１つ
進めて，スプリット処理を開始する。[Application Example to Linear Hash] In the linear hash, when a page becomes full, a record is stored in an overflow page and the split is delayed. For example, when a certain amount of records is inserted, the self-adjustment function advances the content of the split pointer SP by one and starts the split processing.

スプリット処理では，新ページを確保して，旧ページ
の内容を複写する。ハッシュ値からページを求めるた
め，複写による不要なレコードは，キー検索対象外とな
る。In the split processing, a new page is secured and the contents of the old page are copied. Unnecessary records due to copying are excluded from key search because the page is obtained from the hash value.

実際のデータベースでは，ハッシュ値によらずに，順
検索する場合がある。そのため，スプリット中表示Ｆを
立て，後で自己調整機能が動作して，ページ内の不要レ
コードをすべて削除レコードとしたならば，スプリット
中表示Ｆを消す。順検索では，スプリット中表示Ｆの立
っているページ内のレコードは，再ハッシュして，不要
レコードでないものだけを検索対象とする。In an actual database, a forward search may be performed regardless of the hash value. For this reason, the split display F is set up, and if the self-adjustment function is operated later and all unnecessary records in the page are deleted, the split display F is erased. In the sequential search, records in the page where the split display F is set are rehashed, and only non-unnecessary records are searched.

第３図は，従来例として説明した第６図と同じリニア
・ハッシュのケースについての本発明の適用例を示して
いる。FIG. 3 shows an application example of the present invention in the case of the same linear hash as in FIG. 6 described as a conventional example.

スプリットポインタSPが，現在，第２ページ目にあ
り，それを第３ページ目に移動するときの再編成におい
て，第３図に示すように，第２ページ目のプライムペー
ジPPおよびオーバフローページOPを，そのまま複写す
る。そして，スプリット中表示（図示省略）を立てる。
その後，スプリットポインタSPを，第３ページを指すよ
うに進める。この処理は，再ハッシュを必要としないの
で，高速に処理することができる。As shown in FIG. 3, in the reorganization when the split pointer SP is currently on the second page and is moved to the third page, the prime page PP and the overflow page OP of the second page are set as shown in FIG. , Copy as it is. Then, a split display (not shown) is set.
Thereafter, the split pointer SP is advanced so as to point to the third page. This process can be performed at high speed because rehash is not required.

［部分拡張ハッシュへの適用例］部分拡張ハッシュの場合には，グループ数で複写ペー
ジ数が決まる。グループ数が２の場合，それぞれのペー
ジ複写操作によって，各グループと新規ページは,2ペー
ジ（プライムページとオーバフローページ）ずつの構成
となる。グループ数がｎの場合，各グループと新規ペー
ジは,nページずつの構成となる。以下は，リニア・ハッ
シュと同様である。[Application Example to Partially Extended Hash] In the case of partially extended hash, the number of copied pages is determined by the number of groups. When the number of groups is 2, each group and a new page have a configuration of two pages (prime page and overflow page) by each page copy operation. When the number of groups is n, each group and a new page have a configuration of n pages. The following is similar to the linear hash.

第４図は，本発明のリニア・ハッシュ構造における一
実施例の処理構成を示している。FIG. 4 shows a processing configuration of one embodiment of the linear hash structure of the present invention.

第４図に示すスプリット制御デーモン処理部20は，第
１図に示す格納構造変更処理部12に相当する。21はバッ
ファ,22は二次記憶装置16における格納場所とバッファ2
1との関係などを示すアドレス変換表を表す。The split control daemon processing unit 20 shown in FIG. 4 corresponds to the storage structure change processing unit 12 shown in FIG. 21 is a buffer, 22 is a storage location in the secondary storage device 16 and buffer 2
This shows an address conversion table indicating the relationship with 1 and the like.

格納構造アクセス処理部11は，レコード挿入，キー指
定および順検索によるレコード検索，レコード更新，レ
コード削除などの処理機能を持つ。The storage structure access processing unit 11 has processing functions such as record insertion, key search and record search by sequential search, record update, and record deletion.

スプリット制御デーモン処理部20は，プライムページ
複写，オーバフローページ複写，不要レコード削除，オ
ーバフローページ統合などの処理機能を持つ。The split control daemon processing unit 20 has processing functions such as prime page copy, overflow page copy, unnecessary record deletion, and overflow page integration.

バッファ制御部15は，ページ読み込み，ページ書き出
し，新ページ獲得，アドレス変換表22の更新などの処理
機能を持つ。The buffer control unit 15 has processing functions such as page reading, page writing, new page acquisition, and updating of the address conversion table 22.

格納構造アクセス処理部11は、レコードにアクセスす
るときに，バッファ制御部15に，ページ読み込みを依頼
する。読み込みページは，アドレス変換表22を介して，
二次記憶装置16と関連づけられる。バッファ制御部15
は，レコードの挿入依頼などにより，挿入ページが一杯
になったならば，オーバフローページを切り出して，そ
こにレコードを格納する。The storage structure access processing unit 11 requests the buffer control unit 15 to read a page when accessing a record. The read page is transmitted via the address conversion table 22,
Associated with the secondary storage device 16. Buffer control unit 15
When the insertion page becomes full due to a record insertion request or the like, the overflow page is cut out and the record is stored there.

スプリット制御デーモン処理部20は、格納構造アクセ
ス処理部11とは非同期に動作する。スプリットが必要に
なったならば，スプリットポインタSPから，スプリット
対象ページを求める。図ではＰページである。このＰペ
ージには，現在，オーバフローページ（Ｏページ）が１
つ存在している。The split control daemon processing unit 20 operates asynchronously with the storage structure access processing unit 11. If splitting is required, a page to be split is obtained from the split pointer SP. In the figure, it is the P page. This P page currently contains one overflow page (O page).
One exists.

このスプリット制御では，オンライン・トランザクシ
ョン処理向けデータベースの自己調整機能として，スプ
リット処理をいくつかの処理単位に分割することで，シ
ステム性能を安定させるようにしている。In this split control, the system performance is stabilized by dividing the split process into several processing units as a self-adjustment function of the database for online transaction processing.

以下に，本実施例におけるスプリット制御の処理単位
のフローを示す。それぞれのフェーズは，スプリット制
御デーモン処理部20による処理で，何多重かの同時走行
が可能になっている。なお，プログラムを同時走行させ
る技術については，タスク制御や，プロセス制御として
種々の方式が知られているので，その詳細な説明は省略
する。The flow of the processing unit of the split control in the present embodiment will be described below. Each phase is processed by the split control daemon processing unit 20, and several simultaneous runs are possible. As for the technique for running the programs simultaneously, various methods are known as task control and process control, and a detailed description thereof will be omitted.

アドレス変換表22に新しいページエントリを登録し,1
つの複写ページを作る。Register a new page entry in address translation table 22, and
Make two duplicate pages.

オーバフローページが存在すれば，新しいページエン
トリを登録し，その１つの複写ページを作る。If an overflow page exists, a new page entry is registered, and one copy page is created.

オーバフローページの複写が終了するまで，処理を
繰り返す。The process is repeated until copying of the overflow page is completed.

旧ページ（複写元），新ページ（複写先）内の１つの
レコードを再ハッシュして，不要レコードを削除する。One record in the old page (copy source) and the new page (copy destination) are rehashed, and unnecessary records are deleted.

オーバフローページの統合処理をする。この統合処理
は，不要レコードを削除した後のページの使用率が，例
えば40％というような，ある基準以下になったものが,2
つ以上ある場合に，それらを１つまとめて，オーバフロ
ーページを返却する処理である。なお，この統合を省略
して，オーバフローページがすべて空きになってから，
返却するようにしてもよい。Performs overflow page integration processing. In this integration process, when the usage rate of pages after deleting unnecessary records falls below a certain standard, for example, 40%,
If there are more than one, they are put together and the overflow page is returned. Omitting this integration, after all overflow pages become empty,
You may return it.

バッファ制御論理にしたがって，以上の処理結果を，
二次記憶装置16に反映する。According to the buffer control logic, the above processing result is
This is reflected in the secondary storage device 16.

以上の処理単位〜については，ばらばらでも，い
くつかにまとめてもよい。例えば〜を，スプリット
制御フェーズ，〜を自己調整フェーズとすることが
できる。The above processing units 1 to 3 may be separated or may be put together. For example, can be a split control phase and can be a self-adjustment phase.

第４図の例では，具体的には，次の処理を行う。・Ｐ
ページをＰ′ページとして，複写する。この新ページの
獲得処理では，バッファ制御部15が，一次記憶上で新規
ページを獲得する。In the example of FIG. 4, specifically, the following processing is performed.・ P
The page is copied as page P '. In this new page acquisition process, the buffer control unit 15 acquires a new page on the primary storage.

・アドレス変換表22に,P′エントリを登録する。-Register the P 'entry in the address conversion table 22.

・P,P′ページにスプリット中表示を立てる。・ Set the split display on the P, P 'page.

・オーバフローページＯを,O′ページに複写し，それぞ
れスプリット中表示を立てる。Copy the overflow page O to the page O ', and set the display during splitting.

・スプリットぺージの１つのＰ内のレコードを再ハッシ
ュして，不要レコードを削除する。Re-hash records in one P of the split page and delete unnecessary records.

・すべてのレコードを処理した後に，スプリット中表示
を消す。• After all records are processed, the display during splitting disappears.

・Ｐ′ページについても,Pと同様の処理をする。• The same process as for P is performed for page P '.

・Ｏページ,O′ページについても,Pと同様の処理をする
が，ページ内がすべて空きとなったならば，オーバフロ
ーページを返却する。• The same processing as P is performed for the O page and the O 'page, but if all pages become empty, the overflow page is returned.

上記処理の間，格納構造アクセス処理部11で，順検索
機能を使用する場合には，スプリット中表示がページに
立っていたならば，そのページ内のレコードを再ハッシ
ュして，不要レコードであるかを確かめる。During the above processing, when the sequential search function is used in the storage structure access processing unit 11, if the display during splitting is standing on a page, the records in the page are re-hashed and are unnecessary records. Make sure.

リニア・ハッシュについて説明したが，木状ハッシ
ュ，部分拡張ハッシュなどでも，同様の処理構成で対応
できる。Although the linear hash has been described, a tree-like hash, a partially expanded hash, and the like can be handled by a similar processing configuration.

上述の実施例では，オーバフローページは，リンク構
造としているが，リカーシブ・リニア・ハッシュとして
知られているように，オーバフロー構造として，リニア
・ハッシュを使用してもよい。また，オーバフロー構造
として,2進木構造を使用することもできる。In the embodiment described above, the overflow page has a link structure, but a linear hash may be used as the overflow structure, as is known as a recursive linear hash. Also, a binary tree structure can be used as the overflow structure.

第４図に示すスプリット制御デーモン処理部20は，格
納構造アクセス処理部11と非同期で動作するが，格納構
造アクセス処理部11と同期して動作するようにすること
も可能である。ただし，この場合，本格的なオンライン
・トランザクション処理向けではなくなる。The split control daemon processing unit 20 shown in FIG. 4 operates asynchronously with the storage structure access processing unit 11, but can also operate in synchronization with the storage structure access processing unit 11. However, in this case, it is not for full-scale online transaction processing.

格納構造アクセス処理部11のレコード検索で，順検索
機能がない場合もある。In the record search of the storage structure access processing unit 11, there is a case where the sequential search function is not provided.

〔The invention's effect〕

以上説明したように，本発明によれば，フォアグラウ
ンドで動作するトランザクション処理が，バックグラウ
ンドで動作する自己調整機能により，長時間，排他待ち
になる可能性を低下させることができる。また，システ
ム全体として，トランザクション処理と格納構造変更の
自己調整機能とのコストをバランスさせることが可能に
なる。As described above, according to the present invention, the possibility that transaction processing operating in the foreground waits for exclusion for a long time can be reduced by the self-adjustment function operating in the background. Further, the costs of the transaction processing and the self-adjustment function for changing the storage structure can be balanced in the entire system.

[Brief description of the drawings]

第１図は本発明の原理説明図，第２図は本発明の木状ハッシュへの適用例，第３図は本発明のリニア・ハッシュへの適用例，第４図は本発明の実施例によるリニア・ハッシュ構造の
処理構成例，第５図は従来の木状ハッシュの例，第６図は従来のリニア・ハッシュの例，第７図は従来のスプリット処理の例，第８図は従来のスプリット制御の概念図を示す。図中,10は処理装置,11は格納構造アクセス処理部,12は
格納構造変更処理部,13はページ複写処理,14は不要レコ
ード削除処理,15はバッファ制御部,16は二次記憶装置,1
7はデータ格納部を表す。1 is a diagram illustrating the principle of the present invention, FIG. 2 is an example of application of the present invention to a tree-like hash, FIG. 3 is an example of application of the present invention to a linear hash, and FIG. 4 is an embodiment of the present invention. 5 is an example of a conventional tree-like hash, FIG. 6 is an example of a conventional linear hash, FIG. 7 is an example of a conventional split process, and FIG. 2 is a conceptual diagram of the split control of FIG. In the figure, 10 is a processing device, 11 is a storage structure access processing unit, 12 is a storage structure change processing unit, 13 is a page copy process, 14 is an unnecessary record deletion process, 15 is a buffer control unit, 16 is a secondary storage device, 1
7 represents a data storage unit.

Claims

(57) [Claims]

In a data processing system for managing data by a storage structure using a dynamic hash method, a process for securing a new page at the time of splitting and copying the contents of the page to be split to the new page ( 13)
And a processing step (14) for re-hashing the split target page and the new page after the copying processing and deleting unnecessary records, wherein the two processing steps can be operated separately. A split control method for dynamic hashing.