JP3622443B2

JP3622443B2 - T-tree index construction method and apparatus, and storage medium storing T-tree index construction program

Info

Publication number: JP3622443B2
Application number: JP25723997A
Authority: JP
Inventors: 敏岡田; 淳一黒岩; 昌義梅田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1997-09-22
Filing date: 1997-09-22
Publication date: 2005-02-23
Anticipated expiration: 2017-09-22
Also published as: JPH1196058A

Description

【０００１】
【発明の属する技術分野】
本発明は、Ｔ木インッデックス構築方法及び装置及びＴ木インデックス構築プログラムを格納した記憶媒体に係り、特に、データベースをメモリ上に常駐させることで高速なデータアクセスを実現する、メモリ常駐データベース管理システム等におけるＴ木インデックスの構築方法及び装置及びＴ木インデックス構築プログラムを格納した記憶媒体に関する。
【０００２】
【従来の技術】
図８は、従来のＴ木インデックスの構成を示す。同図に示すように、従来のＴ木インデックス１０は、Ｔ木インデックスの構成要素であるノード１１１と当該ノード１１１の間をつなぐノード間ポインタ１１３から成り立っている。各ノード１１１は、キー値１２１と当該キー値１２１に対応するテーブル２０内のレコード２０２の位置を指すポインタ１１６を１組とするノード内要素１１２を含んでいる。ノード内要素１１２の数は、インデックス作成時に決定する同定値である。
【０００３】
レコード２０２検索時には、ノード間ポインタ１１３を使用し、各ノード１１１間を移動することにより、アプリケーションの指定した検索条件に合致するキー値１２１を検索する。
次に、キー値１２１と組になっているレコード２０２間のポインタ１１６を参照し、ポインタ１１６の指している目的のレコード２０２へアクセスする。
【０００４】
図９は、従来のＴ木インデックスにおけるノード構成を示す。
インデックスサイズの観点から見た場合、同図に示すように、ＣＬＭ１とＣＬＭ２の２カラムでキー値１２１を構成する場合と、ＣＬＭ１のみでキー値１２１を構成する場合とでは、ノード１１１のサイズが異なっている。これは、各ノード１１１の内部にキー値１２１を直接埋め込んでいるためであり、キー値１２１を構成するカラム２０１の数、長さによりノード１１１のサイズが変化し、ノード１１１のサイズは、Ｔ木インデックス１０毎に決まる。全てのノード内要素１１２にキー値１２１が挿入されていない場合には、その領域は空き１１５として管理される。
【０００５】
図１０は、従来のＴ木インデックスにおける重複キーの増加を示す。同図はＴ木インデックス１０における重複キー１３３の扱いについて示している。重複キー１３３であるＫＥＹ３は、直接ノード１１１の内部に埋め込まれている。このとき、ＫＥＹ３が更に挿入され、ＫＥＹ３の数が２から３に増加することによって、Ｔインデックス１０で扱うキー値１２１の数及び段数が増加する。このように、１つのキー値に対応するレコード数を表す重複度により検索時に参照対象となるキー１２１の数が変化し、場合によっては、Ｔ木インデックス１０の段数が変化する可能性がある。
【０００６】
図１１は、従来のＴ木インデックスにおけるキー値の挿入を示す。同図は、Ｔ木インデックス１０へのキー値１２１の挿入例を示している。Ｔ木インデックス１０にＫＥＹ６が挿入されたとき、ＫＥＹ５，ＫＥＹ６，ＫＥＹ７の順序性を保持するものであれば、ＫＥＹ４，ＫＥＹ５が移動し、ＫＥＹ６を挿入する領域を確保した後、ＫＥＹ６の挿入が行われる。このように、Ｔ木インデックス１０にキー値１２１が挿入された場合には、キー値１２１の移動が生じる可能性がある。
【０００７】
図１２は、従来のＴ木インデックスにおけるキー値の削除を示す。同図は、挿入時と同様に、従来のＴ木インデックス１０からのキー値１２１の削除を示している。Ｔ木インデックス１０からＫＥＹ６が削除されたとき、ＫＥＹ５，ＫＥＹ７，ＫＥＹ８の順序性を保持するのであれば、ＫＥＹ７，ＫＥＹ８の移動が行われる。このように、Ｔ木インデックス１０からキー値１２１を削除する場合にもキー値１２１の移動が生じる可能性がある。
【０００８】
図１３は、従来のＴ木インデックスのバックアップを示す。同図は、Ｔ木インデックス１０のバックアップ３０を取得している状態を示している。メモリ上に展開されたＴ木インデックス１０のバックアップ３０は、ディスク上に取得される。その内容は、Ｔ木インデックス１０のノード間の階層構造の情報、キー値１２１等のＴ木インデックス１０を構成する全ての情報である。
【０００９】
【発明が解決しようとする課題】
しかしながら、上記従来は、Ｔ木の各ノードの内部にキー値（重複キーを含む）を埋め込んでいたため、以下のような問題がある。
１．キーを構成するカラムの数、長さによって、ノードのサイズが変化するため、Ｔ木インデックスの構築がキー構成に影響され、Ｔ木インデックスの構築が複雑となり、構築に時間がかかる。
【００１０】
２．ノード内要素数は固定であるため、キー値のサイズが大きく、空きとして管理される領域が多い場合には、Ｔ木インデックスとして使用するメモリ量が増加する。
３．キーの挿入／削除時にキー値の移動が生じるため、レコードの挿入／削除の処理速度が遅い。
【００１１】
４．重複キーの重複度が増加するにつれて検索対象となるキーの数が増加し、多くのキー値との比較を行わなくてはならないため検索時の処理速度が悪化する。
５．インデックスに含まれる全ての情報についてバックアップを取得しているため、バックアップに要する時間が長い。
【００１２】
本発明は、上記の点に鑑みなされたもので、Ｔ木インデックス構築時間短縮、メモリ使用量削減、キー挿入／削除時の処理速度の向上、キーの重複時の検索速度の向上及びバックアップ時間の短縮を可能とするＴ木インッデックス構築方法及び装置及びＴ木インデックス構築プログラムを格納した記憶媒体を提供することを目的とする。
【００１３】
【課題を解決するための手段】
本発明は、データベースをメモリ上に常駐させ、高速なデータアクセスを実現するデータベース管理システムで使用するＴ木インデックスを構築するＴ木インデックス構築方法において、
Ｔ木インデックスを該Ｔ木インデックスの階層構造の情報を有するノードからなるＴ木本体、検索時に条件比較を行うキー値情報、一つキーに対して複数のレコードが対応する重複キー情報の３つの領域に分けて管理し、
Ｔ木インデックスのノード情報をキーを構成するカラム数、長さ及びキーの重複とは独立に構築し、
分割した３つの領域の中でＴ木本体を構成するノードのみを、ディスク上にバックアップを取得する。
【００１４】
図１は、本発明の原理構成図である。
本発明は、データベースをメモリ上に常駐させ、高速なデータアクセスを実現するデータベース管理システムで使用するＴ木インデックスを構築するＴ木インデックス構築装置において、
Ｔ木インデックスを該Ｔ木インデックスの階層構造の情報を有するノードからなるＴ木本体を格納するＴ木本体格納領域１００と、
検索時に条件比較を行うキー値情報を格納するキー値格納領域２００と、
一つキーに対して複数のレコードが対応する重複キー情報を格納する重複キー格納領域３００と、
Ｔ木本体格納領域１００、キー値格納領域２００及び重複キー格納領域３００の３つの領域を管理する管理手段４００とを有する。
【００１５】
上記のＴ木本体格納領域１００は、Ｔ木インデックスのノード情報をキーを構成するカラム数、長さ及びキーの重複とは独立に構築する。
上記の管理手段４００は、分割したＴ木本体格納領域１００、キー値格納領域２００及び重複キー格納領域３００の３つの領域の中でＴ木本体を構成するノードのみを、ディスク上にバックアップを取得するバックアップ取得手段４１０を含む。
【００１６】
上記のＴ木本体格納領域１００に格納されるノードは、ノード内要素にキー値格納領域内に格納されているキー値の位置を指すポインタを含む。
上記のキー値格納領域２００に格納されるキー値は、１つのキーに対して１つのレコードが対応している場合には、各キー値に対応するレコードの位置を指すポインタを保持し、
１つのキーに対して複数のレコードが対応している場合には、重複キー格納領域のポインタの集合内の一つの要素を指すポインタを保持する。
【００１７】
上記の重複キー格納領域３００に格納される重複キー情報は、重複キーに対応するレコードの位置を指すポインタを有する。
上記の管理手段４００は、Ｔ木本体格納領域、キー値格納領域及び重複キー格納領域内の要素をポインタで結合する手段を含む。
本発明は、データベースをメモリ上に常駐させ、高速なデータアクセスを実現するデータベース管理システムで使用するＴ木インデックスを構築するＴ木インデックス構築プログラムを格納した記憶媒体であって、
Ｔ木インデックスを該Ｔ木インデックスの階層構造の情報を有するノードからなるＴ木本体、検索時に条件比較を行うキー値情報、一つキーに対して複数のレコードが対応する重複キー情報の３つの領域に分けて管理する管理プロセスを有し、
管理プロセスは、
Ｔ木インデックスのノード情報をキーを構成するカラム数、長さ及びキーの重複とは独立に構築し、分割した３つの領域の中でＴ木本体を構成するノードのみを、ディスク上にバックアップを取得するプロセスを有する。
【００１８】
上記のように、本発明は、Ｔ木インデックスを以下の３領域に分割して格納する各領域内の要素をポインタで結合することにより、データへのアクセスを可能とする。
Ｔ木本体については、Ｔ木インデックスの各ノードを格納し、ノード間の階層構造の情報を保持する。
【００１９】
また、キー値格納領域については、キー値のみを格納し、管理する。
さらに、重複キー格納領域については、重複キーのみを管理する。
バックアップについては、Ｔ木本体のみを取得する。
これにより、従来の方式に比較して以下が可能となる。
１．キーを構成するカラムの数、長さとは独立にＴ木インデックスのノードの作成、ノード間の階層構造が構築できるため、インデックス構築の処理速度が向上する。
【００２０】
２．ノード内にキー値を埋め込まず、キー値を格納する領域は必要最小限に抑えることができるため、Ｔ木インデックス全体のメモリの使用量を少なくできる。
３．キー値の挿入／削除における、Ｔ木インデックスの成長／衰退がノード間のキー値の移動を伴わないで実施でき、キー値の挿入／削除時の処理速度が向上する。
【００２１】
４．重複キーを扱う場合に重複キーの重複度とは独立にＴ木インデックスの階層構造が構築でき、Ｔ木インデックスの検索時の処理速度が向上する。
５．バックアップのデータ量を削減でき、バックアップ取得に要する時間を短縮できる。
【００２２】
【発明の実施の形態】
図２は、本発明のＴ木インデックスの構成を示す。
同図は、本発明の構築方式を使用したＴ木インデックス１０における各領域の相関関係を示している。分割している各領域の内容は、以下の通りである。
１．Ｔ木本体領域１１：
Ｔ木インデックス１０構築のために必要なノード１１１を格納する領域である。各ノード１１１のノード内要素１１２は、キー値格納領域１２内に格納されているキー値の１２１の位置を指すポインタ１１４を保持している。
【００２３】
２．キー値格納領域１２：
検索時に使用するキー値１２１を格納する領域である。１つのキーに対して１レコードのみが対応している場合には、各キー値１２１は、キー値１２１に対応するレコード２０２の位置を指すポインタ１２２を保持している。１つのキーに対して複数のレコード２０２が対応する重複キーを扱う場合には、各キー値１２１は、重複キー格納領域１３内のポインタ１３２の集合１３１内の一つの要素を指すポインタ１２３を保持している。格納しているキー値１２１の並びには制限はない。
【００２４】
３．重複キー格納領域１３：
この領域は、重複キーを扱う場合にのみ作成される。重複キーに対応するレコード２０２の位置を指すポインタ１３２を管理している領域である。
図３は、本発明のＴ木インデックスのノードとキー値の構成を示す。
同図は、キー値１２１を構成するカラム２０１とＴ木インデックス１０の関係を例示している。ＣＬＭ１とＣＬＭ２の２つカラムでキー値１２１を構成する場合と、ＣＬＭ１のみでキー値１２１を構成する場合のＴ本本体１１の各ノード１１１のサイズはどちらも同一となる。そのため、新しいノードの作成時、１つのテーブルに複数のインデックスを張る場合等に、インデックスを構築する全てのノードが同じサイズとなり、Ｔ木インデックス１０の設計及び構築が簡易になる。
【００２５】
また、各ノード１１１の要素１１２内でキー値１２１へのポインタ１１４を格納していない空き１１５が生じてもこの空き１１５に対応するキー値１２１の領域は、キー値格納領域１２内に確保しないため、Ｔ木インデックス１０としてのメモリの使用量を少なく抑えることができる。
【００２６】
【実施例】
以下、図面と共に、本発明の実施例を説明する。
前述の図２、図３の構成に基づいて以下の実施例を説明する。
図４は、本発明の一実施例の重複キーの増加を示す。同図は、重複キー１３３の重複度とＴ木本体１１の関係を例示している。テーブルへのレコードの追加によりＫＥＹ３の重複度が２から３に増加した場合には、ＫＥＹ３のポインタ１２３が指している重複キー格納領域１３内のポインタの集合１３１に挿入されたレコードの位置を指すポインタ１３２を追加する。
【００２７】
Ｔ木本体１１、キー値格納領域１２は重要度が増加しても構成は変わらない。そのため、重複キーによるキー値１２１検索の処理速度の低下を防ぐことができ、重複のないキー値１２１に対する検索の処理速度が向上する。
図５は、本発明の一実施例のキー値の挿入を示す。同図は、Ｔ木インデックス１０にキー値１２１を挿入する方法を例示している。Ｔ木インデックス１０にＫＥＹ６を挿入する例を以下に示す。
【００２８】
１．ＫＥＹ６をキー値格納領域１２に挿入する。
２．Ｔ木本体１１のノード１１１内でＫＥＹ６を指すポインタ１１４を挿入する位置を見つける。キーの順序性より、ＫＥＹ５，ＫＥＹ６，ＫＥＹ７の順序となるように挿入するため、ポインタＰｔ５とポインタＰｔ７の間にＫＥＹ６を指すポインタＰｔ６を挿入する。このとき、ポインタＰｔ６を挿入するための領域がないため、ポインタＰｔ４，ポインタＰｔ５についてポインタの張り替えを行い、ポインタＰｔ６を挿入する領域を確保する。
【００２９】
３．ＫＥＹ６を指すポインタＰｔ６を挿入する。
図６は、本発明の一実施例のキー値の削除を示す。同図は、Ｔ木インデックス１０からキー値１２１を削除する方法を例示している。Ｔ木インデックス１０からＫＥＹ６を削除する例を以下に示す。
１．ＫＥＹ６を指すポインタ１１４Ｐｔ６を検索する。
【００３０】
２．キー値格納領域１２からＫＥＹ６を削除する。
３．ポインタＰｔ６をノード１１１から削除する。
４．ポインタＰｔ６を削除した領域が空き１１５となり、順序性が保てなくなるため、ポインタＰｔ７、ポインタＰｔ８についてポインタ１１４の張り替えを行う。
【００３１】
図７は、本発明の一実施例のバックアップを示す。同図は、バックアップ３０を例示している。Ｔ木インデックス１０を構成するＴ木本体１１、キー値格納領域１２、重複キー格納領域１３の３領域のうち、バックアップ３０をディスク上に取得するのは、Ｔ木本体１１のみとする。バックアップ３０をＴ木本体１１のみとすることで、バックアップ３０のデータ量を削減でき、通常運用の処理速度に影響を与えるバックアップ取得時のディスクへの書込み時間を短縮することができる。
【００３２】
障害からのリカバリ時には、Ｔ木本体１１のバックアップ３０をディスクからメモリ上にロードする。次に、データベース内のテーブル２０のバックアップを参照して、キー値１２１を取り出し、Ｔ木本体１１の各ノード１１１の各要素１１２とポインタ１１４で結合することで、キー値格納領域１２、重複キー格納領域１３を作成し、レコード２０２へのアクセスが可能な状態とする。
【００３３】
また、本発明は、上記において、Ｔ木インデックスを該Ｔ木インデックスの階層構造の情報を有するノードからなるＴ木本体、検索時に条件比較を行うキー値情報、一つキーに対して複数のレコードが対応する重複キー情報の３つの領域に分けて管理し、Ｔ木インデックスのノード情報をキーを構成するカラム数、長さ及びキーの重複とは独立に構築し、分割した３つの領域の中でＴ木本体を構成するノードのみを、ディスク上にバックアップを取得する処理をプログラムとして構築し、ディスク装置や、フロッピーディスク、ＣＤ−ＲＯＭ等の可搬記憶媒体に格納し、データベースをメモリ上に常駐させる場合に利用することにより汎用的に利用することが可能である。
【００３４】
なお、本発明は、上記の実施例に限定されることなく、特許請求の範囲内で種々変更・応用が可能である。
【００３５】
【発明の効果】
上述のように、本発明によれば、メモリ常駐データベース管理システムのＴ木インデックスにおいて、キー値を構成するカラムの数、長さとは独立にＴ木本体を構成でき、Ｔ木の成長、衰退時に、ノード間でキー値を移動することなく、ポインタの張り替えのみで対応できるため、Ｔ木の構築が高速化される。
【００３６】
また、キー値を格納する領域を最小限に抑えることができるため、メモリ使用量が削減できる。重複キーを扱う場合にもキー値は一つであるため、重複度が高いデータでも効率良く検索できる。
さらに、バックアップについては、Ｔ木本体のみであるため、バックアップに要する時間が削減できる。
【図面の簡単な説明】
【図１】本発明の原理を説明するための図である。
【図２】本発明のＴ木インデックスの構成図である。
【図３】本発明のＴ木インデックスのノードとキー値の構成図である。
【図４】本発明の一実施例の重複キーの増加を示す図である。
【図５】本発明の一実施例のキー値の挿入を示す図である。
【図６】本発明の一実施例のキー値の削除を示す図である。
【図７】本発明の一実施例のバックアップを示す図である。
【図８】従来のＴ木インデックスの構成図である。
【図９】従来のＴ木インデックスにおけるノード構成図である。
【図１０】従来のＴ木インデックスにおける重複キーの増加を示す図である。
【図１１】従来のＴ木インデックスにおけるキー値の挿入を示す図である。
【図１２】従来のＴ木インデックスにおけるキー値の削除を示す図である。
【図１３】従来のＴ木インデックスのバックアップを示す図である。
【符号の説明】
１０Ｔ木インデックス
１１Ｔ木本体
１２キー値格納領域
１３重複キー格納領域
２０テーブル
１００Ｔ木本体格納領域
１１１ノード
１１２ノード内要素
１１３ノード間ポインタ
１１４ノード内要素とキー値間ポインタ
１１５空き
１１６ノード内要素とレコード間ポインタ
１２１キー値
１２２キー値とレコード間ポインタ
１２３キー値と重複キーポインタ集合間ポインタ
１３１重複キーポインタ集合内要素とレコード間ポインタ
１３２重複キーポインタ集合内要素とレコード間ポインタ
１３３重複キー
２００キー値格納領域
３００重複キー格納領域
４００管理手段
４１０バックアップ取得手段[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a T-tree index construction method and apparatus and a storage medium storing a T-tree index construction program, and in particular, a memory resident database management system that realizes high-speed data access by making a database resident in a memory. The present invention relates to a T-tree index construction method and apparatus and a storage medium storing a T-tree index construction program.
[0002]
[Prior art]
FIG. 8 shows the structure of a conventional T-tree index. As shown in the figure, the conventional T-tree index 10 includes a node 111 that is a component of the T-tree index and an inter-node pointer 113 that connects between the nodes 111. Each node 111 includes an in-node element 112 that includes a key value 121 and a pointer 116 that points to the position of the record 202 in the table 20 corresponding to the key value 121. The number of in-node elements 112 is an identification value determined at the time of index creation.
[0003]
When searching for the record 202, the key value 121 that matches the search condition designated by the application is searched by using the inter-node pointer 113 and moving between the nodes 111.
Next, the pointer 116 between the records 202 paired with the key value 121 is referred to, and the target record 202 pointed to by the pointer 116 is accessed.
[0004]
FIG. 9 shows a node configuration in a conventional T-tree index.
From the viewpoint of index size, as shown in the figure, when the key value 121 is composed of two columns CLM1 and CLM2, and when the key value 121 is composed only of CLM1, the size of the node 111 is Is different. This is because the key value 121 is directly embedded in each node 111. The size of the node 111 changes depending on the number and length of the columns 201 constituting the key value 121. The size of the node 111 is T Determined for each tree index 10. When the key value 121 is not inserted in all the in-node elements 112, the area is managed as a vacant 115.
[0005]
FIG. 10 shows an increase in duplicate keys in a conventional T-tree index. This figure shows the handling of the duplicate key 133 in the T-tree index 10. KEY3 which is the duplicate key 133 is directly embedded in the node 111. At this time, KEY3 is further inserted, and the number of KEY3 is increased from 2 to 3, whereby the number of key values 121 handled by the T index 10 and the number of stages are increased. As described above, the number of keys 121 to be referred to at the time of search changes depending on the degree of duplication representing the number of records corresponding to one key value, and in some cases, the number of stages of the T-tree index 10 may change.
[0006]
FIG. 11 shows the insertion of key values in a conventional T-tree index. The figure shows an example of inserting the key value 121 into the T-tree index 10. If KEY6 is inserted into T-tree index 10 and the order of KEY5, KEY6, and KEY7 is maintained, KEY4 and KEY5 move, and after KEY6 is inserted, KEY6 is inserted. Is called. Thus, when the key value 121 is inserted into the T-tree index 10, the key value 121 may move.
[0007]
FIG. 12 shows the deletion of the key value in the conventional T-tree index. This figure shows the deletion of the key value 121 from the conventional T-tree index 10 as in the case of insertion. When KEY6 is deleted from the T-tree index 10, if the order of KEY5, KEY7, and KEY8 is maintained, KEY7 and KEY8 are moved. As described above, even when the key value 121 is deleted from the T-tree index 10, the key value 121 may move.
[0008]
FIG. 13 shows a conventional T-tree index backup. The figure shows a state in which a backup 30 of the T-tree index 10 is acquired. A backup 30 of the T-tree index 10 expanded on the memory is acquired on the disk. The contents are information on the hierarchical structure between the nodes of the T-tree index 10 and all information constituting the T-tree index 10 such as the key value 121.
[0009]
[Problems to be solved by the invention]
However, since the above conventional technique embeds key values (including duplicate keys) inside each node of the T-tree, there are the following problems.
1. Since the size of the node changes depending on the number and length of columns constituting the key, the construction of the T-tree index is affected by the key construction, and the construction of the T-tree index becomes complicated, which takes time.
[0010]
2. Since the number of elements in the node is fixed, the amount of memory used as the T-tree index increases when the size of the key value is large and there are many areas managed as empty.
3. Since the key value is moved during key insertion / deletion, the processing speed of record insertion / deletion is slow.
[0011]
4). As the degree of duplication of duplicate keys increases, the number of keys to be searched increases, and comparison with many key values must be performed, so that the processing speed at the time of search deteriorates.
5. Since the backup is acquired for all information included in the index, the time required for the backup is long.
[0012]
The present invention has been made in view of the above points. The T-tree index construction time is shortened, the memory usage is reduced, the processing speed at the time of key insertion / deletion is improved, the search speed at the time of key duplication is improved, and the backup time is reduced. It is an object of the present invention to provide a T-tree index construction method and apparatus capable of shortening and a storage medium storing a T-tree index construction program.
[0013]
[Means for Solving the Problems]
The present invention relates to a T-tree index construction method for constructing a T-tree index used in a database management system that makes a database resident in a memory and realizes high-speed data access.
The T-tree index is a T-tree body consisting of nodes having information on the hierarchical structure of the T-tree index, key value information for comparing conditions at the time of search, and duplicate key information corresponding to a plurality of records for one key. Divided into areas,
The node information of the T-tree index is constructed independently of the number of columns constituting the key, the length, and the key duplication,
Only the nodes constituting the T-tree body in the three divided areas are backed up on the disk.
[0014]
FIG. 1 is a principle configuration diagram of the present invention.
The present invention relates to a T-tree index construction apparatus for constructing a T-tree index used in a database management system that makes a database resident in a memory and realizes high-speed data access.
A T-tree main body storage area 100 for storing a T-tree main body composed of nodes having information on the hierarchical structure of the T-tree index.
A key value storage area 200 for storing key value information for performing condition comparison at the time of retrieval;
A duplicate key storage area 300 for storing duplicate key information corresponding to a plurality of records for one key;
And a management unit 400 that manages three areas: a T-tree main body storage area 100, a key value storage area 200, and a duplicate key storage area 300.
[0015]
In the T-tree main body storage area 100, the node information of the T-tree index is constructed independently of the number of columns constituting the key, the length, and key duplication.
The management unit 400 obtains a backup on the disk of only the nodes constituting the T-tree body among the three areas of the divided T-tree body storage area 100, the key value storage area 200, and the duplicate key storage area 300. Backup acquisition means 410 to be included.
[0016]
The node stored in the T-tree main body storage area 100 includes a pointer indicating the position of the key value stored in the key value storage area in the node element.
When one record corresponds to one key, the key value stored in the key value storage area 200 holds a pointer indicating the position of the record corresponding to each key value,
When a plurality of records correspond to one key, a pointer pointing to one element in the set of pointers in the duplicate key storage area is held.
[0017]
The duplicate key information stored in the duplicate key storage area 300 has a pointer that points to the position of the record corresponding to the duplicate key.
The management means 400 includes means for combining elements in the T-tree main body storage area, the key value storage area, and the duplicate key storage area with a pointer.
The present invention is a storage medium storing a T-tree index construction program for constructing a T-tree index used in a database management system that makes a database resident in a memory and realizes high-speed data access.
The T-tree index is a T-tree body consisting of nodes having information on the hierarchical structure of the T-tree index, key value information for comparing conditions at the time of search, and duplicate key information corresponding to a plurality of records for one key. It has a management process to manage by dividing into areas,
The management process is
The node information of the T-tree index is constructed independently of the number of columns that make up the key, the length, and the key duplication, and only the nodes that make up the T-tree itself are backed up on the disk in the three divided areas. Have a process to get.
[0018]
As described above, the present invention makes it possible to access data by combining, with a pointer, elements in each area where the T-tree index is divided into the following three areas and stored.
For the T-tree body, each node of the T-tree index is stored and information on the hierarchical structure between the nodes is held.
[0019]
The key value storage area stores and manages only the key value.
Further, only the duplicate key is managed for the duplicate key storage area.
For backup, only the T-tree itself is acquired.
Thereby, the following becomes possible compared with the conventional system.
1. Since the creation of a T-tree index node and the construction of a hierarchical structure between nodes can be established independently of the number and length of columns constituting the key, the processing speed of index construction is improved.
[0020]
2. Since the key value is not embedded in the node and the area for storing the key value can be minimized, the memory usage of the entire T-tree index can be reduced.
3. In the insertion / deletion of key values, the growth / decay of the T-tree index can be performed without moving the key values between nodes, and the processing speed at the time of insertion / deletion of key values is improved.
[0021]
4). When handling duplicate keys, the hierarchical structure of the T-tree index can be constructed independently of the degree of duplication of the duplicate keys, and the processing speed when searching for the T-tree index is improved.
5. The amount of backup data can be reduced, and the time required for backup acquisition can be reduced.
[0022]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 2 shows the structure of the T-tree index of the present invention.
This figure shows the correlation of each region in the T-tree index 10 using the construction method of the present invention. The contents of each divided area are as follows.
1. T-tree body area 11:
This is an area for storing the nodes 111 necessary for constructing the T-tree index 10. The in-node element 112 of each node 111 holds a pointer 114 indicating the position of the key value 121 stored in the key value storage area 12.
[0023]
2. Key value storage area 12:
This is an area for storing a key value 121 used at the time of retrieval. When only one record corresponds to one key, each key value 121 holds a pointer 122 that points to the position of the record 202 corresponding to the key value 121. When a duplicate key corresponding to a plurality of records 202 for one key is handled, each key value 121 holds a pointer 123 that points to one element in a set 131 of pointers 132 in the duplicate key storage area 13. doing. The arrangement of the stored key values 121 is not limited.
[0024]
3. Duplicate key storage area 13:
This area is created only when dealing with duplicate keys. This is an area for managing a pointer 132 that points to the position of the record 202 corresponding to the duplicate key.
FIG. 3 shows the structure of the nodes and key values of the T-tree index of the present invention.
The figure illustrates the relationship between the column 201 constituting the key value 121 and the T-tree index 10. When the key value 121 is composed of two columns CLM1 and CLM2, and when the key value 121 is composed of only CLM1, the size of each node 111 of the T main body 11 is the same. Therefore, when a new node is created, when a plurality of indexes are set on one table, all the nodes that construct the index have the same size, and the design and construction of the T-tree index 10 is simplified.
[0025]
In addition, even if an empty space 115 in which the pointer 114 to the key value 121 is not stored in the element 112 of each node 111, the area of the key value 121 corresponding to the empty space 115 is not secured in the key value storage area 12. Therefore, the amount of memory used as the T-tree index 10 can be reduced.
[0026]
【Example】
Embodiments of the present invention will be described below with reference to the drawings.
The following embodiment will be described based on the configuration shown in FIGS.
FIG. 4 illustrates the increase of duplicate keys in one embodiment of the present invention. The figure illustrates the relationship between the duplication degree of the duplication key 133 and the T-tree main body 11. If the KEY3 duplication degree is increased from 2 to 3 by adding a record to the table, it indicates the position of the record inserted in the pointer set 131 in the duplicate key storage area 13 pointed to by the pointer 123 of the KEY3. A pointer 132 is added.
[0027]
The structure of the T-tree main body 11 and the key value storage area 12 does not change even when the importance increases. Therefore, it is possible to prevent a decrease in the processing speed of the key value 121 search due to the duplicate key, and the search processing speed for the key value 121 having no duplication is improved.
FIG. 5 illustrates the insertion of key values in one embodiment of the present invention. This figure illustrates a method for inserting a key value 121 into the T-tree index 10. An example in which KEY6 is inserted into the T-tree index 10 is shown below.
[0028]
1. KEY 6 is inserted into the key value storage area 12.
2. The position where the pointer 114 pointing to KEY 6 is inserted is found in the node 111 of the T-tree body 11. Since the keys are inserted in order of KEY5, KEY6, and KEY7, a pointer Pt6 indicating KEY6 is inserted between the pointers Pt5 and Pt7. At this time, since there is no area for inserting the pointer Pt6, the pointers are replaced for the pointers Pt4 and Pt5, and an area for inserting the pointer Pt6 is secured.
[0029]
3. A pointer Pt6 pointing to KEY6 is inserted.
FIG. 6 illustrates the deletion of the key value according to one embodiment of the present invention. The figure illustrates a method for deleting the key value 121 from the T-tree index 10. An example of deleting KEY6 from the T-tree index 10 is shown below.
1. A pointer 114Pt6 pointing to KEY6 is searched.
[0030]
2. KEY 6 is deleted from the key value storage area 12.
3. The pointer Pt6 is deleted from the node 111.
4). Since the area from which the pointer Pt6 has been deleted becomes empty 115 and the order cannot be maintained, the pointer 114 is replaced with respect to the pointer Pt7 and the pointer Pt8.
[0031]
FIG. 7 shows a backup of one embodiment of the present invention. The figure illustrates the backup 30. Of the three areas of the T-tree main body 11, the key value storage area 12, and the duplicate key storage area 13 constituting the T-tree index 10, only the T-tree main body 11 acquires the backup 30 on the disk. By using only the T-tree body 11 as the backup 30, the data amount of the backup 30 can be reduced, and the write time to the disk at the time of backup acquisition that affects the processing speed of normal operation can be shortened.
[0032]
When recovering from a failure, the backup 30 of the T-tree main body 11 is loaded from the disk onto the memory. Next, the key value 121 is retrieved with reference to the backup of the table 20 in the database, and is combined with each element 112 of each node 111 of the T-tree main body 11 by the pointer 114, whereby the key value storage area 12, duplicate key The storage area 13 is created and the record 202 can be accessed.
[0033]
In the above, the present invention provides a T-tree index, a T-tree body composed of nodes having information on the hierarchical structure of the T-tree index, key value information for performing condition comparison at the time of search, a plurality of records for one key Is divided into three areas of the corresponding duplicate key information, and the node information of the T-tree index is constructed independently of the number of columns constituting the key, the length, and the duplication of the key. Only the nodes that make up the T-tree itself are built as a program to obtain a backup on the disk and stored in a portable storage medium such as a disk device, floppy disk, or CD-ROM, and the database is stored in the memory. It can be used for general purposes by making it resident.
[0034]
The present invention is not limited to the above-described embodiments, and various modifications and applications can be made within the scope of the claims.
[0035]
【The invention's effect】
As described above, according to the present invention, in the T-tree index of the memory resident database management system, the T-tree body can be configured independently of the number and length of columns constituting the key value. The T-tree can be constructed at high speed because it can be handled only by changing the pointer without moving the key value between nodes.
[0036]
In addition, since the area for storing the key value can be minimized, the memory usage can be reduced. Even when a duplicate key is handled, since the key value is one, even data with a high degree of duplicate can be searched efficiently.
Furthermore, since only the T-tree body is used for backup, the time required for backup can be reduced.
[Brief description of the drawings]
FIG. 1 is a diagram for explaining the principle of the present invention.
FIG. 2 is a configuration diagram of a T-tree index according to the present invention.
FIG. 3 is a configuration diagram of nodes and key values of a T-tree index according to the present invention.
FIG. 4 is a diagram illustrating an increase in duplicate keys according to an embodiment of the present invention.
FIG. 5 is a diagram illustrating insertion of a key value according to an embodiment of the present invention.
FIG. 6 is a diagram illustrating deletion of a key value according to an embodiment of the present invention.
FIG. 7 is a diagram showing backup according to an embodiment of the present invention.
FIG. 8 is a configuration diagram of a conventional T-tree index.
FIG. 9 is a node configuration diagram in a conventional T-tree index.
FIG. 10 is a diagram showing an increase in duplicate keys in a conventional T-tree index.
FIG. 11 is a diagram illustrating insertion of a key value in a conventional T-tree index.
FIG. 12 is a diagram showing deletion of a key value in a conventional T-tree index.
FIG. 13 is a diagram showing backup of a conventional T-tree index.
[Explanation of symbols]
10 T-tree index 11 T-tree body 12 Key value storage area 13 Duplicate key storage area 20 Table 100 T-tree body storage area 111 Node 112 Intra-node element 113 Inter-node pointer 114 Intra-node element and inter-key value pointer 115 Free 116 In-node Element and inter-record pointer 121 Key value 122 Key value and inter-record pointer 123 Key value and duplicate key pointer inter-set pointer 131 Duplicate key pointer set element and inter-record pointer 132 Duplicate key pointer set inter-element and inter-record pointer 133 Duplicate key 200 Key value storage area 300 Duplicate key storage area 400 Management means 410 Backup acquisition means

Claims

In a T-tree index construction method for constructing a T-tree index used in a database management system that makes a database resident in a memory and realizes high-speed data access,
The T-tree index is a T-tree body consisting of nodes having information on the hierarchical structure of the T-tree index, key value information for comparing conditions at the time of search, and duplicate key information corresponding to a plurality of records for one key. Divided into areas,
The node information of the T-tree index is constructed independently of the number of columns constituting the key, the length, and the key duplication,
A T-tree index construction method characterized in that a backup is acquired on a disk for only the nodes constituting the T-tree main body in the divided three areas.

In a T-tree index construction apparatus for constructing a T-tree index used in a database management system that makes a database resident in a memory and realizes high-speed data access,
A T-tree main body storage area for storing a T-tree main body composed of nodes having information on the hierarchical structure of the T-tree index.
A key value storage area for storing key value information to be subjected to condition comparison at the time of search;
A duplicate key storage area for storing duplicate key information corresponding to a plurality of records for one key;
A T-tree index construction apparatus comprising: a management unit that manages three areas of the T-tree main body storage area, the key value storage area, and the duplicate key storage area.

The T-tree main body storage area is
The T-tree index construction apparatus according to claim 2, wherein the node information of the T-tree index is constructed independently of the number of columns constituting the key, the length, and duplication of the key.

The management means includes
A backup acquisition unit that acquires a backup on a disk of only the nodes constituting the T-tree main body among the three areas of the divided T-tree main body storage area, the key value storage area, and the duplicate key storage area. Item 4. The T-tree index construction device according to items 2 and 3.

Nodes stored in the T-tree main body storage area are:
3. The T-tree index construction apparatus according to claim 2, wherein the in-node element includes a pointer indicating a position of a key value stored in the key value storage area.

The key value stored in the key value storage area is:
When one record corresponds to one key, a pointer indicating the position of the record corresponding to each key value is held,
3. The T-tree index construction apparatus according to claim 2, wherein when a plurality of records correspond to one key, a pointer pointing to one element in the set of pointers in the duplicate key storage area is held.

The duplicate key information stored in the duplicate key storage area is:
3. The T-tree index construction apparatus according to claim 2, further comprising a pointer that points to a position of a record corresponding to the duplicate key.

The management means includes
8. The T-tree index construction device according to claim 2, 5, 6, and 7, comprising means for combining elements in the T-tree main body storage area, the key value storage area, and the duplicate key storage area with a pointer.

A storage medium storing a T-tree index construction program for constructing a T-tree index used in a database management system that makes a database resident in a memory and realizes high-speed data access,
The T-tree index is a T-tree body consisting of nodes having information on the hierarchical structure of the T-tree index, key value information for comparing conditions at the time of search, and duplicate key information corresponding to a plurality of records for one key. It has a management process to manage by dividing into areas,
The management process is:
The node information of the T-tree index is constructed independently of the number of columns constituting the key, the length, and the duplication of the key, and only the nodes constituting the T-tree body in the three divided areas are placed on the disk. A storage medium storing a T-tree index construction program characterized by having a process for obtaining a backup.