JP3703874B2

JP3703874B2 - File management method and file management apparatus

Info

Publication number: JP3703874B2
Application number: JP05937895A
Authority: JP
Inventors: 志康小林; 康弘鈴木
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1995-03-17
Filing date: 1995-03-17
Publication date: 2005-10-05
Anticipated expiration: 2020-10-05
Also published as: JPH08255103A

Description

【０００２】
【産業上の利用分野】
本発明は、データベースに登録されるファイルを管理するためのファイル管理方法及びファイル管理装置に関し、特に、データベースにおけるデータの入出力動作の単位としてのページ内に収まらないようなデータについてアクセスする際に用いて好適な、ファイル管理方法及びファイル管理装置に関する。
【０００３】
【従来の技術】
従来より、例えば複数の端末とデータベース等により構成され、応用プログラム等が起動されるワークステーションにおいては、例えばＤＢＭＳ（Database Management System）等のようなデータベースに登録されるファイルを管理するためのシステムが用いられている。
【０００４】
即ち、このＤＢＭＳにおいては、データベースに対して、数値や文字列等のような基本的なデータ型（atomic data type）により構成されるファイルについての部分的な読み込み，更新，及びデータの挿入や削除に伴うサイズ変更等のアクセス処理が行なわれている。
また、上述のワークステーション等においては、文書、画像データ、ＣＡＤ／ＣＡＥにおける可変長の座標配列等のマルチメディアデータについてもデータベース上に表現したいというニーズも高まっている。
【０００５】
ここで、マルチメディアデータは、データサイズが可変であり、データに対する固定的な意味付け（大小比較等）は限定することができない。そこで、多くのＲＤＢＭＳ（リレーショナルＤＢＭＳ）では、マルチメディアデータを、例えばレコード（タップル，行）単位に可変長のビット／バイト列を表すＢＬＯＢ(binary large objects)というデータ型で表現することにより、上述の基本的なデータ型とは区別されている。
【０００６】
ここで、レコードとは、データベース内のデータを応用プログラム間でやり取りする際のデータの単位をいう。
なお、マルチメディアデータを、レコードを構成するデータ項目（フィールド、カラム）単位に対応してＢＬＯＢとして扱う場合においては、そのＢＬＯＢレコードへのアドレス参照により実現できる。
【０００７】
また、ＢＬＯＢは、この他に「long item data」，「spatial data」，「bulk data 」のように英語表現される場合がある。
ところで、上述したように、マルチメディアデータをファイルとしてデータベースに登録するにあたっては、レコードのサイズが可変であり、複数の物理ページ間にまたがるような大きいサイズとなる場合があり、このような場合においては、マルチメディアデータを、基本的なデータと同様にはデータベース上に登録することができない。
【０００８】
従って、マルチメディアデータにより構成されるレコードは、レコード全体のサイズが大きくなり、全てのデータ量を主記憶上に常駐させることができなくなり、ワークステーションの運用の際に、応用プログラムや問い合わせに対する要求として、データベースにおけるレコード全体に対する部分的なアクセスを行なえる必要が生じている。
【０００９】
そこで、図３７に示すように、ファイルを管理するファイル管理装置において、複数の物理ページ間にまたがる大きなサイズのレコード（ＢＬＯＢデータ）１０１を、データベース上にページ単位に分割して格納するとともに、これらの分割して格納された破片１０１−１〜１０１−ｎをアドレスリンクすることが考えられる。
【００１０】
これにより、図３７に示すようにＢＬＯＢデータをデータベースに登録した場合は、ファイル管理装置では、レコード全体に対する部分的なアクセスを行なう際には、アドレスリンクに基づいて、先頭の破片から順次ナビゲートしていくようになっている。
しかしながら、近年のデータベースに格納すべきデータの量の増大に伴って、複数の物理ページに分割された破片数を多くすることになる。従って、データベースに対してのアクセス時間が増加し、例えば画像データの時間軸に対するスキップや、ＣＡＥの部分的な配列データへの瞬時なアクセスを実現することができないという課題がある。
【００１１】
そこで、例えば図３８に示すように、ファイル管理装置において、ページ単位に分割された破片（スライス）１０１−１〜１０１−ｎにより構成されるレコードに対して、各スライス１０１−１〜１０１−ｎについてのサイズとアドレスとからなる一覧情報（ディレクトリ情報）を格納するページ１０２を別個に登録しておくことにより、データベース上のＢＬＯＢデータに対する部分的アクセスの時間を少なくすることができる。
【００１２】
即ち、スライス１０１−１〜１０１−ｎの格納されたページに部分的アクセスする場合においては、それぞれ、ディレクトリ情報が格納されたページ１０２における指針１０２−１〜１０２−ｎを参照することにより、部分的アクセスの時間を少なくしているのである。
【００１３】
【発明が解決しようとする課題】
しかしながら、上述の図３８に示すようなファイル管理装置によるデータベースのアクセス手法では、ページ１０２に格納されたディレクトリ情報自体は、一つの物理ページに収まらなければならず、結果としてＢＬＯＢのレコード全体に対する最大サイズが制約されるという課題がある。
【００１４】
さらに、高々数ページにまたがるような小規模のＢＬＯＢデータに対しても、必ずディレクトリ情報のためのページが必要となる。即ち、単にページ内に収まらない小規模のＢＬＯＢデータについて、上述の図３８に示すようなアクセス手法を用いると、ディレクトリ情報のための領域の設定とそのメンテナンスが、空間的、時間的なオーバーヘッドとなる課題もある。
【００１５】
また、応用プログラム利用の際に、利用者の便宜を図るべく、応用プログラム側において、データベース側の環境を意識する必要がないインターフェイスを構築することも必要である。
本発明は、このような課題に鑑み創案されたもので、レコードサイズの制約がなく、且つ部分的なアクセスに対しても高速に行なうことができる、ファイル管理方法及びファイル管理装置を提供することを目的とする。
【００１６】
【課題を解決するための手段】
図１は第１の発明の原理ブロック図であり、この図１において、１ａはファイル管理装置であり、このファイル管理装置１ａは、データベース６ａに登録されるファイルを管理するものであり、レコード操作手段２ａ，長大レコード操作手段３ａ，通常レコード操作手段４ａ及びインデクス操作手段５ａをそなえている。
【００１７】
レコード操作手段１ａは、レコードに対する各種要求を受け付けるものであり、ここでいうレコードとは、データベース６ａ内のデータをレコード操作手段２側の要求元との間でやり取りする際のデータの単位をいう。
また、長大レコード操作手段３ａは、１ページのサイズ内に収まらないサイズの長大レコードであって、ページ単位に分割された複数のスライスにより構成された長大レコードに関する要求をレコード操作手段２ａから受けると、その長大レコードについて、スライス単位に分解するものである。
【００１８】
さらに、通常レコード操作手段４ａは、レコード操作手段１ａからの１ページサイズ内のレコードに関する要求又は長大レコード操作手段３ａからの長大レコードに関する要求を受け、１個のレコード内でのアクセス又はレコード内での部分的アクセスを、データベース６ａとの間で行なうものである。
また、インデクス操作手段５ａは、長大レコードのインデクスを生成又は更新し、長大レコードのアクセスをデータベースとの間で行なうものである。
さらに、通常レコード操作手段４ｂは、レコード操作手段２ａから１ページサイズ内のレコードに関する要求を受けた場合は、１個のレコード内でデータベース６ａに対してアクセスを行なうようになっている。
また、長大レコード操作手段３ａは、レコード操作手段２ａが１ページサイズ内に収まらないサイズを有する新規の長大レコードに関する登録要求を受けた場合はその新規の長大レコードについて１ページに相当するサイズを有するスライス単位に分解するとともに、通常レコード操作手段４ａで、データベース６ａに登録するようになっている。
さらに、長大レコード操作手段３ａが、データベース６ａに登録され１ページのサイズ内に収まらないサイズを有する長大レコードであって複数のスライスにより構成された長大レコードに関する要求を受けた場合は、該長大レコードについて、該スライス単位に分解された長大レコード内での部分的アクセスをデータベース６ａとの間で行なうか、又は該長大レコードのインデクスを生成又は更新し、該長大レコードのアクセスを、該スライスを単位として、データベース６ａとの間で行なうように構成されている。
【００１９】
さらに、図２は第２の発明の原理ブロック図であり、この図２において、１ｂはファイル管理装置であり、このファイル管理装置１ｂは、データベース６ｂに登録されるファイルを管理するものであり、レコード操作手段２ｂ，長大レコード操作手段３ｂ，データページ操作手段４ｂ及び木インデクス操作手段５ｂをそなえている。
【００２０】
ここで、レコード操作手段２ｂは、レコード又はレコード内の部分的なデータに対して、レコードの新規作成、抹消、フェッチ、レコード内の部分的なデータに対する途中挿入、途中削除、更新又は読み込みのいずれかのレコード操作要求を受け付けるものであり、ここでいうレコードについても、データベース６ｂ内のデータをレコード操作手段２側の要求元との間でやり取りする際のデータの単位をいう。
【００２１】
また、データページ操作手段４ｂは、レコード操作手段２ｂにて受け付けた要求の対象となるデータが１ページ内に収まるサイズの通常レコードである場合において、該要求の対象としての新規レコード又はデータベース６ｂ上で登録されたレコード識別子で示される既存レコードを１ページに収まる通常レコードとして割り当てて、各種レコード操作を行なうとともに、１個の通常レコード内における部分的アクセスを行なうものである。
【００２２】
なお、ページとは、データベースにおけるデータに関しての物理的な入出力動作の最小単位である。
さらに、長大レコード操作手段３ｂは、レコード操作手段２ｂにて受け付けた要求の対象となるデータが、１ページ内に収まらないサイズの長大レコードであって、ページ単位に分割された複数のスライスにより構成された長大レコードである場合は、新規レコード又はデータベース６ｂ上で登録された長大レコードに対し、操作を受けるスライス単位に分解してから、データページ操作手段４ｂに対して各種レコード操作を該スライス単位に要求するものであり、後述するアクセス開始点決定制御部３ｂ−１及びメンテナンス部３ｂ−２をそなえている。
【００２３】
木インデクス操作手段５ｂは、最下位リーフレベルのインデクスでは各スライスそのもののサイズとスライスへのレコード識別子を格納し、その上の階層のインデクスは下位のインデクスの表すサイズの総和を格納し、最上位のルートレベルのインデクスの表すサイズの総和は長大レコード全体のサイズに一致し、リーフレベルのインデクスの順番は、これらの対応する各スライスのオフセットの順番に一致するようにインデクスを構成するものである。
【００２４】
また、長大レコード操作手段３ｂのアクセス開始点決定制御部３ｂ−１は、既存の長大レコードに対する部分的なアクセスを行なう場合には、レコード全体に対するオフセットより操作対象となるスライスの開始点を、スライス間のアドレスリンクを順次ナビゲートする方法又は上記木インデクスより求める方法のいずれかにより決定するものである。
【００２５】
さらに、メンテナンス部３ｂ−２は、スライス単位の追加、削除に伴う上記木インデクスのメンテナンスを木インデクス操作手段５ｂに要求するものである。
また、データページ操作手段４ｂが、ページ内の各レコードの、通常レコードと長大レコードとの区別を記録するレコード区別記録部をそなえるとともに、レコード操作手段２ｂが、データページ操作手段４ｂのレコード区別記録部からの、既存のレコードのレコード識別子に対応するレコードの区別に基づき、通常レコードの場合にはデータページ操作手段４ｂに、長大レコードの場合には長大レコード操作手段３ｂにそれぞれレコード操作を要求する操作要求部をそなえることもできる。
【００２６】
さらに、レコード操作手段２ｂが、既存の通常レコードに対し、操作に伴いレコード全体のサイズが１ページに収まらなくなる場合には、通常レコードを長大レコードとして登録し直してから長大レコード操作手段３ｂにレコード操作を要求する一方、既存の長大レコードに対し、操作を完了した後、長大レコード全体のサイズが１ページに収まる場合には、このレコード操作の完了した長大レコードを通常レコードとして登録し直す再登録操作部をそなえることもできる。
【００２７】
また、アクセス開始点決定制御部３ｂ−１が、既存の長大レコードに対する部分的なアクセスを行なうスライスの開始点を決定する際に、木インデクスが構成されている場合は木インデクス操作手段５ｂにより、無ければスライス自体の持つアドレスリンクをナビゲートしてその開始点を決定することができる。
【００２８】
即ち、アクセス開始点決定制御部３ｂ−１の制御に基づいて、該木インデクス操作手段５ｂによりアクセス開始点の決定する場合は、メンテナンス部３ｂ−２を、長大レコード全体のサイズに対する予め決められたしきいサイズとの大小に基づいて、木インデクスの作成と削除とを決定するように構成することができるほか、部分的なアクセスの要否により木インデクスの作成又は削除することを決定し、さらに、部分的なアクセスが発生すると、木インデクスを作成するように構成することができる。
【００２９】
また、長大レコード操作手段３ｂのメンテナンス部３ｂ−２を長大レコード全体のサイズに対する予め決められたしきいサイズとの大小関係、及び部分的なアクセスの要否により木インデクスの作成又は削除することを決定するように構成することができる。
さらに、木インデクス操作手段５ｂが、長大レコード全体に対するオフセットからスライスを検索する際に、木インデクスの最上位のルートインデクスページから最下位のリーフインデクスページに至る各インデクスのサイズの積算が、所望のオフセットと一致又は直前となるインデクスを検索し、検索されたインデクスから逐次下位階層のインデクスページに移って検索を行なうことにより、リーフインデクスページ上のインデクスにおいて当該オフセットを含むスライスへのレコード識別子を取得するインデクスページ操作部をそなえることもできる。
【００３０】
また、長大レコード操作手段３ｂが、長大レコードに対する部分的な挿入を行なう際に、挿入操作後に新規生成されたスライスの格納ページに空き領域があれば、新規生成されたスライスと直後のスライス間で併合するインデクスページ間操作部をそなえることができるほか、長大レコードに対する部分的な削除を行なう際に、削除操作の開始点を含むスライスと終了点をスライス間で併合するインデクスページ間操作部をそなえることもできる。
【００３１】
さらに、長大レコード操作手段３ｂが、長大レコードに対するサイズ変更操作が行なわれた後に、当該長大レコードを構成する全てのスライスについてのスライスサイズの総和を演算する総和演算部と、総和演算部において演算された総和に対する、スライスの個数と１ページサイズの積との比を演算する比演算部と、比演算部において演算された比と、予め設定された比率とを比較する比率比較部と、比率比較部における比率の比較の結果、比演算部において演算された比が予め設定された比率よりも小さい場合は、隣り合うスライス間において自スライスの空き領域が該自スライスに続くスライスのデータで埋まるように、全てのスライスについて順次マージしていくことにより、ガベージコレクションを行なうように制御するガベージコレクション制御部をそなえることができる。
【００３２】
また、長大レコード操作手段３ｂが、メンテナンス部３ｂ−２において、インデクスに対する修正をした後に、修正されたリーフインデクスページ内における、スライスサイズの総和を演算する総和演算部と、総和演算部において演算された総和に対する、スライスの個数と１ページサイズの積との比率を演算する比演算部と、比演算部において演算された比と、予め設定された比率とを比較する比率比較部と、比率比較部における比率の比較の結果、比演算部において演算された比が予め設定された比率よりも小さい場合は、リーフインデクスページ内におけるインデクスに対応したスライスについて、隣り合うスライス間において自スライスの空き領域が該自スライスに続くスライスのデータで埋まるように順次マージしていくことにより、ガベージコレクションを行なうように制御するガベージコレクション制御部をそなえることもできる。
【００３３】
【作用】
上述の第１の発明では、図１に示すように、レコード操作手段２ａにおいて、１ページサイズ内のレコードに関する要求を受けた場合は、通常レコード操作手段４ａにより、１個のレコード内でデータベース６ａに対してアクセスを行なう。
また、長大レコード操作手段３ａで、１ページのサイズ内に収まらないサイズを有する新規の長大レコードに関する登録要求を受けた場合は、該新規の長大レコードについて、ページ単位に分割されたスライスに分解して、通常レコード操作手段４ａで、データベース６ａに登録する。
また、レコード操作手段２ａにおいて、データベース６ａに登録され１ページのサイズ内に収まらないサイズを有する長大レコードであって、複数のスライスにより構成された長大レコードに関する要求を受けた場合は、長大レコード操作手段３ａにより、長大レコードについて、スライス単位に分解する。
【００３４】
そして、通常レコード操作手段４ａにより、スライス単位に分解された長大レコード内での部分的アクセスをデータベース６ａとの間で行なうか、又はインデクス操作手段５ａにより、長大レコードのインデクスを生成又は更新し、長大レコードのアクセスを、スライスを単位として、データベース６ａとの間で行なう。
これにより、ファイル管理装置１ａでは、データベース６ａに格納されるファイルを管理することができる。
【００３５】
さらに、上述の第２の発明のファイル管理装置１ｂにおいては、レコード操作手段２ｂでは、レコード又はレコード内の部分的なデータに対して、レコードの新規作成、抹消、フェッチ、レコード内の部分的なデータに対する途中挿入、途中削除、更新又は読み込みのいずれかのレコード操作要求を受け付ける。
また、データページ操作手段４ｂでは、レコード操作手段２ｂにて受け付けた要求の対象となるデータが１ページ内に収まるサイズの通常レコードである場合は、該要求の対象としての新規レコード又はデータベース６ｂ上で登録されたレコード識別子で示される既存レコードを１ページに収まる通常レコードとして割り当てて、各種レコード操作を行なうとともに、１個の通常レコード内における部分的アクセスを行なう。
【００３６】
さらに、長大レコード操作手段３ｂでは、レコード操作手段２ｂにて受け付けた要求の対象となるデータが、１ページ内に収まらないサイズの長大レコードであって、ページ単位に分割された複数のスライスにより構成された長大レコードである場合は、新規レコード又はデータベース６ｂ上で登録された長大レコードに対し、操作を受けるスライス単位に分解してから、データページ操作手段４ｂに対して各種レコード操作をスライス単位に要求する。
【００３７】
また、木インデクス操作手段５ｂでは、最下位リーフレベルのインデクスでは各スライスそのもののサイズとスライスへのレコード識別子を格納し、その上の階層のインデクスは下位のインデクスの表すサイズの総和を格納し、最上位のルートレベルのインデクスの表すサイズの総和は長大レコード全体のサイズに一致し、リーフレベルのインデクスの順番は、これらの対応する各スライスのオフセットの順番に一致するようにインデクスを構成する。
【００３８】
また、長大レコード操作手段３ｂのアクセス開始点決定制御部３ｂ−１では、既存の長大レコードに対する部分的なアクセスを行なう場合には、レコード全体に対するオフセットより操作対象となるスライスの開始点を、スライス間のアドレスリンクを順次ナビゲートする方法又は上記木インデクスより求める方法のいずれかにより決定し、メンテナンス部３ｂ−２では、スライス単位の追加、削除に伴う上記木インデクスのメンテナンスを木インデクス操作手段５ｂに要求する。
【００３９】
これにより、ファイル管理装置１ｂでは、操作要求の対象のレコードに基づいて、データベース６ｂに登録されるファイルを管理することができる。
また、データページ操作手段４ｂでは、レコード区別記録部により、ページ内の各レコードの通常レコードと長大レコードとの区別を記録しておき、レコード操作手段２ｂの操作要求部により、データページ操作手段４ｂのレコード区別記録部からの既存のレコードのレコード識別子に対応するレコードの区別に基づき、通常レコードの場合にはデータページ操作手段４ｂに、長大レコードの場合には長大レコード操作手段３ｂにそれぞれレコード操作を要求する。これにより、レコード操作手段２ｂでは、既存のレコードのレコード識別子に対応するレコードの区別に応じてレコード操作を要求することができる。
【００４０】
さらに、レコード操作手段２ｂの再登録操作部により、既存の通常レコードに対し、操作に伴いレコード全体のサイズが１ページに収まらなくなる場合には、通常レコードを長大レコードとして登録し直してから長大レコード操作手段３ｂにレコード操作を要求する一方、既存の長大レコードに対し、操作を完了した後、長大レコード全体のサイズが１ページに収まる場合には、このレコード操作の完了した長大レコードを通常レコードとして登録し直すこともできる。
【００４１】
また、長大レコード操作手段３ｂのアクセス開始点決定制御部３ｂ−１では、既存の長大レコードに対する部分的なアクセスを行なうスライスの開始点を決定する際に、木インデクスが構成されている場合は木インデクス操作手段５ｂにより、無ければスライス自体の持つアドレスリンクをナビゲートしてその開始点を決定するように制御することができる。
【００４２】
この場合においては、アクセス開始点決定制御部３ｂ−１による制御に基づき、木エンデクス操作手段５ｂによりアクセス開始点を決定する場合は、メンテンナンス部３ｂ−２では、長大レコード全体のサイズに対する予め決められたしきいサイズとの大小に基づいて、木インデクスの作成と削除とを決定することができるほか、部分的なアクセスの要否により木インデクスの作成又は削除することを決定することができ、さらに、部分的なアクセスが発生すると、木インデクスを作成することもできる。
【００４３】
また、メンテナンス部３ｂ−２では、長大レコード全体のサイズに対する予め決められたしきいサイズとの大小関係、及び部分的なアクセスの要否により木インデクスの作成又は削除することを決定することができる。
さらに、木インデクス操作手段５ｂのインデクスページ操作部では、長大レコード全体に対するオフセットからスライスを検索する際に、木インデクスの最上位のルートインデクスページから最下位のリーフインデクスページに至る各インデクスのサイズの積算が、所望のオフセットと一致又は直前となるインデクスを検索し、検索されたインデクスから逐次下位階層のインデクスページに移って検索を行なうことにより、リーフインデクスページ上のインデクスにおいて当該オフセットを含むスライスへのレコード識別子を取得することもできる。
【００４４】
また、長大レコード操作手段３ｂのインデクスページ間操作部により、長大レコードに対する部分的な挿入を行なう際に、挿入操作後に新規生成されたスライスの格納ページに空き領域があれば、新規生成されたスライスと直後のスライス間で併合することができるほか、長大レコードに対する部分的な削除を行なう際に、削除操作の開始点を含むスライスと終了点をスライス間で併合することもできる。
【００４５】
さらに、長大レコードに対するサイズ変更操作が行なわれた後に、総和演算部では、当該長大レコードを構成する全てのスライスについてのスライスサイズの総和を演算し、比演算部では、総和演算部において演算された総和に対するスライスの個数と１ページサイズの積との比を演算し、比率比較部では、比演算部において演算された比と予め設定された比率とを比較し、ガベージコレクション制御部では、比率比較部における比率の比較の結果、比演算部において演算された比が予め設定された比率よりも小さい場合は、隣り合うスライス間において自スライスの空き領域が該自スライスに続くスライスのデータで埋まるように、全てのスライスについて順次マージしていくことにより、ガベージコレクションを行なうように制御することができる。
【００４６】
また、メンテナンス部３ｂ−２において、インデクスに対する修正をした後に、総和演算部では、修正されたリーフインデクスページ内における、スライスサイズの総和を演算し、比演算部では、総和演算部において演算された総和に対するスライスの個数と１ページサイズの積との比率を演算し、比率比較部では、比演算部において演算された比と予め設定された比率とを比較し、ガベージコレクション制御部では、比率比較部における比率の比較の結果、比演算部において演算された比が予め設定された比率よりも小さい場合は、リーフインデクスページ内におけるインデクスに対応したスライスについて、隣り合うスライス間において自スライスの空き領域が該自スライスに続くスライスのデータで埋まるように、順次マージしていくことにより、ガベージコレクションを行なうように制御することもできる。
【００４７】
【実施例】
以下、図面を参照することにより本発明の実施例について説明する。
（ａ）本発明の一実施例にかかるファイル管理装置の概要の説明
図３は本実施例にかかるファイル管理装置が適用されたシステムを示すブロック図であり、この図３に示すシステムは、複数の端末とデータベース等により構成され、例えば端末利用者による文書編集処理を行なうための応用プログラム等が起動されるワークステーションなどとして機能するものであり、ファイル管理装置は、データベースに登録されるファイルを管理するためのものである。
【００４８】
ここで、この図３において、１１は端末利用者による処理内容等を表示するディスプレイ装置であり、１２は端末利用者からのデータやコマンド等を入力するためのキーボードである。
また、１３はキーボード１２からの入力やデータベース１４に格納されているデータに基づいて、例えば文書編集処理等の応用プログラムを実行したり、プログラム実行の際に使用されるデータをデータベース１４から読み込んで格納する中央処理装置／主記憶装置であり、この中央処理装置／主記憶装置１３は、本実施例にかかるファイル管理装置としての機能を有している。
【００４９】
また、データベース１４は、中央処理装置／主記憶装置１３において起動される応用プログラムのための各種データ等をファイル毎に格納しておくものであり、このデータベース１４では、データに関しての物理的な入出力動作をページ単位に行なうようになっている。
ここで、ＣＰＵ及び主記憶装置１３は、応用処理部２１，レコード操作部２２，ＢＬＯＢレコード操作部２３，ツリーインデクス操作部２４，データページ操作部２５及びページバッファ２６をそなえており、それぞれの機能はソフトウェアにより実現することができる。
【００５０】
応用処理部２１は、応用プログラム（アプリケーションプログラム）による処理を実行するものであり、本実施例においては、文書編集処理の応用プログラムを実行するものである。
レコード操作部（レコード操作手段）２２は、データベース１４内のデータを応用プログラム間でやり取りする際のデータの単位としてのレコード又はレコード内の部分的なデータに対する各種レコード操作要求を受け付けるものであり、操作要求部２２ａ及び再登録操作部２２ｂをそなえている。
【００５１】
例えば、レコード操作部２２は、レコード単位に対する操作要求として文書の新規作成（ｃｒｅａｔｅ），抹消（ｒｅｍｏｖｅ）又はフェッチ（ｆｅｔｃｈ）を、レコード内の部分的なデータに対する操作要求として途中挿入（ｉｎｓｅｒｔ）、途中削除（ｃｕｔ）、更新（ｕｐｄａｔｅ），読み込み（ｒｅａｄ）を受け付けるようになっている。
【００５２】
また、レコード操作部２２の操作要求部２２ａは、データページ操作部２５からのレコード識別子（Record IDentifier,ＲＩＤ）で示される、既存のレコードにおけるレコードの区別（割り当て）に基づき、通常レコードの場合にはデータページ操作部２５に、ＢＬＯＢレコード（長大レコード）の場合にはＢＬＯＢレコード操作部２３にそれぞれレコード操作を要求するものである。
【００５３】
さらに、再登録操作部２２ｂは、既存の通常レコードに対し、操作に伴いレコード全体のサイズが１ページに収まらなくなる場合には、通常レコードを長大レコードとして登録し直してからＢＬＯＢレコード操作部２３にレコード操作を要求する一方、既存の長大レコードに対し、操作を完了した後、長大レコード全体のサイズが１ページに収まる場合には、このレコード操作の完了した長大レコードを通常レコードとして登録し直すものである。
【００５４】
また、データページ操作部（通常レコード操作手段，データページ操作手段）２５は、データを格納するデータページを、ページ毎に操作するものである。具体的には、レコード操作部２２にて受け付けた要求の対象となるデータが１ページ内に収まるサイズの通常レコードである場合に、新規レコード又はデータベース１４上で登録されたレコード識別子で示される既存レコードが１ページに収まる通常レコードとしての各種レコード操作を行なうとともに、通常レコードにおける１ページ内に収まる部分的アクセスを行なうものである。
【００５５】
さらに、データページ操作部２５は、ページ内の各レコードの、通常レコードとＢＬＯＢレコードとの区別をＲＩＤとして記録するレコード区別記録部２５ａをそなえている。
また、ＢＬＯＢレコード操作部（長大レコード操作手段）２３は、レコード操作部２２にて受け付けた要求の対象となるデータが、１ページ内に収まらないサイズであって、複数のスライスにより構成された長大レコードとしてのＢＬＯＢ(binary large objects)レコードである場合は、新規レコード又はデータベース１４上で登録されたＢＬＯＢレコードに対し、操作を受けるスライス単位に分解してから、データページ操作部２５に対して各種レコード操作を要求するものであり、アクセス開始点決定制御部２３ａ及びメンテナンス部２３ｂをそなえている。
【００５６】
なお、ＢＬＯＢレコード操作部２３は、図１７にて後述するように、総和演算部２３ｃ，比演算部２３ｄ，比率比較部２３ｅ及びガベージコレクション（Garbage Collection) 制御部２３ｆをそなえ、データベース１４におけるスライスの空の領域を整理することもできる。
ここで、アクセス開始点決定制御部２３ａは、既存のＢＬＯＢレコードに対する部分的なアクセスを行なう場合には、レコード全体に対するオフセットより操作対象となるスライスの開始点を、スライス間のアドレスリンクを順次ナビゲートする方法又は後述するツリーインデクス操作部２４のツリーインデクス情報で求める方法のいずれかにより決定するものである。
【００５７】
また、メンテナンス部２３ｂは、スライス単位の追加、削除に伴うツリーインデクス情報のメンテナンスをTreeインデクス操作部２４に要求するものである。
さらに、ツリーインデクス（Tree Index) 操作部２４は、ツリーインデクスを構成して、データベース１４におけるインデクスページに格納させるものであり、インデクスページ(Index Page)操作部２４ａ及びインデクスページ間操作部２４ｂをそなえている。
【００５８】
また、既存のＢＬＯＢレコードに対する部分的なアクセスを行なう場合には、このツリーインデクス操作部２４で構成されたツリーインデクス情報に基づいててアクセスを行なうことができるようになっている。
上述のツリーインデクスの構成としては、下位リーフレベル(Leaf Level)のインデクスでは各スライスそのもののサイズとスライスへのレコード識別子を格納し、その上の階層のインデクスは下位のインデクスの表すサイズの総和を格納し、最上位のルートレベル(Root Level)のインデクスの表すサイズの総和は該長大レコード全体のサイズに一致し、リーフレベルのインデクスの順番は、これらの対応する各スライスのオフセットの順番に一致するように構成することができる。
【００５９】
上述の構成により、本発明の一実施例にかかるファイル管理装置では、以下に示すような処理が行なわれる。
即ち、レコード操作部２２では、応用処理部２１の応用プログラムからのレコードに対するアクセス要求を受け付ける。例えば、レコード単位のアクセス要求として、新規作成（ｃｒｅａｔｅ），抹消（ｒｅｍｏｖｅ）又はフェッチ（ｆｅｔｃｈ）を受け付け、また、部分的なアクセスとして、途中挿入（ｉｎｓｅｒｔ），途中削除（ｃｕｔ），更新（ｕｐｄａｔｅ）又は読み込み（ｒｅａｄ）を受け付ける。
【００６０】
さらに、レコード操作部２２では、レコードの新規作成時は、そのレコードに対する要求サイズにより、通常レコードか又はＢＬＯＢレコードの区別を設定する。
また、レコード操作部２２において受け付けたアクセス要求が、既存のレコードに対するものならば、データページ操作部２５からのＲＩＤに基づいて、そのレコードが通常レコードであるかＢＬＯＢレコードであるかを区別する。
【００６１】
ここで、レコード操作部２２の操作要求部２２ａでは、受け付けたアクセスに対応するレコードが、通常レコードである場合は、直接データページ操作部２５に実際のレコード操作を要求する一方、ＢＬＯＢレコードである場合は、ＢＬＯＢレコード操作部２３に実際のレコード操作を要求する。
また、既存のレコードが通常のレコードの場合でも、要求されたレコード操作に伴ってレコード全体サイズが１サイズに収まらなくなる場合には、再登録操作部２２ｂにより既存の通常レコードをＢＬＯＢレコードとして登録し直し、その後、操作要求部２２ａによりＢＬＯＢレコード操作部２３に実際のレコード操作を要求する。
【００６２】
また、既存のＢＬＯＢレコードに対する操作を完了した後、ＢＬＯＢレコード全体が１ページに収まるようになった場合には、再登録操作部２２ｂにより、そのレコード操作完了後のＢＬＯＢレコードを通常レコードとして登録し直す。
なお、レコード操作部２２において受けたアクセス要求が既存の通常レコードに対するものである場合は、ＲＩＤは、それ自体１個のレコードのデータベース１４における所在を示している。
【００６３】
これに対し、レコード操作部２２において受けたアクセス要求がＢＬＯＢレコードの場合には、ＲＩＤは、ＢＬＯＢレコードに対する管理情報のデータベース１４における所在を示している。また、データベース１４に所在する管理情報としては、複数のスライスをナビゲートする情報，ツリーインデクスに関する情報から構成される。
【００６４】
データページ操作部２５では、レコード操作部２２からの要求に基づいて、新規レコード又はＲＩＤで示される既存レコードが１ページに収まる通常レコードとしての割り当て、削除、更新等のレコード操作を行なうほか、１個の通常レコード内における途中削除，部分的な読み出し、及び１ページ内に収まる範囲の途中挿入といったレコード内（スライス内）での部分的アクセスを行なう。
【００６５】
ＢＬＯＢレコード操作部２３では、レコード操作部２２において受けたアクセス要求が、新規レコードのＢＬＯＢレコードか又はＲＩＤで示される既存レコードが複数ページに点在したスライスにまたがって格納されるＢＬＯＢレコードに対するものである場合は、そのＢＬＯＢレコードについて、操作を受けるスライス単位に分解することにより、見かけ上個別の通常レコードに対するアクセスを行なうことができる。
【００６６】
即ち、ＢＬＯＢレコード操作部２３のアクセス開始点決定制御部２３ａでは、レコード全体に対するオフセットより操作対象となるスライスの開始点を、スライス間のアドレスリンクを順次ナビゲートする方法又はツリーインデクス操作部２４により生成されたツリーインデクス情報により求める方法のいずれかにより決定する。
【００６７】
具体的には、アクセス開始点決定制御部２３ａでは、ツリーインデクス操作部２４においてツリーインデクスが作成されている場合は、ツリーインデクス操作部２４により部分的なアクセス対象となるスライスの開始点を決定するが、作成されていない場合は、スライス自体が有するアドレスリンクをナビゲートしてその開始点を決定するのである。
【００６８】
スライスの開始点を上述のいずれかの方法で決定すると、ＢＬＯＢレコード操作部２３では、開始点以降のスライス単位のレコード操作については、アドレスリンクをナビゲートして、スライス単位にデータページ操作部２５によるアクセス処理を繰り返し実行させるとともに、スライス単位のデータの追加、削除に伴って、アドレスリンクのメンテナンスを行なう。
【００６９】
さらに、ツリーインデクス操作部２４においてツリーインデクスが作成されている場合においては、メンテナンス部２３ｂでは、各スライス単位の追加、削除、サイズ変更に関わる操作に伴い、ツリーインデクスをメンテナンスする。
なお、メンテナンス部２３ｂでは、部分的なアクセスの要否や、ＢＬＯＢレコード全体のサイズに応じて、ツリーインデクスを新規に一括生成したり、すでにあるツリーインデクスを消滅させる。
【００７０】
ここで、部分的なアクセスの要否は、このＢＬＯＢレコードとして格納管理されるデータの種類毎に事前に定義された辞書を参照したり、応用プログラムから部分的なアクセス要求を受け付けることにより判断し、ＢＬＯＢレコード全体のサイズは、レコード操作後にサイズが縮小した結果、通常レコードに変換される可能性があるので、レコード操作後のサイズに基づいて判断されている。
【００７１】
ところで、ツリーインデクス操作部２４では、データベース１４上のインデクスページにおける、各スライスに対応したインデクスを操作する。このインデクスは、ページ当たりのインデクスの格納数をＸ，平均格納率をγ（０．５≦γ≦１．０）とすれば、インデクスの階層の高さＨはｌｏｇγＸ（ｎ）となり、スライス数の増加に対して動的にインデクスを増加させるとともに、アクセス性能の劣化を緩和することができる。
【００７２】
また、ツリーインデクス操作部２４により生成されるインデクスでは、各インデクスには対応するスライスのサイズを格納し、最下位のリーフレベルのインデクスは、各スライスそのもののサイズを格納し、結果として最上位のインデクスの表すサイズの総和が、ＢＬＯＢレコード全体のサイズに一致する。
さらに、リーフレベルのインデクスの順番は、これらに対応する各スライスのオフセットの順番に一致しており、結果的に上位のインデクスにより、下位のインデクスページをナビゲートすることなく、少ないページアクセスで効率的にＢＬＯＢ全体に対するオフセットに対応したリーフレベルのインデクスに行き着くことができる。
【００７３】
また、最上位階層（ルート）から最下位（リーフ）に至るインデクスページ上の各インデクスのサイズの積算が、所望のオフセットと一致又は直前となるインデクスを見つけ、そのインデクスから逐次下位階層のインデクスページに移って検索することにより、リーフインデクスページ(Leaf Index Page) 上のインデクスに当該オフセットを含むスライスのＲＩＤを取得する。これにより、ＢＬＯＢレコード全体からのオフセットからスライスを特定することができる。
【００７４】
インデクスページ操作部２４ａでは、インデクスページ単位の操作を行なう。即ち、インデクスを検索する際は、ルートインデクスページ(Root Index Page) から下位のインデクスページに向かって逐次検索されるが、インデクスをメンテナンスする際は、インデクスページ間操作部２４ｂを介して、リーフインデクスページから上位のインデクスページに対して順次変更が反映される。
【００７５】
また、インデクスページ間操作部２４ｂでは、同一階層での前後のインデクスページ間のリンクや、上位／下位インデクスページ間のリンクのメンテナンスを行なう。
ところで、ツリーインデクス操作部２４において、データページにおけるスライスの増減に伴った、インデクスページにおけるインデクスの増減操作を行なう際においては、同一階層内での前後のインデクスページ間でのインデクスの移動が伴うオーバーフローやアンダーフロー、さらに前後のインデクスページ間であふれたインデクスを収容できない場合のスプリットや、極端に過疎になったインデクスページのマージを行なう。
【００７６】
これにより、最初にインデクスページ操作部２４ａがリーフレベルのインデクスに対するメンテナンス要求を受け付けると、インデクスページ操作部２４ａからインデクスページ間操作部２４ｂに実際のインデクスに対するメンテナンス要求を行なう。
メンテナンス要求を受けたインデクスページ間操作部２４ｂでは、前記のインデクスページ間にまたがるメンテナンス（オーバフロー，アンダーフロー，スプリット又はマージ）を行ない、インデクスページ単位のインデクスの挿入，削除はインデクスページ操作部２４ａに要求する。
【００７７】
さらに、インデクスページ間操作部２４ｂでは、一つの階層に対するインデクスのメンテナンスが完了すると、その変更内容を上位階層のインデクスページに反映する。例えば、下位階層のインデクスページの総サイズが変更された場合には、その下位階層のインデクスページを代表する上位階層のインデクスのサイズ及び当該上位階層のインデクスページの総サイズを更新する。
【００７８】
ところで、ページバッファ２６では、データベース１４におけるファイル上のページアドレスに従って、物理ページ単位の入出力，主記憶装置に読み込まれたページのバッファリング，新規ページのデータベース１４上への割り当て，データレコードやインデクスが空になって不要になったページの、データベース１４からの開放等を行なう。
【００７９】
このように、ツリーインデクス操作部２４により、ＢＬＯＢレコードのインデクスを生成し、ＢＬＯＢレコードのアクセスをデータベース１４との間で行なうことができるので、管理すべきＢＬＯＢレコードの最大レコードサイズの制約を無くすとともに、高速に部分的アクセスを行なえる利点もある。
また、再登録操作部２２ａにより、操作に伴ってレコード全体のサイズが変わっても、そのレコード全体のサイズに応じて、通常レコード又はＢＬＯＢレコードとして登録することができるほか、レコード区別記録部２５ａにより、ページ内の各レコードの、ＢＬＯＢレコードと通常レコードとの区別を記録しておき、この区別に基づき、通常レコードとＢＬＯＢレコードの操作との分岐制御を行なうことができるので、応用処理部２１側において、通常レコードとＢＬＯＢレコードの区別を意識する必要がなく、ファイル管理装置を、データ長を区別しない一貫性のあるインターフェイスとして機能させることができる利点がある。
【００８０】
さらに、ＢＬＯＢレコード操作部２３のアクセス開始点決定制御部２３ａにより、レコード全体に対するオフセットから操作対象となるスライスの開始点を、スライス間のアドレスリンクを順次ナビゲートする方法又はツリーインデクスから求める方法のいずれかにより決定することができるので、レコードの規模とアクセス形態に応じた最適なアクセス手段を選択することができる利点があるほか、ファイル管理装置を、応用処理部２１側においてデータベース側の環境を意識する必要のない独立性のあるインターフェイスとして機能することができる利点もある。
【００８１】
また、ツリーインデクス操作部２４では、ＢＬＯＢレコード全体のサイズの対する予め決められたしきいサイズとの大小関係又は部分的なアクセスの要否によりツリーインデクスを動的に増減して更新，作成又は削除を行なうことができるので、データベースの領域を有効に活用しながら、管理すべき長大レコードの最大レコードサイズの制約を無くすとともに、高速に部分的アクセスを行なえる利点もある。
【００８２】
（ｂ）本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例の説明
上述のファイル管理装置としての機能をソフトウェアにより構成する場合は、例えば、以下に示す図２４〜図３６に示すような、Ｃ言語のプログラムにより実現することができる。
【００８３】
即ち、これらの図２４〜図３６は、本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図であり、図２４は初期の設定を行なうためのもので、図２５はレコード操作部２２を実現するためのもので、図２６はページバッファ２６を実現するためのものである。
また、図２７はデータベース１４におけるデータページ内のヘッダ部分の構造を設定するためのもので、図２８はデータページ操作部２５を実現するためのもので、図２９はデータページ内のレコードエントリ部分の構造を設定するためのものである。
【００８４】
さらに、図３０はＢＬＯＢ管理情報を設定するためのもので、図３１はＢＬＯＢレコード操作部２３を実現するためのもので、図３２はツリーインデクス操作部２４を実現するためのもので、図３３はインデクスページ内のスライス毎のインデクスの構造を設定するためのもので、図３４はインデクスページ内のヘッダ部の構造を設定するためのもので、図３５はインデクスページ操作部２４ａを実現するためのもので、図３６はインデクスページ間操作部２４ｂを実現するためのものである。
【００８５】
上述の図２４〜図３５に示すように、ファイル管理装置を実現するためのヘッダファイルが構成されている場合においては、データベース１４におけるデータページ内においては、先頭に、図２７のプログラムにより設定されたヘッダ部（ＤａｔａＨｅａｄ）が格納され、その直後にレコードエントリが可変数の配列として格納される。なお、このレコードエントリの配列数としては、図２７のプログラムにおける変数ｅｎｔｒｙＭａｘが相当する。また、可変長となるレコードはデータページの最後尾から格納され、レコードエントリとレコード領域が衝突しない範囲で格納されるようになっている。
【００８６】
さらに、データベース１４におけるインデクスページ内においては、先頭に、図３４のプログラムにより設定されたヘッダ部（ＩｎｄｅｘＨｅａｄ）が格納され、その直後に、図３３のプログラムにより設定された個々のスライス毎のインデクス（ＩｎｄｅｘＴｕｐｌｅ）が固定数の配列として格納される。なお、インデクスが固定数の配列として格納されているので、インデクスのページサイズも固定である。
【００８７】
また、図３５のプログラムにより実現されたインデクスページ操作部２４ａ及び図３６のプログラムにより実現されたインデクスページ間操作部２４ｂは、図３２のプログラムにより実現されたツリーインデクス操作部２４から利用されるようになっている。
（ｃ）本実施例にかかるファイル管理装置を文書編集システムに適用した場合の動作の概略の説明
次に、図３に示したファイル管理装置を文書編集システムを起動する応用プログラムに適用した場合の動作について、図４に示すフローチャートを用いて以下に説明する。
【００８８】
即ち、この図４に示すように、利用者はキーボード１２からの入力指示とディスプレイ装置１１への表示により文書編集作業を進め、文書情報をデータベース１４から入出力する。
まず、利用者は最初にキーボードより新規／既存の文書指定を行ない（ステップＡ１，ステップＡ２）、指定された文書が新規文書である場合は、文書ヘッダ情報をｃｒｅａｔｅ操作により新規レコードとしてデータベース１４に書き込む（ステップＡ３）。
【００８９】
また、指定された文書が既存文書である場合は、その文書をデータベース１４から取り出すためにｆｅｔｃｈ操作を行なう（ステップＡ４）。
さらに、ページ単位の文書編集を行なう場合については（ステップＡ５）、キーボード１２により編集すべき文書ページを指示する（ステップＡ６）。これにより、データベース１４上において、ＢＬＯＢレコード又は通常レコードで構成された文書の当該文書ページに対応するオフセットをｌｏｃａｔｅ操作により位置付ける（ステップＡ７）。
【００９０】
なお、この場合においては、単位文書ページ当たりの総バイト数を固定にすることにより、文書ページとオフセットとが一意に対応付けることができる。
また、ｌｏｃａｔｅ操作により指示されたオフセットに対応するＢＬＯＢレコード又は通常レコード内の単位文書ページに対応する特定サイズ分だけをｒｅａｄ操作によりデータベース１４から読み込む（ステップＡ８）。
【００９１】
さらに、ｒｅａｄ操作によりデータベース１４から読み込まれた文書の内容をディスプレイ装置１１に表示する（ステップＡ９）。以後、表示中の文書に対する部分文字列，行の単位での修正作業等をキーボード１２で指示し（ステップＡ１０）、これらに対応して、途中挿入ｉｎｓｅｒｔ（ステップＡ１１），途中削除ｃｕｔ（ステップＡ１２），部分変更ｕｐｄａｔｅ（ステップＡ１３）といったデータベース１４に対する操作が行なわれる。
【００９２】
これにより、複数文書ページについて、上述したような編集作業を繰り返すことができる。
また、文書を削除する指示に対しては、キーボード１２により不要文書を指定すると（ステップＡ１４）、ｒｅｍｏｖｅ操作を行なうことにより、ＢＬＯＢレコード又は通常レコード単位で削除する（ステップＡ１５）。
【００９３】
（ｄ）本実施例にかかるファイル管理装置の適用されたデータベース１４の詳細な説明
図５は本実施例にかかるファイル管理装置の適用されたデータベース１４の詳細を示すブロック図であり、この図５に示すように、１４ａはレコードエントリー部であり、このレコードエントリ部１４ａは、ページバッファ２６からのアクセス要求のあったレコードを特定するためのＲＩＤを入力され、このＲＩＤに対応するレコードの所在（例えばページ内の相対バイト位置等）と通常レコードとＢＬＯＢレコードとの区別を記録するものである。
【００９４】
なお、このレコードエントリ部１４ａにて記録されたレコードの区別に基づき、ファイル管理装置側においては、前述のデータページ操作部２５のレコード区別記録部２５ａにて記録されるようになっている。
なお、このＲＩＤには、レコード毎に採番されたユニーク番号を用いたり、レコードアドレスを用いることができる。この場合ではレコードアドレスを用いてＲＩＤを構成する。なお、ＲＩＤは、レコードを格納するページのアドレスとともに、そのページ内でのレコード追番とにより構成されている。
【００９５】
ここで、データベース１４においては、データページ操作部２５の操作に基づいて通常レコード及びＢＬＯＢレコードを格納するデータページ１４ｅと、ツリーインデクス操作部２４の操作に基づきツリーインデクス情報を格納するインデクスページ１４ｄをそなえている。
なお、いずれのページ１４ｄ，１４ｅにおいても、データベース１４のファイル上では、一つの物理ページであり、ページのファイルとの入出力やバッファリングはページバッファ操作部２６で共通して行なわれるようになっている。
。
【００９６】
また、インデクスページ１４ｄに格納されたツリーインデクスとデータページ１４ｅに格納されたＢＬＯＢレコードとしてのｎ個のスライス１４−１〜１４−ｎとは、図６に示すような関係を有している。
即ち、この図６に示すように、ｎ個のスライス１４−１〜１４−ｎにより、応用処理部２１から要求される１個のＢＬＯＢレコードを構成しており、それぞれのスライス１４−１〜１４−ｎは、各スライスに続くスライスのアドレス情報としてのｎｅｘｔアドレス等の制御情報をそなえ、これにより、例えば各スライス１４−１〜１４−ｎを順にｎｅｘｔアドレスに従ってアクセスすることにより、全体としてＢＬＯＢレコードにアクセスすることができるようになっている。
【００９７】
従って、１個のＢＬＯＢレコードのサイズは、以下に示す式（１）に示すように、各スライス１４−１〜１４−ｎ毎のサイズＳ１〜Ｓｎについての制御情報領域を除いた総和Ｓとなる。

さらに、インデクスページ１４ｄのツリーインデクスは、３段の階層構成を有するインデクス群１４ｄ−１〜１４ｄ−３により構成されており、各インデクス群１４ｄ−１〜１４ｄ−３には対応するスライスのサイズが格納されている。
【００９８】
即ち、最下位のリーフレベルのインデクス群１４ｄ−３には、各スライスそのもののサイズを格納し、その上の階層のインデクス群１４ｄ−２には、下位のインデクスの表すサイズの総和を格納し、結果として最上位のルートレベルのインデクス群１４ｄ−１の表すサイズの総和が、ＢＬＯＢレコード全体のサイズに一致する。
【００９９】
さらに、リーフレベルのインデクス群１４ｄ−３の順番は、これらに対応する各スライスのオフセットの順番に一致しており、従って、上位のインデクス群１４ｄ−２，１４ｄ−１を参照することにより、下位のインデクス群１４ｄ−３をナビゲートすることなく、少ないページアクセスで効率的にＢＬＯＢ全体に対するオフセットに対応したリーフレベルのインデクスを検索できるようになっている。
【０１００】
ところで、それぞれのインデクス群１４ｄ−１〜１４ｄ−３は、複数のインデクスページにより構成することができる。さらに、各々のインデクス群１４ｄ−１〜１４ｄ−３を構成するインデクス１個は、例えば、スライスに対するＲＩＤで８バイト、スライスのサイズで４バイト、更に上位から下位へのインデクスへのページアドレスとして４バイトの合計１６バイトからなり、１ページのサイズを４ＫＢとすると、約２５０個のインデクスを１インデクスページに収容することができる。
【０１０１】
このような構成により、レコードエントリ部１４ａにおいて記録された内容に基づき、アクセス要求のあったレコードが通常レコードである場合は、そのレコードは一つのページ上に存在し、且つそのレコードの所在位置がレコードエントリの記録から直接示される（図５における（ａ）参照）。
また、アクセス要求のあったレコードが、複数ページ上に分散されたスライス１４−１〜１４−ｎにより構成されるＢＬＯＢレコードである場合は、レコードエントリ部１４ａにおいて記録されている所在位置情報は、アクセス要求のあったＢＬＯＢレコードについてのＢＬＯＢ管理情報１４ｃを示している（図５における（ｂ）参照）。
【０１０２】
また、このＢＬＯＢ管理情報１４ｃにより、複数ページ上に分散されたスライスの内の先頭のものへの位置が示されるとともに、このＢＬＯＢレコードに対応するツリーインデクスが生成されている場合は、このＢＬＯＢレコードを構成する分散されたスライスへのツリーインデクスへのトップ（ルート）が示されている。
【０１０３】
ここで、上述のＢＬＯＢ管理情報１４ｃに基づいて、データページ１４ｅ上のデータにアクセスする際においては、スライス間のアドレスリンクを順次ナビゲートする方法か又はインデクスページ１４ｄに生成されたツリーインデクスによりアクセスする方法を採用する。
具体的には、ＢＬＯＢレコードを構成するスライスの個数が少なかったり、部分的なアクセスを要しない場合にはスライス間のアドレスリンクを順次ナビゲートする方法を採用し、インデクスページ１４ｄにツリーインデクスが生成されていたり、部分的なアクセスを要する場合は、ツリーインデクスによりアクセスする方法を採用する。
【０１０４】
ここで、上述のツリーインデクスを用いたデータアクセスを行なう方法を採用する場合においては、ＢＬＯＢ管理情報１４ｃからのＢＬＯＢレコード全体からのオフセットに基づいて、最上位階層（ルート）から最下位（リーフ）に至るインデクスページ上の各インデクスのサイズの積算が所望のオフセットと一致又は直前となるインデクスを見つけ、そのインデクスから逐次下位階層のインデクスページに移っていくことにより、リーフインデクスページ上のインデクスに当該オフセットを含むスライスのＲＩＤを取得することができ、これにより、ＢＬＯＢレコード全体からのオフセットからスライスを特定することができる。
【０１０５】
例えば、ｉ番目のスライス１４−ｉに対する部分的なアクセスを必要とする場合、このｉ番目のスライスは、応用処理部２１からのＢＬＯＢレコード全体におけるオフセットで特定される。
ここで、このオフセットに基づいてアクセスする際に、ツリーインデクスを利用した方法を用いた場合においては、上述したように、ルートレベルのインデクス群１４ｄ−１からオフセット１が求まり、インデクス群１４ｄ−２からオフセット２が求まり、リーフレベルのインデクス群１４ｄ−３からオフセット３が求まる。
【０１０６】
これにより、目標とするオフセットのデータの内容を含むスライス１４−ｉを検索することができ、巨大なＢＬＯＢレコード全体の内の１部分に対するアクセスを高速に実現することができる。
なお、アクセス要求のあったＢＬＯＢデータについて、実際に存在するスライスが数ページ（例えば１０ページ）の場合や、常に先頭から順次読み込むような目的の場合（例えばデータベース１４が単なる大きなデータの格納庫であったり、膨大なメモリ空間に一括して読み込むか、読み込める範囲のサイズまでしかデータ量が増大しないものの場合）は、ツリーインデクスによらずにスライス間のアドレスリンクを順次ナビゲートしてアクセスを行なう。
【０１０７】
これにより、ＢＬＯＢレコード全体に対するサイズや部分的なレコードのアクセスの要否によりアクセス方法を選択して、ツリーインデクスの生成／削除を選択することができるので、インデクスページ１４ｅを生成するのに必要なデータベース１４上の記憶領域や、インデクスページ１４ｅのためのメンテナンス時間の冗長性を抑止することができる。
【０１０８】
（ｅ）データベース上のレコードのサイズ変更に伴った状態遷移の説明
また、データベース１４上のレコードは、前述の（ｂ）にて示したような文書ファイルについての修正等が行なわれると、データベース１４上では、その修正等に応じて図７に示すように状態が遷移するようになっている。
即ち、文書ファイルにおけるレコードの新規作成（ｃｒｅａｔｅ，図７の状態Ｂ１参照）に対しては、初回のレコードサイズが１物理ページに収まれば通常レコードとして登録され（図７の状態Ｂ１から状態Ｂ２）、収まらなければＢＬＯＢレコードとして登録される（状態Ｂ１から状態Ｂ３）。
【０１０９】
また、通常レコードに対して、その後の途中挿入（ｉｎｓｅｒｔ）を行なった結果、レコードサイズが１物理ページに収まらなくなった場合は、ＢＬＯＢレコードとして再登録され（状態Ｂ２から状態Ｂ３）、逆に、ＢＬＯＢレコードに対し、途中削除（ｃｕｔ，図７の状態Ｂ４参照）を行なった結果、レコードサイズが１物理ページに収まれば通常レコードとして再登録される（状態Ｂ３からＢ２）。
【０１１０】
さらに、状態Ｂ３において登録されているＢＬＯＢレコードに対して、途中挿入（ｉｎｓｅｒｔ）により更にレコードサイズが増加するか、既に新規作成（ｃｒｅａｔｅ）段階で充分大きなレコードサイズになっている場合か、又は途中挿入（ｉｎｓｅｒｔ）、途中削除（ｃｕｔ）、途中更新（ｕｐｄａｔｅ）、途中読み込み（ｒｅａｄ）といった部分的なアクセスが必要である場合は、このＢＬＯＢレコードに対し、ツリーインデクス操作部２４による操作に基づいて、前述の図６に示すようなツリーインデクスを生成する（状態Ｂ３から状態Ｂ５）。
【０１１１】
その後、逆にＢＬＯＢレコードに対し途中削除（ｃｕｔ）を行なった結果、レコードサイズが充分小さくなるので、ツリーインデクスはオーバーヘッドとなり、再びツリーインデクスは不要とみなされる。これにより、インデクスページ１４ｅに生成されたツリーインデクスを削除される（状態Ｂ５から状態Ｂ３）。
上述のツリーインデクスの生成、削除を行なう際の判断は、レコード操作部２２において、ＢＬＯＢレコードのサイズの基準値に基づいて行なっているが、このサイズの基準値は、インデクスページ１４ｅのサイズが、最小のインデクス構成で冗長にならないような全スライスのサイズの総和の閾値として設定されている。
【０１１２】
なお、最小のインデクス構成であるという条件は、例えばルートとリーフが一致するとともに１個のインデクスページから成り、且つ１ページ内の格納率が０．５とすることができる。
例えば、前述の図６におけるツリーインデクスの場合においては、１個のインデクスページでは２５０個のインデクスを格納できるので、その０．５の格納率の１２５個以上のインデクスを保つようなスライス数（＝インデクス数）であり、その数分のスライスサイズの総和となる。
【０１１３】
即ち、ガベージコレクション制御によりスライスが密に詰められた状態で、１２５個分のインデクスに対応したページの総サイズを閾値としてのサイズとするので、以下に示す式（２）に基づき、５００ＫＢとすることができる。
４ＫＢ×１２５＝５００ＫＢ …（２）
なお、レコードの削除／抹消（ｒｅｍｏｖｅ）を行なう際は、最後のレコードサイズに依存して、ＢＬＯＢレコードと通常レコードの両方の状態から遷移しうる。また、ツリーインデクスが生成されている状態であっても、レコード削除（ｒｅｍｏｖｅ）に伴い、スライスの削除が発生し、その結果の中間の状態遷移としてＴｒｅｅインデクス無しのＢＬＯＢレコードへの状態遷移となる（図７における状態Ｂ５から状態Ｂ３）。
【０１１４】
従って、ページ内の各レコードの、ＢＬＯＢレコードと通常レコードとの区別を記録しておき、この区別に基づき、通常レコードとＢＬＯＢレコードの操作との分岐制御を行なうことができるので、応用処理部２１側において、通常レコードとＢＬＯＢレコードの区別を意識する必要がなく、ファイル管理装置を、データ長を区別しない一貫性のあるインターフェイスとして機能させることができる。
【０１１５】
（ｆ）本実施例におけるファイル管理装置による、レコードの新規作成の際の動作の詳細な説明
図８はレコードの新規作成（ｃｒｅａｔｅ）を行なう際の動作を詳細に説明するためのフローチャートであり、この図８に示すように、まず新規作成に要求されるレコードサイズがページ内に収まるか否かのチェックを行ない（ステップＣ１）、このチェック結果に応じて、データベース１４に格納される際のレコードの種類としての通常レコード又はＢＬＯＢレコードのいずれかが選択される（ステップＣ２）。
【０１１６】
ここで、レコードサイズが１物理ページに収まらず、ＢＬＯＢレコードとして格納することが選択された場合は（ステップＣ２から“ＢＬＯＢレコード”ルート）、ＢＬＯＢレコード操作部２３において、最初にＢＬＯＢレコードに関わる管理情報を作成するとともに、データページ操作部２５の操作により、この管理情報をあたかも通常レコードの如く１物理ページ内に割り当てる（ステップＣ３）。
【０１１７】
その後、実際のデータを格納すべきレコードは、データページ操作部２５の操作により、スライス形式で複数ページ間に分割して格納される。スライス単位の格納は、あたかも１個１個の通常レコードの如くデータページ操作で格納されるが、スライス毎には、制御情報として、スライス間のＲＩＤによるリンク、そのスライスの持つデータのサイズが付加されている（ステップＣ４）。
【０１１８】
次に、ＢＬＯＢレコード操作部２３ａにおいて、ツリーインデクス要のフラグが立っているか否かを判定する（ステップＣ６）。この段階において、ＢＬＯＢレコードが既に巨大化しており、ＢＬＯＢレコード操作部２３ａにおいて、ツリーインデクス要のフラグが立っている場合は（ステップＣ６の“要”ルート）、ツリーインデクス操作部２４の操作により、ツリーインデクスを一括作成する（ステップＣ７）。
【０１１９】
なお、部分的なアクセスの必要性がある場合においても、ツリーインデクスを作成することができるが、新規作成段階では、実際の部分的なアクセス発生をもってその必要性の判断を行なう場合はない。
このように作成された複数スライスの先頭のスライスへのＲＩＤや、ＴｒｅｅインデクスのルートインデクスページへのページアドレスといったＢＬＯＢレコードに関わる情報を、前述のＢＬＯＢ管理情報に返却記録する（ステップＣ８）。
【０１２０】
これにより、データベース１４におけるページ内に割り当てられた各領域への所在を示すレコードエントリには、そのＢＬＯＢレコードの所在と共に、レコードの種類として、通常レコード、ＢＬＯＢ管理情報、スライスの区別が記録されている。
また、ステップＣ２において、レコードサイズが１物理ページに収まる通常レコードである場合は、単にデータページ操作部２５により、１ページ内にレコードを格納し、ＲＩＤをデータベース１４に返却する（ステップＣ９，ステップＣ１０）。
【０１２１】
従って、新規作成されたレコードに対するＲＩＤを返却した結果、通常レコードならばそのＲＩＤはそのままその領域を示し、ＢＬＯＢレコードならばそのＲＩＤはＢＬＯＢ管理情報を示すことになる。
従って、操作に伴ってレコード全体のサイズが変わっても、そのレコード全体のサイズに応じて、通常レコード又はＢＬＯＢレコードとして登録することができるので、応用処理部２１側において、通常レコードとＢＬＯＢレコードの区別を意識する必要がなく、ファイル管理装置を、データ長を区別しない一貫性のあるインターフェイスとして機能させることができる。
【０１２２】
（ｇ）本実施例におけるファイル管理装置による、レコードの途中挿入の際の動作の詳細な説明
図９〜図１２はレコードの途中挿入（ｉｎｓｅｒｔ）を行なう際のレコードの操作の一例とともに、その後の隣接ページのマージの様子を示す図である。
例えば、図９に示すように、ｎ個のスライス１４−１〜１４−ｎからなるＢＬＯＢレコードに対して、ｉ番目のスライス１４−ｉの途中となるオフセットＸから、ｒバイトのデータ３０の挿入を行なう場合においては、まず、目標オフセット位置への位置付け（ｌｏｃａｔｅ）及び押し出されるデータの有無の判定を行なう。
【０１２３】
即ち、図１０に示すように、挿入の開始点となるオフセットが、ｉ番目のスライス１４−ｉの途中に位置付けられると、このオフセット位置Ｘからｒバイト分のデータ３０が挿入されるが、もともとスライス１４−ｉは空き領域がなかったため、挿入されるべき領域にあるｒバイト分のデータは押し出される。
この場合のように、押し出されるデータがある場合は、新規スライス領域３１を確保しておき、押し出されるデータをその新規スライス領域３１に退避させてから、スライス１４−ｉにおける空いた領域に、新データを挿入する。
【０１２４】
即ち、図１１に示すように、オフセット位置Ｘからｒバイト分のデータ３０の挿入に伴い、ｉ番目のスライス１４−ｉ内に存在するデータｒバイト分のデータが、新規に設定されたページ上の新たなスライス領域３１として押し出され、元のｉ番目のスライス１４−ｉの直後のｉ＋１番目だったスライス１４−（ｉ＋１）のデータが、新規生成されたスライス領域の後ろにマージされる。
【０１２５】
ここで、図１２に示すように、ｉ＋１番目だったスライス１４−（ｉ＋１）内の全てのデータが、新規スライス上にマージできれば、ｉ＋１番目だったスライスは空となり削除することができ、新規スライスを新しいｉ＋１番目のスライス１４−（ｉ＋１）とすることができる。
従って、挿入操作の直後に隣接するページとのマージを行なうことにより、挿入操作によるスライスの増加を抑制することができ、アクセス性能の劣化を防止することができる。
【０１２６】
（ｈ）本実施例におけるファイル管理装置による、レコードの途中削除の際の動作の詳細な説明
図１３〜図１６はレコードの途中削除（ｃｕｔ）を行なう際のレコードの操作の一例とともに、その後の隣接ページのマージの様子を示す図である。
即ち、図１３に示すように、ｎ個のスライス１４−１〜１４−ｎからなるＢＬＯＢレコードに対し、ｉ番目のスライス１４−ｉの途中となるオフセットＸから、ｒ（＝ｋ＋ｍ＋ｎ）バイトの途中削除を行なう場合においては、最初に削除の開始点となるオフセットへの位置付け（ｌｏｃａｔｅ）を行なって、ｉ番目のスライス１４−ｉの途中に位置付ける。
【０１２７】
そして、図１４に示すように、スライス１４−ｉ〜１４−（ｉ＋２）内のデータ削除を行ない、目標全ての削除が終わるまで行なう。即ち、スライス１４−ｉにおけるオフセットＸから最終位置までのｋバイト，スライス１４−（ｉ＋１）における先頭位置から最終位置までのｍバイト及びスライス１４−（ｉ＋２）における先頭のｎバイトを削除する。
【０１２８】
次に、図１５に示すように、空スライス１４−（ｉ＋１）を削除し、スライス１４−ｉとスライス１４−（ｉ＋２）との間のスライス間のリンクを結び直すとともに、スライス１４−（ｉ＋２）内の使用部分（先頭からｎバイトの位置から最終位置までの部分）を、当該スライス１４−（ｉ＋２）の先頭部分に詰め直す。
【０１２９】
最後に、図１６に示すように、元のｉ番目のスライス１４−ｉの開いたスペース（オフセットＸから最終位置までのｋバイト）に、新たなｉ＋１番目のスライス１４−（ｉ＋１）（削除前にｉ＋２番目のスライス１４−（ｉ＋２）だったもの）のデータをマージし、残ったスライス１４−（ｉ＋１）のデータについては、前述の図１５におけるスライス１４−（ｉ＋２）と同様に、先頭位置に詰め直す。
【０１３０】
これにより、削除した結果のページ内に過疎なスライスの増加を抑制することができ、アクセス性能の劣化を抑制することができる。
なお、新たなｉ＋１番目のスライス１４−（ｉ＋１）内でのデータの再度のマージの際に、新たなｉ＋１番目スライス１４−（ｉ＋１）のデータが全てｉ番目のスライス１４−ｉ内に収まれば、この新たなｉ＋１番目のスライスも削除されることになる。
【０１３１】
（ｉ）第１のガベージコレクションの態様の説明
ところで、本実施例にかかるファイル管理装置のＢＬＯＢレコード操作部２３は、詳細には図１７に示すように、総和演算部２３ｃ，比演算部２３ｄ，比率比較部２３ｅ及びガベージコレクション制御部２３ｆをそなえており、ツリーインデクス操作部２４においてツリーインデクスを生成していない場合において、データベース１４におけるスライスの空の領域を整理するガベージコレクション制御を行なうことができるようになっており、このため、ＢＬＯＢレコード操作部２３は、詳細には図１７に示すように、総和演算部２３ｃ，比演算部２３ｄ，比率比較部２３ｅ及びガベージコレクション制御部２３ｆをそなえている。
【０１３２】
ここで、総和演算部２３ｃは、ＢＬＯＢレコードに対するサイズ変更操作が行なわれた後に、当該ＢＬＯＢレコードを構成する全てのスライスについてのスライスサイズの総和を演算するものである。
また、比演算部２３ｄは、総和演算部において演算された総和に対する、スライス１４−１〜１４−ｎの個数ｎとページ単位サイズＰの積Ｄとの比Ｓ／Ｄを演算するものであり、比率比較部２３ｅは、比演算部２３ｄにおいて演算された比Ｓ／Ｄと、予め設定された比率αとを比較するものである。
【０１３３】
さらに、ガベージコレクション制御部２３ｆは、比率比較部２３ｅにおける比率の比較の結果、比演算部２３ｄにおいて演算された比Ｓ／Ｄが予め設定された比率αよりも小さい場合は、データベース１４における全てのスライス１４−１〜１４−ｎについてガベージコレクションを行なうように、データページ操作部２５を介して制御するものである。
【０１３４】
このような構成により、本発明のファイル管理装置では、ツリーインデクス操作部２４においてツリーインデクスを生成しておらず、ｎｅｘｔリンクのみでＢＬＯＢデータが管理されている場合は、以下に示すようにガベージコレクションを行なっている。
ところで、個々のスライスが１ページ丸々のサイズ分のデータを持てば、ＢＬＯＢレコード全体のサイズを満足すべき最小スライス数とすることができる。例えば図１８に示すように、スライス１４−１〜１４−ｎにデータが格納されている場合においては、スライス１４−１〜１４−ｎの個数ｎとページ単位サイズＰの積Ｄ（＝ｎ×Ｐ）が、管理することができるＢＬＯＢレコードの最大サイズである。
【０１３５】
ここで、スライス１４−１〜１４−ｎに格納されているデータの総和が所定の格納率αを下回った場合は、図１８〜図２０に示すように、ＢＬＯＢレコードを構成する全てのスライスに対し、ガベージコレクション制御を行なうことができる。
即ち、図１８に示すように、ＢＬＯＢレコードに対するサイズ変更操作が行なわれた後に、総和演算部２３ｃでは、当該ＢＬＯＢレコードを構成する全てのスライス１４−１〜１４−ｎについてのスライスサイズの総和Ｓを、前述の式（１）に従って演算する。
【０１３６】
比演算部２３ｄでは、総和演算部２３ｃにおいて演算された総和Ｓに対するスライス１４−１〜１４−ｎの個数ｎとページ単位サイズＰの積Ｄ（＝ｎ×Ｐ）との比Ｓ／Ｄを演算する。
そして、比率比較部２３ｅでは、比演算部２３ｄにおいて演算された比Ｓ／Ｄと予め設定された比率α（例えばα＝０．５）と比較し、Ｓ／Ｄがαよりも小さい場合は、スライスに格納されるデータ量が、極度に過疎の状態にあると判定され、ガベージコレクション制御部２３ｆにより、全てのスライス１４−１〜１４−ｎについてガベージコレクションを行なうように制御する。
【０１３７】
ここで、全てのスライス１４−１〜１４−ｎについてガベージコレクションを行なう際においては、例えば図１９におけるスライス１４−１〜１４−３の間のように、隣り合うスライス間において、自スライスとｎｅｘｔアドレスで続くｎｅｘｔスライスとの間でマージすることが行なわれる。
即ち、例えばスライス１４−１の空き領域にスライス１４−２のデータＳ₂及びスライス１４−３のデータの一部Ｓ_3-1がマージされ（図１９の▲１▼参照）、スライス１４−２のように、スライス１４−２が空になると、スライス削除が発生する（図１９の▲２▼参照）。
【０１３８】
なお、スライス１４−３のように、スライス内のデータが空でないものについては、残ったデータＳ_3-2を当該スライス１４−３における先頭部分に詰め直す（図１９の▲３▼参照）。
以後、続くスライス１４−４〜１４−ｎについても、上述の▲１▼〜▲３▼における処理と同様のガベージコレクションを行なうことにより、図２０に示すような総スライス数がｍ（＜ｎ）のスライスとすることができるが、この総スライス数の値ｍは、以下に示す式（３）により表すことができる。
【０１３９】
ｍ＝ＣＥＩＬ（Ｓ／Ｐ）；ＣＥＩＬは小数点以下を切り上げる関数…（３）
従って、ＢＬＯＢレコード操作部２３により、操作に伴って長大レコード全体のサイズが変わった場合においては、データベース１４におけるスライスの空の領域を整理して、スライス数を減少させることができるので、データベース１４の領域を有効に活用することができる利点がある。
【０１４０】
（ｊ）第２のガベージコレクションの態様の説明
上述の（ｉ）で詳述した第１のガベージコレクションの態様においては、ツリーインデクス操作部２４においてツリーインデクスを生成していない場合に行なわれるガベージコレクション制御について説明したが、第２のガベージコレクションの態様としては、ツリーインデクスが生成されている場合のガベージコレクション制御を行なうこともできる。
【０１４１】
また、この場合においても、上述の（ｉ）における図１７に示すように、ＢＬＯＢレコード操作部２３は、総和演算部２３ｃ，比演算部２３ｄ，比率比較部２３ｅ及びガベージコレクション制御部２３ｆをそなえているが、それぞれの機能は上述の（ｉ）におけるものと異なる。
即ち、総和演算部２３ｃは、メンテナンス部２３ｂにおいて、インデクスに対する修正をした後に、修正されたリーフインデクスページ内における、スライスサイズの総和を演算するものであり、比演算部２３ｄは、総和演算部２３ｃにおいて演算された総和に対する、スライスの個数とページ単位サイズの積との比率を演算するものである。
【０１４２】
また、比率比較部２３ｅは、比率演算部２３ｄにおいて演算された比と、予め設定された比率とを比較するものであり、ガベージコレクション制御部２３ｆは、比率比較部２３ｅにおける比率の比較の結果、比演算部２３ｄにおいて演算された比が予め設定された比率よりも小さい場合は、リーフインデクスページ内におけるインデクスに対応したスライスについてガベージコレクションを行なうように制御するものである。
【０１４３】
このような構成により、本発明のファイル管理装置では、ツリーインデクス操作部２４により、ツリーインデクスが構成されているＢＬＯＢレコードに対するガベージコレクションの動作について、図２１〜図２３を用いて以下に説明する。
即ち、スライス１４−１〜１４−Ｎに対する修正を行なうとともに、メンテナンス部２３ｂにおいてインデクスに対する修正を行なうことにより、図２１に示すようなインデクス１５及びこのインデクス１５に対応するスライス１４−１〜１４−Ｎとなった場合においては、以下に示すようにガベージコレクションを行なうか否かを判定する。
【０１４４】
即ち、総和演算部２３ｃでは、修正されたリーフインデクスページ１５内におけるスライスサイズの総和Ｓを演算し、比演算部２３ｄでは、総和演算部２３ｃにおいて演算された総和Ｓに対する、リーフインデクスページ１５内におけるスライスの個数Ｎとスライスページ単位サイズＰの積Ｄ（＝Ｎ×Ｐ）との比Ｓ／Ｄを演算する。
【０１４５】
ここで、比率比較部２３ｅでは、比演算部２３ｄにおいて演算された比Ｓ／Ｄと予め設定された比率とを比較し、ガベージコレクション制御部２３ｆでは、比率比較部における比率の比較の結果、比演算部２３ｄにおいて演算された比が予め設定された比率α（例えばα＝０．５）よりも小さい場合は、リーフインデクスページ１５内におけるインデクスに対応したスライス１４−１〜１４−Ｎについてガベージコレクションを行なうように制御する。
【０１４６】
即ち、図２２に示すように、リーフインデクスページ１５内のスライス１４−１〜１４−Ｎについてガベージコレクションを行なう際においても、前述の図１９におけるスライス１４−１〜１４−３の間のように、隣り合うスライス間において、自スライスとｎｅｘｔアドレスで続くｎｅｘｔスライスとの間でマージすることが行なわれる。
【０１４７】
ここで、メンテナンス部２３ｂでは、上述の隣り合うスライス間のマージ処理を行なう毎にインデクスページのメンテナンスを行なっている。即ち、例えばスライス削除が生じた場合は、対応するインデクスも削除する。
具体的には、スライス１４−１の空き領域にスライス１４−２のデータＳ₂がマージされ（図２２の▲１▼参照）、空になったスライス１４−２が削除されると（図２２の▲２▼参照）、対応するリーフインデクスページ１５におけるインデクス１５Ａ（サイズＳ₂が格納されている領域）が削除され、インデクスの詰め直しが行なわれる（図２２の▲３▼参照）。
【０１４８】
以後、続くスライス１４−３〜１４−Ｎについても、上述の図２２における▲１▼〜▲３▼における処理と同様のガベージコレクションを行なうことにより、図２３に示すような総スライス数がｍ（≦ｎ）のスライスとすることができる。
これにより、ツリーインデクスを構成するリーフインデクスページ１５内で、ガベージコレクションを行なう対象となるスライス群の範囲をスライス１４−１〜１４−Ｎに限定することができる。
【０１４９】
従って、ＢＬＯＢレコード操作部２３により、操作に伴って長大レコード全体のサイズが変わった場合においては、データベース１４におけるスライスの空の領域を整理して、スライス数を減少させることができるので、データベース１４の領域を有効に活用することができる利点がある。
また、ツリーインデクス操作部２４によりツリーインデクスが生成される場合のように、詰めなおしの対象となるスライス数が多い場合において、ガベージコレクションを行なう対象となるスライス群の範囲を限定することにより、データの入出力負荷を抑制することもできる。
【０１５０】
また、各スライス１４−１〜１４−Ｎの合計サイズは、ガベージコレクションを行なった後も変わらないので、総管理サイズも変わらず、仮にツリーインデクスが複数段により構成されている場合においても、リーフインデクスページのメンテナンス時の、対応する上位インデクスページのメンテナンスを容易にすることができる。
【０１５１】
（ｊ）その他
上述の（ｉ），（ｊ）では、スライスの理想サイズに対する格納率に基づいてガベージコレクションを行なうか否かを判定しているが、本発明によればこれに限定されず、理想スライス数に基づいて上述の判定を行なうことができ、このようにしても前述の（ｉ），（ｊ）と同様の作用効果を得ることができる。
【０１５２】
この場合においては、現在保有サイズ（ＢＬＯＢレコード全体のサイズ）をＳとし、スライスページ単位サイズをＰとすると、理想のスライス数、即ち最小のスライス数Ｍを、以下に示す式（４）のように設定する。
Ｍ＝ＣＥＩＬ（Ｓ／Ｐ）；ＣＥＩＬは小数点以下を切り上げる関数…（４）
これにより、上述の式（４）にて得られた理想スライス数Ｍを用いて、実際のスライス数Ｎに対する理想スライス数Ｍとの比率Ｎ／Ｍと所定の閾値αとを比較し、この比較結果に基づいてガベージコレクションを行なうか否かを判定するのである。
【０１５３】
【発明の効果】
以上詳述したように、本発明によれば、ツリーインデクス操作手段により、長大レコードのインデクスを生成し、長大レコードのアクセスをデータベースとの間で行なうことができるので、管理すべき長大レコードの最大レコードサイズの制約を無くすとともに、高速に部分的アクセスを行なえる利点もある。
【０１５４】
また、本発明によれば、再登録操作部により、操作に伴ってレコード全体のサイズが変わっても、そのレコード全体のサイズに応じて、通常レコード又は長大レコードとして登録することができるので、応用プログラム側において、通常レコードとＢＬＯＢレコードの区別を意識する必要がなく、ファイル管理装置を、データ長を区別しない一貫性のあるインターフェイスとして機能させることができる利点もある。
【０１５５】
さらに、本発明によれば、データページ操作手段のレコード区別記録部により、ページ内の各レコードの、長大レコードと通常レコードとの区別を記録しておき、この区別に基づき、通常レコードと長大レコードの操作との分岐制御を行なうことができるので、前述の場合と同様の利点がある。
【０１５６】
また、本発明によれば、長大レコード操作手段のアクセス開始点決定制御部により、レコード全体に対するオフセットより操作対象となるスライスの開始点を、スライス間のアドレスリンクを順次ナビゲートする方法又は上記木インデクスより求める方法のいずれかにより決定することができるので、レコードの規模とアクセス形態に応じた最適なアクセス手段を選択することができるほか、ファイル管理装置を、応用プログラム側においてデータベース側の環境を意識する必要のない独立性のあるインターフェイスとして機能することができる利点もある。
【０１５７】
さらに、本発明によれば、ツリーインデクス操作手段では、長大レコード全体のサイズの対する予め決められたしきいサイズとの大小関係又は部分的なアクセスの要否によりツリーインデクスを動的に増減して更新，作成又は削除を行なうことができるので、データベースの領域を有効に活用しながら、管理すべき長大レコードの最大レコードサイズの制約を無くすとともに、高速に部分的アクセスを行なえる利点もある。
【０１５８】
また、本発明によれば、長大レコード操作手段により、操作に伴って長大レコード全体のサイズが変わった場合においては、データベースにおけるスライスの空の領域を整理して、スライス数を減少させることができるので、前述の場合と同様に、データベースの領域を有効に活用することができる利点がある。
【０１５９】
さらに、本発明によれば、長大レコード操作手段により、ガベージコレクションを行なう際に、詰めなおしの対象となるスライス数が多い場合において、ガベージコレクションを行なう対象となるスライス群の範囲を限定することにより、入出力負荷を抑制することができる利点もある。
【図面の簡単な説明】
【図１】第１の発明の原理ブロック図である。
【図２】第２の発明の原理ブロック図である。
【図３】本実施例にかかるファイル管理装置が適用されたシステムを示すブロック図である。
【図４】本実施例にかかるファイル管理装置を文書編集システムを起動する応用プログラムに適用した場合の動作を説明するためのフローチャートである。
【図５】本実施例にかかるファイル管理装置の適用されたデータベースの詳細を示すブロック図である。
【図６】本実施例にかかるデータベースにおけるスライスとインデクスの関係を示す図である。
【図７】本実施例にかかるデータベースの状態遷移を説明するための図である。
【図８】本実施例にかかるデータベースにおけるレコードの新規作成を行なう際の動作を詳細に説明するためのフローチャートである。
【図９】本実施例にかかるデータベースにおけるレコードの途中挿入を行なう際のレコードの操作の一例とともに、その後の隣接ページのマージの様子を示す図である。
【図１０】本実施例にかかるデータベースにおけるレコードの途中挿入を行なう際のレコードの操作の一例とともに、その後の隣接ページのマージの様子を示す図である。
【図１１】本実施例にかかるデータベースにおけるレコードの途中挿入を行なう際のレコードの操作の一例とともに、その後の隣接ページのマージの様子を示す図である。
【図１２】本実施例にかかるデータベースにおけるレコードの途中挿入を行なう際のレコードの操作の一例とともに、その後の隣接ページのマージの様子を示す図である。
【図１３】本実施例にかかるデータベースにおけるレコードの途中削除を行なう際のレコードの操作の一例とともに、その後の隣接ページのマージの様子を示す図である。
【図１４】本実施例にかかるデータベースにおけるレコードの途中削除を行なう際のレコードの操作の一例とともに、その後の隣接ページのマージの様子を示す図である。
【図１５】本実施例にかかるデータベースにおけるレコードの途中削除を行なう際のレコードの操作の一例とともに、その後の隣接ページのマージの様子を示す図である。
【図１６】本実施例にかかるデータベースにおけるレコードの途中削除を行なう際のレコードの操作の一例とともに、その後の隣接ページのマージの様子を示す図である。
【図１７】本実施例にかかるファイル管理装置のＢＬＯＢレコード操作部を詳細に示すブロック図である。
【図１８】本実施例にかかるファイル管理装置における、ツリーインデクスを生成していない場合におけるガベージコレクション制御を説明する図である。
【図１９】本実施例にかかるファイル管理装置における、ツリーインデクスを生成していない場合におけるガベージコレクション制御を説明する図である。
【図２０】本実施例にかかるファイル管理装置における、ツリーインデクスを生成していない場合におけるガベージコレクション制御を説明する図である。
【図２１】ツリーインデクスが構成されている場合におけるＢＬＯＢレコードに対するガベージコレクション制御を説明する図である。
【図２２】ツリーインデクスが構成されている場合におけるＢＬＯＢレコードに対するガベージコレクション制御を説明する図である。
【図２３】ツリーインデクスが構成されている場合におけるＢＬＯＢレコードに対するガベージコレクション制御を説明する図である。
【図２４】本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図である。
【図２５】本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図である。
【図２６】本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図である。
【図２７】本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図である。
【図２８】本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図である。
【図２９】本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図である。
【図３０】本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図である。
【図３１】本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図である。
【図３２】本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図である。
【図３３】本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図である。
【図３４】本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図である。
【図３５】本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図である。
【図３６】本実施例にかかるファイル管理装置を実現するためのヘッダファイルのプログラミング例を示す図である。
【図３７】一般的なファイル管理装置によるＢＬＯＢレコードのアクセス手法を示す図である。
【図３８】一般的なファイル管理装置によるＢＬＯＢレコードのアクセス手法を示す図である。
【符号の説明】
１ａ，１ｂファイル管理装置
２ａ，２ｂレコード操作手段
３ａ，３ｂ長大レコード操作手段
３ｂ−１アクセス開始点決定制御部
３ｂ−２メンテナンス部
４ａ通常レコード操作手段
４ｂデータページ操作手段
５ａインデクス操作手段
５ｂ木インデクス操作手段
６ａ，６ｂデータベース
１１ディスプレイ装置
１２キーボード
１３中央処理装置／主記憶装置
１４データベース
１４−１〜１４−ｎ，１４−Ｎスライス
１４ａレコードエントリ
１４ｂ通常レコード
１４ｃＢＬＯＢ管理情報
１４ｄインデクスページ
１４ｄ−１〜１４ｄ−３インデクス群
１４ｅデータページ
１５，１５Ａインデクス
２１応用処理部
２２レコード操作部（レコード操作手段）
２２ａ操作要求部
２２ｂ再登録操作部
２３ＢＬＯＢレコード操作部（長大レコード操作手段）
２３ａアクセス開始点決定制御部
２３ｂメンテナンス部
２３ｃ総和演算部
２３ｄ比演算部
２３ｅ比率比較部
２３ｆガベージコレクション制御部
２４ツリーインデクス操作部
２４ａインデクスページ操作部
２４ｂインデクスページ間操作部
２５データページ操作部（データページ操作手段）
２５ａレコード区別記録部
２６ページバッファ
３０挿入データ
３１新規スライス領域
１０１レコード
１０１−１〜１０１−ｎスライス
１０２ページ[0002]
[Industrial application fields]
The present invention relates to a file management method and a file management apparatus for managing files registered in a database, and in particular, when accessing data that does not fit within a page as a unit of data input / output operation in the database. The present invention relates to a file management method and a file management apparatus suitable for use.
[0003]
[Prior art]
2. Description of the Related Art Conventionally, a workstation configured with, for example, a plurality of terminals and a database and starting an application program or the like has a system for managing files registered in a database such as a DBMS (Database Management System). It is used.
[0004]
That is, in this DBMS, partial reading, updating, and data insertion / deletion are performed on files configured with basic data types (atomic data types) such as numerical values and character strings. Access processing such as a size change associated with is performed.
Further, in the above-mentioned workstations and the like, there is a growing need for expressing multimedia data such as documents, image data, and variable length coordinate arrays in CAD / CAE on a database.
[0005]
Here, the multimedia data has a variable data size, and the fixed meaning (such as size comparison) for the data cannot be limited. Therefore, in many RDBMSs (relational DBMSs), multimedia data is expressed by a data type called BLOB (binary large objects) that represents a variable-length bit / byte sequence in units of records (taples, rows), for example. It is distinct from the basic data type.
[0006]
Here, the record refers to a unit of data when data in the database is exchanged between application programs.
In the case where multimedia data is handled as a BLOB corresponding to the data items (fields, columns) constituting the record, it can be realized by referring to the address of the BLOB record.
[0007]
In addition, the BLOB may be expressed in English like “long item data”, “spatial data”, “bulk data”.
By the way, as described above, when registering multimedia data in a database as a file, the size of the record is variable and may be a large size that spans between a plurality of physical pages. Cannot register multimedia data on the database in the same way as basic data.
[0008]
Therefore, a record composed of multimedia data increases the size of the entire record, making it impossible to make the entire amount of data resident in main memory, and requests for application programs and inquiries when operating a workstation. As a result, it is necessary to perform partial access to the entire record in the database.
[0009]
Therefore, as shown in FIG. 37, in a file management apparatus that manages files, a large-sized record (BLOB data) 101 that spans a plurality of physical pages is divided into pages and stored on the database. It is possible to address-link the fragments 101-1 to 101-n stored separately.
[0010]
As a result, when BLOB data is registered in the database as shown in FIG. 37, the file management apparatus sequentially navigates from the first fragment based on the address link when performing partial access to the entire record. It has come to do.
However, as the amount of data to be stored in a database in recent years increases, the number of pieces divided into a plurality of physical pages increases. Therefore, the access time to the database increases, and there is a problem that, for example, skipping with respect to the time axis of image data and instantaneous access to partial array data of CAE cannot be realized.
[0011]
Therefore, for example, as shown in FIG. 38, in the file management apparatus, for each record composed of pieces (slices) 101-1 to 101-n divided in units of pages, each slice 101-1 to 101-n. By separately registering the page 102 for storing the list information (directory information) consisting of the size and address of the data, the time for partial access to the BLOB data on the database can be reduced.
[0012]
That is, in the case of partial access to the pages in which slices 101-1 to 101-n are stored, by referring to the guidelines 102-1 to 102-n in the page 102 in which the directory information is stored, The time of public access is reduced.
[0013]
[Problems to be solved by the invention]
However, in the database access method by the file management apparatus as shown in FIG. 38 described above, the directory information itself stored in the page 102 must be contained in one physical page, and as a result, the maximum for the entire BLOB record can be obtained. There is a problem that the size is restricted.
[0014]
Furthermore, a page for directory information is always required even for small-scale BLOB data that spans several pages at most. That is, for small-scale BLOB data that does not simply fit within a page, using the access method as shown in FIG. 38 described above, the setting of the area for directory information and its maintenance require spatial and temporal overhead. There is also a problem.
[0015]
In addition, when using an application program, it is also necessary to construct an interface on the application program side that does not need to be aware of the environment on the database side for the convenience of the user.
The present invention was devised in view of such a problem, and provides a file management method and a file management apparatus that do not have a record size restriction and that can be performed at high speed even for partial access. With the goal.
[0016]
[Means for Solving the Problems]
FIG. 1 is a block diagram showing the principle of the first invention. In FIG. 1, reference numeral 1a denotes a file management apparatus. The file management apparatus 1a manages files registered in the database 6a. Means 2a, long record operation means 3a, normal record operation means 4a, and index operation means 5a are provided.
[0017]
  The record operating means 1a accepts various requests for records, and the record here is the database 6aThis is a unit of data when the internal data is exchanged with the request source on the record operation means 2 side.
  The long record operating means 3a1 pageSize that does not fit within the size ofLong recordBecauseDivided into page unitsWhen a request for a long record composed of a plurality of slices is received from the record operation means 2a, the long record is disassembled into slices.
[0018]
  Further, the normal record operating means 4a is supplied from the record operating means 1a.1 pageIn response to a request for a record within the size or a request for a long record from the long record operation means 3a, access within one record or partial access within the record is performed with the database 6a.
  The index operation means 5a generates or updates an index for a long record and accesses the record with a database.
Furthermore, when the normal record operating means 4b receives a request regarding a record within one page size from the record operating means 2a, the normal record operating means 4b accesses the database 6a within one record.
Further, when the record operation means 2a receives a registration request for a new long record having a size that does not fit within one page size, the long record operation means 3a has a size corresponding to one page for the new long record. It is divided into slice units and registered in the database 6a by the normal record operation means 4a.
Furthermore, when the long record operation means 3a receives a request for a long record that is registered in the database 6a and has a size that does not fit within the size of one page and is composed of a plurality of slices, the long record , The partial access in the long record broken down into the slice units is performed with the database 6a, or the index of the long record is generated or updated, and the access of the long record is performed in units of the slice. As shown in FIG.
[0019]
FIG. 2 is a block diagram showing the principle of the second invention. In FIG. 2, 1b is a file management apparatus, and this file management apparatus 1b manages files registered in the database 6b. A record operation unit 2b, a long record operation unit 3b, a data page operation unit 4b, and a tree index operation unit 5b are provided.
[0020]
Here, the record operating means 2b is any of new record creation, deletion, fetch, halfway insertion, halfway deletion, update or reading of the record or partial data in the record. The record operation request is received, and the record here is also a unit of data when the data in the database 6b is exchanged with the request source on the record operation means 2 side.
[0021]
In addition, when the data to be requested by the record manipulation means 2b is a normal record having a size that can be accommodated in one page, the data page manipulation means 4b can store the new record or the database 6b as the request target. The existing record indicated by the record identifier registered in is assigned as a normal record that fits on one page, various record operations are performed, and partial access is performed in one normal record.
[0022]
  Note that a page is a minimum unit of physical input / output operations regarding data in a database.
  Furthermore, the long record operation means 3b has the data subject to the request received by the record operation means 2b,1Size that does not fit on the pageLong recordBecauseDivided into page unitsIn the case of a long record composed of a plurality of slices, a new record or a long record registered on the database 6b is decomposed into slices to be operated, and then various records are sent to the data page operation means 4b. OperationFor each sliceThis is a request and includes an access start point determination control unit 3b-1 and a maintenance unit 3b-2 which will be described later.
[0023]
The tree index operation means 5b stores the size of each slice itself and the record identifier to the slice in the lowest leaf level index, and stores the sum of the sizes represented by the lower indexes in the upper hierarchy index. The sum of the sizes represented by the root level indexes matches the size of the entire long record, and the order of the leaf level indexes is the same as the offset order of each corresponding slice. .
[0024]
In addition, when performing partial access to an existing long record, the access start point determination control unit 3b-1 of the long record operation means 3b determines the start point of the slice to be operated from the offset for the entire record as the slice. It is determined by either the method of sequentially navigating the address links between them or the method of obtaining from the tree index.
[0025]
  Furthermore, the maintenance unit 3b-2 requests the tree index operating means 5b to maintain the tree index accompanying the addition and deletion of slice units..
  Further, the data page operation means 4b has a record distinction recording section for recording the distinction between normal records and long records for each record in the page, and the record operation means 2b is a record distinction record of the data page manipulation means 4b. Based on the distinction of the record corresponding to the record identifier of the existing record from the section, the record operation is requested to the data page operation means 4b in the case of a normal record, and to the long record operation means 3b in the case of a long record. Can also have an operation request section.
[0026]
  Further, when the record operation means 2b does not fit the size of the entire record on one page with respect to an existing normal record, the record record is stored in the long record operation means 3b after re-registering the normal record as a long record. While requesting an operation, after completing the operation for an existing long record, if the size of the entire long record fits on one page, re-register the long record completed for this record operation as a normal record. Can also have an operation unit.
[0027]
  Further, when the access start point determination control unit 3b-1 determines a start point of a slice for performing partial access to an existing long record, if a tree index is configured, the tree index operation unit 5b If not, you can navigate the address link of the slice itself and determine its starting point.
[0028]
  That is, when the access start point is determined by the tree index operating means 5b based on the control of the access start point determination control unit 3b-1, the maintenance unit 3b-2 is determined in advance for the size of the entire long record. Can be configured to determine tree index creation and deletion based on threshold size and size,Decide whether to create or delete a tree index depending on the necessity of partial access.,In addition, it can be configured to create a tree index when partial access occurs..
[0029]
  In addition, the maintenance unit 3b-2 of the long record operation means 3b can create or delete a tree index depending on the relationship between the size of the entire large record and the predetermined threshold size and the necessity of partial access. Can be configured to determine.
  Further, when the tree index operation means 5b searches for a slice from the offset for the entire long record, the integration of the sizes of the indexes from the highest root index page to the lowest leaf index page of the tree index is desired. Search for the index that matches or immediately precedes the offset, and sequentially search from the searched index to the lower-level index page to obtain the record identifier for the slice containing the offset in the index on the leaf index page. An index page operation unit can be provided.
[0030]
  Further, when the long record operation means 3b performs a partial insertion for a long record, if there is an empty area in the storage page of the newly generated slice after the insert operation, the newly generated slice and the immediately following slice are inserted. In addition to having an operation unit between index pages to be merged,When performing partial deletion of a long record, it is possible to have an index page operation unit that merges the slice including the start point of the delete operation and the end point between the slices..
[0031]
  Further, after the size change operation is performed on the long record, the long record operation means 3b is operated in the sum calculation unit that calculates the sum of the slice sizes for all slices constituting the long record, and the sum calculation unit. The number of slices1 pageThe ratio calculation unit that calculates the ratio of the product of the size, the ratio comparison unit that compares the ratio calculated in the ratio calculation unit and the preset ratio, and the ratio comparison result in the ratio comparison unit, the ratio calculation If the ratio calculated in the part is smaller than the preset ratio,In order to fill the empty area of the own slice with the data of the slice following the own slice between adjacent slices,For all slicesBy merging sequentially,A garbage collection control unit can be provided to control garbage collection..
[0032]
  Further, after the long record operation means 3b corrects the index in the maintenance unit 3b-2, the total calculation unit that calculates the sum of the slice sizes in the corrected leaf index page and the total calculation unit calculate The number of slices1 pageRatio calculation unit for calculating the ratio with the product of size, ratio comparison unit for comparing the ratio calculated in the ratio calculation unit with a preset ratio, and ratio calculation as a result of the ratio comparison in the ratio comparison unit If the ratio calculated in the section is smaller than the preset ratio, the slice corresponding to the index in the leaf index pageBy sequentially merging so that the empty area of the own slice is filled with the data of the slice following the own slice between adjacent slices,A garbage collection control unit can be provided to control garbage collection..
[0033]
[Action]
  In the first invention described above, as shown in FIG.1 pageWhen a request relating to a record within the size is received, the normal record operating means 4a accesses the database 6a within one record.
Further, when the registration request for a new long record having a size that does not fit within the size of one page is received by the long record operation means 3a, the new long record is decomposed into slices divided into pages. Then, it is registered in the database 6a by the normal record operating means 4a.
  In the record operation means 2a,1 page registered in database 6aSize that does not fit within the size ofLong record withWhen a request for a long record composed of a plurality of slices is received, the long record operation means 3a breaks the long record into slices.
[0034]
  Then, the normal record operation means 4a performs partial access to the database 6a within the long record broken down into slices, or the index operation means 5a generates or updates the long record index, Long record access, With slice as unitThis is performed with the database 6a.
  Thereby, the file management apparatus 1a can manage the files stored in the database 6a..
[0035]
Further, in the file management apparatus 1b of the second invention described above, the record operation means 2b performs new creation, deletion, fetching, partial recording in the record for the record or partial data in the record. It accepts record operation requests for data insertion, deletion, update or reading.
Further, in the data page operation means 4b, when the data targeted for the request received by the record operation means 2b is a normal record having a size that can be contained within one page, the new record or the database 6b as the target of the request. The existing record indicated by the record identifier registered in (1) is assigned as a normal record that fits on one page, various record operations are performed, and partial access is performed in one normal record.
[0036]
  Furthermore, in the long record operation means 3b, the data that is the target of the request received by the record operation means 2b is1Size that does not fit on the pageLong recordBecauseDivided into page unitsIn the case of a long record composed of a plurality of slices, a new record or a long record registered on the database 6b is decomposed into slices to be operated, and then various records are sent to the data page operation means 4b. OperationIn units of slicesRequest.
[0037]
Further, the tree index operation means 5b stores the size of each slice itself and the record identifier for the slice in the lowest leaf level index, and the index of the hierarchy above it stores the sum of the sizes represented by the lower index, The sum of the sizes represented by the highest root level index matches the size of the entire long record, and the order of the leaf level indexes matches the offset order of each of these corresponding slices.
[0038]
In addition, in the access start point determination control unit 3b-1 of the long record operation means 3b, when performing partial access to an existing long record, the start point of the slice to be operated is determined from the offset for the entire record as the slice. In the maintenance unit 3b-2, the maintenance of the tree index accompanying the addition or deletion of the slice unit is performed by the tree index operating means 5b. To request.
[0039]
  Thereby, the file management apparatus 1b can manage the file registered in the database 6b based on the record of the operation request target..
  In the data page operation unit 4b, the record distinction recording unit records the distinction between the normal record and the long record of each record in the page, and the operation request unit of the record operation unit 2b performs the data page operation unit 4b. Based on the record distinction corresponding to the record identifier of the existing record from the record distinction recording section, the record operation is performed on the data page operation means 4b in the case of a normal record, and on the long record operation means 3b in the case of a long record. Request.Thereby, in the record operation means 2b, a record operation can be requested according to the distinction of the record corresponding to the record identifier of the existing record.
[0040]
  Furthermore, when the re-registration operation unit of the record operation means 2b does not fit the size of the entire record on one page with respect to an existing normal record, the normal record is re-registered as a long record and then the long record When the operation means 3b is requested to perform a record operation and the size of the entire long record fits on one page after completing the operation for an existing long record, the long record completed by this record operation is set as a normal record. You can also re-register.
[0041]
  In addition, the access start point determination control unit 3b-1 of the long record operating means 3b determines the start point of the slice for performing partial access to the existing long record, and if a tree index is configured, the tree By using the index operation means 5b, it is possible to control to navigate the address link possessed by the slice itself and determine its starting point..
[0042]
  In this case, when the access start point is determined by the tree index operating means 5b based on the control by the access start point determination control unit 3b-1, the maintenance unit 3b-2 determines in advance the size of the entire long record. In addition to being able to decide whether to create or delete a tree index based on the size of the threshold size,Depending on the necessity of partial access, it is possible to decide to create or delete a tree index,In addition, a tree index can be created when partial access occurs..
[0043]
  In addition, the maintenance unit 3b-2 can determine whether to create or delete a tree index depending on the size relationship with a predetermined threshold size with respect to the size of the entire long record and the necessity of partial access..
  Further, in the index page operation unit of the tree index operation means 5b, when searching for a slice from the offset for the entire long record, the size of each index from the highest root index page to the lowest leaf index page of the tree index is determined. By searching for the index whose integration matches or immediately before the desired offset, and sequentially searching from the searched index to the index page of the lower hierarchy, the index on the leaf index page includes the offset. You can also get the record identifier of.
[0044]
  In addition, when performing a partial insertion on a long record by the inter-index page operation unit of the long record operation means 3b, if there is an empty area in the storage page of the newly generated slice after the insert operation, the newly generated slice Can be merged between the next slice and,When performing partial deletion on a long record, the slice containing the start point and end point of the delete operation can be merged between slices..
[0045]
  Further, after the size change operation is performed on the long record, the sum calculation unit calculates the sum of the slice sizes for all slices constituting the record, and the ratio calculation unit calculates the sum. The number of slices for the sum and1 pageThe ratio with the product of the size is calculated, the ratio comparison unit compares the ratio calculated in the ratio calculation unit with a preset ratio, and the garbage collection control unit compares the ratio in the ratio comparison unit, If the ratio calculated in the ratio calculator is smaller than the preset ratio, So that the free area of the own slice is filled with the data of the slice following the own slice between adjacent slices,For all slicesBy merging sequentially,Can be controlled to do garbage collection.
[0046]
  In addition, in the maintenance unit 3b-2, after correcting the index, the sum calculation unit calculates the sum of the slice sizes in the corrected leaf index page, and the ratio calculation unit calculates the sum in the sum calculation unit. The number of slices for the sum and1 pageThe ratio of the product of the size is calculated, the ratio comparison unit compares the ratio calculated in the ratio calculation unit with a preset ratio, and the garbage collection control unit compares the ratio in the ratio comparison unit, When the ratio calculated by the ratio calculation unit is smaller than the preset ratio, the slice corresponding to the index in the leaf index pageBy sequentially merging so that the empty area of the own slice is filled with the data of the slice following the own slice between adjacent slices,Can also be controlled to do garbage collection.
[0047]
【Example】
Embodiments of the present invention will be described below with reference to the drawings.
(A) Outline of file management apparatus according to one embodiment of the present invention
FIG. 3 is a block diagram showing a system to which the file management apparatus according to the present embodiment is applied. The system shown in FIG. 3 is composed of a plurality of terminals and a database, and performs document editing processing by a terminal user, for example. It functions as a workstation or the like on which an application program or the like is started, and the file management device is for managing files registered in the database.
[0048]
Here, in FIG. 3, 11 is a display device for displaying the processing contents and the like by the terminal user, and 12 is a keyboard for inputting data, commands and the like from the terminal user.
Reference numeral 13 denotes an application program such as a document editing process based on input from the keyboard 12 or data stored in the database 14 or reads data used in executing the program from the database 14. The central processing unit / main storage device stores the central processing unit / main storage device 13 and has a function as a file management device according to the present embodiment.
[0049]
The database 14 stores various data for application programs started in the central processing unit / main storage device 13 for each file. In this database 14, physical input regarding data is stored. The output operation is performed for each page.
Here, the CPU and the main storage device 13 include an application processing unit 21, a record operation unit 22, a BLOB record operation unit 23, a tree index operation unit 24, a data page operation unit 25, and a page buffer 26. Can be realized by software.
[0050]
The application processing unit 21 executes processing by an application program (application program). In the present embodiment, the application processing unit 21 executes an application program for document editing processing.
The record operation unit (record operation means) 22 receives various record operation requests for a record as a unit of data when exchanging data in the database 14 between application programs, or partial data in the record. An operation request unit 22a and a re-registration operation unit 22b are provided.
[0051]
For example, the record operation unit 22 inserts a document creation (create), delete (remove), or fetch (fetch) as an operation request for each record, and inserts (insert) as an operation request for partial data in the record. In the middle of the process, deletion (cut), update (update), and reading (read) are accepted.
[0052]
In addition, the operation request unit 22a of the record operation unit 22 is based on the record distinction (allocation) in the existing record indicated by the record identifier (Record IDentifier, RID) from the data page operation unit 25. In the case of a BLOB record (long record), the BLOB record operation unit 23 is requested to perform a record operation.
[0053]
Furthermore, if the size of the entire record does not fit on one page with the operation for the existing normal record, the re-registration operation unit 22b re-registers the normal record as a long record and then sends it to the BLOB record operation unit 23. When a record operation is requested and the size of the entire long record fits on one page after completing the operation for an existing long record, the long record completed by this record operation is re-registered as a normal record It is.
[0054]
A data page operation unit (normal record operation means, data page operation means) 25 operates a data page for storing data for each page. Specifically, when the data that is the target of the request received by the record operation unit 22 is a normal record that fits within one page, the existing record indicated by a new record or a record identifier registered on the database 14 Various record operations are performed as normal records in which a record fits on one page, and partial access that fits in one page in the normal record is performed.
[0055]
Further, the data page operation unit 25 includes a record distinction recording unit 25a that records the distinction between the normal record and the BLOB record of each record in the page as an RID.
Further, the BLOB record operation unit (long record operation means) 23 is a long and large size composed of a plurality of slices, the size of the data subject to the request accepted by the record operation unit 22 is not within one page. In the case of a BLOB (binary large objects) record as a record, a new record or a BLOB record registered on the database 14 is decomposed into slices to be operated, and then various data pages are processed with respect to the data page operation unit 25. It requests a record operation and includes an access start point determination control unit 23a and a maintenance unit 23b.
[0056]
As will be described later with reference to FIG. 17, the BLOB record operation unit 23 includes a sum calculation unit 23c, a ratio calculation unit 23d, a ratio comparison unit 23e, and a garbage collection control unit 23f. You can also organize empty areas.
Here, when performing partial access to an existing BLOB record, the access start point determination control unit 23a sequentially navigates the start point of the slice to be operated from the offset for the entire record and the address link between the slices. This is determined by either a gating method or a method obtained from tree index information of the tree index operation unit 24 described later.
[0057]
The maintenance unit 23b requests the Tree index operation unit 24 to maintain tree index information associated with addition and deletion of slice units.
Further, a tree index operation unit 24 constitutes a tree index and stores it in an index page in the database 14, and includes an index page operation unit 24a and an inter-index page operation unit 24b. ing.
[0058]
Further, when a partial access to an existing BLOB record is performed, the access can be performed based on the tree index information configured by the tree index operation unit 24.
As the structure of the tree index described above, the lower leaf level index stores the size of each slice itself and the record identifier to the slice, and the index of the hierarchy above it is the sum of the sizes represented by the lower index. The sum of the sizes represented by the indexes of the highest root level (Root Level) matches the size of the entire large record, and the order of the leaf level indexes matches the order of the offsets of each corresponding slice. Can be configured to.
[0059]
With the configuration described above, the file management apparatus according to one embodiment of the present invention performs the following processing.
That is, the record operation unit 22 receives an access request for a record from the application program of the application processing unit 21. For example, new creation (delete) or fetch (fetch) is accepted as an access request for each record, and halfway insertion (insert), halfway deletion (cut), update (update) is accepted as partial access. ) Or read.
[0060]
Further, the record operation unit 22 sets a distinction between a normal record and a BLOB record according to a request size for the record when a new record is created.
If the access request received by the record operation unit 22 is for an existing record, it is distinguished based on the RID from the data page operation unit 25 whether the record is a normal record or a BLOB record.
[0061]
Here, when the record corresponding to the accepted access is a normal record, the operation request unit 22a of the record operation unit 22 directly requests the actual data operation from the data page operation unit 25, and is a BLOB record. In this case, the actual record operation is requested to the BLOB record operation unit 23.
In addition, even when the existing record is a normal record, if the entire record size does not fit in one size due to the requested record operation, the existing normal record is registered as a BLOB record by the re-registration operation unit 22b. Thereafter, the operation request unit 22a requests the BLOB record operation unit 23 to perform an actual record operation.
[0062]
In addition, when the entire BLOB record can be accommodated on one page after completing the operation for the existing BLOB record, the BLOB record after completion of the record operation is registered as a normal record by the re-registration operation unit 22b. cure.
When the access request received by the record operation unit 22 is for an existing normal record, the RID itself indicates the location of one record in the database 14.
[0063]
On the other hand, when the access request received by the record operation unit 22 is a BLOB record, the RID indicates the location in the database 14 of management information for the BLOB record. The management information located in the database 14 includes information for navigating a plurality of slices and information regarding a tree index.
[0064]
Based on the request from the record operation unit 22, the data page operation unit 25 performs record operations such as allocation, deletion, and update as a normal record in which a new record or an existing record indicated by RID fits on one page. Partial access within a record (within a slice) such as halfway deletion, partial reading within a normal record, and halfway insertion of a range that fits within one page is performed.
[0065]
In the BLOB record operation unit 23, the access request received in the record operation unit 22 is for a BLOB record stored across a plurality of slices where a BLOB record of a new record or an existing record indicated by RID is scattered over a plurality of pages. In some cases, the BLOB record is decomposed into slice units to be operated, so that apparently individual normal records can be accessed.
[0066]
That is, the access start point determination control unit 23a of the BLOB record operation unit 23 uses the method of sequentially navigating the address link between slices or the tree index operation unit 24 using the offset of the entire record as the operation target. This is determined by one of the methods obtained from the generated tree index information.
[0067]
Specifically, in the access start point determination control unit 23a, when a tree index is created in the tree index operation unit 24, the tree index operation unit 24 determines a start point of a slice to be partially accessed. However, if not created, the address link of the slice itself is navigated to determine its starting point.
[0068]
When the start point of the slice is determined by any one of the methods described above, the BLOB record operation unit 23 navigates the address link for the record operation in units of slices after the start point, and the data page operation unit 25 in units of slices. The access process is repeatedly executed and address link maintenance is performed as data in units of slices is added or deleted.
[0069]
Furthermore, when a tree index is created in the tree index operation unit 24, the maintenance unit 23b maintains the tree index in accordance with operations related to addition, deletion, and size change of each slice unit.
The maintenance unit 23b newly generates a tree index or deletes an existing tree index according to the necessity of partial access and the size of the entire BLOB record.
[0070]
Here, the necessity of partial access is determined by referring to a dictionary defined in advance for each type of data stored and managed as this BLOB record or by receiving a partial access request from an application program. Since the size of the entire BLOB record may be converted into a normal record as a result of the size reduction after the record operation, it is determined based on the size after the record operation.
[0071]
Incidentally, the tree index operation unit 24 operates an index corresponding to each slice in the index page on the database 14. In this index, if the storage number of the index per page is X and the average storage ratio is γ (0.5 ≦ γ ≦ 1.0), the height H of the index hierarchy is log γX (n), and the number of slices It is possible to dynamically increase the index with respect to the increase in the number of access points and to alleviate the deterioration in access performance.
[0072]
In addition, in the index generated by the tree index operation unit 24, the size of the corresponding slice is stored in each index, and the lowest leaf level index stores the size of each slice itself. The sum of the sizes indicated by the indexes matches the size of the entire BLOB record.
In addition, the order of the leaf-level indexes matches the order of the offsets of the corresponding slices. As a result, the higher-order index is efficient with less page access without navigating the lower-order index pages. In particular, it is possible to arrive at a leaf level index corresponding to an offset for the entire BLOB.
[0073]
In addition, an index page in which the size of each index on the index page from the highest level (root) to the lowest level (leaf) is the same as or just before the desired offset is found. By moving to and searching, the RID of the slice whose offset is included in the index on the leaf index page (Leaf Index Page) is acquired. Thereby, the slice can be specified from the offset from the entire BLOB record.
[0074]
The index page operation unit 24a performs an operation for each index page. That is, when searching for an index, a search is sequentially performed from a root index page (Root Index Page) toward a lower index page. However, when maintaining an index, a leaf index is operated via the index page operation unit 24b. Changes are reflected sequentially from the page to the upper index pages.
[0075]
In addition, the inter-index page operation unit 24b performs maintenance of links between previous and next index pages in the same hierarchy and links between upper / lower index pages.
By the way, when the tree index operation unit 24 performs an index increase / decrease operation on an index page accompanying an increase / decrease of a slice on a data page, an overflow accompanying the movement of the index between the previous and next index pages in the same hierarchy. And underflow, split when the index overflows between the previous and next index pages cannot be accommodated, and merge of extremely sparse index pages.
[0076]
Thus, when the index page operation unit 24a first receives a maintenance request for a leaf level index, the index page operation unit 24a makes a maintenance request for the actual index to the inter-index page operation unit 24b.
In response to the maintenance request, the index page operation unit 24b performs maintenance (overflow, underflow, split, or merge) between the index pages, and the index page operation unit 24a performs index insertion / deletion for each index page. Request.
[0077]
Further, when the index maintenance for one layer is completed, the index page operation unit 24b reflects the change contents on the index page of the upper layer. For example, when the total size of the index page of the lower hierarchy is changed, the size of the index of the upper hierarchy representing the index page of the lower hierarchy and the total size of the index page of the upper hierarchy are updated.
[0078]
By the way, in the page buffer 26, according to the page address on the file in the database 14, input / output in units of physical pages, buffering of pages read into the main storage device, allocation of new pages to the database 14, data records and indexes For example, the database 14 releases a page that is no longer needed due to emptying.
[0079]
As described above, the tree index operation unit 24 can generate the index of the BLOB record and access the BLOB record to and from the database 14, thereby eliminating the restriction on the maximum record size of the BLOB record to be managed. There is also an advantage that partial access can be performed at high speed.
In addition, even if the size of the entire record changes with the operation by the re-registration operation unit 22a, it can be registered as a normal record or a BLOB record according to the size of the entire record, and the record distinction recording unit 25a Since the distinction between the BLOB record and the normal record of each record in the page is recorded and the branch control between the normal record and the operation of the BLOB record can be performed based on this distinction, the application processing unit 21 side However, there is no need to be aware of the distinction between the normal record and the BLOB record, and there is an advantage that the file management apparatus can function as a consistent interface that does not distinguish the data length.
[0080]
Further, the access start point determination control unit 23a of the BLOB record operation unit 23 uses the method of sequentially navigating the address link between slices or the method of obtaining the start point of the slice to be operated from the offset with respect to the entire record. Since it can be determined by either method, there is an advantage that it is possible to select an optimum access means according to the size of the record and the access form. There is also an advantage that it can function as an independent interface that does not need to be conscious.
[0081]
Also, the tree index operation unit 24 updates, creates, or deletes the tree index dynamically by dynamically increasing or decreasing the tree index depending on the size relationship with the predetermined threshold size with respect to the size of the entire BLOB record or the necessity of partial access. Therefore, there is an advantage that the restriction of the maximum record size of the long record to be managed can be eliminated while the partial area can be accessed at high speed while effectively utilizing the database area.
[0082]
(B) Description of a programming example of the header file for realizing the file management apparatus according to the present embodiment
When the function as the above-described file management apparatus is configured by software, for example, it can be realized by a C language program as shown in FIGS.
[0083]
That is, FIG. 24 to FIG. 36 are diagrams showing examples of header file programming for realizing the file management apparatus according to the present embodiment, and FIG. 24 is for initial setting. Is for realizing the record operation unit 22, and FIG. 26 is for realizing the page buffer 26.
27 is for setting the structure of the header part in the data page in the database 14, FIG. 28 is for realizing the data page operation unit 25, and FIG. 29 is the record entry part in the data page. This is for setting the structure.
[0084]
30 is for setting the BLOB management information, FIG. 31 is for realizing the BLOB record operation unit 23, and FIG. 32 is for realizing the tree index operation unit 24. Is for setting the index structure for each slice in the index page, FIG. 34 is for setting the header structure in the index page, and FIG. 35 is for realizing the index page operation section 24a. FIG. 36 is for realizing the inter-index page operation unit 24b.
[0085]
As shown in FIGS. 24 to 35 described above, when a header file for realizing the file management apparatus is configured, in the data page in the database 14, it is set at the head by the program of FIG. The header portion (DataHead) is stored, and immediately after that, record entries are stored as a variable number of arrays. Note that the number of arrays of record entries corresponds to the variable entryMax in the program of FIG. The variable length record is stored from the end of the data page, and is stored in a range where the record entry and the record area do not collide.
[0086]
Furthermore, in the index page in the database 14, the header part (IndexHead) set by the program of FIG. 34 is stored at the head, and immediately after that, the index for each slice (set by the program of FIG. 33) ( IndexTuple) is stored as a fixed number of arrays. Since the index is stored as a fixed number of arrays, the page size of the index is also fixed.
[0087]
Also, the index page operation unit 24a realized by the program of FIG. 35 and the index page operation unit 24b realized by the program of FIG. 36 are used from the tree index operation unit 24 realized by the program of FIG. It has become.
(C) Outline of operation when the file management apparatus according to this embodiment is applied to a document editing system.
Next, the operation when the file management apparatus shown in FIG. 3 is applied to an application program for starting the document editing system will be described below with reference to the flowchart shown in FIG.
[0088]
That is, as shown in FIG. 4, the user advances the document editing work by inputting instructions from the keyboard 12 and displaying on the display device 11, and inputs / outputs document information from the database 14.
First, the user first designates a new / existing document from the keyboard (step A1, step A2). If the designated document is a new document, the document header information is stored in the database 14 as a new record by a create operation. Write (step A3).
[0089]
If the designated document is an existing document, a fetch operation is performed to retrieve the document from the database 14 (step A4).
Further, when document editing is performed in units of pages (step A5), the document page to be edited is designated by the keyboard 12 (step A6). Thereby, the offset corresponding to the document page of the document composed of the BLOB record or the normal record is positioned on the database 14 by the locate operation (step A7).
[0090]
In this case, the document page and the offset can be uniquely associated by fixing the total number of bytes per unit document page.
Further, only a specific size corresponding to the BLOB record corresponding to the offset instructed by the locate operation or the unit document page in the normal record is read from the database 14 by the read operation (step A8).
[0091]
Further, the contents of the document read from the database 14 by the read operation are displayed on the display device 11 (step A9). Thereafter, the keyboard 12 is used for instructing partial character strings and line-by-line correction operations for the document being displayed (step A10). Corresponding to these instructions, an intermediate insertion insert (step A11) and an intermediate deletion cut (step A12). ), Partial change update (step A13) is performed on the database 14.
[0092]
Thereby, the editing operation as described above can be repeated for a plurality of document pages.
Also, in response to an instruction to delete a document, if an unnecessary document is designated by the keyboard 12 (step A14), the remove operation is performed to delete the BLOB record or the normal record (step A15).
[0093]
(D) Detailed description of the database 14 to which the file management apparatus according to this embodiment is applied
FIG. 5 is a block diagram showing details of the database 14 to which the file management apparatus according to the present embodiment is applied. As shown in FIG. 5, 14a is a record entry part, and this record entry part 14a is a page. The RID for specifying the record requested to be accessed from the buffer 26 is input, and the location of the record corresponding to this RID (for example, relative byte position in the page) and the distinction between the normal record and the BLOB record are recorded. Is.
[0094]
On the basis of the record distinction recorded in the record entry part 14a, on the file management apparatus side, the record distinction recording part 25a of the data page operation part 25 is recorded.
The RID can be a unique number assigned for each record or a record address. In this case, the RID is configured using the record address. The RID is composed of the address of the page storing the record and the record serial number within the page.
[0095]
Here, in the database 14, a data page 14 e that stores a normal record and a BLOB record based on the operation of the data page operation unit 25, and an index page 14 d that stores tree index information based on the operation of the tree index operation unit 24. I have it.
Note that each

page

14d and 14e is one physical page on the file of the database 14, and input / output and buffering of the page file are performed in common in the page buffer operation unit 26. ing.
.
[0096]
Further, the tree index stored in the index page 14d and the n slices 14-1 to 14-n as the BLOB records stored in the data page 14e have a relationship as shown in FIG.
That is, as shown in FIG. 6, n slices 14-1 to 14-n constitute one BLOB record requested from the application processing unit 21, and each slice 14-1 to 14-14 is configured. -N is provided with control information such as a next address as address information of a slice following each slice, and thereby, for example, by sequentially accessing each slice 14-1 to 14-n according to the next address, the BLOB record as a whole Can be accessed.
[0097]
Therefore, the size of one BLOB record is the sum S excluding the control information area for the sizes S1 to Sn for each slice 14-1 to 14-n, as shown in the following equation (1). .

Further, the tree index of the index page 14d is composed of index groups 14d-1 to 14d-3 having a three-level hierarchical structure, and each index group 14d-1 to 14d-3 has a corresponding slice size. Stored.
[0098]
That is, the size of each slice itself is stored in the index group 14d-3 at the lowest leaf level, and the sum of the sizes represented by the lower indexes is stored in the index group 14d-2 in the hierarchy above it. As a result, the sum of the sizes represented by the index group 14d-1 at the highest root level matches the size of the entire BLOB record.
[0099]
Furthermore, the order of the leaf-level index groups 14d-3 matches the offset order of the corresponding slices. Therefore, by referring to the higher-order index groups 14d-2 and 14d-1, The leaf level index corresponding to the offset for the entire BLOB can be efficiently searched with few page accesses without navigating the index group 14d-3.
[0100]
By the way, each index group 14d-1 to 14d-3 can be composed of a plurality of index pages. Further, one index constituting each of the index groups 14d-1 to 14d-3 is, for example, 8 bytes as the RID for the slice, 4 bytes as the size of the slice, and 4 as the page address to the index from the higher order to the lower order. It consists of a total of 16 bytes, and if the size of one page is 4 KB, about 250 indexes can be accommodated in one index page.
[0101]
With such a configuration, if the record requested to be accessed is a normal record based on the content recorded in the record entry unit 14a, the record exists on one page and the location of the record is Directly shown from the record of the record entry (see (a) in FIG. 5).
Further, when the record for which an access request is made is a BLOB record composed of slices 14-1 to 14-n distributed over a plurality of pages, the location information recorded in the record entry unit 14a is: The BLOB management information 14c for the BLOB record for which an access request has been made is shown (see (b) in FIG. 5).
[0102]
If the BLOB management information 14c indicates the position of the top slice among the slices distributed on a plurality of pages, and if a tree index corresponding to this BLOB record has been generated, this BLOB record The top (root) to the tree index to the distributed slices that make up is shown.
[0103]
Here, when accessing the data on the data page 14e based on the BLOB management information 14c described above, access is made by a method of sequentially navigating address links between slices or by a tree index generated on the index page 14d. Adopt the method to do.
Specifically, when the number of slices constituting a BLOB record is small or partial access is not required, a method of sequentially navigating address links between slices is adopted, and a tree index is generated on the index page 14d. If a partial access is required or a partial access is required, a method of accessing by a tree index is adopted.
[0104]
Here, in the case of adopting the above-described method for performing data access using the tree index, based on the offset from the entire BLOB record from the BLOB management information 14c, the highest hierarchy (root) to the lowest (leaf) Find the index where the total size of each index on the index page leading up to or equals the desired offset or immediately before is moved to the index page of the lower hierarchy from that index, so that the index on the leaf index page The RID of the slice including the offset can be acquired, and thereby the slice can be specified from the offset from the entire BLOB record.
[0105]
For example, when partial access to the i-th slice 14-i is required, the i-th slice is specified by an offset in the entire BLOB record from the application processing unit 21.
Here, when a method using a tree index is used when accessing based on this offset, as described above, the offset 1 is obtained from the root-level index group 14d-1, and the index group 14d-2 is obtained. , Offset 2 is obtained, and offset 3 is obtained from the leaf-level index group 14d-3.
[0106]
As a result, the slice 14-i including the content of the target offset data can be searched, and access to one part of the entire huge BLOB record can be realized at high speed.
For BLOB data for which access has been requested, the actual slice is a few pages (for example, 10 pages), or for the purpose of always reading sequentially from the beginning (for example, the database 14 is just a large data storage). In the case where the data amount is increased only up to the size of a readable range), the address links between slices are sequentially navigated and accessed regardless of the tree index.
[0107]
  As a result, an access method is selected according to the size of the entire BLOB record and the necessity of partial record access, and tree index generation / deletion is performed.TheSince the selection can be made, the storage area on the database 14 necessary for generating the index page 14e and the redundancy of the maintenance time for the index page 14e can be suppressed.
[0108]
(E) Explanation of state transitions accompanying record size changes in the database
Further, when the record on the database 14 is corrected for the document file as shown in the above (b), the state on the database 14 is changed as shown in FIG. It is supposed to transition.
That is, for a new record creation (create, see state B1 in FIG. 7) in the document file, if the initial record size fits in one physical page, it is registered as a normal record (from state B1 to state B2 in FIG. 7). If it does not fit, it is registered as a BLOB record (from state B1 to state B3).
[0109]
If the record size does not fit in one physical page as a result of subsequent insertion into the normal record, it is re-registered as a BLOB record (from state B2 to state B3). If the BLOB record is deleted halfway (cut, see state B4 in FIG. 7), if the record size fits in one physical page, it is re-registered as a normal record (states B3 to B2).
[0110]
Furthermore, for the BLOB record registered in the state B3, the record size further increases due to insertion in the middle (insert), or the record size has already become sufficiently large at the new creation stage, or halfway When partial access such as insertion, deletion in the middle (cut), update in the middle (update), and reading in the middle (read) is necessary, this BLOB record is based on the operation by the tree index operation unit 24. Then, a tree index as shown in FIG. 6 is generated (from state B3 to state B5).
[0111]
Then, conversely, as a result of performing a cut (cut) on the BLOB record halfway, the record size becomes sufficiently small, so that the tree index becomes overhead and the tree index is regarded as unnecessary again. As a result, the tree index generated in the index page 14e is deleted (from state B5 to state B3).
The determination at the time of generating and deleting the tree index is performed based on the reference value of the size of the BLOB record in the record operation unit 22. This reference value of the size is determined by the size of the index page 14e, It is set as a threshold value of the sum total of the sizes of all slices so as not to be redundant with the minimum index configuration.
[0112]
The condition that the index configuration is the minimum can be, for example, that the root and the leaf are coincident and that one index page is included, and the storage rate in one page is 0.5.
For example, in the case of the tree index shown in FIG. 6 described above, since one index page can store 250 indexes, the number of slices that can maintain 125 or more indexes with a storage rate of 0.5 (= This is the total number of slice sizes.
[0113]
That is, in the state where slices are closely packed by garbage collection control, the total size of pages corresponding to 125 indexes is set as the threshold size, so that it is set to 500 KB based on the following equation (2). be able to.
4KB × 125 = 500KB (2)
Note that when performing record deletion / removal (remove), depending on the size of the last record, transition can be made from both the BLOB record and the normal record. Even in a state where a tree index has been generated, deletion of a slice occurs due to record deletion (remove), and as a result of the intermediate state transition, a state transition to a BLOB record without a Tree index occurs. (State B5 to State B3 in FIG. 7).
[0114]
Accordingly, the distinction between the BLOB record and the normal record of each record in the page is recorded, and branch control between the normal record and the operation of the BLOB record can be performed based on this distinction. On the side, there is no need to be aware of the distinction between normal records and BLOB records, and the file management apparatus can function as a consistent interface that does not distinguish between data lengths.
[0115]
(F) Detailed description of the operation when creating a new record by the file management apparatus in this embodiment
FIG. 8 is a flowchart for explaining in detail the operation when a new record is created. As shown in FIG. 8, first, whether or not the record size required for new creation fits within the page. (Step C1), and according to the check result, either a normal record or a BLOB record is selected as a record type when stored in the database 14 (step C2).
[0116]
Here, when the record size does not fit on one physical page and it is selected to store as a BLOB record (from the step C2 to the “BLOB record” route), the BLOB record operation unit 23 first manages the BLOB record. Information is created and the management information is assigned to one physical page as if it were a normal record by operating the data page operation unit 25 (step C3).
[0117]
Thereafter, the record in which the actual data is to be stored is divided into a plurality of pages and stored in a slice format by the operation of the data page operation unit 25. Storage in units of slices is stored by data page operations as if each normal record, but for each slice, a link by RID between slices and the size of the data held by that slice are added as control information. (Step C4).
[0118]
  Next, in the BLOB record operation unit 23a, it is determined whether or not a tree index required flag is set (step C6). At this stage, BLOB recordsButIf the tree index has already been enlarged and the BLOB record operation unit 23a has a flag indicating that the tree index is required ("required" route in step C6), a tree index is created collectively by operating the tree index operation unit 24 ( Step C7).
[0119]
  Even when partial access is necessary, a tree index can be created. However, at the new creation stage, there is no case where the necessity is determined when actual partial access occurs.
  RID to the top slice of the multiple slices created in this way, TreeBInformation related to the BLOB record, such as the page address to the root index page of the index, is returned and recorded in the BLOB management information (step C8).
[0120]
As a result, the record entry indicating the location of each area allocated in the page in the database 14 records the location of the BLOB record, the record type, the normal record, the BLOB management information, and the slice distinction. Yes.
In step C2, if the record size is a normal record that fits in one physical page, the data page operation unit 25 simply stores the record in one page and returns the RID to the database 14 (step C9, step C10).
[0121]
Therefore, as a result of returning the RID for the newly created record, if the record is a normal record, the RID indicates the area as it is. If the record is a BLOB record, the RID indicates the BLOB management information.
Therefore, even if the size of the entire record changes according to the operation, it can be registered as a normal record or a BLOB record according to the size of the entire record. There is no need to be aware of the distinction, and the file management apparatus can function as a consistent interface that does not distinguish the data length.
[0122]
(G) Detailed description of the operation at the time of record insertion by the file management apparatus in this embodiment
FIG. 9 to FIG. 12 are diagrams showing an example of the operation of the record when the record is inserted halfway (insert), and the state of the subsequent adjacent page merge.
For example, as shown in FIG. 9, the insertion of r bytes of data 30 from the offset X in the middle of the i-th slice 14-i with respect to the BLOB record consisting of n slices 14-1 to 14-n. In the case of performing the above, first, the positioning to the target offset position and the presence / absence of data to be pushed out are determined.
[0123]
That is, as shown in FIG. 10, when the offset that is the starting point of insertion is positioned in the middle of the ith slice 14-i, r bytes of data 30 are inserted from this offset position X. Since the slice 14-i has no free area, r bytes of data in the area to be inserted are pushed out.
When there is data to be pushed out as in this case, a new slice area 31 is secured, the pushed data is saved in the new slice area 31, and then the new area is added to the empty area in the slice 14-i. Insert data.
[0124]
That is, as shown in FIG. 11, with the insertion of r bytes of data 30 from the offset position X, the data of r bytes existing in the i-th slice 14-i is transferred to the newly set page. The new slice area 31 is pushed out, and the data of the i + 1th slice 14- (i + 1) immediately after the original ith slice 14-i is merged behind the newly generated slice area.
[0125]
Here, as shown in FIG. 12, if all the data in the slice 14- (i + 1), which is the i + 1th, can be merged onto the new slice, the i + 1th slice becomes empty and can be deleted. Can be the new i + 1 th slice 14- (i + 1).
Therefore, by merging with adjacent pages immediately after the insertion operation, an increase in slices due to the insertion operation can be suppressed, and deterioration of access performance can be prevented.
[0126]
(H) Detailed description of the operation when deleting a record halfway by the file management apparatus in this embodiment
FIGS. 13 to 16 are diagrams showing an example of a record operation when performing mid-record deletion (cut), and a state of merging adjacent pages thereafter.
That is, as shown in FIG. 13, for a BLOB record consisting of n slices 14-1 to 14-n, from an offset X that is halfway through the i-th slice 14-i, halfway through r (= k + m + n) bytes. In the case of performing deletion, first, positioning to an offset that is a starting point of deletion is performed, and it is positioned in the middle of the i-th slice 14-i.
[0127]
Then, as shown in FIG. 14, data in the slices 14-i to 14- (i + 2) is deleted until all the targets are deleted. That is, k bytes from the offset X to the last position in the slice 14-i, m bytes from the head position to the last position in the slice 14- (i + 1), and the first n bytes in the slice 14- (i + 2) are deleted.
[0128]
Next, as shown in FIG. 15, the empty slice 14- (i + 1) is deleted, the link between the slice 14-i and the slice 14- (i + 2) is reconnected, and the slice 14- (i + 2) is connected. ) In the first portion of the slice 14- (i + 2).
[0129]
Finally, as shown in FIG. 16, a new i + 1th slice 14- (i + 1) (before deletion) is added to the open space (k bytes from the offset X to the final position) of the original ith slice 14-i. And the data of the remaining slice 14- (i + 1) is merged with the data of the first slice 14- (i + 2) in FIG. 15 described above. Repack to.
[0130]
As a result, it is possible to suppress an increase in sparse slices in the page resulting from the deletion, and it is possible to suppress deterioration in access performance.
When the data in the new i + 1th slice 14- (i + 1) is merged again, if all the data in the new i + 1th slice 14- (i + 1) fits in the ith slice 14-i. This new i + 1th slice is also deleted.
[0131]
(I) Description of the first garbage collection mode
By the way, the BLOB record operation unit 23 of the file management apparatus according to the present embodiment includes a total operation unit 23c, a ratio operation unit 23d, a ratio comparison unit 23e, and a garbage collection control unit 23f, as shown in detail in FIG. When the tree index operation unit 24 does not generate a tree index, the garbage collection control for organizing the empty area of the slice in the database 14 can be performed. Therefore, the BLOB record operation can be performed. As shown in detail in FIG. 17, the unit 23 includes a sum calculation unit 23c, a ratio calculation unit 23d, a ratio comparison unit 23e, and a garbage collection control unit 23f.
[0132]
Here, the sum calculating unit 23c calculates the sum of the slice sizes for all the slices constituting the BLOB record after the size changing operation is performed on the BLOB record.
The ratio calculator 23d calculates a ratio S / D between the number n of the slices 14-1 to 14-n and the product D of the page unit size P with respect to the sum calculated by the sum calculator. The ratio comparison unit 23e compares the ratio S / D calculated by the ratio calculation unit 23d with a preset ratio α.
[0133]
Furthermore, the garbage collection control unit 23f determines that the ratio S / D calculated in the ratio calculation unit 23d is smaller than the preset ratio α as a result of the ratio comparison in the ratio comparison unit 23e. The slice 14-1 to 14-n is controlled via the data page operation unit 25 so as to perform garbage collection.
[0134]
With such a configuration, in the file management apparatus of the present invention, when the tree index operation unit 24 does not generate a tree index and BLOB data is managed only by the next link, garbage collection is performed as shown below. Is doing.
By the way, if each slice has data corresponding to the size of one page, the minimum number of slices that can satisfy the size of the entire BLOB record can be obtained. For example, as shown in FIG. 18, when data is stored in slices 14-1 to 14-n, the product D of the number n of slices 14-1 to 14-n and the page unit size P (= n × P) is the maximum size of the BLOB record that can be managed.
[0135]
Here, when the sum of the data stored in the slices 14-1 to 14-n falls below a predetermined storage rate α, as shown in FIGS. 18 to 20, all the slices constituting the BLOB record are included. On the other hand, garbage collection control can be performed.
That is, as shown in FIG. 18, after the size change operation is performed on the BLOB record, the sum calculation unit 23c performs the sum S of the slice sizes for all slices 14-1 to 14-n constituting the BLOB record. Is calculated according to the above-described equation (1).
[0136]
The ratio calculator 23d calculates a ratio S / D between the number n of the slices 14-1 to 14-n and the product D (= n × P) of the page unit size P with respect to the sum S calculated by the sum calculator 23c. To do.
Then, the ratio comparison unit 23e compares the ratio S / D calculated by the ratio calculation unit 23d with a preset ratio α (for example, α = 0.5), and when S / D is smaller than α, It is determined that the amount of data stored in the slice is extremely sparse, and the garbage collection control unit 23f controls to perform garbage collection for all slices 14-1 to 14-n.
[0137]
Here, when garbage collection is performed for all slices 14-1 to 14-n, between the adjacent slices, for example, between slices 14-1 to 14-3 in FIG. Merging with the next slice that follows the address is performed.
That is, for example, the data S of the slice 14-2 is stored in the empty area of the slice 14-1.₂And part of the data S of slice 14-3_3-1Are merged (see (1) in FIG. 19), and slice 14-2 becomes empty like slice 14-2 (see (2) in FIG. 19).
[0138]
For slices 14-3 where the data in the slice is not empty, the remaining data S_3-2Are repacked to the head of the slice 14-3 (see (3) in FIG. 19).
Thereafter, garbage collection similar to the processing in (1) to (3) is performed for the subsequent slices 14-4 to 14-n, so that the total number of slices as shown in FIG. 20 is m (<n). However, the value m of the total number of slices can be expressed by the following equation (3).
[0139]
m = CEIL (S / P); CEIL is a function that rounds up the decimal point (3)
Therefore, when the size of the entire long record is changed by the operation by the BLOB record operation unit 23, empty areas of slices in the database 14 can be arranged and the number of slices can be reduced. There is an advantage that the area can be effectively utilized.
[0140]
(J) Explanation of second garbage collection mode
In the first garbage collection mode described in detail in (i) above, the garbage collection control performed when the tree index operation unit 24 does not generate a tree index has been described. As an aspect, garbage collection control can be performed when a tree index is generated.
[0141]
Also in this case, as shown in FIG. 17 in the above (i), the BLOB record operation unit 23 includes a sum calculation unit 23c, a ratio calculation unit 23d, a ratio comparison unit 23e, and a garbage collection control unit 23f. However, each function is different from that in the above (i).
That is, the summation calculation unit 23c calculates the sum of slice sizes in the corrected leaf index page after the maintenance unit 23b corrects the index, and the ratio calculation unit 23d calculates the summation calculation unit 23c. The ratio of the number of slices to the product of the page unit size is calculated with respect to the sum calculated in step.
[0142]
Further, the ratio comparison unit 23e compares the ratio calculated in the ratio calculation unit 23d with a preset ratio, and the garbage collection control unit 23f performs the ratio comparison in the ratio comparison unit 23e. When the ratio calculated in the ratio calculation unit 23d is smaller than a preset ratio, control is performed so that the slice corresponding to the index in the leaf index page is garbage collected.
[0143]
With such a configuration, in the file management apparatus of the present invention, the garbage collection operation for the BLOB record in which the tree index is configured by the tree index operation unit 24 will be described below with reference to FIGS.
That is, by correcting the slices 14-1 to 14-N and correcting the index by the maintenance unit 23b, the index 15 as shown in FIG. 21 and the slices 14-1 to 14- corresponding to the index 15 are shown. If N, it is determined whether to perform garbage collection as shown below.
[0144]
That is, the sum calculation unit 23c calculates the sum S of the slice sizes in the modified leaf index page 15, and the ratio calculation unit 23d calculates the sum S calculated in the sum calculation unit 23c in the leaf index page 15. The ratio S / D between the number N of slices and the product D (= N × P) of the slice page unit size P is calculated.
[0145]
Here, in the ratio comparison unit 23e, the ratio S / D calculated in the ratio calculation unit 23d is compared with a preset ratio, and in the garbage collection control unit 23f, as a result of the ratio comparison in the ratio comparison unit, the ratio When the ratio calculated by the calculation unit 23d is smaller than a preset ratio α (for example, α = 0.5), garbage collection is performed on slices 14-1 to 14-N corresponding to the index in the leaf index page 15. Control to perform.
[0146]
That is, as shown in FIG. 22, even when garbage collection is performed on slices 14-1 to 14 -N in the leaf index page 15, as between slices 14-1 to 14-3 in FIG. 19 described above. Between adjacent slices, merging is performed between the own slice and the next slice that continues at the next address.
[0147]
Here, the maintenance unit 23b performs index page maintenance each time the above-described merge processing between adjacent slices is performed. That is, for example, when a slice deletion occurs, the corresponding index is also deleted.
Specifically, the data S of the slice 14-2 is stored in the empty area of the slice 14-1.₂Are merged (see (1) in FIG. 22) and the empty slice 14-2 is deleted (see (2) in FIG. 22), the index 15A (size S) in the corresponding leaf index page 15 is deleted.₂Is deleted, and the index is refilled (see (3) in FIG. 22).
[0148]
Thereafter, garbage collection similar to the processing in (1) to (3) in FIG. 22 is performed on the subsequent slices 14-3 to 14-N, so that the total number of slices as shown in FIG. ≦ n) slices.
As a result, the range of slice groups to be subjected to garbage collection can be limited to slices 14-1 to 14-N in the leaf index page 15 constituting the tree index.
[0149]
Therefore, when the size of the entire long record is changed by the operation by the BLOB record operation unit 23, empty areas of slices in the database 14 can be arranged and the number of slices can be reduced. There is an advantage that the area can be effectively utilized.
In addition, when the number of slices to be refilled is large as in the case where a tree index is generated by the tree index operation unit 24, the range of slice groups to be garbage collected is limited, thereby reducing the data The input / output load can be suppressed.
[0150]
Further, since the total size of each slice 14-1 to 14-N does not change after the garbage collection, the total management size does not change, and even if the tree index is configured by a plurality of stages, the leaf It is possible to facilitate the maintenance of the corresponding upper index page at the time of index page maintenance.
[0151]
(J) Other
In the above (i) and (j), whether or not to perform garbage collection is determined based on the storage rate with respect to the ideal size of the slice. However, according to the present invention, the number of ideal slices is not limited to this. Based on this, it is possible to make the above-mentioned determination, and even in this way, it is possible to obtain the same effects as the above-mentioned (i) and (j).
[0152]
In this case, assuming that the currently held size (the size of the entire BLOB record) is S and the slice page unit size is P, the ideal slice number, that is, the minimum slice number M is expressed by the following equation (4). Set to.
M = CEIL (S / P); CEIL is a function that rounds up decimals (4)
As a result, the ratio N / M of the ideal slice number M to the actual slice number N is compared with the predetermined threshold α using the ideal slice number M obtained by the above equation (4), and this comparison is performed. It is determined whether to perform garbage collection based on the result.
[0153]
【The invention's effect】
  As detailed above,According to the present invention, since the index of the long record can be generated by the tree index operation means and the access to the long record can be performed with the database, the restriction on the maximum record size of the long record to be managed is eliminated. There is also an advantage that partial access can be performed at high speed.
[0154]
  Also,According to the present invention, even if the size of the entire record changes with operation by the re-registration operation unit, it can be registered as a normal record or a long record according to the size of the entire record. However, there is no need to be aware of the distinction between normal records and BLOB records, and there is an advantage that the file management apparatus can function as a consistent interface that does not distinguish the data length.
[0155]
  further,According to the present invention, the record distinction recording unit of the data page operation means records the distinction between the long record and the normal record of each record in the page, and based on this distinction, the operation of the normal record and the long record is performed. Branch control withofThere are similar advantages to the case.
[0156]
  Also,According to the present invention, the access start point determination control unit of the long record operating means uses the method of sequentially navigating the address link between slices, or the above tree index, from the offset to the entire record. Since it can be determined by any of the required methods, it is possible to select the most suitable access method according to the size of the record and the access form, and the file management device is conscious of the database side environment on the application program side There is also an advantage that it can function as an independent interface that is not necessary.
[0157]
  further,According to the present invention, in the tree index operation means, the tree index is dynamically increased / decreased and updated depending on the size relationship with the predetermined threshold size with respect to the size of the entire long record or the necessity of partial access. Since creation or deletion can be performed, there is an advantage that the restriction of the maximum record size of a long record to be managed can be eliminated and the partial access can be performed at high speed while effectively utilizing the database area.
[0158]
  Also,According to the present invention, when the size of the entire long record is changed with the operation by the long record operation means, the empty area of the slice in the database can be arranged, and the number of slices can be reduced. As in the case described above, there is an advantage that the database area can be used effectively.
[0159]
  further,According to the present invention, when garbage collection is performed by the long record operating means, when the number of slices to be refilled is large, the range of slice groups to be garbage collected is limited to limit the input range. There is also an advantage that the output load can be suppressed.
[Brief description of the drawings]
FIG. 1 is a principle block diagram of a first invention.
FIG. 2 is a principle block diagram of a second invention.
FIG. 3 is a block diagram illustrating a system to which the file management apparatus according to the embodiment is applied.
FIG. 4 is a flowchart for explaining an operation when the file management apparatus according to the embodiment is applied to an application program for starting a document editing system;
FIG. 5 is a block diagram showing details of a database to which the file management apparatus according to the embodiment is applied.
FIG. 6 is a diagram illustrating a relationship between slices and indexes in the database according to the embodiment.
FIG. 7 is a diagram for explaining database state transition according to the embodiment;
FIG. 8 is a flowchart for explaining in detail an operation when creating a new record in the database according to the embodiment;
FIG. 9 is a diagram illustrating an example of a record operation when performing midway insertion of a record in the database according to the embodiment, and a state of merging adjacent pages thereafter.
FIG. 10 is a diagram illustrating an example of a record operation when performing midway insertion of a record in the database according to the embodiment, and a state of merging adjacent pages thereafter.
FIG. 11 is a diagram illustrating an example of a record operation when performing midway insertion of a record in the database according to the embodiment and a state of merging adjacent pages thereafter.
FIG. 12 is a diagram illustrating an example of a record operation when a record is inserted halfway in the database according to the embodiment and a state of merging adjacent pages thereafter.
FIG. 13 is a diagram illustrating an example of a record operation when deleting a record in the database in the database according to the embodiment and a state of merging adjacent pages thereafter.
FIG. 14 is a diagram illustrating an example of a record operation when performing midway deletion of a record in the database according to the embodiment, and a state of merging adjacent pages thereafter.
FIG. 15 is a diagram illustrating an example of a record operation when performing midway deletion of a record in the database according to the embodiment and a state of merging adjacent pages thereafter.
FIG. 16 is a diagram showing an example of a record operation when deleting a record in the database according to the embodiment, and a state of merging adjacent pages thereafter.
FIG. 17 is a block diagram showing in detail a BLOB record operation unit of the file management apparatus according to the embodiment.
FIG. 18 is a diagram for explaining garbage collection control when a tree index is not generated in the file management apparatus according to the embodiment.
FIG. 19 is a diagram for explaining garbage collection control when a tree index is not generated in the file management apparatus according to the embodiment;
FIG. 20 is a diagram for explaining garbage collection control when a tree index is not generated in the file management apparatus according to the embodiment;
FIG. 21 is a diagram illustrating garbage collection control for a BLOB record when a tree index is configured.
FIG. 22 is a diagram illustrating garbage collection control for a BLOB record when a tree index is configured.
FIG. 23 is a diagram illustrating garbage collection control for a BLOB record when a tree index is configured.
FIG. 24 is a diagram illustrating a programming example of a header file for realizing the file management apparatus according to the embodiment.
FIG. 25 is a diagram illustrating a programming example of a header file for realizing the file management apparatus according to the embodiment.
FIG. 26 is a diagram illustrating a header file programming example for realizing the file management apparatus according to the embodiment;
FIG. 27 is a diagram illustrating a programming example of a header file for realizing the file management apparatus according to the embodiment.
FIG. 28 is a diagram illustrating a programming example of a header file for realizing the file management apparatus according to the embodiment.
FIG. 29 is a diagram illustrating a programming example of a header file for realizing the file management apparatus according to the embodiment.
FIG. 30 is a diagram illustrating a programming example of a header file for realizing the file management apparatus according to the embodiment.
FIG. 31 is a diagram illustrating a programming example of a header file for realizing the file management apparatus according to the embodiment.
FIG. 32 is a diagram illustrating a programming example of a header file for realizing the file management apparatus according to the embodiment.
FIG. 33 is a diagram illustrating a programming example of a header file for realizing the file management apparatus according to the embodiment.
FIG. 34 is a diagram illustrating a programming example of a header file for realizing the file management apparatus according to the embodiment.
FIG. 35 is a diagram illustrating a programming example of a header file for realizing the file management apparatus according to the embodiment.
FIG. 36 is a diagram illustrating a programming example of a header file for realizing the file management apparatus according to the embodiment.
FIG. 37 is a diagram showing a BLOB record access method by a general file management apparatus.
FIG. 38 is a diagram showing a BLOB record access method by a general file management apparatus.
[Explanation of symbols]
1a, 1b File management device
2a, 2b Record operation means
3a, 3b Long record operation means
3b-1 Access start point determination control unit
3b-2 Maintenance Department
4a Normal record operation means
4b Data page operation means
5a Index operation means
5b Tree index operation means
6a, 6b database
11 Display device
12 Keyboard
13 Central processing unit / Main memory
14 Database
14-1 to 14-n, 14-N slices
14a Record entry
14b Normal record
14c BLOB management information
14d index page
14d-1 to 14d-3 Index group
14e Data page
15,15A index
21 Application Processing Department
22 Record operation part (record operation means)
22a Operation request part
22b Re-registration operation unit
23 BLOB record operation section (long record operation means)
23a Access start point determination control unit
23b Maintenance Department
23c Summation unit
23d Ratio calculator
23e Ratio comparison unit
23f Garbage collection controller
24 Tree index operation section
24a Index page operation section
24b Index page operation section
25 Data page operation section (data page operation means)
25a Record distinction recording part
26 page buffer
30 Inserted data
31 New slice area
101 records
101-1 to 101-n slices
102 pages

Claims

In a file management method for managing files registered in a database,
When a request for a record within one page size is received, the database is accessed within one record,
When a registration request regarding a new long record having a size that does not fit within the size of one page is received, the new long record is decomposed into slices divided into page units and registered in the database.
Is registered in the database, a long record having a size that does not fit within the size of one page, when having received a request for long records composed of a plurality of slices, the length for large records, decomposed into the slice unit The partial access in the long record is made to the database, or the index of the long record is generated or updated, and the access of the long record is made to the database in units of the slice. The file management method characterized by performing by.

In a file management device that manages files registered in the database,
A record operation means for receiving various requests for records;
A long record of a size that does not fit within the size of one page, when receiving a request for a long record composed of a plurality of slices divided into page units from the record operating unit, for the long atmospheric record, the slice unit Long record operation means to break down into,
In response to a request for a record within one page size from the record operation means or a request for a long record from the long record operation means, an access within one record or a partial access within a record is performed with the database. Normal record operation means to be performed between,
An index operation means for generating or updating the index of the long record and accessing the long record with the database ;
When the normal record operating means receives a request for a record within one page size from the record operating means, the normal record operating means accesses the database within one record,
When the long record operation means receives a registration request for a new long record having a size that does not fit within one page size, the new long record has a size corresponding to one page. While disassembling into slice units, registering in the database with the normal record operating means,
When the long record operation means receives a request for a long record that is registered in the database and has a size that does not fit within the size of one page, and is composed of a plurality of slices, the long record The partial access in the long record decomposed into the slice unit is performed with the database, or the index of the long record is generated or updated, and the access of the long record is performed in the slice unit. A file management apparatus configured to be performed with the database .

In a file management device that manages files registered in the database,
Record operation means for accepting various record operation requests for records or partial data in the records;
When the data subject to the request received by the record operation means is a normal record having a size that can be accommodated in one page, it is indicated by a new record as a target of the request or a record identifier registered on the database. Data page operation means for assigning existing records as normal records that fit on one page, performing various record operations, and performing partial access within one normal record;
When the data subject to the request received by the record operation means is a long record of a size that does not fit within one page and is composed of a plurality of slices divided into pages. A long record operation means for requesting various record operations to the data page operation means for each slice unit after the new record or the long record registered on the database is decomposed into slice units to be operated. ,
The index of the lowest leaf level stores the size of each slice itself and the record identifier for the slice. The index of the hierarchy above it stores the sum of the sizes indicated by the lower indexes, and represents the index of the highest root level. The sum of the sizes corresponds to the size of the entire long record, and the order of the leaf-level indexes includes a tree index operation unit that constitutes the index so as to match the order of the offset of each corresponding slice.
The long record operating means is
When performing partial access to an existing long record, either the method of sequentially navigating the address link between slices or the method of obtaining the above tree index from the offset of the entire record An access start point determination control unit determined by
A file management apparatus comprising: a maintenance unit that requests the tree index operating means to perform maintenance of the tree index accompanying addition and deletion of slice units.

The various record operations for receiving a request by the record operation means are any one of new creation, deletion, fetching, halfway insertion, halfway deletion, update or reading of partial data in the record. to, full § i le management apparatus according to claim 2 or 3 wherein.

The data page operating means is
In addition to having a record distinction recording section that records the distinction between normal records and long records for each record in the page,
The record operating means is
Based on the record distinction corresponding to the record identifier of the existing record from the record distinction recording unit of the data page operation means, the data page operation means is in the case of a normal record, and the long in the case of a long record. 4. The file management apparatus according to claim 3, further comprising an operation request unit that requests the record operation means to perform a record operation.

The record operating means is
If the size of the entire record does not fit on one page for an existing normal record, the normal record is re-registered as a long record, and then the record operation is requested to the long record operation means. After completing the operation for a long record, if the size of the entire large record fits on a single page, a re-registration operation section is provided to re-register the long record completed for this record operation as a normal record. The file management apparatus according to claim 2, wherein the file management apparatus is characterized.

The access start point determination control unit of the long record operating means
When determining the starting point of a slice that performs partial access to an existing long record, if a tree index is configured, navigate the address link of the slice itself using the tree index operation means. 4. The file management apparatus according to claim 3, wherein the file management apparatus is configured to control to determine the starting point.

The maintenance section of the long record operating means
When the access start point is determined by the tree index operation unit based on the control by the access start point determination control unit, the tree index is determined based on the size of a predetermined threshold size with respect to the size of the entire long record. The file management apparatus according to claim 7, wherein the file management apparatus is configured to determine creation and deletion.

The maintenance section of the long record operating means
8. The file management apparatus according to claim 7, wherein the file management apparatus is configured to determine whether to create or delete the tree index according to necessity of partial access.

The maintenance section of the long record operating means
10. The file management apparatus according to claim 9, wherein the file management apparatus is configured to create the tree index when a partial access occurs.

The maintenance section of the long record operating means
According to the present invention, it is configured to determine whether to create or delete the tree index according to a size relationship with a predetermined threshold size with respect to the size of the entire long record, and whether or not partial access is necessary. The file management apparatus according to claim 7.

The tree index operation means is
When searching for a slice from the offset for the entire long record, the index in which the total size of each index from the highest root index page to the lowest leaf index page of the tree index matches or immediately precedes the desired offset And an index page operation unit that obtains a record identifier for the slice including the offset in the index on the leaf index page by sequentially moving to the index page of the lower hierarchy from the retrieved index and performing the search. The file management apparatus according to claim 3, wherein:

The long record operating means is
When performing partial insertion for a long record, if there is an empty area in the storage page of a newly generated slice after the insertion operation, an index page operation unit that merges between the newly generated slice and the immediately following slice is displayed. The file management apparatus according to claim 4, wherein the file management apparatus is provided.

The long record operating means is
5. The file management according to claim 4, further comprising an index page operation unit for merging a slice including a start point and an end point of a deletion operation between slices when partial deletion is performed on a long record. apparatus.

The long record operating means is
A sum calculating unit for calculating a sum of slice sizes for all slices constituting the long record after the resize operation for the long record is performed;
A ratio calculation unit that calculates a ratio of the number of slices to the product of one page size with respect to the total calculated by the total calculation unit;
A ratio comparison unit that compares the ratio calculated in the ratio calculation unit with a preset ratio;
As a result of the ratio comparison in the ratio comparison unit, if the ratio calculated in the ratio calculation unit is smaller than a preset ratio , the free area of the own slice between adjacent slices 4. The file management apparatus according to claim 3, further comprising a garbage collection control unit that controls to perform garbage collection by sequentially merging all the slices so as to be filled with data .

The long record operating means is
In the maintenance unit, after correcting the index, a sum calculating unit that calculates a sum of slice sizes in the corrected leaf index page;
A ratio calculation unit that calculates a ratio of the number of slices to the product of one page size with respect to the total calculated in the total calculation unit;
A ratio comparison unit that compares the ratio calculated in the ratio calculation unit with a preset ratio;
As a result of the ratio comparison in the ratio comparison unit, when the ratio calculated in the ratio calculation unit is smaller than a preset ratio, the slice corresponding to the index in the leaf index page is between adjacent slices. 4. A garbage collection control unit that controls to perform garbage collection by sequentially merging so that an empty area of the own slice is filled with data of a slice following the own slice. The file management device described.