JP4251725B2

JP4251725B2 - File management method

Info

Publication number: JP4251725B2
Application number: JP19458999A
Authority: JP
Inventors: 光則郡
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1999-07-08
Filing date: 1999-07-08
Publication date: 2009-04-08
Anticipated expiration: 2019-07-08
Also published as: JP2001022622A

Description

【０００１】
【発明の属する技術分野】
本発明はファイル管理方法、特に可変長データを取り扱うことのできる改良されたファイル管理方法に関する。
【０００２】
【従来の技術】
データがレコード単位に格納されたファイルから所望するデータを取り出す場合、当該データを含むレコード全体を入出力バッファに書き込み、その入出力バッファの中から該当するフィールドのみを切り出す処理が必要であった。つまり、例えば５１２バイトで１レコードが形成されている場合には、所望するデータが４バイトであっても１レコードすなわち５１２バイト分をファイルから読み出さなければならなかった。例をあげて説明すると、社員データベースの中から氏名と住所だけを取り出して社員の住所録を作成する場合、上記のデータ読出し方法に従えば社員全員のレコードを入出力バッファへ読み出して、その中から氏名と住所だけを取り出さなければならない。このような方法だと必要以外のデータも読み出さなくてはならず、効率的でないし、処理負荷が無用に増大してしまう。
【０００３】
そこで、本願と同一の出願人は、データをレコード単位に読み出すのではなく各レコードにおいてフィールド単位で読み出すことができるようにしたファイル管理方法を提案した（特願平９−３１９５２７号、以下「先行文献」）。
【０００４】
この管理方法について図８を用いて説明する。元ファイル１には複数のフィールド２から構成されるレコ−ド３が複数格納されているとき、レコード３を一定件数、例えばＮレコードずつ分割する。次に、分割した各グループにおいて各レコードの先頭から順に一フィールドずつ分割することによって同一位置すなわち同一項目を格納するために設けられたフィールドをまとめてブロック４を生成する。レコード３においてフィールド２が行方向に並んでいるとしたならば、フィールド２をグループ内において列方向にまとめることでブロック４を生成するということができる。そして、分割した各グループにおいて分割したブロック４を先頭から行方向に順次連結してグループ５を再編成する。これをレコード３の全件について行い、その後グループ５を連結することで転置ファイル６を生成する。
【０００５】
このような構成の転置ファイル６を生成することにより、例えば、上記例において各レコード３に含まれる氏名のみを取り出したいときには氏名が記憶されたフィールド２を含むブロック４のみを転置ファイル６から順次読み出せば、社員番号や年齢など氏名以外のデータを社員データベースから読み出す必要がないので、入出力データ量の少ない効率的なデータ読出し処理を行うことができる。
【０００６】
ところで、先行文献においては、処理速度の高速化を実現するためにブロック４を構成する各フィールド（以下、「内部フィールド」）２を固定長としている。特に、内部フィールド長をワード境界などの固定境界と合致させることでディスク装置に対する物理的な入出力回数が増えないように配慮している。従って、元ファイル１を構成する各フィールド（以下、「論理フィールド」）が可変長である場合、図９に示したように、論理フィールドを１乃至複数の固定境界に従った固定長の内部フィールドに変換していた。そして、論理フィールドのデータで満たされなかった内部フィールド内の領域に対してパディング（ｐａｄｄｉｎｇ）を施している。
【０００７】
なお、転置ファイルの生成処理の説明に関し、元ファイル１に含まれるフィールド２からブロック４が生成されるように図８を用いて上述したが、実際には、図９に示したように論理フィールドから固定長の内部フィールドに変換する処理と、内部フィールドを集めてブロック４を生成する処理という２段階の手順で構成されている。
【０００８】
【発明が解決しようとする課題】
しかしながら、先行文献においては、同一項目を格納するための内部フィールド長は、フィールド内に格納するデータ長に関係なく全レコード共通の長さとなる。上記例に従って説明すると、社員データベースに格納される住所の実際のデータ長は全て同じではないが、先行文献においては、住所データを登録するために同一長であって固定長の内部フィールドが全レコード共通に割り当てられる。例えば、住所データのために１００バイト分の内部フィールドが用意された場合、住所データが４０バイトであっても６０バイトであっても１００バイト分のフィールド長が割り当てられる。
【０００９】
本発明は以上のような問題を解決するためになされたものであり、その目的は、可変長データを可変長フィールドを用いて管理しうる改良されたファイル管理方法を提供することにある。
【００１０】
【課題を解決するための手段】
以上のような目的を達成するために、本発明に係るファイル管理方法は、少なくとも１つの可変長のフィールドを含むレコードを複数格納した元ファイルの管理を行うファイル管理方法において、元ファイルに格納されたレコードを構成する各フィールドを内部フィールドに１対１に変換する際、可変長データを含む可変長フィールドをそれぞれ、データの区切りを示す固定境界と区切りが合致する１つの内部フィールドに変換するフィールド変換ステップと、変換された内部フィールドにより構成される全レコードを、複数の群に分割することによってレコードグループを生成するレコードグループ生成ステップと、各レコードグループにおいて、各レコードにおける同一フィールドが同じグループに含まれるように分割することでブロックを生成するブロック生成ステップと、レコ−ドグループ毎に、当該レコ−ドグループにおいて生成されたブロックを並べて含むグループを生成し、更にその生成したグループを並べて含むファイルを転置ファイルとして生成し、更に可変長データを含むブロックの転置ファイルにおける格納位置を特定できるブロック管理情報を生成する転置ファイル生成ステップとを含み、元ファイルからの可変長データ読出し要求に対して当該格納位置情報を参照することによって転置ファイルから該当する可変長データを特定して読み出すものである。
【００１１】
他の発明に係るファイル管理方法は、少なくとも１つの可変長のフィールドを含むレコードを複数格納した元ファイルの管理を行うファイル管理方法において、元ファイルに格納されたレコードを構成する各フィールドを内部フィールドに１対１に変換する際、可変長データを含む可変長フィールドに対して当該可変長データを可変長データ記憶手段に登録するとともに、可変長データ記憶手段における当該可変長データ格納位置情報を内部フィールドに設定するフィールド変換ステップと、変換された内部フィールドにより構成される全レコードを、複数の群に分割することによってレコードグループを生成するレコードグループ生成ステップと、各レコードグループにおいて、各レコードにおける同一フィールドが同じグループに含まれるように分割することでブロックを生成するブロック生成ステップと、レコ−ドグループ毎に、当該レコ−ドグループにおいて生成されたブロックを並べて含むグループを生成し、更にその生成したグループを並べて含むファイルを転置ファイルとして生成する転置ファイル生成ステップとを含み、元ファイルからの可変長データ読出し要求に対して転置ファイルに格納された当該可変長データの格納位置情報を参照することによって可変長データ記憶手段に格納された当該可変長データを特定して読み出すものである。
【００１２】
また、上記各発明において、前記フィールド変換ステップは、内部フィールドに空き領域を付加することによって各内部フィールドの区切りを物理的な処理単位となる境界に合致させるものである。
【００１３】
また、上記各発明において、前記ブロック生成ステップは、生成したブロックサイズが物理的な最小入出力単位の整数倍と一致しないときには、最小入出力単位の整数倍になるように当該ブロックに空き領域を付加するものである。
【００１４】
【発明の実施の形態】
以下、図面に基づいて、本発明の好適な実施の形態について説明する。
【００１５】
実施の形態１．
図１は、本発明に係るファイル管理方法の実施の形態１に用いられる転置ファイルの生成方法を示すための模式図であり、図２は、元ファイルに含まれているレコードを構成する論理フィールドの構造と本実施の形態におけるファイル管理システムが取り扱う内部フィールドの構造並びに各フィールドの対応関係を示した図である。これらの図及び図３に示したフローチャートを用いて本実施の形態における転置ファイル生成処理について説明する。
【００１６】
本実施の形態におけるこの処理の手順自体は、先行文献に記載された手順と基本的には同様である。すなわち、元ファイルの論理フィールドを内部フィールドに変換し、同一の内部フィールドをまとめることでブロックを生成し、そのブロックを連結することで転置ファイルを生成する。以下、この各処理について詳述する。
【００１７】
ステップ１０１において、まず、図２に示したようにレコードを構成するＭ個の各論理フィールドField0,Field1,...,Field(M-1)をＭ個の内部フィールドField#0,Field#1,...,Field#(M-1)に変換する。本実施の形態における論理フィールドには、図２（ａ）に示したように実際のフィールドデータに加えてフィールド長が付加されており、このため各論理フィールドを可変長に形成することができる。各論理フィールドを内部フィールドに変換する際、内部フィールドの区切りがワード境界に合致するように必要に応じて空き領域を設け、その空き領域に図２（ｂ）に示したようにｐａｄｄｉｎｇ（パディング）を挿入する。これは、処理効率を向上させるためにワード境界などの固定境界に内部フィールドの区切りを合致させるためである。この空き領域の大きさは、論理フィールド長と固定境界を意識した内部フィールド長との差分に相当する。従って、図２（ｂ）におけるField1とField#1の関係から明らかなように、論理フィールドの対応する内部フィールドの長さがその論理フィールドと一致するとは限らない。なお、図２（ｂ）にはデータの後ろにパディングを挿入した例を示したが、一内部フィールド内におけるパディングの挿入箇所及び数はこの例に限定されるものではない。
【００１８】
本実施の形態において特徴的なことは、内部フィールドにより構成されるレコード（以下「内部レコード」ともいう）に含まれる各内部フィールド（Field#0,Field#1,...,Field#(M-1)）の長さが可変であることのみならず、同一内部フィールド、例えば図２（ａ），（ｂ）に示した各Field#1のように各内部レコードにおける同一位置におけるフィールド長をも可変として扱うことができるようにしたことである。これにより、可変長データを可変長のまま取り扱うことができる。上記例に従って説明すると、社員データベースに格納される住所データが４０バイトであれば４０バイト長の内部フィールドを、６０バイトであれば６０バイト長の内部フィールドを割り当てることができるようになる。
【００１９】
以上のようにして元ファイルに含まれていたレコード（以下「論理レコード」ともいう）が内部レコードへ変換されると、次に、内部レコードを複数の群に分割することによってレコードグループ１１を生成する（ステップ１０２）。本実施の形態の場合、先行文献と同様に元ファイルを予め決められたＮレコードずつに分割して等しい大きさのレコードグループ１１を生成するものとする。
【００２０】
続いて、各レコードグループ１１内においてブロック１２を生成する（ステップ１０３）。これは、各レコードグループ１１に対して同じ処理が施される。また、各レコードグループ１１における生成ブロック数は全て同じであり、本実施の形態の場合、内部レコードを先頭から一内部フィールドずつ行方向へ順番に分割する。そして、分割した同一の内部フィールドをまとめることによってブロック１２を生成する。つまり、例えば図１によると、先頭のＮレコードから構成される先頭のレコードグループ１１に含まれる先頭のブロック１２は、１番目からＮ番目の内部レコードにおける同一内部レコードField#0により生成されることになる。端的に言うと、このブロック生成処理は、各内部レコードを行方向に一内部フィールドずつ分割した後、内部フィールドを列方向にまとめてブロック１２を生成する。結果として、内部フィールドの数と等しい個数のブロック１２が各レコードグループ１１において生成されることになる。本実施の形態のように、固定数Ｎのレコードで構成されるレコードグループを一内部グループずつに分割してブロックを生成する場合の処理自体は先行文献と同じである。
【００２１】
ところで、本実施の形態における内部フィールドは、論理フィールドからの変換時にパディングが施されることによって隣接する内部フィールドとの区切りがワード境界に合致するように調整されている。しかし、ブロック１２の大きさが物理的な入出力単位の整数倍となるとは限らない。すなわち、ディスク装置などの記憶装置では、物理的に入出力できる最小単位は固定されている場合が多い。この最小単位を以下、「セクタサイズ」と呼ぶ。ディスク装置に格納されているファイルにアクセスを行うときの入出力単位がセクタサイズの整数倍であれば、記憶装置と入出力バッファとの間で直接入出力を行うことができる。しかし、入出力単位がセクタサイズの整数倍でなければ、ディスク装置内のデータをメモリにいったん読み出した後、改めて入出力バッファにコピーを行うようにしなければ入出力を行えない場合がある。このことは、本実施の形態においても同様なことがいえ、効率的な入出力を行うためにはブロック１２の大きさをセクタサイズの整数倍にすることが望ましい。本実施の形態においては、内部フィールド長をワード境界などの固定境界に合致させるようにはしたが、内部フィールドをまとめて生成するブロック１２の大きさが必ずしもセクタサイズの整数倍になるとは限らない。例えば、セクタサイズが５１２バイト、ワード境界が４バイトである場合、１０バイトの論理フィールドは１２バイトの内部フィールドに変換されるが、このとき１２×４２＋８＝５１２であるから４２レコード分の１２バイト長の内部フィールドをまとめて生成されたブロック１２は、セクタサイズの整数倍にはならない。
【００２２】
そこで、本実施の形態においては、内部フィールドをまとめてブロック１２を生成した結果、そのブロック１２の大きさがセクタサイズの整数倍にならなかったときにはパディングを施してセクタサイズの整数倍となるように調整する。図４は、可変長である一ブロック１２のデータ構造例を示した図であるが、内部フィールドを連結したときのブロックサイズがセクタサイズの整数倍とならない場合には、最終内部フィールドに続けて空き領域１６を設けて、そこにパディングを施す。このようにして、ブロックサイズがセクタサイズの整数倍となるように調整され、このようなブロックサイズの調整を行うようにすることで入出力処理の効率化を図ることができる。
【００２３】
なお、このブロック生成処理においてパディングを挿入するのであれば、内部フィールド内にパディングを施す必要がなくなる。つまり、ブロック１２の生成の元になる論理フィールド長の総和とセクタサイズの整数倍との差異の数のパディングをブロック１２に施せばよいことになる。内部フィールド内にパディングを施さず、内部フィールドの区切りをワード境界に合致させないようにするのであれば、論理フィールドを内部フィールドにマッピング（変換）する必要もなくなる。
【００２４】
最後に、転置ファイルを生成するわけであるが（ステップ１０４）、ここでは、まず各レコードグループ１１において生成されたブロック１２を先頭から順番に連結していくことによってグループ１３を新たに生成する。図１に基づけば、横（行）方向に並んでいるブロック１２をグループというファイルに順番（列方向）に追加登録していくようなイメージである。この処理ではレコードグループ数個のグループ１３が作成されることになる。各レコードグループ１１に対してブロック生成処理が行われると、各グループ１３を連結することで転置ファイル１４を生成する。なお、レコードグループ１１と、レコードグループ１１に対応するグループ１３とは、それぞれを構成する内部フィールドは同一であるが、内部フィールドの並び順の相違により内部構造が異なるので、異なる符号を付け異なる構成要素として示している。
【００２５】
更に、可変長の内部フィールドから構成される各ブロック１２の大きさも可変であるため、ブロック１２を連結する際に各ブロック１２の大きさを特定するための情報を生成しておく必要がある。本実施の形態では、管理ファイル１５を生成し、ブロック管理情報として各グループ１３における各ブロック１２の転置ファイル１４におけるオフセット位置を管理ファイル１５で保持管理するようにした。これにより、転置ファイル１４へアクセスする際には管理ファイル１５に基づき転置ファイル１４における所望するブロック１２の格納位置を特定でき、アクセス対象となるフィールドデータを読み出すことができる。以上のようにして、単一の元ファイルからフィールドが転置した単一の転置ファイル１４が生成される。
【００２６】
本実施の形態では、以上のようにして転置ファイル１４を事前に生成し、元ファイルからのデータ読出し要求に対して転置ファイル１４へアクセスを行うことで処理効率を向上させることを特徴としている。次に、本実施の形態におけるファイル管理方法に基づくデータの読出し処理について図５に示したフローチャートを用いて説明する。
【００２７】
ファイル管理システムは、要求されたフィールドデータを転置ファイル１４から読み出す。この際、内部処理では、要求されたデータを含むブロックを入出力バッファへ読み出すことになるが、フィールド長が可変の場合その読出し先となる入出力バッファをどのような大きさにするかが問題となる。そこで、本実施の形態では、最大ブロックサイズ（全ブロック１２の最大長）を予め求めておき、その最大ブロックサイズに等しい（あるいは整数倍）の大きさの入出力バッファを用意する。
【００２８】
ファイル管理システムは、入出力バッファを確保すると（ステップ２０１）、管理ファイル１５に設定されている各ブロック１２のオフセットを参照することによって要求されたフィールドデータを格納するブロック位置及びそのブロック１２における格納位置を特定して当該フィールドデータを順次読み出す。すなわち、ファイル管理システムは、管理ファイル１５の読出し位置を先頭グループに位置づけて（ステップ２０２）、当該グループに含まれる各ブロックのオフセットを得る（ステップ２０３）。管理ファイル１５に基づき各ブロック１２のオフセット位置が求まると、ブロック１２内に含まれる内部フィールドの構造及びフィールド長は既知なので、読出し対象となるフィールドデータの格納位置を容易に特定できる（ステップ２０４）。この後は転置ファイル１４の特定したグループ１３において特定したブロック１２から読出し対象とされたフィールドデータを順次読み出せばよい（ステップ２０５〜２０９）。本実施の形態では、転置ファイル１４からの読出し処理を非同期に行っている。そして、先頭のグループに対する処理が終了すると、次のグループに処理を移行し、最終的には転置ファイル１４に含まれる全グループに対して前述した読出し処理を施す（ステップ２１０，２１１）。以上のようにして、各グループ１３へのデータ読出し処理が終了すると、この時点で入出力バッファを解放する（ステップ２１２）。
【００２９】
例えば、社員データベースの中から「住所」という論理フィールドField1に格納されているフィールドデータのみを全て取り出したいという要求が送られてきた場合、本実施の形態においては、論理フィールドField1に対応した転置ファイル１４内の内部フィールドField#1からデータが取り出されることになる。このとき、内部フィールドField#1はブロック単位にまとめられて格納されているので、ファイル管理システムは、転置ファイル１４における各グループ１３から内部フィールドField#1を含むブロックのみを各グループ１３から読み出せばよい。この際、各グループ１３に含まれている内部フィールドField#1のブロック１２の転置ファイル１４における格納位置は、管理ファイル１５を参照することによって特定することができる。もし、元ファイルに対して内部フィールドField#1を読出しにいくのであれば、結果として元ファイル全体をアクセスしなければならないが、転置ファイル１４では、内部フィールドField#1をまとめてブロック化してあるので、不要なデータを読み出すことなく「住所」を効率的に読み出すことができる。
【００３０】
本実施の形態によれば、ブロック管理情報を保持管理するだけで、各ブロック長が可変であってもフィールドデータの読出しを容易に行うことができる。なお、本実施の形態では、管理ファイル１５を別個な構成として設けたが、転置ファイル１４の内部に組み込んでもよいし、障害対策としてレコード数等の情報を管理ファイル１５と転置ファイル１４の双方に持たせて二重化するようにしてもよい。
【００３１】
なお、上記説明では、管理ファイル１５に最大ブロックサイズを設けることで読出し処理の開始前に必要なバッファサイズを確実に確保することができるようにした。もし、可変長データの最大長が制限されているような場合には、管理ファイル１５に最大ブロックサイズを設けずにその最大長に基づき計算により最大ブロックサイズを求めるようにしてもよい。これにより、管理ファイル１５の実装を簡素化できる。あるいは、ブロック１２からフィ−ルドデータを読み出す度に必要な入出力バッファの獲得と解放を繰り返し行うようにしてもよい。これにより、必要最小限のサイズのバッファでメモリを確保することができるので、メモリを効率的に使用することができる。必要なときにバッファの獲得と解放を繰り返し行うように処理することは、特にブロックサイズに極端なばらつきがある場合には効果的である。
【００３２】
また、上記実施の形態では、ブロック管理情報として各ブロック１２のオフセット位置を保持管理するようにしたが、直前のブロックサイズとの差異を保持するようにしてもよい。これにより、管理ファイル１５の大きさを縮小することができる。
【００３３】
また、本実施の形態では、転置ファイル１４からの読出し処理を非同期に行うようにしたが、同時に発行できる非同期読出し要求の数に制限がある場合を想定してこの制限数を考慮した非同期発行制御を行う必要がある。あるいは、同期処理として実行してもよい。
【００３４】
実施の形態２．
上記実施の形態１では、可変長データをそのままブロック１２に組み入れたためブロック自体も可変長となり、ブロックサイズを管理するための管理ファイル１５が必要となった。そこで、本実施の形態では、フィールド長がレコードによって異なってくる可変長である場合、可変長データをブロック１２に組み入れず別途可変長データファイルに格納するとともにブロック１２には可変長データファイルへのポインタ情報を設定するようにした。
【００３５】
図６は、本発明に係るファイル管理方法の実施の形態２に用いられる転置ファイルの生成方法を示すための模式図である。本実施の形態においては、実施の形態１に示した管理ファイル１５を設けずに、可変長データを転置ファイル１４と別個に格納する可変長データ記憶手段として可変長データファイル１７を設けており、また、フィールドデータが通常格納される内部フィールドには、扱うデータが可変長である場合には可変長データファイル１７に格納されている該当する可変長データの格納位置を特定するためにオフセット情報及び可変長データのサイズ情報が格納される。このように、可変長となるデータを別ファイルに格納し、可変長データが格納されるべき内部フィールドには固定長で表現できるオフセット及びサイズのデータを格納するようにしたので、転置ファイル１４を固定長のデータ、すなわち実際の固定長データとオフセット及びサイズで生成することができる。図６には、オフセットとサイズとを別の内部フィールドに設定し、別のブロック１２として生成しているが、オフセットとサイズの各内部フィールドをまとめて１つの可変長データ用ブロック１８として生成するようにしてもよいし、大きいサイズの同一の内部フィールドに設けるようにしてもよい。なお、本実施の形態においても上記実施の形態１と同様に、必要に応じてブロック１２にパディングを施して転置ファイル１４に格納することができる。
【００３６】
本実施の形態における転置ファイル生成処理は、基本的には実施の形態１とほぼ同様である。異なる処理のみを説明すると、ステップ１０１において論理フィールドから内部フィールドへ変換する際、扱うデータが可変長であるときには、その可変長データを内部フィールドに設定するのではなく可変長データファイル１７に追加登録する。その際、可変長データファイル１７における可変長データの格納位置情報として、可変長データファイル１７の先頭からのオフセット情報を取得してサイズ情報（可変長データ長）とともに内部フィールドに設定する。その後の処理は、内部フィールドに含まれるデータが可変長データそのものであるかオフセット及びサイズであるかの相違しかなく実施の形態１と同じである。この結果、転置ファイル１４には、固定長データを含むブロック１２とオフセット情報及びサイズ情報を含む可変長データ用ブロック１８とが含まれることになる。
【００３７】
次に、本実施の形態におけるファイル管理方法に基づくデータの読出し処理について図７に示したフローチャートを用いて説明する。ここでは、転置ファイル１４から固定長データを読み出す処理については実施の形態１と同様なので、可変長データを読み出す際の処理についてのみ説明する。なお、内部フィールドに含まれているのが固定長データであるかオフセット情報又はサイズ情報であるかは、内部フィールド内に識別フラグを持たせるなどして識別することができる。
【００３８】
ファイル管理システムは、データの読出し要求を受け取ると、転置ファイル１４の先頭グループ１３にポインタ（読出し位置）を位置づけ（ステップ２２１）、読出し対象とされた可変長データに関するオフセット情報及びサイズ情報を含む可変長データ用ブロック１８を図示しない転置ファイル用ブロックバッファに読み出す（ステップ２２２）。そして、転置ファイル用ブロックバッファの先頭レコードにポインタ（処理位置）を位置づけ（ステップ２２３）、読出し対象とされた可変長データに関するオフセット情報及びサイズ情報を読み出す（ステップ２２４）。続いて、読み出したオフセット情報及びサイズ情報に基づきそのオフセット情報により特定される格納位置からサイズ情報で示された大きさのデータを可変長データファイル１７から図示しない可変長データバッファに読み出す（ステップ２２５）。
【００３９】
この後は転置ファイル用ブロックバッファにおいてポインタを次に移し、次のオフセット情報及びそのオフセット情報に対応したサイズ情報を取得して、可変長データファイル１７から可変長データを読み出す処理を繰り返し行う（ステップ２２６，２２７，２２４，２２５）。
【００４０】
転置ファイル用ブロックバッファに読み出した可変長データ用ブロック１８に対して上記処理（ステップ２２４〜２２７）を繰り返し行い、全てのオフセット情報及びサイズ情報に基づく処理が終了すると、転置ファイル１４に未処理のグループ１３がなくなるまで上記処理（ステップ２２２〜２２７）を繰り返し行う（ステップ２２８，２２９）。
【００４１】
本実施の形態によれば、以上のようにして可変長データを扱うことができる。特に、本実施の形態においては、可変長データを転置ファイル１４とは別個に設けた記憶手段に格納し、転置ファイル１４を構成するブロックには固定長で表現できる可変長データのオフセット情報及びサイズ情報を格納するようにしたので、転置ファイルを固定長ブロックに基づき生成することができる。
【００４２】
【発明の効果】
本発明によれば、内部処理において各レコードにおける同一フィールドに対して異なる長さを持つ可変長フィールドを割り当てることができるようにしたので、可変長データを可変長のまま取り扱うことができる。
【００４３】
また、実際の可変長データを別個に設けた可変長データ記憶手段で保持するようにし、転置ファイルに固定長で表現可能なオフセット情報やサイズ情報などの格納位置情報を格納したので、転置ファイルを固定長ブロックに基づき生成することができる。
【００４４】
また、空き領域を内部フィールドに付加することによって各内部フィールドの区切りを物理的な処理単位となる境界に合致させることができるので、効率的な読出し処理を実現することができる。
【００４５】
また、内部フィールドをまとめてブロックを生成した結果、そのブロックサイズが物理的な最小入出力単位の整数倍にならなかったときにはパディングを施してその最小入出力単位の整数倍となるように調整するようにしたので、入出力処理を効率的に行うことができる。
【図面の簡単な説明】
【図１】本発明に係るファイル管理方法の実施の形態１に用いられる転置ファイルの生成方法を示すための模式図である。
【図２】実施の形態１において元ファイルの論理フィールドと内部フィールドとの対応関係を示したレコード構造図である。
【図３】実施の形態１における転置ファイル生成処理の流れを示したフローチャートである。
【図４】実施の形態１における可変長ブロックのデータ構造例を示した図である。
【図５】実施の形態１におけるデータ読出し処理を示したフローチャートである。
【図６】本発明に係るファイル管理方法の実施の形態２に用いられる転置ファイルの生成方法を示すための模式図である。
【図７】実施の形態２におけるデータ読出し処理を示したフローチャートである。
【図８】従来のファイル管理方法において転置ファイルの生成方法を示すための模式図である。
【図９】従来において元ファイルの論理フィールドと内部フィールドとの対応関係を示したレコード構造図である。
【符号の説明】
１１レコードグループ、１２ブロック、１３グループ、１４転置ファイル、１５管理ファイル、１６空き領域、１７可変長データファイル、１８可変長データ用ブロック。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a file management method, and more particularly to an improved file management method capable of handling variable length data.
[0002]
[Prior art]
When retrieving desired data from a file in which data is stored in record units, it is necessary to write the entire record including the data to the input / output buffer and cut out only the corresponding field from the input / output buffer. That is, for example, when one record is formed with 512 bytes, even if the desired data is 4 bytes, one record, that is, 512 bytes must be read from the file. As an example, when creating an employee address book by extracting only the name and address from the employee database, the records of all employees are read to the input / output buffer according to the above data reading method. Only the name and address must be taken out from. With such a method, unneeded data must be read out, which is not efficient and the processing load increases unnecessarily.
[0003]
Therefore, the same applicant as the present application has proposed a file management method that allows data to be read in units of fields in each record instead of reading out in units of records (Japanese Patent Application No. 9-319527, hereinafter referred to as “Previous”). Literature ").
[0004]
This management method will be described with reference to FIG. When a plurality of records 3 composed of a plurality of fields 2 are stored in the original file 1, the record 3 is divided into a predetermined number, for example, N records. Next, in each divided group, the fields provided for storing the same position, that is, the same item are grouped by dividing the fields one by one from the beginning of each record to generate the block 4. If field 2 is arranged in the row direction in record 3, it can be said that block 4 is generated by grouping field 2 in the column direction in the group. Then, the group 4 is reorganized by sequentially connecting the blocks 4 divided in the divided groups in the row direction from the top. This is performed for all records 3 and then the group 5 is connected to generate the transposed file 6.
[0005]
By generating the transposed file 6 having such a configuration, for example, when only the name included in each record 3 in the above example is to be extracted, only the block 4 including the field 2 in which the name is stored is sequentially read from the transposed file 6. If issued, it is not necessary to read out data other than the name such as employee number and age from the employee database, so that efficient data reading processing with a small amount of input / output data can be performed.
[0006]
By the way, in the prior art, each field (hereinafter referred to as “internal field”) 2 constituting the block 4 has a fixed length in order to increase the processing speed. In particular, consideration is given to avoiding an increase in the number of physical inputs and outputs to the disk device by matching the internal field length with a fixed boundary such as a word boundary. Accordingly, when each field (hereinafter, “logical field”) constituting the original file 1 has a variable length, as shown in FIG. 9, the logical field is a fixed-length internal field according to one or more fixed boundaries. It was converted to. Then, padding is applied to the area in the internal field that is not filled with the logical field data.
[0007]
In addition, regarding the description of the generation process of the transposed file, the block 4 is generated from the field 2 included in the original file 1 as described above with reference to FIG. 8, but actually, the logical field as illustrated in FIG. To a fixed-length internal field, and a process of collecting internal fields and generating a block 4 in two steps.
[0008]
[Problems to be solved by the invention]
However, in the prior art, the internal field length for storing the same item is the same length for all records regardless of the data length stored in the field. Explaining according to the above example, the actual data lengths of the addresses stored in the employee database are not all the same, but in the prior literature, the same length and a fixed length are used to register the address data. internal Fields are assigned to all records. For example, 100 bytes for address data internal When a field is prepared, a field length of 100 bytes is assigned regardless of whether the address data is 40 bytes or 60 bytes.
[0009]
The present invention has been made to solve the above problems, and an object of the present invention is to provide an improved file management method capable of managing variable length data using a variable length field.
[0010]
[Means for Solving the Problems]
In order to achieve the above object, a file management method according to the present invention is a file management method for managing an original file that stores a plurality of records including at least one variable-length field. When converting each field constituting a record into an internal field on a one-to-one basis, a variable-length field containing variable-length data Each of the fixed boundaries indicating the data delimiters matches one delimiter. Convert to internal field Rufu A field conversion step, a record group generation step for generating a record group by dividing all records composed of converted internal fields into a plurality of groups, and a group in which the same field in each record is the same in each record group A block generation step of generating a block by dividing it so as to be included in For each record group, Blocks generated in the record group Side by side Group Generate and further That Generation Group Include files side by side Transposed file As And a transposition file generation step for generating block management information that can specify a storage position in a transposition file of a block including variable length data, and for storing the storage position information in response to a variable length data read request from the original file. By referencing, the corresponding variable length data is identified and read from the transposed file.
[0011]
A file management method according to another invention is a file management method for managing an original file in which a plurality of records including at least one variable-length field are stored, wherein each field constituting the record stored in the original file is an internal field. When converting to one-to-one, the variable length data is registered in the variable length data storage means for the variable length field including the variable length data, and the variable length data storage position information in the variable length data storage means is stored internally. A field conversion step for setting a field, a record group generation step for generating a record group by dividing all records composed of converted internal fields into a plurality of groups, and the same for each record in each record group The field is in the same group A block generating step of generating a block by dividing as, For each record group, Blocks generated in the record group Side by side Group Generate and further That Generation Group Include files side by side Transposed file As A transposed file generation step, and stored in the variable length data storage means by referring to the storage location information of the variable length data stored in the transposed file in response to the variable length data read request from the original file The variable length data is specified and read.
[0012]
In each of the above inventions, the field conversion step is to add a free area to the internal field so that the delimitation of each internal field matches the boundary that is a physical processing unit.
[0013]
In each of the above inventions, when the generated block size does not match an integer multiple of the physical minimum input / output unit, the block generation step allocates an empty area to the block so that the block size becomes an integer multiple of the minimum input / output unit. It is to be added.
[0014]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, preferred embodiments of the present invention will be described with reference to the drawings.
[0015]
Embodiment 1 FIG.
FIG. 1 is a schematic diagram for illustrating a transposed file generation method used in the first embodiment of the file management method according to the present invention, and FIG. 2 is a logical field constituting a record included in the original file. FIG. 6 is a diagram showing the structure of the internal field, the internal field structure handled by the file management system in the present embodiment, and the correspondence between each field. The transposed file generation process in the present embodiment will be described using these figures and the flowchart shown in FIG.
[0016]
The procedure itself in this embodiment is basically the same as the procedure described in the prior art. That is, the logical field of the original file is converted into an internal field, a block is generated by combining the same internal fields, and a transposed file is generated by concatenating the blocks. Hereinafter, each process will be described in detail.
[0017]
In step 101, first, as shown in FIG. 2, each of M logical fields Field0, Field1,..., Field (M-1) constituting the record is converted into M internal fields Field # 0, Field # 1. , ..., Convert to Field # (M-1). As shown in FIG. 2A, a field length is added to the logical field in the present embodiment in addition to the actual field data. Therefore, each logical field can be formed in a variable length. When converting each logical field into an internal field, an empty area is provided as necessary so that the delimiter of the internal field matches the word boundary, and padding is performed in the empty area as shown in FIG. Insert. This is to make the internal field delimiter coincide with a fixed boundary such as a word boundary in order to improve the processing efficiency. The size of this empty area corresponds to the difference between the logical field length and the internal field length taking into account the fixed boundary. Therefore, as is apparent from the relationship between Field1 and Field # 1 in FIG. 2B, the length of the corresponding internal field of the logical field does not always match that logical field. Although FIG. 2B shows an example in which padding is inserted after the data, the number and number of padding insertions in one internal field are not limited to this example.
[0018]
What is characteristic in this embodiment is that each internal field (Field # 0, Field # 1,..., Field # (M -1)) is not only variable, but the same internal field, for example, the field length at the same position in each internal record, such as Field # 1 shown in FIGS. Is also able to be handled as variable. Thereby, variable length data can be handled with variable length. In accordance with the above example, if the address data stored in the employee database is 40 bytes, an internal field having a length of 40 bytes can be allocated, and if the address data is 60 bytes, an internal field having a length of 60 bytes can be allocated.
[0019]
When a record (hereinafter also referred to as “logical record”) included in the original file is converted into an internal record as described above, a record group 11 is generated by dividing the internal record into a plurality of groups. (Step 102). In the case of the present embodiment, the record group 11 having the same size is generated by dividing the original file into predetermined N records in the same manner as in the prior art.
[0020]
Subsequently, a block 12 is generated in each record group 11 (step 103). This is the same processing for each record group 11. The number of generated blocks in each record group 11 is the same. In the case of the present embodiment, the internal records are sequentially divided in the row direction by one internal field from the top. Then, the block 12 is generated by collecting the same divided internal fields. That is, according to FIG. 1, for example, the first block 12 included in the first record group 11 composed of the first N records is generated by the same internal record Field # 0 in the first to Nth internal records. become. In short, in this block generation processing, each internal record is divided by one internal field in the row direction, and then the internal fields are combined in the column direction to generate a block 12. As a result, the same number of blocks 12 as the number of internal fields are generated in each record group 11. As in the present embodiment, the processing itself in the case of generating a block by dividing a record group composed of a fixed number N records into one internal group is the same as that of the prior art.
[0021]
By the way, the internal field in the present embodiment is adjusted so that the separation from the adjacent internal field matches the word boundary by padding at the time of conversion from the logical field. However, the size of the block 12 is not always an integer multiple of the physical input / output unit. That is, in a storage device such as a disk device, the minimum unit that can be physically input / output is often fixed. This minimum unit is hereinafter referred to as “sector size”. If the input / output unit for accessing a file stored in the disk device is an integral multiple of the sector size, direct input / output can be performed between the storage device and the input / output buffer. However, if the input / output unit is not an integral multiple of the sector size, input / output may not be performed unless data in the disk device is once read into the memory and then copied to the input / output buffer again. This also applies to the present embodiment, and it is desirable to make the size of the block 12 an integral multiple of the sector size in order to perform efficient input / output. In the present embodiment, the internal field length is matched with a fixed boundary such as a word boundary, but the size of the block 12 that collectively generates the internal fields is not necessarily an integral multiple of the sector size. . For example, if the sector size is 512 bytes and the word boundary is 4 bytes, the 10-byte logical field is converted into a 12-byte internal field. At this time, 12 × 42 + 8 = 512, so 12 bytes for 42 records The block 12 generated by combining the long internal fields is not an integral multiple of the sector size.
[0022]
Therefore, in the present embodiment, as a result of generating the block 12 by collecting the internal fields, if the size of the block 12 does not become an integral multiple of the sector size, padding is performed so that the block 12 becomes an integral multiple of the sector size. Adjust to. FIG. 4 is a diagram showing an example of the data structure of one block 12 having a variable length. When the block size when the internal fields are concatenated does not become an integral multiple of the sector size, the last internal field is followed. An empty area 16 is provided, and padding is performed there. In this way, the block size is adjusted to be an integral multiple of the sector size, and the efficiency of input / output processing can be improved by performing such block size adjustment.
[0023]
If padding is inserted in this block generation process, it is not necessary to perform padding in the internal field. That is, the block 12 may be padded with the number of differences between the sum of logical field lengths from which the block 12 is generated and an integer multiple of the sector size. If padding is not performed in the internal field and the boundary of the internal field is not matched with the word boundary, it is not necessary to map (convert) the logical field to the internal field.
[0024]
Finally, a transposition file is generated (step 104). Here, a group 13 is newly generated by first connecting the blocks 12 generated in each record group 11 in order from the top. Based on FIG. 1, the image is such that blocks 12 arranged in the horizontal (row) direction are additionally registered in a file called a group in order (column direction). In this process, several record groups 13 are created. When the block generation process is performed on each record group 11, the transposed file 14 is generated by connecting the groups 13. The record group 11 and the group 13 corresponding to the record group 11 have the same internal fields, but the internal structure is different depending on the arrangement order of the internal fields. Shown as an element.
[0025]
Further, since the size of each block 12 composed of variable length internal fields is also variable, it is necessary to generate information for specifying the size of each block 12 when the blocks 12 are connected. In this embodiment, the management file 15 is generated, and the management file 15 holds and manages the offset position in the transposed file 14 of each block 12 in each group 13 as block management information. Thereby, when accessing the inverted file 14, the storage position of the desired block 12 in the inverted file 14 can be specified based on the management file 15, and the field data to be accessed can be read. As described above, a single transposed file 14 in which fields are transposed from a single original file is generated.
[0026]
The present embodiment is characterized in that processing efficiency is improved by generating the inverted file 14 in advance as described above and accessing the inverted file 14 in response to a data read request from the original file. Next, data read processing based on the file management method in this embodiment will be described with reference to the flowchart shown in FIG.
[0027]
The file management system reads the requested field data from the transposed file 14. At this time, in the internal processing, the block including the requested data is read out to the input / output buffer. However, when the field length is variable, there is a problem as to how large the input / output buffer to be read out is. It becomes. Therefore, in the present embodiment, the maximum block size (the maximum length of all the blocks 12) is obtained in advance, and an input / output buffer having a size equal to (or an integer multiple of) the maximum block size is prepared.
[0028]
When the file management system secures the input / output buffer (step 201), the block position for storing the requested field data by referring to the offset of each block 12 set in the management file 15 and the storage in the block 12 are stored. The field data is sequentially read by specifying the position. That is, the file management system positions the read position of the management file 15 in the first group (step 202), and obtains an offset of each block included in the group (step 203). When the offset position of each block 12 is obtained based on the management file 15, since the structure and field length of the internal field included in the block 12 are known, the storage position of the field data to be read can be easily specified (step 204). . Thereafter, the field data to be read may be sequentially read from the block 12 specified in the specified group 13 of the transposed file 14 (steps 205 to 209). In the present embodiment, the reading process from the transposed file 14 is performed asynchronously. When the processing for the first group is completed, the processing is shifted to the next group, and finally, the above-described reading processing is performed on all the groups included in the transposed file 14 (steps 210 and 211). When the data read processing for each group 13 is completed as described above, the input / output buffer is released at this point (step 212).
[0029]
For example, when a request is sent to retrieve only the field data stored in the logical field Field1 “address” from the employee database, in this embodiment, the transposed file corresponding to the logical field Field1 is sent. The data is extracted from the internal field Field # 1 in FIG. At this time, since the internal field Field # 1 is stored in units of blocks, the file management system can read only the block including the internal field Field # 1 from each group 13 in the transposed file 14 from each group 13. That's fine. At this time, the storage position in the transposed file 14 of the block 12 of the internal field Field # 1 included in each group 13 can be specified by referring to the management file 15. If the internal field Field # 1 is read from the original file, the entire original file must be accessed as a result. However, in the transposed file 14, the internal field Field # 1 is collectively blocked. Therefore, the “address” can be efficiently read without reading unnecessary data.
[0030]
According to the present embodiment, it is possible to easily read out field data even if the length of each block is variable only by holding and managing block management information. In this embodiment, the management file 15 is provided as a separate configuration. However, the management file 15 may be incorporated in the inverted file 14, and information such as the number of records may be stored in both the management file 15 and the inverted file 14 as a countermeasure against the failure. You may make it have double.
[0031]
In the above description, the management file 15 is provided with a maximum block size so that the necessary buffer size can be ensured before the start of the reading process. If the maximum length of variable-length data is limited, the maximum block size may be obtained by calculation based on the maximum length without providing the maximum block size in the management file 15. Thereby, mounting of the management file 15 can be simplified. Alternatively, the necessary input / output buffer acquisition and release may be repeated each time field data is read from the block 12. As a result, the memory can be secured with a buffer having the minimum necessary size, so that the memory can be used efficiently. Processing to repeatedly acquire and release buffers when necessary is effective particularly when there is extreme variation in block sizes.
[0032]
In the above embodiment, the offset position of each block 12 is held and managed as block management information. However, a difference from the previous block size may be held. Thereby, the size of the management file 15 can be reduced.
[0033]
Further, in the present embodiment, the read processing from the transposed file 14 is performed asynchronously, but asynchronous issue control in consideration of this limited number assuming that the number of asynchronous read requests that can be issued simultaneously is limited. Need to do. Or you may perform as a synchronous process.
[0034]
Embodiment 2. FIG.
In the first embodiment, since the variable length data is directly incorporated into the block 12, the block itself has a variable length, and the management file 15 for managing the block size is required. Therefore, in this embodiment, when the field length is variable length that varies depending on the record, the variable length data is not incorporated into the block 12 but stored separately in the variable length data file, and the block 12 stores the variable length data file. The pointer information was set.
[0035]
FIG. 6 is a schematic diagram for showing a transposed file generation method used in the second embodiment of the file management method according to the present invention. In the present embodiment, the variable length data file 17 is provided as variable length data storage means for storing the variable length data separately from the transposed file 14 without providing the management file 15 shown in the first embodiment. Also, in the internal field in which field data is normally stored, when the data to be handled is variable length, offset information and the storage location of the corresponding variable length data stored in the variable length data file 17 are specified. Stores size information of variable length data. As described above, the variable length data is stored in a separate file, and the internal field in which the variable length data is to be stored is stored with offset and size data that can be expressed in a fixed length. It is possible to generate fixed-length data, that is, actual fixed-length data and an offset and size. In FIG. 6, the offset and size are set in different internal fields and generated as another block 12. However, the offset and size internal fields are collectively generated as one variable length data block 18. Alternatively, it may be provided in the same internal field having a large size. In the present embodiment, as in the first embodiment, the block 12 can be padded as necessary and stored in the transposed file 14.
[0036]
The transposed file generation process in the present embodiment is basically the same as that in the first embodiment. To explain only the different processing, when converting from a logical field to an internal field in step 101, if the data to be handled is variable length, the variable length data is not set in the internal field but additionally registered in the variable length data file 17 To do. At this time, offset information from the head of the variable length data file 17 is acquired as the storage position information of the variable length data in the variable length data file 17 and set in the internal field together with the size information (variable length data length). The subsequent processing is the same as that of the first embodiment except that the data included in the internal field is variable length data itself or is offset and size. As a result, the transposed file 14 includes a block 12 including fixed length data and a variable length data block 18 including offset information and size information.
[0037]
Next, data read processing based on the file management method in the present embodiment will be described with reference to the flowchart shown in FIG. Here, since the process of reading fixed-length data from the transposed file 14 is the same as that of the first embodiment, only the process for reading variable-length data will be described. Whether the internal field includes fixed-length data, offset information, or size information can be identified by providing an identification flag in the internal field.
[0038]
When the file management system receives the data read request, the file management system positions the pointer (read position) at the head group 13 of the transposed file 14 (step 221), and includes variable information including offset information and size information regarding the variable length data to be read. The long data block 18 is read into a transposed file block buffer (not shown) (step 222). Then, the pointer (processing position) is positioned at the first record of the transposed file block buffer (step 223), and the offset information and size information relating to the variable length data to be read are read (step 224). Subsequently, based on the read offset information and size information, data having the size indicated by the size information is read from the variable length data file 17 to a variable length data buffer (not shown) from the storage position specified by the offset information (step 225). ).
[0039]
Thereafter, the pointer is moved next in the transposed file block buffer, the next offset information and the size information corresponding to the offset information are acquired, and the process of reading the variable length data from the variable length data file 17 is repeated (step 226, 227, 224, 225).
[0040]
The above processing (steps 224 to 227) is repeated for the variable length data block 18 read into the transposed file block buffer, and when the processing based on all the offset information and size information is completed, the unprocessed file 14 is not processed. The above processing (steps 222 to 227) is repeated until the group 13 is exhausted (steps 228 and 229).
[0041]
According to the present embodiment, variable length data can be handled as described above. In particular, in the present embodiment, variable length data is stored in a storage means provided separately from the transposed file 14, and the offset information and size of the variable length data that can be expressed in a fixed length in the blocks constituting the transposed file 14 Since the information is stored, the transposed file can be generated based on the fixed-length block.
[0042]
【The invention's effect】
According to the present invention, variable length fields having different lengths can be assigned to the same field in each record in internal processing, so that variable length data can be handled with a variable length.
[0043]
In addition, since the actual variable length data is held by variable length data storage means provided separately, and the storage location information such as offset information and size information that can be expressed in a fixed length is stored in the transposition file, It can be generated based on fixed-length blocks.
[0044]
In addition, by adding the empty area to the internal field, the delimitation of each internal field can be matched with the boundary that is a physical processing unit, so that efficient reading processing can be realized.
[0045]
In addition, when the block size is not an integer multiple of the physical minimum input / output unit as a result of generating the block by combining the internal fields, padding is performed to adjust the block size to an integer multiple of the minimum input / output unit. As a result, input / output processing can be performed efficiently.
[Brief description of the drawings]
FIG. 1 is a schematic diagram for illustrating a transposed file generation method used in Embodiment 1 of a file management method according to the present invention.
FIG. 2 is a record structure diagram showing a correspondence relationship between a logical field and an internal field of the original file in the first embodiment.
FIG. 3 is a flowchart showing a flow of transposed file generation processing in the first embodiment.
4 is a diagram showing an example of a data structure of a variable-length block according to Embodiment 1. FIG.
FIG. 5 is a flowchart showing data read processing in the first embodiment.
FIG. 6 is a schematic diagram for illustrating a transposed file generation method used in the second embodiment of the file management method according to the present invention.
FIG. 7 is a flowchart showing data read processing in the second embodiment.
FIG. 8 is a schematic diagram for illustrating a method for generating a transposed file in a conventional file management method.
FIG. 9 is a record structure diagram showing the correspondence between logical fields and internal fields in the original file in the past.
[Explanation of symbols]
11 record group, 12 block, 13 group, 14 transposed file, 15 management file, 16 free area, 17 variable length data file, 18 variable length data block.

Claims

In a file management method for managing an original file storing a plurality of records including at least one variable-length field,
When converting each field that constitutes a record stored in the original file into an internal field on a one-to-one basis, each variable-length field that contains variable-length data is one internal that matches the fixed boundary indicating the data delimiter. and the field conversion step that converts the field,
A record group generation step for generating a record group by dividing all records composed of converted internal fields into a plurality of groups;
In each record group, a block generation step for generating a block by dividing so that the same field in each record is included in the same group;
For each record group, a group including blocks generated in the record group is generated, a file including the generated groups is generated as a transposed file , and a block including variable length data is further transposed. A transposed file generation step for generating block management information capable of specifying a storage position in the file;
A file management method for identifying and reading out the corresponding variable length data from the transposed file by referring to the storage location information in response to a variable length data read request from the original file.

In a file management method for managing an original file storing a plurality of records including at least one variable-length field,
When each field constituting the record stored in the original file is converted into an internal field on a one-to-one basis, the variable length data is registered in the variable length data storage means for the variable length field including the variable length data, and A field conversion step of setting the variable length data storage position information in the variable length data storage means in an internal field;
A record group generation step for generating a record group by dividing all records composed of converted internal fields into a plurality of groups;
In each record group, a block generation step for generating a block by dividing so that the same field in each record is included in the same group;
For each record group, generate a group that includes the blocks generated in the record group side by side , and further generate a file that includes the generated groups side by side as a transposed file; and
The variable length data stored in the variable length data storage means is specified by referring to the storage position information of the variable length data stored in the transposed file in response to the variable length data read request from the original file. A file management method characterized by being read out.

3. The file management method according to claim 1 or 2, wherein the field conversion step matches each internal field delimiter with a boundary which is a physical processing unit by adding an empty area to the internal field.

In the block generation step, when the generated block size does not match an integer multiple of the physical minimum input / output unit, an empty area is added to the block so as to be an integer multiple of the minimum input / output unit. The file management method according to claim 1 or 2.