JP2012150562A

JP2012150562A - Method and device and program for compressing n-branch tree internal node

Info

Publication number: JP2012150562A
Application number: JP2011007196A
Authority: JP
Inventors: Ken Yamamuro; 健山室; Harushio Hidaka; 東潮日高; Masashi Yamamuro; 雅司山室
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2011-01-17
Filing date: 2011-01-17
Publication date: 2012-08-09
Anticipated expiration: 2031-01-17
Also published as: JP5523360B2

Abstract

PROBLEM TO BE SOLVED: To reduce the size of one comparison key (the numerical value of branch determination set in the nodes of an N-branch tree).SOLUTION: With respect to input M pieces of integral columns, N-branch tree creation is performed, and the N-branch tree is decomposed into partial tress of every height H from a ROOT node, and the partial tress are rearranged with width given priority, and the expression space of a comparison key in the partial trees is converted so as to be degenerated from 8×S bits to 8×T bits, and a block including the converted comparison key is stored in storage means, and the block including the ROOT node is read from the storage means, and conversion processing which is similar to conversion processing performed by initialization means is performed to an input search key, and processing of comparing the converted search key with the comparison key of the read block is repeated until a leaf node is reached, and when there exists the same comparison key as the search key in the leaf node, position information in an integral column corresponding to the comparison key is output.

Description

本発明は、N分木内部ノードの圧縮方法及び装置及びプログラムに係り、特に、M個の整数列から任意の整数（探索キー）を探索する処理を高速化するために構造化されたN分木の内部ノードのデータサイズを縮小するためのN分木内部ノードの圧縮方法及び装置及びプログラムに関する。 The present invention relates to an N-ary tree internal node compression method, apparatus, and program, and in particular, an N-minute structured to speed up a process of searching for an arbitrary integer (search key) from M integer strings. The present invention relates to an N-ary tree internal node compression method, apparatus, and program for reducing the data size of an internal node of a tree.

探索対象となるM個の整数列は昇順に並んでおり、その整数列から探索キーと同じキー値を持つ整数が存在するかの判定を行う。探索処理は、最上位ノードに位置するROOTノードから順にノード内の比較キーと探索キーの比較処理を繰り返し、中間位置に存在するBRANCHノードを通過し、探索キーに対応する（整数列への位置情報を保持する）LEAFノードを計算し、最終的に探索キーの整数列内の有無を判定する。図９の例では、探索キーは"３２３"であるため、まず、ROOTノードの最左端の矢印を指すBRANCHノード（"２０１"、"３９３"）へ移動する。『"２０１"＜探索キー（"３２３"）＜"３９３"』であるため、BRANCHノード中間矢印の先にあるLEAFノードに移動する。最終的にLEAFノード内の探索キーと同じ比較キーが存在するため、この探索処理は成功と判定され、整数列内における"３２３"の位置情報が返却される。 The M integer strings to be searched are arranged in ascending order, and it is determined whether there is an integer having the same key value as the search key from the integer string. The search process repeats the comparison process of the comparison key and the search key in the node in order from the ROOT node located at the highest node, passes through the BRANCH node existing at the intermediate position, and corresponds to the search key (position to the integer string) The LEAF node (which holds the information) is calculated, and finally the presence or absence of the search key in the integer string is determined. In the example of FIG. 9, since the search key is “323”, first, the search moves to the BRANCH nodes (“201”, “393”) indicating the leftmost arrow of the ROOT node. Since “201” <search key (“323”) <“393” ”, it moves to the LEAF node ahead of the BRANCH node middle arrow. Since the same comparison key as the search key in the LEAF node finally exists, this search process is determined to be successful, and the position information “323” in the integer string is returned.

M個の整数列の探索処理を高速化するためのN分木データ構造は『ポインタを利用するもの』（図１０）と『配列を利用するもの』（図１１）の２つある。前者の『ポインタを利用するもの』は図１０に示すように、BRANCHノードが探索キーの目的位置を特定するための（N-1）個の比較キーと、次のBRANCHノードを表すN個のポインタから構成される（例えば、特許文献１参照）。一方、後者の『配列を使用するもの』は、図１１に示すように、BRANCHノードが（N-1）個の比較キーのみで構成され（例えば、非特許文献１、２参照）、各BRANCHノードにおける比較結果から、次BRANCHノードへのオフセット数を算出し、移動する（例えば、特許文献２参照）。 There are two N-ary tree data structures for speeding up the search processing of M integer strings: “using a pointer” (FIG. 10) and “using an array” (FIG. 11). As shown in FIG. 10, the former “using a pointer” includes (N−1) comparison keys for the BRANCH node to specify the target position of the search key, and N numbers for the next BRANCH node. It consists of a pointer (see, for example, Patent Document 1). On the other hand, as shown in FIG. 11, the latter “using an array” is composed of only (N−1) comparison keys for the BRANCH node (see, for example, Non-Patent Documents 1 and 2). From the comparison result at the node, the number of offsets to the next BRANCH node is calculated and moved (see, for example, Patent Document 2).

特許０４３５１５３０号公報Japanese Patent No. 04351530 特開２００５−２３５２０９号公報JP 2005-235209 A

Changkyu K. et al.: FAST: fast architecture sensitive tree search on modern CPUs and GPUs, Proceedings of the 2010 international conference on Management of data (SIGMOD'10),2010.Changkyu K. et al .: FAST: fast architecture sensitive tree search on modern CPUs and GPUs, Proceedings of the 2010 international conference on Management of data (SIGMOD'10), 2010. Benjamin S. et al.: k-ary search on modern processors, Proceedings of the 19th International Workshop on Data Management on New Hardware, 2009.Benjamin S. et al .: k-ary search on modern processors, Proceedings of the 19th International Workshop on Data Management on New Hardware, 2009.

M個の整数列に対してN分木を構成した場合、N分木の高さは If an N-ary tree is constructed for M integer sequences, the height of the N-ary tree is

で表され、N分木内部ノード内の総比較キー数（N_key）は初項（N-1）で、公比Nの等比級数となり、

The total number of comparison keys (N _key ) in the N-branch internal node is the first term (N-1), which is a geometric series of the common ratio N,

になる。N分木内部ノードの比較キーのデータサイズは、前述の従来技術の非特許文献１，２、特許文献２においては、N分木を構成する整数列と同じデータサイズ（このデータサイズをSバイトとする）になるため、整数列のデータサイズが４バイト（S=4）の場合は、内部ノードの総サイズはN_key×４バイトになり、８バイト（S=8）の場合は、N_key×８バイトになる。このN分木の総ノードサイズは、整数列が少ない（Mが小）場合には問題にならないが、センサ情報や、アクセスログデータなど大規模データ内の整数列に対するN分木を構成する際、N分木の内部ノードサイズが非常に大きくなる問題がある。

become. The data size of the comparison key of the N-tree internal node is the same data size as the integer string constituting the N-tree in the above-mentioned prior

art Non-Patent Documents

1 and 2 and Patent Document 2 (this data size is S bytes). Therefore, if the integer column data size is 4 bytes (S = 4), the total size of the internal node is N _key × 4 bytes, and if it is 8 bytes (S = 8), N _key × 8 bytes. The total node size of this N-ary tree is not a problem when the integer string is small (M is small), but when configuring an N-ary tree for integer strings in large-scale data such as sensor information and access log data The internal node size of the N-ary tree becomes very large.

本発明は、上記の点に鑑みなされたもので、１つの比較キー（N分木のノードに設定される分岐判断の数値）のサイズを小さくすることが可能なN分木内部ノードの圧縮方法及び装置及びプログラムを提供することを目的とする。 The present invention has been made in view of the above points, and a compression method of an N-ary tree internal node that can reduce the size of one comparison key (a numerical value for branch determination set in an N-ary tree node). And an apparatus and a program.

上記の課題を解決するため、本発明は、M個の整数列から任意の整数(探索キー)を探索する処理を高速化するために構造化されたN分木の内部ノードのデータサイズを縮小するためのN分木内部ノードの圧縮装置であって、
入力された前記M個の整数列に対してN(Nはシステムパラメータ)分木作成を行い、ROOTノードを含む部分木を先頭として、部分木を幅優先に並び替えたN分木の部分木内の分岐判断の数値である比較キーの表現空間を縮退する変換を行い、変換された比較キーを含むブロックを記憶手段に格納する初期化手段と、
前記記憶手段から前記ROOTノードを含む前記ブロックを読み出し、入力された前記探索キーを前記初期化手段で行った変換処理と同様の変換処理を行い、変換された探索キーと読み出されたブロックの前記比較キーを比較する処理をリーフノードになるまで繰り返し、リーフノード内に該探索キーと同じ比較キーが存在する場合は、該比較キーに対応する整数列内の位置情報を出力する探索キー処理手段と、を有する。 In order to solve the above problem, the present invention reduces the data size of the internal node of the structured N-ary tree in order to speed up the process of searching for an arbitrary integer (search key) from M integer strings. An N-ary tree internal node compression device for
N (N is a system parameter) branch tree is created for the M integer strings that have been input. An initialization unit that performs conversion to degenerate a comparison key expression space that is a numerical value of branch determination, and stores a block including the converted comparison key in a storage unit;
The block including the ROOT node is read from the storage means, and the input search key is subjected to a conversion process similar to the conversion process performed by the initialization means, and the converted search key and the read block Search key processing that repeats the process of comparing the comparison keys until a leaf node is reached, and outputs the position information in the integer string corresponding to the comparison key when the same comparison key as the search key exists in the leaf node Means.

また、上記の初期化手段は、
前記N分木を前記ROOTノードから高さHずつの部分木に分解し、幅優先に並び替え、該部分木内の比較キーの表現空間を８×Sビットから８×Tビット(但し、Tは任意のパラメータ、S>T)に縮退する手段を含む。 Also, the initialization means described above is
The N-ary tree is decomposed into subtrees of height H from the ROOT node, rearranged in the priority order, and the expression space of the comparison key in the subtree is changed from 8 × S bits to 8 × T bits (where T is arbitrary) Parameter, S> T).

また、上記の探索キー処理手段は、
前記初期化手段で変換された比較キーを変換するために用いられた情報を利用して、前記探索キーをブロック毎に比較可能な値に変換する手段と、
前記探索キーと前記比較キーの比較を行うことにより探索対象の次のブロックを決定する処理を次の部分木に移動しながら目的位置まで探索する手段と、を含む。 The search key processing means is
Means for converting the search key into a value that can be compared for each block, using the information used to convert the comparison key converted by the initialization means;
Means for searching for the target position while moving to the next subtree by determining the next block to be searched by comparing the search key and the comparison key.

また、上記の探索キー処理手段は、
前記リーフノード内の比較キーを利用して、最初に探索されたリーフノードが前記探索キーに対応する正しいリーフノードか否かを判定し、正しくない場合には、リーフノードを順に読み込み、位置を補正する手段を含む。 The search key processing means is
Using the comparison key in the leaf node, it is determined whether or not the leaf node that is searched first is a correct leaf node corresponding to the search key. Means for correcting.

上記にように、N分木内部ノード内の単体比較キーが８×Sビットから、８×Tビットで表現可能となるため、N分木の全内部ノードサイズはN_key×SバイトからN_key×Tバイトとなり、総サイズが約T/S（メタデータ要の領域は相対的に小さいため無視した場合）になる。より具体的には、M個の整数列を４バイト（S＝4）として、Tを１とした場合、N分木の全内部ノードサイズが変換前の約２５％になる。 As described above, since the single comparison key in the N-ary tree internal node can be expressed from 8 × S bits to 8 × T bits, the total internal node size of the N- _ary tree can be expressed as N _key × S bytes to N _key. X T bytes, and the total size is about T / S (if the metadata area is relatively small, it is ignored). More specifically, assuming that M integer strings are 4 bytes (S = 4) and T is 1, the total internal node size of the N-ary tree is about 25% before conversion.

本発明の一実施の形態における装置構成図である。It is an apparatus block diagram in one embodiment of this invention. 本発明の一実施の形態におけるN分木初期化部における部分木への分解と幅優先並び替えを示す図である。It is a figure which shows the decomposition | disassembly to the subtree in the N-ary tree initialization part in one embodiment of this invention, and width priority rearrangement. 本発明の一実施の形態における四則演算を利用した比較キー縮退変換処理の具体例である。It is a specific example of comparison key degeneracy conversion processing using four arithmetic operations in one embodiment of the present invention. 本発明の一実施の形態における部分木内の比較キー変換処理モジュールの具合例の１つとしてのフローチャートである。It is a flowchart as one of the specific examples of the comparison key conversion processing module in the subtree in the embodiment of the present invention. 本発明の一実施の形態における探索処理のズレの問題(T=1)を示す図である。It is a figure which shows the problem (T = 1) of the gap of the search process in one embodiment of this invention. 本発明の一実施の形態における正しい位置への補正処理を示す図である。It is a figure which shows the correction process to the correct position in one embodiment of this invention. 本発明の一実施の形態におけるN分木初期株の処理のフローチャートである。It is a flowchart of the process of the N branch tree initial stock in one embodiment of this invention. 本発明の一実施の形態における探索キー処理部の処理のフローチャートである。It is a flowchart of the process of the search key process part in one embodiment of this invention. N分木の構成図である。It is a block diagram of N-ary trees. ポインタを利用したN分木のデータ構造である。This is a data structure of an N-ary tree using a pointer. 配列を利用したN分木のデータ構造である。This is a data structure of N-ary trees using arrays.

以下図面と共に、本発明の実施の形態を説明する。 Embodiments of the present invention will be described below with reference to the drawings.

図１は、本発明の一実施の形態における装置構成を示す。 FIG. 1 shows an apparatus configuration according to an embodiment of the present invention.

同図に示す装置は、N分木初期化部１０、探索キー処理部２０、N分木記憶部３０から構成され、N分木初期化部１０にはM個の整数列を入力する整数列入力装置１が、探索キー処理部２０には探索キーを入力する探索キー入力装置２及び結果出力装置３が接続されている。 The apparatus shown in FIG. 1 includes an N-ary tree initialization unit 10, a search key processing unit 20, and an N-ary tree storage unit 30, and an integer sequence for inputting M integer sequences to the N-ary tree initialization unit 10. The search key input device 2 for inputting a search key and the result output device 3 are connected to the search key processing unit 20.

以下に上記の構成における処理を説明する。 The process in the above configuration will be described below.

N分木初期化部１０は、探索キーの高速な処理を行うために、整数列入力装置１から入力されたM個の整数列に対してN分木（Nはシステムパラメータ）作成を行う。N分木初期化部１０では、整数列入力装置１から入力された昇順並び替え済みのN分木対象整数列から従来通りのN分木を作成する。その後、図２に示すように、N分木を高さH毎の部分木に分解し、幅優先で部分木を並び替え、内部の比較キーを昇順に配列配置することでN分木のデータ構造（ブロック）とする。ブロックに変換する際に、ブロック毎の比較キーの表現空間を８×Sビットから８×Tビット（Tは本発明における任意のパラメータで、S＞Tとする）に縮退する点が本発明の特徴である。詳細については図７、図８で詳述する。部分木内の比較キー変換処理の具体例として、四則演算を利用して比較キー縮退変換処理の例を図３に示す。 The N-ary tree initialization unit 10 creates an N-ary tree (N is a system parameter) for M integer strings input from the integer string input device 1 in order to perform high-speed processing of search keys. The N-ary tree initialization unit 10 creates a conventional N-ary tree from the ascending-ordered N-ary tree target integer string input from the integer string input device 1. After that, as shown in FIG. 2, the N-branch tree is decomposed into subtrees for each height H, the subtrees are rearranged with priority given to the width, and the internal comparison keys are arranged in ascending order to arrange the data structure of the N-branch tree. (Block). When converting to a block, the expression space of the comparison key for each block is degenerated from 8 × S bits to 8 × T bits (T is an arbitrary parameter in the present invention, and S> T). It is a feature. Details will be described in detail with reference to FIGS. As a specific example of the comparison key conversion process in the subtree, an example of the comparison key degeneration conversion process using four arithmetic operations is shown in FIG.

探索キー処理部２０は、探索キー入力装置２から入力された探索キーが、整数列入力装置１から入力された整数列内に含まれているかどうかを判定し、その位置情報を返す。探索の際に、比較を行う各ブロック内に含まれているメタデータ（ブロック内の比較キーを変換するために利用した情報）を利用して、探索キーをそのブロック毎で比較可能な値に変換した後に、ブロック内の比較キーとの比較を行う。探索を行うべき次ブロックの決定を行い、次の部分木に移動しながら、目的位置までの探索処理を行う。探索処理においては、部分木内の比較キーを圧縮する際に発生する異なる比較キーが同一の値に変換されてしまう問題（具体例として図４のフローチャートを利用した、この問題の説明を図５に示す）に対処する必要（表現空間を縮退したことによるペナルティ）がある。より具体的には、異なる比較キーが同一の値に変換されてしまうことにより、本来探索すべき部分木からの"ズレ"が発生する点である。このズレに関しては、図６に示すように、LEAFノードに到着した際に、探索キーと内部の比較キーを比較することで、ズレを修正する処理に加える。 The search key processing unit 20 determines whether or not the search key input from the search key input device 2 is included in the integer string input from the integer string input device 1 and returns its position information. When searching, the metadata included in each block to be compared (information used to convert the comparison key in the block) is used to make the search key comparable to each block. After conversion, the comparison with the comparison key in the block is performed. The next block to be searched is determined, and the search process to the target position is performed while moving to the next subtree. In the search process, the problem is that different comparison keys generated when compressing the comparison key in the subtree are converted to the same value (a specific example using the flowchart of FIG. 4 is used to explain this problem in FIG. 5). Need to deal with (shown)) (penalty for degenerating the expression space). More specifically, when different comparison keys are converted to the same value, “deviation” from the subtree to be originally searched for occurs. As shown in FIG. 6, this shift is added to the process of correcting the shift by comparing the search key with the internal comparison key when it arrives at the LEAF node.

次に、N分木初期化部１０の詳細な処理を説明する。 Next, detailed processing of the N-ary tree initialization unit 10 will be described.

図７は、本発明の一実施の形態におけるN分木初期化部の処理のフローチャートである。 FIG. 7 is a flowchart of the process of the N-ary tree initialization unit in one embodiment of the present invention.

ステップ５００）整数列入力装置１から昇順のM個の整数列が入力される。 Step 500) M integer strings in ascending order are input from the integer string input device 1.

ステップ５０５）入力されたM個の整数値に対して、従来技術の方法によりN分木を作成し、通常のN分木のデータ構造を得る。 Step 505) An N-ary tree is created for the input M integer values by a conventional method to obtain a normal N-ary tree data structure.

ステップ５１０）ステップ５０５で得られたN分木を、図２に示すようにROOTノードから順に高さH毎の部分木に分解して、幅優先に並び替える。 Step 510) The N-ary tree obtained in step 505 is decomposed into subtrees of every height H in order from the ROOT node as shown in FIG.

ステップ５１５）ｉを値１で初期化する。 Step 515) Initialize i with the value 1.

ステップ５２０） N分木を高さHごとに分解したS個の部分木からｉ番目の部分木を取り出す。 Step 520) The i-th subtree is extracted from the S subtrees obtained by decomposing the N-ary tree at every height H.

ステップ５２５）『部分木内の比較キー変換処理』（例えば、図４のフローチャート）モジュールを利用し、ｉ番目の部分木内に含まれる比較キーを変換してブロック化する。この変換処理において、各部分木内比較キーの表現空間を８×Sビットから８×Tビットに縮退する任意の変換処理を行う。この縮退変換処理の１つの具体例として四則演算を利用した変換手法がある。この手法については後述する。 Step 525) Using the “comparison key conversion process in subtree” (for example, the flowchart of FIG. 4) module, the comparison key included in the i-th subtree is converted into a block. In this conversion process, an arbitrary conversion process for degenerating the expression space of each sub-tree comparison key from 8 × S bits to 8 × T bits is performed. As a specific example of the degenerate conversion processing, there is a conversion method using four arithmetic operations. This method will be described later.

ステップ５３０）ステップ５２５で変換処理され圧縮されたブロックが入力されると、当該入力されたブロックをFIFOバッファ（図示せず）に格納する。 Step 530) When the block which has been converted and compressed in Step 525 is input, the input block is stored in a FIFO buffer (not shown).

ステップ５３５）ｉに１を加算する。 Step 535) Add 1 to i.

ステップ５４０） FIFOバッファ（図示せず）内の全てのブロックを順に読み出し、N分木記憶部３０に格納する。 Step 540) All blocks in the FIFO buffer (not shown) are sequentially read and stored in the N-ary tree storage unit 30.

以下に、上記のステップ５２５の縮退変換処理について図４を用いて説明する。 Hereinafter, the degeneracy conversion process in step 525 will be described with reference to FIG.

ステップ６００） N分木初期化部１０において、部分木の比較キー集合が入力されると、部分木の比較キーを昇順に以下のように並び替える。 Step 600) When the comparison key set of the subtree is input in the N-ary tree initialization unit 10, the comparison keys of the subtree are rearranged in ascending order as follows.

V₁(V_min)，V₂,…，Ｖ_N-1(V_max)
ステップ６０５）比較キー集合内の全ての比較キーを、比較キー内最小値V_minで減算し、ｘ（補正項）を加算する。補正項ｘは部分木内の最小値を"１"に対応付けるために存在し、図３に示すように、比較キーを変換後に（"greater than"比較処理を想定）左端の部分木への探索を可能にする。 V ₁ (V _min ), V ₂ , ..., V _N-1 (V _max )
Step 605) All the comparison keys in the comparison key set are subtracted by the minimum value V _min in the comparison key, and x (correction term) is added. The correction term x exists to associate the minimum value in the subtree with “1”. As shown in FIG. 3, after the comparison key is converted (assuming “greater than” comparison processing), a search for the leftmost subtree is performed. enable.

ステップ６１０）比較キー集合内の全ての比較キー（V_n−V_min+ｘ）を、（V_max−V_min+ｘ）で割る。 Step 610) Divide all comparison keys (V _n −V _min + x) in the set of comparison keys by (V _max −V _min + x).

ステップ６１５）比較キー集合内の全ての比較キーに2^８T(Tはシステムパラメータ)を掛け、小数点以下を切り捨てる。 Step 615) Multiply all comparison keys in the set of comparison keys by ^{28 T} (T is a system parameter) and round off the decimal part.

上記のステップ６０５，６１０，６１５までの部分木内の比較キーは、 The comparison key in the subtree up to the above steps 605, 610, 615 is

で変換（T<Ｓ，Ｖ*_nは右辺を四捨五入した値

(T <S, V * _n is the value obtained by rounding off the right side.

する。

To do.

ステップ６２０）先頭にメタデータV_min、V_maxを付与した変換処理後の昇順比較キーをブロックとして出力する。 Step 620) The ascending comparison key after the conversion process with the metadata V _min and V _max _{added at the} head is output as a block.

次に、探索キー処理部２０の処理を図８のフローチャートに沿って説明する。 Next, the process of the search key process part 20 is demonstrated along the flowchart of FIG.

ステップ７００）探索キー処理部２０は、探索キー入力装置２から探索キーが入力される。 Step 700) The search key processing unit 20 receives a search key from the search key input device 2.

ステップ７０５）Ｎ分木記憶部３０からROOTノードのブロックを読み出す。 Step 705) The block of the ROOT node is read from the N-ary tree storage unit 30.

ステップ７１０）読み出したブロック内に含まれるメタデータ（ブロック内の比較キーを変換するために利用した情報）を読み出して、ブロック内の比較キーが変換処理された内容と同じ処理を探索キーに行い、変換処理された探索キーを出力する。 Step 710) The metadata contained in the read block (information used for converting the comparison key in the block) is read, and the comparison key in the block is subjected to the same processing as the converted key using the search key. The converted search key is output.

ステップ７１５）変換処理された探索キーを利用して、例えば、「URL: http://www.geocities.jp/m_hiroi/light/abcruby13.html」に示される手法を用いて、ブロック内の比較キーと比較処理を行う。その後、次に読み出すブロックを決定し、Ｎ分木記憶部３０から次に探索を行うべきブロックを読み出す。次に読み出すブロックがLEAFノードである場合はステップ７２０に移行する。LEAFブロックで無い場合はステップ７１０に移行する。 Step 715) Using the converted search key, for example, using the method shown in “URL: http://www.geocities.jp/m_hiroi/light/abcruby13.html”, the comparison key in the block And the comparison process. Thereafter, the next block to be read is determined, and the block to be searched next is read from the N-ary tree storage unit 30. If the next block to be read is a LEAF node, the process proceeds to step 720. If it is not a LEAF block, the process proceeds to step 710.

ステップ７２０） LEAFノードを含むブロック情報を取得し、Ｎ分木内部ノードの比較キーの変換処理（図５）により発生する探索位置の"ズレ"の問題に対処するために、最初に探索されたLEAFノードが探索キーに対応する正しいLEAFノードかどうかをLEAFノード内の比較キーを利用することで判定する。正しくない場合には、本来探索されるべきLEAFノード位置を、LEAFノードを順に読み込むことで図６に示すように補正する。 Step 720) The block information including the LEAF node is acquired and searched first to cope with the problem of “shift” in the search position caused by the conversion process of the comparison key of the N-ary tree internal node (FIG. 5). Whether the LEAF node is the correct LEAF node corresponding to the search key is determined by using the comparison key in the LEAF node. If it is not correct, the LEAF node position to be originally searched is corrected as shown in FIG. 6 by reading the LEAF nodes in order.

ステップ７２５） LEAFノード内に、探索キーと同じ比較キーを持つものがあるかを調べ、もしある場合は、その比較キーに対応する整数列内のポインタを結果出力装置３に出力する。無い場合には、整数列には探索キーと同じ値を持つもの無しとして出力する。 Step 725) It is checked whether there is any LEAF node having the same comparison key as the search key. If there is, a pointer in the integer string corresponding to the comparison key is output to the result output device 3. If there is not, the integer string is output as having no value as the search key.

上記のように、本発明では、ROOTノードを含む部分木を先頭として、部分木を幅優先に並び替えた列からN分木の内部データ構造であるブロックに変換する際（ステップ５２０〜ステップ５３５）、ブロックごとの比較キーの表現空間を縮退する処理（ステップ５２５）を行うことが特徴である。なお、H(高さ)=1としてステップ５２５の部分木内の比較キー変換処理を行わない場合が従来技術に対応する。本発明のような比較キーの変換を行うことで、探索キーと比較可能な状態で表現空間（０〜２^S−１から０〜２^T-1-１)を縮退させ、比較キーを構成するN分木のノード圧縮を可能とする。 As described above, according to the present invention, when a subtree including a ROOT node is used as a head, a subtree is rearranged into a block having the internal data structure of an N-ary tree from the sequence in which the subtree is arranged in priority (step 520 to step 535) The process is characterized by performing a process of degenerating the expression space of the comparison key for each block (step 525). Note that the case where H (height) = 1 and the comparison key conversion process in the subtree in step 525 is not performed corresponds to the conventional technique. By performing the comparison key conversion as in the present invention, the expression space (0-2 ^S -1 to 0-2 ^T-1 -1) is degenerated in a state comparable to the search key, and the comparison key is configured. Allows node compression of N-ary trees.

なお、上記の図１に示すN分木内部ノード圧縮装置の構成要素の動作をプログラムとして構築し、N分木内部ノード圧縮装置として利用されるコンピュータにインストールして実行させる、または、ネットワークを介して流通させることが可能である。 The operations of the components of the N-branch internal node compression device shown in FIG. 1 are constructed as a program and installed and executed on a computer used as the N-branch internal node compression device, or via a network. Can be distributed.

本発明は、上記の実施の形態に限定されることなく、特許請求の範囲内において種々変更・応用が可能である。 The present invention is not limited to the above-described embodiments, and various modifications and applications can be made within the scope of the claims.

１整数列入力装置
２探索キー入力装置
３結果出力装置
１０ N分木初期化部
２０探索キー処理部
３０ N分木記憶部 1 Integer String Input Device 2 Search Key Input Device 3 Result Output Device 10 N-ary Tree Initialization Unit 20 Search Key Processing Unit 30 N-ary Tree Storage Unit

Claims

N-tree internal node compression device for reducing the data size of the internal node of the N-tree structured to speed up the process of searching for an arbitrary integer (search key) from M integer strings. There,
N (N is a system parameter) branch tree is created for the M integer strings that have been input. An initialization unit that performs conversion to degenerate a comparison key expression space that is a numerical value of branch determination, and stores a block including the converted comparison key in a storage unit;
The block including the ROOT node is read from the storage means, and the input search key is subjected to a conversion process similar to the conversion process performed by the initialization means, and the converted search key and the read block Search key processing that repeats the process of comparing the comparison keys until a leaf node is reached, and outputs the position information in the integer string corresponding to the comparison key when the same comparison key as the search key exists in the leaf node Means,
An N-ary tree internal node compression apparatus characterized by comprising:

The initialization means includes
The N-ary tree is decomposed into subtrees of height H from the ROOT node, rearranged in the priority order, and the expression space of the comparison key in the subtree is changed from 8 × S bits to 8 × T bits (where T is arbitrary) The apparatus for compressing an N-ary tree internal node according to claim 1, further comprising means for degenerating the parameter: S> T).

The search key processing means includes
Means for converting the search key into a value that can be compared for each block, using the information used to convert the comparison key converted by the initialization means;
Means for searching to the target position while moving the process of determining the next block to be searched by comparing the search key and the comparison key to the next subtree;
The compression apparatus for an N-ary tree internal node according to claim 1.

The search key processing means includes
Using the comparison key in the leaf node, it is determined whether or not the leaf node that is searched first is a correct leaf node corresponding to the search key. 4. A compression apparatus for an N-ary tree internal node according to claim 1, further comprising means for correcting.

N-tree internal node compression method to reduce the data size of the internal node of the structured N-ary tree to speed up the process of searching for an arbitrary integer (search key) from M integer strings. There,
Initialization means creates N (N is a system parameter) branch tree for the input M integer strings, and starts with the subtree including the ROOT node and sorts the subtree in width-first order. An initialization step of performing a conversion to degenerate an expression space of a comparison key that is a numerical value of a branch decision in a subtree of the tree, and storing a block including the converted comparison key in a storage unit;
A search key processing unit reads the block including the ROOT node from the storage unit, performs a conversion process similar to the conversion process performed by the initialization unit on the input search key, and converts the converted search key and The process of comparing the comparison key of the read block is repeated until a leaf node is reached, and if the same comparison key as the search key exists in the leaf node, position information in the integer string corresponding to the comparison key Search key processing step for outputting
A method for compressing N-ary tree internal nodes.

In the initialization step,
The N-ary tree is decomposed into subtrees of height H from the ROOT node, rearranged in the priority order, and the expression space of the comparison key in the subtree is changed from 8 × S bits to 8 × T bits (where T is arbitrary) 6. The method of compressing an N-ary tree internal node according to claim 5, wherein the compression is reduced to a parameter of S> T).

In the search key processing step,
Using the information used to convert the comparison key converted by the initialization means, the search key is converted into a value that can be compared for each block,
6. The method of compressing an N-ary tree internal node according to claim 5, wherein a process for determining a next block to be searched by comparing the search key and the comparison key is performed to move to a next subtree and search to a target position. .

In the search key processing step,
Using the comparison key in the leaf node, it is determined whether or not the leaf node that is searched first is a correct leaf node corresponding to the search key. The method of compressing an N-ary tree internal node according to claim 5 or 7, wherein correction is performed.

The program for functioning a computer as each means which comprises the compression apparatus of the N branch tree internal node of any one of Claims 1 thru | or 4.