JP5526985B2

JP5526985B2 - Search program, search device, and search method

Info

Publication number: JP5526985B2
Application number: JP2010104013A
Authority: JP
Inventors: 高志渡辺; 芳浩土屋; 泰生野口
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2010-04-28
Filing date: 2010-04-28
Publication date: 2014-06-18
Anticipated expiration: 2030-04-28
Also published as: JP2011233014A

Description

本発明は、ブルームフィルタを用いてデータを検索する検索プログラム、検索装置、および検索方法に関する。 The present invention relates to a search program, a search device, and a search method for searching for data using a Bloom filter.

従来、大規模なデータを木構造で管理する場合、Ｂ木（Ｂｔｒｅｅ）と呼ばれるデータ構造での管理が比較的多く行われていた。Ｂ木は、単純な２分木に比べて、一つのブロックに複数のデータエントリを格納するので、データエントリの追加があっても木構造の形の変化が波及する範囲を狭くできるという利点がある。このため、Ｂ木はハードディスクなどのディスク向けのデータ管理方法として利用されることが多い。 Conventionally, when large-scale data is managed in a tree structure, a relatively large amount of data is managed in a data structure called a B-tree. Compared to a simple binary tree, the B-tree stores a plurality of data entries in one block, and therefore has an advantage that the range in which the change in the shape of the tree structure spreads can be narrowed even if data entries are added. is there. For this reason, the B-tree is often used as a data management method for a disk such as a hard disk.

しかしながら、ディスク上において木構造で管理されたデータを検索する場合、複数のデータブロックを実際に読み込む必要がある。また、一般に、ディスクに対するＩ／Ｏ（ｉｎｐｕｔ／ｏｕｔｐｕｔ）は、メモリアクセスに比べると遅いことから、ディスク上でのデータ検索には手間と時間を要するおそれがある。 However, when retrieving data managed in a tree structure on the disk, it is necessary to actually read a plurality of data blocks. In general, I / O (input / output) for a disk is slower than memory access, and thus searching for data on the disk may take time and effort.

このため、最近では、ディスクＩ／Ｏによる検索の遅延を避けるためには、メモリ中に木構造をもつなどの対応も考えられている。しかるに、Ｂ木では、データエントリ数が多くなると、それに応じて必要なメモリ量が増えてしまうおそれがある。このため、木構造のうち最も良く読みこまれる部分のみをメモリ中に格納する方法（キャッシュ）を利用する方法も考えられている。 For this reason, recently, in order to avoid a search delay due to disk I / O, a countermeasure such as having a tree structure in the memory has been considered. However, in the B-tree, when the number of data entries increases, there is a possibility that the required memory amount increases accordingly. For this reason, a method using a method (cache) in which only the most read portion of the tree structure is stored in the memory is also considered.

これに対し、最近では、ブルームフィルタ（ＢｌｏｏｍＦｉｌｔｅｒ）と呼ばれるデータ構造も知られてきている。ブルームフィルタは、あるエントリが既存の集合に属するかどうかを効率的に調べる方法である。また、電子交換機のダイヤルパルス処理で、ダイヤルパルスにパルス速度ビットと偶数／奇数ビットの２つを設けておき、そのビットを取り込む群処理も開示されている。 On the other hand, a data structure called a Bloom filter has recently been known. The Bloom filter is a method for efficiently checking whether an entry belongs to an existing set. In addition, there is also disclosed a group process in which two pulse speed bits and even / odd bits are provided in the dial pulse in the dial pulse process of the electronic exchange, and the bits are fetched.

特開２００７−５２６９８号公報JP 2007-52698 A 特開平４−１８８９５号公報Japanese Patent Laid-Open No. 4-18895

上述したように、Ｂ木は多量のデータを扱うことができるため、キャッシュを適切に実装すれば、ディスクＩ／Ｏを減らすことは可能である。しかしながら、その回数はある一定以上減らすことはできない。また、データエントリの追加により木構造が変化すると、木構造管理のためのＩ／Ｏが必要になることもある。また、ブルームフィルタは、データエントリの存在だけがわかるものであるため、そのままではデータ管理に使うことはできない。 As described above, since the B-tree can handle a large amount of data, it is possible to reduce disk I / O if a cache is appropriately mounted. However, the number of times cannot be reduced beyond a certain level. Further, when the tree structure changes due to the addition of the data entry, I / O for managing the tree structure may be required. In addition, the Bloom filter can only be used for data management because it knows only the existence of a data entry.

本発明は、上述した従来技術による問題点を解消するため、ブルームフィルタを用いたときのデータ検索の高速化を図ることができる検索プログラム、検索装置、および検索方法を提供することを目的とする。 An object of the present invention is to provide a search program, a search device, and a search method capable of speeding up data search when a Bloom filter is used in order to solve the above-described problems caused by the prior art. .

上述した課題を解決し、目的を達成するため、データブロックごとに登録済のデータ群を含むデータブロック集合と、所定数のデータブロック内での陰性を示すビットがｍ個配列されたブルームフィルタが前記所定数のデータブロック単位でｎ個配列されたブルームフィルタ列と、にアクセス可能な場合に、前記ブルームフィルタ列の転置要求を受け付け、転置要求が受け付けられた場合、前記ブルームフィルタ列を、前記各ブルームフィルタ内のビット列を同一位置のビットどうしでまとめたｎビットの転置ブルームフィルタがｍ個配列された転置ブルームフィルタ列に転置し、転置された転置ブルームフィルタ列を用いて、前記データブロック集合内でのデータの有無を判定することを要件とする。 In order to solve the above-mentioned problems and achieve the object, a data block set including a registered data group for each data block and a Bloom filter in which m bits indicating negative in a predetermined number of data blocks are arranged When the n number of Bloom filter columns arranged in units of the predetermined number of data blocks can be accessed, a transposition request for the Bloom filter column is accepted, and when a transposition request is accepted, the Bloom filter sequence is The data block set is transposed to a transposed Bloom filter sequence in which n-bit transposed Bloom filters in which bit sequences in each Bloom filter are collected by bits at the same position are arranged, and the transposed Bloom filter sequence is used. It is a requirement to determine the presence or absence of data in the network.

本発明にかかる検索プログラム、検索装置、および検索方法によれば、ブルームフィルタを用いたときのデータ検索の高速化を図ることができることという効果を奏する。 According to the search program, the search device, and the search method according to the present invention, it is possible to increase the speed of data search when using the Bloom filter.

実施の形態にかかる管理装置のハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware structural example of the management apparatus concerning embodiment. 本実施の形態にかかる管理装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the management apparatus concerning this Embodiment. ハッシュテーブル群の一例を示す説明図である。It is explanatory drawing which shows an example of a hash table group. 階層型ブルームフィルタの一例を示す説明図である。It is explanatory drawing which shows an example of a hierarchical Bloom filter. 登録処理部による階層型ブルームフィルタの学習処理例を示す説明図である。It is explanatory drawing which shows the learning process example of the hierarchical Bloom filter by a registration process part. 検索処理部による階層型ブルームフィルタの検索処理例を示す説明図である。It is explanatory drawing which shows the example of a search process of the hierarchical Bloom filter by a search process part. 階層型ブルームフィルタ内の第ｐ段のブルームフィルタ列の転置例を示す説明図である。It is explanatory drawing which shows the transposition example of the Bloom filter row | line | column of the p-th stage in a hierarchical Bloom filter. 検索処理部による階層型転置ブルームフィルタの検索処理例を示す説明図である。It is explanatory drawing which shows the search processing example of the hierarchical transposed Bloom filter by a search process part. 検索処理部の機能的構成例を示すブロック図である。It is a block diagram which shows the functional structural example of a search process part. 登録処理部による階層型ブルームフィルタの学習処理手順を示すフローチャートである。It is a flowchart which shows the learning processing procedure of the hierarchical Bloom filter by a registration process part. 検索処理部による検索処理手順を示すフローチャート（その１）である。It is a flowchart (the 1) which shows the search processing procedure by a search process part. 検索処理部による検索処理手順を示すフローチャート（その２）である。It is a flowchart (the 2) which shows the search processing procedure by a search process part. 登録処理部による階層型転置ブルームフィルタｔＢＦの学習処理例を示す説明図である。It is explanatory drawing which shows the learning process example of the hierarchical transposition Bloom filter tBF by a registration process part. 登録処理部による階層型ブルームフィルタＢＦの学習処理手順を示すフローチャートである。It is a flowchart which shows the learning processing procedure of the hierarchical Bloom filter BF by a registration process part.

以下に添付図面を参照して、本発明にかかる検索プログラム、検索装置、および検索方法の実施の形態を詳細に説明する。まず、本実施の形態にかかる管理装置の構成例について説明する。 Embodiments of a search program, a search device, and a search method according to the present invention will be described below in detail with reference to the accompanying drawings. First, a configuration example of the management apparatus according to the present embodiment will be described.

＜管理装置のハードウェア構成例＞
図１は、実施の形態にかかる管理装置のハードウェア構成例を示すブロック図である。図１において、検索装置は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１０１と、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１０２と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１０３と、磁気ディスクドライブ１０４と、磁気ディスク１０５と、光ディスクドライブ１０６と、光ディスク１０７と、ディスプレイ１０８と、Ｉ／Ｆ（Ｉｎｔｅｒｆａｃｅ）１０９と、キーボード１１０と、マウス１１１と、スキャナ１１２と、プリンタ１１３と、を備えている。また、各構成部はバス１００によってそれぞれ接続されている。 <Hardware configuration example of management device>
FIG. 1 is a block diagram of a hardware configuration example of the management apparatus according to the embodiment. In FIG. 1, the search device includes a CPU (Central Processing Unit) 101, a ROM (Read Only Memory) 102, a RAM (Random Access Memory) 103, a magnetic disk drive 104, a magnetic disk 105, and an optical disk drive 106. An optical disk 107, a display 108, an I / F (Interface) 109, a keyboard 110, a mouse 111, a scanner 112, and a printer 113. Each component is connected by a bus 100.

ここで、ＣＰＵ１０１は、管理装置の全体の制御を司る。ＲＯＭ１０２は、ブートプログラムなどのプログラムを記憶している。ＲＡＭ１０３は、ＣＰＵ１０１のワークエリアとして使用される。磁気ディスクドライブ１０４は、ＣＰＵ１０１の制御にしたがって磁気ディスク１０５に対するデータのリード／ライトを制御する。磁気ディスク１０５は、磁気ディスクドライブ１０４の制御で書き込まれたデータを記憶する。 Here, the CPU 101 controls the entire management apparatus. The ROM 102 stores a program such as a boot program. The RAM 103 is used as a work area for the CPU 101. The magnetic disk drive 104 controls reading / writing of data with respect to the magnetic disk 105 according to the control of the CPU 101. The magnetic disk 105 stores data written under the control of the magnetic disk drive 104.

光ディスクドライブ１０６は、ＣＰＵ１０１の制御にしたがって光ディスク１０７に対するデータのリード／ライトを制御する。光ディスク１０７は、光ディスクドライブ１０６の制御で書き込まれたデータを記憶したり、光ディスク１０７に記憶されたデータをコンピュータに読み取らせたりする。 The optical disk drive 106 controls reading / writing of data with respect to the optical disk 107 according to the control of the CPU 101. The optical disc 107 stores data written under the control of the optical disc drive 106, and causes the computer to read data stored on the optical disc 107.

ディスプレイ１０８は、カーソル、アイコンあるいはツールボックスをはじめ、文書、画像、機能情報などのデータを表示する。このディスプレイ１０８は、たとえば、ＣＲＴ、ＴＦＴ液晶ディスプレイ、プラズマディスプレイなどを採用することができる。 The display 108 displays data such as a document, an image, and function information as well as a cursor, an icon, or a tool box. As this display 108, for example, a CRT, a TFT liquid crystal display, a plasma display, or the like can be adopted.

インターフェース（以下、「Ｉ／Ｆ」と略する。）１０９は、通信回線を通じてＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）、インターネットなどのネットワーク１１４に接続され、このネットワーク１１４を介して他の装置に接続される。そして、Ｉ／Ｆ１０９は、ネットワーク１１４と内部のインターフェースを司り、外部装置からのデータの入出力を制御する。Ｉ／Ｆ１０９には、たとえばモデムやＬＡＮアダプタなどを採用することができる。 An interface (hereinafter abbreviated as “I / F”) 109 is connected to a network 114 such as a LAN (Local Area Network), a WAN (Wide Area Network), and the Internet through a communication line. Connected to other devices. The I / F 109 controls an internal interface with the network 114 and controls data input / output from an external device. For example, a modem or a LAN adapter may be employed as the I / F 109.

キーボード１１０は、文字、数字、各種指示などの入力のためのキーを備え、データの入力をおこなう。また、タッチパネル式の入力パッドやテンキーなどであってもよい。マウス１１１は、カーソルの移動や範囲選択、あるいはウィンドウの移動やサイズの変更などをおこなう。ポインティングデバイスとして同様に機能を備えるものであれば、トラックボールやジョイスティックなどであってもよい。 The keyboard 110 includes keys for inputting characters, numbers, various instructions, and the like, and inputs data. Moreover, a touch panel type input pad or a numeric keypad may be used. The mouse 111 performs cursor movement, range selection, window movement, size change, and the like. A trackball or a joystick may be used as long as they have the same function as a pointing device.

スキャナ１１２は、画像を光学的に読み取り、管理装置内に画像データを取り込む。なお、スキャナ１１２は、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅａｄｅｒ）機能を持たせてもよい。また、プリンタ１１３は、画像データや文書データを印刷する。プリンタ１１３には、たとえば、レーザプリンタやインクジェットプリンタを採用することができる。 The scanner 112 optically reads an image and takes in the image data into the management apparatus. The scanner 112 may have an OCR (Optical Character Reader) function. The printer 113 prints image data and document data. For example, a laser printer or an ink jet printer can be employed as the printer 113.

＜管理装置の機能的構成例＞
図２は、本実施の形態にかかる管理装置の構成例を示すブロック図である。管理装置２００は、データブロック集合ｄｂと、ハッシュテーブル群ＨＴｓと、階層型ブルームフィルタＢＦと、階層型転置ブルームフィルタｔＢＦと、登録処理部２０１と、検索処理部２０２と、を備えている。 <Functional configuration example of management device>
FIG. 2 is a block diagram illustrating a configuration example of the management apparatus according to the present embodiment. The management device 200 includes a data block set db, a hash table group HTs, a hierarchical Bloom filter BF, a hierarchical transposed Bloom filter tBF, a registration processing unit 201, and a search processing unit 202.

データブロック集合ｄｂは、複数のデータブロックを有しており、各データブロックは、登録されたデータを保持している。なお、個々のデータブロックをｄｂ＃と表記する。＃は数字であり、ブロック番号を示す。ブロック番号＃は、そのデータブロックｄｂ＃のビット位置に対応する。 The data block set db has a plurality of data blocks, and each data block holds registered data. Each data block is represented as db #. # Is a number indicating the block number. The block number # corresponds to the bit position of the data block db #.

ハッシュテーブル群ＨＴｓは、データブロック集合ｄｂ内の各データブロックｄｂ＃に対応するハッシュテーブルの集合である。なお、個々のハッシュテーブルをＨＴ＃と表記する。＃は数字であり、データブロックｄｂ＃のブロック番号と一致する。ハッシュテーブルＨＴ＃は、データをある特定のハッシュ関数に与えたときのハッシュ値と、ハッシュ値の生成元となるデータ（データそのものでもデータへのポインタでもよい）を関連付けたテーブルである。 The hash table group HTs is a set of hash tables corresponding to each data block db # in the data block set db. Each hash table is represented as HT #. # Is a number and matches the block number of the data block db #. The hash table HT # is a table in which a hash value when data is given to a specific hash function is associated with data that is a generation source of the hash value (data itself or a pointer to the data).

図３は、ハッシュテーブル群ＨＴｓの一例を示す説明図である。図３では、ハッシュ関数としてＳＨＡ−１を用いている。なお、どのハッシュ関数を用いるかはあらかじめ設定しておけばよい。 FIG. 3 is an explanatory diagram showing an example of the hash table group HTs. In FIG. 3, SHA-1 is used as the hash function. In addition, what hash function is used may be set in advance.

図２に戻って、階層型ブルームフィルタＢＦは、ブルームフィルタを階層構造にしたインデックス情報である。階層型ブルームフィルタＢＦについては後述する。ブルームフィルタとは、所定数のデータブロック内での擬陽性（疑陽性、偽陽性ともいう）／陰性を示すビットが複数配列されたインデックス情報である。ブルームフィルタのビットがＯＮのときは擬陽性を示し、ＯＦＦのときは陰性を示す。なお、ビットの値が１をＯＮとし、０をＯＦＦとしてもよく、逆に、ビットの値が０をＯＮとし、１をＯＦＦとしてもよい。本実施の形態では、ビットの値が１をＯＮとし、０をＯＦＦとする。 Returning to FIG. 2, the hierarchical Bloom filter BF is index information in which the Bloom filter has a hierarchical structure. The hierarchical Bloom filter BF will be described later. The Bloom filter is index information in which a plurality of bits indicating false positives (also called false positives or false positives) / negatives in a predetermined number of data blocks are arranged. When the Bloom filter bit is ON, it indicates false positive, and when it is OFF, it indicates negative. The bit value may be set to 1 and 0 may be set to OFF. Conversely, the bit value may be set to 0 and 1 may be set to OFF. In this embodiment, the bit value is set to 1 and 0 is set to OFF.

階層型転置ブルームフィルタｔＢＦは、階層型ブルームフィルタＢＦを転置したインデックス情報である。階層型転置ブルームフィルタｔＢＦは、検索処理部２０２により生成される。階層型転置ブルーフィルタｔＢＦの詳細は後述する。 The hierarchical transposed Bloom filter tBF is index information obtained by transposing the hierarchical Bloom filter BF. The hierarchical transposed Bloom filter tBF is generated by the search processing unit 202. Details of the hierarchical transposed blue filter tBF will be described later.

データブロック集合ｄｂ、ハッシュテーブル群ＨＴｓ、および階層型ブルームフィルタＢＦは、図１に示したＲＯＭ１０２、ＲＡＭ１０３、磁気ディスク１０５などの記憶装置に記憶される。また、図２では、データブロック集合ｄｂ、ハッシュテーブル群ＨＴｓ、および階層型ブルームフィルタＢＦは、管理装置２００内に記憶させているが、管理装置２００外の外部装置に記憶させてもよい。この場合、管理装置２００は、ネットワークを介して外部装置から読み出したり、外部装置に書き込んだりする。 The data block set db, the hash table group HTs, and the hierarchical Bloom filter BF are stored in a storage device such as the ROM 102, the RAM 103, and the magnetic disk 105 shown in FIG. In FIG. 2, the data block set db, the hash table group HTs, and the hierarchical Bloom filter BF are stored in the management device 200, but may be stored in an external device outside the management device 200. In this case, the management apparatus 200 reads out from or writes to the external apparatus via the network.

登録処理部２０１は、データブロック集合ｄｂに登録したいデータがエントリされると、データブロック集合ｄｂの空き領域に登録する。データの登録の際には、ハッシュ関数によりハッシュ値を求め、登録先のデータブロックｄｂ＃に対応するハッシュテーブルＨＴ＃に、データ（またはそのポインタ）とハッシュ値とを追加する。また、登録先のデータブロックｄｂ＃にデータが新規に登録されたことを階層型ブルームフィルタＢＦに学習させるため、登録処理部２０１は、階層型ブルームフィルタＢＦを更新する。 When the data to be registered is entered in the data block set db, the registration processing unit 201 registers in the free area of the data block set db. When registering data, a hash value is obtained by a hash function, and the data (or its pointer) and the hash value are added to the hash table HT # corresponding to the data block db # of the registration destination. In addition, the registration processing unit 201 updates the hierarchical Bloom filter BF so that the hierarchical Bloom filter BF learns that data has been newly registered in the registration destination data block db #.

検索処理部２０２は、検索対象データが入力されると、階層型ブルームフィルタＢＦを参照して、検索対象データが存在するであろうデータブロックｄｂ＃を特定する。検索対象データが存在するであろうデータブロックｄｂ＃が特定されなかった場合は、いずれのデータブロックｄｂ＃にも検索対象データは存在しない（陰性）。逆に、検索対象データが存在するであろうデータブロックｄｂ＃が特定された場合でも、特定されたデータブロックｄｂ＃には、必ずしも検索対象データが存在するとは限らない（擬陽性）。 When the search target data is input, the search processing unit 202 refers to the hierarchical Bloom filter BF and specifies the data block db # in which the search target data will exist. When the data block db # that may contain the search target data is not specified, the search target data does not exist in any data block db # (negative). On the other hand, even when the data block db # that may contain the search target data is specified, the specified data block db # does not necessarily include the search target data (false positive).

なお、擬陽性が陽性になるか陰性になるかは、検索処理部２０２により最終的に特定されたデータブロックｄｂ＃に対応するハッシュテーブルＨＴ＃での検索結果に依存する。たとえば、検索処理部２０２により最終的に特定されたデータブロックｄｂ＃に対応するハッシュテーブルＨＴ＃において、検索対象データのハッシュ値がヒットすれば陽性、ヒットしなければ陰性となる。 Whether the false positive becomes positive or negative depends on the search result in the hash table HT # corresponding to the data block db # finally specified by the search processing unit 202. For example, in the hash table HT # corresponding to the data block db # finally specified by the search processing unit 202, it is positive if the hash value of the search target data is hit, and is negative if it is not hit.

なお、上述した登録処理部２０１および検索処理部２０２は、具体的には、たとえば、図１に示したＲＯＭ１０２、ＲＡＭ１０３、磁気ディスク１０５、光ディスク１０７などの記憶装置に記憶されたプログラムをＣＰＵ１０１に実行させることにより、その機能を実現する。 Specifically, the above-described registration processing unit 201 and search processing unit 202 execute, for example, the CPU 101 on a program stored in a storage device such as the ROM 102, the RAM 103, the magnetic disk 105, and the optical disk 107 illustrated in FIG. To realize its function.

＜階層型ブルームフィルタＢＦ＞
図４は、階層型ブルームフィルタＢＦの一例を示す説明図である。階層型ブルームフィルタＢＦは、ｈ段×ｓビット幅のメモリ領域で構成されている。ｓビット幅は、データブロック集合ｄｂのビット幅に対応する。また、各段のビット長ｓは最上段である第ｈ段の分割数ｄに基づいて分割される。分割された各々はブルームフィルタであり、各段においてブルームフィルタ列を構成する。分割数ｄは基本的には２以上の整数であるが、最上段である第ｈ段を単一のブルームフィルタとする場合は、ｄ＝１としてもよい。 <Hierarchical Bloom Filter BF>
FIG. 4 is an explanatory diagram showing an example of the hierarchical Bloom filter BF. The hierarchical Bloom filter BF is configured by a memory area of h stages × s bits width. The s bit width corresponds to the bit width of the data block set db. The bit length s of each stage is divided based on the division number d of the h-th stage that is the uppermost stage. Each of the divided parts is a Bloom filter and forms a Bloom filter row in each stage. The number of divisions d is basically an integer of 2 or more, but d = 1 may be used when the h-th stage as the uppermost stage is a single Bloom filter.

任意の段をｐとすると、第ｐ段のブルームフィルタ列ＢＦ（ｐ）を構成するブルームフィルタｂｆ（ｐ）のビット幅ｍは、ｍ＝ｓ／ｄ^[h-(p-1)]となる。図４では、ｄ＝２としている。また、第ｐ段のブルームフィルタ列ＢＦ（ｐ）のブルームフィルタｂｆ（ｐ）の配列数ｎは、ｎ＝ｄ^[h-(p-1)]となる。 When an arbitrary stage is p, the bit width m of the Bloom filter bf (p) constituting the p-th Bloom filter row BF (p) is m = s / d ^{[h− (p−1)].} . In FIG. 4, d = 2. The number n of arrangement of the Bloom filters bf (p) in the p-th Bloom filter row BF (p) is n = d ^{[h− (p−1)]} .

したがって、階層型ブルームフィルタＢＦでは、段が下がる（ｈが小さくなる）につれ、第ｐ段のブルームフィルタ列ＢＦ（ｐ）のブルームフィルタｂｆ（ｐ）の配列数が増加する。なお、最下段（第１段）のブルームフィルタ列Ｂｆ（１）のブルームフィルタｂｆ（１）の配列個数は、データブロックｄｂ＃の個数と同一とする。 Therefore, in the hierarchical Bloom filter BF, the number of arrangements of the Bloom filters bf (p) in the p-th Bloom filter row BF (p) increases as the level decreases (h decreases). Note that the number of arrangement of the bloom filters bf (1) in the lowermost (first) Bloom filter row Bf (1) is the same as the number of data blocks db #.

これにより、第１段までたどり着いたときにビットしたブルームフィルタｂｆ（１）とデータブロックｄｂ＃とが一対一対応することになる。また、階層型ブルームフィルタＢＦの段数ｈは基本的には複数段であるが、１段（ｈ＝１）としてもよい。ただし、この場合は、ｄ≠１とする。 As a result, the Bloom filter bf (1) bitted when reaching the first level has a one-to-one correspondence with the data block db #. The number of stages h of the hierarchical Bloom filter BF is basically a plurality of stages, but may be one stage (h = 1). However, in this case, d ≠ 1.

＜階層型ブルームフィルタＢＦの学習処理例＞
図５は、登録処理部２０１による階層型ブルームフィルタＢＦの学習処理例を示す説明図である。図５では、説明上、全ビット幅ｓ＝４０９６ビット、段数ｈ＝３段、第ｈ段での分割数ｄ＝２とする。 <Example of learning process of hierarchical Bloom filter BF>
FIG. 5 is an explanatory diagram showing an example of learning processing of the hierarchical Bloom filter BF by the registration processing unit 201. In FIG. 5, for the sake of explanation, it is assumed that the total bit width s = 4096 bits, the number of stages h = 3, and the number of divisions d = 2 at the h-th stage.

したがって、最下段である第１段のブルームフィルタ列ＢＦ（１）は、８（＝ｄ^[h-(p-1)]＝２³）分割されてブルームフィルタｂｆ（１−１）〜ｂｆ（１−８）により構成される。また、第２段のブルームフィルタ列ＢＦ（２）は、４（＝ｄ^[h-(p-1)]＝２²）分割されてブルームフィルタｂｆ（２−１）〜ｂｆ（２−４）により構成される。また、最上段である第３段のブルームフィルタ列ＢＦ（３）は、２（＝ｄ^[h-(p-1)]＝２¹）分割されてブルームフィルタｂｆ（３−１）〜ｂｆ（３−２）により構成される。 Accordingly, the first-stage Bloom filter row BF (1), which is the lowermost stage, is divided into 8 (= d ^{[h- (p-1)]} = 2 ³ ) and Bloom filters bf (1-1) to bf ( 1-8). The second-stage Bloom filter row BF (2) is divided into 4 (= d ^{[h- (p-1)]} = 2 ² ) to obtain Bloom filters bf (2-1) to bf (2-4). Consists of. The third-stage Bloom filter row BF (3), which is the uppermost stage, is divided into 2 (= d ^{[h- (p-1)]} = 2 ¹ ) and Bloom filters bf (3-1) to bf ( 3-2).

また、登録される対象データＤが与えられるハッシュ関数の種類数ｋをｋ＝３とする。ここでは、ハッシュ関数Ｈ１（），Ｈ２（），Ｈ３（）を用いることとする。また、ハッシュテーブルの登録対象となるハッシュ関数をＨ１（）とする。 Further, the number k of hash functions to which the registered target data D is given is set to k = 3. Here, the hash functions H1 (), H2 (), and H3 () are used. A hash function to be registered in the hash table is H1 ().

まず、データブロック集合ｄｂのうちデータブロックｄｂ３に対象データＤが登録されたものとする。対象データＤを各ハッシュ関数Ｈ１（），Ｈ２（），Ｈ３（）に与えたときのハッシュ値は、例として以下の値とする。
Ｈ１（Ｄ）＝１２３４５６７
Ｈ２（Ｄ）＝３９８４０１２
Ｈ３（Ｄ）＝９８０３３２３ First, it is assumed that the target data D is registered in the data block db3 in the data block set db. As an example, the hash values when the target data D is given to the hash functions H1 (), H2 (), and H3 () are as follows.
H1 (D) = 1234567
H2 (D) = 3984012
H3 (D) = 9803323

また、階層型ブルームフィルタＢＦの学習処理では、更新対象となるブルームフィルタ内の特定のビットをＯＮにするが、その特定のビットがすでにＯＮになっている場合はそのままとする。 In the learning process of the hierarchical Bloom filter BF, a specific bit in the Bloom filter to be updated is turned on, but if the specific bit is already turned on, it is left as it is.

ここで、登録処理部２０１は、登録先のデータブロックｄｂ３のブロック番号３のハッシュテーブルＨＴ３に対するハッシュテーブルエントリＥ３を作成する。そして、登録処理部２０１は、作成されたハッシュテーブルエントリＥ３を、ハッシュテーブルＨＴ３に追加登録する。 Here, the registration processing unit 201 creates a hash table entry E3 for the hash table HT3 of the block number 3 of the data block db3 of the registration destination. Then, the registration processing unit 201 additionally registers the created hash table entry E3 in the hash table HT3.

つぎに、登録処理部２０１は、更新対象となるブルームフィルタを、第１段のブルームフィルタ列ＢＦ（１）の中から特定する。最下段では、登録先のデータブロックｄｂ３のブロック番号３と同一配列番号であるブルームフィルタｂｆ（１−３）が対応する。したがって、ブルームフィルタｂｆ（１−３）を更新対象とする。ブルームフィルタｂｆ（１−３）は５１２ビットのビット列である。 Next, the registration processing unit 201 specifies a Bloom filter to be updated from the first-stage Bloom filter row BF (1). In the bottom row, the Bloom filter bf (1-3) having the same array number as the block number 3 of the data block db3 of the registration destination corresponds. Therefore, the Bloom filter bf (1-3) is the update target. The Bloom filter bf (1-3) is a 512-bit bit string.

また、登録処理部２０１は、各ハッシュ値を、第１段のブルームフィルタｂｆ（１）のビット幅である５１２で割り算し、余り値を算出する。ハッシュ値Ｈ１（Ｄ）の余り値は「１３５」、ハッシュ値Ｈ２（Ｄ）の余り値は「１４０」、ハッシュ値Ｈ３（Ｄ）の余り値は「５９」になったものとする。 Also, the registration processing unit 201 divides each hash value by 512 that is the bit width of the first-stage Bloom filter bf (1), and calculates a remainder value. It is assumed that the remainder value of the hash value H1 (D) is “135”, the remainder value of the hash value H2 (D) is “140”, and the remainder value of the hash value H3 (D) is “59”.

つぎに、登録処理部２０１は、更新対象のブルームフィルタにおいて余り値に対応する位置のビットをＯＮにする。なお、余り値が０の場合は、更新対象のブルームフィルタの末尾ビットをＯＮにする。図５の例では、ブルームフィルタｂｆ（１−３）は、５１２ビット有するため、余り値「１３５」については、先頭から１３５番目のビットをＯＮにする。同様に、余り値「１４０」については、先頭から１４０番目のビットをＯＮにする。余り値「５９」については、先頭から５９番目のビットをＯＮにする。これにより、第１段での学習処理を終了する。 Next, the registration processing unit 201 turns on the bit at the position corresponding to the remainder value in the Bloom filter to be updated. If the remainder value is 0, the last bit of the Bloom filter to be updated is turned ON. In the example of FIG. 5, the Bloom filter bf (1-3) has 512 bits, so the 135th bit from the beginning is turned ON for the remainder value “135”. Similarly, for the remainder value “140”, the 140th bit from the beginning is turned ON. For the remainder value “59”, the 59th bit from the beginning is turned ON. This completes the learning process in the first stage.

つぎに、第２段の学習処理に移る。登録処理部２０１は、更新対象となるブルームフィルタを、第２段のブルームフィルタ列ＢＦ（２）の中から特定する。具体的には、第１段での更新対象のブルームフィルタｂｆ（１−３）のビット位置を包含するブルームフィルタを、第２段のブルームフィルタ列ＢＦ（２）の中から特定する。本例では、ブルームフィルタｂｆ（２−２）となる。より具体的には、前段の第１段での更新対象のブルームフィルタｂｆ（１−３）の配列番号３を、分割数ｄ（＝２）で割り算し、端数を切り上げることで、更新対象の配列番号は２となる。したがって、ブルームフィルタｂｆ（２−２）が特定される。 Next, the learning process of the second stage is started. The registration processing unit 201 identifies the Bloom filter to be updated from the second-stage Bloom filter row BF (2). Specifically, the Bloom filter including the bit position of the Bloom filter bf (1-3) to be updated in the first stage is specified from the Bloom filter string BF (2) in the second stage. In this example, the Bloom filter bf (2-2) is obtained. More specifically, the array number 3 of the Bloom filter bf (1-3) to be updated in the first stage of the previous stage is divided by the division number d (= 2), and the fraction is rounded up to round up the update target. The sequence number is 2. Therefore, the Bloom filter bf (2-2) is specified.

そして、登録処理部２０１は、各ハッシュ値を、第２段のブルームフィルタｂｆ（２）のビット幅である１０２４で割り算し、余り値を算出する。ハッシュ値Ｈ１（Ｄ）の余り値は「６４７」、ハッシュ値Ｈ２（Ｄ）の余り値は「６５２」、ハッシュ値Ｈ３（Ｄ）の余り値は「５７１」になったものとする。 Then, the registration processing unit 201 divides each hash value by 1024 which is the bit width of the second-stage Bloom filter bf (2), and calculates a remainder value. It is assumed that the remainder value of the hash value H1 (D) is “647”, the remainder value of the hash value H2 (D) is “652”, and the remainder value of the hash value H3 (D) is “571”.

つぎに、登録処理部２０１は、更新対象のブルームフィルタにおいて余り値に対応する位置のビットをＯＮにする。なお、余り値が０の場合は、更新対象のブルームフィルタの末尾ビットをＯＮにする。図５の例では、ブルームフィルタｂｆ（２−２）は、１０２４ビット有するため、余り値「６４７」については、先頭から６４７番目のビットをＯＮにする。同様に、余り値「６５２」については、先頭から６５２番目のビットをＯＮにする。余り値「５７１」については、先頭から５７１番目のビットをＯＮにする。これにより、第２段での学習処理を終了する。 Next, the registration processing unit 201 turns on the bit at the position corresponding to the remainder value in the Bloom filter to be updated. If the remainder value is 0, the last bit of the Bloom filter to be updated is turned ON. In the example of FIG. 5, the Bloom filter bf (2-2) has 1024 bits, so the 647th bit from the beginning is turned ON for the remainder value “647”. Similarly, for the remainder value “652,” the 652nd bit from the beginning is turned ON. For the remainder value “571”, the 571st bit from the beginning is turned ON. Thereby, the learning process in the second stage is completed.

つぎに、最上段である第３段の学習処理に移る。登録処理部２０１は、更新対象となるブルームフィルタを、第３段のブルームフィルタ列ＢＦ（３）の中から特定する。具体的には、第２段での更新対象のブルームフィルタｂｆ（２−２）のビット位置を包含するブルームフィルタを、第３段のブルームフィルタ列ＢＦ（３）の中から特定する。本例では、ブルームフィルタｂｆ（３−１）となる。より具体的には、前段の第２段での更新対象のブルームフィルタｂｆ（２−２）の配列番号２を、分割数ｄ（＝２）で割り算することで、更新対象の配列番号は１となる。したがって、ブルームフィルタｂｆ（３−１）が特定される。 Next, the process moves to the third-stage learning process, which is the uppermost stage. The registration processing unit 201 specifies the Bloom filter to be updated from the third-stage Bloom filter row BF (3). Specifically, the Bloom filter including the bit position of the Bloom filter bf (2-2) to be updated in the second stage is specified from the third stage Bloom filter string BF (3). In this example, the Bloom filter bf (3-1) is obtained. More specifically, by dividing the array number 2 of the Bloom filter bf (2-2) to be updated in the second stage of the previous stage by the division number d (= 2), the array number to be updated is 1 It becomes. Therefore, the Bloom filter bf (3-1) is specified.

そして、登録処理部２０１は、各ハッシュ値を、第３段のブルームフィルタｂｆ（３）のビット幅である２０４８で割り算し、余り値を算出する。ハッシュ値Ｈ１（Ｄ）の余り値は「１６７１」、ハッシュ値Ｈ２（Ｄ）の余り値は「６５２」、ハッシュ値Ｈ３（Ｄ）の余り値は「１５９５」になったものとする。 Then, the registration processing unit 201 divides each hash value by 2048 which is the bit width of the third-stage Bloom filter bf (3) to calculate a remainder value. It is assumed that the remainder value of the hash value H1 (D) is “1671”, the remainder value of the hash value H2 (D) is “652”, and the remainder value of the hash value H3 (D) is “1595”.

つぎに、登録処理部２０１は、更新対象のブルームフィルタにおいて余り値に対応する位置のビットをＯＮにする。なお、余り値が０の場合は、更新対象のブルームフィルタの末尾ビットをＯＮにする。図５の例では、ブルームフィルタｂｆ（３−１）は、２０４８ビット有するため、余り値「１６７１」については、先頭から１６７１番目のビットをＯＮにする。同様に、余り値「６５２」については、先頭から６５２番目のビットをＯＮにする。余り値「１５９５」については、先頭から１５９５番目のビットをＯＮにする。これにより、第３段での学習処理を終了する。 Next, the registration processing unit 201 turns on the bit at the position corresponding to the remainder value in the Bloom filter to be updated. If the remainder value is 0, the last bit of the Bloom filter to be updated is turned ON. In the example of FIG. 5, the Bloom filter bf (3-1) has 2048 bits, so the 1671th bit from the beginning is turned ON for the remainder value “1671”. Similarly, for the remainder value “652,” the 652nd bit from the beginning is turned ON. For the remainder value “1595”, the 1595th bit from the beginning is turned ON. Thereby, the learning process in the third stage is finished.

このような手順により、登録処理部２０１は、階層型ブルームフィルタＢＦにデータエントリを学習させることとなる。 With this procedure, the registration processing unit 201 causes the hierarchical Bloom filter BF to learn data entries.

＜階層型ブルームフィルタＢＦを用いた検索処理例＞
図６は検索処理部２０２による階層型ブルームフィルタＢＦの検索処理例を示す説明図である。図６では、図５と同一の階層型ブルームフィルタＢＦを用いて説明する。なお、図６では、図５で登録された対象データＤを検索対象データとして検索する例について説明する。 <Example of search processing using hierarchical Bloom filter BF>
FIG. 6 is an explanatory diagram showing an example of search processing of the hierarchical Bloom filter BF by the search processing unit 202. 6 will be described using the same hierarchical Bloom filter BF as in FIG. Note that FIG. 6 illustrates an example in which the target data D registered in FIG. 5 is searched as search target data.

図５に示した学習処理では、最下段の第１段から処理したが、検索処理では最上段（図６の場合は第３段）から処理をおこなう。まず、検索処理部２０２は、対象データＤについての３個のハッシュ値を、第３段の各ブルームフィルタｂｆ（３）のビット幅２０４８で割り算したときの余り値「１６７１」、「６５２」、「１５９５」を求める。 In the learning process shown in FIG. 5, processing is performed from the first stage at the bottom, but in the search process, processing is performed from the top (the third stage in the case of FIG. 6). First, the search processing unit 202 divides the three hash values for the target data D by the bit widths 2048 of the third-stage Bloom filters bf (3), resulting in residual values “1671”, “652”, “1595” is obtained.

つぎに、検索処理部２０２は、フィルタリング対象となるブルームフィルタを第３段のブルームフィルタ列ＢＦ（３）から特定する。第３段は最上段であるため、無条件で第３段のすべてのブルームフィルタｂｆ（３−１），ｂｆ（３−２）が特定される。 Next, the search processing unit 202 specifies the Bloom filter to be filtered from the third-stage Bloom filter row BF (3). Since the third stage is the top stage, all the Bloom filters bf (3-1) and bf (3-2) in the third stage are specified unconditionally.

そして、検索処理部２０２は、フィルタリング対象に特定されたブルームフィルタ群の中から、算出した余り値に対応する位置のビットがすべてＯＮになっているブルームフィルタを特定する。第３段の場合、ブルームフィルタｂｆ（３−１），ｂｆ（３−２）はいずれも、算出した余り値に対応する位置のビットがすべてＯＮになっているものとする。これにより、第３段でのフィルタリング処理を終了する。 Then, the search processing unit 202 specifies a Bloom filter in which all the bits at the position corresponding to the calculated remainder value are ON from the Bloom filter group specified as the filtering target. In the case of the third stage, it is assumed that all of the bits at the position corresponding to the calculated remainder value are ON in the Bloom filters bf (3-1) and bf (3-2). Thereby, the filtering process in the third stage is finished.

つぎに、第２段のフィルタリング処理に移る。検索処理部２０２は、対象データＤについての３個のハッシュ値を、第２段の各ブルームフィルタｂｆ（２）のビット幅１０２４で割り算したときの余り値「６４７」、「６５２」、「５７１」を求める。 Next, the process proceeds to the second-stage filtering process. The search processing unit 202 divides the three hash values of the target data D by the bit width 1024 of each second-stage Bloom filter bf (2), and the remainder values “647”, “652”, “571 "

つぎに、検索処理部２０２は、フィルタリング対象となるブルームフィルタを第２段のブルームフィルタ列ＢＦ（２）から特定する。最上段を除く残余の段の場合、１つ上の段において、算出した余り値に対応する位置のビットがすべてＯＮになっているブルームフィルタｂｆ（ｐ＋１）を探し、そのブルームフィルタｂｆ（ｐ＋１）のビット位置に包含されるブルームフィルタｂｆ（ｐ）をフィルタリング対象とする。 Next, the search processing unit 202 specifies the Bloom filter to be filtered from the second-stage Bloom filter row BF (2). In the case of the remaining stages excluding the uppermost stage, the Bloom filter bf (p + 1) in which all the bits in the position corresponding to the calculated remainder value are turned on is searched in the upper stage, and the Bloom filter bf (p + 1). The Bloom filter bf (p) included in the bit positions of is the filtering target.

第２段の場合、第３段で算出した余り値に対応する位置のビットがすべてＯＮになっているブルームフィルタｂｆ（３−１），ｂｆ（３−２）のビット位置に包含されるブルームフィルタｂｆ（２−１）〜ｂｆ（２−４）をフィルタリング対象とする。 In the case of the second stage, the blooms included in the bit positions of the Bloom filters bf (3-1) and bf (3-2) in which all the bits at the positions corresponding to the remainder values calculated in the third stage are ON. Filters bf (2-1) to bf (2-4) are targeted for filtering.

そして、検索処理部２０２は、フィルタリング対象に特定されたブルームフィルタ群の中から、算出した余り値に対応する位置のビットがすべてＯＮになっているブルームフィルタを特定する。第２段の場合、ブルームフィルタｂｆ（２−２），ｂｆ（２−３）は、算出した余り値に対応する位置のビットがすべてＯＮになっているものとする。一方、ブルームフィルタｂｆ（２−１），ｂｆ（２−４）については、算出した余り値に対応する位置のビットのいずれかがＯＦＦであるものとする。 Then, the search processing unit 202 specifies a Bloom filter in which all the bits at the position corresponding to the calculated remainder value are ON from the Bloom filter group specified as the filtering target. In the case of the second stage, the Bloom filters bf (2-2) and bf (2-3) are assumed to have all the bits at the positions corresponding to the calculated remainder values turned on. On the other hand, for the Bloom filters bf (2-1) and bf (2-4), it is assumed that one of the bits at the position corresponding to the calculated remainder value is OFF.

したがって、ブルームフィルタｂｆ（２−１），ｂｆ（２−４）のビット位置に包含される下位の段のブルームフィルタｂｆ（１−１），ｂｆ（１−２），ｂｆ（１−７），ｂｆ（１−８）は、フィルタリング対象外となり、対象データＤの存在するデータブロックｄｂ＃は、ブルームフィルタｂｆ（２−２），ｂｆ（２−３）のビット位置に包含されるデータブロックｄｂ＃に絞り込まれる。これにより、第２段でのフィルタリング処理を終了する。 Therefore, the lower-stage Bloom filters bf (1-1), bf (1-2), bf (1-7) included in the bit positions of the Bloom filters bf (2-1) and bf (2-4). , Bf (1-8) are excluded from filtering, and the data block db # in which the target data D exists is a data block included in the bit positions of the Bloom filters bf (2-2), bf (2-3). It is narrowed down to db #. Thereby, the filtering process in the second stage is completed.

つぎに、最下段である第１段のフィルタリング処理に移る。検索処理部２０２は、対象データＤについての３個のハッシュ値を、第１段の各ブルームフィルタｂｆ（１）のビット幅５１２で割り算したときの余り値「１３５」、「１４０」、「５９」を求める。 Next, the process proceeds to the first-stage filtering process which is the lowest stage. The search processing unit 202 divides the three hash values for the target data D by the bit width 512 of each Bloom filter bf (1) in the first stage, resulting in residual values “135”, “140”, “59 "

つぎに、検索処理部２０２は、フィルタリング対象となるブルームフィルタを第１段のブルームフィルタ列ＢＦ（１）から特定する。第１段の場合、第２段で算出した余り値に対応する位置のビットがすべてＯＮになっているブルームフィルタｂｆ（２−２），ｂｆ（２−３）のビット位置に包含されるブルームフィルタｂｆ（１−３）〜ｂｆ（１−６）をフィルタリング対象とする。 Next, the search processing unit 202 specifies a Bloom filter to be filtered from the first-stage Bloom filter row BF (1). In the case of the first stage, the bloom included in the bit positions of the Bloom filters bf (2-2) and bf (2-3) in which all the bits at the positions corresponding to the remainder values calculated in the second stage are ON. Filters bf (1-3) to bf (1-6) are targeted for filtering.

そして、検索処理部２０２は、フィルタリング対象に特定されたブルームフィルタ群の中から、算出した余り値に対応する位置のビットがすべてＯＮになっているブルームフィルタを特定する。第１段の場合、ブルームフィルタｂｆ（１−３），ｂｆ（１−６）は、算出した余り値に対応する位置のビットがすべてＯＮになっているものとする。一方、ブルームフィルタｂｆ（１−４），ｂｆ（１−５）については、算出した余り値に対応する位置のビットのいずれかがＯＦＦであるものとする。 Then, the search processing unit 202 specifies a Bloom filter in which all the bits at the position corresponding to the calculated remainder value are ON from the Bloom filter group specified as the filtering target. In the case of the first stage, the Bloom filters bf (1-3) and bf (1-6) are assumed to have all the bits at positions corresponding to the calculated remainder values turned ON. On the other hand, for the Bloom filters bf (1-4) and bf (1-5), it is assumed that one of the bits at the position corresponding to the calculated remainder value is OFF.

最下段では、これ以上、下位の段が存在しないため、ブルームフィルタｂｆ（１−３），ｂｆ（１−６）に擬陽性がある。検索処理部２０２は、特定されたブルームフィルタｂｆ（１−３）の配列番号「３」に対応するハッシュテーブルＨＴ３に、ハッシュ値Ｈ１（Ｄ）が登録されているかを判定する。ハッシュテーブルＨＴ３では、エントリＥ３が登録されているため、ハッシュテーブルＨＴ３に対応するデータブロックｄｂ３の中に対象データＤが登録されていることが判明する。 At the bottom level, there are no further lower levels, so the Bloom filters bf (1-3) and bf (1-6) have false positives. The search processing unit 202 determines whether or not the hash value H1 (D) is registered in the hash table HT3 corresponding to the array number “3” of the identified Bloom filter bf (1-3). Since the entry E3 is registered in the hash table HT3, it is found that the target data D is registered in the data block db3 corresponding to the hash table HT3.

一方、検索処理部２０２は、特定されたブルームフィルタｂｆ（１−６）の配列番号「６」に対応するハッシュテーブルＨＴ６に、ハッシュ値Ｈ１（Ｄ）が登録されているかも判定する。ハッシュテーブルＨＴ６では、ハッシュ値Ｈ１（Ｄ）＝１２３４５６７が登録されていないため、ハッシュテーブルＨＴ６に対応するデータブロックｄｂ６の中に対象データＤが登録されていないことが判明する。これにより、検索処理を終了する。 On the other hand, the search processing unit 202 also determines whether or not the hash value H1 (D) is registered in the hash table HT6 corresponding to the array number “6” of the identified Bloom filter bf (1-6). In the hash table HT6, since the hash value H1 (D) = 1234567 is not registered, it is found that the target data D is not registered in the data block db6 corresponding to the hash table HT6. Thereby, the search process is terminated.

このような手順により、検索処理部２０２は、階層型ブルームフィルタＢＦを利用して対象データＤが存在するデータブロックを特定することができる。 By such a procedure, the search processing unit 202 can specify the data block in which the target data D exists using the hierarchical Bloom filter BF.

次に、ブルームフィルタの擬陽性による影響について説明する。 Next, the influence of the false positive of the Bloom filter will be described.

ブルームフィルタの擬陽性の発生確率ＦＰＲは、ビット長がｍのブルームフィルタがｈ段ある場合、データ登録数Ｎ（Ｎ＜ｍ）、ハッシュ関数の個数をｋ個とすると、ブルームフィルタの性質より、次式（２）のように表すことができる。 The Bloom filter false-positive occurrence probability FPR is as follows when the number of data registrations is N (N <m) and the number of hash functions is k when there are h Bloom filters with a bit length of m. It can be expressed as equation (2).

ＦＰＲ=｛１−（１−１／ｍ）^kN｝^k≒｛１−ｅ^(-kN/m)）｝^k …（１） FPR = {1- (1-1 / m ) kN} k ≒ {1-e (-kN / m))} k ... (1)

この場合、ｋ，ｍ，Ｎを変更することにより、擬陽性の発生確率ＦＰＲを非常に小さくすることができる。すなわち、本実施形態では、ｋ，ｍ，Ｎの設定次第で、擬陽性の発生確率ＦＰＲを１よりも非常に小さい値（ほぼ０）に設定することができるようになる。したがって、図５の例では、ブルームフィルタｂｆ（１−６）が選ばれることはほとんどありえない。 In this case, by changing k, m, and N, the false positive occurrence probability FPR can be made very small. That is, in the present embodiment, the false positive occurrence probability FPR can be set to a value (substantially 0) that is much smaller than 1 depending on the setting of k, m, and N. Therefore, in the example of FIG. 5, the Bloom filter bf (1-6) is almost never selected.

また、本実施形態では、データブロック数Ｎｄｂはｄ^hであるため、高さ段数ｈは、次式（２）にて表すことができる。 In the present embodiment, since the data block number Ndb is d ^h , the height step number h can be expressed by the following equation (2).

ｈ＝ｌｏｇ（Ｎｄｂ）／ｌｏｇ（ｄ）＋１ …（２） h = log (Ndb) / log (d) +1 (2)

上記式（２）は、ｌｏｇ（Ｎｄｂ）／ｌｏｇ（ｄ）が割り切れる場合を前提にしたが、そうでない場合、段によりｄの値を他の段とは変えることで、ｈを決定することができる。 The above formula (2) is based on the assumption that log (Ndb) / log (d) is divisible, but if not, h can be determined by changing the value of d from other stages depending on the stage. it can.

また、上述した検索処理では、ハッシュ値の数（ｋ回（定数））だけ照合を行う必要があり、検索における１段あたりのフィルタリング対象の数は多くてもｄ個である。したがって、検索によるメモリアクセス回数ＭＡは、最大でも次式（３）で表される程度である。 Further, in the above-described search process, it is necessary to perform collation as many times as the number of hash values (k times (constant)), and the number of filtering targets per search is at most d. Therefore, the memory access count MA by search is at most about the level expressed by the following equation (3).

ＭＡ＝ｋ×ｄ×ｌｏｇ（Ｎｄｂ）／ｌｏｇ（ｄ） …（３） MA = k * d * log (Ndb) / log (d) (3)

すなわち、段数ｈ（＝メモリ量）は、分割数ｄを増やすことにより小さくすることができ、その一方で、検索回数は分割数ｄの増加とともに大きくなるというトレードオフの関係にある。したがって、この関係を考慮することで、適切なメモリの運用が可能となる。 That is, the number of stages h (= memory amount) can be reduced by increasing the number of divisions d, while the number of searches is in a trade-off relationship that increases as the number of divisions d increases. Therefore, by considering this relationship, an appropriate memory operation can be performed.

＜階層型転置ブルームフィルタ＞
つぎに、階層型転置ブルームフィルタについて説明する。上述した説明では、階層型ブルームフィルタＢＦの登録処理や検索処理について説明したが、検索速度をより高速化させるため、階層型ブルームフィルタＢＦを転置させる。 <Hierarchical transposed Bloom filter>
Next, the hierarchical transposed Bloom filter will be described. In the above description, the registration process and the search process of the hierarchical Bloom filter BF have been described. However, the hierarchical Bloom filter BF is transposed in order to increase the search speed.

図７は、階層型ブルームフィルタＢＦ内の第ｐ段のブルームフィルタ列ｂｆ（ｐ）の転置例を示す説明図である。（Ａ）は、ブルームフィルタ列ＢＦ（ｐ）を示している。ここでは、ブルームフィルタ列ＢＦ（ｐ）は、例として、４分割されたブルームフィルタｂｆ（ｐ−１）〜ｂｆ（ｐ−４）を示している。すなわち、ブルームフィルタ列ＢＦ（ｐ）は、１０ビット×４フィルタ数のビット列である。転置する場合は、４ビット×１０フィルタ数のビット列となる。 FIG. 7 is an explanatory diagram showing a transposition example of the p-th Bloom filter row bf (p) in the hierarchical Bloom filter BF. (A) shows the Bloom filter row BF (p). Here, the Bloom filter row BF (p) shows Bloom filters bf (p−1) to bf (p−4) divided into four as an example. That is, the Bloom filter string BF (p) is a bit string of 10 bits × 4 filters. In the case of transposition, the bit string is 4 bits × 10 filters.

（Ｂ）は、ブルームフィルタ列ＢＦ（ｐ）の転置を示している。転置する場合、各ブルームフィルタｂｆ（ｐ−１）〜ｂｆ（ｐ−４）の同一位置のビットを集めて、同一位置ごとに集められたビット列を、ビット位置順に配列させる。 (B) shows transposition of the Bloom filter row BF (p). When transposing, the bits at the same position of each of the Bloom filters bf (p-1) to bf (p-4) are collected, and the bit strings collected at the same positions are arranged in the order of the bit positions.

具体的には、各ブルームフィルタｂｆ（ｐ−１）〜ｂｆ（ｐ−４）の各先頭ビットを配列番号順にまとめてビット列｛０１１０｝とする。左から先頭ビット「０」がブルームフィルタｂｆ（ｐ−１）の先頭ビット、２番目のビット「１」がブルームフィルタｂｆ（ｐ−２）の先頭ビット、３番目のビット「１」がブルームフィルタｂｆ（ｐ−３）の先頭ビット、末尾ビット「０」がブルームフィルタｂｆ（ｐ−４）の先頭ビットである。 Specifically, the first bits of each of the Bloom filters bf (p-1) to bf (p-4) are grouped in the order of the array number to form a bit string {0110}. From the left, the first bit “0” is the first bit of the Bloom filter bf (p−1), the second bit “1” is the first bit of the Bloom filter bf (p−2), and the third bit “1” is the Bloom filter The leading bit and trailing bit “0” of bf (p−3) are the leading bit of the Bloom filter bf (p−4).

このビット列｛０１１０｝を、転置ブルームフィルタｔｂｆ（ｐ−１）と称す。２番目〜末尾のビット位置についても同様にまとめることで、転置ブルームフィルタｔｂｆ（ｐ−１）〜ｔｂｆ（ｐ−１０）を得る。転置ブルームフィルタｔｂｆ（ｐ−１）〜ｔｂｆ（ｐ−１０）がビット位置順に配列されたインデックス情報を、転置ブルームフィルタ列ｔＢＦ（ｐ）と称す。転置ブルームフィルタ列ｔＢＦ（ｐ）をすべての段で生成することで、階層型転置ブルームフィルタｔＢＦとなる。 This bit string {0110} is referred to as a transposed Bloom filter tbf (p−1). The transposed Bloom filters tbf (p-1) to tbf (p-10) are obtained by collecting the second to last bit positions in the same manner. The index information in which the transposed Bloom filters tbf (p-1) to tbf (p-10) are arranged in the order of bit positions is referred to as a transposed Bloom filter row tBF (p). By generating the transposed Bloom filter row tBF (p) at all stages, the hierarchical transposed Bloom filter tBF is obtained.

（Ｃ）は、ブルームフィルタ列ＢＦ（ｐ）と転置ブルームフィルタ列ｔＢＦ（ｐ）との検索比較例を示している。ここでは、２種類のハッシュ関数により対象データＤのハッシュ値を２つ求め、ブルームフィルタ列ＢＦ（ｐ）を構成するブルームフィルタｂｆ（ｐ）のビット幅１０で割り算した余り値を、「４」および「８」とする。 (C) shows a search comparison example between the Bloom filter row BF (p) and the transposed Bloom filter row tBF (p). Here, two hash values of the target data D are obtained by two types of hash functions, and a remainder value obtained by dividing by the bit width 10 of the Bloom filter bf (p) constituting the Bloom filter string BF (p) is “4”. And “8”.

ブルームフィルタ列ＢＦ（ｐ）で検索する場合、余り値「４」および「８」となるビット位置「４」および「８」がすべてＯＮになっているブルームフィルタｂｆ（ｐ）をブルームフィルタ列ＢＦ（ｐ）から探す。この場合、ブルームフィルタｂｆ（ｐ−２）が該当する。 When searching with the Bloom filter column BF (p), the Bloom filter bf (p) in which the bit positions “4” and “8” having the remainder values “4” and “8” are all ON is selected as the Bloom filter column BF. Search from (p). In this case, the Bloom filter bf (p-2) is applicable.

一方、転置ブルームフィルタ列ｔＢＦ（ｐ）を用いる場合、ブルームフィルタ列ＢＦ（ｐ）のように、ビット位置「４」および「８」がすべてＯＮになっているブルームフィルタｂｆ（ｐ）を検索せず、余り値「４」および「８」と同一配列番号の転置ブルームフィルタｔｂｆ（ｐ−４），ｔｂｆ（ｐ−８）を抽出する。そして、抽出された転置ブルームフィルタｔｂｆ（ｐ−４），ｔｂｆ（ｐ−８）をＡＮＤ演算することで、ともにＯＮになっているビット位置「２」を特定する。 On the other hand, when the transposed Bloom filter row tBF (p) is used, the Bloom filter bf (p) in which the bit positions “4” and “8” are all ON, such as the Bloom filter row BF (p), is searched. First, transposed Bloom filters tbf (p-4) and tbf (p-8) having the same sequence numbers as the remainder values “4” and “8” are extracted. Then, an AND operation is performed on the extracted transposed Bloom filters tbf (p−4) and tbf (p−8) to identify the bit position “2” that is both ON.

ブルームフィルタ列ＢＦ（ｐ）の場合は、４個のブルームフィルタｂｆ（ｐ−１）〜ｂｆ（ｐ−４）内の４ビット目と８ビット目を参照するため、８（＝４×２）のメモリアクセスが必要となる。一方、転置ブルームフィルタ列ｔＢＦ（ｐ）は、転置前のブルームフィルタｂｆ（ｐ−１）〜ｂｆ（ｐ−４）のビット位置ごとに折りたたまれたインデックス情報である。したがって、転置ブルームフィルタｔｂｆ（ｐ−４），ｔｂｆ（ｐ−８）を抽出するという２回のメモリアクセスとそのＡＮＤ演算により判定することが可能となる。したがって、メモリアクセス頻度が低減され、検索速度が高速化することとなる。 In the case of the Bloom filter string BF (p), since the fourth and eighth bits in the four Bloom filters bf (p-1) to bf (p-4) are referred to, 8 (= 4 × 2) Memory access is required. On the other hand, the transposed Bloom filter row tBF (p) is index information folded for each bit position of the Bloom filters bf (p-1) to bf (p-4) before transposition. Therefore, it is possible to make a determination by two memory accesses of extracting the transposed Bloom filters tbf (p-4) and tbf (p-8) and their AND operation. Therefore, the memory access frequency is reduced, and the search speed is increased.

＜階層型転置ブルームフィルタを用いた検索処理例＞
図８は検索処理部２０２による階層型転置ブルームフィルタの検索処理例を示す説明図である。図８では、説明上、全ビット幅ｓ＝６４ビット、段数ｈ＝３段、第ｈ段での分割数ｄ＝２とする。 <Example of search processing using hierarchical transposed Bloom filter>
FIG. 8 is an explanatory diagram showing a search process example of the hierarchical transposed Bloom filter by the search processing unit 202. In FIG. 8, for the sake of explanation, the total bit width s = 64 bits, the number of stages h = 3, and the number of divisions d = 2 at the h-th stage.

最下段である第１段のブルームフィルタ列ＢＦ（１）を構成するブルームフィルタのビット幅は、８（＝ｓ／ｄ^h＝６４／２³）ビットであるため、最下段である第１段の転置ブルームフィルタ列ｔＢＦ（１）は、８（＝ｓ／ｄ^h＝６４／２³）個の転置ブルームフィルタｔｂｆ（１−１）〜ｔｂｆ（１−８）により構成される。 Since the bit width of the Bloom filter constituting the first-stage Bloom filter column BF (1) which is the lowest stage is 8 (= s / d ^h = 64/2 ³ ) bits, the first stage which is the lowest stage The transposed Bloom filter row tBF (1) includes 8 (= s / d ^h = 64/2 ³ ) transposed Bloom filters tbf (1-1) to tbf (1-8).

また、第２段の転置ブルームフィルタ列ｔＢＦ（２）を構成するブルームフィルタのビット幅は、１６（＝ｓ／ｄ^h＝６４／２²）ビットであるため、第２段の転置ブルームフィルタ列ｔＢＦ（２）は、１６（＝ｓ／ｄ^h＝６４／２²）個の転置ブルームフィルタｔｂｆ（２−１）〜ｔｂｆ（２−１６）により構成される。 Further, since the bit width of the Bloom filter constituting the second-stage transposed Bloom filter string tBF (2) is 16 (= s / d ^h = 64/2 ² ) bits, the second-stage transposed Bloom filter string tBF (2) is composed of 16 (= s / d ^h = 64/2 ² ) transposed Bloom filters tbf (2-1) to tbf (2-16).

また、最上段である第３段のブルームフィルタ列ＢＦ（３）を構成するブルームフィルタのビット幅は、３２（＝ｓ／ｄ^h＝６４／２¹）ビットであるため、最上段である第３段の転置ブルームフィルタ列ｔＢＦ（３）は、３２（＝ｓ／ｄ^h＝６４／２¹）個の転置ブルームフィルタｔｂｆ（３−１）〜ｔｂｆ（３−３２）により構成される。 The bit width of the Bloom filter constituting the third-stage Bloom filter column BF (3), which is the uppermost stage, is 32 (= s / d ^h = 64/2 ¹ ) bits, so The three-stage transposed Bloom filter row tBF (3) includes 32 (= s / d ^h = 64/2 ¹ ) transposed Bloom filters tbf (3-1) to tbf (3-32).

なお、図８では、説明上、比較のため、転置ブルームフィルタ列ｔＢＦ（１）〜ｔＢＦ（３）とともに、転置前のブルームフィルタ列ＢＦ（１）〜ＢＦ（３）を併記しておく。 In FIG. 8, for the purpose of explanation, the transposed Bloom filter rows tBF (1) to tBF (3) and the pre-transposed Bloom filter rows BF (1) to BF (3) are also shown for comparison.

まず、検索処理部２０２は、検索対象データＤｘについてのハッシュ関数Ｈ１（）〜Ｈ３（）での３個のハッシュ値を、第３段の転置ブルームフィルタ数３２で割り算したときの余り値「２」、「１９」、「２７」を求める。 First, the search processing unit 202 divides the three hash values of the hash functions H1 () to H3 () for the search target data Dx by the number of transposed Bloom filters 32 in the third stage “2” ”,“ 19 ”,“ 27 ”.

つぎに、検索処理部２０２は、フィルタリング対象となる転置ブルームフィルタを第３段の転置ブルームフィルタ列ｔＢＦ（３）から特定する。具体的には、余り値と同一ビット位置（余り値が０の場合は末尾位置）となる転置ブルームフィルタｔｂｆ（３−２），ｔｂｆ（３−１９），ｔｂｆ（３−２７）を特定する。そして、特定された転置ブルームフィルタｔｂｆ（３−２），ｔｂｆ（３−１９），ｔｂｆ（３−２７）の各ビット列｛１０｝，｛１１｝，｛１０｝のＡＮＤ演算をおこなう。そのＡＮＤ結果は、｛１０｝となる。 Next, the search processing unit 202 identifies the transposed Bloom filter to be filtered from the third-stage transposed Bloom filter string tBF (3). Specifically, the transposed Bloom filters tbf (3-2), tbf (3-19), and tbf (3-27) that are in the same bit position as the remainder value (or the end position when the remainder value is 0) are specified. . Then, an AND operation is performed on each bit string {10}, {11}, {10} of the identified transposed Bloom filter tbf (3-2), tbf (3-19), tbf (3-27). The AND result is {10}.

検索処理部２０２は、ＡＮＤ結果に「１」が含まれていない場合は、データブロック集合ｄｂには検索対象データＤｘが存在しないと判定する。一方、ＡＮＤ結果に「１」が含まれている場合、検索対象データＤｘを登録したかもしれないため、１つ下の段に移る。 When “1” is not included in the AND result, the search processing unit 202 determines that the search target data Dx does not exist in the data block set db. On the other hand, when “1” is included in the AND result, the search target data Dx may be registered, and the process moves to the next lower level.

第２段においても、まず、検索処理部２０２は、検索対象データＤｘについての３個のハッシュ値を、第２段の転置ブルームフィルタ数１６で割り算したときの余り値「８」、「１１」、「１３」を求める。 Also in the second stage, first, the search processing unit 202 divides the three hash values for the search target data Dx by the number of transposed Bloom filters 16 in the second stage, “8” and “11”. , “13” is obtained.

つぎに、検索処理部２０２は、フィルタリング対象となる転置ブルームフィルタを第２段の転置ブルームフィルタ列ｔＢＦ（２）から特定する。具体的には、余り値と同一ビット位置（余り値が０の場合は末尾位置）となる転置ブルームフィルタｔｂｆ（２−８），ｔｂｆ（２−１１），ｔｂｆ（２−１３）を特定する。そして、特定された転置ブルームフィルタｔｂｆ（２−８），ｔｂｆ（２−１１），ｔｂｆ（２−１３）の各ビット列｛０１１０｝，｛０１００｝，｛０１１０｝のＡＮＤ演算をおこなう。そのＡＮＤ結果は、｛０１００｝となる。 Next, the search processing unit 202 specifies the transposed Bloom filter to be filtered from the second-stage transposed Bloom filter row tBF (2). Specifically, the transposed Bloom filters tbf (2-8), tbf (2-11), and tbf (2-13) that have the same bit position as the remainder value (or the end position when the remainder value is 0) are specified. . Then, an AND operation is performed on each bit string {0110}, {0100}, {0110} of the identified transposed Bloom filters tbf (2-8), tbf (2-11), and tbf (2-13). The AND result is {0100}.

最下段である第１段においても、まず、検索処理部２０２は、検索対象データＤｘについての３個のハッシュ値を、第１段の転置ブルームフィルタ数８で割り算したときの余り値「２」、「５」、「７」を求める。 Even in the first level, which is the lowest level, first, the search processing unit 202 divides the three hash values for the search target data Dx by the number of transposed Bloom filters of the first level “8”. , “5”, “7” are obtained.

つぎに、検索処理部２０２は、フィルタリング対象となる転置ブルームフィルタを第１段の転置ブルームフィルタ列ｔＢＦ（１）から特定する。具体的には、余り値と同一ビット位置（余り値が０の場合は末尾位置）となる転置ブルームフィルタｔｂｆ（１−２），ｔｂｆ（１−５），ｔｂｆ（１−７）を特定する。そして、特定された転置ブルームフィルタｔｂｆ（１−２），ｔｂｆ（１−５），ｔｂｆ（１−７）の各ビット列｛００１１０１１０｝，｛１００１１０１０｝，｛００１１０１１１｝のＡＮＤ演算をおこなう。そのＡＮＤ結果は、｛０００１００１０｝となる。 Next, the search processing unit 202 identifies the transposed Bloom filter to be filtered from the first-stage transposed Bloom filter string tBF (1). Specifically, the transposed Bloom filters tbf (1-2), tbf (1-5), and tbf (1-7) that are in the same bit position as the remainder value (or the end position when the remainder value is 0) are specified. . Then, an AND operation is performed on each bit string {00110110}, {10011010}, and {00110111} of the identified transposed Bloom filters tbf (1-2), tbf (1-5), and tbf (1-7). The AND result is {00010010}.

これ以上下の段は存在しないため、擬陽性により、ＡＮＤ結果｛０００１００１０｝が「１」のビット位置４，７に対応するデータブロックｄｂ４，ｄｂ７に検索対象データＤｘが存在する可能性がある。 Since there is no lower stage, there is a possibility that the search target data Dx exists in the data blocks db4 and db7 corresponding to the bit positions 4 and 7 whose AND result {00010010} is “1” due to false positive.

この場合、ハッシュテーブルＨＴ４，ＨＴ７に対しハッシュ関数Ｈ１（）でのハッシュ値をキーにして検索することで、データブロックｄｂ４がヒットし、データブロックｄｂ７はヒットしなかったものとする。これにより、検索対象データＤｘは、データブロックｄｂ４に登録されていることが判明する。これにより、検索処理を終了する。 In this case, it is assumed that the data block db4 is hit and the data block db7 is not hit by searching the hash tables HT4 and HT7 using the hash value of the hash function H1 () as a key. Thereby, it is found that the search target data Dx is registered in the data block db4. Thereby, the search process is terminated.

このような手順により、検索処理部２０２は、階層型転置ブルームフィルタを利用することで、階層型ブルームフィルタＢＦに比べてもより高速に対象データを検索することができる。 By such a procedure, the search processing unit 202 can search for target data at a higher speed than the hierarchical Bloom filter BF by using the hierarchical transposed Bloom filter.

＜検索処理部２０２の詳細な機能的構成＞
つぎに、検索処理部２０２の詳細な機能的構成例について説明する。 <Detailed Functional Configuration of Search Processing Unit 202>
Next, a detailed functional configuration example of the search processing unit 202 will be described.

図９は、検索処理部２０２の機能的構成例を示すブロック図である。検索処理部２０２は、受付部９０１と、転置部９０２と、変換部９０３と、第１の特定部９０４と、第２の特定部９０５と、判断部９０６と、判定部９０７と、抽出部９０８と、出力部９０９とを備えている。 FIG. 9 is a block diagram illustrating a functional configuration example of the search processing unit 202. The search processing unit 202 includes a receiving unit 901, a transposing unit 902, a converting unit 903, a first specifying unit 904, a second specifying unit 905, a determining unit 906, a determining unit 907, and an extracting unit 908. And an output unit 909.

受付部９０１は、ブルームフィルタ列ＢＦ（ｐ）の転置要求を受け付ける機能を有する。たとえば、階層型ブルームフィルタＢＦから階層型転置ブルームフィルタｔＢＦへの転置要求を受け付ける。 The receiving unit 901 has a function of receiving a transposition request for the Bloom filter row BF (p). For example, a transposition request from the hierarchical Bloom filter BF to the hierarchical transposed Bloom filter tBF is accepted.

ここで、転置要求とは、ブルームフィルタ列ＢＦ（ｐ）を転置ブルームフィルタ列ｔＢＦ（ｐ）に転置させるリクエストである。たとえば、管理装置２００が起動して起動が完了した旨の起動完了通知を転置要求とする。これにより、管理装置２００が起動することで、階層型ブルームフィルタＢＦが階層型転置ブルームフィルタｔＢＦに転置される。このように起動時に一括して、階層型転置ブルームフィルタｔＢＦに転置しておけば、シャットダウンするまでは常時転置ブルームフィルタｔＢＦを利用することができる。 Here, the transposition request is a request for transposing the Bloom filter string BF (p) to the transposed Bloom filter string tBF (p). For example, the activation completion notification that the management apparatus 200 is activated and the activation is completed is used as the transposition request. Thereby, when the management apparatus 200 is activated, the hierarchical Bloom filter BF is transposed to the hierarchical transposed Bloom filter tBF. In this way, if the entire transposed Bloom filter tBF is transposed at the time of activation, the transposed Bloom filter tBF can be used at all times until shutdown.

また、データブロック集合ｄｂに対する検索要求を転置要求としてもよい。これにより、検索処理部２０２は、検索要求があるまでは待機しておき、検索要求があると、階層型ブルームフィルタＢＦのうち検索処理部２０２で用いられるブルームフィルタ列ＢＦ（ｐ）を転置させることとなる。これにより、検索頻度が高くなるにつれ、転置ブルームフィルタ列ｔＢＦ（ｐ）が増加する。したがって、検索に必要なブルームフィルタ列ＢＦ（ｐ）から徐々に転置させるため、シャットダウンまでに利用されなかったブルームフィルタ列の無駄な転置処理を低減することができる。 The search request for the data block set db may be a transposition request. Thereby, the search processing unit 202 stands by until a search request is made, and when there is a search request, the Bloom filter string BF (p) used in the search processing unit 202 is transposed among the hierarchical Bloom filters BF. It will be. Thereby, the transposed Bloom filter string tBF (p) increases as the search frequency increases. Therefore, since it is gradually transposed from the Bloom filter row BF (p) necessary for the search, it is possible to reduce useless transposition processing of the Bloom filter row that was not used until the shutdown.

転置部９０２は、受付部９０１によって転置要求が受け付けられた場合、ブルームフィルタ列ＢＦ（ｐ）を、転置ブルームフィルタ列ｔＢＦ（ｐ）に転置する機能を有する。具体的には、転置部９０２では、ブルームフィルタ列ＢＦ（ｐ）を、図７に示したような手法により、転置ブルームフィルタ列ｔＢＦ（ｐ）に転置する。 The transposition unit 902 has a function of transposing the Bloom filter string BF (p) to the transposed Bloom filter string tBF (p) when the transposition request is accepted by the accepting unit 901. Specifically, the transposition unit 902 transposes the Bloom filter row BF (p) to the transposed Bloom filter row tBF (p) by the method shown in FIG.

たとえば、ブルームフィルタ列ＢＦ（ｐ）は、ｎ（＝ｄ^[h-(p-1)]）個のブルームフィルタｂｆ（ｐ−１）〜ｂｆ（ｐ−ｎ）が配列されたインデックス情報である。また、各ブルームフィルタｂｆ（ｐ−１）〜ｂｆ（ｐ−ｎ）のビット幅は、ｍ（＝ｓ／ｎ）ビットである。 For example, the Bloom filter row BF (p) is index information in which n (= d ^{[h− (p−1)]} ) Bloom filters bf (p−1) to bf (pn) are arranged. . The bit width of each of the Bloom filters bf (p−1) to bf (p−n) is m (= s / n) bits.

ブルームフィルタ列ＢＦ（ｐ）を図７にしたがって転置すると、転置ブルームフィルタ列ｔＢＦ（ｐ）は、ｍ（＝ｓ／ｎ）個の転置ブルームフィルタｔｂｆ（ｐ−１）〜ｔｂｆ（ｐ−ｍ）が配列されたインデックス情報となる。各転置ブルームフィルタｔｂｆ（ｐ−１）〜ｔｂｆ（ｐ−ｍ）のビット幅は、ｎ（＝ｄ^[h-(p-1)]）ビットとなる。このように転置により、配列数とビット幅が入れ替わる。 When the Bloom filter row BF (p) is transposed according to FIG. 7, the transposed Bloom filter row tBF (p) has m (= s / n) transposed Bloom filters tbf (p−1) to tbf (p−m). Becomes the index information arranged. The bit width of each transposed Bloom filter tbf (p-1) to tbf (pm) is n (= d ^{[h- (p-1)]} ) bits. In this way, the number of arrays and the bit width are interchanged by transposition.

変換部９０３は、複数種類（種類数ｋ）のハッシュ関数に基づいて、ハッシュ関数ごとに、検索対象データを、転置ブルームフィルタの配列位置をあらわす位置情報に変換する機能を有する。具体的には、たとえば、ｋ個の異なるハッシュ関数Ｈ１（）〜Ｈｋ（）に検索対象データを与えることで、ｋ個のハッシュ値が得られる。 The conversion unit 903 has a function of converting the search target data into position information representing the array position of the transposed Bloom filter for each hash function based on a plurality of types (number of types k) of hash functions. Specifically, for example, k hash values are obtained by providing search target data to k different hash functions H1 () to Hk ().

図５に示したブルームフィルタ列ＢＦ（ｐ）へのエントリでは、ブルームフィルタ列ＢＦ（ｐ）を構成するブルームフィルタのビット幅ｍ（＝ｓ／ｎ）でハッシュ値を除算し、その余り値に対応するビット位置をＯＮにしていた。 In the entry to the Bloom filter column BF (p) shown in FIG. 5, the hash value is divided by the bit width m (= s / n) of the Bloom filter constituting the Bloom filter column BF (p), and the remainder is obtained. The corresponding bit position was set to ON.

このため、変換部９０３では、ｋ個のハッシュ値が得られると、転置ブルームフィルタ列ｔＢＦ（ｐ）を構成する転置ブルームフィルタの配列数ｍ（＝ｓ／ｎ）でハッシュ値を除算し、その余り値を算出する。このｋ個の余り値が、転置ブルームフィルタの配列位置をあらわす位置情報である。なお、算出された余り値が０の場合の位置情報は、ｍとする。 Therefore, when k hash values are obtained, the conversion unit 903 divides the hash value by the number m (= s / n) of transposed Bloom filter arrays constituting the transposed Bloom filter sequence tBF (p), Calculate the remainder value. The k remainder values are position information representing the arrangement position of the transposed Bloom filter. Note that the position information when the calculated remainder value is 0 is m.

第１の特定部９０４は、転置ブルームフィルタ列ｔＢＦ（ｐ）の中から、変換部９０３によって変換された位置情報に対応する転置ブルームフィルタを位置情報ごとに特定する機能を有する。具体的には、変換部９０３から得られた位置情報と配列番号が一致する転置ブルームフィルタを転置ブルームフィルタ列ｔＢＦ（ｐ）の中から特定する。 The first specifying unit 904 has a function of specifying, for each positional information, a transposed Bloom filter corresponding to the positional information converted by the converting unit 903 from the transposed Bloom filter row tBF (p). Specifically, a transposed Bloom filter whose array number matches the position information obtained from the conversion unit 903 is specified from the transposed Bloom filter row tBF (p).

図８に示した例では、第３段の転置ブルームフィルタ列ｔＢＦ（３）では、余り値（位置情報）が「２」、「１９」、「２７」であるため、配列番号が「２」、「１９」、「２７」である転置ブルームフィルタｔｂｆ（３−２），ｔｂｆ（３−１９），ｔｂｆ（３−２７）を特定する。 In the example illustrated in FIG. 8, in the third-stage transposed Bloom filter row tBF (3), the remainder values (position information) are “2”, “19”, and “27”, so the array element number is “2”. , “19”, “27” are specified as transposed Bloom filters tbf (3-2), tbf (3-19), tbf (3-27).

第２の特定部９０５は、第１の特定部９０４によって特定された複数の転置ブルームフィルタｔｂｆ（ｐ）に共通する位置情報に対応するブルームフィルタｂｆ（ｐ）を、ブルームフィルタ列ＢＦ（ｐ）の中から特定する機能を有する。具体的には、第１の特定部９０４によって特定された複数の転置ブルームフィルタｔｂｆ（ｐ）のビット列をＡＮＤ演算する。このＡＮＤ演算によって「１」が出現したビットの位置が、転置前のブルームフィルタ列ＢＦ（ｐ）でエントリ時にＯＮになったであろうビットを有するブルームフィルタである。 The second specifying unit 905 generates the Bloom filter bf (p) corresponding to the position information common to the plurality of transposed Bloom filters tbf (p) specified by the first specifying unit 904, and the Bloom filter row BF (p). It has a function to identify from. Specifically, an AND operation is performed on the bit strings of the plurality of transposed Bloom filters tbf (p) specified by the first specifying unit 904. The bit position where “1” appears by the AND operation is a Bloom filter having a bit that would have been turned ON at the time of entry in the Bloom filter string BF (p) before transposition.

図８に示した例では、第３段において転置ブルームフィルタｔｂｆ（３−２），ｔｂｆ（３−１９），ｔｂｆ（３−２７）をＡＮＤ演算すると、ＡＮＤ結果では先頭のビット位置が「１」となる。したがって、転置前のブルームフィルタ列ＢＦ（３）のうちブルームフィルタｂｆ（３−１）のいずれかのビットが、検索対象データのエントリ時にＯＮになった可能性があることがわかる。 In the example shown in FIG. 8, when the transposed Bloom filters tbf (3-2), tbf (3-19), and tbf (3-27) are ANDed in the third stage, the leading bit position in the AND result is “1”. " Therefore, it can be seen that any bit of the Bloom filter bf (3-1) in the Bloom filter row BF (3) before transposition may have been turned ON at the time of entry of search target data.

このように、第１の特定部９０４および第２の特定部９０５により、転置ブルームフィルタ列ｔＢＦ（ｐ）へのアクセス回数が転置前のブルームフィルタ列ＢＦ（ｐ）よりも減少する。したがって、転置ブルームフィルタｔｂｆ（ｐ）の段数ｐが増加すればするほど、転置前のブルームフィルタ列ＢＦ（ｐ）よりもアクセス回数が格段に抑えられ、検索速度の高速化を実現することができる。 As described above, the first specifying unit 904 and the second specifying unit 905 reduce the number of accesses to the transposed Bloom filter row tBF (p) than that before the transposed Bloom filter row BF (p). Therefore, as the number of stages p of the transposed Bloom filter tbf (p) increases, the number of accesses is significantly suppressed compared to the pre-transposed Bloom filter string BF (p), and the search speed can be increased. .

判断部９０６は、第２の特定部９０５によって特定されたブルームフィルタを配列要素とするブルームフィルタ列が存在するか否かを判断する機能を有する。具体的には、転置ブルームフィルタ列ｔＢＦ（ｐ）の１つ下の段の転置ブルームフィルタ列が存在するか否かを判断するだけでよい。より具体的には、ｐ＝１であるか否かをみるだけでよい。 The determination unit 906 has a function of determining whether there is a Bloom filter row having the Bloom filter specified by the second specifying unit 905 as an array element. Specifically, it is only necessary to determine whether or not there is a transposed Bloom filter row one level below the transposed Bloom filter row tBF (p). More specifically, it is only necessary to check whether p = 1.

１つ下の段の転置ブルームフィルタ列が存在する場合（ｐ≠１）は、第１の特定部９０４により、１つ下の段の転置ブルームフィルタ列をあらたに転置ブルームフィルタ列ｔＢＦ（ｐ）とする。一方、１つ下の段の転置ブルームフィルタ列が存在しない場合（ｐ＝１）、転置ブルームフィルタ列ｔＢＦ（ｐ）は、最下段である第１段の転置ブルームフィルタ列ｔＢＦ（１）となる。 When the transposed Bloom filter row of the next lower stage exists (p ≠ 1), the first specifying unit 904 newly converts the transposed Bloom filter row of the lower stage to the transposed Bloom filter row tBF (p). And On the other hand, when there is no lower-stage transposed Bloom filter row (p = 1), the transposed Bloom filter row tBF (p) becomes the first-stage transposed Bloom filter row tBF (1), which is the lowest stage. .

判定部９０７は、判断部９０６によって存在しないと判断された場合、データブロック集合ｄｂのうち、第２の特定部９０５によって特定されたブルームフィルタｂｆ（ｐ）に対応するデータブロックｄｂ＃内での検索対象データの有無を判定する機能を有する。ｐ＝１の場合、第２の特定部９０５によって特定されたブルームフィルタｂｆ（１）はデータブロックと一対一対応するため、第２の特定部９０５でのＡＮＤ結果で「１」になったビットの位置をブロック番号＃とするデータブロックｄｂ＃を判定対象とする。判定部９０７では、判定対象のデータブロックｄｂ＃は擬陽性であるため、陽性であるか陰性であるかを判定することとなる。 If the determination unit 906 determines that the data does not exist, the determination unit 907 includes the data block db # corresponding to the Bloom filter bf (p) specified by the second specification unit 905 in the data block set db. It has a function of determining the presence or absence of search target data. When p = 1, since the Bloom filter bf (1) specified by the second specifying unit 905 has a one-to-one correspondence with the data block, the bit that is “1” in the AND result of the second specifying unit 905 The data block db # having the block number # as the position is determined as a determination target. The determination unit 907 determines whether the determination target data block db # is positive or negative and therefore is positive or negative.

そして、判定対象のデータブロックｄｂ＃のハッシュテーブルＨＴ＃を参照して、検索対象データの擬陽性／陰性を判定する。この場合、階層型ブルームフィルタＢＦを用いた検索処理と同様、特定のハッシュ関数（たとえば、Ｈ１（））による検索対象データのハッシュ値をキーにして、ハッシュテーブルＨＴ＃からデータ（またはそのポインタ）の有無を判定する。存在しない場合は、擬陽性によりヒットミスしたこととなる。 Then, with reference to the hash table HT # of the determination target data block db #, false positive / negative of the search target data is determined. In this case, similarly to the search process using the hierarchical Bloom filter BF, data (or a pointer thereof) from the hash table HT # using a hash value of search target data by a specific hash function (for example, H1 ()) as a key. The presence or absence of is determined. If it does not exist, it means that a miss has occurred due to a false positive.

抽出部９０８は、判定部９０７によって陽性であると判定された場合、判定対象のデータブロックｄｂ＃から検索対象データを抽出する機能を有する。具体的には、たとえば、判定対象のデータブロックに対応するハッシュテーブルＨＴ＃によりデータの格納位置が判明するため、抽出部９０８は、その格納位置から検索対象データや検索対象データに関連するデータを抽出する。 The extraction unit 908 has a function of extracting search target data from the determination target data block db # when the determination unit 907 determines that the data is positive. Specifically, for example, since the data storage position is determined by the hash table HT # corresponding to the determination target data block, the extraction unit 908 obtains the search target data and the data related to the search target data from the storage position. Extract.

たとえば、検索対象データそのものを判定対象のデータブロックｄｂ＃から抽出する。抽出できた場合は、検索対象データが登録されていることが判明する。また、検索対象データがファイル番号である場合は、ファイル番号に関連するファイルデータを抽出する。また、検索対象データが辞書や用語の見出し語である場合は、見出し語の用語解説データを抽出する。 For example, the search target data itself is extracted from the determination target data block db #. If it can be extracted, it is found that the search target data is registered. If the search target data is a file number, file data related to the file number is extracted. In addition, when the search target data is a dictionary or a term headword, the term explanation data of the headword is extracted.

出力部９０９は、判定部９０７による判定結果や抽出部９０８によって抽出されたデータを出力する。たとえば、検索対象データの陽性または陰性といった判定結果や、陽性である場合の抽出データを出力する。出力形式としては、ディスプレイ１０８への表示、音声出力、印刷出力、他の装置への送信などがある。 The output unit 909 outputs the determination result by the determination unit 907 and the data extracted by the extraction unit 908. For example, the determination result such as positive or negative of the search target data and the extracted data when it is positive are output. Examples of the output format include display on the display 108, audio output, print output, transmission to another apparatus, and the like.

＜登録処理部２０１による階層型ブルームフィルタＢＦの学習処理手順＞
図１０は、登録処理部２０１による階層型ブルームフィルタＢＦの学習処理手順を示すフローチャートである。まず、登録処理部２０１は、登録したいデータ（対象データＤ）があるか否かを判断する（ステップＳ１００１）。対象データＤがある場合（ステップＳ１００１：Ｙｅｓ）、登録処理部２０１は、段数ｐをｐ＝１に設定し（ステップＳ１００２）、ｐ＞ｈ（ｈは階層型ブルームフィルタＢＦの最上段）であるか否かを判断する（ステップＳ１００３）。ｐ＞ｈでない場合（ステップＳ１００３：Ｎｏ）、登録処理部２０１は、ｋ種類のハッシュ関数を用いて、対象データのｋ個のハッシュ値を算出する（ステップＳ１００４）。そして、ｋ個のハッシュ値をブルームフィルタ列ＢＦ（ｐ）のビット幅で除算して、ｋ個の余り値を算出する（ステップＳ１００５）。 <Learning Processing Procedure of Hierarchical Bloom Filter BF by Registration Processing Unit 201>
FIG. 10 is a flowchart showing the learning processing procedure of the hierarchical Bloom filter BF by the registration processing unit 201. First, the registration processing unit 201 determines whether there is data to be registered (target data D) (step S1001). When there is target data D (step S1001: Yes), the registration processing unit 201 sets the number of stages p to p = 1 (step S1002), and p> h (h is the uppermost stage of the hierarchical Bloom filter BF). Whether or not (step S1003). If p> h is not satisfied (step S1003: NO), the registration processing unit 201 calculates k hash values of the target data using k types of hash functions (step S1004). Then, k hash values are divided by the bit width of the Bloom filter string BF (p) to calculate k remainder values (step S1005).

つぎに、登録処理部２０１は、ｐ段目のブルームフィルタ列ＢＦ（ｐ）の中から登録先のブルームフィルタｂｆ（ｐ）ｒを特定する（ステップＳ１００６）。ｐ＝１の場合は、保存先のデータブロックｄｂ＃のブロック番号＃に対応するブルームフィルタｂｆ（１−＃）を登録先のブルームフィルタｂｆ（ｐ）ｒとする。 Next, the registration processing unit 201 identifies the bloom filter bf (p) r that is the registration destination from the p-th Bloom filter row BF (p) (step S1006). When p = 1, the bloom filter bf (1- #) corresponding to the block number # of the data block db # of the storage destination is set as the bloom filter bf (p) r of the registration destination.

具体的には、図５に示したように、対象データＤをデータブロックｄｂ３に登録する場合は、データブロックｄｂ３のブロック番号３と同一配列番号である第１段のブルームフィルタｂｆ（１−３）を登録先のブルームフィルタｂｆ（１）ｒとする。 Specifically, as shown in FIG. 5, when registering the target data D in the data block db3, the first-stage Bloom filter bf (1-3) having the same array number as the block number 3 of the data block db3. ) Is the bloom filter bf (1) r of the registration destination.

ｐ≠１の場合は、ｐ段目のブルームフィルタ列ＢＦ（ｐ）の中から、（ｐ−１）段目の登録先のブルームフィルタｂｆ（ｐ−１）ｒのビット位置に対応するブルームフィルタｂｆ（ｐ）をあらたな登録先のブルームフィルタｂｆ（ｐ）ｒとする。 If p ≠ 1, the Bloom filter corresponding to the bit position of the (p−1) -th registered Bloom filter bf (p−1) r from the p-th Bloom filter string BF (p) Let bf (p) be the newly registered Bloom filter bf (p) r.

具体的には、図５に示したように、Ｐ＝２とすると、２段目のブルームフィルタ列ＢＦ（２）の中から、１段目の登録先のブルームフィルタｂｆ（１）ｒであるブルームフィルタｂｆ（１−３）のビット位置を包含するブルームフィルタｂｆ（２−２）をあらたな登録先のブルームフィルタｂｆ（２）ｒとする。 Specifically, as shown in FIG. 5, when P = 2, the Bloom filter bf (1) r of the first registration destination is selected from the second-stage Bloom filter row BF (2). The Bloom filter bf (2-2) including the bit position of the Bloom filter bf (1-3) is set as a new registration destination Bloom filter bf (2) r.

そして、登録先のブルームフィルタｂｆ（ｐ）ｒに、ステップＳ１００５で算出されたｋ個の余り値をエントリする（ステップＳ１００７）。すなわち、余り値と同一ビット位置のビットをＯＮにする。余り値が０の場合は、末尾ビットをＯＮにする。そして、段数ｐをインクリメントして（ステップＳ１００８）、ステップＳ１００３に戻る。 Then, the k remainder values calculated in step S1005 are entered in the registered Bloom filter bf (p) r (step S1007). That is, the bit at the same bit position as the remainder value is turned ON. When the remainder value is 0, the tail bit is turned ON. Then, the stage number p is incremented (step S1008), and the process returns to step S1003.

ステップＳ１００３において、ｐ＞ｈである場合（ステップＳ１００３：Ｙｅｓ）、対象データのハッシュテーブルエントリを追加する（ステップＳ１００９）。具体的には、たとえば、図４に示したように、ハッシュテーブルエントリＥ３をハッシュテーブルＨＴ３に追加登録する。 If p> h in step S1003 (step S1003: Yes), a hash table entry of the target data is added (step S1009). Specifically, for example, as shown in FIG. 4, the hash table entry E3 is additionally registered in the hash table HT3.

そして、ステップＳ１００１に戻る。ステップＳ１００１において、対象データＤがない場合（ステップＳ１００１：Ｎｏ）、登録処理部２０１による階層型ブルームフィルタＢＦの学習処理を終了する。このような処理手順により、階層型ブルームフィルタＢＦが構築されることとなる。 Then, the process returns to step S1001. In step S1001, if there is no target data D (step S1001: No), the learning process of the hierarchical Bloom filter BF by the registration processing unit 201 is terminated. By such a processing procedure, the hierarchical Bloom filter BF is constructed.

＜検索処理部２０２による検索処理手順＞
図１１および図１２は、検索処理部２０２による検索処理手順を示すフローチャートである。なお、検索処理部２０２により利用されるブルームフィルタは、すでに一括転置済みの階層型転置ブルームフィルタｔＢＦでもよく、階層型ブルームフィルタＢＦでもよい。階層型ブルームフィルタＢＦを用いる場合は、未転置のブルームフィルタが特定される都度転置処理をおこなう。 <Search Processing Procedure by Search Processing Unit 202>
FIG. 11 and FIG. 12 are flowcharts showing a search processing procedure by the search processing unit 202. Note that the Bloom filter used by the search processing unit 202 may be a hierarchical transposed Bloom filter tBF that has already been transposed, or a hierarchical Bloom filter BF. When the hierarchical Bloom filter BF is used, a transposition process is performed each time an untransposed Bloom filter is specified.

すなわち、検索しながら階層型転置ブルームフィルタｔＢＦを構築していく。図１１および図１２では、検索しながら階層型転置ブルームフィルタｔＢＦを構築していく場合の修理手順について説明する。なお、一括転置済みの場合は、転置済みか否かの判断（ステップＳ１１０４）と転置処理（ステップＳ１１０５）を省略するだけでよい。 That is, the hierarchical transposed Bloom filter tBF is constructed while searching. 11 and 12, the repair procedure when the hierarchical transposed Bloom filter tBF is constructed while searching will be described. In addition, when batch transposition has been completed, it is only necessary to omit the determination of whether transposition has been performed (step S1104) and the transposition processing (step S1105).

まず、図１１において、検索処理部２０２は、受付部９０１により、検索対象データＤｘを待ち受け（ステップＳ１１０１：Ｎｏ）、検索対象データＤｘが受け付けられた場合（ステップＳ１１０１：Ｙｅｓ）、検索処理部２０２は、変換部９０３により、検索対象データＤｘをｋ種類のハッシュ関数に与えてｋ個のハッシュ値を算出する（ステップＳ１１０２）。 First, in FIG. 11, the search processing unit 202 waits for the search target data Dx by the receiving unit 901 (step S1101: No), and when the search target data Dx is received (step S1101: Yes), the search processing unit 202. The conversion unit 903 gives the search object data Dx to k types of hash functions to calculate k hash values (step S1102).

そして、検索処理部２０２は、ｐ＝ｈ、すなわち、段数ｐを最大段数ｈに設定し（ステップＳ１１０３）、ｐ段目のブルームフィルタ列ＢＦ（ｐ）が転置済みであるか否かを判断する（ステップＳ１１０４）。転置済みである場合（ステップＳ１１０４：Ｙｅｓ）、ステップＳ１１０６に移行する。一方、未転置である場合（ステップＳ１１０４：Ｎｏ）、検索処理部２０２は、転置部９０２により、ｐ段目のブルームフィルタ列ＢＦ（ｐ）を転置して（ステップＳ１１０５）、ステップＳ１１０６に移行する。 Then, the search processing unit 202 sets p = h, that is, sets the number of stages p to the maximum number of stages h (step S1103), and determines whether or not the p-th Bloom filter string BF (p) has been transposed. (Step S1104). When it has been transposed (step S1104: Yes), the process proceeds to step S1106. On the other hand, if it is not transposed (step S1104: No), the search processing unit 202 transposes the p-th Bloom filter row BF (p) by the transposing unit 902 (step S1105), and proceeds to step S1106. .

ステップＳ１１０６では、検索処理部２０２は、変換部９０３により、ｋ個のハッシュ値を転置ブルームフィルタの配列数で除算して、ｋ個の余り値を算出する（ステップＳ１１０６）。そして、検索処理部２０２は、第１の特定部９０４により、ｐ段目の転置ブルームフィルタ列ｔＢＦ（ｐ）から、ｋ個の余り値に対応するｋ個の転置ブルームフィルタｔｂｆ（ｐ）ｒを特定する（ステップＳ１１０７）。 In step S1106, the search processing unit 202 uses the conversion unit 903 to divide k hash values by the number of arrays of transposed Bloom filters to calculate k remainder values (step S1106). Then, the search processing unit 202 uses the first specifying unit 904 to extract k transposed Bloom filters tbf (p) r corresponding to k remainder values from the p-th transposed Bloom filter sequence tBF (p). Specify (step S1107).

そして、検索処理部２０２は、第２の特定部９０５により、ｋ個の転置ブルームフィルタｔｂｆ（ｐ）ｒをＡＮＤ演算し（ステップＳ１１０８）、図１２のステップＳ１２０１に移行する。 Then, the search processing unit 202 performs an AND operation on the k transposed Bloom filters tbf (p) r by the second specifying unit 905 (step S1108), and proceeds to step S1201 in FIG.

図１２において、検索処理部２０２は、第２の特定部９０５により、ＡＮＤ結果の先頭ビットを対象ビットに設定し（ステップＳ１２０１）、対象ビットがＯＮであるか否かを判断する（ステップＳ１２０２）。ＯＮでない場合、検索処理部２０２は、判定部９０７により、対象ビットがシフト可能か否かを判断する（ステップＳ１２０３）。具体的には、対象ビットが末尾ビットであるか否かを判断する。 In FIG. 12, the search processing unit 202 uses the second specifying unit 905 to set the first bit of the AND result as the target bit (step S1201), and determines whether the target bit is ON (step S1202). . If not ON, the search processing unit 202 uses the determination unit 907 to determine whether the target bit can be shifted (step S1203). Specifically, it is determined whether or not the target bit is the tail bit.

シフト可能である場合（ステップＳ１２０３：Ｙｅｓ）、検索処理部２０２は、対象ビットを１ビット末尾方向へシフトし（ステップＳ１２０４）、ステップＳ１２０２に戻る。一方、ステップＳ１２０３においてシフト可能でない場合（ステップＳ１２０３：Ｎｏ）、検索処理部２０２は、判定部９０７により、検索結果（陰性）と判定し、出力部９０９から出力する（ステップＳ１２０５）。これにより、検索結果が陰性である場合の処理手順を終了する。 When the shift is possible (step S1203: Yes), the search processing unit 202 shifts the target bit toward the end of one bit (step S1204), and returns to step S1202. On the other hand, if the shift is not possible in step S1203 (step S1203: No), the search processing unit 202 determines the search result (negative) by the determination unit 907 and outputs the result from the output unit 909 (step S1205). Thereby, the processing procedure when the search result is negative is terminated.

一方、ステップＳ１２０２において、対象ビットがＯＮであった場合（ステップＳ１２０２：Ｙｅｓ）、検索処理部２０２は、判断部９０６により、現在の段数ｐがｐ＝１であるか否かを判断する（ステップＳ１２０６）。ｐ＝１でない場合（ステップＳ１２０６：Ｎｏ）、ｐをデクリメントして（ステップＳ１２０７）、ステップＳ１１０４に戻る。 On the other hand, if the target bit is ON in step S1202 (step S1202: Yes), the search processing unit 202 determines whether the current stage number p is p = 1 by the determination unit 906 (step S1202). S1206). When p is not 1 (step S1206: No), p is decremented (step S1207), and the process returns to step S1104.

一方、ｐ＝１である場合（ステップＳ１２０６：Ｙｅｓ）、検索処理部２０２は、判定部９０７により、対象ビットのビット位置に対応するハッシュテーブルを検索する（ステップＳ１２０８）。そして、検索対象データＤｘが存在するか否かを判断する（ステップＳ１２０９）。 On the other hand, if p = 1 (step S1206: Yes), the search processing unit 202 uses the determination unit 907 to search for a hash table corresponding to the bit position of the target bit (step S1208). Then, it is determined whether or not the search target data Dx exists (step S1209).

検索対象データが存在しない場合（ステップＳ１２０９：Ｎｏ）、ステップＳ１２０３に戻り、対象ビットがシフト可能か否かを判断することとなる。一方、検索対象データが存在する場合（ステップＳ１２０９：Ｙｅｓ）、検索処理部２０２は、検索結果（陽性）を出力する（ステップＳ１２１０）。検索処理部２０２は、必要に応じて抽出部９０８により関連するデータを抽出して検索結果として出力する。これにより、検索結果が陰性である場合の処理手順を終了する。 When the search target data does not exist (step S1209: No), the process returns to step S1203 to determine whether the target bit can be shifted. On the other hand, when search target data exists (step S1209: Yes), the search processing unit 202 outputs a search result (positive) (step S1210). The search processing unit 202 extracts related data by the extraction unit 908 as necessary and outputs it as a search result. Thereby, the processing procedure when the search result is negative is terminated.

以上説明したように、本実施の形態によれば、ブルームフィルタ列ＢＦ（ｐ）を転置して転置ブルームフィルタ列ｔＢＦ（ｐ）とすることで、メモリアクセスが低減され、検索速度が高速化するという効果を奏する。特に、階層型転置ブルームフィルタｔＢＦとすることで、段ごとのメモリアクセスが低減されるため、より高速に検索をおこなうことができる。 As described above, according to the present embodiment, transposing the Bloom filter string BF (p) to the transposed Bloom filter string tBF (p) reduces the memory access and increases the search speed. There is an effect. In particular, by using the hierarchical transposed Bloom filter tBF, memory access at each stage is reduced, so that a search can be performed at a higher speed.

また、起動完了通知を転置要求とすることで、管理装置２００が起動すると、階層型ブルームフィルタＢＦが階層型転置ブルームフィルタｔＢＦに転置される。このように起動時に一括して、階層型転置ブルームフィルタｔＢＦに転置しておけば、シャットダウンするまでは常時階層型転置ブルームフィルタｔＢＦを利用することができる。 Moreover, when the management apparatus 200 is activated by using the activation completion notification as a transposition request, the hierarchical Bloom filter BF is transposed to the hierarchical transposed Bloom filter tBF. In this way, if the transposition to the hierarchical transposed Bloom filter tBF is collectively performed at the time of startup, the hierarchical transposed Bloom filter tBF can be used at all times until the shutdown.

また、データブロック集合ｄｂに対する検索要求を転置要求とすることで、検索要求があると、階層型ブルームフィルタＢＦのうち検索処理部２０２で用いられるブルームフィルタ列ＢＦ（ｐ）を転置させることとなる。これにより、検索頻度が高くなるにつれ、転置ブルームフィルタ列ｔＢＦ（ｐ）が増加する。したがって、検索に必要なブルームフィルタ列ＢＦ（ｐ）から徐々に転置させるため、シャットダウンまでに利用されなかったブルームフィルタ列の無駄な転置処理を低減することができる。 Also, by making the search request for the data block set db a transposition request, when there is a search request, the Bloom filter string BF (p) used in the search processing unit 202 in the hierarchical Bloom filter BF is transposed. . Thereby, the transposed Bloom filter string tBF (p) increases as the search frequency increases. Therefore, since it is gradually transposed from the Bloom filter row BF (p) necessary for the search, it is possible to reduce useless transposition processing of the Bloom filter row that was not used until the shutdown.

また、階層型ブルームフィルタＢＦから転置された階層型転置ブルームフィルタｔＢＦの場合、最下段になるまでは、ＡＮＤ結果のＯＮの有無だけで擬陽性であるか陰性であるかを判断し、ＡＮＤ結果に１つでもＯＮのビットがあれば、擬陽性と判断して、１つ下の段に移行する。このように、ＡＮＤ結果を利用することで、陰性か否かを容易に判断でき、検索対象データが存在しないことを早期に特定することができる。 Further, in the case of the hierarchical transposed Bloom filter tBF transposed from the hierarchical Bloom filter BF, until the bottom level is reached, whether the AND result is ON or not is determined whether it is false positive or negative, and the AND result is If there is even one ON bit, it is determined as a false positive and the process proceeds to the next lower stage. Thus, by using the AND result, it can be easily determined whether or not the result is negative, and it can be identified early that the search target data does not exist.

また、階層型転置ブルームフィルタｔＢＦだけをメモリに保持しておけば検索が可能であるため、階層型ブルームフィルタＢＦを保持しておく必要なない。これにより、省メモリ化を図ることができる。 In addition, since it is possible to search if only the hierarchical transposed Bloom filter tBF is held in the memory, it is not necessary to hold the hierarchical Bloom filter BF. Thereby, memory saving can be achieved.

＜階層型転置ブルームフィルタｔＢＦの学習処理例＞
図１３は、登録処理部２０１による階層型転置ブルームフィルタｔＢＦの学習処理例を示す説明図である。ここでは、図８に示した階層型転置ブルームフィルタｔＢＦを例にあげて説明する。また、図８では、検索対象データＤｘを検索してヒットさせた例について説明したが、図１３では、検索対象データＤｘ（図１３では、対象データＤｘとする）がデータブロック集合ｄｂにまだ登録されていないものとして説明する。 <Example of learning process of hierarchical transposed Bloom filter tBF>
FIG. 13 is an explanatory diagram illustrating a learning process example of the hierarchical transposed Bloom filter tBF by the registration processing unit 201. Here, the hierarchical transposed Bloom filter tBF shown in FIG. 8 will be described as an example. Further, FIG. 8 illustrates an example in which the search target data Dx is searched and hit, but in FIG. 13, the search target data Dx (referred to as target data Dx in FIG. 13) is still registered in the data block set db. It will be described as not being done.

まず、データブロック集合ｄｂのうちデータブロックｄｂ４に対象データＤｘが登録されたものとする。対象データＤｘを各ハッシュ関数Ｈ１（），Ｈ２（），Ｈ３（）に与えたときのハッシュ値は、例として以下の値とする。
Ｈ１（Ｄｘ）＝ｘ１
Ｈ２（Ｄｘ）＝ｘ２
Ｈ３（Ｄｘ）＝ｘ３ First, it is assumed that the target data Dx is registered in the data block db4 in the data block set db. As an example, the hash value when the target data Dx is given to each hash function H1 (), H2 (), H3 () is as follows.
H1 (Dx) = x1
H2 (Dx) = x2
H3 (Dx) = x3

また、階層型転置ブルームフィルタｔＢＦの学習処理では、更新対象となる転置ブルームフィルタｔｂｆ（ｐ）内の特定のビットをＯＮにするが、その特定のビットがすでにＯＮになっている場合はそのままとする。 Further, in the learning process of the hierarchical transposed Bloom filter tBF, a specific bit in the transposed Bloom filter tbf (p) to be updated is turned ON, but if the specific bit is already ON, it is left as it is. To do.

ここで、登録処理部２０１は、登録先のデータブロックｄｂ４のブロック番号４のハッシュテーブルＨＴ４に対するハッシュテーブルエントリＥ４を作成する。そして、登録処理部２０１は、作成されたハッシュテーブルエントリＥ４を、ハッシュテーブルＨＴ４に追加登録する。 Here, the registration processing unit 201 creates a hash table entry E4 for the hash table HT4 of the block number 4 of the registration destination data block db4. Then, the registration processing unit 201 additionally registers the created hash table entry E4 in the hash table HT4.

そして、第１段の学習処理に移る。登録処理部２０１は、更新対象となる転置ブルームフィルタｔｂｆ（１）を、第１段の転置ブルームフィルタ列ｔＢＦ（１）の中から特定する。具体的には、登録処理部２０１は、各ハッシュ値ｘ１〜ｘ３を、第１段の転置ブルームフィルタ列ｔＢＦ（１）の配列個数である８で割り算し、余り値を算出する。ハッシュ値ｘ１の余り値は「２」、ハッシュ値ｘ２の余り値は「５」、ハッシュ値ｘ３の余り値は「７」になったものとする。したがって、第１段での更新対象となる転置ブルームフィルタｔｂｆ（１）は、転置ブルームフィルタｔｂｆ（１−２），ｔｂｆ（１−５），ｔｂｆ（１−７）となる。 Then, the process proceeds to the first stage learning process. The registration processing unit 201 identifies the transposed Bloom filter tbf (1) to be updated from the first-stage transposed Bloom filter row tBF (1). Specifically, the registration processing unit 201 divides each hash value x1 to x3 by 8, which is the number of arrays of the first-stage transposed Bloom filter string tBF (1), and calculates a remainder value. It is assumed that the remainder value of the hash value x1 is “2”, the remainder value of the hash value x2 is “5”, and the remainder value of the hash value x3 is “7”. Therefore, the transposed Bloom filter tbf (1) to be updated in the first stage is the transposed Bloom filter tbf (1-2), tbf (1-5), tbf (1-7).

また、最下段では、登録先のデータブロックｄｂ４のブロック番号４に対応するビット位置を更新対象ビットとする。したがって、更新対象となる転置ブルームフィルタの先頭から４ビット目のビットをＯＮにする。これにより、第１段の転置ブルームフィルタ列ｔＢＦ（１）の学習処理を終了する。 In the lowest level, the bit position corresponding to the block number 4 of the registration-destination data block db4 is set as the update target bit. Therefore, the fourth bit from the beginning of the transposed Bloom filter to be updated is turned ON. Thereby, the learning process of the first-stage transposed Bloom filter row tBF (1) is completed.

つぎに、第２段の学習処理に移る。登録処理部２０１は、更新対象となる転置ブルームフィルタｔｂｆ（２）を、第２段の転置ブルームフィルタ列ｔＢＦ（２）の中から特定する。具体的には、登録処理部２０１は、各ハッシュ値ｘ１〜ｘ３を、第２段の転置ブルームフィルタ列ｔＢＦ（２）の配列個数である１６で割り算し、余り値を算出する。ハッシュ値ｘ１の余り値は「８」、ハッシュ値ｘ２の余り値は「１１」、ハッシュ値ｘ３の余り値は「１３」になったものとする。したがって、第２段での更新対象となる転置ブルームフィルタｔｂｆ（２）は、転置ブルームフィルタｔｂｆ（２−８），ｔｂｆ（２−１１），ｔｂｆ（２−１３）となる。 Next, the learning process of the second stage is started. The registration processing unit 201 identifies the transposed Bloom filter tbf (2) to be updated from the second-stage transposed Bloom filter row tBF (2). Specifically, the registration processing unit 201 divides each hash value x1 to x3 by 16, which is the number of arrays of the second-stage transposed Bloom filter string tBF (2), and calculates a remainder value. It is assumed that the remainder value of the hash value x1 is “8”, the remainder value of the hash value x2 is “11”, and the remainder value of the hash value x3 is “13”. Therefore, the transposed Bloom filter tbf (2) to be updated in the second stage is the transposed Bloom filter tbf (2-8), tbf (2-11), tbf (2-13).

つぎに、転置ブルームフィルタｔｂｆ（２−８），ｔｂｆ（２−１１），ｔｂｆ（２−１３）内のどのビット位置のビットをＯＮにするかについて説明する。転置前の階層型ブルームフィルタＢＦでは、分割数をｄとして、各ブルームフィルタ列ＢＦ（ｐ）をｎ（＝ｄ^[h-(p-1)]）個に分割した。そして、これにより、各ブルームフィルタ列ＢＦ（ｐ）のビット幅は、ｍ（＝ｓ／ｎ）ビットになった。 Next, the bit position of the bit in the transposed Bloom filter tbf (2-8), tbf (2-11), tbf (2-13) will be described. In the hierarchical Bloom filter BF before transposition, the number of divisions is d, and each Bloom filter row BF (p) is divided into n (= d ^{[h− (p−1)]} ). As a result, the bit width of each Bloom filter column BF (p) is m (= s / n) bits.

このため、階層型ブルームフィルタＢＦの学習処理では、第（ｐ−１）段での更新対象のブルームフィルタｂｆ（（ｐ−１）−＃）のビット位置を包含するブルームフィルタｂｆ（ｐ）を、第ｐ段のブルームフィルタ列ＢＦ（ｐ）の中から特定していた。 For this reason, in the learning process of the hierarchical Bloom filter BF, the Bloom filter bf (p) including the bit position of the Bloom filter bf ((p-1)-#) to be updated in the (p-1) -th stage is used. The p-th Bloom filter row BF (p) is specified.

たとえば、図５の例では、第２段において、前段の第１段での更新対象のブルームフィルタｂｆ（１−３）の配列番号３を、分割数ｄ（＝２）で割り算し、端数を切り上げることで、更新対象の配列番号は２となる。したがって、ブルームフィルタｂｆ（２−２）が特定される。 For example, in the example of FIG. 5, in the second stage, the array number 3 of the Bloom filter bf (1-3) to be updated in the first stage of the previous stage is divided by the division number d (= 2), and the fraction is calculated. By rounding up, the array number to be updated becomes 2. Therefore, the Bloom filter bf (2-2) is specified.

これに対し、階層型転置ブルームフィルタｔＢＦでは、配列個数ｎとビット幅ｍが入れ替わっているため、第（ｐ−１）段の更新対象のブルームフィルタｂｆ（（ｐ−１）−＃）の配列番号＃ではなく、第（ｐ−１）段での更新対象ビット位置を分割数ｄで割り算し、端数を切り上げる。 On the other hand, in the hierarchical transposed Bloom filter tBF, the array number n and the bit width m are interchanged, so the array of Bloom filters bf ((p-1)-#) to be updated in the (p-1) stage. Instead of the number #, the bit position to be updated at the (p−1) -th stage is divided by the division number d, and the fraction is rounded up.

第２段の場合、第１段での更新対象ビットは、先頭から４ビット目のビットであり、転置ブルームフィルタｔｂｆ（１−２），ｔｂｆ（１−５），ｔｂｆ（１−７）の４ビット目がＯＮにされた。したがって、第２段の更新対象ビットは、ｄ＝２であるため、先頭から４／ｄ＝２ビット目のビットを更新対象ビットとする。本例では、転置ブルームフィルタｔｂｆ（２−８），ｔｂｆ（２−１１），ｔｂｆ（２−１３）の先頭から２ビット目のビットをＯＮにすることになる。これにより、第２段の転置ブルームフィルタ列ｔＢＦ（２）の学習処理を終了する。 In the case of the second stage, the update target bit in the first stage is the fourth bit from the head, and the transposed Bloom filters tbf (1-2), tbf (1-5), tbf (1-7) The 4th bit was turned ON. Therefore, since the update target bit in the second stage is d = 2, the bit of 4 / d = 2 bit from the head is set as the update target bit. In this example, the second bit from the beginning of the transposed Bloom filters tbf (2-8), tbf (2-11), and tbf (2-13) is turned ON. Thereby, the learning process of the second-stage transposed Bloom filter row tBF (2) is completed.

つぎに、第３段の学習処理に移る。登録処理部２０１は、更新対象となる転置ブルームフィルタｔｂｆ（３）を、第３段の転置ブルームフィルタ列ｔＢＦ（３）の中から特定する。具体的には、登録処理部２０１は、各ハッシュ値ｘ１〜ｘ３を、第３段の転置ブルームフィルタ列ｔＢＦ（３）の配列個数である３２で割り算し、余り値を算出する。ハッシュ値ｘ１の余り値は「２」、ハッシュ値ｘ２の余り値は「１９」、ハッシュ値ｘ３の余り値は「２７」になったものとする。したがって、第３段での更新対象となる転置ブルームフィルタｔｂｆ（３）は、転置ブルームフィルタｔｂｆ（３−２），ｔｂｆ（３−１９），ｔｂｆ（３−２７）となる。 Next, the process moves to the third stage learning process. The registration processing unit 201 identifies the transposed Bloom filter tbf (3) to be updated from the third-stage transposed Bloom filter row tBF (3). Specifically, the registration processing unit 201 divides each hash value x1 to x3 by 32, which is the number of arrays of the third-stage transposed Bloom filter string tBF (3), and calculates a remainder value. It is assumed that the remainder value of the hash value x1 is “2”, the remainder value of the hash value x2 is “19”, and the remainder value of the hash value x3 is “27”. Therefore, the transposed Bloom filter tbf (3) to be updated in the third stage is the transposed Bloom filter tbf (3-2), tbf (3-19), tbf (3-27).

つぎに、転置ブルームフィルタｔｂｆ（３−２），ｔｂｆ（３−１９），ｔｂｆ（３−２７）内の更新対象ビットを決める。第２段と同様、第（ｐ−１）段の更新対象のブルームフィルタｂｆ（（ｐ−１）−＃）の配列番号＃ではなく、第（ｐ−１）段での更新対象ビット位置を分割数ｄで割り算し、端数を切り上げる。 Next, the update target bits in the transposed Bloom filters tbf (3-2), tbf (3-19), and tbf (3-27) are determined. Similar to the second stage, not the array number # of the Bloom filter bf ((p-1)-#) to be updated in the (p-1) stage, but the bit position to be updated in the (p-1) stage. Divide by the division number d and round up the fraction.

第３段の場合、前段である第２段での更新対象ビットは、先頭から２ビット目のビットであり、転置ブルームフィルタｔｂｆ（２−８），ｔｂｆ（２−１１），ｔｂｆ（２−１３）の２ビット目がＯＮにされた。したがって、第３段の更新対象ビットは、ｄ＝２であるため、先頭から２／ｄ＝１ビット目のビットを更新対象ビットとする。本例では、転置ブルームフィルタｔｂｆ（３−２），ｔｂｆ（３−１９），ｔｂｆ（３−２７）の先頭ビットをＯＮにすることになる。これにより、第３段の転置ブルームフィルタ列ｔＢＦ（３）の学習処理を終了する。 In the case of the third stage, the update target bit in the second stage, which is the previous stage, is the second bit from the beginning, and is a transposed Bloom filter tbf (2-8), tbf (2-11), tbf (2- The second bit of 13) was turned ON. Accordingly, since the update target bit in the third stage is d = 2, the bit of 2 / d = 1 bit from the head is set as the update target bit. In this example, the first bit of the transposed Bloom filter tbf (3-2), tbf (3-19), tbf (3-27) is turned ON. Thereby, the learning process of the third-stage transposed Bloom filter row tBF (3) is completed.

＜登録処理部２０１による階層型転置ブルームフィルタｔＢＦの学習処理手順＞
図１４は、登録処理部２０１による階層型ブルームフィルタＢＦの学習処理手順を示すフローチャートである。まず、登録処理部２０１は、登録したいデータ（対象データＤｘ）があるか否かを判断する（ステップＳ１４０１）。対象データＤｘがある場合（ステップＳ１４０１：Ｙｅｓ）、登録処理部２０１は、段数ｐをｐ＝１に設定し（ステップＳ１４０２）、ｐ＞ｈ（ｈは階層型転置ブルームフィルタｔＢＦの最上段）であるか否かを判断する（ステップＳ１４０３）。ｐ＞ｈでない場合（ステップＳ１４０３：Ｎｏ）、登録処理部２０１は、ｋ種類のハッシュ関数を用いて、対象データＤｘのｋ個のハッシュ値を算出する（ステップＳ１４０４）。 <Learning Processing Procedure of Hierarchical Transposed Bloom Filter tBF by Registration Processing Unit 201>
FIG. 14 is a flowchart showing the learning processing procedure of the hierarchical Bloom filter BF by the registration processing unit 201. First, the registration processing unit 201 determines whether there is data to be registered (target data Dx) (step S1401). When there is target data Dx (step S1401: Yes), the registration processing unit 201 sets the number of stages p to p = 1 (step S1402), and p> h (h is the uppermost stage of the hierarchical transposed Bloom filter tBF). It is determined whether or not there is (step S1403). If p> h is not satisfied (step S1403: NO), the registration processing unit 201 calculates k hash values of the target data Dx using k types of hash functions (step S1404).

つぎに、登録処理部２０１は、ｋ個のハッシュ値を第ｐ段の転置ブルームフィルタｔＢＦ（ｐ）の配列個数で除算して、ｋ個の余り値を算出する（ステップＳ１４０５）。そして、登録処理部２０１は、ｋ個の余り値と同一配列番号のｋ個の転置ブルームフィルタｔｂｆ（ｐ）ｒを特定する（ステップＳ１４０６）。 Next, the registration processing unit 201 divides the k hash values by the number of arrays of the p-th transposed Bloom filter tBF (p) to calculate k remainder values (step S1405). Then, the registration processing unit 201 identifies k transposed Bloom filters tbf (p) r having the same array number as the k remainder values (step S1406).

このあと、ｐ＝１であるか否かを判断し（ステップＳ１４０７）、ｐ＝１である場合（ステップＳ１４０７：Ｙｅｓ）、登録処理部２０１は、特定されたｋ個の転置ブルームフィルタｔｂｆ（ｐ）ｒに、対象データＤｘの所属データブロックｄｂ＃のブロック番号＃をエントリする（ステップＳ１４０８）。すなわち、対象データＤｘの所属データブロックｄｂ＃のブロック番号＃を更新対象ビット位置＃に設定し、特定されたｋ個の転置ブルームフィルタｔｂｆ（ｐ）ｒの更新対象ビット位置＃のビットをＯＮにする。そして、ステップＳ１４１０に移行する。 Thereafter, it is determined whether or not p = 1 (step S1407). If p = 1 (step S1407: Yes), the registration processing unit 201 determines k transposed Bloom filters tbf (p ) In r, the block number # of the data block db # to which the target data Dx belongs is entered (step S1408). That is, the block number # of the data block db # to which the target data Dx belongs is set as the update target bit position #, and the bit at the update target bit position # of the identified k transposed Bloom filters tbf (p) r is turned ON. To do. Then, control goes to a step S1410.

一方、ステップＳ１４０７において、ｐ≠１である場合（ステップＳ１４０７：Ｎｏ）、特定されたｋ個の転置ブルームフィルタｔｂｆ（ｐ）ｒに、対象データＤｘの第（ｐ−１）段での更新対象ビット位置を分割数ｄで除算した商（端数切り上げ）をエントリする（ステップＳ１４０９）。すなわち、対象データＤｘの第（ｐ−１）段での更新対象ビット位置を分割数ｄで除算した商（端数切り上げ）を更新対象ビット位置に設定し、特定されたｋ個の転置ブルームフィルタｔｂｆ（ｐ）ｒの更新対象ビット位置のビットをＯＮにする。そして、ステップＳ１４１０に移行する。 On the other hand, if p ≠ 1 in step S1407 (step S1407: No), the k-th transposed Bloom filter tbf (p) r is updated in the (p−1) th stage of the target data Dx. A quotient (rounded up) obtained by dividing the bit position by the division number d is entered (step S1409). That is, the quotient (rounded up) obtained by dividing the update target bit position in the (p−1) -th stage of the target data Dx by the division number d is set as the update target bit position, and the identified k transposed Bloom filters tbf (P) The bit at the update target bit position of r is turned ON. Then, control goes to a step S1410.

ステップＳ１４１０において、登録処理部２０１は、段数ｐをインクリメントし（ステップＳ１４１０）、ステップＳ１４０３に戻る。これにより、最下段から最上段まで更新対象ビットをＯＮにすることができる。 In step S1410, the registration processing unit 201 increments the number of stages p (step S1410) and returns to step S1403. Thereby, the update target bit can be turned ON from the lowest level to the highest level.

一方、ステップＳ１４０３において、ｐ＞ｈである場合（ステップＳ１４０３：Ｙｅｓ）、登録処理部２０１は、対象データのハッシュテーブルエントリを追加して（ステップＳ１４１１）、ステップＳ１４０１に戻る。対象データＤｘがない場合（ステップＳ１４０１：Ｎｏ）、登録処理部２０１による階層型転置ブルームフィルタｔＢＦの学習処理を終了する。 On the other hand, in step S1403, when p> h (step S1403: Yes), the registration processing unit 201 adds a hash table entry of the target data (step S1411), and returns to step S1401. If there is no target data Dx (step S1401: No), the learning process of the hierarchical transposed Bloom filter tBF by the registration processing unit 201 is terminated.

このような手順により、登録処理部２０１は、階層型転置ブルームフィルタｔＢＦにデータエントリを学習させることとなる。すなわち、転置後にデータが登録される場合も、階層型転置ブルームフィルタｔＢＦから階層型ブルームフィルタＢＦに戻す必要がなく、転置後の階層型転置ブルームフィルタｔＢＦのまま学習させることができる。したがって、転置前に戻すといった無駄な処理がなくなり、検索効率の向上も図ることができる。 By such a procedure, the registration processing unit 201 causes the hierarchical transposed Bloom filter tBF to learn the data entry. That is, even when data is registered after transposition, it is not necessary to return from the hierarchical transposed Bloom filter tBF to the hierarchical Bloom filter BF, and learning can be performed with the transposed hierarchical transposed Bloom filter tBF. Therefore, useless processing such as returning before transposition is eliminated, and search efficiency can be improved.

なお、本実施の形態で説明した検索方法は、予め用意されたプログラムをパーソナル・コンピュータやワークステーション等のコンピュータで実行することにより実現することができる。本検索プログラムは、ハードディスク、フレキシブルディスク、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ等のコンピュータで読み取り可能な記録媒体に記録され、コンピュータによって記録媒体から読み出されることによって実行される。また本検索プログラムは、インターネット等のネットワークを介して配布してもよい。 Note that the search method described in this embodiment can be realized by executing a program prepared in advance on a computer such as a personal computer or a workstation. The search program is recorded on a computer-readable recording medium such as a hard disk, a flexible disk, a CD-ROM, an MO, and a DVD, and is executed by being read from the recording medium by the computer. The search program may be distributed via a network such as the Internet.

２００管理装置
２０１登録処理部
２０２検索処理部
９０１受付部
９０２転置部
９０３変換部
９０４第１の特定部
９０５第２の特定部
９０６判断部
９０７判定部
９０８抽出部
９０９出力部
ＢＦ階層型ブルームフィルタ
ｄｂデータブロック集合
ＨＴｓハッシュテーブル群
ｔＢＦ階層型転置ブルームフィルタ 200 management apparatus 201 registration processing unit 202 search processing unit 901 receiving unit 902 transposing unit 903 converting unit 904 first specifying unit 905 second specifying unit 906 determining unit 907 determining unit 908 extracting unit 909 output unit BF hierarchical Bloom filter db Data block set HTs Hash table group tBF Hierarchical transposed Bloom filter

Claims

A data block set including a registered data group for each data block and n Bloom filters in which m bits indicating negative in a predetermined number of data blocks are arranged in units of the predetermined number of data blocks On the computer with access to the Bloom Filter column,
An accepting step for accepting a transposition request for the Bloom filter row;
When a transposition request is accepted by the accepting step, a transposed Bloom filter sequence in which m n-bit transposed Bloom filters are arranged in which the Bloom filter sequence is a group of bit sequences in each Bloom filter are arranged at the same position. A transposition step of transposing to a storage device,
Based on a plurality of types of hash functions, for each hash function, a conversion step of converting search target data into position information representing an array position of the transposed Bloom filter;
From the transposed Bloom filter row, a first identifying step that identifies, for each positional information, a transposed Bloom filter corresponding to the positional information converted by the converting step;
A determination step of determining presence / absence of the search target data in the data block set based on position information common to a plurality of transposed Bloom filters specified by the first specification step;
A search program characterized in that is executed.

The reception process includes
The search program according to claim 1, wherein the computer startup completion notification is received as a transposition request for the Bloom filter row.

The reception process includes
The search program according to claim 1, wherein a search request for the data block set is accepted as a transposition request for the Bloom filter string.

Bloom filter corresponding to the position information common to a plurality of transposed Bloom filter identified by the previous SL first specific step, a second specifying step of specifying from the Bloom filter row,
Causing the computer to execute a determination step of determining whether or not there is a Bloom filter row having the Bloom filter specified by the second specification step as an array element,
The determination step includes
If it is determined that the data does not exist in the determination step, determining whether or not the search target data is present in the data block corresponding to the Bloom filter specified in the second specific step in the data block set. The search program according to any one of claims 1 to 3, characterized in that:

A data block set including a registered data group for each data block and a bit string in n Bloom filters in which m bits indicating negative in a predetermined number of data blocks are arranged together by bits at the same position A computer having access to a transposed Bloom filter array in which m n-bit transposed Bloom filters are arranged;
Based on a plurality of types of hash functions, for each hash function, a conversion step of converting search target data into position information representing an array position of the transposed Bloom filter;
From the transposed Bloom filter row, a first identifying step that identifies, for each positional information, a transposed Bloom filter corresponding to the positional information converted by the converting step;
A second specifying step of specifying a Bloom filter corresponding to position information common to a plurality of transposed Bloom filters specified by the first specifying step from the Bloom filter row;
A determination step of determining whether there is a Bloom filter row having the Bloom filter specified by the second specification step as an array element;
A determination step of determining whether or not the search target data exists in the data block corresponding to the Bloom filter specified by the second specific step in the data block set when it is determined that the data does not exist by the determination step. When,
A search program characterized by causing a computer to execute.

The first specific step includes
If it is determined by the determination step that the bloom filter column having the Bloom filter specified by the second specific step as an array element is set as the specific Bloom filter column, the transposition transposed from the specific Bloom filter column By specifying a Bloom filter string as a specific target transposed Bloom filter string, a specific target transposed Bloom filter corresponding to the positional information converted by the conversion step is specified for each positional information from the specific target transposed Bloom filter string And
The second specific step includes
The Bloom filter corresponding to the positional information common to the plurality of specific target transposed Bloom filters specified by the first specifying step is specified from the specific target Bloom filter row. Search program described in.

A data block set including a registered data group for each data block and n Bloom filters in which m bits indicating negative in a predetermined number of data blocks are arranged in units of the predetermined number of data blocks A search device accessible to the Bloom filter column,
Receiving means for receiving a transposition request of the Bloom filter row;
When a transposition request is accepted by the accepting unit, a transposed Bloom filter sequence in which m n-bit transposed Bloom filters are arranged in which the Bloom filter sequence is a group of bit sequences in each Bloom filter. Transposing means for transposing to a storage device;
Based on a plurality of types of hash functions, for each hash function, conversion means for converting search target data into position information representing an array position of the transposed Bloom filter;
From the transposed Bloom filter row, a first identifying unit that identifies, for each positional information, a transposed Bloom filter corresponding to the positional information converted by the converting unit;
Determination means for determining presence or absence of the search target data in the data block set based on position information common to a plurality of transposed Bloom filters specified by the first specifying means;
A search device comprising:

A data block set including a registered data group for each data block and n Bloom filters in which m bits indicating negative in a predetermined number of data blocks are arranged in units of the predetermined number of data blocks A computer with access to the Bloom Filter column,
An accepting step for accepting a transposition request for the Bloom filter row;
When a transposition request is accepted by the accepting step, a transposed Bloom filter sequence in which m n-bit transposed Bloom filters are arranged in which the Bloom filter sequence is a group of bit sequences in each Bloom filter are arranged at the same position. A transposition step of transposing to a storage device,
Based on a plurality of types of hash functions, for each hash function, a conversion step of converting search target data into position information representing an array position of the transposed Bloom filter;
From the transposed Bloom filter row, a first identifying step that identifies, for each positional information, a transposed Bloom filter corresponding to the positional information converted by the converting step;
A determination step of determining presence / absence of the search target data in the data block set based on position information common to a plurality of transposed Bloom filters specified by the first specification step;
The search method characterized by performing.