JP2005071115A

JP2005071115A - Registration/retrieval method for object in p2p environment and program therefor

Info

Publication number: JP2005071115A
Application number: JP2003300613A
Authority: JP
Inventors: Hiroyuki Kitagawa; 博之北川; Akira Matsushita; 亮松下
Original assignee: Japan Science and Technology Agency
Current assignee: Japan Science and Technology Agency
Priority date: 2003-08-25
Filing date: 2003-08-25
Publication date: 2005-03-17

Abstract

<P>PROBLEM TO BE SOLVED: To provide a registration/retrieval method in a P2P environment that is compatible with efficient object retrieval with the use of various feature vectors, load equalization, node addition and deletion. <P>SOLUTION: A desired object is retrieved by using a common retrieval method utilizing signatures from the objects registered according to a common registration method utilizing each signature in nodes of maximum 2<SP>n</SP>having a node-ID specifying itself and connected to a network, respectively. In both registration and retrieval methods, locators are generated. Index entries are generated on the basis of the locators, and the index entries are registered in the nodes equal to or more than one that are specified by the locators. Index entries are collected on the basis of inquiry locators, and all index entries satisfying conditions of inquiry frame signatures are transmitted to nodes that perform retrieval work. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、Ｐ２Ｐ環境におけるシグネチャを用いたオブジェクトの登録検索方法及びこの方法を実施するために用いるプログラムに関するものである。 The present invention relates to an object registration / retrieval method using a signature in a P2P environment, and a program used to implement this method.

近年、計算機の高性能化とネットワークインフラの発達により、Ｐｅｅｒ−ｔｏ−Ｐｅｅｒ（Ｐ２Ｐ）技術が注目されている。Ｐ２Ｐでは各計算機（コンピュータ）がｐｅｅｒｎｏｄｅ（ノード）となり、大規模な分散ネットワークを構築する。各ノードはサーバ、クライアントといった明確な区別はなく両方の役割を担っている。また、Ｐ２Ｐネットワークの形態は、クライアント・サーバシステムを融合させたハイブリッドＰ２Ｐ型と、完全な分散環境であるピュアＰ２Ｐ型に分類される。ハイブリッドＰ２Ｐ型は、ある種のサービスを提供するために特定のサーバが存在する。ハイブリッドＰ２Ｐ型におけるオブジェクト検索では、サーバがインデックス機能を提供することで、ノードに分散する情報の共有を図ることが可能である。しかし、多数のノードがそのサーバのサービスを要求した場合には、サーバがボトルネックとなる。また、サーバが停止した場合にはサービス全体が停止してしまう。一方、ピュアＰ２Ｐ型では各ノードが自律的に動作する。このため、サーバの処理能力に制約されずにノードの追加ができるなどの拡張性に富み、ボトルネックのない処理を実現することができる。 In recent years, Peer-to-Peer (P2P) technology has attracted attention due to the high performance of computers and the development of network infrastructure. In P2P, each computer (computer) becomes a peer node (node) to construct a large-scale distributed network. Each node plays both roles without distinction between servers and clients. The P2P network is classified into a hybrid P2P type in which client / server systems are integrated and a pure P2P type that is a completely distributed environment. In the hybrid P2P type, a specific server exists in order to provide a certain type of service. In the object search in the hybrid P2P type, it is possible to share information distributed to the nodes by providing an index function by the server. However, when a large number of nodes request the service of the server, the server becomes a bottleneck. Further, when the server is stopped, the entire service is stopped. On the other hand, in the pure P2P type, each node operates autonomously. For this reason, it is rich in extensibility such that a node can be added without being restricted by the processing capability of the server, and processing without a bottleneck can be realized.

しかし、ピュアＰ２Ｐ型ではグローバルなインデックス等をサーバに持つことができないため、一般に情報の共有は容易ではない。代表的なピュアＰ２Ｐ型のシステムとして、ファイル共有システムＧｎｕｔｅｌｌａ［非特許文献１］（Ｇｎｕｔｅｌｌａｗｅｂｓｉｔｅ．ｈｔｔｐ：／／ｗｗｗ．ｇｎｕｔｅｌｌａ．ｃｏｍ／）が挙げられる。Ｇｎｕｔｅｌｌａではブロードキャストを用いて、周辺のノードを巡回する方法で検索を行うため、オブジェクト検索時における通信コストが大きな問題となる。この問題を解決するため、ＫａｒｌＡｂｅｒｅｒが発表した論文「ＡＳｅｌｆ−ＯｒｇａｎｉｚｉｎｇＡｃｃｅｓｓＳｔｒｕｃｔｕｒｅｆｏｒＰ２ＰＩｎｆｏｒｍａｔｉｏｎＳｙｓｔｅｍｓ」［非特許文献２］や、ＩｏｎＳｔｏｉｃａ，ＲｏｂｅｒｔＭｏｒｒｉｓ，ＤａｖｉｄＫａｒｇｅｒ，Ｍ．ＦｒａｎｓＫａａｓｈｏｅｋ及びＨａｒｉＢａｌａｋｒｉｓｈｎａｎが発表した論文「ＡＳｃａｌａｂｌｅＰｅｅｒ−ｔｏ−ＰｅｅｒＬｏｏｋｕｐＳｅｒｖｉｃｅｆｏｒＩｎｔｅｒｎｅｔＡｐｐｌｉｃａｔｉｏｎｓ」［非特許文献３］、ＳｙｌｖｉａＲａｔｎａｓａｍｙ，ＰａｕｌＦｒａｎｃｉｓ，ＭａｒｋＨａｎｄｌｅｙ，ＲｉｃｈａｒｄＫａｒｐ及びＳｃｏｔｔＳｈｅｎｋｅｒが発表した論文「ＡＳｃａｌａｂｌｅＣｏｎｔｅｎｔ−ＡｄｄｒｅｓｓａｂｌｅＮｅｔｗｏｒｋ」［非特許文献４］や、ＫｉｒｓｔｅｎＨｉｌｄｒｕｍ、ＪｏｈｎＤ．Ｋｕｂｉａｔｏｗｉｃｚ、ＳａｔｉｓｈＲａｏ及びＢｅｎＹ．Ｚｈａｏが発表した論文「ＤｉｓｔｒｉｂｕｔｅｄＯｂｊｅｃｔＬｏｃａｔｉｏｎｉｎａＤｙｎａｍｉｃＮｅｔｗｏｒｋ」［非特許文献５］に各手法が提案されている。これらの手法は、以下のような特徴を持つ。 However, since the pure P2P type cannot have a global index or the like in the server, information sharing is generally not easy. As a typical pure P2P type system, there is a file sharing system Gnutella [Non-patent Document 1] (Gnutella website. Http://www.gnutella.com/). In Gnutella, the search is performed by using a method of patroling neighboring nodes using broadcast, so the communication cost at the time of object search becomes a big problem. In order to solve this problem, the paper “A Self-Organizing Access Structure for P2P Information Systems” [Non-Patent Document 2] published by Karl Aberer, Ion Stoica, Robert Morris, Dar. Morris, Dar. A paper published by Franc Kaashhoek and Hari Balakrishnan, “A Scalable Peer-to-Peer Lookup Service for Internet Applications, Non-Patent Document 3”, Sylvia Ratnasulamulak. Content-Addressable Network "[Non-Patent Document 4], Kirsten Hildrum, John D. et al. Kubiatowicz, Satis Rao and Ben Y. et al. Each method is proposed in a paper “Distributed Object Location in a Dynamic Network” published by Zhao [Non-Patent Document 5]. These methods have the following characteristics.

・効率的なオブジェクト検索
全ノード数Ｎに対してＯ（ｌｏｇ（Ｎ））またはこれに準じた検索コストの効率的な検索方式を提供する。したがって、多数のノードが存在する環境にも適用可能である。 -Efficient object search An efficient search method of O (log (N)) or a search cost according to this is provided for the total number N of nodes. Therefore, the present invention can be applied to an environment where a large number of nodes exist.

・負荷の均等化
データオブジェクトは各ノードに均等に配置され、ノード当たりの負荷が均等化される。 -Load equalization Data objects are equally distributed to each node, and the load per node is equalized.

・ノードの追加・削除への対応
ノードの追加や削除に対し、自律的にノード情報を修正することで、サービスを継続することができる。 -Support for node addition / deletion Service can be continued by correcting node information autonomously for node addition / deletion.

しかし、これらの手法では、オブジェクトＩＤというキーを用いたオブジェクト検索のみが考慮されており、オブジェクトの持つ種々の特徴量による検索を直接行うことはできない。 However, in these methods, only object search using a key called object ID is considered, and it is not possible to directly perform search using various feature quantities of an object.

そこで一般に、オブジェクトの持つ種々の特徴量を用いた柔軟な検索を実現する手段としてシグネチャを用いることが考えられている。例えば、ＣｈｒｉｓｔｏｓＦａｌｏｕｔｓｏｓが発表した論文「ＤｅｓｉｇｎａｎｄＰｅｒｆｏｒｍａｎｃｅＣｏｍｐａｒｉｓｏｎｏｆＳｏｍｅＳｉｇｎａｔｕｒｅＥｘｔｒａｃｔｉｏｎＭｅｔｈｏｄｓ」［非特許文献６］や、石川佳治，北川博之及び大保信夫が発表した論文「シグネチャファイルによる集合値検索のコスト評価」［非特許文献７］に説明されている。シグネチャを用いることで、多様な特徴量による検索を行うことができる。例えば、オブジェクト名やその内容記述に対するキーワード照合や任意の部分文字列照合があげられる。
Ｇｎｕｔｅｌｌａｗｅｂｓｉｔｅ．ｈｔｔｐ：／／ｗｗｗ．ｇｎｕｔｅｌｌａ．ｃｏｍ／「ＡＳｅｌｆ−ＯｒｇａｎｉｚｉｎｇＡｃｃｅｓｓＳｔｒｕｃｔｕｒｅｆｏｒＰ２ＰＩｎｆｏｒｍａｔｉｏｎＳｙｓｔｅｍｓ」ＣｏｏｐＩＳ２００１，ＬＮＣＳ２１７２，ｐｐ．１７９−１９４，２００１．「ＡＳｃａｌａｂｌｅＰｅｅｒ−ｔｏ−ＰｅｅｒＬｏｏｋｕｐＳｅｒｖｉｃｅｆｏｒＩｎｔｅｒｎｅｔＡｐｐｌｉｃａｔｉｏｎｓ」，ＳＩＧＣＯＭＭ’０１，ｐｐ．１４９−１６０，２００１．「ＡＳｃａｌａｂｌｅＣｏｎｔｅｎｔ−ＡｄｄｒｅｓｓａｂｌｅＮｅｔｗｏｒｋ」，ＳＩＧＣＯＭＭ’０１，ｐｐ．１６１−１７２，２００１．「ＤｉｓｔｒｉｂｕｔｅｄＯｂｊｅｃｔＬｏｃａｔｉｏｎｉｎａＤｙｎａｍｉｃＮｅｔｗｏｒｋ」，ＳＰＡＡ’０２，ｐｐ．４１−５２，２００２．「ＤｅｓｉｇｎａｎｄＰｅｒｆｏｒｍａｎｃｅＣｏｍｐａｒｉｓｏｎｏｆＳｏｍｅＳｉｇｎａｔｕｒｅＥｘｔｒａｃｔｉｏｎＭｅｔｈｏｄｓ」Ｐｒｏｃ．ＡＣＭＳＩＧＭＯＤ１９８５，ｐｐ．６３−８２，１９８５「シグネチャファイルによる集合値検索のコスト評価」情報処理学会論文誌，Ｖｏｌ．３６，Ｎｏ．２，ｐｐ．３８３−３９５，１９９５ Therefore, it is generally considered to use a signature as a means for realizing a flexible search using various feature values of an object. For example, a paper published by Christos Faloutos “Design and Performance Comparison of Signature Extraction Methods” [Non-Patent Literature 6], a paper published by Keiji Ishikawa, Hiroyuki Kitagawa, and Nobuo Ooho's evaluation [Non-Patent Document 7]. By using the signature, it is possible to perform a search using various feature amounts. For example, keyword collation or arbitrary partial character string collation for an object name or its content description can be given.
Gnutella website. http: // www. gnutella. com / “A Self-Organizing Access Structure for P2P Information Systems”, CoopIS 2001, LNCS 2172, pp. 179-194, 2001. “A Scalable Peer-to-Peer Lookup Service for Internet Applications”, SIGCOMM '01, pp. 149-160, 2001. "A Scalable Content-Addressable Network", SIGCOMM '01, pp. 161-172, 2001. “Distributed Object Location in a Dynamic Network”, SPAA '02, pp. 41-52, 2002. “Design and Performance Comparison of Some Signature Extraction Methods” Proc. ACM SIGMOD 1985, pp. 63-82, 1985 “Cost Evaluation of Set Value Retrieval Using Signature File”, Transactions of Information Processing Society of Japan, Vol. 36, no. 2, pp. 383-395, 1995

しかしながら従来のシグネチャを用いる技術では、Ｐ２Ｐ環境における利用が検討されておらず、Ｐ２Ｐ環境におけるシグネチャを用いたオブジェクト検索を行うことができなかった。 However, conventional techniques using signatures have not been studied for use in a P2P environment, and object search using signatures in a P2P environment cannot be performed.

本発明の目的は、多様な特徴量を用いた効率的オブジェクト検索、負荷の均等化、ノードの追加・削除への対応ができるＰ２Ｐ環境におけるオブジェクトの登録検索方法及びこの方法を実現するプログラムを提供することにある。 An object of the present invention is to provide a method for registering and retrieving objects in a P2P environment capable of efficiently searching for objects using various feature quantities, equalizing loads, and adding / deleting nodes, and a program for realizing the method. There is to do.

本発明の他の目的は、検索要求とオブジェクト登録要求の出現頻度に応じて、処理効率を最適化することができるＰ２Ｐ環境におけるオブジェクトの登録検索方法及びこの方法を実現するプログラムを提供することにある。 Another object of the present invention is to provide an object registration and retrieval method in a P2P environment capable of optimizing the processing efficiency in accordance with the appearance frequency of retrieval requests and object registration requests, and a program for realizing the method. is there.

本発明の更に他の目的は、オフラインノードが存在する場合でも、オブジェクト検索を継続することができるＰ２Ｐ環境におけるオブジェクトの登録検索方法及びこの方法を実現するプログラムを提供することにある。 Still another object of the present invention is to provide a method for registering and retrieving an object in a P2P environment that can continue object retrieval even when an offline node exists, and a program for realizing this method.

本発明の方法は、ネットワークに接続されてそれぞれ自己を特定するノードＩＤを有する最大２^ｎ個のノードに、それぞれシグネチャを利用した共通の登録方法に従って登録されているオブジェクトからシグネチャを利用した共通の検索方法を用いて所望のオブジェクトを検索するＰ２Ｐ環境におけるオブジェクトの登録検索方法を対象とする。 The method of the present invention is a common method using a signature from an object registered in accordance with a common registration method using a signature to a maximum of 2 ⁿ nodes each having a node ID for identifying itself connected to a network. An object registration search method in a P2P environment in which a desired object is searched using a search method is an object.

本発明で用いる登録方法では、まず登録すべきオブジェクトの１個以上の特徴量を長さｍ（ｍは１以上の正の整数）のビット列からなるオブジェクトシグネチャとする（第１の登録ステップ）。次に、オブジェクトシグネチャの長さｍのビット列を長さｍ／ｐ（ｐは１以上の正の整数）のビット列からなるフレームシグネチャから構成されたｐ個の分割フレームに分割する（第２の登録ステップ）。そしてｐ個の分割フレームをそれぞれ特定するビット列と、フレームシグネチャを構成する長さｍ／ｐのビット列とを合成して各分割フレームに対応する合成シグネチャを作成する（第３の登録ステップ）。その後、合成シグネチャを構成するビット列から長さｎのビット列からなるｐ個のロケータを生成する（第４の登録ステップ）。この第４の登録ステップでは、合成シグネチャを構成するビット列の先頭から最大ｎ個のビットを用いて登録先のノードを特定するためのロケータを生成する。このようにするとロケータによるノードの特定を容易且つ確実に行える。次に登録作業を行っているノードのノードＩＤと、登録作業を行っているノード内で登録対象となっているオブジェクトを特定するローカルＩＤと、分割フレームを特定するデータと、フレームシグネチャとからなるインデックスエントリをｐ個のロケータに対応して作成する（第５の登録ステップ）。最後に、このインデックスエントリをロケータにより指定された１個以上のノードに登録する（第６の登録ステップ）。 In the registration method used in the present invention, at least one feature quantity of an object to be registered is first made an object signature consisting of a bit string of length m (m is a positive integer of 1 or more) (first registration step). Next, the bit string of length m of the object signature is divided into p divided frames composed of a frame signature composed of a bit string of length m / p (p is a positive integer of 1 or more) (second registration) Step). Then, a bit string specifying each of the p divided frames and a bit string of length m / p constituting the frame signature are combined to create a combined signature corresponding to each divided frame (third registration step). Thereafter, p locators composed of a bit string having a length n are generated from the bit string constituting the composite signature (fourth registration step). In the fourth registration step, a locator for specifying a registration destination node is generated using a maximum of n bits from the head of the bit string constituting the composite signature. In this way, the node can be easily and reliably identified by the locator. Next, it consists of the node ID of the node that is performing the registration work, the local ID that identifies the object to be registered within the node that is performing the registration work, the data that identifies the divided frames, and the frame signature. An index entry is created corresponding to p locators (fifth registration step). Finally, this index entry is registered in one or more nodes designated by the locator (sixth registration step).

また本発明の方法で用いる検索方法では、検索作業を行うノードにおいて、検索すべきオブジェクトの１以上の特徴量を長さｍ（ｍは以上の正の整数）のビット列からなる問合せシグネチャとする（第１の問合せエントリ生成ステップ）。次に問合せシグネチャの長さｍのビット列を長さｍ／ｐ（ｐは１以上の正の整数）のビット列からなる問合せフレームシグネチャから構成されたｐ個の分割フレームに分割する（第２の問合せエントリ生成ステップ）。そしてｐ個の分割フレームをそれぞれ特定するビット列と、問合せフレームシグネチャを構成する長さｍ／ｐのビット列を合成して各分割フレームに対応する合成シグネチャを作成する（第３の問合せエントリ生成ステップ）。次に合成シグネチャを構成するビット列から長さｎのビット列からなるｐ個の問合せロケータを生成する（第４の問合せエントリ生成ステップ）。これら第１乃至第４の問合わせエントリ生成ステップを実行するのが第１の検索ステップである。次に、ｐ個の問合せロケータに基づいて定まる１以上のノードにおいて問合せフレームシグネチャの条件を満たすインデックスエントリを収集し、最終的にインデックスエントリの収集を行うノードから１以上のノードにおいて収集した問合せフレームシグネチャの条件を満たすすべてのインデックスエントリを、検索作業を行うノードに送信する（第２の検索ステップ）。本発明で用いる検索方法を採用すると、第１及び第２の検索ステップによりオブジェクトの検索に利用できるインデックスエントリをＰ２Ｐ環境において、簡単に且つ少ない通信費で収集することができる。収集したインデックスエントリから目的とするオブジェクトを検索する方法は、マニュアルでもよいし、公知の検索手法を用いてもよい。 In the search method used in the method of the present invention, one or more feature quantities of an object to be searched for at a node performing a search operation are used as a query signature consisting of a bit string of length m (m is a positive integer above) ( First query entry generation step). Next, a bit string of length m of the query signature is divided into p divided frames composed of a query frame signature composed of a bit string of length m / p (p is a positive integer of 1 or more) (second query). Entry generation step). Then, a bit string specifying each of the p divided frames and a bit string of length m / p constituting the query frame signature are synthesized to generate a synthesized signature corresponding to each divided frame (third query entry generation step). . Next, p query locators composed of a bit string of length n are generated from the bit string constituting the composite signature (fourth query entry generating step). The first search step executes these first to fourth inquiry entry generation steps. Next, index frames satisfying the query frame signature condition are collected at one or more nodes determined based on p query locators, and finally query frames collected at one or more nodes from the node that collects the index entries. All index entries that satisfy the signature condition are transmitted to the node that performs the search operation (second search step). When the search method used in the present invention is adopted, index entries that can be used for searching for objects by the first and second search steps can be easily collected in the P2P environment with low communication costs. A method for searching for a target object from the collected index entries may be a manual or a known search method.

このようにしてｐ個のロケータに対応して作成したインデックスエントリをｐ個のロケータによりそれぞれ指定されたノードに登録すると、１つのオブジェクトの検索に必要なインデックスをＰ２Ｐ環境において分散して各ノードに配置することになる。その結果、以下のような特徴的な利点が得られる。 When the index entries created corresponding to the p locators are registered in the nodes designated by the p locators in this way, the index necessary for searching for one object is distributed in the P2P environment and distributed to each node. Will be placed. As a result, the following characteristic advantages can be obtained.

１．シグネチャを用いることで、多様な特徴量による柔軟な検索に対応できる。 1. By using a signature, it is possible to cope with a flexible search using various feature amounts.

２．基本となる枠組みとして後述するＣｈｏｒｄやＰ−Ｇｒｉｄ等を利用し、かつこれらの持つ上記の特徴をシグネチャを用いた検索においても継承することができる。 2. As the basic framework, Chord, P-Grid, etc., which will be described later, can be used, and the above-described features can be inherited even in a search using a signature.

３．シグネチャのフレーム分割を導入することで、適切なパラメタを選択することにより、利用環境におけるオブジェクト検索と追加の発生頻度に適合した構成をとることができる。 3. By introducing the frame division of the signature, by selecting an appropriate parameter, it is possible to adopt a configuration suitable for object search and additional occurrence frequency in the usage environment.

４．オフラインノードが存在する場合でもオブジェクト検索を継続することができる。 4). Even when an offline node exists, the object search can be continued.

まずｐ個の問合せロケータ中の最初の問合せロケータに基づいて検索対象とすべきノードを定めるすべての検索対象ロケータを求めて検索対象ロケータ集合を生成する。すべての検索対象ロケータとは、１つの問合わせロケータで定められたノード以外のノードにもインデックスエントリが登録されることがあるため、これらのインデックスエントリを検索するためのロケータを意味する。そして、検索作業を行うノードのノードＩＤ及びすべての問合せフレームシグネチャを含む問合せデータを検索対象ロケータ集合の最初の検索対象ロケータにより定まるノードに送る（第１のステップ）。その後最初のノードにおいて、最初の検索対象ロケータに対応する問合せフレームシグネチャの条件を満たすすべてのインデックスエントリを取得して部分インデックスエントリ候補集合とする（第２のステップ）。次に次の検索対象ロケータにより定まる次のノードに問合せデータとインデックスエントリ候補を送る（第３のステップ）。そして次のノードにおいて、次の検索対象ロケータに対応する問合せフレームシグネチャの条件を満たすすべてのインデックスエントリを取得し且つ該インデックスエントリ集合と前記部分インデックスエントリ候補集合との和集合を取ってその結果を次の部分インデックスエントリ候補集合とする（第４のステップ）。またその後、最後の検索対象ロケータに到るまで第３のステップ及び第４のステップを繰り返し最後の部分インデックスエントリ候補集合をインデックスエントリ候補集合とする（第５のステップ）。またｐ個の問合せロケータ中の次の問合せロケータから検索対象ロケータ集合を生成し、問合せデータ及びインデックスエントリ候補集合を該検索対象ロケータ集合の最初の検索対象ロケータにより定まる次のノードに送る（第６のステップ）。そして次のノードからはじめて検索対象ロケータ集合の最後の検索対象ロケータに到るまで第２のステップ及び第３のステップ及び第４のステップ及び第５のステップを繰り返しその結果得られたインデックスエントリ候補集合と本ステップ（第７のステップ）開始時のインデックスエントリ候補集合の積集合をとってその結果を次のインデックスエントリ候補集合とする（第７のステップ）。最後の問合せロケータに到るまで第６のステップ及び第７のステップを繰り返し、最終結果を検索作業を行うノードに送信する（第８のステップ）。 First, a search target locator set is generated by obtaining all search target locators that determine nodes to be searched based on the first query locators in the p query locators. All search target locators mean locators for searching for index entries, since index entries may be registered in nodes other than the nodes defined by one query locator. Then, the query data including the node ID of the node performing the search operation and all query frame signatures is sent to the node determined by the first search target locator in the search target locator set (first step). Thereafter, in the first node, all index entries that satisfy the query frame signature condition corresponding to the first search target locator are acquired and set as a partial index entry candidate set (second step). Next, query data and index entry candidates are sent to the next node determined by the next search target locator (third step). At the next node, all index entries satisfying the query frame signature condition corresponding to the next search target locator are acquired, and the union of the index entry set and the partial index entry candidate set is taken and the result is obtained. The next partial index entry candidate set is set (fourth step). Thereafter, the third step and the fourth step are repeated until the last search target locator is reached, and the last partial index entry candidate set is set as an index entry candidate set (fifth step). Also, a search target locator set is generated from the next query locator in the p query locators, and the query data and index entry candidate set are sent to the next node determined by the first search target locator of the search target locator set (No. 6 Step). The index entry candidate set obtained as a result of repeating the second step, the third step, the fourth step, and the fifth step from the next node to the last search target locator of the search target locator set And the product set of index entry candidate sets at the start of this step (seventh step) is taken and the result is set as the next index entry candidate set (seventh step). The sixth step and the seventh step are repeated until the final inquiry locator is reached, and the final result is transmitted to the node that performs the search operation (eighth step).

このようにすることにより、所定のオブジェクトについてのインデックスエントリが登録されているすべてのノードにおいて検索が実行される。 By doing so, the search is executed in all the nodes in which the index entry for the predetermined object is registered.

なお問合せロケータにより定まる各ノードにおいて、第１の問合せエントリ生成ステップ乃至第４の問合せエントリ生成ステップが実行されて次の問合せロケータを定める。 In each node determined by the query locator, the first query entry generation step to the fourth query entry generation step are executed to determine the next query locator.

本発明を、ネットワークに接続されてそれぞれ自己を特定するノードＩＤを有する最大２^ｎ個のノードのコンピュータに、それぞれシグネチャを利用した共通の登録手順に従って登録し、且つ登録されているオブジェクトからシグネチャを利用した共通の検索手順を用いて所望のオブジェクトを検索するために用いられるＰ２Ｐ環境におけるオブジェクトの登録検索用プログラムとして特定すると、このプログラムは以下の登録手順と検索手順を実行するように構成される。すなわち登録手順では、登録すべきオブジェクトの１個以上の特徴量を長さｍ（ｍは１以上の正の整数）のビット列からなるオブジェクトシグネチャとする第１の登録ステップと、オブジェクトシグネチャの前記長さｍのビット列を長さｍ／ｐ（ｐは１以上の正の整数）のビット列からなるフレームシグネチャから構成されたｐ個の分割フレームに分割する第２の登録ステップと、ｐ個の分割フレームをそれぞれ特定するビット列と、フレームシグネチャを構成する長さｍ／ｐのビット列とを合成して各分割フレームに対応する合成シグネチャを作成する第３の登録ステップと、合成シグネチャを構成するビット列から長さｎのビット列からなるｐ個のロケータを生成する第４の登録ステップと、登録作業を行っているノードのノードＩＤと、登録作業を行っているノード内で登録対象となっているオブジェクトを特定するローカルＩＤと、分割フレームを特定するデータと、フレームシグネチャとからなるインデックスエントリをｐ個のロケータに対応して作成する第５の登録ステップと、インデックスエントリをロケータにより指定された１個以上のノードに登録する第６の登録ステップとからなる。 The present invention is registered in a computer of a maximum of 2 ⁿ nodes each having a node ID for identifying itself connected to a network according to a common registration procedure using each signature, and a signature is registered from the registered objects. When specified as an object registration search program in the P2P environment used for searching for a desired object using the common search procedure used, this program is configured to execute the following registration procedure and search procedure. . That is, in the registration procedure, a first registration step in which one or more feature quantities of an object to be registered is an object signature composed of a bit string of length m (m is a positive integer of 1 or more), and the length of the object signature. A second registration step of dividing a bit string of length m into p divided frames composed of a frame signature composed of a bit string of length m / p (p is a positive integer of 1 or more), and p divided frames A third registration step for synthesizing a bit string having a length of m / p constituting a frame signature to create a synthesized signature corresponding to each divided frame, and a length from the bit string constituting the synthesized signature A fourth registration step for generating p locators comprising n bit strings, and the node ID of the node performing the registration work First, an index entry corresponding to p locators is created, which includes a local ID for identifying an object to be registered in a node performing registration work, data for identifying a divided frame, and a frame signature. 5 registration steps and a sixth registration step of registering the index entry in one or more nodes designated by the locator.

そして検索手順は、検索作業を行うノードにおいて、検索すべきオブジェクトの１以上の特徴量を長さｍ（ｍは以上の正の整数）のビット列からなる問合せシグネチャとする第１の問合せエントリ生成ステップと、問合せシグネチャの長さｍのビット列を長さｍ／ｐ（ｐは１以上の正の整数）のビット列からなる問合せフレームシグネチャから構成されたｐ個の分割フレームに分割する第２の問合せエントリ生成ステップと、ｐ個の分割フレームをそれぞれ特定するビット列と、問合せフレームシグネチャを構成する長さｍ／ｐのビット列を合成して各分割フレームに対応する合成シグネチャを作成する第３の問合せエントリ生成ステップと、合成シグネチャを構成するビット列から長さｎのビット列からなるｐ個の問合せロケータを生成する第４の問合せエントリ生成ステップとを実行する第１の検索ステップと、ｐ個の問合せロケータに基づいて定まる１以上のノードにおいて問合せフレームシグネチャの条件を満たすインデックスエントリを収集し、最終的にインデックスエントリの収集を行うノードから１以上のノードにおいて収集した問合せフレームシグネチャの条件を満たすすべてのインデックスエントリを検索作業を行うノードに送信する第２の検索ステップとからなる。 The search procedure includes a first query entry generation step in which at least one feature quantity of an object to be searched is set as a query signature consisting of a bit string of length m (m is a positive integer above) in a node performing a search operation. A second query entry that divides a bit string of length m of the query signature into p divided frames composed of a query frame signature composed of a bit string of length m / p (p is a positive integer of 1 or more) Generating a third query entry for generating a composite signature corresponding to each divided frame by synthesizing a generation step, a bit string specifying each of the p divided frames, and a bit string of length m / p constituting the query frame signature Generate p query locators comprising a step and a bit string of length n from the bit string constituting the composite signature Collecting index entries that satisfy the conditions of the query frame signature at one or more nodes determined based on the first query step and p query locators that execute the fourth query entry generation step, and finally index entries A second search step of transmitting all index entries satisfying the conditions of the query frame signature collected at one or more nodes from the node that performs the collection to the node that performs the search operation.

本発明によれば、Ｐ２Ｐ環境における分散配置されたシグネチャ情報を用いたオブジェクトの登録検索方法を提供することができ、分割フレーム数を大きくすることで、検索時におけるメッセージ数、総データ転送量、応答時間を削減できる。また本発明によれば、分割フレーム数を増加させるとデータ追加の際のメッセージ数は増加するものの、並列処理の導入により応答時間については大きな変化がないという利点がある。さらに検索と追加の発生割合を考慮した場合でも、分割フレーム数を変化させることで処理効率を最適化できる。その上本発明によれば、オフラインノードが存在する場合の検索の精度についても、特に分割フレーム数が大きくすれば、オフラインノードが存在する場合でも一定の検索精度を維持可能であるという利点が得られる。 According to the present invention, it is possible to provide a method for registering and retrieving an object using signature information distributed in a P2P environment. By increasing the number of divided frames, the number of messages at the time of retrieval, the total data transfer amount, Response time can be reduced. Further, according to the present invention, when the number of divided frames is increased, the number of messages when data is added increases, but there is an advantage that there is no significant change in response time due to the introduction of parallel processing. Furthermore, even when the search and additional occurrence rate are taken into account, the processing efficiency can be optimized by changing the number of divided frames. In addition, according to the present invention, with respect to the accuracy of the search when there is an offline node, there is an advantage that a constant search accuracy can be maintained even when there is an offline node, especially when the number of divided frames is increased. It is done.

以下図面を参照して本発明の実施の形態の一例を詳細に説明する。なお以下の説明では、基本アーキテクチャとしてＣｈｏｒｄの枠組みを利用した場合を具体的な対象とする。また、シグネチャを用いた特徴検索の例として、オブジェクト名に対する部分文字列照合を示す。さらに、シミュレーション実験により、オブジェクト検索時および追加時に要するメッセージ数とその応答時間、検索と追加が混在する場合でのメッセージ数、およびオフラインノードが存在する場合の検索の精度について評価検討を行う。 Hereinafter, an example of an embodiment of the present invention will be described in detail with reference to the drawings. In the following description, the case where the Chord framework is used as a basic architecture is a specific target. Further, as an example of feature search using a signature, partial character string matching with respect to an object name is shown. Furthermore, through simulation experiments, we evaluate and evaluate the number of messages required when searching and adding objects and their response times, the number of messages when searching and adding are mixed, and the accuracy of searching when there are offline nodes.

まず本発明の実施の形態を説明する前に、比較のために従来から採用されているＰ２Ｐ環境上にキーを用いた検索のための構造を導入し、適切なルーティング処理を行って、効率的な検索を実現するための検索方法について簡単に説明する。このような公知の検索方法としては、Ｃｈｏｒｄ、Ｐ−Ｇｒｉｄ、ＣＡＮ、Ｔａｐｅｓｔｒｙ等が知られている。しかし、これらの方法では、検索はキーとしてのオブジェクトＩＤに基づくもののみに限られるため、オブジェクトの特徴量などによる柔軟な検索は不可能である。 First, before describing the embodiment of the present invention, a structure for searching using a key is introduced into a P2P environment that has been conventionally employed for comparison, and appropriate routing processing is performed for efficient A search method for realizing a simple search will be briefly described. As such a known search method, Chord, P-Grid, CAN, Tapestry, etc. are known. However, in these methods, since the search is limited only to the search based on the object ID as a key, a flexible search based on the feature amount of the object is impossible.

本発明の方法では、Ｐ２Ｐ環境におけるオブジェクト検索を実現するためにシグネチャを用いる。なお並列処理を用いてシグネチャの照合処理を効率化する研究（ＺｈｅｎｇＬｉｎ，ＣｏｎｃｕｒｒｅｎｔＦｒａｍｅＳｉｇｎａｔｕｒｅＦｉｌｅｓ，ＫｌｕｗｅｒＡｃａｄｅｍｉｃＰｕｂｌｉｓｈｅｒｓ，ＤｉｓｔｒｉｂｕｅｄａｎｄＰａｒａｌｌｅｌＤａｔａｂａｓｅｓＶｏｌ．１，ｐｐ．２３１−２４９，１９９３．）もある。このの公知の方法では、シグネチャを分割しすべての計算機（ノード）に均等に割り当てを行って、照合処理を並列化することで効率化を図っている。しかしながらシグネチャの照合処理が全ノードで発生することや、シグネチャの分割や割当てが静的であることにより、この公知の方法をＰ２Ｐ環境に適応させることは非常に困難である。 In the method of the present invention, a signature is used to realize object search in a P2P environment. In addition, the research which makes the collation processing of a signature efficient using parallel processing (Zheng Lin, Concurrent Frame Signature Files, Kluwer Academic Publishers, Distributed and Parallel Databases Vol.1, pp.23-2. In this known method, the signature is divided and equally allocated to all the computers (nodes), and collation processing is performed in parallel to improve efficiency. However, it is very difficult to adapt this known method to a P2P environment because signature verification processing occurs in all nodes and signature division and assignment are static.

これに対して、本発明では、シグネチャの分割法と配置法を工夫することで、シグネチャの照合処理が必要なノード数を少なく抑えることが可能である。また、ノードの追加や削除に対しても、シグネチャの動的な再配置を行うことが可能である。 On the other hand, in the present invention, it is possible to reduce the number of nodes that require signature verification processing by devising the signature division method and arrangement method. In addition, dynamic relocation of signatures can be performed even when nodes are added or deleted.

本発明の実施の形態では、Ｃｈｏｒｄアーキテクチャを採用するので、まずＣｈｏｒｄアーキテクチャについて説明する。図１に示すように、Ｃｈｏｒｄアーキテクチャでは、ネットワーク空間全体がＩＤサークル（ＩｄｅｎｔｉｆｉｅｒＣｉｒｃｌｅ）という円状の仮想空間として定義され、すべてのノードとすべてのデータオブジェクトはこのＩＤサークル上に配置される。Ｃｈｏｒｄのネットワーク空間の大きさはｓｃａｌｅ（スケール）で表現され、最大２^{ｓｃａｌｅ}個すなわち２^ｎ個（ｎは正の整数）のノードから構成される。ＣｈｏｒｄではノードおよびデータオブジェクトをＩＤサークル上に均等に配置するため、ハッシュ関数を用いる。これによりノードに対してｓｃａｌｅビットのノードＩＤ（ｎｉｄ）、データオブジェクトに対してｓｃａｌｅビットのオブジェクトＩＤ（ｏｉｄ）をそれぞれ与える。各ノードはノードＩＤに基づきＩＤサークル上に配置される。また、任意のオブジェクトＩＤのデータオブジェクトをＩＤサークル上に存在するいずれかのノードに割り当てるため、各ノードはＩＤ集合をもつ。 In the embodiment of the present invention, since the Chord architecture is adopted, the Chord architecture will be described first. As shown in FIG. 1, in the Chord architecture, the entire network space is defined as a circular virtual space called an ID circle (Identifier Circle), and all nodes and all data objects are arranged on this ID circle. The size of the Chord network space is represented by a ^scale, and is composed of a maximum of 2 ^scale nodes, that is, 2 ⁿ nodes (n is a positive integer). Chord uses a hash function to evenly arrange nodes and data objects on the ID circle. As a result, a node ID (nid) having a scale bit is given to the node, and an object ID (oid) having a scale bit is given to the data object. Each node is arranged on an ID circle based on the node ID. Further, since a data object having an arbitrary object ID is assigned to any node existing on the ID circle, each node has an ID set.

ノードｎのノードＩＤをｎ（ｉｄ）、その直前（時計回りの方向を順方向とする）に位置するノードｎ（ｐｒｅ）のノードＩＤをｎ（ｐｒｅ）ｉｄとする。この時、ノードｎのＩＤ集合は、以下の区間ｉｎｔｅｒｖａｌ（ｉｄ）に含まれるＩＤの集合のことである。 The node ID of the node n is n (id), and the node ID of the node n (pre) located immediately before (the clockwise direction is the forward direction) is n (pre) id. At this time, the ID set of the node n is a set of IDs included in the following interval (id).

ｉｎｔｅｒｖａｌ（ｉｄ）＝（ｎ（ｐｒｅ）ｉｄ，ｎｉｄ］
各データオブジェクトは、そのオブジェクトＩＤをＩＤ集合の要素に含む当該ノードへ割り当てられる。 interval (id) = (n (pre) id, nid)
Each data object is assigned to the node including the object ID as an element of the ID set.

各ノードは実際に割り当てられたオブジェクトＩＤのリスト、ルーティング情報、前後に位置するノード情報（ｓｕｃｃｅｓｓｏｒ、ｐｒｅｄｅｃｅｓｓｏｒ）を保持している。ルーティング情報は、検索対象のオブジェクトＩＤのオブジェクトが当該ノードに存在しない場合に、次にその問合せをフォワード（送信）すべきノードの情報を含む。この情報はｓｃａｌｅ個存在し、各ｉｎｔｅｒｖａｌに含まれるオブジェクトＩＤのオブジェクトを検索する際に次にどのノード（ｓｕｃｃｅｓｓｏｒｎｏｄｅと呼ばれる）に検索要求をフォワードすべきかが示されている。ただし、ｉｎｔｅｒｖａｌとは次式で定義される区間のことである。

このように、各ｉｎｔｅｒｖａｌの大きさは時計回りに順に２^ｋ−１となっている。これらのルーティング情報を持つことで、各ノードはＩＤサークル全体をカバーすることになり、任意の問合せに対して適切なルーティング処理を行うことができる。 Each node holds a list of actually assigned object IDs, routing information, and node information (successor and predecessor) located before and after. The routing information includes information on a node to which the inquiry should be forwarded (transmitted) next when the object having the object ID to be searched does not exist in the node. This information includes scale, and indicates to which node (referred to as a successor node) the search request should be forwarded next when searching for an object having an object ID included in each interval. However, the interval is a section defined by the following equation.

Thus, the magnitude of each interval is 2 ^k−1 in order clockwise. By having such routing information, each node covers the entire ID circle, and appropriate routing processing can be performed for any query.

Ｃｈｏｒｄにおいてノードの追加および削除があった場合の処理について説明する。新たにノードが追加された場合には、当該ノードを含む影響を受けるノードのルーティング情報と前後に位置するノード情報を更新し、ｓｕｃｃｅｓｓｏｒ（次のノード）の持つオブジェクトの再配置を行う。あるノードが削除される場合には、当該ノードの持つオブジェクトをｓｕｃｃｅｓｓｏｒへ再配置し、影響を受けるノードのルーティング情報と前後に位置するノード情報を更新する。 Processing when a node is added or deleted in Chord will be described. When a new node is added, the routing information of the affected node including the node and the node information located before and after the node are updated, and the objects of the successor (next node) are rearranged. When a certain node is deleted, the object of the node is relocated to the successor, and the routing information of the affected node and the node information located before and after are updated.

次に、Ｃｈｏｒｄでの検索処理についてより詳しく説明する。ＣｈｏｒｄではオブジェクトＩＤに基づく検索のみが考慮されている。このため、まず問合せでは獲得したいオブジェクトＩＤを指定する。問合せは任意のノードから開始することができる。問合せがノードへ渡されると、ノードは割り当てられたオブジェクトＩＤのリストから、その問合せのオブジェクトＩＤを持つデータオブジェクトがそのノード自身に存在するかどうかを判定する。存在する場合には、そのデータオブジェクトを返すことにより検索を終える。存在しない場合には、ルーティング情報を見ることで、次にフォワードすべきノードを決定しそのノードへ問合せを送る。この一連の判定処理を繰り返すことで、オブジェクトＩＤによる検索を実現する。２^{ｓｃａｌｅ}個のノードを想定した場合、問合せがフォワードされる度に検索空間を半分に絞っていくことから、あるデータオブジェクトを検索するために要するメッセージ数はＯ（ｌｏｇ（Ｎ））となる。 Next, search processing in Chord will be described in more detail. Chord only considers searches based on object IDs. For this reason, first, an object ID to be acquired is specified in the inquiry. Queries can be started from any node. When a query is passed to a node, the node determines from the list of assigned object IDs whether a data object with that query's object ID exists in the node itself. If it exists, the search is terminated by returning the data object. If the node does not exist, the routing information is checked to determine a node to be forwarded next, and an inquiry is sent to that node. By repeating this series of determination processes, the search based on the object ID is realized. Assuming 2 ^scale nodes, the search space is reduced to half each time a query is forwarded, so the number of messages required to search for a data object is O (log (N)).

なお図１において、楕円の中に記載された数字がオブジェクトＩＤであり、○で示したノードの側に記載したビットは、ノードＩＤである。また図１の右側に示した枠の内部には、オブジェクトＩＤリストに２が入っていること、そしてｉｎｔｅｒｖａｌ（検索対象となるオブジェクトＩＤの最初のＩＤと最後のＩＤ：但し前者は含まれるが、後者は含まれない）とｓｕｃｃｅｓｓｏｒ（次のノード）の関係と、ｓｕｃｃｅｓｓｏｒ（次のノード）とｐｒｅｄｅｃｅｓｓｏｒ（前のノード）の関係とが示されている。この関係から判ることは、ｉｎｔｅｒｖａｌが［４，６）とは、求めるオブジェクトＩＤが’４‘，’５’であればｓｕｃｃｅｓｓｏｒ（次のノード）はノードｃである。オブジェクトＩＤが’１０‘〜’１６‘，’１’であればｓｕｃｃｅｓｓｏｒ（次のノード）はノードａであることを示している。 In FIG. 1, the number described in the ellipse is the object ID, and the bit described on the node side indicated by ◯ is the node ID. In addition, the inside of the frame shown on the right side of FIG. 1 includes 2 in the object ID list, and interval (the first ID and the last ID of the object ID to be searched: the former is included, The relationship between the successor (next node) and the successor (next node) and the relationship between the successor (next node) and the predecessor (previous node) are shown. From this relationship, it is understood that the interval is [4, 6], and if the object ID to be obtained is “4”, “5”, the successor (next node) is the node c. If the object ID is '10' to '16', '1', it indicates that the successor (next node) is the node a.

具体的な検索の例を、図２を用いて説明する。問合せとして、オブジェクトＩＤが’２’であるオブジェクトを検索することを考える。まず、ノードｃより問合せが開始されるものとする。ノードｃでは問合せのオブジェクトＩＤを持つオブジェクトは存在しないため、ルーティング情報の中から’２’をｉｎｔｅｒｖａｌに含む［１４，６）のｓｕｃｃｅｓｓｏｒであるノードａへ問合せをフォワードする。ノードａでも同様に、問合せのオブジェクトＩＤを持つオブジェクトは存在しないため、ルーティング情報の中から’２’をｉｎｔｅｒｖａｌに含む［２，４）のｓｕｃｃｅｓｓｏｒであるノードｂへ問合せをフォワードする。ノードｂでは、問合せ条件に合致するデータオブジェクトが存在するため、開始ノードｃへ結果を返し、検索は終了する。以上のように、ＣｈｏｒｄではオブジェクトＩＤを用いた検索については、効率的な検索が可能である。また、各ノードのＩＤ集合の要素数がほぼ同じになることより、負荷の均等化が計れる。さらに、ノードの追加や削除への対応もなされている。したがって前述の３つの特徴が実現されている。 A specific search example will be described with reference to FIG. As an inquiry, consider searching for an object whose object ID is “2”. First, it is assumed that an inquiry is started from the node c. Since there is no object having the object ID of the query in node c, the query is forwarded to node a which is the successor of [14, 6) including “2” in the interval from the routing information. Similarly, since the object having the object ID of the query does not exist in the node a, the query is forwarded to the node b that is the successor of [2, 4) including “2” in the interval from the routing information. Since there is a data object that matches the query condition at the node b, the result is returned to the start node c, and the search ends. As described above, in Chord, an efficient search is possible for a search using an object ID. Also, since the number of elements in the ID set of each node is almost the same, the load can be equalized. Furthermore, correspondence to addition and deletion of nodes is also made. Therefore, the above-mentioned three features are realized.

次に本発明で用いるシグネチャについて説明する。シグネチャは、個々のデータオブジェクトから生成される固定長のビット列であり、オブジェクトの特徴量を表現するものである。オブジェクトの特徴量をビット列という単純な表現方法に変換することで、特定の特徴量の存在の有無を容易に判定することができ、多様な特徴量によるオブジェクト検索が可能になる。 Next, the signature used in the present invention will be described. The signature is a fixed-length bit string generated from each data object and expresses the feature quantity of the object. By converting the feature quantity of an object into a simple expression method called a bit string, the presence or absence of a specific feature quantity can be easily determined, and object search using various feature quantities becomes possible.

まず、各データオブジェクトから生成されるオブジェクトシグネチャについて述べる。図３は、ファイル名などのデータオブジェクトの名称から生成した公知のｔｒｉｇｒａｍを特徴量とした（名称を構成する文字を３文字づつに分けてそれを特徴量とした）場合の例を示している。ビット列からなるオブジェクトシグネチャの生成には、周知のスーパーインポーズドコーディングを用いている。これは各特徴量をハッシングしてシグネチャ長の要素シグネチャ（ｐｕｒ，ｕｒｅ，ｒｅＰ，ｅＰ２，Ｐ２Ｐをビット列であらわしたもの）を生成する。そしてさらに得られた要素シグネチャの各ビット列の論理和をとってそのビット列をオブジェクトシグネチャとする。このようにして生成されたオブジェクトシグネチャは、当該データオブジェクトの持つ特徴量を全て表すことになる。シグネチャでは、’１’の立っているビット位置によってこれを表現する。なお、この’１’の立っているビットの数のことをシグネチャのウェイトと呼ぶ。 First, an object signature generated from each data object will be described. FIG. 3 shows an example in which a known trigram generated from the name of a data object such as a file name is used as a feature amount (characters constituting the name are divided into three characters and used as feature amounts). . Well-known superimposed coding is used to generate an object signature including a bit string. In this method, each feature quantity is hashed to generate a signature-length element signature (pur, ure, reP, eP2, and P2P represented by a bit string). Further, a logical sum of each bit string of the obtained element signature is taken to make the bit string an object signature. The object signature generated in this way represents all the feature quantities of the data object. In the signature, this is expressed by the bit position where “1” stands. The number of bits in which “1” stands is called a signature weight.

問合せに関しても、図４に示すように、図３のオブジェクトシグネチャと同様にして問合せシグネチャを作成する。図４の例は、検索すべきオブジェクトの名称をｐｕｒｅとして問合せシグネチャを作成する例を示している。この例では、ｐｕｒｅをｔｒｉｇｒａｍに分解して、各特徴量の要素シグネチャの論理和を取って問合せシグネチャを作成している。問合せシグネチャの’１’の立っているビット位置が、問合せの持つ全ての特徴量を表している。オブジェクト検索では、問合せシグネチャの’１’の立っているビット位置がオブジェクトシグネチャでも’１’であるかを照合することで、当該オブジェクトが問合せの持つ全ての特徴量を含む可能性があるかを判定する。したがって、オブジェクトシグネチャと問合せシグネチャの各ビットの論理積をとったものが、問合せシグネチャと一致するときにそのオブジェクトは問合せの持つ特徴量を全て含む可能性があり、問合せ条件を満たす解の侯補となる。この解の侯補のことをドロップといい、この中で実際に正解となるものをアクチュアルドロップ、そうでないものをフォールスドロップという。この判定処理のことをフォールスドロップレゾリューションという。また、解の侯補がフォールスドロップとなる確率をフォールスドロップ確率Ｆｄといい、以下の式で与えられる。フォールスドロップ確率はシグネチャの検索精度を測る尺度となる。

上記式においては、Ｎｕｍ＿ＦａｌｓｅＤｒｏｐは、フォールスドロップ数、Ｎｕｍ＿ＵｎｑｕａｌｉｆｉｅｄＤａｔａＯｂｊｅｃｔは、問合せ条件を満たさないデータオブジェクト数である。このように、ｔｒｉｇｒａｍを特徴量と考えた場合には、シグネチャを用いることで、任意の文字列によるデータオブジェクト名の部分一致検索を実現できる。 As for the query, as shown in FIG. 4, a query signature is created in the same manner as the object signature of FIG. The example of FIG. 4 shows an example of creating a query signature with the name of the object to be searched as “pure”. In this example, pure is decomposed into a trigram, and a query signature is created by taking the logical sum of the element signatures of each feature quantity. The bit position where “1” stands in the query signature represents all the feature values of the query. In the object search, by checking whether the bit position where “1” is set in the query signature is also “1” in the object signature, it is determined whether there is a possibility that the object includes all the features of the query. judge. Therefore, when the logical product of each bit of the object signature and the query signature matches the query signature, the object may contain all the features of the query, and compensation of the solution that satisfies the query condition. It becomes. Compensation of this solution is called drop, the actual answer in this is called actual drop, and the other is called false drop. This determination process is called false drop resolution. Further, the probability that the solution compensation is a false drop is called a false drop probability Fd, which is given by the following equation. False drop probability is a measure for measuring signature search accuracy.

In the above formula, Num_FalseDrop is the number of false drops, and Num_UnqualifiedDataObject is the number of data objects that do not satisfy the query condition. As described above, when the trigram is considered as the feature amount, the partial match search of the data object name using an arbitrary character string can be realized by using the signature.

本発明では、Ｐ２Ｐネットワークのノード上に、シグネチャ情報を分散配置することで、多様な特徴量に基づくオブジェクト検索の実現を計る。最も単純な分散配置法としては、各オブジェクトシグネチャを１つの単位として分散配置することが考えられる。この方法では、オブジェクトの追加時の処理は効率的にできるものの、検索時に各ノードに登録されている多数のシグネチャを参照する必要が生じる場合があり、多数のノードでの照合処理を避けるのは難しい。別の分散配置法としては、各オブジェクトシグネチャを複数の部分に分割し、部分シグネチャを単位として分散配置することが考えられる。この場合、照合時には問合せシグネチャ中の’１’の立っているビット位置を含む部分シグネチャのみを参照すればよいため検索処理の効率化を計ることができる。しかし、オブジェクト追加時には、各部分シグネチャの配置処理が必要になる。そこで、本発明では上記のオブジェクトシグネチャを単位とした分散配置法と部分シグネチャを単位とした分散配置法を融合し、利用環境におけるオブジェクト検索と追加の発生頻度に適合した構成が可能な方法を実現した。本発明の方法は、Ｐ２Ｐネットワーク上でデータの効率的検索を実現する枠組みの上に構築するものであり、先に述べたＣｈｏｒｄ、Ｐ−Ｇｒｉｄ等のいずれの枠組みを用いても実現することが可能である。本実施の形態では、Ｃｈｏｒｄを用いている。 In the present invention, the object information based on various feature quantities is realized by distributing signature information on nodes of the P2P network. As the simplest distributed arrangement method, it can be considered to distribute and arrange each object signature as one unit. Although this method can efficiently process an object when it is added, it may be necessary to refer to a large number of signatures registered in each node at the time of searching. difficult. As another distributed arrangement method, each object signature may be divided into a plurality of parts and distributedly arranged in units of partial signatures. In this case, since only the partial signature including the bit position where “1” stands in the query signature needs to be referred to at the time of collation, it is possible to improve the efficiency of the search process. However, when adding an object, it is necessary to place each partial signature. Therefore, in the present invention, the above-described distributed arrangement method using the object signature as a unit and the distributed arrangement method using the partial signature as a unit are combined to realize a method capable of constructing the object according to the usage environment and the additional occurrence frequency. did. The method of the present invention is built on a framework that realizes efficient data retrieval on a P2P network, and can be implemented using any of the above-described frameworks such as Chord and P-Grid. Is possible. In the present embodiment, Chord is used.

図５は、本発明の方法の実施の形態を実施するために用いるプログラムにおける登録処理のアルゴリズムを示すフローチャートであり、図６はこのプログラムにおける検索処理のアルゴリズムを示すフローチャートである。以下図５及び図６のステップを説明しながら、本発明の方法及びプログラムの実施の形態の一例を説明する。図５及び図６に示したアルゴリズムを実現するプログラムは、Ｐ２Ｐ環境におけるネットワークに接続された自己を特定するノードＩＤを有する最大２^ｎ個（ｎは正の整数）のノードにおけるコンピュータにすべてインストールされている。なお図５に示した関数は以下のことを意味する。ｌｅｎ（）は、引数で与えられた配列またはビット列の長さを返すことを意味する。ｖａｌｕｅ（）は、引数を１０進表現に変換した値を返すことを意味する。ｂｉｎａｒｙ（）は引数を２進表現に変換した値を返すことを意味する。ｃｏｎｃａｔｅｎａｔｅ（）は第１引数と第２引数を連結したビット列を返すことを意味する。ｐｒｅｆｉｘ（）は、第１引数のビット列に対して、先頭から第２引数番目までのビット列を返すことを意味する。ｚｅｒｏｐａｄｄｉｎｇ（）は、第１引数のビット列に対して、ビット列の長さが第２引数になるまで‘０’をパディングし、生成されたビット列を返すことを意味する。また図６中においてｃｈｅｃｋ（）は、第１引数で与えられたインデックスエントリ集合に対して、以下の２つの条件１及び２を満たすインデックスエントリ集合を返すことを意味する。 FIG. 5 is a flowchart showing an algorithm of registration processing in a program used for carrying out the embodiment of the method of the present invention, and FIG. 6 is a flowchart showing an algorithm of search processing in this program. Hereinafter, an example of an embodiment of the method and the program of the present invention will be described while explaining the steps of FIGS. The programs that implement the algorithms shown in FIGS. 5 and 6 are all installed on computers in a maximum of 2 ⁿ nodes (n is a positive integer) having node IDs that identify themselves connected to the network in the P2P environment. ing. The function shown in FIG. 5 means the following. len () means returning the length of the array or bit string given by the argument. value () means that a value obtained by converting an argument into a decimal representation is returned. binary () means returning a value obtained by converting the argument into a binary representation. “concatenate ()” means that a bit string obtained by concatenating the first argument and the second argument is returned. prefix () means that a bit string from the head to the second argument is returned for the bit string of the first argument. Zero padding () means that the bit string of the first argument is padded with “0” until the length of the bit string becomes the second argument, and the generated bit string is returned. In FIG. 6, check () means that an index entry set that satisfies the following two conditions 1 and 2 is returned for the index entry set given by the first argument.

１．各エントリのフレーム番号＝第１引数の値
２．各エントリのフレームシグネチャΛ第３引数のシグネチャ＝第３引数のシグネチャ
（Λはビット論理積を表す）
またｎｅｘｔ（）は、引数で与えられたロケータのフレームシグネチャ部分を基に、以下の２つの条件１及び２を満たすフレームシグネチャｎｅｘｔｓｉｇを計算し、新たにロケータを生成して返すことを意味する。 1. 1. Frame number of each entry = value of first argument Frame signature of each entry Λ Signature of third argument = Signature of third argument (Λ represents bitwise AND)
Next () means that a frame signature next sig satisfying the following two conditions 1 and 2 is calculated based on the frame signature portion of the locator given by the argument, and a new locator is generated and returned. .

１．ｎｅｘｔｓｉｇ Λ フレームシグネチャ部分＝フレームシグネチャ部分
２．ｖａｌｕｅ（ｎｅｘｔｓｉｇ）＞ｖａｌｕｅ（フレームシグネチャ部分）を満たす最小のｖａｌｕｅ（ｎｅｘｔｓｉｇ）となるｎｅｘｔｓｉｇ
まず登録処理では、図５に示すようにステップＳＴ１で、登録作業を行うノードにおいて、データオブジェクトの追加処理を実行するために、検索対象のデータオブジェクトをユーザが任意のノードに配置する。そしてステップＳＴ２乃至ＳＴ１５に従って、シグネチャ情報を含むインデックスエントリを所定のノードに分散配置する。インデックスエントリの配置処理、および検索処理はＣｈｏｒｄの枠組みを利用する。各データオブジェクトはノード単位で管理されるため、データオブジェクトのノード内でのＩＤとそれを格納するノードＩＤのペアが、データオブジェクトを一意に決定するキーとなる。ＣｈｏｒｄのオブジェクトＩＤ（ｏｉｄ）と本発明のデータオブジェクトのＩＤを明確に区別するため、本実施の形態におけるノード内でのデータオブジェクトのＩＤのことをローカルＩＤ（ｌｉｄ）と呼ぶことにする。 1. next sig Λ frame signature part = frame signature part next (next sig)> next sig that is the smallest value (next sig) that satisfies value (frame signature part)
First, in the registration process, in step ST1, as shown in FIG. 5, the user places the data object to be searched at an arbitrary node in order to execute the data object addition process in the node performing the registration work. Then, according to steps ST2 to ST15, index entries including signature information are distributed and arranged at predetermined nodes. Index entry placement processing and search processing use the Chord framework. Since each data object is managed in node units, a pair of an ID in the node of the data object and a node ID storing the data object is a key for uniquely determining the data object. In order to clearly distinguish the Chord object ID (oid) from the ID of the data object of the present invention, the ID of the data object in the node in this embodiment will be referred to as a local ID (lid).

ステップＳＴ２及びＳＴ３では、シグネチャ情報の分散配置を行う上での前処理を行う。まず、ステップＳＴ２では図３を用いて説明した手法でオブジェクトシグネチャを生成する。そしてステップＳＴ３では、データオブジェクトから生成したオブジェクトシグネチャを図７に示すように分割フレーム数＝ｓｌｉｃｅ個のフレームシグネチャに分割する（第２の登録ステップ）（ＺｈｅｎｇＬｉｎ及びＣｈｒｉｓｔｏｓＦａｌｏｕｔｓｏｓ著の「Ｆｒａｍｅ−ＳｌｉｃｅｄＳｉｇｎａｔｕｒｅＦｉｌｅｓ」ＩＥＥＥＴＫＤＥＶｏｌ．４，Ｎｏ．３，ｐｐ．２８１−２８９，１９９２．参照）。分割フレーム数Ｐは、１以上の正の整数である。図７の例では、分割フレーム数＝４であり、１６ビットのデータシグネチャまたはオブジェクトシグネチャ（検索の場合には問合せシグネチャ）の場合、４つの分割フレーム（ｆ０〜ｆ３）が生成され、各分割フレームはそれぞれ４ビットのビット列からなるフレームシグネチャによって構成される。分割フレーム数が１の場合には、フレームシグネチャとオブジェクトシグネチャとが、一致する。特に、ｓｌｉｃｅがシグネチャ長と一致する場合をビットスライス構成と呼ぶ。 In steps ST2 and ST3, preprocessing for performing distributed arrangement of signature information is performed. First, in step ST2, an object signature is generated by the method described with reference to FIG. In step ST3, the object signature generated from the data object is divided into the number of divided frame = slice frame signatures as shown in FIG. 7 (second registration step) (“Frame-Sliced by Zheng Lin and Christos Faloutos”). Signage Files "IEEE TKDE Vol. 4, No. 3, pp. 281-289, 1992.). The divided frame number P is a positive integer of 1 or more. In the example of FIG. 7, when the number of divided frames is 4, and a 16-bit data signature or object signature (query signature in the case of search), four divided frames (f0 to f3) are generated, and each divided frame is generated. Are each constituted by a frame signature composed of a 4-bit bit string. When the number of divided frames is 1, the frame signature matches the object signature. In particular, the case where slice matches the signature length is called a bit slice configuration.

次に、ステップＳＴ４で、各フレームシグネチャに番号付けを行う。そして以下のステップＳＴ５からＳＴ１２までのステップで分割フレーム番号をバイナリ表現に変換したビット列と当該分割フレームのフレームシグネチャを結合し、合成シグネチャを得る（第３の登録ステップ）。本実施の形態では、合成シグネチャから長さがｓｃａｌｅビット（ｎ＝４ビット）のｐ個のロケータを生成する。ｐ個のロケータはＩＤサークル上へのインデックスエントリの配置、および検索処理の際に利用する。ロケータの生成方法は次の通りである。まず図７の右上の部分に示すようにｉ＋ｆｉで分割フレームをそれぞれ特定するビット列と、前記フレームシグネチャを構成する長さｍ／ｐのビット列とを合成して各分割フレームに対応する合成シグネチャを作成する（ステップＳＴ７）。そしてこの合成シグネチャの最大上位ｎ個のビット列によってロケータを生成する（ステップＳＴ８〜ＳＴ１２）。ステッすプＳＴ５では、分割フレームのすべてについて処理が終了しているか否かが判定され、すべての処理が終わっていれば終了する。分割フレームについてすべての処理が終了していなければ、ステップＳＴ６へと進む。そしてフレームシグネチャを構成するビット列がすべて０で構成されている場合には、ステップＳＴ１５へと進んで次の分割フレームの処理が行われる。フレームシグネチャを構成するビット列に１が少なくとも１つ含まれていれば、ステップＳＴ７へと進んで合成シグネチャが生成される。生成した合成シグネチャのビット列の長さに応じて、ステップＳＴ８〜ステップＳＴ１２までの処理が実行されてロケータが生成される。ステップＳＴ８からステップＳＴ１１までのステップでは以下のような処理が行われる。 Next, in step ST4, each frame signature is numbered. Then, a combined signature is obtained by combining the bit string obtained by converting the divided frame number into binary representation in the following steps ST5 to ST12 and the frame signature of the divided frame (third registration step). In the present embodiment, p locators having a length of scale bits (n = 4 bits) are generated from the composite signature. The p locators are used for the arrangement of index entries on the ID circle and the search process. The locator generation method is as follows. First, as shown in the upper right part of FIG. 7, a combined signature corresponding to each divided frame is generated by combining a bit string that identifies each divided frame with i + fi and a bit string of length m / p that constitutes the frame signature. (Step ST7). Then, a locator is generated by the highest n bits of the composite signature (steps ST8 to ST12). In step ST5, it is determined whether or not the processing has been completed for all of the divided frames. If all the processing has been completed, the processing ends. If all the processes have not been completed for the divided frames, the process proceeds to step ST6. If the bit strings constituting the frame signature are all composed of 0, the process proceeds to step ST15 and the next divided frame is processed. If at least one 1 is included in the bit string constituting the frame signature, the process proceeds to step ST7 to generate a composite signature. Depending on the length of the bit string of the generated composite signature, the processing from step ST8 to step ST12 is executed to generate a locator. The following processing is performed in steps ST8 to ST11.

［ケース１］合成シグネチャ長がｓｃａｌｅ（図７の場合４）と同じ場合、合成シグネチャ自身をロケータとする（ステップＳＴ８，１０，１２）。 [Case 1] When the combined signature length is the same as the scale (case 4 in FIG. 7), the combined signature itself is used as a locator (steps ST8, 10, 12).

［ケース２］合成シグネチャ長がｓｃａｌｅより大きい場合（図７の場合４より大きい場合）、先頭からｓｃａｌｅ番目（図７の場合４番目までの）プレフィックス（ビット列の先頭からのビット）をロケータとする（ステップＳＴ８，９）。 [Case 2] When the composite signature length is larger than scale (larger than 4 in FIG. 7), the scale (up to the fourth in FIG. 7) prefix (bits from the beginning of the bit string) from the head is used as the locator. (Steps ST8, 9).

［ケース３］合成シグネチャ長がｓｃａｌｅより小さい場合（図７の場合４より小さい場合）、’０’を合成シグネチャに追加することで長さをｓｃａｌｅ（図７の場合ビット数を４）とし、ロケータとする（ステップＳＴ８，１０，１１）。 [Case 3] When the combined signature length is smaller than the scale (less than 4 in the case of FIG. 7), the length is scaled by adding “0” to the combined signature (the number of bits in FIG. 7 is 4). A locator is used (steps ST8, 10, 11).

このロケータ生成方法を用いることにより、任意の分割フレーム数、およびシグネチャ長に対してロケータを生成することができる（第４の登録ステップ）。次に、ステップＳＴ１３へと進んでインデックスエントリが生成される。ステップＳＴ１３では、登録作業を行っている前記ノードのノードＩＤと、登録作業を行っているノード内で登録対象となっているオブジェクトを特定するローカルＩＤと、分割フレームを特定するデータと、フレームシグネチャとからなるインデックスエントリを対応するロケータに対応して作成する（第５の登録ステップ）。そしてステップＳＴ１４で、インデックスエントリをロケータにより指定された１個以上のノードに配置即ち登録する（第６の登録ステップ）。このようなロケータを用いてインデックスエントリを配置することで、各ノードに対して均等にインデックスエントリを配置することができる。なお、ここでノードｎｉにおけるロケータの順序を以下のように定義する。ノードｎｉのノードＩＤをｎｉｄ、ロケータをｌｏｃとした場合に、次のｏｒｄを計算する。

ロケータの順序は上記式のｏｒｄの昇順とする。 By using this locator generation method, a locator can be generated for an arbitrary number of divided frames and signature length (fourth registration step). Next, proceeding to step ST13, an index entry is generated. In step ST13, the node ID of the node that is performing the registration work, the local ID that identifies the object to be registered in the node that is performing the registration work, the data that identifies the divided frames, and the frame signature Are created in correspondence with the corresponding locators (fifth registration step). In step ST14, the index entry is arranged or registered in one or more nodes designated by the locator (sixth registration step). By arranging the index entries using such a locator, the index entries can be arranged equally for each node. Here, the order of the locators in the node ni is defined as follows. When the node ID of the node ni is nid and the locator is loc, the next ord is calculated.

The order of the locators is the ascending order of ord in the above formula.

次にインデックスエントリの生成、およびその配置方法についてより詳しく説明する。インデックスエントリは、各フレームシグネチャに対して生成する。ただし、すべてのビットが’０’であるフレームシグネチャに対しては、インデックスエントリは生成しない。インデックスエントリは、当該データオブジェクトのローカルＩＤ、それを格納したノードＩＤ、フレーム番号、およびそのフレームシグネチャから構成される。 Next, the generation of index entries and the arrangement method thereof will be described in more detail. An index entry is generated for each frame signature. However, no index entry is generated for a frame signature in which all bits are “0”. The index entry includes a local ID of the data object, a node ID storing the data object, a frame number, and a frame signature thereof.

あるノードにあるデータオブジェクトが追加された場合のインデックスエントリの配置方法について説明する。本実施の形態では、並列的にインデックスエントリの配置処理を行うことで、配置処理の応答時間を短縮する。まず、各インデックスエントリ中のフレーム番号とフレームシグネチャからロケータを生成する。そして図８に示すように、各インデックスエントリはＣｈｏｒｄのアルゴリズムに従い、ロケータをオブジェクトＩＤとみなして、適当なノードへ配置される。このとき、配置処理を行う必要のある各インデックスエントリを個別に処理するのではなく、まず、データオブジェクトが追加されたノードの持つルーティング情報に基づき、次にフォワードすべきノード毎に各インデックスエントリを分類する。次に、このようにして得られた各インデックスエントリ集合を当該ノードへ送付する。同様の処理を送付されたノードで繰り返し、インデックスエントリの配置を行う。この配置方法では、各ノードが処理するインデックスエントリ配置のためのメッセージ数を最小化することができる。以下では、あるロケータＬｏｃにより配置されているインデックスエントリ集合のことをＬｏｃの配置インデックスエントリ集合と呼ぶ。 An arrangement method of index entries when a data object in a certain node is added will be described. In this embodiment, the index entry placement processing is performed in parallel to shorten the placement processing response time. First, a locator is generated from the frame number and frame signature in each index entry. Then, as shown in FIG. 8, each index entry is arranged at an appropriate node according to the Chord algorithm, regarding the locator as an object ID. At this time, instead of individually processing each index entry that needs to be arranged, first, each index entry is set for each node to be forwarded based on the routing information of the node to which the data object is added. Classify. Next, each index entry set obtained in this way is sent to the node. Similar processing is repeatedly performed at the node to which the index entry is sent, and index entries are arranged. In this arrangement method, the number of messages for index entry arrangement processed by each node can be minimized. In the following, an index entry set arranged by a certain locator Loc is referred to as a Loc arrangement index entry set.

次に図６を用いて、オブジェクトを検索するためのインデックスエントリの検索方法を説明する。このオブジェクト検索時においても、問合せ条件として与えられた特徴量から、問合せシグネチャ、フレームシグネチャ、合成シグネチャ、ロケータを順次生成する。ただし、フレームシグネチャがすべて’０’で構成されるものに対してはロケータを生成しない。これらは図６のステップＳＴ２１からステップＳＴ３１によって実現される（第１の検索ステップ）。なおステップＳＴ２１からステップＳＴ３１までは、図５におけるステップＳＴ１からステップＳＴ１３までと実質的に同じである。具体的には、検索作業を行うノードにおいて、検索すべきオブジェクトの１以上の特徴量を長さｍ（ｍは以上の正の整数）のビット列からなる問合せシグネチャとして生成する（第１の問合せエントリ生成ステップＳＴ２１）。そしてこの問合せシグネチャの長さｍのビット列を長さｍ／ｐ（ｐは１以上の正の整数）のビット列からなる問合せフレームシグネチャから構成されたｐ個の分割フレームに分割する（第２の問合せエントリ生成ステップＳＴ２２）。次に、ｐ個の分割フレームをそれぞれ特定するビット列と、問合せフレームシグネチャを構成する長さｍ／ｐのビット列を合成して各分割フレームに対応する合成シグネチャを作成する（第３の問合せエントリ生成ステップＳＴ２６）。そしてこの合成シグネチャを構成するビット列からｎ個のビット列からなるｐ個の問合せロケータを生成する（第４の問合せエントリ生成ステップＳＴ２７〜ＳＴ３１）。この第１の検索ステップは、最初は検索作業を行うノードで実行される。そして以後、ステップＳＴ２４〜ステップＳＴ３１（前述の第１の問合せエントリ生成ステップ乃至前記第４の問合せエントリ生成ステップ）は、前の問合せロケータにより定まる各ノードにおいて実行されて次の問合せロケータが定められる。 Next, the index entry search method for searching for an object will be described with reference to FIG. Also in the object search, a query signature, a frame signature, a composite signature, and a locator are sequentially generated from the feature amount given as the query condition. However, no locator is generated for a frame signature composed entirely of “0”. These are realized by step ST21 to step ST31 in FIG. 6 (first search step). Steps ST21 to ST31 are substantially the same as steps ST1 to ST13 in FIG. Specifically, at the node that performs the search operation, one or more feature quantities of the object to be searched are generated as a query signature consisting of a bit string of length m (m is a positive integer above) (first query entry) Generation step ST21). Then, the bit string of length m of the query signature is divided into p divided frames composed of a query frame signature composed of a bit string of length m / p (p is a positive integer of 1 or more) (second query). Entry generation step ST22). Next, a bit string specifying each of the p number of divided frames and a bit string of length m / p constituting the query frame signature are synthesized to generate a synthesized signature corresponding to each divided frame (third query entry generation) Step ST26). Then, p query locators composed of n bit strings are generated from the bit strings constituting the composite signature (fourth query entry generating steps ST27 to ST31). This first search step is initially executed at the node that performs the search operation. Thereafter, step ST24 to step ST31 (the first query entry generation step to the fourth query entry generation step described above) are executed at each node determined by the previous query locator to determine the next query locator.

ステップＳＴ３２では、最初のノードにおいて、中間結果の集合を初期値にセットする。すなわち格納場所を空の状態にする。そして以後のステップＳＴ３３〜ステップＳＴ４１では、以下のようにインデックスエントリ集合の取得と絞り込みが実施される。まずステップＳＴ３３で、ｐ個の問合せロケータ中の最初の問合せロケータに基づいて検索対象とすべきノードを定めるすべての検索対象ロケータを求めて検索対象ロケータ集合を生成し（ステップＳＴ３６及び３７）、検索作業を行うノードのノードＩＤ及びすべての問合せフレームシグネチャを含む問合せデータ（すなわち部分問合せエントリ）を検索対象ロケータ集合の最初の検索対象ロケータにより定まるノードに送る（ステップＳＴ３３：第１のステップ）。そして次にステップＳＴ３４に進んで、最初のノードにおいて、最初の検索対象ロケータに対応する問合せフレームシグネチャの条件を満たすすべてのインデックスエントリを取得して部分インデックスエントリ候補集合とする（第２のステップ）。次に、次の検索対象ロケータにより定まる次のノードに問合せデータとインデックスエントリ候補を送る（ステップＳＴ３６及び３７：第３のステップ）。そして次のノードにおいて、次の検索対象ロケータに対応する問合せフレームシグネチャの条件を満たすすべてのインデックスエントリを取得し（ステップＳＴ３３）且つ該インデックスエントリ集合と部分インデックスエントリ候補集合との和集合を取ってその結果を次の部分インデックスエントリ候補集合とする（ステップＳＴ３４及び３５：第４のステップ）。最後の検索対象ロケータに到るまで第３のステップ及び第４のステップを繰り返し最後の部分インデックスエントリ候補集合をインデックスエントリ候補集合とする（第５のステップ）。これら第１乃至第５のステップは、図６のステップＳＴ３３〜ＳＴ３７で実行される。なおステップＳＴ３６では、順次ロケータを計算するため、ロケータのフレームシグネチャ部分がすべて‘１’になった場合を終了条件となる。 In step ST32, a set of intermediate results is set to an initial value at the first node. That is, the storage location is made empty. In subsequent steps ST33 to ST41, acquisition and narrowing down of the index entry set is performed as follows. First, in step ST33, a search target locator set is generated by obtaining all search target locators that determine nodes to be searched based on the first query locators in the p query locators (steps ST36 and 37). Query data including the node ID of the node that performs the work and all query frame signatures (ie, partial query entries) is sent to the node determined by the first search target locator in the search target locator set (step ST33: first step). Then, the process proceeds to step ST34, and at the first node, all index entries that satisfy the query frame signature condition corresponding to the first search target locator are acquired and set as a partial index entry candidate set (second step). . Next, inquiry data and index entry candidates are sent to the next node determined by the next search target locator (steps ST36 and 37: third step). Then, in the next node, all index entries that satisfy the query frame signature condition corresponding to the next search target locator are acquired (step ST33), and the union of the index entry set and the partial index entry candidate set is taken. The result is set as the next partial index entry candidate set (steps ST34 and 35: fourth step). The third step and the fourth step are repeated until the last search target locator is reached, and the last partial index entry candidate set is set as an index entry candidate set (fifth step). These first to fifth steps are executed in steps ST33 to ST37 in FIG. In step ST36, since the locators are calculated sequentially, the end condition is when all the frame signature portions of the locators are "1".

そしてステップＳＴ３８で、最初の問合わせロケータの処理であったか否かの判定が行われ、最初の問合わせロケータの処理であれば、ステップＳＴ３９へと進んで収集したインデックスエントリ候補をそのまま次の問合せロケータにより定まるノードに送出する（ステップＳＴ４０、ＳＴ４２、ＳＴ２４〜ＳＴ３１）。 In step ST38, it is determined whether or not the process is the first query locator process. If the process is the first query locator process, the process proceeds to step ST39 and the collected index entry candidates are directly used as the next query locator. (Steps ST40, ST42, ST24 to ST31).

次にｐ個の問合せロケータ中の次の問合せロケータから検索対象ロケータ集合を生成し、問合せデータ及びインデックスエントリ候補集合を該検索対象ロケータ集合の最初の検索対象ロケータにより定まる次のノードに送る（ステップＳＴ３９及びＳＴ４０：第６のステップ）。次のノードからはじめて検索対象ロケータ集合の最後の検索対象ロケータに到るまで前述の第２のステップ及び第３のステップ及び第４のステップ及び第５のステップを繰り返しその結果得られたインデックスエントリ候補集合と本ステップ開始時のインデックスエントリ候補集合の積集合（ステップＳＴ４１）をとってその結果を次のインデックスエントリ候補集合とする（第７のステップ：ステップＳＴ２４〜ステップＳＴ３８、ステップＳＴ４１）。そして最後の問合せロケータに到るまで第６のステップ及び第７のステップを繰り返し、最終結果を解として検索作業を行うノードに送信する（第８のステップ：ステップＳＴ２４及びＳＴ４３）。 Next, a search target locator set is generated from the next query locator in the p query locators, and the query data and index entry candidate set are sent to the next node determined by the first search target locator of the search target locator set (step ST39 and ST40: 6th step). The above-mentioned second step, third step, fourth step and fifth step are repeated until the last search target locator of the search target locator set is reached from the next node, and index entry candidates obtained as a result thereof The product set of the set and the index entry candidate set at the start of this step (step ST41) is taken and the result is set as the next index entry candidate set (seventh step: step ST24 to step ST38, step ST41). Then, the sixth step and the seventh step are repeated until the final inquiry locator is reached, and the final result is transmitted as a solution to the node performing the search operation (eighth step: steps ST24 and ST43).

前述の部分問合せエントリはロケータ、フレーム番号、フレームシグネチャから構成される。問合せはこれらの部分問合せエントリを順次照合することで処理される。この処理は、時計回りに問合せ処理を行う上で参照すべきインデックスエントリを持つノードを順次巡回していくことで行う。この場合、照合処理による解の絞り込みが順次行われるため、中間結果のデータ転送量が小さくなることが期待される。 The partial query entry described above includes a locator, a frame number, and a frame signature. Queries are processed by sequentially matching these partial query entries. This processing is performed by sequentially circulating the nodes having index entries to be referred to when performing the query processing in the clockwise direction. In this case, since narrowing down of solutions by the collation processing is sequentially performed, it is expected that the data transfer amount of the intermediate result is reduced.

図９が上記の第１及び第２の検索ステップを別に表現した検索アルゴリズムである。まず、問合せから部分問合せエントリ集合が生成される（１行目）。各部分問合せエントリは、問合せが発生したノードにおけるロケータの順序に従い、より小さいロケータを持つ順に処理する（２行目）。次に、部分問合せエントリのロケータから検索対象ロケータ集合を生成する（３行目）。検索対象ロケータ集合は、ロケータの中にあるフレームシグネチャ中の’０’を’０’または’１’としたすべてのビット列の集合である（したがって次式を満たす。検索対象ロケータ集合の各要素Λ^＊（Λはビット論理積を表す）ロケータ＝ロケータ）この時点では部分問合せエントリに対する検索処理は何も行われていないため、解侯補集合は空集合である（４行目）。ただし解侯補集合とは、ノードＩＤとローカルＩＤのペアを要素とする集合である。次に、検索対象ロケータ集合の各要素に対して、そのロケータの配置インデックスエントリ集合を保持するノードに、その時点の解侯補集合、未処理の部分問合せエントリ集合、未処理の検索対象ロケータ集合を送付する。ただし、検索対象ロケータ集合からのロケータ選択は、そのノードにおけるロケータの順序に従う。これらを受け取ったノードでは、自分の保持する配置インデックスエントリ集合の中で、以下の２つの条件を満たすインデックスエントリを選択し（６行目）、その中のノードＩＤとローカルＩＤのペアを解侯補集合に加える（７行目）。 FIG. 9 shows a search algorithm that separately expresses the first and second search steps. First, a partial query entry set is generated from the query (first line). Each partial query entry is processed in order of having a smaller locator according to the locator order in the node where the query occurs (second line). Next, a search target locator set is generated from the locators of the partial query entries (third line). The search target locator set is a set of all bit strings in which “0” in the frame signature in the locator is set to “0” or “1” (thus satisfying the following expression. Each element Λ of the search target locator set) ^* (Λ represents bitwise AND) Locator = Locator) At this point, no search processing is performed on the partial query entry, so the solution complement set is an empty set (line 4). However, the solution complement set is a set having a pair of a node ID and a local ID as an element. Next, for each element of the search target locator set, the node that holds the placement index entry set of the locator is set to the current solution complement set, the unprocessed partial query entry set, and the unprocessed search target locator set. Will be sent. However, the locator selection from the search target locator set follows the locator order in the node. The node that receives them selects an index entry satisfying the following two conditions from the set of arrangement index entries held by itself (line 6), and resolves the pair of the node ID and local ID in the index entry. Add to complement (7th line).

［条件１］部分問合せエントリ中のフレーム番号＝インデックスエントリ中のフレーム番号。 [Condition 1] Frame number in partial query entry = frame number in index entry.

［条件２］部分問合せエントリ中のフレームシグネチャΛインデックスエントリ中のフレームシグネチャ＝部分問合せエントリ中のフレームシグネチャ。 [Condition 2] Frame Signature in Partial Query Entry Frame Signature in Λ Index Entry = Frame Signature in Partial Query Entry

すべての検索対象ロケータ集合の各要素について解侯補集合の取得処理が終了した時点で、当該部分問合せエントリに関する処理が終了する。このあと、これまで計算された解Ａｎｓと解侯補集合との集合の積をとり、解の絞り込み処理を行う（１１行目）。この処理をすべての部分問合せエントリに対して行った後、開始ノードにＡｎｓを返す（１２行目）。 When the acquisition process of the solution complement set is completed for each element of all search target locator sets, the process related to the partial query entry ends. Thereafter, the product of the set of the solution Ans and the solution complement set calculated so far is taken, and the solution narrowing process is performed (line 11). After performing this processing for all partial query entries, Ans is returned to the start node (line 12).

図１０の問合せを例として説明する。まず、ノードｎ４で問合せが発生するものとする。問合せ処理はノードｎ４におけるロケータの順序に従い、部分問合せエントリｄ_０から開始される。ｄ_０から検索対象ロケータ集合ｄ_０Ｓｅｔ｛’００１０’，’００１１’｝が生成され、’００１０’の配置インデックスエントリ集合の有無を調べる。ノードｎ４には、’００１０’の配置インデックスエントリ集合は存在しない。このため、ルーティング情報を用いることで’００１０’の配置インデックスエントリ集合が存在するノードｎ１を辿る。このとき未処理部分問合せエントリ集合｛ｄ_０、ｄ_２｝、未処理検索対象ロケータ集合ｄ_０Ｓｅｔ、解侯補集合をノードｎ１に送付する。ノードｎ１では、ｄ_０Ｓｅｔ中の全ロケータ（’００１０’と’００１１’）の配置インデックスエントリ集合を持つため、これらを用いた解侯補集合を獲得する。この時点で、ｄ_０Ｓｅｔ中のすべてのロケータに対する処理は終了となるため、ｄ_０に対する処理は終了する。次に未処理の部分問合せエントリｄ_２から検索対象ロケータ集合ｄ_２Ｓｅｔ｛’１０００’，’１０１０’，’１００１’，’１０１１’｝が生成され、’１０００’の配置インデックスエントリ集合の有無を調べる。ノードｎ１には、’１０００’の配置インデックスエントリ集合は存在せず、ルーティング情報を使って配置インデックスエントリ集合の存在するノードｎ３を辿る。このとき、未処理部分問合せエントリ集合｛ｄ_２｝、未処理検索対象ロケータ集合ｄ_２Ｓｅｔ、解集合Ａｎｓをノードｎ３に送付する。ノードｎ３には、ｄ_２Ｓｅｔ中のすべてのロケータの配置インデックスエントリ集合が存在し、ｄ_２に対する処理は終了する。ここでＡｎｓの再計算を行う。この段階でシグネチャの照合処理は終了し、開始ノードｎ４へ検索結果であるＡｎｓが返される。 The inquiry in FIG. 10 will be described as an example. First, it is assumed that an inquiry occurs at the node n4. Query processing in the order of the locator of the node n4, it is started from the partial query entry d _0. A search target locator set d ₀ Set {'0010', '0011'} is generated from d _0, and the presence / absence of an arrangement index entry set of '0010' is checked. Node n4 has no arrangement index entry set of “0010”. For this reason, by using the routing information, the node n1 where the arrangement index entry set of “0010” exists is traced. At this time, the unprocessed partial query entry set {d ₀ , d ₂ }, the unprocessed search target locator set d ₀ Set, and the solution complement set are sent to the node n1. Since the node n1 has an arrangement index entry set of all locators ('0010' and '0011') in d ₀ Set, a solution complement set using these is obtained. At this point, the processing for all locators in d ₀ Set is complete, so the processing for d ₀ is complete. Then untreated Rates partial query entry _{d 2} of the object locator set _{d 2 Set { '1000',} '1010', '1001', '1011'} is generated, the presence or absence of the arrangement index entry set of '1000' Investigate. Node n1 does not have a placement index entry set of “1000”, and uses routing information to trace node n3 where the placement index entry set exists. At this time, the unprocessed partial query entry set {d ₂ }, the unprocessed search target locator set d ₂ Set, and the solution set Ans are sent to the node n3. In node n3, there is an arrangement index entry set of all locators in d ₂ Set, and the processing for d ₂ ends. Here, Ans is recalculated. At this stage, the signature verification process ends, and Ans as a search result is returned to the start node n4.

開始ノードｎ４では検索結果Ａｎｓ中の全要素に対応するデータオブジェクトを取得し、最後にフォールスドロップレゾリューションを行い、問合せに対する最終的な解を得る。 The start node n4 obtains data objects corresponding to all elements in the search result Ans, and finally performs false drop resolution to obtain a final solution for the query.

以下、上記の実施の形態を評価した実験について説明する。実験では、シミュレーションに基づく本実施の形態の方法の評価実験を行った。実験では、１）検索に必要なメッセージ数、総データ転送量、応答時間、２）オブジェクトの追加に伴うインデックスエントリの更新のためのメッセージ数と応答時間、３）検索と追加が混在する場合でのメッセージ数、４）オフラインノードが存在する場合の検索の精度を測定する。先に述べたように、検索コストと追加コストの間にはトレードオフの関係があり、分割フレーム数はそのコストに大きな影響を与えるパラメータである。その関係を確認するのが、実験１）２）３）の主な目的である。また、シグネチャを用いた検索の特徴の１つとして、全てのフレームシグネチャを参照しなくとも対象オブジェクトの絞り込みが可能な点がある。この特徴により、仮に必要なインデックスエントリを保持するノードがオフラインであっても、残りのオンラインノードのもつインデックスエントリのみで絞り込み処理を行うことが可能である。この点を確認するのが実験４）の目的である。実験時の主なパラメタを表１、表２に示す。

シグネチャ長およびウェイトについては、表２に示す３通りについて実験を行う。データオブジェクトの特徴量の数を１６４とした場合、特徴量１個に基づく問合せに対するフォールスドロップ確率はこの３通りのいずれの場合も約５％となる。応答時間の測定では、ノード間で、あるメッセージを転送するために必要な転送時間を単位時間とする。その他の処理時間は、メッセージ転送に必要な時間と比較した場合に非常に小さくなると考えられるため、考慮しない。 Hereinafter, an experiment evaluating the above embodiment will be described. In the experiment, an evaluation experiment of the method of the present embodiment based on simulation was performed. In the experiment, 1) number of messages required for search, total data transfer amount, response time, 2) number of messages and response time for index entry update accompanying object addition, and 3) search and addition are mixed. 4) Measure the accuracy of the search when there is an offline node. As described above, there is a trade-off relationship between the search cost and the additional cost, and the number of divided frames is a parameter that greatly affects the cost. Confirming this relationship is the main purpose of Experiments 1), 2) and 3). Further, as one of the features of search using signatures, there is a point that it is possible to narrow down target objects without referring to all frame signatures. With this feature, even if a node that holds a necessary index entry is offline, it is possible to perform a narrowing process using only the index entries of the remaining online nodes. The purpose of Experiment 4) is to confirm this point. Tables 1 and 2 show the main parameters during the experiment.

For the signature length and weight, the experiment is performed for three types shown in Table 2. When the number of feature amounts of the data object is 164, the false drop probability for a query based on one feature amount is about 5% in any of these three cases. In the measurement of response time, the transfer time required for transferring a certain message between nodes is defined as a unit time. Other processing times are not considered because they are likely to be very small when compared to the time required for message transfer.

［オブジェクト検索］
まず最初に、検索を行う際のメッセージ数、総データ転送量、応答時間の評価を行う。分割フレーム数ｓｌｉｃｅ、およびシグネチャ長Ｆを変化させて実験を行う。ノード数は１２８とし、問合せの特徴量は２、４、６と変化させる。既に述べたように、分割フレーム数を増加させた方が検索コストは減少することが予想される。また、シグネチャ長を大きくした場合には、問合せシグネチャのウェイトが小さくなるため、参照する必要のあるフレームシグネチャの数が減少し検索コストも小さくなることが予想される。一方、問合せ特徴量の数を大きくした場合には、問合せシグネチャのウェイトも大きくなり、検索コストは大きくなると考えられる。実験では、問合せを各５０回実行したときの平均を測定値とする。データ転送量を測る際、インデックスエントリの各要素のサイズについてはｎｉｄが６［Ｂｙｔｅ］、ｌｉｄが４［Ｂｙｔｅ］とする。フレーム番号のサイズとフレームシグネチャのサイズについてはｓｌｉｃｅの値によって変化する。ｓｌｉｃｅの値を表現するために必要なビット数をｂｉｔ＿ｓｌｉｃｅとした場合、フレーム番号に必要なサイズは［ｂｉｔ＿ｓｌｉｃｅ／８］［Ｂｙｔｅ］（［］はｘ以上の最小の整数を表す）、フレームシグネチャに必要なサイズは［Ｆ／ｂｉｔ＊ｓｌｉｃｅ］［Ｂｙｔｅ］である。 [Search Object]
First, the number of messages, total data transfer amount, and response time when searching are evaluated. The experiment is performed by changing the number of divided frames slice and the signature length F. The number of nodes is 128, and the feature amount of the query is changed to 2, 4, and 6. As already described, it is expected that the search cost decreases as the number of divided frames increases. Further, when the signature length is increased, the weight of the query signature is reduced, so that the number of frame signatures that need to be referred to is reduced and the search cost is expected to be reduced. On the other hand, when the number of query feature values is increased, the weight of the query signature also increases, and the search cost is considered to increase. In the experiment, an average when the query is executed 50 times is used as a measurement value. When measuring the data transfer amount, the size of each element of the index entry is set to nid is 6 [Byte] and lid is 4 [Byte]. The size of the frame number and the size of the frame signature vary depending on the value of slice. When the number of bits necessary to express the slice value is bit_slice, the size required for the frame number is [bit_slice / 8] [Byte] ([] represents the smallest integer greater than or equal to x), The required size is [F / bit * slice] [Byte].

図１１は平均メッセージ数である。メッセージ数は、分割フレーム数が大きくなると減少している。これは、問合せシグネチャを分割したときに、解の判定を行う必要のないフレームシグネチャ（すべて’０’で構成されるフレームシグネチャ）が高い確率で出現するため、メッセージ数の削減につながっている。さらにフレームシグネチャ長も小さくなるため、検索対象ロケータ集合の要素数も小さくなり、辿らなければならないノードの数も減少する。これらの理由により、メッセージ数が大幅に削減される。 FIG. 11 shows the average number of messages. The number of messages decreases as the number of divided frames increases. This results in a reduction in the number of messages because when a query signature is divided, a frame signature (a frame signature composed of all “0”) that does not need to determine a solution appears with a high probability. Furthermore, since the frame signature length is also reduced, the number of elements in the search target locator set is reduced, and the number of nodes that must be traced is also reduced. For these reasons, the number of messages is greatly reduced.

総データ転送量に関しても同様の傾向がある（図１２）。フレーム分割を行わない場合（分割フレーム数２^０にデータ転送量が小さいのは、単一のインデックスエントリのみでデータオブジェクトのオブジェクトシグネチャ全体が得られるため、解の絞り込みが瞬時に行えることによる。分割フレーム数が２^１以上の付近で総データ転送量が大きくなっているのは、順次巡回による十分な解の絞り込みが行えないため中間結果のデータ転送量が大きくなり、検索に必要なメッセージ数も大きくなっているためである。 There is a similar tendency with respect to the total data transfer amount (FIG. 12). The case without frame division (data transfer amount to the divided frame number 2 ⁰ is small, the entire object signature of the data object only in a single index entry is obtained, due to the fact that narrowing of the solution can be performed instantaneously. Division the number of frames is increased the total amount of data transfer in the vicinity of 2 ¹ or more, increased data transfer amount of the intermediate results for can not be performed narrowing of sufficient de by sequential cyclic, even if the number of messages required to search This is because it is getting bigger.

応答時間について説明する。本提案手法では、各ノードを順次巡回することで検索を行っており、全ての絞り込み処理が終了した時点で結果を返す。このため応答時間は、検索時のメッセージ数（図１１）と同一の曲線となる。この場合についても分割フレーム数を大きくした場合の方が優れている。 The response time will be described. In the proposed method, the search is performed by sequentially visiting each node, and the result is returned when all the narrowing-down processes are completed. Therefore, the response time has the same curve as the number of messages at the time of search (FIG. 11). Also in this case, the case where the number of divided frames is increased is superior.

実験結果より、検索時のメッセージ数に関しては分割フレーム数が大きくなる程効率的であることが分かる。また、シグネチャのウェイトを小さくすることで、メッセージ数を低減できることを確認できる。 From the experimental results, it can be seen that the larger the number of divided frames, the more efficient the number of messages at the time of retrieval. It can also be confirmed that the number of messages can be reduced by reducing the signature weight.

［オブジェクト追加］
次に新規にデータオブジェクトが追加される場合の、インデックスエントリの配置に必要なメッセージ数とその応答時間について測定する。実験では、分割フレーム数とシグネチャ長Ｆを変化させ、ノード数は１２８、２５６と変化させて行う。配置方法即ち登録方法は先に述べた方法に基づいて行う。既に述べたように、分割フレーム数を増加させた方が追加の際に必要なメッセージ数は増加することが予想される。また、オブジェクト検索時と同様に、シグネチャ長を大きくした場合の方が、インデックスエントリの配置に必要なメッセージ数は小さくなることが予想される。 [Add Object]
Next, when a data object is newly added, the number of messages necessary for the arrangement of index entries and the response time are measured. In the experiment, the number of divided frames and the signature length F are changed, and the number of nodes is changed to 128 and 256. The arrangement method, that is, the registration method is performed based on the method described above. As described above, it is expected that the number of messages required for addition increases as the number of divided frames increases. Similarly to the object search, it is expected that the number of messages required for the arrangement of the index entry is reduced when the signature length is increased.

図１３は平均メッセージ数である。分割フレーム数が大きくなるほど、メッセージ数が大きくなっていることがわかる。この理由は、分割フレーム数が大きくなるほど、新たに配置する必要のあるインデックスエントリ数が多くなるからである。シグネチャのウェイトを小さくした場合には相対的にメッセージ数が小さくなっている。これは、配置する必要のあるインデックスエントリの数が小さくなるからである。 FIG. 13 shows the average number of messages. It can be seen that the larger the number of divided frames, the larger the number of messages. This is because as the number of divided frames increases, the number of index entries that need to be newly arranged increases. When the signature weight is reduced, the number of messages is relatively small. This is because the number of index entries that need to be arranged is reduced.

さらに応答時間についても測定する。図１４が実験結果である。インデックスエントリの配置では並列的な処理を行うため、分割フレーム数の増大によるインデックスエントリ数の増大に対し、応答時間の増加曲線は緩やかである。また、どのシグネチャ長の場合も応答時間はほとんど変化していないことがわかる。 The response time is also measured. FIG. 14 shows the experimental results. Since the index entries are arranged in parallel, the response time increase curve is gradual with respect to the increase in the number of index entries due to the increase in the number of divided frames. It can also be seen that the response time hardly changes for any signature length.

このように、分割フレーム数を大きくした場合にはメッセージ数は非常に大きくなる傾向にあるが、応答時間で見た場合には、フレーム分割を行わない場合と比べてもそれほど大きな差はない。 As described above, when the number of divided frames is increased, the number of messages tends to be very large. However, when viewed in response time, there is not much difference compared to the case where frame division is not performed.

［検索と追加が混在する場合］
オブジェクトの検索処理、およびオブジェクトの追加処理の発生頻度を考慮した場合の平均メッセージ数について検討する。ここでは検索処理の生起確率がｐであるものとし、追加処理の生起確率が（１−ｐ）であるものとする。実験ではノード数を１２８とし、分割シグネチャとシグネチャ長を変化させ、そのときの検索時のメッセージ数と追加時のメッセージ数にそれぞれの生起確率をかけて合計したものを平均メッセージ数とする。 [When search and addition are mixed]
Consider the average number of messages considering the frequency of object search processing and object addition processing. Here, it is assumed that the occurrence probability of the search process is p, and the occurrence probability of the additional process is (1-p). In the experiment, the number of nodes is 128, the divided signature and the signature length are changed, and the average number of messages is obtained by multiplying the number of messages at the time of retrieval and the number of messages at the time of addition by the respective occurrence probabilities.

図１５が実験結果である。一般的に検索処理の方が、追加処理と比べて多く発生すると考えられるため、ｐを０．７、０．８、０．９と変化させる。ｐ＝０．７の場合は、分割フレーム数が２^４の辺りでメッセージ数が最も小さくなる。また、ｐ＝０．８の場合は、分割フレーム数が２^５の辺りでメッセージ数が最も小さくなっており、ｐ＝０．９の場合では、分割フレーム数が２^５以上であれば、最小となるメッセージ数はほぼ一定になっていることがわかる。このように、検索と追加の発生割合に応じて、最適の分割フレーム数が異なってくる。 FIG. 15 shows the experimental results. In general, it is considered that search processing occurs more frequently than addition processing, so p is changed to 0.7, 0.8, and 0.9. For p = 0.7, the number of divided frame message number becomes the smallest at around 2 ^4. In addition, when p = 0.8, the number of messages is the smallest when the number of divided frames is around ²⁵ , and when p = 0.9, the number of divided frames is ²⁵ or more and the minimum It can be seen that the number of messages is almost constant. Thus, the optimum number of divided frames varies depending on the search and the additional occurrence rate.

［オフラインノードが存在する場合の検索の精度］
最後に、所定の割合のノードがオフラインの状態にあり、シグネチャの照合処理が正しく行えないと仮定した場合における、検索の精度について実験を行う。なお、Ｃｈｏｒｄの枠組みではオフラインノードが存在する場合でもルーティング情報を動的に更新することができる。このため、ルーティング情報は常に正しく利用できるものと仮定する。本実験では、問合せを実行した場合のフォールスドロップ確率を計算する。オフラインノードが存在する場合には、これらのオフラインノードに存在するフレームシグネチャとの照合ができないため、フォールスドロップ数が大きくなることが予想される。 [Accuracy of search when offline node exists]
Finally, an experiment is performed on the accuracy of search when it is assumed that a predetermined percentage of nodes are offline and signature verification processing cannot be performed correctly. In the Chord framework, routing information can be dynamically updated even when an offline node exists. For this reason, it is assumed that the routing information can always be used correctly. In this experiment, the false drop probability when a query is executed is calculated. When offline nodes exist, the frame signatures existing in these offline nodes cannot be collated, so the number of false drops is expected to increase.

実験では、問合せ特徴量の数を１とし、分割フレーム数ｓｌｉｃｅを２^０、２^４、２^８と変化させフォールスドロップ確率を求める。また、ノード数は１２８とし、正しく照合処理が行えないノードの割合を０％から最大５０％まで２．５％ずつ変化させる。 In the experiment, the number of query feature amounts is set to 1, and the number of divided frames slice is changed to 2 ⁰ , 2 ⁴ , and 2 ⁸ to determine the false drop probability. Further, the number of nodes is 128, and the ratio of nodes that cannot be correctly verified is changed by 2.5% from 0% to a maximum of 50%.

図１６が実験結果である。オフラインノードの割合が大きくなるにつれて、フォールスドロップ確率も増加することを確認できる。分割フレーム数を大きくした場合の方が、フォールスドロップ確率の増加曲線は緩やかになっている。この理由は、分割フレーム数を大きくすることでフレームシグネチャのサイズが小さくなり、照合することができないシグネチャのサイズが小さくなるからであると考えられる。このように、オフラインノードが存在する状況下でも、分割フレーム数を大きくすることで一定水準の検索の精度を維持することができる。 FIG. 16 shows the experimental results. It can be confirmed that the false drop probability increases as the proportion of offline nodes increases. When the number of divided frames is increased, the increase curve of the false drop probability is gentler. The reason for this is considered that the size of the frame signature is reduced by increasing the number of divided frames, and the size of the signature that cannot be verified is reduced. In this way, even in the situation where offline nodes exist, a certain level of search accuracy can be maintained by increasing the number of divided frames.

Ｃｈｏｒｄアーキテクチャを説明するために用いる図である。It is a figure used in order to explain Chord architecture. Ｃｈｏｒｄによる検索処理を説明するために用いる図である。It is a figure used in order to explain the search processing by Chord. オブジェクトシグネチャの生成法を説明するために用いる図である。It is a figure used in order to explain the generation method of an object signature. 問合せシグネチャの生成法を説明するために用いるである。It is used to explain how to generate a query signature. 本発明の方法の実施の形態を実施するために用いるプログラムにおける登録処理のアルゴリズムを示すフローチャートである。It is a flowchart which shows the algorithm of the registration process in the program used in order to implement embodiment of the method of this invention. プログラムにおける検索処理のアルゴリズムを示すフローチャートである。It is a flowchart which shows the algorithm of the search process in a program. ロケータの生成法を説明するためにいる図である。It is a figure for demonstrating the production | generation method of a locator. インデックスエントリの配置を説明するために用いる図である。It is a figure used in order to demonstrate arrangement | positioning of an index entry. 検索アルゴリズムを表にした図である。It is the figure which tabulated the search algorithm. オブジェクト検索を説明するために用いる図である。It is a figure used in order to explain object search. 検索時の平均メッセージ数についての実験結果を示す図である。It is a figure which shows the experimental result about the average number of messages at the time of a search. 検索時の平均総データ転送量についての実験結果を示す図である。It is a figure which shows the experimental result about the average total data transfer amount at the time of a search. 配置時の平均メッセージ数についての実験結果を示す図である。It is a figure which shows the experimental result about the average number of messages at the time of arrangement | positioning. 配置時の平均応答時間についての実験結果を示す図である。It is a figure which shows the experimental result about the average response time at the time of arrangement | positioning. 検索と追加が混在する場合のメッセージ数についての実験結果を示す図である。It is a figure which shows the experimental result about the number of messages in case search and addition coexist. オフラインノードがあるときのフォールスドロップ確率についての実験結果を示す図である。It is a figure which shows the experimental result about the false drop probability when there exists an offline node.

Claims

Common using signatures from objects registered according to a common registration method using signatures to a maximum of 2 ⁿ nodes (n is a positive integer) connected to the network and having node IDs identifying themselves. An object registration search method in a P2P environment for searching for a desired object using the search method of
The registration method is:
A first registration step in which one or more feature quantities of an object to be registered is an object signature consisting of a bit string of length m (m is a positive integer of 1 or more);
A second registration step of dividing the bit string of length m of the object signature into p divided frames composed of a frame signature composed of a bit string of length m / p (p is a positive integer of 1 or more); ,
A third registration step of synthesizing a bit string specifying each of the p number of divided frames and a bit string of length m / p constituting the frame signature to create a combined signature corresponding to each divided frame;
A fourth registration step of generating p locators comprising a bit string of length n from the bit string constituting the composite signature;
A node ID of the node that is performing the registration work, a local ID that identifies an object to be registered in the node that is performing the registration work, data that identifies the divided frame, and the frame signature A fifth registration step of creating an index entry corresponding to the p locators;
A sixth registration step of registering the index entry with one or more of the nodes specified by the locator;
The search method is:
A first query entry generating step for generating one or more feature quantities of an object to be searched as a query signature consisting of a bit string of length m (m is a positive integer of 1 or more) in the node performing the search operation; A second query entry that divides the bit string of length m of the query signature into p divided frames composed of a query frame signature composed of a bit string of length m / p (p is a positive integer of 1 or more). A third query for generating a combined signature corresponding to each divided frame by combining a generating step, a bit string specifying each of the p divided frames, and a bit string of length m / p constituting the query frame signature An entry generation step and p queries comprising a bit string of length n from the bit string constituting the composite signature A first search step of performing a fourth query entry generation step of generating an applicator,
Index entries that satisfy the query frame signature condition are collected at one or more nodes determined based on the p query locators, and finally collected at the one or more nodes from the node that collects index entries. A method for registering and retrieving an object in a P2P environment, comprising: a second retrieval step of transmitting all the index entries that satisfy the condition of the inquiry frame signature to a node that performs the retrieval operation.

2. The object registration search according to claim 1, wherein, in the fourth registration step, the locator for specifying a registration destination node is generated using a maximum of n bits from the top of a bit string constituting the composite signature. Method.

The second search step includes
Based on the first query locator among the p query locators, all search target locators that determine the node to be searched are obtained to generate a search target locator set, and a node ID of a node that performs the search operation and A first step of sending query data including all the query frame signatures to the node defined by the first search target locator of the search target locator set;
A second step of obtaining all index entries satisfying the condition of the query frame signature corresponding to the first search target locator as the partial index entry candidate set in the first node;
A third step of sending the query data and the index entry candidate to the next node determined by a next search target locator;
In the next node, all index entries that satisfy the conditions of the query frame signature corresponding to the next search target locator are acquired, and the union of the index entry set and the partial index entry candidate set is obtained. A fourth step in which the result is the next partial index entry candidate set;
A fifth step of repeating the third step and the fourth step until the last search target locator is reached, and setting the last partial index entry candidate set as an index entry candidate set;
The search target locator set is generated from the next query locator in the p query locators, and the query data and the index entry candidate set are determined by the first search target locator of the search target locator set. A sixth step to send to
The second step, the third step, the fourth step, and the fifth step are repeated as a result, starting from the next node until reaching the last search target locator of the search target locator set. A seventh step of taking the product set of the index entry candidate set and the index entry candidate set at the start of this step and setting the result as the next index entry candidate set;
In the P2P environment, the sixth step and the seventh step are repeated until the last inquiry locator is reached, and the final result is transmitted to the node that performs the search operation. How to register and search for objects.

3. The P2P according to claim 2, wherein the first query entry generation step to the fourth query entry generation step are executed to determine the next query locator at each of the nodes determined by the query locator. How to register and search for objects in the environment.

Register to a computer of up to 2 ⁿ nodes each having a node ID that identifies itself by being connected to the network according to a common registration procedure using a signature, and using a signature from a registered object. A program for registering and retrieving objects in a P2P environment used for retrieving a desired object using a retrieval procedure,
The registration procedure includes:
A first registration step in which one or more feature quantities of an object to be registered is an object signature consisting of a bit string of length m (m is a positive integer of 1 or more);
A second registration step of dividing the bit string of length m of the object signature into p divided frames composed of a frame signature composed of a bit string of length m / p (p is a positive integer of 1 or more); ,
A third registration step of synthesizing a bit string specifying each of the p number of divided frames and a bit string of length m / p constituting the frame signature to create a combined signature corresponding to each divided frame;
A fourth registration step of generating p locators comprising a bit string of length n from the bit string constituting the composite signature;
A node ID of the node that is performing the registration work, a local ID that identifies an object to be registered in the node that is performing the registration work, data that identifies the divided frame, and the frame signature A fifth registration step of creating an index entry corresponding to the p locators;
A sixth registration step of registering the index entry with one or more of the nodes specified by the locator;
The search procedure is
A first query entry generating step in which at least one feature quantity of an object to be searched is a query signature consisting of a bit string of length m (m is a positive integer of 1 or more) in the node performing the search operation; Generation of a second query entry that divides the bit string of length m of the query signature into p divided frames composed of a query frame signature consisting of a bit string of length m / p (p is a positive integer of 1 or more) A third query entry for synthesizing a bit string having a length of m / p and a bit string specifying each of the p divided frames and a bit string of length m / p constituting the query frame signature And p query locators comprising a bit string of length n from the bit string constituting the composite signature A first search step of performing a fourth query entry generation step of generating,
Index entries that satisfy the query frame signature condition are collected at one or more nodes determined based on the p query locators, and finally collected at the one or more nodes from the node that collects index entries. A second search step of transmitting all the index entries that satisfy the condition of the query frame signature to a node that performs the search operation,
An object registration / retrieval program in a P2P environment for executing the registration procedure and the retrieval procedure.

The second search step includes
Based on the first query locator among the p query locators, all search target locators that determine the node to be searched are obtained to generate a search target locator set, and a node ID of a node that performs the search operation and A first step of sending query data including all the query frame signatures to the node defined by the first search target locator of the search target locator set;
A second step of obtaining all index entries satisfying the condition of the query frame signature corresponding to the first search target locator as the partial index entry candidate set in the first node;
A third step of sending the query data and the index entry candidate to the next node determined by a next search target locator;
In the next node, all index entries that satisfy the conditions of the query frame signature corresponding to the next search target locator are acquired, and the union of the index entry set and the partial index entry candidate set is obtained. A fourth step in which the result is the next partial index entry candidate set;
A fifth step of repeating the third step and the fourth step until the last search target locator is reached, and setting the last partial index entry candidate set as an index entry candidate set;
The search target locator set is generated from the next query locator in the p query locators, and the query data and the index entry candidate set are determined by the first search target locator of the search target locator set. A sixth step to send to
The second step, the third step, the fourth step, and the fifth step are repeated as a result, starting from the next node until reaching the last search target locator of the search target locator set. A seventh step of taking the product set of the index entry candidate set and the index entry candidate set at the start of this step and setting the result as the next index entry candidate set;
In the P2P environment, the sixth step and the seventh step are repeated until the last inquiry locator is reached, and the final result is transmitted to the node that performs the search operation. Object registration search program.

The first query entry generation step to the fourth query entry generation step are executed at each of the nodes determined by the query locator to further realize a function of determining the next query locator. 7. A program method for registering and retrieving objects in the P2P environment according to 6.