JP2011034254A

JP2011034254A - Similar search device, similar search system and similar search method

Info

Publication number: JP2011034254A
Application number: JP2009178559A
Authority: JP
Inventors: Takao Murakami; 隆夫村上; Kenta Takahashi; 健太高橋; Hiroshi Koya; 博小屋
Original assignee: Hitachi Information and Control Systems Inc; Hitachi Information and Control Solutions Ltd
Current assignee: Hitachi Information and Control Systems Inc; Hitachi Information and Control Solutions Ltd
Priority date: 2009-07-31
Filing date: 2009-07-31
Publication date: 2011-02-17

Abstract

<P>PROBLEM TO BE SOLVED: To achieve high speed similar search by using a few hardware resources. <P>SOLUTION: A client terminal includes: a feature quantity for registration selection part for storing feature quantity data and representative data information showing which of the feature quantity data is representative data and a distance table showing similarity or distance between each representative data and each feature quantity data into a database, and for selecting one of the feature quantity data; a similarity calculation part for calculating the similarity or distance between the selected feature quantity data and the feature quantity for search; a determination part for determining whether or not the selected data and the extracted feature quantity are similar based on the calculated similarity or distance; and a processing part for performing processing corresponding to the determination result of the determination part. The feature quantity for registration selection part alternately selects the representative data and non-representative data from among the feature quantity data for non-registration by using the distance table. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、マルチメディアデータに類似するデータを検索する技術に関する。 The present invention relates to a technique for searching for data similar to multimedia data.

デジタル化された画像、動画、音楽、文書などのマルチメディアデータに対して、それに類似するマルチメディアデータを検索することを類似検索と呼ぶ。類似検索は、一般には、抽出対象とするマルチメディアデータ（以後、生データ）から、類似度或いは距離の算出に用いられる、特徴量と呼ばれる情報を抽出し、特徴量同士の一致度合いを示す類似度が大きいもの、或いは、特徴量同士の不一致度合いを示す距離が小さいものを、類似すると見なすことにより行なわれる。 Retrieving multimedia data similar to digital data such as images, moving images, music, and documents is called a similarity search. Similarity search generally extracts information called feature values used for calculating similarity or distance from multimedia data (hereinafter referred to as raw data) to be extracted, and shows the similarity between feature values. This is performed by regarding a thing with a large degree or a thing with a small distance indicating a degree of mismatch between feature quantities as similar.

類似の定義としては、類似検索の用途によって、例えば、データベースなどに登録されたマルチメディアデータの中で、入力として受け付けたマルチメディアデータの特徴量との類似度が最も大きい特徴量を持つデータを、類似すると定義する場合や、類似度がある閾値より大きい特徴量を持つ、複数のデータについて、類似すると定義する場合などがある。 As a similar definition, depending on the use of similarity search, for example, among the multimedia data registered in a database or the like, the data having the largest feature amount with the feature amount of the multimedia data received as input is selected. There are cases where it is defined as similar, and cases where a plurality of pieces of data having a feature amount greater than a certain threshold are defined as similar.

類似検索の対象としてデータベースなどに登録された特徴量（以後、登録用特徴量）の総数をＮとすると、全ての登録用特徴量に対して類似度を算出する場合、Ｎ回の類似度計算が必要である。一般に、類似度計算には大きな時間コストが必要とされるため、登録用特徴量数Ｎが増加すれば、それに比例して検索に時間がかかる。 When the total number of feature quantities registered in the database or the like as registration targets (hereinafter referred to as registration feature quantities) is N, the similarity calculation is performed N times when the similarity is calculated for all registration feature quantities. is required. In general, since a large time cost is required for similarity calculation, if the number N of registration feature amounts increases, the search takes time in proportion thereto.

これに対して、あらかじめ登録用特徴量同士の類似度を算出し、これを用いて効率的に登録用特徴量の探索を行なうことで、類似度を算出する回数を削減する技術が提案されている（特許文献１）。 On the other hand, a technique for reducing the number of times of calculating the similarity by calculating the similarity between the registration feature quantities in advance and efficiently searching for the registration feature quantity using the registration feature quantity has been proposed. (Patent Document 1).

特許文献１の段落００１１から段落００１７には、登録データを複数のグループに分類し、グループ毎に各登録データ同士の照合度（類似度）が記述された行列（以後、照合度テーブル）を用意しておき、検索時には、それらを用いて効率的に登録データの探索を行ない、照合度の計算回数を削減する技術が開示されている。 In paragraphs 0011 to 0017 of Patent Document 1, registration data is classified into a plurality of groups, and a matrix (hereinafter referred to as a matching degree table) in which the degree of matching (similarity) between registered data is described for each group is prepared. In addition, there is disclosed a technique for efficiently searching for registered data by using them at the time of search and reducing the number of verification degree calculations.

また、非特許文献１には、Ｎ個の登録用特徴量の中から、代表とする登録用特徴量（以後、ｐｉｖｏｔ）をＭ個選択し、各ｐｉｖｏｔと各登録用特徴量との、特徴量空間における距離を記述した行列（以後、距離テーブル）を用意しておき、これを用いて、最初は入力されたデータと各ｐｉｖｏｔとの距離を算出しながら、いくつかの登録用特徴量を選択対象から除外する枝刈りを行ない、距離を求める、或いは枝刈りを行なう、という処理が、全てのｐｉｖｏｔに対して施された後に、ｐｉｖｏｔ以外の登録用特徴量（以後、ｎｏｎ−ｐｉｖｏｔ）に対して、入力されたデータとの距離を求めることで、類似検索における距離の計算回数を削減する、ＬＡＥＳＡと呼ばれる技術が記載されている。 Also, in Non-Patent Document 1, M representative registration feature amounts (hereinafter referred to as pivots) are selected from N registration feature amounts, and the features of each pivot and each registration feature amount are selected. A matrix describing the distance in the quantity space (hereinafter referred to as a distance table) is prepared, and using this, several feature values for registration are calculated while calculating the distance between the input data and each pivot. After the process of pruning to exclude from the selection target, obtaining the distance, or pruning is performed for all the pivots, the registration feature amount other than pivot (hereinafter referred to as non-pivot) is used. On the other hand, a technique called LAESA is described that reduces the number of distance calculations in the similarity search by obtaining the distance to the input data.

また、別の類似検索技術として、Ｎ個の登録データ同士の距離を記述したＮ×Ｎの行列を用意しておき、検索時には、それらを用いて効率的に登録データの探索及び枝刈りを行ない、距離の計算回数を削減する、ＡＥＳＡと呼ばれる技術が従来より提案されており、これらの類似検索技術についてサーベイが行われている（非特許文献２）。 As another similar search technique, an N × N matrix describing the distance between N pieces of registered data is prepared, and at the time of search, search and pruning of registered data are efficiently performed using them. A technique called AESA that reduces the number of times of distance calculation has been proposed in the past, and a survey has been conducted on these similar search techniques (Non-patent Document 2).

非特許文献２のｐ．７８−８０には、ＡＥＳＡにおいて距離を算出する回数が登録データの総数Ｎに対して一定である、即ちＯ（１）であること、及び非特許文献１に記載のＬＡＥＳＡにおいて距離を算出する回数はＭ＋Ｏ（１）であることが記載されている。 Non-Patent Document 2 p. In 78-80, the number of times of calculating the distance in AESA is constant with respect to the total number N of registered data, that is, O (1), and the number of times of calculating the distance in LAESA described in Non-Patent Document 1. Is M + O (1).

特開2007-286970号公報JP 2007-286970 A

Ｌ．Ｍｉｃｏｅｔａｌ．， “Ａｎａｌｇｏｒｉｔｈｍｆｏｒｆｉｎｄｉｎｇｎｅａｒｅｓｔｎｅｉｇｈｂｏｕｒｓｉｎｃｏｎｓｔａｎｔａｖｅｒａｇｅｔｉｍｅｗｉｔｈａｌｉｎｅａｒｓｐａｃｅｃｏｍｐｌｅｘｉｔｙ，” Ｐｒｏｃ．ＩＣＰＲ１９９２，ｐｐ．５５７−５６０（１９９２）L. Mico et al. "An algorithm for finding nearest neighbors in constant average time with a linear space complexity," Proc. ICPR 1992, pp. 557-560 (1992) Ｐ．Ｚｅｚｕｌａｅｔａｌ．，ＳｉｍｉｌａｒｉｔｙＳｅａｒｃｈ − ＴｈｅＭｅｔｒｉｃＳｐａｃｅＡｐｐｒｏａｃｈ，Ｓｐｒｉｎｇｅｒ（２００６）P. Zezula et al. , Similarity Search-The Metric Space Approach, Springer (2006)

特許文献１では、グループ数をＧとすると、照合度テーブルのサイズはＯ（Ｎ^２／Ｇ）である。従って、もしグループ数Ｇを一定とすると、照合度テーブルのサイズはＯ（Ｎ^２）となり、登録データ数Ｎの増加に伴って照合度テーブルのサイズが飛躍的に増加するため、大容量のハードウェア資源が必要になるという課題がある。 In Patent Document 1, when the number of groups is G, the size of the matching degree table is O (N ² / G). Therefore, if the number of groups G is constant, the size of the collation degree table becomes O (N ² ), and the size of the collation degree table increases dramatically as the number of registered data N increases. There is a problem that hardware resources are required.

一方、照合度テーブルのサイズを抑えるためにグループ数Ｇを増加させた場合には、それに比例して照合度を算出する回数が増加する。特許文献１においてグループ数Ｇを１とすると、前述のＡＥＳＡにおいて、枝刈り処理を行なわない場合と等価である。枝刈り処理を行なわない場合には、算出を行う回数が増える可能性があるため、特許文献１においてグループ数Ｇを１とした場合、類似度を算出する回数はＯ（１）以上となる。 On the other hand, when the number of groups G is increased in order to reduce the size of the matching level table, the number of times that the matching level is calculated increases in proportion thereto. If the number of groups G is 1 in Patent Document 1, this is equivalent to the case where no pruning process is performed in the above-described AESA. If the pruning process is not performed, the number of times of calculation may increase. Therefore, when the number of groups G is 1 in Patent Document 1, the number of times of calculating the similarity is O (1) or more.

グループ数Ｇを増やすと、各グループの検索にそれぞれＯ（１）以上の回数の算出を行なうことになるため、合計ではＯ（Ｇ）以上の算出回数、すなわち検索時間がかかる。従って、例えば、登録用特徴量数Ｎの増加に比例してグループ数Ｇを増加させた場合、類似度テーブルのサイズはＯ（Ｎ）となり、ＡＥＳＡにおける距離テーブルのサイズと同じオーダーとなるが、類似度を算出する回数は、登録用特徴量数Ｎに比例して増加してしまうという課題がある。 When the number of groups G is increased, the number of times of O (1) or more is calculated for each group search, so that the total number of times of calculation of O (G) or more, that is, a search time is required. Therefore, for example, when the group number G is increased in proportion to the increase in the registration feature quantity N, the size of the similarity table becomes O (N), which is the same order as the size of the distance table in AESA. There is a problem that the number of times of calculating the degree of similarity increases in proportion to the number N of registration feature amounts.

また、非特許文献１に記載のＬＡＥＳＡにおいて距離を算出する回数はＭ＋Ｏ（１）であるため、登録用特徴量数Ｎが大きい場合、ＬＡＥＳＡにおける距離の算出回数は、特許文献１で登録用特徴量数Ｎに比例してグループ数Ｇを増加させた場合における類似度の算出回数と比べては、はるかに少ない。 In addition, since the number of times the distance is calculated in LAESA described in Non-Patent Document 1 is M + O (1), when the registration feature quantity N is large, the number of distance calculations in LAESA is the registration feature in Patent Document 1. This is far less than the number of similarities calculated when the group number G is increased in proportion to the quantity number N.

しかしながら、非特許文献１に記載のＬＡＥＳＡでは、距離を求める、或いは枝刈りを行なう、という処理を、全てのｐｉｖｏｔに対して施してから、ｎｏｎ−ｐｉｖｏｔとの距離を求めている。この場合、類似する登録用特徴量がｎｏｎ−ｐｉｖｏｔであった場合、そのｎｏｎ−ｐｉｖｏｔとの距離を求めるために、全てのｐｉｖｏｔに対して距離を求めておく、或いは枝刈りを行なっておく必要があり、距離を算出する回数を削減する上では改良の余地があった。 However, in LAESA described in Non-Patent Document 1, a process of obtaining a distance or performing pruning is performed on all pivots, and then a distance from non-pivot is obtained. In this case, if the similar registration feature quantity is non-pivot, it is necessary to obtain the distance for all pivots or to perform pruning in order to obtain the distance to the non-pivot. There was room for improvement in reducing the number of times of calculating the distance.

企業により運営される類似画像検索サービスなど、大規模な類似検索システムにおいては、登録用特徴量数Ｎが膨大になることが考えられる。また、例えば入退室管理のための生体認証などにおいては、限られたハードウェア資源を用いて、かつ短時間で類似検索を行なうことが要求される。 In a large-scale similar search system such as a similar image search service operated by a company, it is conceivable that the registration feature quantity N is enormous. For example, in biometric authentication for entrance / exit management, it is required to perform a similarity search in a short time using limited hardware resources.

したがって、必要なハードウェア資源を削減し、かつ検索時間の短縮を実現する類似検索技術の提供が望まれている。 Therefore, it is desired to provide a similar search technique that reduces necessary hardware resources and shortens search time.

本発明は、マルチメディアデータに類似するデータを検索する類似検索装置であって、複数の登録用特徴量データを格納する記憶部と、マルチメディアデータより抽出された特徴量を取得する手段と、複数の登録用特徴量データの中から、１つを選択する登録用特徴量選択部と、選択した登録用特徴量データと、抽出した特徴量との、特徴量空間における類似度或いは距離を算出する類似度算出部と、算出した類似度或いは距離に基づき、選択した登録用特徴量データと抽出した特徴量とが類似するか否かを判定する判定部と、判定部による判定結果に応じた処理を行なう処理部と、を備える。 The present invention is a similarity search device for searching for data similar to multimedia data, a storage unit for storing a plurality of registration feature data, means for acquiring feature data extracted from the multimedia data, Calculate the similarity or distance in the feature quantity space between the registration feature quantity selection unit for selecting one of the registration feature quantity data, the selected registration feature quantity data, and the extracted feature quantity A determination unit that determines whether the selected registration feature value data is similar to the extracted feature value based on the calculated similarity or distance, and a determination result by the determination unit A processing unit that performs processing.

記憶部は、さらに、複数の登録用特徴量データのうち、ｐｉｖｏｔ（代表データ）がどれであるかを示す代表データ情報と、各ｐｉｖｏｔと各登録用特徴量データの特徴量空間における類似度或いは距離を格納しており、登録用特徴量選択部は、まだ選択していない登録用特徴量データのうちから、ｐｉｖｏｔとｎｏｎ−ｐｉｖｏｔ（非代表データ）とを、それぞれ所定の数ごとに交互に、記憶部に格納された各ｐｉｖｏｔと各登録用特徴量データの特徴量空間における類似度或いは距離を用いて選択し、判定部による判定の結果、類似しない場合には、判定部が、登録用特徴量選択部により登録用特徴量データを選択させ、類似度算出部により類似度或いは距離を算出させて、判定を行うという一連の処理を、類似判定がなされるか、あるいは登録用特徴量データ全てに対して判定がなされるまで繰り返すことを特徴とする。 The storage unit further includes representative data information indicating which pivot (representative data) is among the plurality of registration feature value data, and the similarity in the feature value space between each pivot and each registration feature value data, or The registration feature value selection unit alternately stores pivot and non-pivot (non-representative data) for each predetermined number from the registration feature value data not yet selected. , If each voivot stored in the storage unit is selected using the similarity or distance in the feature amount space of each registration feature amount data, and the result of determination by the determination unit is that they are not similar, the determination unit Whether a similarity determination is made or not, a series of processes in which the feature amount selection unit selects registration feature amount data and the similarity calculation unit calculates similarity or distance and performs determination. And repeating until a determination with respect to the registration feature data all is done.

本発明によれば、少ないハードウェア資源を用いて、高速な類似検索が可能である。 According to the present invention, a high-speed similarity search is possible using a small number of hardware resources.

本発明の第一の実施形態の機能構成を例示するブロック図である。It is a block diagram which illustrates functional composition of a first embodiment of the present invention. 本発明の第一、第二、第三の実施形態のハードウェア構成を例示するブロック図である。It is a block diagram which illustrates the hardware constitutions of 1st, 2nd, 3rd embodiment of this invention. 本発明の第一、第二の実施形態の登録処理を例示するフロー図である。It is a flowchart which illustrates the registration process of 1st, 2nd embodiment of this invention. 本発明の第一の実施形態の事前処理を例示するフロー図である。It is a flowchart which illustrates the pre-processing of 1st embodiment of this invention. 本発明の第一の実施形態の検索処理を例示するフロー図である。It is a flowchart which illustrates the search process of 1st embodiment of this invention. 距離が対称性を満たす場合の距離テーブルを例示する概略図である。It is the schematic which illustrates a distance table in case distance satisfy | fills symmetry. 距離が対称性を満たさない場合の距離テーブルを例示する概略図である。It is the schematic which illustrates a distance table in case distance does not satisfy symmetry. ｑ−ｅ間距離ベクトルとｐ−ｅ間距離ベクトルを例示する概略図である。It is the schematic which illustrates the distance vector between qe, and the distance vector between pe. ｑ−ｐ間距離ベクトルとｎ−ｐ間距離ベクトルを例示する概略図である。It is the schematic which illustrates the distance vector between qp, and the distance vector between np. 本発明の第二の実施形態の機能構成を例示するブロック図である。It is a block diagram which illustrates functional composition of a second embodiment of the present invention. 本発明の第二、第三の実施形態の事前処理を例示するフロー図である。It is a flowchart which illustrates the pre-processing of 2nd, 3rd embodiment of this invention. 本発明の第二の実施形態の検索処理を例示するフロー図である。It is a flowchart which illustrates the search process of 2nd embodiment of this invention. 本発明の第三の実施形態の機能構成を例示するブロック図である。It is a block diagram which illustrates functional composition of a third embodiment of the present invention. 本発明の第三の実施形態の検索処理を例示するフロー図である。It is a flowchart which illustrates the search process of 3rd embodiment of this invention.

以下に、図面を用いて実施形態について詳細に説明する。 Hereinafter, embodiments will be described in detail with reference to the drawings.

以下、図面を参照して、１つ目の実施形態について説明する。本実施形態の類似検索システムは、認証を試みるユーザ（以後、認証ユーザ）がクライアント端末に生体情報を入力し、システムがクライアント端末内のデータベースから類似する生体情報データを検索することで、認証ユーザがデータベースに登録されているユーザ（以後、登録ユーザ）のうち誰であるか（或いは誰でもないか）を識別し、その結果に基づいて認証を行なう生体認証システムである。 Hereinafter, the first embodiment will be described with reference to the drawings. The similarity search system according to the present embodiment allows a user who attempts authentication (hereinafter referred to as an “authentication user”) to input biometric information into the client terminal, and the system searches for similar biometric information data from a database in the client terminal. Is a biometric authentication system that identifies who is (or is not) any of users registered in the database (hereinafter, registered users) and performs authentication based on the result.

特徴量の類似度としては、例えば特徴量空間における距離（非類似度）に−１をかけたものを用いる方法がある。以降、「距離」とは特徴量空間における距離を意味する。また、２つの特徴量が類似しているか否かの判定基準として、類似度ではなく距離そのものを用いても良い。この場合、類似度を用いる場合と大小関係が逆になり、距離が小さいものを類似すると見なすことになる。 For example, there is a method of using the feature amount similarity by multiplying the distance (dissimilarity) in the feature amount space by -1. Hereinafter, “distance” means a distance in the feature space. Further, as a criterion for determining whether two feature amounts are similar, the distance itself may be used instead of the similarity. In this case, the magnitude relationship is reversed from that in the case of using the similarity, and those having a small distance are regarded as similar.

クライアント端末が、データベースに登録された登録用特徴量データのうち、ｐｉｖｏｔがどれであるかを示すｐｉｖｏｔ情報と、各ｐｉｖｏｔと各登録用特徴量データの特徴量空間における距離を示す距離テーブルとを備え、ｐｉｖｏｔとｎｏｎ−ｐｉｖｏｔとを、それぞれ所定の数ごとに交互に、距離テーブルを用いて選択して、入力された生体情報データに類似するデータを検索する。 The client terminal includes pivot information indicating which pivot is the registration feature value data registered in the database, and a distance table indicating the distance between each pivot and each registration feature value data in the feature amount space. In addition, pivot and non-pivot are alternately selected for each predetermined number using a distance table, and data similar to the input biological information data is searched.

これにより、メモリやＨＤＤの容量が制限される場合にも、多数の登録用特徴量データを対象とした高速な類似検索を可能とする。 As a result, even when the capacity of the memory or HDD is limited, a high-speed similarity search for a large number of registration feature data is possible.

また、入力された生体情報データより抽出した特徴量（以後、検索用特徴量）との距離を算出する登録用特徴量データは、距離テーブルを用いて算出した特徴量同士の相関の大きさに基づいて選択する。 Also, the registration feature value data for calculating the distance from the feature value extracted from the input biometric information data (hereinafter referred to as the search feature value) is based on the magnitude of the correlation between the feature values calculated using the distance table. Select based on.

図１に、本実施形態の類似検索システムの構成例を示す。本実施形態では、生データはデジタル化された生体情報である。
このシステムは、ユーザから取得した登録情報をサーバ端末へ送信する登録端末１００と、登録情報を保存するサーバ端末２００と、ユーザが入力した検索用の生データを基に類似検索を行なうクライアント端末３００と、ネットワーク４００とを備える。クライアント端末は１台でも良いし、複数台存在しても良い。ネットワーク４００は、ＷＡＮやＬＡＮなどのネットワーク、ＵＳＢやＩＥＥＥ１３９４などを用いた機器間の通信、あるいは携帯電話網やＢｌｕｅＴｏｏｔｈ（登録商標）などの無線通信を用いても良い。 FIG. 1 shows a configuration example of the similarity search system of this embodiment. In the present embodiment, the raw data is digitized biological information.
This system includes a registration terminal 100 that transmits registration information acquired from a user to a server terminal, a server terminal 200 that stores registration information, and a client terminal 300 that performs a similarity search based on raw data for search input by the user. And a network 400. There may be one client terminal or a plurality of client terminals. The network 400 may use a network such as a WAN or a LAN, communication between devices using USB or IEEE1394, or wireless communication such as a mobile phone network or BlueTooth (registered trademark).

例えば、企業内の入退室管理システムの場合、登録端末１００は登録センター（登録用に確保された居室など）内に置かれたＰＣと生体情報入力センサ、クライアント端末３００は居室のドア付近に設置されたドアコントローラと生体情報入力センサ、サーバ端末２００はサーバ室に置かれたサーバ、ネットワーク４００は社内イントラネットとする構成が考えられる。また、企業内の勤怠管理システムの場合、登録端末１００は登録センター内に置かれたＰＣと生体情報入力センサ、クライアント端末３００は居室に設置されたＰＣと生体情報入力センサ、サーバ端末２００はサーバ室に置かれたサーバ、ネットワーク４００は社内イントラネットとする構成が考えられる。 For example, in the case of a company entrance / exit management system, the registration terminal 100 is installed in a registration center (such as a room reserved for registration) and a biometric information input sensor, and the client terminal 300 is installed near the door of the room. The door controller and the biometric information input sensor, the server terminal 200 may be a server placed in the server room, and the network 400 may be an in-house intranet. In the case of a company attendance management system, the registration terminal 100 is a PC and a biometric information input sensor placed in a registration center, the client terminal 300 is a PC and a biometric information input sensor installed in a room, and the server terminal 200 is a server. The server and the network 400 placed in the room may be configured as an in-house intranet.

これらの例においては、後述するグループＩＤ２１０及びグループＩＤ３１０は居室毎に異なる値となるように設定しても良い。このようにグループ分けすることで、１グループにおける登録用特徴量数Ｎを少なくすることができ、その分、距離を算出する回数を削減することができ、より高速な検索が可能となる。また、グループＩＤ２１０及びグループＩＤ３１０は無くても良い。 In these examples, a group ID 210 and a group ID 310, which will be described later, may be set to have different values for each room. By grouping in this way, the number N of registration feature amounts in one group can be reduced, and the number of times of calculating the distance can be reduced correspondingly, thereby enabling a faster search. Further, the group ID 210 and the group ID 310 may be omitted.

登録端末１００は、グループＩＤや登録データＩＤを取得するＩＤ取得部１０１と、生データを取得する生データ取得部１０２と、登録用の生データから登録用の特徴量を抽出する特徴量抽出部１０３と、通信Ｉ／Ｆ１０４とを備える。ＩＤ取得部１０１は、キーボードなどを用いて実現される。生データ取得部１０２は、本実施形態においては生体情報入力センサを用いて実現される。取得する生体情報の種類としては、指紋、虹彩、声紋、静脈などが考えられる。特徴量抽出部１０３は、サーバ端末２００側にあっても良い。抽出する特徴量としては、例えば指紋の場合であれば、指紋画像より抽出する、指紋の隆線の端点および分岐点であるマニューシャ、虹彩の場合であれば、虹彩画像から作成するアイリスコード（虹彩コード）と呼ばれるビット列などが考えられる。 The registration terminal 100 includes an ID acquisition unit 101 that acquires a group ID and a registration data ID, a raw data acquisition unit 102 that acquires raw data, and a feature amount extraction unit that extracts a registration feature amount from the registration raw data 103 and a communication I / F 104. The ID acquisition unit 101 is realized using a keyboard or the like. In this embodiment, the raw data acquisition unit 102 is realized using a biological information input sensor. Possible types of biometric information to be acquired include fingerprints, irises, voiceprints, veins, and the like. The feature amount extraction unit 103 may be on the server terminal 200 side. For example, in the case of a fingerprint, the feature amount to be extracted is extracted from a fingerprint image. In the case of an iris, an iris code (iris) created from an iris image. A bit string called “code” can be considered.

サーバ端末２００は、グループＩＤ２１０毎にＮ個の登録情報２２０の中からＭ個のｐｉｖｏｔを決定するｐｉｖｏｔ決定部２０１と、グループＩＤ２１０毎に各ｐｉｖｏｔの特徴量と各登録用特徴量データの特徴量間の距離を算出する距離テーブル作成部２０２と、通信Ｉ／Ｆ２０３と、データベース２０４とを備える。登録用特徴量数Ｎ及びｐｉｖｏｔ数ＭはグループＩＤ２１０毎に異なっていても良い。また、２つの特徴量間の距離としては、例えばマニューシャを特徴量とした場合は、各特徴点の位置と隆線の方向を用いて照合を行なうマニューシャ・マッチング方式によって求めた距離、アイリスコードを特徴量とした場合は、コード間のハミング距離などが考えられる。 The server terminal 200 includes a pivot determining unit 201 that determines M pivots from the N pieces of registration information 220 for each group ID 210, and a feature amount for each pivot and a feature amount for each registration feature amount data for each group ID 210. A distance table creation unit 202 that calculates a distance between the communication I / F 203 and a database 204 is provided. The registration feature quantity number N and the pivot number M may be different for each group ID 210. As the distance between two feature quantities, for example, when a minutia is used as a feature quantity, the distance obtained by the minutia matching method in which matching is performed using the position of each feature point and the direction of the ridge, the iris code is In the case of the feature amount, a Hamming distance between codes can be considered.

データベース２０４は、グループＩＤ２１０毎にマスタデータ２０５を保持する。グループＩＤ２１０は、ユーザが出入りする居室のＩＤとするなどの方法が考えられる。マスタデータ２０５は、グループＩＤ２１０と、各登録ユーザの登録情報２２０と、補助情報２３０とを含む。登録情報２２０は、登録データＩＤ２２１と、生データ２２２と、特徴量データ２２３とを含む。登録データＩＤ２２１は、ユーザの従業員番号とするなどの方法が考えられる。 The database 204 holds master data 205 for each group ID 210. The group ID 210 may be a method of setting the ID of a room where the user goes in and out. The master data 205 includes a group ID 210, registration information 220 for each registered user, and auxiliary information 230. The registration information 220 includes a registration data ID 221, raw data 222, and feature amount data 223. The registered data ID 221 may be a user employee number.

また、生データ２２２は無くても良い。但し、生データ２２２があれば、特徴量抽出部１０３がバージョンアップされた場合に、生データ２２２から最新の特徴量を抽出できるため、生データの再登録をしなくて済むというメリットがある。補助情報２３０は、登録情報２２０の特徴量データ２２３のうちどれがｐｉｖｏｔであるかが記されたｐｉｖｏｔ情報２３１と、各ｐｉｖｏｔと各登録情報２２０の特徴量データ２２３間の距離が記された距離テーブル２３２とを含んで構成される。 Further, the raw data 222 may be omitted. However, if there is the raw data 222, when the feature quantity extraction unit 103 is upgraded, the latest feature quantity can be extracted from the raw data 222, so that there is an advantage that it is not necessary to re-register the raw data. The auxiliary information 230 includes the pivot information 231 that describes which of the feature amount data 223 of the registration information 220 is a pivot, and the distance that describes the distance between each pivot and the feature amount data 223 of each registration information 220. And a table 232.

クライアント端末３００は、生データを取得する生データ取得部３０１と、検索用の生データから検索用の特徴量を抽出する特徴量抽出部３０２と、検索用特徴量とｐｉｖｏｔとの相関を計算する検索用特徴量−ｐｉｖｏｔ間相関計算部３０３と、検索用特徴量とｎｏｎ−ｐｉｖｏｔとの相関を計算する検索用特徴量−ｎｏｎ−ｐｉｖｏｔ間相関計算部３０４と、検索用特徴量との距離を算出する登録情報３２０を選択する登録用特徴量選択部３０５と、検索用特徴量と登録情報３２０の特徴量データ３２３との距離を算出する類似度算出部３０６と、算出した距離を基に選択した登録情報３２０の特徴量データ３２３が検索用特徴量に類似しているかなどを判定する判定部３０７と、まだ検索用特徴量との距離を求めておらず、枝刈りもされていないｐｉｖｏｔ及びｎｏｎ−ｐｉｖｏｔに対して、枝刈り処理を行なう枝刈り部３０７ａと、通信Ｉ／Ｆ３０８と、データベース３０９とを備える。 The client terminal 300 calculates a correlation between a raw data acquisition unit 301 that acquires raw data, a feature amount extraction unit 302 that extracts a search feature amount from the search raw data, and a search feature amount and pivot. The distance between the search feature quantity-pivot correlation calculation unit 303, the search feature quantity-non-pivot correlation calculation unit 304 that calculates the correlation between the search feature quantity and non-pivot, and the search feature quantity. Selection based on the calculated distance, a registration feature selection unit 305 that selects the registration information 320 to be calculated, a similarity calculation unit 306 that calculates the distance between the search feature and the feature data 323 of the registration information 320 The distance between the determination unit 307 that determines whether the feature amount data 323 of the registered information 320 is similar to the search feature amount, and the search feature amount have not yet been obtained, and pruning is also performed. Against non pivot and non-pivot, it comprises a pruning unit 307a that performs pruning process, a communication I / F308, a database 309.

相関計算部３０３、３０４による相関の計算は、以下の距離ベクトルについて行なう。検索用特徴量と、探索済みの全登録用特徴量データとの距離で構成されるベクトルを「ｑ−ｅ間距離ベクトル」、ｐｉｖｏｔと、探索済み全登録用特徴量データとの距離で構成されるベクトルを「ｐ−ｅ間距離ベクトル」、検索用特徴量と、探索済みの全ｐｉｖｏｔとの距離で構成されるベクトルを「ｑ−ｐ間距離ベクトル」、ｎｏｎ−ｐｉｖｏｔと、探索済みの全ｐｉｖｏｔとの距離で構成されるベクトルを「ｎ−ｐ間距離ベクトル」と定義する。上記以外の他のベクトルとして定義したものを用いても良い。 The correlation calculation by the correlation calculation units 303 and 304 is performed for the following distance vectors. A vector composed of the distance between the search feature quantity and the searched all registration feature quantity data is constituted by the distance between the “q-e distance vector” and pivot and the searched all registration feature quantity data. The vector consisting of the distance between the search feature value and the distances of all the searched pivots is the “q-p distance vector”, the non-pivot, A vector composed of a distance from pivot is defined as an “n-p distance vector”. You may use what was defined as a vector other than the above.

検索用特徴量−ｐｉｖｏｔ間相関計算部３０３は、ｑ−ｅ間距離ベクトルとｐ−ｅ間距離ベクトルとの相関を、検索用特徴量−ｎｏｎ−ｐｉｖｏｔ間相関計算部３０４は、ｑ−ｐ間距離ベクトルとｎ−ｐ間距離ベクトルとの相関をそれぞれ計算する。相関の計算の詳細については後述する。検索用特徴量−ｐｉｖｏｔ間相関計算部３０３、及び枝刈り部３０７ａは無くても良い。 The search feature quantity-pivot correlation calculation unit 303 calculates the correlation between the q-e distance vector and the p-e distance vector, and the search feature quantity-non-pivot correlation calculation unit 304 calculates the correlation between q-p. The correlation between the distance vector and the n-p distance vector is calculated. Details of the calculation of the correlation will be described later. The search feature quantity-pivot correlation calculation unit 303 and the pruning unit 307a may be omitted.

データベース３０９は、グループＩＤ３１０と、各登録ユーザの登録情報３２０と、補助情報３３０とを含む。登録情報３２０は、登録データＩＤ３２１と、生データ３２２と、特徴量データ３２３とを含む。生データ３２２は無くても良い。補助情報３３０は、登録情報３２０の特徴量データ３２３のうちどれがｐｉｖｏｔであるかが記されたｐｉｖｏｔ情報３３１と、各ｐｉｖｏｔと各登録情報３２０の特徴量データ３２３間の距離が記された距離テーブル３３２とを含む。 The database 309 includes a group ID 310, registration information 320 for each registered user, and auxiliary information 330. The registration information 320 includes a registration data ID 321, raw data 322, and feature amount data 323. The raw data 322 may not be present. The auxiliary information 330 includes pivot information 331 that describes which of the feature quantity data 323 of the registration information 320 is a pivot, and a distance that describes the distance between each pivot and the feature quantity data 323 of each registration information 320. Table 332.

登録情報３２０及び補助情報３３０は、同一のグループＩＤを持つサーバ端末２００のマスタデータ２０５の登録情報２２０及び補助情報２３０と同一のものである。例えば、入退室管理システムであって、クライアント端末３００が居室のドア付近に設置されている場合には、クライアント端末３００に、その居室を示すグループＩＤ２１０が紐付けられており、同一のグループＩＤを持つ登録情報及び補助情報を、サーバ端末２００より取得することが考えられる。このような構成をとることにより、入退室を許可するユーザを、居室ごとに管理することができる。また、各クライアント端末３００は、システムに登録されているデータのうち、自身が認証する対象のデータのみを格納するため、より少ない容量のＨＤＤおよびメモリを備える装置で実現できる。 The registration information 320 and the auxiliary information 330 are the same as the registration information 220 and the auxiliary information 230 of the master data 205 of the server terminal 200 having the same group ID. For example, in the entrance / exit management system, when the client terminal 300 is installed near the door of a room, the group ID 210 indicating the room is linked to the client terminal 300, and the same group ID is assigned. It is conceivable to obtain the registration information and auxiliary information possessed from the server terminal 200. By adopting such a configuration, users who are allowed to enter and leave the room can be managed for each room. In addition, each client terminal 300 stores only the data to be authenticated among the data registered in the system, and thus can be realized by an apparatus having a smaller capacity HDD and memory.

登録情報３２０及び補助情報３３０は、例えば、クライアント端末３００が、システムがユーザにより利用されない夜間に、通信Ｉ／Ｆ３０８によりネットワークを介してサーバ端末２００よりダウンロードすることにより取得する。 The registration information 320 and the auxiliary information 330 are acquired, for example, when the client terminal 300 downloads from the server terminal 200 via the network by the communication I / F 308 at night when the system is not used by the user.

また、上記の機能部以外に、判定部３０７の判定結果に基づいて、システムの用途に応じた処理を行なう処理部を備える。例えば入退室管理システムであって、クライアント端末３００がドアコントローラを含む場合には、判定部３０７の判定結果に基づいて、ドアのロックを制御する処理部を備える。また、勤怠管理システムであって、クライアント端末３００がＰＣを含む場合には、判定部３０７の判定結果に応じて、類似すると判定された特徴量データ３２３に紐付けられた登録データＩＤを、認証を行なった時刻とともにログとしてデータベースに記録する処理部などを備える。 In addition to the above-described functional units, a processing unit that performs processing according to the use of the system based on the determination result of the determination unit 307 is provided. For example, when the client terminal 300 includes a door controller, a processing unit that controls the door lock based on the determination result of the determination unit 307 is provided. Further, in the attendance management system, when the client terminal 300 includes a PC, the registration data ID associated with the feature amount data 323 determined to be similar is authenticated according to the determination result of the determination unit 307. A processing unit or the like is recorded as a log in the database along with the time when the operation is performed.

図２に、本実施形態における登録端末１００、サーバ端末２００、クライアント端末３００のハードウェア構成を示す。これらの端末は、ＣＰＵ５００と、メモリ５０１と、ＨＤＤ５０２と、入力装置５０３と、出力装置５０４と、通信装置５０５と、可搬性のある記憶媒体５０７の読取装置５０６を備える構成とすることができる。 FIG. 2 shows a hardware configuration of the registration terminal 100, the server terminal 200, and the client terminal 300 in the present embodiment. These terminals can include a CPU 500, a memory 501, an HDD 502, an input device 503, an output device 504, a communication device 505, and a portable storage medium 507 reading device 506.

登録端末１００の入力装置５０３は、例えばキーボードおよび生体情報入力センサであり、出力装置５０４はディスプレイである。サーバ端末２００の入力装置５０３は、例えばキーボードであり、出力装置５０４は同様にディスプレイである。クライアント端末３００の入力装置５０３は、例えばキーボードおよび生体情報入力センサであり、出力装置５０４はディスプレイである。 The input device 503 of the registration terminal 100 is, for example, a keyboard and a biological information input sensor, and the output device 504 is a display. The input device 503 of the server terminal 200 is a keyboard, for example, and the output device 504 is a display as well. The input device 503 of the client terminal 300 is, for example, a keyboard and a biological information input sensor, and the output device 504 is a display.

前述した、登録端末１００、サーバ端末２００、およびクライアント端末３００が備える各機能部は、ＣＰＵ５００が、例えばＨＤＤ５０２からメモリ５０１に読み出して実行する各プログラムによって実現される。各プログラムは、あらかじめ、各端末内のＨＤＤ５０２に格納されていても良いし、必要に応じ、各端末が利用可能な媒体を介して、他の装置からＨＤＤ５０２に導入されてもよい。媒体とは、例えば、読取装置５０６に着脱可能な記憶媒体５０７、または、ネットワークや、ネットワークを伝搬する搬送波やデジタル信号などの通信媒体を指す。 Each functional unit included in the registration terminal 100, the server terminal 200, and the client terminal 300 described above is realized by each program that the CPU 500 reads out from the HDD 502 to the memory 501 and executes the program. Each program may be stored in advance in the HDD 502 in each terminal, or may be introduced into the HDD 502 from another device via a medium that can be used by each terminal as necessary. The medium refers to, for example, a storage medium 507 that can be attached to and detached from the reading device 506, or a communication medium such as a network, a carrier wave that propagates through the network, and a digital signal.

図３に、本実施形態における登録の処理手順およびデータの流れを示す。以下に示す登録処理は、登録端末１００のＣＰＵ５００が、ＨＤＤ５０２からメモリ５０１に読み出して実行するプログラムによって実現される。そしてこのプログラムは、以下に説明される各種の動作を行うためのコードから構成されている。 FIG. 3 shows a registration processing procedure and data flow in this embodiment. The registration process described below is realized by a program that the CPU 500 of the registration terminal 100 reads out from the HDD 502 to the memory 501 and executes. And this program is comprised from the code | cord | chord for performing the various operation | movement demonstrated below.

まず登録端末１００のＩＤ取得部１０１が、ユーザ操作によりグループＩＤ及び登録データＩＤを入力として受け付ける（ステップＳ１０１）。
登録端末１００の生データ取得部１０２が、ユーザ操作により登録用の生データを取得する（ステップＳ１０２）。
登録端末１００の特徴量抽出部１０３が、登録用の生データから登録用の特徴量を抽出する（ステップＳ１０３）。
登録端末１００は、グループＩＤ、登録データＩＤ、登録用の生データ、および抽出した登録用特徴量データを、通信Ｉ／Ｆ１０４によりネットワーク４００を介してサーバ端末２００に送信する（ステップＳ１０４）。
サーバ端末２００は、登録端末１００から受信したグループＩＤと同一のグループＩＤ２１０を持つマスタデータ２０５に、受信した登録データＩＤ、登録用の生データ、および登録用特徴量データを、登録情報２２０として追加する（ステップＳ１０５）。 First, the ID acquisition unit 101 of the registration terminal 100 receives a group ID and a registration data ID as input by a user operation (step S101).
The raw data acquisition unit 102 of the registration terminal 100 acquires raw data for registration by a user operation (step S102).
The feature extraction unit 103 of the registration terminal 100 extracts a registration feature from the registration raw data (step S103).
The registration terminal 100 transmits the group ID, the registration data ID, the raw data for registration, and the extracted feature data for registration to the server terminal 200 via the network 400 by the communication I / F 104 (step S104).
The server terminal 200 adds the received registration data ID, registration raw data, and registration feature value data as registration information 220 to the master data 205 having the same group ID 210 as the group ID received from the registration terminal 100. (Step S105).

特徴量抽出部１０３をサーバ端末２００が備える場合は、ステップＳ１０４ではクライアント端末３００は登録ＩＤおよび生データのみを送信し、受信したサーバ端末２００の特徴量抽出部１０３が、ステップＳ１０３の特徴量抽出処理を行なえば良い。 When the server terminal 200 includes the feature quantity extraction unit 103, in step S104, the client terminal 300 transmits only the registration ID and raw data, and the received feature quantity extraction unit 103 of the server terminal 200 receives the feature quantity extraction in step S103. What is necessary is just to process.

図４に、本実施形態における事前処理の処理手順およびデータの流れを示す。本処理は、登録処理を行なってから検索処理を行なうまでの間に行なう。例えば、登録を行なった日の夜間に行なう、などの方法が考えられる。以下に示す事前処理は、サーバ端末２００のＣＰＵ５００が、ＨＤＤ５０２からメモリ５０１に読み出して実行するプログラムによって実現される。そしてこのプログラムは、以下に説明される各種の動作を行うためのコードから構成されている。 FIG. 4 shows a processing procedure and data flow of pre-processing in the present embodiment. This process is performed between the registration process and the search process. For example, a method of performing at night on the day of registration can be considered. The pre-processing shown below is realized by a program that the CPU 500 of the server terminal 200 reads from the HDD 502 to the memory 501 and executes. And this program is comprised from the code | cord | chord for performing the various operation | movement demonstrated below.

サーバ端末２００のデータベース２０４において、グループＩＤ２１０毎に、前回の事前処理の実行時以降で、登録情報２２０が追加されたマスタデータ２０５があれば、ｐｉｖｏｔ決定部２０１が、そのマスタデータ２０５中のＮ個の登録情報２２０の特徴量データ２２３の中から、Ｍ個をｐｉｖｏｔとして選択する処理を行なう（ステップＳ２０１）。Ｍの値は、たとえばシステムの管理者の操作などにより予め設定されている。ｐｉｖｏｔの決定方法の詳細は後述する。 In the database 204 of the server terminal 200, if there is master data 205 to which the registration information 220 has been added for each group ID 210 since the previous pre-processing, the pivot determining unit 201 determines that N in the master data 205. A process of selecting M pieces as pivots from the feature amount data 223 of the pieces of registration information 220 is performed (step S201). The value of M is set in advance by, for example, an operation of a system administrator. The details of the pivot determination method will be described later.

サーバ端末２００の距離テーブル作成部２０２は、ｐｉｖｏｔ決定部２０１により決定された各ｐｉｖｏｔと、同一マスタデータ２０５内の各登録情報２２０の特徴量データ２２３との距離を算出し、距離テーブルを作成する（ステップＳ２０２）。 The distance table creation unit 202 of the server terminal 200 calculates the distance between each pivot determined by the pivot determination unit 201 and the feature amount data 223 of each registration information 220 in the same master data 205, and creates a distance table. (Step S202).

図６(a),(b)に距離テーブルの例を示す。ここで、Ｘ_１、Ｘ_２、・・・、Ｘ_Ｍはｐｉｖｏｔ、Ｙ_１、Ｙ_２、・・・、Ｙ_Ｎ−Ｍはｎｏｎ−ｐｉｖｏｔを表す。例えば、図６(b)では、Ｘ_２からＹ_１への距離ｄ（Ｘ_２，Ｙ_１）＝２５、Ｙ_１からＸ_２への距離ｄ（Ｙ_１，Ｘ_２）＝２７となっている。距離が対称性を満たす場合（即ち、ｄ（Ａ，Ｂ）＝ｄ（Ｂ，Ａ）の場合）は、図６(a)のようにどちらか一方の距離情報を記憶しておけばよい。逆に対称性を満たさない場合は、図６(b)のように両方の距離情報を記憶しておく。対称性を満たす距離の例としては、ハミング距離などが挙げられる。対称性を満たさない距離の例としては、マニューシャ・マッチング方式によって求めた距離などが挙げられる。 FIGS. 6A and 6B show examples of distance tables. Here, X ₁ , X ₂ ,..., X _M represent pivot, Y ₁ , Y ₂ ,..., Y _N-M represent non-pivot. For example, in FIG. 6 (b), a distance _d (Y _1, X 2) = 27 from the distance _d _(X 2, Y 1) = 25, _{Y 1} from _{X 2} to _{Y 1} to _{X 2} . When the distance satisfies the symmetry (that is, when d (A, B) = d (B, A)), either one of the distance information may be stored as shown in FIG. Conversely, when the symmetry is not satisfied, both distance information is stored as shown in FIG. An example of the distance that satisfies the symmetry is a Hamming distance. Examples of distances that do not satisfy symmetry include distances obtained by the minutia matching method.

本実施形態の類似検索システムでは、距離テーブル中の要素数は、非対称の距離尺度を用いる場合は２ＭＮ−Ｍ（Ｍ−１）個、対称の距離尺度を用いる場合にはＭＮ−Ｍ（Ｍ−１）／２個であり、どちらの場合においても距離テーブルのサイズはＯ（Ｎ）と小さい。従って、登録用特徴量数Ｎの上昇に伴う距離テーブルのサイズの増加を抑えることができ、データベースの容量及び必要なメモリの空き容量を削減できるという効果が得られる。 In the similarity search system of this embodiment, the number of elements in the distance table is 2MN-M (M-1) when an asymmetric distance measure is used, and MN-M (M-M when a symmetric distance measure is used. 1) / 2, and in both cases, the size of the distance table is as small as O (N). Therefore, an increase in the size of the distance table accompanying an increase in the number N of registration feature quantities can be suppressed, and an effect of reducing the capacity of the database and the required free memory capacity can be obtained.

例えば入退室管理システムなどの場合、クライアント端末３００にあたるドアコントローラは、一般にメモリやＨＤＤの容量などが非常に制限されるため、本実施形態の適用が特に有用であると考えられる。 For example, in the case of an entrance / exit management system or the like, since the door controller corresponding to the client terminal 300 is generally very limited in the capacity of the memory and HDD, the application of this embodiment is considered to be particularly useful.

サーバ端末２００のｐｉｖｏｔ決定部２０１および距離テーブル作成部２０２は、ステップＳ２０１で決定したＭ個のｐｉｖｏｔに関するｐｉｖｏｔ情報（登録情報２２０の特徴量データ２２３のうちどれがｐｉｖｏｔであるか）と、ステップＳ２０２で求めた距離テーブルとを、それぞれマスタデータ２０５の補助情報２３０に上書きで保存する（ステップＳ２０３）。 The pivot determining unit 201 and the distance table creating unit 202 of the server terminal 200 include the pivot information (which is the feature amount data 223 of the registration information 220 is pivot) about the M pivots determined in step S201, and step S202. The distance table obtained in step (1) is overwritten and stored in the auxiliary information 230 of the master data 205 (step S203).

クライアント端末３００は、サーバ端末２００にグループＩＤ３１０を送信し、送信したグループＩＤと同一のグループＩＤ２１０を持つマスタデータ２０５の登録情報２２０と補助情報２３０をサーバ端末２００から受信する。受信したデータはデータベース３０９に登録情報３２０と補助情報３３０として保存する（ステップＳ２０４）。 The client terminal 300 transmits the group ID 310 to the server terminal 200, and receives registration information 220 and auxiliary information 230 of the master data 205 having the same group ID 210 as the transmitted group ID from the server terminal 200. The received data is stored in the database 309 as registration information 320 and auxiliary information 330 (step S204).

図５に、本実施形態における検索の処理手順およびデータの流れを示す。以下に示す検索処理は、サーバ端末２００のＣＰＵ５００が、ＨＤＤ５０２からメモリ５０１に読み出して実行するプログラムによって実現される。そしてこのプログラムは、以下に説明される各種の動作を行うためのコードから構成されている。また、以下は一実施形態であり、検索の処理手順はこれに限定されるものではない。
クライアント端末３００のＣＰＵ５００が、データベース３０９から登録情報３２０と補助情報３３０を読み出し、メモリ５０１に展開する（ステップＳ３０１）。
クライアント端末３００の生データ取得部３０１が、ユーザから検索用の生データを取得する（ステップＳ３０２）。
クライアント端末３００の特徴量抽出部３０２が、検索用の生データから検索用特徴量を抽出する（ステップＳ３０３）。
クライアント端末３００の判定部３０７が、距離算出回数ｉを０に、されていないｐｉｖｏｔ数Ｍ’をＭに、それぞれ初期化する（ステップＳ３０４）。ここで判定部３０７は、距離算出回数ｉが２Ｍ’未満であればステップＳ３０６に、２Ｍ’以上であればステップＳ３１０に進む（ステップＳ３０５）。 FIG. 5 shows a search processing procedure and a data flow in this embodiment. The search processing shown below is realized by a program that the CPU 500 of the server terminal 200 reads from the HDD 502 to the memory 501 and executes. And this program is comprised from the code | cord | chord for performing the various operation | movement demonstrated below. The following is one embodiment, and the search processing procedure is not limited to this.
The CPU 500 of the client terminal 300 reads the registration information 320 and the auxiliary information 330 from the database 309 and develops them in the memory 501 (step S301).
The raw data acquisition unit 301 of the client terminal 300 acquires raw data for search from the user (step S302).
The feature quantity extraction unit 302 of the client terminal 300 extracts the search feature quantity from the search raw data (step S303).
The determination unit 307 of the client terminal 300 initializes the distance calculation count i to 0 and the number of pivots M ′ that have not been set to M (step S304). Here, the determination unit 307 proceeds to step S306 if the distance calculation count i is less than 2M ′, and proceeds to step S310 if it is 2M ′ or more (step S305).

ステップＳ３０６では、距離算出回数ｉが偶数であればステップＳ３０７に、奇数であればステップＳ３０９に進む。
ステップＳ３０７では、ｐｉｖｏｔの中から１つを選択するために、クライアント端末３００の検索用特徴量−ｐｉｖｏｔ間相関計算部が、ｑ−ｅ間距離ベクトルと、まだ検索用特徴量との距離を求めておらず、枝刈りもされていない各ｐｉｖｏｔにおけるｐ−ｅ間距離ベクトルとの相関を計算する。ｑ−ｅ間距離ベクトルとｐ−ｅ間距離ベクトルとの相関の計算方法の詳細は後述する。また、このステップは実行しなくても良い。 In step S306, if the distance calculation count i is an even number, the process proceeds to step S307, and if it is an odd number, the process proceeds to step S309.
In step S307, in order to select one of the pivots, the search feature quantity-pivot correlation calculation unit of the client terminal 300 obtains the distance between the q-e distance vector and the search feature quantity. The correlation with the distance vector between pe in each pivot that is not pruned is calculated. Details of the method of calculating the correlation between the qe distance vector and the pe distance vector will be described later. Further, this step may not be executed.

クライアント端末３００の登録用特徴量選択部３０５は、まだ距離を求めておらず、枝刈りもされていないｐｉｖｏｔのうち、ステップＳ３０７で計算した相関が最も大きいデータを、距離を算出する対象として選択する（ステップＳ３０８）。ステップＳ３０７を実行しない場合は、本検索処理前にあらかじめ定めておいた順序に従って、ｐｉｖｏｔを選択する。ｐｉｖｏｔの選択順序は、例えば擬似乱数生成器を用いてランダムに決定する方法がある。この順序はデータベース２０４のマスタデータ２０５に保持させておき、ステップＳ２０４でクライアント端末３００がその他の情報とともに受信、保存して、ステップＳ３０１においてデータベース３０９より読み出すようにすれば良い。 The registration feature quantity selection unit 305 of the client terminal 300 selects the data having the largest correlation calculated in step S307 as the target for calculating the distance among the pivots that have not yet been obtained and have not been pruned. (Step S308). When step S307 is not executed, pivot is selected according to a sequence determined in advance before the main search process. The pivot selection order may be determined randomly using, for example, a pseudo-random number generator. This order may be held in the master data 205 of the database 204, and the client terminal 300 may receive and store it together with other information in step S204, and read from the database 309 in step S301.

一方、ステップＳ３０９では、ｎｏｎ−ｐｉｖｏｔの中から１つを選択するために、クライアント端末３００の検索用特徴量−ｎｏｎ−ｐｉｖｏｔ間相関計算部が、ｑ−ｐ間距離ベクトルと、まだ距離を求めておらず、枝刈りもされていない各ｎｏｎ−ｐｉｖｏｔにおけるｎ−ｐ間距離ベクトルとの相関を計算する。ｑ−ｐ間距離ベクトルとｎ−ｐ間距離ベクトルとの相関の計算方法の詳細は後述する。 On the other hand, in step S309, in order to select one of the non-pivots, the search feature amount-non-pivot correlation calculation unit of the client terminal 300 obtains the q-p distance vector and the distance yet. The correlation with the n-p distance vector in each non-pivot that has not been pruned is calculated. Details of a method of calculating the correlation between the qp distance vector and the np distance vector will be described later.

クライアント端末３００の登録用特徴量選択部３０５は、まだ距離を求めておらず、枝刈りもされていないｎｏｎ−ｐｉｖｏｔのうち、ステップＳ３０９で計算した相関が最も大きいデータを、距離を算出する対象として選択する（ステップＳ３１０）。 The registration feature quantity selection unit 305 of the client terminal 300 calculates the distance from the non-pivot that has not yet obtained the distance and has not been pruned and has the largest correlation calculated in step S309. (Step S310).

クライアント端末３００の判定部３０７は、距離算出回数ｉを１増やす（ステップＳ３１１）。
クライアント端末３００の類似度算出部３０６は、検索用の特徴量（Ｑとする）と選択した登録情報３２０の特徴量データ３２３（Ｚ_ｉとする）との距離ｄ（Ｑ，Ｚ_ｉ）を算出する（ステップＳ３１２）。
ステップＳ３１３では、クライアント端末３００の判定部３０７が、ステップＳ３１２で算出した距離ｄ（Ｑ，Ｚ_ｉ）が所定の閾値τより小さければ、類似すると判定し、ステップＳ３１４に進む。閾値τ以上であれば、類似しないと判定し、ステップＳ３１５に進む。 The determination unit 307 of the client terminal 300 increases the distance calculation count i by 1 (step S311).
Similarity calculation unit 306 of the client terminal 300 calculates the feature quantity for search (and Q) characteristic amount data 323 of the registration information 320 and selected (the _{Z i)} the distance between d (Q, _{Z i)} (Step S312).
In step S313, the determination unit 307 of the client terminal 300 determines that they are similar if the distance d (Q, Z _i ) calculated in step S312 is smaller than the predetermined threshold τ, and the process proceeds to step S314. If it is more than threshold value (tau), it will determine with not being similar and will progress to step S315.

類似すると判定した場合、クライアント端末３００は、検索用特徴量に類似する登録用特徴量データ３２３を持つ登録情報３２０を発見した場合の処理を行なう（ステップＳ３１４）。 If it is determined that they are similar, the client terminal 300 performs processing when the registration information 320 having the registration feature quantity data 323 similar to the search feature quantity is found (step S314).

生体認証の用途に応じて、この登録情報３２０を検索結果として検索終了としても良いし、あるいはステップＳ３０５に戻って、他の登録情報３２０の検索を引き続き行なっても良い。前者の手法は、類似する登録情報３２０が存在するか否かをできる限り短時間で判定したい場合において有効である。後者の手法は、類似すると判定した登録情報３２０全て出力したい場合や、最も類似する登録情報３２０を検索したい場合などにおいて有効である。 Depending on the use of biometric authentication, the registration information 320 may be used as a search result to end the search, or the process may return to step S305 to continue searching for other registration information 320. The former method is effective when it is desired to determine whether or not similar registration information 320 exists in as short a time as possible. The latter method is effective when it is desired to output all the registered information 320 determined to be similar or when it is desired to search for the most similar registered information 320.

例えば、認証ユーザがデータベースに登録されているか否かをチェックすることを目的とした入退室管理システムの場合には、このステップの時点で検索終了とし、ロックされていた居室のドアをアンロックする処理が考えられる。また、認証ユーザが誰なのかを出来る限り正確に把握することを目的とした勤怠管理システムの場合には、判定結果をメモリ５０１に保存し、ステップＳ３０５に戻って他の登録情報３２０の検索を引き続き行なう方法が考えられる。 For example, in the case of an entrance / exit management system for checking whether or not an authenticated user is registered in the database, the search ends at the time of this step and the door of the locked room is unlocked. Processing can be considered. If the attendance management system is intended to ascertain as accurately as possible who the authenticated user is, the determination result is stored in the memory 501 and the process returns to step S305 to search for other registered information 320. A method of continuing can be considered.

ステップＳ３１３において判定部３０７が類似しないと判定した場合、クライアント端末３００の枝刈り部３０７ａは、まだ距離を求めておらず、枝刈りもされていないｐｉｖｏｔ及びｎｏｎ−ｐｉｖｏｔに対して、枝刈り処理を行なう（ステップＳ３１５）。枝刈り処理の詳細は後述する。また、このステップは実行しなくても良い。 If the determination unit 307 determines in step S313 that they are not similar, the pruning unit 307a of the client terminal 300 has not yet obtained the distance, and the pruning process is performed on pivot and non-pivot that have not been pruned. Is performed (step S315). Details of the pruning process will be described later. Further, this step may not be executed.

クライアント端末３００の判定部３０７が、距離算出回数ｉを参照し、距離算出回数ｉが距離算出回数の上限値ｉ_ｍａｘであればステップＳ３１７に進み、ｉ_ｍａｘでなければステップＳ３０５に戻る（ステップＳ３１６）。本実施形態では、距離算出回数はｉ_ｍａｘを超えることがないので、ｉ_ｍａｘの設定によって検索時間の最悪値（最大値）が決まる。従って、システムに応じて、検索にかけることができる時間の条件を満たすよう、検索前にあらかじめシステム管理者がｉ_ｍａｘの値を設定しておく。 The determination unit 307 of the client terminal 300 refers to the distance calculation count i and proceeds to step S317 if the distance calculation count i is the upper limit value i _max of the distance calculation count, and returns to step S305 if not i _max (step S316). ). In the present embodiment, the distance calculation number does not exceed the i _max, the worst value of the search time by setting the i _max (maximum value) is determined. Therefore, the system administrator sets a value of i _max in advance before the search so as to satisfy the condition of the time that can be searched according to the system.

検索を終了するとしてステップＳ３１７に進んだ場合は、クライアント端末３００は、検索終了時の処理を行なう。検索用の特徴量に類似する登録用の特徴量データ３２３を持つ登録情報３２０が見つからなかった場合には、検索失敗時の処理を行なう。例えば、入退室管理システムの場合、入退室を認められたユーザではないとして、ドアをロックしたままにすることが考えられる。勤怠管理システムの場合は、ユーザに生体情報の再入力を促すメッセージを、出力装置５０４に出力することが考えられる。 When the process proceeds to step S317 to end the search, the client terminal 300 performs processing at the end of the search. If the registration information 320 having the registration feature quantity data 323 similar to the search feature quantity is not found, the process at the time of the search failure is performed. For example, in the case of an entrance / exit management system, it is conceivable that the door is kept locked, assuming that the user is not allowed to enter / exit. In the case of the attendance management system, a message that prompts the user to re-enter biometric information may be output to the output device 504.

類似する（１つ以上の）登録情報３２０が見つかった場合は、認証システムの用途に応じて、複数の登録情報３２０を検索結果としてもよいし、その中で最も小さい距離を実現した登録情報３２０のみを検索結果としてもよい。例えば、勤怠管理システムの場合、類似すると判定した複数の登録情報３２０の登録データＩＤ３２１を出力装置５０４に出力し、その中から正しいものを認証ユーザに選択させ、選択を入力として受け付けた登録データＩＤ３２１を、現在時刻と共にログとしてデータベース３０９（あるいは２０４）に保存する、といった処理が考えられる。或いは、類似する登録情報３２０の中で最も小さい距離を実現した登録情報３２０の登録データＩＤ３２１を、現在時刻と共にログとして保存するようにしても良い。検索終了時の処理については、システムの用途に応じた処理を行なう処理部を備える。 When similar (one or more) registration information 320 is found, a plurality of registration information 320 may be used as a search result according to the use of the authentication system, or the registration information 320 that realizes the smallest distance among them. Only search results may be used. For example, in the case of the attendance management system, registration data IDs 321 of a plurality of registration information 320 determined to be similar are output to the output device 504, and the correct one is selected from among them, and the registration data ID 321 received as an input is a selection. Is stored in the database 309 (or 204) as a log together with the current time. Alternatively, the registration data ID 321 of the registration information 320 that realizes the smallest distance among the similar registration information 320 may be stored as a log together with the current time. The processing at the end of the search includes a processing unit that performs processing according to the use of the system.

このように、本実施形態の類似検索システムでは、ステップＳ３０５〜ステップＳ３１０に記されているように、まずｐｉｖｏｔとｎｏｎ−ｐｉｖｏｔとを交互に選択し、ｐｉｖｏｔに対して、距離を求める、或いは枝刈りを行なう、という処理が施された後に、まだ枝刈りされていないｎｏｎ−ｐｉｖｏｔの距離を求める。このような手法をとることで、ｎｏｎ−ｐｉｖｏｔの距離を求めるために、全てのｐｉｖｏｔに対して距離を求めておく、或いは枝刈りを行なっておく必要がなくなるため、距離を算出する回数が削減され、より高速な検索を可能にするという効果が得られる。 As described above, in the similar search system of this embodiment, as described in steps S305 to S310, first, pivot and non-pivot are alternately selected, and a distance is obtained for pivot or a branch is determined. After the process of cutting, the non-pivot distance that has not been pruned yet is obtained. By adopting such a method, it is not necessary to obtain distances for all pivots or to perform pruning in order to obtain non-pivot distances, thereby reducing the number of times of calculating distances. As a result, the effect of enabling a faster search is obtained.

また、本実施形態は、ｐｉｖｏｔとｎｏｎ−ｐｉｖｏｔとを、１つずつ交互に選択することとしたが、選択のルールはこれに限らず、例えばそれぞれ所定の数ずつ交互に選択することとしても良い。例えば、それぞれ２つずつ交互に選択するのであれば、ステップＳ３０６ではｉの値を４で割った余りによって、０または１であればｐｉｖｏｔを選択、２または３であればｎｏｎ−ｐｉｖｏｔを選択するなどとすればよい。また、ｐｉｖｏｔを１つ選んだ後にｎｏｎ−ｐｉｖｏｔを２つ選ぶとする場合には、ステップＳ３０５においてはｉ＜３Ｍ’であるかを判定し、ステップＳ３０６では、ｉの値を３で割った余りによって、０であればｐｉｖｏｔを選択、１または２であればｎｏｎ−ｐｉｖｏｔを選択する、などの手法を適宜用いればよい。 In this embodiment, pivot and non-pivot are alternately selected one by one. However, the selection rule is not limited to this, and for example, a predetermined number may be alternately selected. . For example, if two are to be selected alternately, in step S306, depending on the remainder of dividing i by 4, if 0 or 1 is selected, pivot is selected, if 2 or 3, non-pivot is selected. And so on. If two non-pivots are selected after selecting one pivot, it is determined in step S305 whether i <3M ′, and in step S306, the remainder of dividing the value of i by 3 is determined. Thus, a technique such as selecting pivot if it is 0, selecting non-pivot if it is 1 or 2, may be used as appropriate.

以下、ｐｉｖｏｔの決定方法について、その詳細を述べる。ｐｉｖｏｔの決定は、非特許文献１記載の手法を踏襲する。即ち、以下のようにｐｉｖｏｔを決定する。以下の処理は、サーバ端末２００のｐｉｖｏｔ決定部が行なう。すなわち、サーバ端末２００のＣＰＵ５００が、ＨＤＤ５０２からメモリ５０１に読み出して実行するプログラムによって実現される。そしてこのプログラムは、以下に説明される各種の動作を行うためのコードから構成されている。
（１）１番目のｐｉｖｏｔの決定
Ｎ個の登録情報２２０の特徴量データ２２３からランダムに１つ選択し、それを１番目のｐｉｖｏｔとする。
（２）ａ＋１番目（ａ＞＝１）のｐｉｖｏｔの決定
既にｐｉｖｏｔとして選択された登録情報２２０の特徴量データ２２３をＰ_１、・・・、Ｐ_ａとし、選択されていない登録情報２２０の特徴量データ２２３をＥ_１、・・・、Ｅ_Ｎ−ａとする。このとき、以下のようにしてＰ_ａ＋１を決定する。

Hereinafter, the details of the method for determining pivot will be described. The determination of pivot follows the method described in Non-Patent Document 1. That is, pivot is determined as follows. The following processing is performed by the pivot determining unit of the server terminal 200. In other words, the CPU 500 of the server terminal 200 is realized by a program that is read from the HDD 502 to the memory 501 and executed. And this program is comprised from the code | cord | chord for performing the various operation | movement demonstrated below.
(1) Determination of First Pivot One is randomly selected from the feature amount data 223 of the N pieces of registration information 220, and is designated as the first pivot.
(2) a + 1-th _P 1 the feature amount data 223 of already determined registration information 220 selected as the pivot of the pivot of the (a> = 1), ··· , and _{P a,} characterized in the registration information 220 is not selected _{Let the} quantity data 223 be E ₁ ,..., E _N-a . At this time, _{Pa + 1} is determined as follows.

非対称な距離尺度を用いている場合、数１の代わりに以下の式を用いても良い。

When an asymmetric distance measure is used, the following equation may be used instead of Equation 1.

以下、ステップＳ３０７における、検索用特徴量−ｐｉｖｏｔ間相関計算部３０３によるｑ−ｅ間距離ベクトルとｐ−ｅ間距離ベクトルとの相関の計算方法、及びステップＳ３０９における、検索用特徴量−ｎｏｎ−ｐｉｖｏｔ間相関計算部３０４によるｑ−ｐ間距離ベクトルとｎ−ｐ間距離ベクトルとの相関の計算方法について、その詳細を述べる。 Hereinafter, a method for calculating the correlation between the q-e distance vector and the p-e distance vector by the search feature quantity-pivot correlation calculation unit 303 in step S307, and the search feature quantity -non- in step S309 The details of the calculation method of the correlation between the q-p distance vector and the n-p distance vector by the inter-pivot correlation calculation unit 304 will be described.

図７にｑ−ｅ間距離ベクトルとｐ−ｅ間距離ベクトルの例を、図８にｑ−ｐ間距離ベクトルとｎ−ｐ間距離ベクトルの例を示す。距離テーブルの斜線部は、既に選択し距離を求めた、或いは枝刈りを行なった登録用データである。 FIG. 7 shows examples of q-e distance vectors and pe distance vectors, and FIG. 8 shows examples of q-p distance vectors and n-p distance vectors. The hatched portion of the distance table is registration data that has already been selected and the distance has been obtained or pruned.

１つ目の距離ベクトルをｘ＝（ｘ_１、ｘ_２、・・・、ｘ_ｍ）とし、２つ目の距離ベクトルをｙ＝（ｙ_１、ｙ_２、・・・、ｙ_ｍ）とする。これらの相関の計算方法としては、特許文献１記載の手法を踏襲する方法が考えられる。即ち、相関Ｒ（ｘ，ｙ）を、

Ｒ（ｑｅ，ｐｅ）＝Ｓｘｙ／（Ｓｘｘ＋Ｓｙｙ）
として求める。但し、

である。或いは、特許文献１に記載されているように、

として求めても良いし、

として求めても良い。但し、

である。数９はＤ_ｘｙとして市街地距離を用いたものと等価であるが、その他、ユークリッド距離、チェビシェフ距離、二次形式距離などの距離尺度を用いても良い。 The first distance vector is x = (x ₁ , x ₂ ,..., X _m ), and the second distance vector is y = (y ₁ , y ₂ ,..., Y _m ). . As a method of calculating these correlations, a method that follows the method described in Patent Document 1 can be considered. That is, the correlation R (x, y) is

R (qe, pe) = Sxy / (Sxx + Syy)
Asking. However,

It is. Or, as described in Patent Document 1,

You may ask as

You may ask as. However,

It is. _Equation 9 is equivalent to using city distance as D _xy , but other distance measures such as Euclidean distance, Chebyshev distance, and quadratic distance may be used.

また、非特許文献１のようにベクトル間の相関ではなく、ベクトル間の距離を求め、それをそのまま登録情報３２０の特徴量データ３２３の選択基準として用いることとしても良い。この場合、ステップＳ３０８やステップＳ３１０におけるｐｉｖｏｔ、ｎｏｎ−ｐｉｖｏｔの選択基準は、「相関が最も大きいもの」ではなく、「距離が最も小さいもの」となる。また、非特許文献１では距離尺度としてチェビシェフ距離を用いているが、その他、市街地距離、ユークリッド距離、二次形式距離などの距離尺度を用いても良い。 Further, instead of correlation between vectors as in Non-Patent Document 1, a distance between vectors may be obtained and used as it is as a selection criterion for the feature amount data 323 of the registration information 320. In this case, the selection criterion for pivot and non-pivot in step S308 or step S310 is not “the one with the largest correlation” but “the one with the smallest distance”. In Non-Patent Document 1, the Chebyshev distance is used as a distance scale, but other distance scales such as an urban area distance, an Euclidean distance, and a quadratic distance may be used.

ｑ−ｅ間距離ベクトル、ｐ−ｅ間距離ベクトル、ｑ−ｐ間距離ベクトル、ｎ−ｐ間距離ベクトルとしては、前述した以外の他のベクトルとして定義したものを用いても良い。例えば、ｑ−ｅ間距離ベクトルとして、検索用の特徴量と、探索済みの全ｎｏｎ−ｐｉｖｏｔとの距離で構成されるベクトルを、またｐ−ｅ間距離ベクトルとして、ｐｉｖｏｔと、探索済みの全ｎｏｎ−ｐｉｖｏｔとの距離で構成されるベクトルを用いても良い。この場合、距離テーブルに格納された距離情報のうち、ｐｉｖｏｔ同士の距離は用いられなくなるため、距離テーブルのサイズをさらに減らすことが可能となる。また、ステップＳ３０７を実行しない場合には、ｑ−ｅ間距離ベクトル、ｐ−ｅ間距離ベクトルは用いなくて良い。 As the q-e distance vector, the p-e distance vector, the q-p distance vector, and the n-p distance vector, those defined as vectors other than those described above may be used. For example, as a q-e distance vector, a vector constituted by the distance between a search feature quantity and all searched non-pivots, and as a p-e distance vector, pivot and all searched You may use the vector comprised by the distance with non-pivot. In this case, since the distance between pivots is not used among the distance information stored in the distance table, the size of the distance table can be further reduced. When step S307 is not executed, the qe distance vector and the pe distance vector need not be used.

以下、ステップＳ３１５における、枝刈り部３０７ａによる枝刈り処理について、その詳細を述べる。枝刈り処理は、非特許文献１記載の手法を踏襲する方法が考えられる。即ち、以下のように枝刈り処理を行なう。
（１）枝刈り判定を行なうデータ集合の作成
枝刈りをする否かの判定を行なう登録情報３２０の特徴量データ３２３の集合Ａ^＊を作成する。Ａ^＊の作成方法としては、例えば以下の方法がある。
ａ）まだ距離を求めておらず、枝刈りもされていない全ｎｏｎ−ｐｉｖｏｔの集合をＡ^＊とする。
ｂ）Ｍ／２以上の数のｐｉｖｏｔに対して距離を求めていればｄ）と同じ、そうでなければａ）と同じ方法でＡ^＊を作成する。
ｃ）Ｍ／３以上の数のｐｉｖｏｔに対して距離を求めていればｄ）と同じ、そうでなければａ）と同じ方法でＡ^＊を作成する。
ｄ）まだ距離を求めておらず、枝刈りもされていない全ｐｉｖｏｔ及びｎｏｎ−ｐｉｖｏｔの集合をＡ^＊とする。
（２）枝刈り判定を行なうデータの選択
Ａ^＊から特徴量データ３２３（Ａとする）を１つ選択する。選択したものはＡ^＊から除く。但し、Ａ^＊が空であれば、枝刈り処理を終了する。
（３）距離ｄ’の算出
距離テーブルから、距離ｄ（Ｚ_ｉ，Ａ）を求める。
（４）枝刈り判定
｜ｄ（Ｑ，Ｚ_ｉ）−ｄ（Ｚ_ｉ，Ａ）｜＞＝τであれば、その特徴量データ３２３Ａに対して類似検索の対象として選択しないものとする枝刈りを行なう。このときの特徴量データ３２３がｐｉｖｏｔの場合、Ｍ’をＭ’−１とする。（２）へ戻る。 The details of the pruning process by the pruning unit 307a in step S315 will be described below. For the pruning process, a method that follows the technique described in Non-Patent Document 1 is considered. That is, the pruning process is performed as follows.
(1) Creation of Data Set for Performing Pruning Determination Create a set A ^* of feature amount data 323 of registration information 320 for determining whether or not to perform pruning. As a method of creating A ^* , for example, there are the following methods.
a) Let A ^* be the set of all non-pivots for which the distance has not yet been obtained and pruned.
b) Create A ^* in the same way as d) if distance is obtained for M / 2 or more pivots, otherwise use the same method as a).
c) Create A ^* in the same manner as d) if distance is obtained for M / 3 or more pivots, otherwise use the same method as a).
d) Let A ^* be the set of all pivots and non-pivots for which no distance has yet been obtained and pruned.
(2) One feature quantity data 323 (referred to as A) is selected from the selection A ^{* of} data to be subjected to pruning determination. Selected items are excluded from A ^* . However, if A ^* is empty, the pruning process is terminated.
(3) The distance d (Z _i , A) is obtained from the distance d ′ calculation distance table.
(4) Pruning determination _{| d (Q, Z i)} -d (Z i, A) |> = if tau, pruning and shall not selected for the similarity search for the feature amount data 323A To do. If the feature amount data 323 at this time is pivot, M ′ is set to M′−1. Return to (2).

用いる距離尺度が三角不等式を満たす場合（ハミング距離など）、認証用の特徴量Ｑと枝刈りされた特徴量データ３２３との距離はτ以上であることが保証される。従って、誤って距離がτ未満の特徴量データ３２３を枝刈りすることなく、距離算出対象の特徴量データ３２３の数を減らすことができる。これにより、認証時間のさらなる短縮が実現でき、その結果、利便性がさらに向上する効果が得られる。 When the distance measure to be used satisfies the triangle inequality (such as the Hamming distance), it is guaranteed that the distance between the authentication feature quantity Q and the pruned feature quantity data 323 is τ or more. Therefore, the number of feature quantity data 323 to be distance-calculated can be reduced without accidentally pruning the feature quantity data 323 whose distance is less than τ. Thereby, the further shortening of authentication time can be implement | achieved and the effect which the convenience improves further as a result is acquired.

以上、距離を用いる手法について説明したが、距離でなく類似度を用いて類似検索を行なっても良い。例えば、距離テーブルの代わりに各ｐｉｖｏｔと各登録用特徴量データの特徴量空間における類似度を示す類似度テーブルを備え、相関計算部３０３、３０４が、各距離ベクトルの代わりに、特徴量同士の類似度で構成される類似度ベクトルを用いて相関を計算することとしても良い。また、類似度算出部３０６が、距離でなく類似度を算出する、あるいは算出した距離に−１をかけることにより類似度を算出することとし、判定部３０７が、検索用特徴量と登録用特徴量との類似度が所定の閾値より大きい場合に、両者が類似すると判定し、枝刈り部３０７ａは、ＱとＺ_ｉの類似度とＺ_ｉとＡの類似度との差が所定の閾値τ以上である場合に、Ａに対し枝刈りを行なうこととしても良い。 Although the method using the distance has been described above, the similarity search may be performed using the similarity instead of the distance. For example, instead of the distance table, a similarity table indicating the similarity in the feature space of each pivot and each registration feature value data is provided, and the correlation calculation units 303 and 304 may replace the feature vectors with each other. The correlation may be calculated using a similarity vector composed of similarities. Further, the similarity calculation unit 306 calculates not the distance but the similarity, or calculates the similarity by multiplying the calculated distance by −1, and the determination unit 307 determines the search feature amount and the registration feature. When the similarity to the quantity is greater than a predetermined threshold, it is determined that the two are similar, and the pruning unit 307a determines that the difference between the similarity between Q and Z _{i and} the similarity between Z _i and A is a predetermined threshold τ In the above case, pruning may be performed on A.

以下、図面を参照して、２つ目の実施形態について説明する。本実施形態の類似検索システムは、認証ユーザが生体情報を入力し、システムがサーバ端末内のデータベースから類似する生体情報を検索することで、認証ユーザが登録ユーザのうち誰か（或いは誰でもないか）を識別し、その結果に基づいて認証を行なう生体認証システムである。 The second embodiment will be described below with reference to the drawings. In the similarity search system of this embodiment, the authenticated user inputs biometric information, and the system searches for similar biometric information from the database in the server terminal, so that the authenticated user is someone among registered users (or who is not) ) And authenticates based on the result.

図９に、本実施形態の類似検索システムの構成例を示す。ここでは、図１と異なる点について述べる。 FIG. 9 shows a configuration example of the similarity search system of the present embodiment. Here, differences from FIG. 1 will be described.

このシステムは、ユーザから取得した登録情報をサーバ端末へ送信する登録端末１００と、登録情報を保存し、検索用特徴量を基に類似検索を行なうサーバ端末２００と、ユーザが入力した検索用の生データから抽出した特徴量をサーバ端末２００に送信するクライアント端末３００と、ネットワーク４００を含んで構成される。例えば、企業内のＰＣログイン時における認証や、業務システムへのアクセス時における認証などの情報アクセス制御をサーバ側で行なうシステム（以後、情報アクセス制御システム）の場合、登録端末１００は登録センター（登録用に確保された居室など）内に置かれたＰＣと生体情報入力センサ、クライアント端末２００は居室内のＰＣと生体情報入力センサ、サーバ端末３００はサーバ室に置かれたサーバ、ネットワーク４００は社内イントラネットとする構成が考えられる。この例においては、グループＩＤ２１０は居室毎、サブネット毎、或いはクライアント端末２００毎に異なる値となるように設定しても良い。 This system includes a registration terminal 100 that transmits registration information acquired from a user to a server terminal, a server terminal 200 that stores the registration information and performs a similar search based on a search feature, and a search input that is input by a user. A client terminal 300 that transmits a feature amount extracted from raw data to the server terminal 200 and a network 400 are included. For example, in the case of a system (hereinafter referred to as an information access control system) in which information access control such as authentication at the time of PC login in a company or authentication at the time of accessing a business system is performed on the server side, the registration terminal 100 is a registration center (registration PC and biometric information input sensor placed in a room reserved for use), the client terminal 200 is a PC and biometric information input sensor in the room, the server terminal 300 is a server placed in the server room, and the network 400 is in-house An intranet configuration can be considered. In this example, the group ID 210 may be set to have a different value for each room, for each subnet, or for each client terminal 200.

検索用特徴量−ｐｉｖｏｔ間相関計算部３０３、検索用特徴量−ｎｏｎ−ｐｉｖｏｔ間相関計算部３０４、登録用特徴量選択部３０５、類似度算出部３０６、判定部３０７、枝刈り部３０７ａはクライアント端末３００ではなくサーバ端末２００に備える。但し、枝刈り部３０７ａは無くても良い。また、クライアント端末３００はデータベースを持たない。 The search feature quantity-pivot correlation calculation unit 303, the search feature quantity-non-pivot correlation calculation unit 304, the registration feature quantity selection unit 305, the similarity calculation unit 306, the determination unit 307, and the pruning unit 307a are clients. The server terminal 200 is provided instead of the terminal 300. However, the pruning unit 307a may not be provided. The client terminal 300 does not have a database.

また、特徴量抽出部１０３及び特徴量抽出部３０２は、実施形態３のようにサーバ端末２００側にあっても良い。この場合、登録端末１００及びクライアント端末３００は特徴量を送信せずに生データを送信し、その後、サーバ端末２００が受信した生データから特徴量を抽出すれば良い。 Further, the feature amount extraction unit 103 and the feature amount extraction unit 302 may be on the server terminal 200 side as in the third embodiment. In this case, the registration terminal 100 and the client terminal 300 may transmit the raw data without transmitting the feature amount, and then extract the feature amount from the raw data received by the server terminal 200.

本実施形態における登録端末１００、サーバ端末２００、クライアント端末３００のハードウェア構成は図２と同じである。 The hardware configurations of the registration terminal 100, the server terminal 200, and the client terminal 300 in the present embodiment are the same as those in FIG.

本実施形態における登録の処理手順およびデータの流れは図３と同じである。 The registration processing procedure and data flow in this embodiment are the same as in FIG.

図１０に、本実施形態における事前処理の処理手順およびデータの流れを示す。図１０のステップＳ２０１〜Ｓ２０３は、図４のステップＳ２０１〜Ｓ２０３と同じである。 FIG. 10 shows a processing procedure and data flow of pre-processing in the present embodiment. Steps S201 to S203 in FIG. 10 are the same as steps S201 to S203 in FIG.

図１１に、本実施形態における検索の処理手順およびデータの流れを示す。ここでは、図５との違いについてのみ述べる。
サーバ端末２００のＣＰＵ５００が、データベース２０４からマスタデータ２０５を読み出し、メモリ５０１に展開する（ステップＳ３０１）。また、この処理は行なわず、代わりにステップＳ３０３ａの直後に、受信したグループＩＤを持つマスタデータ２０５の登録情報２２０と補助情報２３０をデータベース２０４から読み出し、メモリ５０１に展開しても良い。前者はより短時間での検索ができるというメリットがあり、後者は必要なメモリの空き容量がより少ないというメリットがある。 FIG. 11 shows a search processing procedure and a data flow in this embodiment. Here, only differences from FIG. 5 will be described.
The CPU 500 of the server terminal 200 reads the master data 205 from the database 204 and expands it in the memory 501 (step S301). Alternatively, this processing is not performed, and instead, immediately after step S303a, the registration information 220 and auxiliary information 230 of the master data 205 having the received group ID may be read from the database 204 and expanded in the memory 501. The former has the advantage of being able to search in a shorter time, and the latter has the advantage of requiring less free memory space.

クライアント端末３００は、グループＩＤ及び検索用の特徴量をサーバ端末２００に送信する（ステップＳ３０３ａ）。
ステップＳ３０４〜Ｓ３１７はクライアント端末３００ではなく、サーバ端末２００が行なう。ここでの登録情報２２０と補助情報２３０は、ステップＳ３０３ａにおいて受信したグループＩＤを持つマスタデータ２０５のものを用いる。 The client terminal 300 transmits the group ID and the search feature amount to the server terminal 200 (step S303a).
Steps S304 to S317 are performed not by the client terminal 300 but by the server terminal 200. The registration information 220 and the auxiliary information 230 used here are the master data 205 having the group ID received in step S303a.

ステップＳ３１４で検索終了とする場合、及びステップ３１７では、サーバ端末２００は、クライアント端末３００に検索結果を送信する。例えば、クライアント端末３００へのログイン時における、認証ユーザがＤＢに登録されているか否かのみをチェックすることを目的とした認証の場合は、アクセスを許可する旨をクライアント端末３００に送信する方法が考えられる。また、業務システムへのアクセス時における、認証ユーザが誰なのかを出来る限り正確に把握することを目的とした認証の場合は、今まで検索した中で最も小さい距離を実現した登録用の特徴量データ２２３を持つ登録データＩＤ２２１を業務システムに送信し、業務システムが登録データＩＤ２２１としてのアクセスを許可する旨をクライアント端末３００に送信する方法が考えられる。クライアント端末３００は、サーバ端末３００からの受信内容に応じた処理を行なう。たとえば、クライアント端末３００へのログインを許可する旨を受信した場合には、ログインを実行する。 When the search is ended in step S314 and in step 317, the server terminal 200 transmits the search result to the client terminal 300. For example, in the case of authentication for the purpose of checking only whether an authenticated user is registered in the DB at the time of logging in to the client terminal 300, there is a method for transmitting access permission to the client terminal 300. Conceivable. Also, in the case of authentication for the purpose of ascertaining as accurately as possible who is the authenticated user when accessing the business system, the feature value for registration that has achieved the smallest distance that has been searched so far A method is conceivable in which a registration data ID 221 having data 223 is transmitted to the business system, and the business system transmits to the client terminal 300 that the access as the registration data ID 221 is permitted. The client terminal 300 performs processing according to the content received from the server terminal 300. For example, when it is received that the login to the client terminal 300 is permitted, the login is executed.

以下、図面を参照して、３つ目の実施形態について説明する。本実施形態の類似検索システムは、ユーザが画像を入力し、システムがサーバ端末内のデータベースから類似する画像を検索する類似画像検索システムである。 Hereinafter, the third embodiment will be described with reference to the drawings. The similarity search system of this embodiment is a similar image search system in which a user inputs an image and the system searches for a similar image from a database in a server terminal.

図１２に本実施形態の類似検索システムの構成例を示す。ここでは、図９と異なる点について述べる。本実施形態では、生データは画像データである。 FIG. 12 shows a configuration example of the similarity search system of this embodiment. Here, differences from FIG. 9 will be described. In the present embodiment, the raw data is image data.

このシステムは、登録情報２２０をデータベース２０４に保存し、検索用特徴量を基に類似検索を行なうサーバ端末２００と、ユーザから検索用の生データを入力として受付、サーバ端末２００に送信するクライアント端末３００と、ネットワーク４００とを含んで構成される。例えば、サーバ端末２００は企業が運用するサーバ、クライアント端末３００はユーザが持つ自宅のＰＣ、ネットワーク４００は社内イントラネットとする構成が考えられる。 In this system, registration information 220 is stored in a database 204, a server terminal 200 that performs a similar search based on a search feature quantity, and a client terminal that receives raw data for search from a user as input and transmits it to the server terminal 200 300 and the network 400 are comprised. For example, the server terminal 200 may be a server operated by a company, the client terminal 300 may be a home PC owned by the user, and the network 400 may be an in-house intranet.

このシステムは登録端末１００、グループIＤ２１０を持たない。また、特徴量抽出部３０２はサーバ端末２００に持たせておく。抽出する特徴量としては、例えばカラーヒストグラムなどが考えられ、２つの特徴量間の距離としては、例えば二次形式距離などが考えられる。 This system does not have the registration terminal 100 and the group ID 210. The feature amount extraction unit 302 is provided in the server terminal 200. As the feature quantity to be extracted, for example, a color histogram can be considered, and as the distance between two feature quantities, for example, a quadratic form distance can be considered.

登録情報２２０としては、例えばサーバ端末２００が、インターネットから収集した画像を生データ２２２として格納し、登録用データＩＤ２２１を付与し、特徴量抽出部３０２を用いて生データ２２２から特徴量データ２２３を抽出しておく、などの方法が考えられる。或いは、実施例１、２のように、ユーザが登録端末を用いて登録できることとしても良い。 As the registration information 220, for example, the server terminal 200 stores an image collected from the Internet as raw data 222, assigns a registration data ID 221, and uses the feature amount extraction unit 302 to extract the feature amount data 223 from the raw data 222. A method such as extraction is conceivable. Alternatively, as in the first and second embodiments, the user may be able to register using a registration terminal.

本実施形態における事前処理の処理手順およびデータの流れは図１０と同じである。 The processing procedure and data flow of the pre-processing in the present embodiment are the same as those in FIG.

図１３に、本実施形態における検索の処理手順およびデータの流れを示す。ここでは、図５との違いについてのみ述べる。
クライアント端末３００は、通信Ｉ／Ｆ３０８によりネットワーク４００を介して検索用の生データをサーバ端末２００に送信する（ステップＳ３０２ａ）。
特徴量抽出処理（ステップＳ３０３）はクライアント端末３００ではなく、サーバ端末２００の特徴量抽出部３０２が行なう。 FIG. 13 shows a search processing procedure and a data flow in this embodiment. Here, only differences from FIG. 5 will be described.
The client terminal 300 transmits raw data for search to the server terminal 200 via the network 400 by the communication I / F 308 (step S302a).
The feature amount extraction process (step S303) is performed not by the client terminal 300 but by the feature amount extraction unit 302 of the server terminal 200.

ステップＳ３１４で検索終了とする場合、及びステップ３１７では、サーバ端末２００は、クライアント端末３００に検索結果を送信し、クライアント端末３００が出力装置５０４に検索結果を出力する。このとき、閾値より小さい距離を実現した１つ以上の登録用の生データ２２２を検索結果とする方法もあるし、その中で最も小さい距離を実現した生データ２２２を検索結果とする方法もある。前者の場合、距離の小さい順に生データ２２２をランク付けし、生データ２２２をランクと共に送信し、クライアント端末３００が出力装置５０４に表示することとにしても良い。 When the search is ended in step S314 and in step 317, the server terminal 200 transmits the search result to the client terminal 300, and the client terminal 300 outputs the search result to the output device 504. At this time, there is a method of using one or more registration raw data 222 that realizes a distance smaller than a threshold as a search result, and a method of using raw data 222 that realizes the smallest distance as a search result. . In the former case, the raw data 222 may be ranked in ascending order of distance, the raw data 222 may be transmitted together with the rank, and the client terminal 300 may display it on the output device 504.

なお、本発明は、上記各実施形態に限定されるものでなく、要旨を逸脱しない範囲で適宜変形することが可能である。 In addition, this invention is not limited to said each embodiment, It can change suitably in the range which does not deviate from a summary.

本発明は、画像、動画、音楽、文書などのマルチメディアデータの類似検索を行なう任意のアプリケーションに対して適用可能である。例えば、類似画像検索システム、類似動画検索システム、類似音楽検索システム、類似文書検索システム、生体認証による入退室管理システムや勤怠管理システム、ＰＣログインシステムなどへの適用が可能である。 The present invention can be applied to any application that performs a similar search of multimedia data such as images, moving images, music, and documents. For example, the present invention can be applied to a similar image search system, a similar moving image search system, a similar music search system, a similar document search system, an entrance / exit management system by biometric authentication, an attendance management system, a PC login system, and the like.

１００登録端末
１０１ＩＤ取得部
１０２生データ取得部
１０３特徴量抽出部
２００サーバ端末
２０１ｐｉｖｏｔ決定部
２０２距離テーブル作成部
２０５マスタデータ
２１０グループＩＤ
２２０登録情報
２２１登録データＩＤ
２２２生データ
２２３特徴量
２３１ｐｉｖｏｔ情報
２３２距離テーブル
３００クライアント端末
３０３検索用特徴量−ｐｉｖｏｔ間相関計算部
３０４検索用特徴量−ｎｏｎ−ｐｉｖｏｔ間相関計算部
３０５登録用特徴量選択部
３０６類似度算出部
３０７判定部
３０７ａ枝刈り部 100 registered terminal 101 ID acquisition unit 102 raw data acquisition unit 103 feature quantity extraction unit 200 server terminal 201 pivot determination unit 202 distance table creation unit 205 master data 210 group ID
220 Registration information 221 Registration data ID
222 Raw data 223 Feature quantity 231 Pivot information 232 Distance table 300 Client terminal 303 Search feature quantity-pivot correlation calculation section 304 Search feature quantity-non-pivot correlation calculation section 305 Registration feature quantity selection section 306 Similarity calculation Part 307 determination part 307a pruning part

Claims

A similar retrieval device for retrieving data similar to multimedia data,
A storage unit for storing a plurality of registration feature data;
Means for obtaining a search feature extracted from the multimedia data;
A registration feature quantity selection unit that selects one of the plurality of registration feature quantity data;
A similarity calculation unit for calculating a similarity or distance in the feature amount space between the selected feature data for registration and the feature for search;
A determination unit that determines whether the selected registration feature data and the search feature data are similar based on the calculated similarity or distance;
A processing unit that performs processing according to a determination result by the determination unit;
With
The storage unit further includes representative data information indicating which representative data is selected from the plurality of registration feature data, and similarity in the feature data space between each representative data and each registration feature data Stores degrees or distances,
The registration feature quantity selection unit is the representative data and the registration feature quantity data other than the representative data from among the unselected registration feature quantity data which is registration feature quantity data that has not yet been selected. The non-representative data is alternately selected by a predetermined number, using the similarity or distance in the feature amount space of each representative data and each registration feature amount data stored in the storage unit,
If the result of the determination is that they are not similar, the determination unit causes the registration feature amount selection unit to select registration feature amount data, causes the similarity calculation unit to calculate the similarity or distance, and A similarity search apparatus characterized by repeating a series of processes of making a determination until a similarity determination is made or a determination is made for all the registration feature data.

The similarity search device according to claim 1,
Using the similarity or distance in the feature amount space of each representative data and each registration feature amount data stored in the storage unit, the correlation between the search feature amount and the non-representative data is a first correlation. A correlation calculation unit that calculates as
The similarity search device, wherein the registration feature value selection unit selects the non-representative data based on the calculated first correlation.

The similarity search device according to claim 2,
The correlation calculation unit
The first correlation is calculated using a distance vector representing a similarity or distance between the search feature and the representative data and a distance vector representing a similarity or distance between the non-representative data and the representative data. Similarity search apparatus characterized by performing.

The similarity search device according to claim 2 or 3, wherein:
The correlation calculation unit calculates a correlation between the search feature quantity and the representative data as a second correlation,
The similarity search device, wherein the registration feature value selection unit selects the representative data based on the calculated second correlation.

The similarity search device according to claim 4,
The correlation calculation unit
A distance vector representing a distance between the search feature quantity and the registration feature quantity data already selected by the registration feature quantity selection unit; the representative data; and the already selected registration feature quantity data. The similarity search apparatus, wherein the second correlation is calculated using a distance vector representing a distance of

The similarity search device according to claim 4,
The correlation calculation unit
A distance vector representing the distance between the search feature quantity and the non-representative data already selected by the registration feature quantity selection unit, and a distance between the representative data and the already selected non-representative data. A similarity search apparatus that calculates the second correlation using a distance vector.

The similarity search device according to claim 2 or 3, wherein:
The similarity search device, wherein the registration feature quantity selection unit selects the representative data in a predetermined order.

The similarity search device according to any one of claims 1 to 7,
The determination unit is
When the similarity in the feature space between the selected feature data for registration and the feature for search calculated by the similarity calculation unit is greater than a predetermined threshold, or when the distance is smaller than a predetermined threshold And a similarity search device characterized in that it is determined to be similar.

The similarity search device according to claim 8,
For the unselected registration feature quantity data, the similarity or distance in the feature quantity space with the registration feature quantity data selected by the registration feature quantity selection unit, and
The difference between the similarity or distance in the feature amount space between the selected feature data for registration and the feature amount for search, calculated by the similarity calculation unit,
If it is greater than a predetermined threshold,
The similarity search apparatus characterized by including the pruning part which determines that the said feature-value selection part does not select the said feature-value data for unselected registration as a target of a similar search.

The similarity search device according to any one of claims 1 to 9,
The similarity search device, wherein the multimedia data is biometric information of a user to be authenticated.

The similarity search device according to any one of claims 1 to 10, wherein:
As means for acquiring the search feature amount,
A raw data acquisition unit that receives the multimedia data as an input;
A feature quantity extraction unit for extracting feature quantities from the multimedia data received as input;
A similarity search apparatus comprising:

The similarity search device according to any one of claims 1 to 10, wherein:
Connected to a terminal device via a network,
The terminal device is
A raw data acquisition unit that receives the multimedia data as an input;
A feature quantity extraction unit for extracting the search feature quantity from the multimedia data received as input, and a means for transmitting the extracted feature quantity to the device;
With
The similarity search apparatus, wherein the similarity search apparatus acquires the feature amount by receiving it from the terminal device via a network.

A similarity search system comprising the similarity search device according to any one of claims 1 to 10, a registration terminal device, and a server device including a storage unit,
The registered terminal device
Means for receiving the multimedia data as input;
Means for extracting feature data for registration from the acquired multimedia data;
Means for transmitting the extracted feature data for registration to the server;
With
The server device is
Means for selecting a plurality of representative data from the plurality of registration feature data received;
Means for transmitting representative data information indicating which of the plurality of representative data is selected and the plurality of registration feature data to the similarity search device;
A similarity search system comprising:

The similarity search system according to claim 13,
The server device is
Means for calculating a similarity or distance in the feature amount space between each representative data and each registration feature amount data;
A similarity search system comprising: means for transmitting the calculated similarity or distance to the similarity search device.

A similar search method for searching data similar to multimedia data,
Obtaining a search feature extracted from the multimedia data;
A registration feature quantity selection step of selecting one of the plurality of registration feature quantity data stored in the storage unit;
A similarity calculation step for calculating a similarity or distance in the feature amount space between the selected feature data for registration and the feature for search;
A determination step of determining whether or not the selected registration feature value data and the search feature value are similar based on the calculated similarity or distance;
A processing step for performing processing according to a determination result by the determination unit;
With
In the registration feature quantity selection step, representative data information indicating which representative data is selected from the plurality of registration feature quantity data further stored in the storage unit, each representative data, and each Using the similarity or distance in the feature quantity space of the registration feature quantity data, the representative data and the representative data are selected from the unselected registration feature quantity data that is not yet selected registration feature quantity data. The non-representative data that is the registration feature value data other than is alternately selected for each predetermined number,
As a result of the determination in the determination step, if the similarity is not similar, the registration feature amount selection step, the similarity calculation step, and the determination step are performed by performing similarity determination or all the registration feature amount data. A similarity search method characterized by repeating until a determination is made.

A program that causes a computer to function as each unit of the apparatus according to any one of claims 1 to 14.