JPWO2017168798A1

JPWO2017168798A1 - Encrypted search index merge server, encrypted search index merge system, and encrypted search index merge method

Info

Publication number: JPWO2017168798A1
Application number: JP2018508358A
Authority: JP
Inventors: 通冶; 稔藤本
Original assignee: Hitachi Solutions Ltd
Current assignee: Hitachi Solutions Ltd
Priority date: 2016-03-30
Filing date: 2016-10-12
Publication date: 2019-07-25
Anticipated expiration: 2036-10-12
Also published as: JP6672451B2; WO2017168798A1

Abstract

第１検索インデックス及び第２検索インデックスそれぞれは、第１の非決定性暗号アルゴリズムで生成された１以上の暗号キーワードを含み、第２検索インデックスは、第２の非決定性暗号アルゴリズムで生成された１以上の暗号クエリを含み、検索インデックスマージサーバは、第１検索インデックスと第２検索インデックスとをマージする処理において、第１検索インデックスに含まれる暗号キーワードと第２検索インデックスに含まれる暗号クエリとを比較する比較処理を実行して、比較対象の暗号キーワードと比較対象の暗号クエリとが同一のキーワードから生成されたか否かを判定する。Each of the first search index and the second search index includes one or more encryption keywords generated by the first non-deterministic encryption algorithm, and the second search index is one or more generated by the second non-deterministic encryption algorithm The search index merge server compares the encryption keyword included in the first search index with the encryption query included in the second search index in the process of merging the first search index and the second search index. The comparison process is executed, and it is determined whether or not the encryption keyword to be compared and the encryption query to be compared are generated from the same keyword.

Description

Capture by reference

本出願は、2016年3月30日に出願された日本特許出願第2016-067699号の優先権を主張し、その内容を参照することにより、本出願に取り込む。 This application claims the priority of Japanese Patent Application No. 2016-067699 filed on March 30, 2016, which is incorporated into the present application by reference.

本発明は、暗号化検索インデックスマージサーバ、暗号化検索インデックスマージシステム、及び暗号化検索インデックスマージ方法に関する。 The present invention relates to an encrypted search index merge server, an encrypted search index merge system, and an encrypted search index merge method.

本技術分野の背景技術として、特開２０１５−３５０７２号公報（特許文献１）がある。この公報には、「登録クライアントは、ハッシュ値と準同型関数の出力値によるマスクを用いた確率的暗号化方式により、検索用に作成する検索タグのサイズを圧縮した暗号化データをサーバに預託し、検索クライアントは、検索用のキーワードを同様に確率的暗号化し、暗号化データの一部のみを暗号化した検索キーワードとして管理サーバに送信し、管理サーバに暗号化データと暗号化キーワードの乱数のマスクを解除させずに、管理サーバに検索に該当するデータを検索し、検索結果の誤検索を検知し、検索結果を復号する。」と記載されている（要約参照）。 As background art of this technical field, there is JP, 2015-35072, A (patent documents 1). In this publication, "the registration client entrusts the server with encrypted data in which the size of the search tag created for the search is compressed by the probabilistic encryption method using the hash value and the mask by the output value of the homomorphic function. Then, the search client similarly probabilistically encrypts the search keyword and transmits it as a search keyword in which only a part of the encrypted data is encrypted to the management server, and the management server receives the random number of the encrypted data and the encryption keyword The management server searches for data corresponding to the search, detects an erroneous search of the search result, and decodes the search result, without removing the mask of (see summary).

特開２０１５−３５０７２号公報JP, 2015-35072, A

特許文献１に記載の技術は、非決定性暗号を用いて暗号化された検索インデックスを用いて、ドキュメント及び検索インデックスを復号せずに、検索処理を実行する。特許文献１に記載の各検索インデックスは、非決定性暗号を用いて暗号化されたキーワードである暗号キーワードと、当該キーワードに対応するメタデータと、からなる複数の組み合わせを含む。 The technology described in Patent Document 1 executes a search process using a search index encrypted using nondeterministic encryption without decrypting a document and the search index. Each search index described in Patent Document 1 includes a plurality of combinations of an encryption keyword that is a keyword encrypted using non-deterministic encryption and metadata corresponding to the keyword.

検索インデックスの数が増加すると、暗号キーワードとメタデータとの組み合わせの総数も増加するため、検索処理速度が低下する。このような検索処理速度の低下を抑制するために、例えば、複数の検索インデックスをマージして１つの検索インデックスを生成するマージ処理が実行される。 As the number of search indexes increases, the total number of combinations of encryption keywords and metadata also increases, thereby reducing the search processing speed. In order to suppress such a decrease in search processing speed, for example, merge processing is performed in which a plurality of search indexes are merged to generate one search index.

暗号化されていない検索インデックスのマージ処理において、同一のキーワードがマージ対象の複数の検索インデックスに含まれている場合、当該同一のキーワードと、当該同一のキーワードと紐づく全てのメタデータと、を紐づけて１つの組み合わせを生成し、マージ結果である検索インデックスに格納する。このようなマージ処理によって、暗号キーワードとメタデータとの組み合わせの総数を減少させることができる。 When the same keyword is included in a plurality of search indexes to be merged in the merge process of a search index that is not encrypted, the same keyword and all metadata associated with the same keyword It links and generates one combination, and stores it in the search index which is a merge result. Such merge processing can reduce the total number of combinations of encryption keywords and metadata.

しかし、特許文献１に記載の検索インデックスに含まれる各暗号キーワードは、非決定性暗号を用いて暗号化されているため、同一のキーワードから生成された暗号キーワードであっても、暗号キーワード同士は異なるデータである。従って、特許文献１に記載の技術において、複数のマージ対象の検索インデックスに含まれる暗号キーワードは原則的に全て異なるデータであるため、暗号化状態のまま上述のマージ処理を実行しても、検索インデックスに含まれる暗号キーワードとメタデータとの組み合わせの総数を減少させることはできない。 However, since each encryption keyword included in the search index described in Patent Document 1 is encrypted using non-deterministic encryption, the encryption keywords are different even if they are encryption keywords generated from the same keyword. It is data. Therefore, in the technology described in Patent Document 1, since the encryption keywords included in the plurality of search target search indexes are basically all different data, the search is performed even if the above merge processing is executed in the encrypted state. The total number of combinations of encryption keywords and metadata included in the index can not be reduced.

また、特許文献１に記載の技術において、暗号キーワードを復号すれば、暗号化されていない検索インデックスと同様のマージ処理を実行することができるが、暗号キーワードを復号することによりセキュリティレベルが低下してしまう。 Further, in the technique described in Patent Document 1, if the encryption keyword is decrypted, the same merge processing as the search index which is not encrypted can be executed, but the security level is lowered by decrypting the encryption keyword. It will

そこで、本発明の一態様は、暗号化された検索インデックスに含まれるキーワードを復号することなく、検索インデックスをマージすることを目的とする。ひいては、セキュリティを確保しつつ、検索処理速度を向上させることを目的とする。 Therefore, an aspect of the present invention aims to merge search indexes without decrypting keywords included in the encrypted search index. Consequently, it is an object to improve search processing speed while securing security.

上記課題を解決するため、本発明の一態様は、例えば、以下の構成を採用する。暗号化された検索インデックスをマージする、検索インデックスマージサーバであって、プロセッサと記憶装置とを含み、前記記憶装置は、第１検索インデックスと第２検索インデックスとを保持し、前記第１検索インデックス及び前記第２検索インデックスそれぞれは、１以上のキーワードそれぞれから生成された暗号セットと、前記１以上のキーワードそれぞれに対応するメタデータと、を紐づけて保持し、前記第１検索インデックス及び前記第２検索インデックスの暗号セットそれぞれは、暗号キーワードを含み、前記第２検索インデックスの暗号セットそれぞれは、暗号クエリを含み、前記暗号キーワードそれぞれは、乱数を用いて暗号化されたキーワードを示す暗号文と、当該乱数に対して準同型関数による変換及び不可逆変換が実行された値を示す検索タグと、を含み、前記暗号クエリそれぞれは、乱数を用いて暗号化されたキーワードを示す暗号文と、当該乱数に対して準同型関数による変換が実行された値を示す検索タグと、を含み、前記プロセッサは、前記第１検索インデックスと前記第２検索インデックスとをマージして、マージ結果である第３検索インデックスを生成するマージ処理を実行し、前記マージ処理において、前記第１検索インデックスに含まれる暗号キーワードと前記第２検索インデックスに含まれる暗号クエリとを比較する比較処理を実行して、同一のキーワードから生成された暗号セットを特定し、同一のキーワードから生成された暗号セットに含まれる第１暗号キーワードを含む暗号セットと、前記特定した暗号セットそれぞれに紐づくメタデータと、を紐づけて前記第３検索インデックスに格納し、前記比較処理において、比較対象である第２暗号キーワードの暗号文の一部又は全部と、比較対象である第１暗号クエリの暗号文と、から算出される値に対して、準同型関数による変換を実行した関数値を算出し、前記関数値と、前記第１暗号クエリの検索タグが示す値と、から算出される値に対して、不可逆変換を実行した不可逆変換値を算出し、前記不可逆変換値と、前記第２暗号キーワードの検索タグと、の比較結果に基づいて、前記第２暗号キーワードを含む暗号セットと、前記第１暗号クエリを含む暗号セットと、が同一のキーワードから生成されたか否かを判定する、検索インデックスマージサーバ。 In order to solve the above-mentioned subject, one mode of the present invention adopts the following composition, for example. A search index merge server for merging encrypted search indexes, comprising a processor and a storage device, wherein the storage device holds a first search index and a second search index, the first search index And each of the second search index associates and holds a cipher set generated from each of one or more keywords and metadata corresponding to each of the one or more keywords, and the first search index and the second search index (2) Each of the encryption sets of the search index includes an encryption keyword, each of the encryption set of the second search index includes an encryption query, and each of the encryption keywords is an encrypted text indicating a keyword encrypted using a random number , Transformation and irreversible transformation by the homomorphic function for the random number Each of the cryptographic queries includes a ciphertext representing a keyword encrypted using a random number, and a value obtained by performing conversion by the homomorphic function on the random number. And the search processor executes a merge process of merging the first search index and the second search index to generate a third search index which is a merge result, and the merge process Executing a comparison process of comparing an encryption keyword included in the first search index and an encryption query included in the second search index to specify an encryption set generated from the same keyword, and from the same keyword A cryptographic set including the first cryptographic keyword included in the generated cryptographic set, and a string associated with each of the identified cryptographic sets Data stored in the third search index, and in the comparison process, a part or all of the encrypted text of the second encryption keyword to be compared and the encrypted text of the first encrypted query to be compared And a value calculated from the homomorphic function is calculated, and a function value is calculated, and the value calculated from the function value and the value indicated by the search tag of the first cryptographic query is calculated. Calculating the irreversible conversion value subjected to the irreversible conversion, and based on the comparison result of the irreversible conversion value and the search tag of the second encryption keyword, a cipher set including the second encryption keyword, A search index merge server that determines whether or not a cryptographic set including one cryptographic query is generated from the same keyword.

本発明の一態様によれば、暗号化された検索インデックスに含まれるキーワードを復号することなく、検索インデックスをマージすることができる。ひいては、セキュリティを確保しつつ、検索データサイズを削減し、検索処理速度を向上させることができる。 According to an aspect of the present invention, search indexes can be merged without decrypting keywords included in the encrypted search index. As a result, it is possible to reduce search data size and improve search processing speed while securing security.

上記した以外の課題、構成及び効果は、以下の実施形態の説明により明らかにされる。 Problems, configurations, and effects other than those described above will be apparent from the description of the embodiments below.

実施例１における全文検索システムの全体の構成例を示すブロック図である。FIG. 1 is a block diagram showing an example of the overall configuration of a full-text search system according to a first embodiment. 実施例１におけるインデックス生成サーバの物理的な構成例を示すブロック図である。FIG. 7 is a block diagram showing an example of a physical configuration of an index generation server in the first embodiment. 実施例１における検索インデックスのマージ処理の一例を示す説明図である。FIG. 8 is an explanatory diagram of an example of a search index merge process according to the first embodiment. 実施例１における検索インデックス作成処理の一例を示すシーケンス図である。FIG. 8 is a sequence diagram showing an example of a search index creation process in the first embodiment. 実施例１における乱数生成処理の一例を示す説明図である。FIG. 8 is an explanatory view showing an example of random number generation processing in the first embodiment. 実施例１における中間暗号文生成処理の一例を示す説明図である。FIG. 7 is an explanatory view showing an example of an intermediate ciphertext generation process according to the first embodiment. 実施例１における暗号キーワード生成処理の一例を示す説明図である。FIG. 8 is an explanatory view showing an example of encryption keyword generation processing in the first embodiment. 実施例１における暗号クエリ生成処理の一例を示す説明図である。FIG. 8 is an explanatory diagram of an example of cryptographic query generation processing according to the first embodiment; 実施例１における検索インデックスのマージ処理の一例を示すシーケンス図である。FIG. 7 is a sequence diagram showing an example of a search index merge process according to the first embodiment. 実施例１における暗号セットの比較処理の一例を示す説明図である。FIG. 14 is an explanatory view showing an example of the comparison processing of the cipher set in the first embodiment. 実施例２における検索インデックスのマージ処理の一例を示す説明図である。FIG. 18 is an explanatory drawing showing an example of a search index merge process in the second embodiment. 実施例３における全文検索システムの全体の構成例を示すブロック図である。FIG. 16 is a block diagram showing an example of the overall configuration of a full-text search system in a third embodiment.

以下、添付図面を参照して本発明の実施形態を説明する。本実施形態は本発明を実現するための一例に過ぎず、本発明の技術的範囲を限定するものではないことに注意すべきである。各図において共通の構成については同一の参照符号が付されている。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. It should be noted that the present embodiment is merely an example for realizing the present invention, and does not limit the technical scope of the present invention. The same reference numerals are given to the same configuration in each drawing.

図１は、本実施例の全文検索システムの全体の構成例を示すブロック図である。全文検索システム１００は、インデックス型の全文検索を実行するシステムであり、例えば、互いに接続された検索エンジンサーバ１２０とインデックス生成サーバ１１０とを含む。なお、インデックス生成サーバ１１０と検索エンジンサーバ１２０は、１つの計算機上に構成されていてもよい。 FIG. 1 is a block diagram showing an example of the overall configuration of the full-text search system of this embodiment. The full-text search system 100 is a system that executes index-type full-text search, and includes, for example, a search engine server 120 and an index generation server 110 connected to each other. The index generation server 110 and the search engine server 120 may be configured on one computer.

全文検索システム１００と、ユーザが利用するユーザ端末１３０と、ユーザの暗号鍵を保管する鍵サーバ１４０と、はネットワーク１５０を介して互いに接続されている。ユーザ端末１３０は、ユーザの暗号鍵情報を保持する。ユーザの暗号鍵情報は、当該ユーザのデータ暗号鍵を特定可能な情報（例えば、当該ユーザの暗号鍵、関数値暗号鍵、及び関数値復号鍵の識別子等）を含む。データ暗号鍵、関数値暗号鍵、関数値復号鍵、乱数用秘密鍵については後述する。 The full-text search system 100, the user terminal 130 used by the user, and the key server 140 storing the encryption key of the user are mutually connected via the network 150. The user terminal 130 holds encryption key information of the user. The encryption key information of the user includes information that can specify the data encryption key of the user (for example, the encryption key of the user, the function value encryption key, and the identifier of the function value decryption key). The data encryption key, the function value encryption key, the function value decryption key, and the secret key for random numbers will be described later.

鍵サーバ１４０は、ユーザのデータ暗号鍵、関数値暗号鍵、及び関数値復号鍵と、を保持する。ネットワーク１５０は、例えば、インターネットであるが、所定の組織内のネットワーク（例えば、イントラネット）でもよい。 The key server 140 holds the data encryption key of the user, the function value encryption key, and the function value decryption key. The network 150 is, for example, the Internet, but may be a network in a predetermined organization (for example, an intranet).

検索エンジンサーバ１２０は、非決定性暗号方式を用いて暗号化されたドキュメントの検索インデックス情報を保持する。検索エンジンサーバ１２０は、例えばユーザに指定されたキーワードを含むドキュメントを、後述するインデックス格納部１１３に格納されたインデックスを用いて、検索する。 The search engine server 120 holds search index information of a document encrypted using nondeterministic encryption. The search engine server 120 searches, for example, a document including a keyword designated by the user using an index stored in the index storage unit 113 described later.

なお、インデックス格納部１１３に格納された検索インデックスに含まれるキーワードは、検索可能暗号処理によって暗号化されている。検索可能暗号処理とは、暗号化したキーワードを含む検索インデックスを生成し、当該検索インデックスに含まれる暗号化されたキーワードを復号することなく、当該検索インデックスを用いたドキュメント検索を実行する、一連の処理である。なお、本実施例における検索可能暗号処理では、非決定性暗号が用いられているものとする。即ち、検索インデックスに登録されるキーワードの暗号化に非決定性暗号が用いられている。また、検索可能暗号処理では、暗号化されたキーワードである暗号キーワードを検索インデックスから検索する際に、ユーザが検索用に指定したキーワードを暗号化した暗号クエリが生成されるが、暗号クエリの生成においても、非決定性暗号が用いられる。本実施例において、例えば、特許文献１に記載の検索可能暗号処理を用いることができる。 The keywords included in the search index stored in the index storage unit 113 are encrypted by the searchable encryption process. The searchable encryption process is a series of generation of a search index including an encrypted keyword, and a document search using the search index without decrypting the encrypted keyword included in the search index. It is a process. It is assumed that nondeterministic encryption is used in searchable encryption processing in the present embodiment. That is, nondeterministic encryption is used to encrypt keywords registered in a search index. Also, in searchable encryption processing, when searching for an encrypted keyword that is an encrypted keyword from the search index, an encrypted query is generated in which the keyword specified by the user for search is encrypted, but an encrypted query is generated. Also use non-deterministic encryption. In the present embodiment, for example, searchable encryption processing described in Patent Document 1 can be used.

検索エンジンサーバ１２０は、ユーザに指定されたキーワードを含むドキュメントを検索する際、検索可能暗号処理によって、当該キーワードに対応する暗号クエリを生成する。検索エンジンサーバ１２０は、生成した暗号クエリと、検索インデックスに含まれる暗号キーワードと、を比較することで、暗号クエリの元となったキーワードと同一のキーワードから生成された暗号キーワードを特定して、当該キーワードを含むドキュメントを検索する。 When searching for a document including a keyword designated by the user, the search engine server 120 generates an encrypted query corresponding to the keyword by searchable encryption processing. The search engine server 120 compares the generated encrypted query with the encrypted keyword included in the search index to identify the encrypted keyword generated from the same keyword as the original keyword of the encrypted query, Search for documents containing that keyword.

検索可能暗号処理における、暗号キーワード生成方法と暗号クエリ生成方法との違いの詳細、暗号キーワードと暗号クエリとの比較処理の詳細、及びドキュメント検索方法の詳細については、後述する。 The details of the difference between the encryption keyword generation method and the encryption query generation method in the searchable encryption processing, the details of comparison processing of the encryption keyword and the encryption query, and the details of the document search method will be described later.

インデックス生成サーバ１１０は、例えば、インデックス生成部１１１、インデックスマージ部１１２、インデックス格納部１１３、及び検索可能暗号化部１１４を含む。インデックス生成部１１１は、暗号化前のドキュメントを用いて、当該ドキュメントを検索するための検索インデックスを生成する。 The index generation server 110 includes, for example, an index generation unit 111, an index merge unit 112, an index storage unit 113, and a searchable encryption unit 114. The index generation unit 111 generates a search index for searching the document using the document before encryption.

インデックスマージ部１１２は、複数の検索インデックスをマージして、１つの検索インデックスを生成する。インデックス格納部１１３は、１以上の検索インデックスを格納する。検索インデックスそれぞれは、ドキュメント内のキーワードから生成された暗号キーワードと暗号クエリを含む。検索インデックスの詳細については後述する。 The index merge unit 112 merges a plurality of search indexes to generate one search index. The index storage unit 113 stores one or more search indexes. Each search index includes cryptographic keywords and cryptographic queries generated from keywords in the document. Details of the search index will be described later.

検索可能暗号化部１１４は、暗号処理を実施する。検索可能暗号化部１１４は、例えば、暗号キーワード生成部１１５、暗号クエリ生成部１１６、及び一致判定部１１７を含む。暗号キーワード生成部１１５は、インデックス生成部１１１がドキュメントから抽出したキーワードそれぞれから、暗号キーワードを生成する。暗号クエリ生成部１１６は、当該キーワードそれぞれから、暗号クエリを生成する。一致判定部１１７は、暗号キーワードと暗号クエリとが、同一のキーワードから生成されたか否かを判定する。 The searchable encryption unit 114 performs encryption processing. The searchable encryption unit 114 includes, for example, an encryption keyword generation unit 115, an encryption query generation unit 116, and a match determination unit 117. The encryption keyword generation unit 115 generates an encryption keyword from each of the keywords extracted from the document by the index generation unit 111. The cryptographic query generation unit 116 generates a cryptographic query from each of the keywords. The match determination unit 117 determines whether the encryption keyword and the encryption query are generated from the same keyword.

図２は、インデックス生成サーバ１１０の物理的な構成例を示すブロック図である。なお、図２には、インデックス生成サーバ１１０の構成を示すが、検索エンジンサーバ１２０、ユーザ端末１３０、及び鍵サーバ１４０も同様の構成を有すればよい。 FIG. 2 is a block diagram showing an example of a physical configuration of the index generation server 110. As shown in FIG. Although FIG. 2 shows the configuration of the index generation server 110, the search engine server 120, the user terminal 130, and the key server 140 may have the same configuration.

本実施例のインデックス生成サーバ１１０は、プロセッサ（ＣＰＵ）１、メモリ２、補助記憶装置３及び通信インターフェース４を有する計算機によって構成される。 The index generation server 110 of the present embodiment is configured by a computer having a processor (CPU) 1, a memory 2, an auxiliary storage device 3, and a communication interface 4.

プロセッサ１は、メモリ２に格納されたプログラムを実行する。メモリ２は、不揮発性の記憶素子であるＲＯＭ及び揮発性の記憶素子であるＲＡＭを含む。ＲＯＭは、不変のプログラム（例えば、ＢＩＯＳ）などを格納する。ＲＡＭは、ＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）のような高速かつ揮発性の記憶素子であり、プロセッサ１が実行するプログラム及びプログラムの実行時に使用されるデータを一時的に格納する。 The processor 1 executes a program stored in the memory 2. The memory 2 includes a ROM, which is a non-volatile storage element, and a RAM, which is a volatile storage element. The ROM stores an immutable program (for example, BIOS). The RAM is a high-speed and volatile storage element such as a dynamic random access memory (DRAM), and temporarily stores a program executed by the processor 1 and data used when the program is executed.

補助記憶装置３は、例えば、磁気記憶装置（ＨＤＤ）、フラッシュメモリ（ＳＳＤ）等の大容量かつ不揮発性の記憶装置によって構成され、プロセッサ１が実行するプログラム及びプログラムの実行時に使用されるデータを格納する。すなわち、プログラムは、補助記憶装置３から読み出されて、メモリ２にロードされて、プロセッサ１によって実行される。 The auxiliary storage device 3 is composed of, for example, a large-capacity and non-volatile storage device such as a magnetic storage device (HDD), a flash memory (SSD), etc. Store. That is, the program is read from the auxiliary storage device 3, loaded into the memory 2, and executed by the processor 1.

通信インターフェース４は、所定のプロトコルに従って、他の装置（検索エンジンサーバ１２０、ユーザ端末１３０、鍵サーバ１４０など）との通信を制御するネットワークインターフェース装置である。 The communication interface 4 is a network interface device that controls communication with another device (the search engine server 120, the user terminal 130, the key server 140, etc.) according to a predetermined protocol.

インデックス生成サーバ１１０は、入力インターフェース５及び出力インターフェース８を有してもよい。入力インターフェース５は、キーボード６やマウス７などが接続され、オペレータからの入力を受けるインターフェースである。出力インターフェース８は、ディスプレイ装置９やプリンタなどが接続され、プログラムの実行結果をオペレータが視認可能な形式で出力するインターフェースである。 The index generation server 110 may have an input interface 5 and an output interface 8. The input interface 5 is an interface to which a keyboard 6 and a mouse 7 are connected and which receives an input from an operator. The output interface 8 is an interface to which a display device 9 and a printer are connected, and which outputs the execution result of the program in a format that can be viewed by the operator.

プロセッサ１が実行するプログラムは、リムーバブルメディア（ＣＤ−ＲＯＭ、フラッシュメモリなど）又はネットワークを介してインデックス生成サーバ１１０に提供され、非一時的記憶媒体である不揮発性の補助記憶装置３に格納される。このためインデックス生成サーバ１１０は、リムーバブルメディアからデータを読み込むインターフェースを有するとよい。 The program executed by the processor 1 is provided to the index generation server 110 via removable media (CD-ROM, flash memory, etc.) or via a network, and is stored in the non-volatile secondary storage device 3 which is a non-temporary storage medium. . Therefore, the index generation server 110 may have an interface for reading data from removable media.

インデックス生成サーバ１１０は、物理的に一つの計算機上で、又は、論理的又は物理的に構成された複数の計算機上で構成される計算機システムであり、同一の計算機上で別個のスレッドで動作してもよく、複数の物理的計算機資源上に構築された仮想計算機上で動作してもよい。 The index generation server 110 is a computer system configured physically on one computer, or on a plurality of logically or physically configured computers, and operates in separate threads on the same computer. It may operate on a virtual computer built on multiple physical computer resources.

図３は、インデックス格納部１１３に格納された検索インデックスのマージ処理の一例を示す説明図である。図３は、インデックス格納部１１３に格納されたマージ対象の検索インデックス３０１及び検索インデックス３０２がマージされて、マージ結果である検索インデックス３０３が生成される例を示す。 FIG. 3 is an explanatory diagram of an example of a process of merging search indexes stored in the index storage unit 113. As shown in FIG. FIG. 3 shows an example in which the search index 301 to be merged stored in the index storage unit 113 and the search index 302 are merged, and a search index 303 which is a merge result is generated.

検索インデックス３０１は、例えば、キーワード辞書３１１とメタデータ３２１とを含む。キーワード辞書３１１は、暗号キーワードと暗号クエリからなる１以上の組み合わせからなる。以下、当該１以上の組み合わせそれぞれを暗号セットと呼ぶ。メタデータ３２１は、各暗号セットに紐づくメタデータを含む。暗号セットに紐づくメタデータは、例えば、当該暗号セットの暗号化される前のキーワードが含まれるドキュメント、当該ドキュメントにおける当該キーワードの出現頻度、及び当該ドキュメントにおける当該キーワードの出現場所を示す情報等を含む。 The search index 301 includes, for example, a keyword dictionary 311 and metadata 321. The keyword dictionary 311 is composed of one or more combinations of an encryption keyword and an encryption query. Hereinafter, each of the one or more combinations is called a cipher set. Metadata 321 includes metadata associated with each cipher set. The metadata associated with the encryption set includes, for example, a document including a keyword before the encryption of the encryption set, the frequency of appearance of the keyword in the document, and information indicating the appearance location of the keyword in the document. Including.

同様に、検索インデックス３０２は、キーワード辞書３１２とメタデータ３２２とを含み、検索インデックス３０３は、キーワード辞書３１２とメタデータ３２２とを含む。例えば、自然数Ｘに対して、図３における「ＥｎｃｋｅｙｗｏｒｄＸ」は、キーワードである「ｋｅｙｗｏｒｄＸ」を暗号化した暗号キーワードであり、「ＥｎｃｑｕｅｒｙＸ」は、「ｋｅｙｗｏｒｄＸ」を暗号化した暗号クエリである。 Similarly, search index 302 includes keyword dictionary 312 and metadata 322, and search index 303 includes keyword dictionary 312 and metadata 322. For example, "EnckeywordX" in FIG. 3 is an encryption keyword obtained by encrypting the keyword "keywordX" with respect to a natural number X, and "EncqueryX" is an encryption query obtained by encrypting "keywordX".

インデックス生成サーバ１１０は、同一のキーワードから生成された暗号セットを特定し、キーワード辞書３１１の暗号セットとメタデータ３２１のメタデータ、及びキーワード辞書３１２の暗号セットとメタデータ３２２とのメタデータを、キーワード辞書３１３及びメタデータ３２３に格納する。 The index generation server 110 specifies a cipher set generated from the same keyword, and the cipher set of the keyword dictionary 311 and the metadata of the metadata 321, and the metadata of the cipher set of the keyword dictionary 312 and the metadata 322, It is stored in the keyword dictionary 313 and the metadata 323.

なお、同一のキーワードから生成された暗号セットがキーワード辞書３１１及びキーワード辞書３１２に含まれている場合、インデックス生成サーバ１１０は、当該暗号セット及び当該暗号セットそれぞれに紐づくメタデータを集約して検索インデックス３０３に格納する。 In addition, when the encryption set generated from the same keyword is included in the keyword dictionary 311 and the keyword dictionary 312, the index generation server 110 aggregates and searches the encryption set and the metadata associated with each of the encryption set. It stores in the index 303.

具体的には、図３の例では、「ｋｅｙｗｏｒｄ１」から生成された「Ｅｎｃｋｅｙｗｏｒｄ１」と「Ｅｎｃｑｕｅｒｙ１」からなる暗号セットがキーワード辞書３１１及びキーワード辞書３１２に含まれている。このときインデックス生成サーバ１１０は、キーワード辞書３１１又はキーワード辞書３１２の「Ｅｎｃｋｅｙｗｏｒｄ１」と、キーワード辞書３１１又はキーワード辞書３１２の「Ｅｎｃｑｕｅｒｙ１」と、からなる暗号セットを、キーワード辞書３１３に格納する。また、インデックス生成サーバ１１０は、キーワード辞書３１１において「Ｅｎｃｋｅｙｗｏｒｄ１」に紐づくメタデータである「ＭｅｔａＡ」と、キーワード辞書３１２において「Ｅｎｃｋｅｙｗｏｒｄ１」に紐づくメタデータである「ＭｅｔａＤ」と、をメタデータ３２３に格納し、キーワード辞書３１３の当該暗号セットと紐づける。 Specifically, in the example of FIG. 3, the key word dictionary 311 and the key word dictionary 312 include an encryption set including “Enckeyword 1” and “Encquery 1” generated from “keyword 1”. At this time, the index generation server 110 stores, in the keyword dictionary 313, an encryption set composed of the keyword dictionary 311 or “Enckeyword 1” of the keyword dictionary 312 and the keyword dictionary 311 or “Encquery 1” of the keyword dictionary 312. In addition, the index generation server 110 generates metadata 323 of “MetaA” which is metadata linked to “Enckeyword 1” in the keyword dictionary 311 and “MetaD” which is metadata linked to “Enckeyword 1” in the keyword dictionary 312. , And associated with the corresponding encryption set of the keyword dictionary 313.

なお、前述した通り、暗号キーワードは、非決定性暗号を用いて生成されるため、例えば、キーワード辞書３１１内の「Ｅｎｃｋｅｙｗｏｒｄ１」とキーワード辞書３１２の「Ｅｎｃｋｅｙｗｏｒｄ１」とは互いに異なる値である。同様に、暗号クエリも非決定性暗号を用いて生成されるため、例えば、キーワード辞書３１１内の「Ｅｎｃｑｕｅｒｙ１」とキーワード辞書３１２の「Ｅｎｃｑｕｅｒｙ１」とは互いに異なる値である。これらの暗号セットが同じキーワードから生成されたか否かを判定する処理の詳細については後述する。 As described above, since the encryption keyword is generated using nondeterministic encryption, for example, “Enckeyword 1” in the keyword dictionary 311 and “Enckeyword 1” in the keyword dictionary 312 have different values. Similarly, since the encryption query is also generated using nondeterministic encryption, for example, “Encquery 1” in the keyword dictionary 311 and “Encquery 1” in the keyword dictionary 312 have different values. Details of the process of determining whether these cipher sets are generated from the same keyword will be described later.

図４は、ドキュメントの追加又は更新に伴う検索インデックス作成処理の一例を示す。ユーザ端末１３０は、例えばユーザからの指示に従って、検索エンジンサーバ１２０にログインし、ユーザの暗号鍵情報及びドキュメント追加・更新リクエストを検索エンジンサーバ１２０に送信する（Ｓ４０１）。ドキュメント追加・更新リクエストは、ドキュメント内のテキストを特定できるドキュメント情報（例えば、ドキュメントそのもの又はドキュメントのＵＲＬ等）を含む。 FIG. 4 shows an example of search index creation processing accompanying document addition or update. The user terminal 130 logs in to the search engine server 120, for example, in accordance with an instruction from the user, and transmits the user's encryption key information and the document addition / update request to the search engine server 120 (S401). The document addition / update request includes document information (for example, the document itself or the URL of the document) which can specify the text in the document.

検索エンジンサーバ１２０は、ドキュメント情報と暗号鍵情報とをインデックス生成部１１１に送信する（Ｓ４０２）。インデックス生成部１１１は、ドキュメント情報が示すドキュメント内のテキストから、暗号化前キーワードとメタデータとを抽出する（Ｓ４０３）。 The search engine server 120 transmits the document information and the encryption key information to the index generation unit 111 (S402). The index generation unit 111 extracts the pre-encryption keyword and the metadata from the text in the document indicated by the document information (S403).

具体的には、インデックス生成部１１１は、例えば、形態素解析又はＮグラム法等のアルゴリズムを用いて、当該テキストから１以上のキーワードを抽出し、さらに抽出したキーワードそれぞれに対応するメタデータを抽出する（Ｓ４０３）。インデックス生成部１１１は、暗号鍵情報と抽出したキーワードとを検索可能暗号化部１１４に送信する（Ｓ４０４）。 Specifically, the index generation unit 111 extracts one or more keywords from the text using, for example, an algorithm such as morphological analysis or N-gram method, and further extracts metadata corresponding to each of the extracted keywords. (S403). The index generation unit 111 transmits the encryption key information and the extracted keyword to the searchable encryption unit 114 (S404).

検索可能暗号化部１１４は、暗号鍵情報を鍵サーバ１４０に送信する（Ｓ４０５）。鍵サーバ１４０は、暗号鍵情報が示すユーザのデータ暗号鍵と関数値暗号鍵と関数値復号鍵と乱数用秘密鍵とを検索可能暗号化部１１４に送信する（Ｓ４０６）。なお、関数値復号鍵は、図４の処理には使用されないため（後述する図９の処理において使用される）、関数値復号鍵のやりとりはステップＳ４０５〜Ｓ４０６において、実施されなくてもよい。 The searchable encryption unit 114 transmits the encryption key information to the key server 140 (S405). The key server 140 transmits the data encryption key of the user indicated by the encryption key information, the function value encryption key, the function value decryption key, and the random number secret key to the searchable encryption unit 114 (S406). Since the function value decryption key is not used in the process of FIG. 4 (used in the process of FIG. 9 described later), the exchange of the function value decryption key may not be performed in steps S405 to S406.

検索可能暗号化部１１４は、受信したデータ暗号鍵と抽出したキーワードとを用いて、抽出したキーワードそれぞれに対応する暗号キーワードを生成する（Ｓ４０７）。ステップＳ４０７における暗号キーワード生成処理の詳細は後述する。 The searchable encryption unit 114 generates an encrypted keyword corresponding to each of the extracted keywords using the received data encryption key and the extracted keywords (S407). Details of the encryption keyword generation process in step S407 will be described later.

検索可能暗号化部１１４は、受信したデータ暗号鍵及び関数値暗号鍵と、抽出したキーワードと、を用いて、抽出したキーワードそれぞれに対応する暗号クエリを生成する（Ｓ４０８）。ステップＳ４０８における暗号クエリ生成処理の詳細は後述する。 The searchable encryption unit 114 generates an encrypted query corresponding to each of the extracted keywords using the received data encryption key and function value encryption key and the extracted keyword (S408). Details of the cryptographic query generation process in step S408 will be described later.

検索可能暗号化部１１４は、抽出したキーワードそれぞれについて、当該キーワードに対応する暗号キーワードと暗号クエリとの組み合わせである暗号セットを生成し、生成した暗号セットからなる暗号キーワード辞書をインデックス生成部１１１に送信する（Ｓ４０９）。なお、検索可能暗号化部１１４は、ステップＳ４０９において、暗号キーワード辞書に含まれる暗号セットそれぞれに対応するキーワードを特定する情報を併せて、インデックス生成部１１１に送信する。 The searchable encryption unit 114 generates, for each of the extracted keywords, an encryption set which is a combination of an encryption keyword corresponding to the keyword and an encryption query, and generates, to the index generation unit 111, an encryption keyword dictionary including the generated encryption set. It transmits (S409). In step S409, the searchable encryption unit 114 transmits, to the index generation unit 111, information specifying the keyword corresponding to each of the encryption sets included in the encryption keyword dictionary.

インデックス生成部１１１は、同一のキーワードから生成された、暗号キーワード辞書内の暗号セットとメタデータとを対応付けて、暗号キーワード辞書とメタデータとからなる検索インデックスを生成し、生成した検索インデックスをインデックス格納部１１３に格納する（Ｓ４１０）。インデックス生成部１１１は、検索エンジンサーバ１２０に対して、検索インデックス生成完了通知を送信する（Ｓ４１１）。検索エンジンサーバ１２０は、インデックス格納部１１３に格納された検索インデックスを読み込む（Ｓ４１２）。 The index generation unit 111 associates the encryption set in the encryption keyword dictionary and the metadata generated from the same keyword with each other to generate a search index including the encryption keyword dictionary and the metadata, and generates the generated search index. It stores in the index storage unit 113 (S410). The index generation unit 111 transmits a search index generation completion notification to the search engine server 120 (S411). The search engine server 120 reads the search index stored in the index storage unit 113 (S412).

以下、暗号キーワードと暗号クエリの生成処理の一例を説明する。以下では、１つのキーワードから１つの暗号キーワードと１つの暗号クエリを生成する例を説明する。 Hereinafter, an example of the generation process of the encryption keyword and the encryption query will be described. Below, the example which produces | generates one encryption keyword and one encryption query from one keyword is demonstrated.

＜暗号キーワードの生成方法＞
ステップＳ４０７における暗号キーワードの生成処理の一例を図６と図７を用いて示す。<Method of generating encryption keyword>
An example of the generation process of the encryption keyword in step S407 is shown using FIG. 6 and FIG.

暗号キーワード生成部１１５は、キーワードを、検索可能暗号化部１１４が処理可能な所定のサイズに分割する。例えば、検索可能暗号化部１１４が共通鍵暗号ＡＥＳを実装している場合、図５Ｂに示すように、暗号キーワード生成部１１５はキーワードをＭ１、Ｍ２、……Ｍｎの１２８ビット毎のブロックに分割する。 The encryption keyword generation unit 115 divides the keyword into a predetermined size that can be processed by the searchable encryption unit 114. For example, when the searchable encryption unit 114 implements the common key encryption AES, as shown in FIG. 5B, the encryption keyword generation unit 115 divides the keywords into blocks of 128 bits of M1, M2,. Do.

暗号キーワード生成部１１５は、所定の初期ベクトルとデータ暗号鍵とを用いて、分割されたキーワードそれぞれを暗号化した、中間暗号キーワードの各ブロックＣ１、Ｃ２、……Ｃｎを生成する。 The encryption keyword generation unit 115 generates blocks C1, C2,..., Cn of intermediate encryption keywords by encrypting each of the divided keywords using a predetermined initial vector and a data encryption key.

暗号キーワード生成部１１５は、中間暗号キーワードの各ブロックの作成において、中間暗号キーワードの生成済のブロックを利用し、当該ブロックを作成する。暗号キーワード生成部１１５は、例えば、図５Ｂに示すように中間暗号化キーワードの生成済のブロックとキーワードのブロックとの排他的論理和（ｘｏｒ算）を計算したデータを暗号化し、次の中間暗号キーワードを作成する。従って、ブロックＭｎに対応する中間暗号化キーワードのブロックＣｎには、ブロックＭｎの内容だけでなく、他のブロックＭ１、Ｍ２、…、Ｍｎ−１の内容も反映されている。 In creating each block of the intermediate encryption keyword, the encryption keyword generation unit 115 uses the block for which the intermediate encryption keyword has been generated to create the block. For example, as shown in FIG. 5B, the encryption keyword generation unit 115 encrypts data obtained by calculating the exclusive OR (xor calculation) of the block for which the intermediate encryption keyword has been generated and the block of the keyword, Create keywords. Therefore, not only the contents of the block Mn but also the contents of the other blocks M1, M2, ..., Mn-1 are reflected in the block Cn of the intermediate encryption keyword corresponding to the block Mn.

暗号キーワード生成部１１５は、中間暗号キーワードの各ブロックに対する乱数を生成する。具体的には、例えば、暗号キーワード生成部１１５は、擬似乱数生成器を用いて中間暗号キーワードのｎ個のブロックそれぞれに対する乱数を生成する。インデックス生成サーバ１１０は、例えば、擬似乱数生成器を予め保持している。 The encryption keyword generation unit 115 generates a random number for each block of the intermediate encryption keyword. Specifically, for example, the cryptographic keyword generation unit 115 generates a random number for each of n blocks of the intermediate cryptographic keyword using a pseudo random number generator. The index generation server 110 holds, for example, a pseudo random number generator in advance.

例えば、図５Ａに示すように、暗号キーワード生成部１１５は、初期ベクトルと定数を連結したデータを、乱数用秘密鍵Ｋ２とともに擬似乱数生成器（ＲＮＧ）に入力し、１２８ビット毎のｎ個分の乱数Ｒ１、Ｒ２、……Ｒｎを生成する。 For example, as shown in FIG. 5A, the cryptographic keyword generation unit 115 inputs data obtained by concatenating the initial vector and the constant into the pseudo random number generator (RNG) together with the random number secret key K2, Random numbers R1, R2,..., Rn are generated.

暗号キーワード生成部１１５は、所定の準同型関数にｎ個目の乱数Ｒｎを入力し、出力されたデータを関数値Ｘとして取得する。例えば、図７に示すように、暗号キーワード生成部１１５は、１２８ビットの乱数を当該準同型関数に入力し、９６ビットの関数値を得る。 The cryptographic keyword generation unit 115 inputs the n-th random number Rn to a predetermined homomorphic function, and acquires the output data as the function value X. For example, as shown in FIG. 7, the encryption keyword generation unit 115 inputs a 128-bit random number to the homomorphic function to obtain a 96-bit function value.

なお、準同型関数Ｆとは、入力変数ｘ、入力変数ｙに対し、以下の数１が成り立つ関数を指す。
（数１）Ｆ（ｘ・ｙ）＝Ｆ（ｘ）？Ｆ（ｙ）The homomorphic function F refers to a function in which the following equation 1 holds for the input variable x and the input variable y.
(Equation 1) F (x · y) = F (x)? F (y)

ただし、「・」と「？」は、二項演算の演算記号を表わし、加算用の演算記号＋、乗算用の演算記号＊、ビット毎の排他的論理和であるＸＯＲ（ｅＸｃｌｕｓｉｖｅＯＲ）演算用の演算記号ｘｏｒ等が入る。このとき、数１において、「・」と「？」にＸＯＲ演算記号ｘｏｒが入る場合、以下の数２が成り立つ。
（数２）Ｆ（ｘｘｏｒｙ）＝Ｆ（ｘ）ｘｏｒＦ（ｙ）Here, “•” and “?” Represent binary arithmetic operation symbols, an addition operation symbol +, a multiplication operation symbol *, an XOR (eXclusive OR) operation which is an exclusive OR for each bit Operation symbols xor etc. At this time, when the XOR operation symbol xor is included in “·” and “?” In Expression 1, the following Expression 2 holds.
(Equation 2) F (x xor y) = F (x) xor F (y)

暗号キーワード生成部１１５は、関数値Ｘに対し所定の不可逆変換を実行し、不可逆変換実行後の値を不可逆変換値Ｈとして取得する。例えば、当該不可逆変換がハッシュ関数ＳＨＡ２５６である場合、暗号キーワード生成部１１５は、９６ビットの関数値Ｘを２５６ビットのハッシュ値（不可逆変換値）に変換する。 The cryptographic keyword generation unit 115 executes predetermined irreversible conversion on the function value X, and acquires the value after the irreversible conversion is performed as the irreversible conversion value H. For example, when the irreversible conversion is the hash function SHA256, the cryptographic keyword generation unit 115 converts the 96-bit function value X into a 256-bit hash value (irreversible conversion value).

例えば、図６に示すように、暗号キーワード生成部１１５は、２５６ビットのハッシュ値のうち、最下位３２ビットを抽出し、暗号キーワード用の検索タグＤｎ＋１を得る。その結果、元のデータよりもデータサイズが小さい検索用のデータが得られる。 For example, as shown in FIG. 6, the cryptographic keyword generation unit 115 extracts the least significant 32 bits of the 256-bit hash value, and obtains a search tag Dn + 1 for the cryptographic keyword. As a result, data for search having a smaller data size than the original data can be obtained.

暗号キーワード生成部１１５は、不可逆変換値Ｈから、所定のタグ長が示すビット長を暗号キーワード用の検索タグＤｎ＋１として取得する。例えば、図６に示すように、２５６ビットのハッシュ値のうち、最下位３２ビットを暗号キーワード生成部１１５が抽出し、照合データＤ'ｎ＋１を得る。なお、不可逆変換値Ｈから、抽出するビットは最下位ビットからに限らず、最上位ビットから抽出してもよく、既定のビットを抽出、あるいはランダムに各ビットを抽出してもよい。また、選択するビット長も任意である。 The cryptographic keyword generation unit 115 acquires a bit length indicated by a predetermined tag length from the irreversible conversion value H as a search tag Dn + 1 for a cryptographic keyword. For example, as shown in FIG. 6, the cryptographic keyword generation unit 115 extracts the least significant 32 bits of the 256-bit hash value to obtain collation data D ′ n + 1. The bits to be extracted from the irreversible conversion value H may be extracted not only from the least significant bit but also from the most significant bit, or predetermined bits may be extracted or each bit may be extracted at random. Moreover, the bit length to select is also arbitrary.

暗号キーワード生成部１１５は、中間暗号キーワードのｎ個のブロックと乱数とに対して、以下の数１に示すように、それぞれの排他的論理和（ＸＯＲ算）を計算し、出力結果Ｄ１、Ｄ２、……Ｄｎを暗号文本体（即ち、暗号化されたキーワードに相当する部分）として取得する。
（数３）Ｄｉ＝ＣｉｘｏｒＲｉ（ｉ＝１、・・・ｎ）The encryption keyword generation unit 115 calculates the exclusive OR (XOR operation) of n blocks of the intermediate encryption keyword and the random number, as shown in the following equation 1, and outputs the output results D1, D2 ,..., Dn is acquired as a ciphertext body (ie, a portion corresponding to an encrypted keyword).
(Equation 3) Di = Ci xor Ri (i = 1,... N)

暗号キーワード生成部１１５は、初期ベクトルとＤ１、Ｄ２、……Ｄｎからなる暗号文本体と、検察タグＤｎ＋１を連結し、これを暗号キーワードに決定する。 The cryptographic keyword generation unit 115 links the prosecutor tag Dn + 1 with the ciphertext main body consisting of the initial vector, D1, D2,... Dn, and determines this as the cryptographic keyword.

なお、上記の秘匿データを作成する手順は、必ずしも上記に記述された通りの順序で処理する必要はなく、異なる順序で実施してもよい。 The procedure for creating the secret data does not necessarily have to be processed in the order as described above, and may be performed in a different order.

＜暗号クエリの生成方法＞
ステップＳ４０７における暗号化クエリ生成処理の一例を、図７を用いて示す。<Method of generating cryptographic query>
An example of the encrypted query generation process in step S407 is shown using FIG.

暗号クエリ生成部１１６は、キーワードを取得し、検索可能暗号化部１１４が処理可能な所定のサイズに分割する。暗号クエリ生成部１１６は、例えば、図５Ｂの例におけるキーワードの分割と同様、１２８ビット毎にキーワードをＭ１、Ｍ２、……Ｍｎに分割する。 The cryptographic query generation unit 116 acquires a keyword and divides the keyword into a predetermined size that can be processed by the searchable encryption unit 114. The cryptographic query generation unit 116 divides the keywords into M1, M2,..., Mn every 128 bits, as in the case of the keyword division in the example of FIG. 5B, for example.

暗号クエリ生成部１１６は、所定の初期ベクトルとデータ暗号鍵とを用いて、分割されたキーワードそれぞれを暗号化することにより、ｎ個のブロックＣ１、Ｃ２、……Ｃｎからなる中間暗号クエリを生成する。 The cryptographic query generation unit 116 generates an intermediate cryptographic query consisting of n blocks C1, C2,... Cn by encrypting each of the divided keywords using a predetermined initial vector and a data encryption key. Do.

暗号クエリ生成部１１６は、暗号キーワードの生成時と同様に、中間暗号クエリの作成済のブロックを利用し、次の中間暗号クエリのブロックを作成する。例えば、図５Ｂに示すように、暗号クエリ生成部１１６は、中間暗号クエリの作成済のブロックとキーワードのブロックをｘｏｒしたデータを暗号化し、次の中間暗号クエリのブロックを作成する。 Similar to the generation of the encryption keyword, the encryption query generation unit 116 uses the block for which the intermediate encryption query has been created, and creates the next intermediate encryption query block. For example, as shown in FIG. 5B, the cryptographic query generation unit 116 encrypts data obtained by xoring the block of the intermediate cryptographic query and the block of the keyword, and creates the next intermediate cryptographic query block.

暗号クエリ生成部１１６は、例えば、擬似乱数生成器に初期ベクトル（Ｗ０）と乱数用秘密鍵（Ｋ２）を入力し、ｎ番目の中間暗号クエリのブロックＣｎとのｘｏｒに用いる１個分の乱数Ｒ'ｎを生成する。 The cryptographic query generation unit 116 inputs, for example, the initial vector (W0) and the secret key for random number (K2) into the pseudo random number generator, and uses one random number for xor with the block Cn of the nth intermediate cryptographic query. Generate R'n.

暗号クエリ生成部１１６は、準同型関数に乱数Ｒ'ｎを入力し、出力されたデータを関数値Ｘとして取得する。当該準同型関数は、例えば、暗号キーワードの生成に用いた準同型関数と同じである必要がある。暗号クエリ生成部１１６は、例えば、図７に示すように、１２８ビットの乱数Ｒ'ｎを当該準同型関数に入力し、９６ビットの関数値Ｘを得る。 The cryptographic query generation unit 116 inputs the random number R′n to the homomorphic function, and acquires the output data as the function value X. The homomorphic function needs to be, for example, the same as the homomorphic function used to generate the encryption keyword. For example, as shown in FIG. 7, the cryptographic query generation unit 116 inputs a 128-bit random number R'n to the homomorphic function to obtain a 96-bit function value X.

暗号クエリ生成部１１６は、関数値暗号鍵（Ｋ３）を用いて関数値Ｘを暗号化することにより得られたデータを、暗号クエリ用の検索タグＷｎ＋１として取得する。例えば、図７に示すように、暗号クエリ生成部１１６は９６ビットの関数値Ｘを関数値暗号鍵（Ｋ３）と初期ベクトル（Ｗ０）を用いて暗号化することにより、１２８ビットの暗号文を出力し、それを暗号クエリ用の検索タグＷｎ＋１とする。 The cryptographic query generation unit 116 acquires data obtained by encrypting the function value X using the function value cryptographic key (K3) as a search tag Wn + 1 for the cryptographic query. For example, as shown in FIG. 7, the cryptographic query generation unit 116 encrypts the 128-bit ciphertext by encrypting the 96-bit function value X using the function-value cryptographic key (K3) and the initial vector (W0). Output and set it as search tag Wn + 1 for cryptographic query.

暗号クエリ生成部１１６は、中間暗号化キーワードのｎ個目のブロックＣｎと乱数Ｒ'ｎとの排他的論理和（ＸＯＲ算）を計算し、出力結果Ｗｎをクエリ用に暗号化された暗号文本体として取得する。 The cryptographic query generation unit 116 calculates the exclusive OR (XOR operation) of the nth block Cn of the intermediate encryption keyword and the random number R'n, and the ciphertext in which the output result Wn is encrypted for the query Acquire as the main body.

暗号クエリ生成部１１６は、初期ベクトルＷ０、暗号文本体Ｗｎ、及び暗号クエリ用の検察タグＷｎ＋１を連結し、これを暗号クエリに決定する。なお、上記の暗号クエリを作成する手順は、必ずしも上記に記述された通りの順序で処理する必要はなく、異なる順序で実施してもよい。 The cryptographic query generation unit 116 links the initial vector W0, the ciphertext main body Wn, and the prosecution tag Wn + 1 for the cryptographic query, and determines this as a cryptographic query. Note that the procedure for creating the above encrypted query does not necessarily have to be processed in the order as described above, and may be performed in a different order.

図８は、複数の検索インデックスのマージ処理の一例を示す。まず検索エンジンサーバ１２０は、所定のポリシーに従って、インデックス格納部１１３に格納された検索インデックスから、マージ対象の複数の検索インデックスを選定する（Ｓ８０１）。 FIG. 8 shows an example of merge processing of a plurality of search indexes. First, the search engine server 120 selects a plurality of search indexes to be merged from the search indexes stored in the index storage unit 113 according to a predetermined policy (S801).

具体的には、検索エンジンサーバ１２０は、例えば、インデックス格納部１１３に格納された検索インデックスが所定数以上であると判定した場合、前回のマージ処理から所定時間が経過した場合、又は検索エンジンサーバ１２０の管理者が直接サーバにインデックスマージを指示した場合、にステップＳ８０１の処理を開始する。また、検索エンジンサーバ１２０は、新たな検索インデックスが生成されたと判定した場合にステップＳ８０１の処理を開始してもよい。 Specifically, when the search engine server 120 determines that the number of search indexes stored in the index storage unit 113 is equal to or more than a predetermined number, for example, when a predetermined time has elapsed from the previous merge process, or the search engine server If the administrator of 120 directly instructs the server to perform index merge, the process of step S801 is started. The search engine server 120 may start the process of step S801 when it is determined that a new search index has been generated.

また、検索エンジンサーバ１２０は、例えば、インデックス格納部１１３に格納された全ての検索インデックスをマージ対象に選定する。また、例えば、検索エンジンサーバ１２０は、選定した複数の検索インデックスのキーワード辞書に含まれる暗号キーワードの合計数が所定数以上となるように、複数の検索インデックスをマージ対象に選定してもよい。 In addition, the search engine server 120 selects, for example, all search indexes stored in the index storage unit 113 as merge targets. Further, for example, the search engine server 120 may select a plurality of search indexes as merge targets so that the total number of encryption keywords included in the keyword dictionary of the selected plurality of search indexes is equal to or more than a predetermined number.

続いて、検索エンジンサーバ１２０は、選定したマージ対象の検索インデックスを示す情報をインデックスマージ部１１２に送信する（Ｓ８０２）。インデックスマージ部１１２は、受信した情報が示すマージ対象の検索インデックスをインデックス格納部１１３から取得し、取得した検索インデックスのキーワード辞書を検索可能暗号化部１１４に送信する（Ｓ８０３）。 Subsequently, the search engine server 120 transmits information indicating the selected search index to be merged to the index merge unit 112 (S802). The index merge unit 112 acquires a search index to be merged indicated by the received information from the index storage unit 113, and transmits a keyword dictionary of the acquired search index to the searchable encryption unit 114 (S803).

一致判定部１１７は、受信したキーワード辞書に含まれる暗号セットから、同じキーワードから生成された暗号セットを特定する（Ｓ８０４）。なお、一致判定部１１７は、第１暗号セットに含まれる第１暗号キーワードと、第２暗号セットに含まれる暗号クエリと、を比較することにより、第１暗号セット対応する暗号化前キーワードと第２暗号セットに対応する暗号化前キーワードとが一致するか否かを判定する。 The match determination unit 117 identifies a cipher set generated from the same keyword from the cipher sets included in the received keyword dictionary (S804). In addition, the match determination unit 117 compares the first encryption keyword included in the first encryption set with the encryption query included in the second encryption set to obtain the pre-encryption keyword corresponding to the first encryption set and the first encryption set. (2) It is determined whether or not the pre-encryption keyword corresponding to the encryption set matches.

例えば、一致判定部１１７は、受信したキーワード辞書に含まれる全ての暗号セットについて、当該暗号セットが属していない他のキーワード辞書に含まれる暗号セットと当該比較処理を行うことにより、ステップＳ８０４の処理を実行する。なお、当該比較処理の詳細については後述する。 For example, the match determination unit 117 performs the process of step S804 by performing the comparison process on all the encryption sets included in the received keyword dictionary with the encryption set included in another keyword dictionary to which the encryption set does not belong. Run. The details of the comparison process will be described later.

一致判定部１１７は、ステップＳ８０４における判定結果をインデックスマージ部１１２に送信する（Ｓ８０５）。インデックスマージ部１１２は、受信した判定結果に基づいて、マージ対象の検索インデックスをマージしてマージ結果である１つの検索インデックスを生成し、生成した検索インデックスをインデックス格納部１１３に格納し、マージ対象の検索インデックスをインデックス格納部１１３から削除する（Ｓ８０６）。 The match determination unit 117 transmits the determination result in step S804 to the index merge unit 112 (S805). The index merge unit 112 merges the search index to be merged based on the received determination result to generate one search index which is a merge result, stores the generated search index in the index storage unit 113, and merges Is deleted from the index storage unit 113 (S806).

ステップＳ８０６のマージ処理について説明する。インデックスマージ部１１２は、判定結果を参照して、同じキーワードから生成された暗号セットからなる暗号セット群を特定し、各暗号セット群に対して以下の処理を行う。 The merge process of step S806 will be described. The index merge unit 112 refers to the determination result, specifies a cipher set group consisting of cipher sets generated from the same keyword, and performs the following processing on each cipher set group.

インデックスマージ部１１２は、１つの暗号セットからなる暗号セット群に対して、当該１つの暗号セットをマージ結果のインデックス辞書に格納し、当該１つの暗号セットに紐づくマージ対象のメタデータをマージ結果のメタデータに格納し、マージ結果において当該暗号セットと当該メタデータとを紐づける。 The index merge unit 112 stores the one encryption set in the index dictionary of the merge result for the encryption set group consisting of one encryption set, and merges the merge target metadata associated with the one encryption set Of the encryption set and the metadata in the merge result.

図３の例では、「ｋｅｙｗｏｒｄ２」から生成された「Ｅｎｃｋｅｙｗｏｒｄ２」と「Ｅｎｃｑｕｅｒｙ２」とからなる暗号セットは検索インデックス３０１にのみ含まれる、即ち「ｋｅｙｗｏｒｄ２」から生成された暗号セットは１つであるため、当該暗号セットと当該暗号セットに紐づくメタデータである「ＭｅｔａＢ」は、そのまま検索インデックス３０３に格納される。 In the example of FIG. 3, since the encryption set consisting of "Enckeyword 2" and "Encquery 2" generated from "keyword 2" is included only in the search index 301, that is, one encryption set is generated from "keyword 2". The encryption set and metadata “MetaB” associated with the encryption set are stored in the search index 303 as they are.

インデックスマージ部１１２は、複数の暗号セットからなる暗号セット群に対して、例えば、当該複数の暗号セットからランダムに選択した暗号キーワードと暗号クエリとからなる暗号セットを、マージ結果のキーワード辞書に格納する。また、インデックスマージ部１１２は、当該複数の暗号セットそれぞれに紐づくメタデータを取得し、取得したメタデータをマージ結果のメタデータに格納する。インデックスマージ部１１２は、マージ結果において、当該１つの暗号セットと、当該メタデータとを紐づける。 The index merge unit 112 stores, for example, a cipher set consisting of a cryptographic keyword randomly selected from the plurality of cipher sets and a cryptographic query in a keyword dictionary of the merge result for a cipher set group consisting of a plurality of cipher sets Do. In addition, the index merging unit 112 acquires metadata associated with each of the plurality of encryption sets, and stores the acquired metadata in the metadata of the merge result. The index merge unit 112 associates the one cipher set with the metadata in the merge result.

図３の例では、検索インデックス３０１及び検索インデックス３０２それぞれが、「ｋｅｙｗｏｒｄ１」から生成された暗号セットを含んでいる。従って、インデックスマージ部１１２は、当該暗号セットからランダムに選択した暗号キーワードと暗号クエリとからなる暗号セットを、検索インデックス３０３のキーワード辞書に格納する。また、インデックスマージ部１１２は、検索インデックス３０１において当該暗号セットに紐づくメタデータである「ＭｅｔａＡ」と、検索インデックス３０２において当該暗号セットに紐づくメタデータである「ＭｅｔａＤ」と、を検索インデックス３０３のメタデータに格納し、検索インデックス３０３において、当該暗号セットと当該メタデータとを紐づける。 In the example of FIG. 3, each of the search index 301 and the search index 302 includes a cipher set generated from “keyword1”. Therefore, the index merge unit 112 stores the encryption set including the encryption keyword and the encryption query randomly selected from the encryption set in the keyword dictionary of the search index 303. In addition, the index merge unit 112 searches the search index 303 for “MetaA,” which is metadata linked to the encryption set in the search index 301, and “MetaD”, which is metadata linked to the encryption set in the search index 302. In the search index 303, the encryption set is associated with the metadata.

続いて、インデックスマージ部１１２は、検索エンジンサーバ１２０に対して、検索インデックスマージ完了通知を送信する（Ｓ８０７）。検索エンジンサーバ１２０は、インデックス格納部１１３に格納された検索インデックスを読み込む（Ｓ８０８）。 Subsequently, the index merge unit 112 transmits a search index merge completion notification to the search engine server 120 (S807). The search engine server 120 reads the search index stored in the index storage unit 113 (S808).

以下、一致判定部１１７による、ステップＳ８０４における暗号セットの比較処理の一例を、図９を用いて示す。具体的には、一致判定部１１７が、第１暗号セットに含まれる第１暗号キーワードと、第２暗号セットに含まれる第２クエリとを比較して、第１暗号セットと第２暗号セットが同一のキーワードから生成されたか否かを判定する処理の一例を示す。 Hereinafter, an example of the comparison process of the encryption set in step S804 by the coincidence determination unit 117 will be described with reference to FIG. Specifically, the match determination unit 117 compares the first encryption keyword included in the first encryption set with the second query included in the second encryption set, and the first encryption set and the second encryption set An example of the process which determines whether it produced | generated from the same keyword is shown.

ステップＳ４０５〜Ｓ４０６においてインデックス生成サーバ１１０が関数値復号鍵を取得していない場合、一致判定部１１７は、第２暗号セットが含まれる検索インデックスのユーザの関数値復号鍵を、鍵サーバ１４０から取得する。 If the index generation server 110 does not acquire the function value decryption key in steps S405 to S406, the match determination unit 117 acquires the function value decryption key of the user of the search index including the second encryption set from the key server 140. Do.

一致判定部１１７は、第１暗号キーワードにおける暗号文本体を取得し、暗号キーワード生成部１１５が処理したサイズに分割されたブロックのうち、ｎ番目のブロックを取り出す。一致判定部１１７は、例えば、第１暗号キーワードＤをＤ０、Ｄ１、Ｄ２、……Ｄｎ、Ｄｎ＋１と分割されたブロックの集合とみなし、データＤｎを取り出す。 The coincidence determination unit 117 acquires the ciphertext main body in the first encryption keyword, and extracts the n-th block among the blocks divided into the size processed by the encryption keyword generation unit 115. The coincidence determination unit 117, for example, regards the first encryption keyword D as a set of blocks divided into D0, D1, D2,... Dn, Dn + 1, and extracts the data Dn.

一致判定部１１７は、第２暗号クエリにおける暗号文本体を取得する。一致判定部１１７は、例えば、第２暗号クエリＷをＷ０、Ｗｎ、Ｗｎ＋１と３分割されたブロックの集合とみなし、２番目のデータＷｎを取り出す。 The match determination unit 117 acquires the ciphertext body in the second encrypted query. For example, the coincidence determination unit 117 regards the second encrypted query W as a set of blocks divided into three with W0, Wn, and Wn + 1, and extracts the second data Wn.

一致判定部１１７は、第１暗号キーワードの暗号文本体に含まれるブロックＤｎと第２暗号クエリの暗号文本体Ｗｎとの排他的論理和（ＸＯＲ算）を、下記の数４に従って計算する。
（数４）ＤｎｘｏｒＷｎ＝（ＣｎｘｏｒＲｎ）ｘｏｒ（ＣｎｘｏｒＲ'ｎ）The coincidence determination unit 117 calculates an exclusive OR (XOR operation) of the block Dn included in the ciphertext body of the first encryption keyword and the ciphertext body Wn of the second encryption query in accordance with the following Equation 4.
(Equation 4) Dn xor Wn = (Cn xor Rn) xor (Cn xor R'n)

ここで、第１暗号キーワードの暗号化前のキーワードと、第２暗号クエリの暗号化前のキーワードの値が同じである場合、それらを暗号化した中間暗号キーワードと中間暗号クエリの値が等しいため、以下の数５が導ける。
（¬（ＡｘｏｒＢ）＝Ａ・Ｂ＋¬Ａ・¬Ｂより、但し¬は否定又は補集合）
（数５）ＤｎｘｏｒＷｎ＝ＲｎｘｏｒＲ'ｎ
即ち、乱数（ＲｎとＲ'ｎ）の情報のみが数５に残される。Here, when the pre-encryption keyword of the first encryption keyword and the pre-encryption keyword value of the second encryption query are the same, since the intermediate encryption keyword obtained by encrypting them is equal to the value of the intermediate encryption query , The following number 5 can be derived.
(From (¬ (A xor B) = A · B + ¬A · ¬B, where ¬ is negative or complement))
(Equation 5) Dn xor Wn = Rn xor R'n
That is, only the information of the random numbers (Rn and R'n) is left in the equation 5.

一致判定部１１７は、当該排他的論理和の計算結果を準同型関数に入力し、関数値Ｙを取得する。なお、例えば、当該準同型関数は、図６の暗号キーワード生成処理及び図７の暗号クエリの生成処理に用いられた準同型関数と同じである必要がある。 The coincidence determination unit 117 inputs the calculation result of the exclusive OR into the homomorphic function, and acquires the function value Y. For example, the homomorphic function needs to be the same as the homomorphic function used in the encryption keyword generation process of FIG. 6 and the encryption query generation process of FIG. 7.

例えば、図９に示すように、一致判定部１１７は、第１暗号キーワードの暗号文本体のｎ番目の１２８ビットのブロックＤｎと、第２暗号クエリの１２８ビットの暗号文本体との排他的論理和（ＸＯＲ）を、準同型関数に入力し、以下の数６に示すように、例えば９６ビットの関数値Ｙを得る。
（数６）Ｙ＝Ｆ（ＤｎｘｏｒＷｎ）
数５が成り立つ場合、数６から以下の数７が導ける。
（数７）Ｙ＝Ｆ（ＲｎｘｏｒＲ'ｎ）For example, as illustrated in FIG. 9, the match determination unit 117 performs an exclusive logic between the nth 128-bit block Dn of the ciphertext body of the first encryption keyword and the 128-bit ciphertext body of the second cipher query. The sum (XOR) is input to the homomorphic function to obtain, for example, a 96-bit function value Y as shown in the following equation 6.
(Equation 6) Y = F (Dn xor Wn)
If Equation 5 holds, Equation 7 can be derived from Equation 6 below.
(Equation 7) Y = F (Rn xor R'n)

一致判定部１１７は、第２暗号クエリの検索タグを取得する。一致判定部１１７は、例えば、第２暗号クエリＷをＷ０、Ｗｎ、Ｗｎ＋１と３分割されたブロックの集合とみなし、３番目のデータＷｎ＋１を取り出す。 The match determination unit 117 acquires a search tag of the second encrypted query. For example, the coincidence determination unit 117 regards the second encrypted query W as a set of blocks divided into three such as W0, Wn, and Wn + 1, and extracts the third data Wn + 1.

一致判定部１１７は、第２暗号セットに対応するユーザの関数値復号鍵を用いて第２暗号クエリの検索タグＷｎ＋１を復号し、復号結果である関数値Ｘを取得する。関数値Ｘは、乱数Ｒｎと数２における準同型関数Ｆを用い、以下の数８で表わされる。
（数８）Ｘ＝Ｆ（Ｒｎ）The match determination unit 117 decrypts the search tag Wn + 1 of the second encrypted query using the function value decryption key of the user corresponding to the second encryption set, and acquires the function value X that is the decryption result. The function value X is expressed by the following equation 8 using the random number Rn and the homomorphic function F in equation 2.
(Equation 8) X = F (Rn)

一致判定部１１７は、関数値Ｘと関数値Ｙの排他的論理和（ＸＯＲ算）を計算し、計算結果である関数値Ｚを取得する。関数値Ｚに対しては以下に示す数９が成り立つ。
（Ａｘｏｒ（ＡｘｏｒＢ）＝Ａ・¬（ＡｘｏｒＢ）＋¬Ａ・（ＡｘｏｒＢ）＝Ａ・Ｂ＋¬Ａ・Ｂ＝Ｂより、データ（Ｂ）に他のデータ（Ａ）による排他的論理和の演算を２度施すと元のデータ（Ｂ）が得られる）
（数９）
Ｚ＝ＸｘｏｒＹ
＝Ｆ（Ｒｎ）ｘｏｒ（Ｆ（ＲｎｘｏｒＲ'ｎ））
＝Ｆ（Ｒｎ）ｘｏｒ（Ｆ（Ｒｎ）ｘｏｒＦ（Ｒ'ｎ））（数２より）
＝Ｆ（Ｒ'ｎ）The match determination unit 117 calculates an exclusive OR (XOR operation) of the function value X and the function value Y, and obtains the function value Z that is the calculation result. For the function value Z, the following number 9 holds.
(A xor (A xor B) = A ¬ (A xor B) + ¬ A · (A xor B) = A · B + ・ A · B = B, according to the other data (A) in the data (B) The original data (B) can be obtained by performing the operation of exclusive OR twice
(Number 9)
Z = X xor Y
= F (Rn) xor (F (Rn xor R'n))
= F (Rn) xor (F (Rn) xor F (R'n)) (from equation 2)
= F (R'n)

一致判定部１１７は、関数値Ｚに不可逆変換を実行し、実行結果である不可逆変換値Ｈとして取得する。なお、当該不可逆変換は、例えば、図６の暗号キーワード生成処理に用いられた不可逆変換と同一である必要がある。例えば、図９に示すように、当該不可逆変換がハッシュ関数ＳＨＡ２５６である場合、９６ビットの関数値Ｘと関数値Ｙの排他的論理和の値を２５６ビットのハッシュ値（不可逆変換値）に変換する。 The coincidence determination unit 117 performs irreversible conversion on the function value Z, and acquires it as an irreversible conversion value H that is the execution result. In addition, the said irreversible conversion needs to be the same as the irreversible conversion used for the encryption keyword production | generation process of FIG. 6, for example. For example, as shown in FIG. 9, when the irreversible conversion is the hash function SHA256, the value of the exclusive OR of the 96-bit function value X and the function value Y is converted to a 256-bit hash value (irreversible conversion value) Do.

一致判定部１１７は、不可逆変換値Ｈから、図６の暗号キーワード生成処理における所定のタグ長が示すビット長を照合データＤとして取得する。一致判定部１１７は、例えば、図９に示すように、２５６ビットのハッシュ値のうち、最下位３２ビットを抽出し、照合データＤ'ｎ＋１を得る。なお、不可逆変換値Ｈから、抽出するビットは最下位ビットからに限らず、最上位ビットから抽出してもよく、既定のビットを抽出、あるいはランダムに各ビットを抽出してもよい。また、選択するビット長も任意である。 The match determination unit 117 acquires, from the irreversible conversion value H, a bit length indicated by a predetermined tag length in the encryption keyword generation process of FIG. For example, as shown in FIG. 9, the match determination unit 117 extracts the least significant 32 bits of the 256-bit hash value, and obtains the match data D ′ n + 1. The bits to be extracted from the irreversible conversion value H may be extracted not only from the least significant bit but also from the most significant bit, or predetermined bits may be extracted or each bit may be extracted at random. Moreover, the bit length to select is also arbitrary.

一致判定部１１７は、第１暗号キーワードの検索タグを取得する。一致判定部１１７は、例えば、第１暗号キーワードＤにおけるデータＤｎ＋１を取り出す。 The match determination unit 117 acquires a search tag of the first encryption keyword. The match determination unit 117, for example, takes out data Dn + 1 in the first encryption keyword D.

一致判定部１１７は、照合データＤと第１暗号キーワードの検索タグとを比較、同一であれば、第１暗号セットと第２暗号セットが同一のキーワードから生成されたと判定し、同一でなければ、第１暗号セットと第２暗号セットは異なるキーワードから生成されたと判定する。 The match determination unit 117 compares the verification data D with the search tag of the first encryption keyword, and determines that the first encryption set and the second encryption set are generated from the same keyword if they are the same. It is determined that the first cipher set and the second cipher set are generated from different keywords.

一致判定部１１７は、例えば、図９に示すように、第１暗号キーワードの検索タグＤｎ＋１と照合データＤ'ｎ＋１を比較し、同一であれば、第１暗号セットと第２暗号セットが同一のキーワードから生成されたと判定し、同一でなければ、第１暗号セットと第２暗号セットは異なるキーワードから生成されたと判定する。なお、一致判定部１１７は、例えば、特許文献１に記載の誤検索の検知をさらに実施することにより、同一のキーワードから生成されたと誤って判定されてしまった暗号セットの組み合わせを特定し、特定した組み合わせにおける一致判定の結果を変更してもよい。 For example, as shown in FIG. 9, the match determination unit 117 compares the search tag Dn + 1 of the first encryption keyword with the verification data D'n + 1, and if the comparison data is identical, the first encryption set and the second encryption set are identical. It is determined that they are generated from keywords, and if they are not identical, it is determined that the first cipher set and the second cipher set are generated from different keywords. In addition, for example, by further executing detection of an erroneous search described in Patent Document 1, the coincidence determination unit 117 specifies a combination of a cryptographic set that has been erroneously determined to be generated from the same keyword, and specifies the combination. The result of the match determination in the combination may be changed.

上記処理によって、一致判定部１１７は、第１暗号セット及び第２暗号セットに含まれる暗号キーワード及び暗号クエリを復号することなく、第１暗号セットと第２暗号セットが同一のキーワードから生成されたか否かを判定することができる。なお、上記の秘匿データを検索する手順は、必ずしも上記に記述された通りの順序で処理する必要はなく、異なる順序で実施してもよい。 According to the above-described process, the coincidence determination unit 117 may generate the first and second cipher sets from the same keyword without decrypting the cipher keywords and the cipher query included in the first and second cipher sets. It can be determined whether or not. The above procedure for searching for confidential data does not necessarily have to be processed in the order as described above, and may be performed in a different order.

以上、本実施例の全文検索システム１００は、複数の検索インデックスに含まれる暗号キーワードを復号することなく、当該複数の検索インデックスをマージすることができる。ひいては、本実施例の全文検索システム１００は、セキュリティを確保しつつ、検索処理速度等の検索性能を維持することができる。 As described above, the full-text search system 100 according to this embodiment can merge the plurality of search indexes without decrypting the encryption keywords included in the plurality of search indexes. As a result, the full-text search system 100 according to this embodiment can maintain search performance such as search processing speed while securing security.

なお、本実施例の暗号セットそれぞれは暗号キーワードと暗号クエリとを含むが、暗号クエリの代わりに、復号せずに暗号キーワードと比較して平文が一致することが判定可能な他の暗号文を含んでもよい。 Although each of the cipher sets of the present embodiment includes a cipher keyword and a cipher query, instead of the cipher query, another cipher text which can be judged that the plaintext matches with the cipher keyword without decryption is judged. May be included.

以下、本実施例のドキュメント検索処理の一例を説明する。検索エンジンサーバ１２０は、ユーザ端末１３０から、検索クエリを受信する。検索エンジンサーバ１２０は、検索クエリをインデックス生成サーバ１１０に送信する。暗号クエリ生成部１１６は、ステップＳ４０８の方法を用いて、検索クエリから暗号クエリを生成する。 Hereinafter, an example of the document search process of the present embodiment will be described. The search engine server 120 receives a search query from the user terminal 130. The search engine server 120 sends a search query to the index generation server 110. The cryptographic query generation unit 116 generates a cryptographic query from the search query using the method of step S408.

一致判定部１１７は、暗号クエリ生成部１１６が生成した暗号クエリと、インデックス格納部１１３に含まれる検索インデックスの暗号キーワードそれぞれと、の一致判定（Ｓ８０４）を実施する。つまり、一致判定部１１７は、検索エンジンサーバ１２０が受け付けた検索クエリと同一のキーワードから生成された暗号キーワードを特定する。 The match determination unit 117 performs match determination (S804) between the encryption query generated by the encryption query generation unit 116 and each encryption keyword of the search index included in the index storage unit 113. That is, the match determination unit 117 identifies an encrypted keyword generated from the same keyword as the search query accepted by the search engine server 120.

一致判定部１１７は、特定した暗号キーワードを示す情報を、検索エンジンサーバ１２０に送信する。検索エンジンサーバ１２０は、読み込み済みの検索インデックスから、当該情報が示す暗号キーワードに紐づくメタデータを抽出し、抽出したメタデータ及び／又は抽出したメタデータが示すドキュメントをユーザ端末１３０に送信する。 The match determination unit 117 transmits, to the search engine server 120, information indicating the identified encryption keyword. The search engine server 120 extracts metadata associated with the encryption keyword indicated by the information from the read search index, and transmits the extracted metadata and / or the document indicated by the extracted metadata to the user terminal 130.

以下の実施例においては、実施例１と同様の構成及び処理についての説明を省略し、実施例１との相違点を説明する。本実施例のインデックス生成サーバ１１０は、マージ結果である検索インデックスのキーワード辞書に暗号クエリを含めない。 In the following embodiments, the description of the same configuration and processing as those of the first embodiment will be omitted, and the differences from the first embodiment will be described. The index generation server 110 according to this embodiment does not include the cryptographic query in the keyword dictionary of the search index which is the merge result.

図１０は、本実施例の検索インデックスのマージ処理の一例を示す説明図である。実施例１（図３）との相違点は、検索インデックス３０１及び検索インデックス３０３がメインインデックスである点である。 FIG. 10 is an explanatory diagram of an example of the search index merge process of this embodiment. The difference from the first embodiment (FIG. 3) is that the search index 301 and the search index 303 are main indexes.

メインインデックスとは、キーワード辞書が暗号クエリを含まない検索インデックスである。つまり、メインインデックスにおける各暗号セットは、暗号キーワードのみからなる。また、検索インデックス３０２はサブインデックスである。サブインデックスとは、キーワード辞書が暗号クエリを含む検索インデックスである。つまり、実施例１で説明した検索インデックスはサブインデックスである。 The main index is a search index in which the keyword dictionary does not include a cryptographic query. That is, each cipher set in the main index consists only of cipher keywords. Also, the search index 302 is a sub index. The sub index is a search index in which the keyword dictionary includes a cryptographic query. That is, the search index described in the first embodiment is a sub index.

一致判定部１１７は、暗号キーワードと暗号クエリとを比較することにより、暗号セットが同一のキーワードから生成されたか否かを判定するため、インデックス生成サーバ１１０は、メインインデックスとサブインデックスとの間のマージ処理を実施例１と同様の方法で実施することができる。インデックスマージ部１１２は、例えば、ステップＳ８０６において、マージ結果の検索インデックスに、暗号クエリを含めないことにより、メインインデックスであるマージ結果を生成する。 Since the match determination unit 117 determines whether or not the encryption set is generated from the same keyword by comparing the encryption keyword and the encryption query, the index generation server 110 detects the difference between the main index and the sub index. Merge processing can be performed in the same manner as in the first embodiment. For example, in step S806, the index merge unit 112 generates a merge result which is a main index by not including the encryption query in the search index of the merge result.

なお、図１０は、メインインデックスとサブインデックスとがマージされてメインインデックスが生成される例を示しているが、サブインデックス同士がマージされてメインインデックスが生成されてもよい。 Although FIG. 10 shows an example in which the main index and the sub index are merged to generate the main index, the sub indexes may be merged to generate the main index.

なお、メインインデックスの各暗号セットは暗号キーワードしか含まないため、一致判定部１１７は、メインインデックス間で暗号キーワードが同一のキーワードから生成されたか否かを判定することができない。つまり、インデックス生成サーバ１１０は、メインインデックス間のマージ処理を実行することができない。従って、ステップＳ８０１において検索エンジンサーバ１２０は、メインインデックスを１つだけ含む、又はメインインデックスを１つも含まないように、マージ対象の複数の検索インデックスを選定する。 In addition, since each encryption set of the main index includes only the encryption keyword, the coincidence determination unit 117 can not determine whether the encryption keyword is generated from the same keyword between the main indexes. That is, the index generation server 110 can not execute merge processing between main indexes. Therefore, in step S801, the search engine server 120 selects a plurality of search indexes to be merged so as to include only one main index or not include any main index.

また、例えば、検索エンジンサーバ１２０は、ステップＳ８０１において、所定数以上のサブインデックスがインデックス格納部１１３に格納されていると判定した場合に、マージ対象の複数の検索インデックスを選定してもよい。 Further, for example, when it is determined in step S801 that a predetermined number or more of sub-indexes are stored in the index storage unit 113, the search engine server 120 may select a plurality of search indexes to be merged.

以上、また、本実施例のインデックス生成サーバ１１０は、メインインデックスとサブインデックスとの間においても、それぞれに含まれる暗号キーワードを復号することなく、マージ処理を実行することができる。 As described above, the index generation server 110 according to the present embodiment can execute the merge processing without decrypting the encryption keywords included in the main index and the sub index.

さらに、メインインデックスは暗号クエリを含まないため、復号鍵を用いない限り、複数のメインインデックス間で、同一のキーワードから生成された暗号キーワードが存在するか否かを判定することができない。つまり、本実施例のインデックス生成サーバ１１０は、マージ処理によってメインインデックスを生成することにより、より強固なセキュリティを確保することができる。 Furthermore, since the main index does not include the encryption query, it is not possible to determine whether or not there is an encryption keyword generated from the same keyword among a plurality of main indexes unless the decryption key is used. That is, the index generation server 110 according to the present embodiment can secure stronger security by generating the main index by the merge process.

図１１は、本実施例の全文検索システムの全体の構成例を示すブロック図である。以下、実施例１との全文検索システムの全体構成との違いを説明する。ユーザ端末１３０は、インデックス生成部１３１と、検索可能暗号化部１３２とを含む。検索可能暗号化部１３２は、暗号キーワード生成部１３３と暗号クエリ生成部１３４とを含む。インデックス生成部１３１、暗号キーワード生成部１３３、及び暗号クエリ生成部１３４の説明は、それぞれインデックス生成部１１１、暗号キーワード生成部１１５、及び暗号クエリ生成部１１６の説明と同様であるため、省略する。 FIG. 11 is a block diagram showing an example of the entire configuration of the full-text search system of this embodiment. Hereinafter, the difference between the first embodiment and the entire configuration of the full-text search system will be described. The user terminal 130 includes an index generation unit 131 and a searchable encryption unit 132. The searchable encryption unit 132 includes an encryption keyword generation unit 133 and an encryption query generation unit 134. Descriptions of the index generation unit 131, the encryption keyword generation unit 133, and the encryption query generation unit 134 are the same as the descriptions of the index generation unit 111, the encryption keyword generation unit 115, and the encryption query generation unit 116, respectively.

本実施例は、インデックス生成サーバ１１０がインデックス生成部１１１を含まない点、及びインデックス生成サーバ１１０の検索可能暗号化部１１４が暗号キーワード生成部１１５と暗号クエリ生成部１１６とを含まない点、において実施例１と異なる。つまり、本実施例では、インデックス生成サーバ１１０ではなく、ユーザ端末１３０がインデックスの生成を行う。 In this embodiment, the index generation server 110 does not include the index generation unit 111, and the searchable encryption unit 114 of the index generation server 110 does not include the encryption keyword generation unit 115 and the encryption query generation unit 116. It differs from the first embodiment. That is, in the present embodiment, not the index generation server 110 but the user terminal 130 generates an index.

以下、図４の処理の相違点を説明する。実施例１で説明した図４におけるインデックス生成部１１１による処理は、インデックス生成部１３１によって実行される。実施例１で説明した図４における検索可能暗号化部１１４による処理は、検索可能暗号化部１３２によって実行される。また、ステップＳ４０１において、インデックス生成部１３１は、ドキュメント追加・更新リクエストを受け付け、ユーザ端末１３０が保持するユーザの暗号鍵情報を取得する。 Hereinafter, differences in the process of FIG. 4 will be described. The processing by the index generation unit 111 in FIG. 4 described in the first embodiment is executed by the index generation unit 131. The processing by the searchable encryption unit 114 in FIG. 4 described in the first embodiment is executed by the searchable encryption unit 132. In step S401, the index generation unit 131 receives the document addition / update request, and acquires encryption key information of the user held by the user terminal 130.

また、ステップＳ４０２の処理は実行されない。また、インデックス生成部１３１は、ステップＳ４１０で生成したインデックスをインデックス生成サーバ１１０に送信し、インデックス生成サーバ１１０は受信したインデックス格納部１１３に格納する。その後、インデックス生成サーバ１１０がステップＳ４１１の処理を行う。 Also, the process of step S402 is not performed. In addition, the index generation unit 131 transmits the index generated in step S410 to the index generation server 110, and the index generation server 110 stores the index in the received index storage unit 113. Thereafter, the index generation server 110 performs the process of step S411.

以上、本実施例では、ユーザ端末１３０がインデックスを生成するため、インデックス生成サーバ１１０がユーザのデータ暗号鍵及び関数値暗号鍵を取得する必要がないため、より強固なセキュリティを確保することができる。 As described above, in the present embodiment, since the user terminal 130 generates an index, there is no need for the index generation server 110 to acquire the data encryption key and the function value encryption key of the user, so more secure security can be ensured. .

また、実施例２に本実施例を適用した例を説明する。図４において、インデックス生成部１３１が生成するサブインデックスである。インデックス生成サーバ１１０は、例えばサブインデックスを受信する度にマージ処理を実行すれば、メインインデックスを保持する時間を短縮することができ、さらに強固なセキュリティを確保することができる。 Further, an example in which the present embodiment is applied to the second embodiment will be described. In FIG. 4, it is a sub index generated by the index generation unit 131. For example, if the index generation server 110 executes a merge process each time a sub index is received, the time for holding the main index can be shortened, and more robust security can be ensured.

なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施例の構成の一部を他の実施例の構成に置き換えることも可能であり、また、ある実施例の構成に他の実施例の構成を加えることも可能である。また、各実施例の構成の一部について、他の構成の追加・削除・置換をすることが可能である。 The present invention is not limited to the embodiments described above, but includes various modifications. For example, the embodiments described above are described in detail in order to explain the present invention in an easy-to-understand manner, and are not necessarily limited to those having all the configurations described. In addition, part of the configuration of one embodiment can be replaced with the configuration of another embodiment, and the configuration of another embodiment can be added to the configuration of one embodiment. In addition, with respect to a part of the configuration of each embodiment, it is possible to add, delete, and replace other configurations.

また、上記の各構成、機能、処理部、処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。また、上記の各構成、機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリや、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記録装置、または、ＩＣカード、ＳＤカード、ＤＶＤ等の記録媒体に置くことができる。 Further, each of the configurations, functions, processing units, processing means, etc. described above may be realized by hardware, for example, by designing part or all of them with an integrated circuit. Further, each configuration, function, etc. described above may be realized by software by the processor interpreting and executing a program that realizes each function. Information such as a program, a table, and a file for realizing each function can be placed in a memory, a hard disk, a recording device such as an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, or a DVD.

また、制御線や情報線は説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。実際には殆ど全ての構成が相互に接続されていると考えてもよい。 Further, control lines and information lines indicate what is considered to be necessary for the description, and not all control lines and information lines in the product are necessarily shown. In practice, almost all configurations may be considered to be mutually connected.

Claims

A search index merge server that merges encrypted search indexes, and
Including processor and storage,
The storage device holds a first search index and a second search index.
Each of the first search index and the second search index associates and holds an encryption set generated from each of one or more keywords and metadata corresponding to each of the one or more keywords,
Each of the first search index and the second search index cipher set includes an encryption keyword,
Each of the second search index's cipher set includes a cipher query,
Each of the cryptographic keywords includes a ciphertext indicating a keyword encrypted using a random number, and a search tag indicating a value obtained by performing transformation and irreversible transformation on the random number using a homomorphic function,
Each of the cryptographic queries includes a ciphertext indicating a keyword encrypted using a random number, and a search tag indicating a value obtained by performing conversion on the random number using a homomorphic function,
The processor is
Merging the first search index and the second search index to execute a merge process for generating a third search index as a merge result;
In the merge process,
Performing a comparison process of comparing an encryption keyword included in the first search index and an encryption query included in the second search index, and identifying a combination of encryption sets generated from the same keyword;
For each of the identified combinations, an encryption keyword included in one of the encryption sets included in the combination and metadata associated with each of the encryption sets included in the combinations are linked and stored in the third search index ,
In the comparison process,
A function value obtained by performing transformation using a homomorphic function on a value calculated from part or all of the encrypted text of the second encryption keyword to be compared and the encrypted text of the first encrypted query to be compared Calculate
Calculating an irreversible conversion value obtained by performing irreversible conversion on the value calculated from the function value and the value indicated by the search tag of the first cryptographic query;
Based on the comparison result of the irreversible conversion value and the search tag of the second encryption keyword, the encryption set including the second encryption keyword and the encryption set including the first encryption query are from the same keyword Search index merge server to determine if it was generated.

The search index merge server according to claim 1, wherein
Each of the encrypted queries is encrypted using the same encryption algorithm as the encrypted search query used to search the encrypted keywords included in the first search index and the second search index. Index merge server.

The search index merge server according to claim 1, wherein
Each of the cipher sets included in the third search index comprises only cipher keywords,
The search index merge server, wherein the processor deletes the first search index and the second search index after the end of the merge process.

A search index merge system, including a user terminal and a search index merge server, for merging encrypted search indexes, comprising:
The user terminal is
A first keyword group consisting of one or more keywords and a metadata group corresponding to each keyword of the first keyword group are held,
For each of the keywords in the first keyword group,
Generate a ciphertext in which the keyword is encrypted using a random number,
Generate a search tag indicating a value obtained by performing the homomorphic function and the irreversible conversion on the random number,
Generating an encryption keyword including the generated encrypted text and the generated search tag;
For each of the keywords in the first keyword group,
Generate a ciphertext in which the keyword is encrypted using a random number,
Generate a search tag indicating the value of the random number transformed by the homomorphic function,
Generating a cryptographic query including the generated ciphertext and the generated search tag;
Include the encryption keyword and the encryption query corresponding to the same keyword in the same encryption set,
The encryption set corresponding to the same keyword is associated with metadata and stored in the second search index,
Sending the second search index to the search index merge server;
The search index merge server holds a first search index,
The first search index links and holds an encryption set generated from each keyword of the second keyword group including one or more keywords and metadata corresponding to each keyword of the second keyword group.
Each of the first search index's cryptographic set includes cryptographic keywords,
Each of the encryption keywords of the first search index includes: an encrypted text indicating a keyword encrypted using a random number; and a search tag indicating a value obtained by performing transformation and irreversible transformation on the random number using a homomorphic function. Including
The search index merge server is
Merging the first search index and the second search index to execute a merge process for generating a third search index as a merge result;
In the merge process,
Performing a comparison process of comparing an encryption keyword included in the first search index and an encryption query included in the second search index, and identifying a combination of encryption sets generated from the same keyword;
For each of the identified combinations, an encryption keyword included in one of the encryption sets included in the combination and metadata associated with each of the encryption sets included in the combinations are linked and stored in the third search index ,
In the comparison process,
A function value obtained by performing transformation using a homomorphic function on a value calculated from part or all of the encrypted text of the second encryption keyword to be compared and the encrypted text of the first encrypted query to be compared Calculate
Calculating an irreversible conversion value obtained by performing irreversible conversion on the value calculated from the function value and the value indicated by the search tag of the first cryptographic query;
Based on the comparison result of the irreversible conversion value and the search tag of the second encryption keyword, the encryption set including the second encryption keyword and the encryption set including the first encryption query are from the same keyword Search index merge system to determine if generated.

A search index merge method in which a search index merge server merges encrypted search indexes,
The search index merge server holds the first search index and the second search index,
Each of the first search index and the second search index associates and holds an encryption set generated from each of one or more keywords and metadata corresponding to each of the one or more keywords,
Each of the first search index and the second search index cipher set includes an encryption keyword,
Each of the second search index's cipher set includes a cipher query,
Each of the cryptographic keywords includes a ciphertext indicating a keyword encrypted using a random number, and a search tag indicating a value obtained by performing transformation and irreversible transformation on the random number using a homomorphic function,
Each of the cryptographic queries includes a ciphertext indicating a keyword encrypted using a random number, and a search tag indicating a value obtained by performing conversion on the random number using a homomorphic function,
The search index merge method is
The search index merge server
Merging the first search index and the second search index to execute a merge process for generating a third search index as a merge result;
In the merge process,
Performing a comparison process of comparing an encryption keyword included in the first search index and an encryption query included in the second search index, and identifying a combination of encryption sets generated from the same keyword;
For each of the identified combinations, an encryption keyword included in one of the encryption sets included in the combination and metadata associated with each of the encryption sets included in the combinations are linked and stored in the third search index ,
In the comparison process,
A function value obtained by performing transformation using a homomorphic function on a value calculated from part or all of the encrypted text of the second encryption keyword to be compared and the encrypted text of the first encrypted query to be compared Calculate
Calculating an irreversible conversion value obtained by performing irreversible conversion on the value calculated from the function value and the value indicated by the search tag of the first cryptographic query;
Based on the comparison result of the irreversible conversion value and the search tag of the second encryption keyword, the encryption set including the second encryption keyword and the encryption set including the first encryption query are from the same keyword A search index merge method that determines whether or not it has been generated.

The search index merging method according to claim 5, wherein
Each of the encrypted queries is generated using the same encryption algorithm as the encrypted search query used to search the encrypted keywords included in the first search index and the second search index. Merge method.

The search index merging method according to claim 5, wherein
Each of the cipher sets included in the third search index comprises only cipher keywords,
The search index merge method is a search index merge method in which the search index merge server deletes the first search index and the second search index after the end of the merge process.