JP6632796B2

JP6632796B2 - Database evaluation device, method and program, and database division device, method and program

Info

Publication number: JP6632796B2
Application number: JP2014210218A
Authority: JP
Inventors: 清良披田野; 清本　晋作; 晋作清本; 三宅　優; 優三宅
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2014-10-14
Filing date: 2014-10-14
Publication date: 2020-01-22
Anticipated expiration: 2034-10-14
Also published as: JP2016081192A

Description

本発明は、データベース評価装置、方法及びプログラム、並びにデータベース分割装置、方法及びプログラムに関する。 The present invention relates to a database evaluation device, a method, and a program, and a database division device, a method, and a program.

近年、パスワードや暗証番号などの秘密情報が漏洩する事例が多発しており、漏洩情報を用いたシステムの不正利用に関する被害が多数報告されている。それらの事例の多くは、人的ミスや機器の故障、あるいは故意による不正などの潜在的な脆弱性に起因するものであり、根本的な原因を取り除くことは容易ではない。このため、今後も情報漏洩のリスクをゼロにすることはきわめて難しいと考えられる。 In recent years, secret information such as passwords and personal identification numbers have frequently been leaked, and a large number of damages related to unauthorized use of systems using the leaked information have been reported. Many of these cases are due to potential vulnerabilities such as human error, equipment failure, or deliberate fraud, and it is not easy to remove the root cause. For this reason, it is considered extremely difficult to reduce the risk of information leakage to zero in the future.

特に、現在多くのサービスがクラウド上で展開されていることから、情報漏洩のケースとして以下が想定される。
クラウド上のシステムでは複数のサーバを仮想的に統合して利用しているため、一部のストレージに故障やバグが生じることや、異なる管理者のデータセンターのサーバを統合している場合には一部の管理者が不正を働くなど、部分的にインシデントが発生する可能性がある。それらのインシデントにより脆弱性が顕在化した場合、システムの一部のユーザの秘密情報が大量に漏洩する可能性がある。
また、情報漏洩が発覚した直後、対象ユーザは早期に情報を変更すると考えられるが、それ以外のユーザは、複数のサービスで同一の情報を使用していることなどに起因する変更の煩わしさから、変更に消極的であることが考えられる。
また、秘密情報として用いられるパスワードや生体情報などの認証情報、あるいは物理的なソースなどの多くは、各値が一様に生起しない非一様な情報である。それらの非一様性を普遍的にモデル化することは容易ではないが、一部のユーザの秘密情報を知ることができれば、クラウド上のシステムの他のユーザの秘密情報の偏りを推測できる可能性がある。このような秘密情報の非一様性を利用した攻撃は、総当たり的に攻撃するよりも効率的に攻撃を遂行できる。 In particular, since many services are currently deployed on the cloud, the following are assumed as information leakage cases.
In a system on the cloud, multiple servers are virtually integrated and used, so if there is a failure or bug in some storage, or if servers of different administrator data centers are integrated, Partial incidents may occur, such as some administrators cheating. If the vulnerabilities become apparent due to those incidents, there is a possibility that secret information of some users of the system will be leaked in large quantities.
In addition, immediately after the information leak is discovered, the target user is considered to change the information early, but other users are inconvenienced by the change caused by using the same information in multiple services. , May be reluctant to change.
In addition, authentication information such as passwords and biometric information used as secret information, or many physical sources are non-uniform information in which each value does not occur uniformly. It is not easy to universally model those non-uniformities, but if we can know the secret information of some users, we can infer the bias of the secret information of other users of the system on the cloud There is. An attack using such non-uniformity of secret information can perform an attack more efficiently than a brute force attack.

また、セキュリティ分野における機能安全の概念に基づく従来研究としては、サイドチャネル攻撃などにより秘密情報について何らかの情報が漏洩することを想定した漏洩耐性暗号に関する試みがあげられる。それらの議論では、漏洩耐性の暗号プリミティブが情報漏洩時においても安全であることを立証するために、漏洩関数ｆにより定式化された秘密情報ｘに関する漏洩情報ｆ（ｘ）を用いて情報漏洩時の安全性をモデル化している。例えば、部分漏洩に対するモデルを提示する非特許文献１や、部分漏洩について非特許文献１とは別のモデルを提示する非特許文献２が知られている。 Further, as a conventional research based on the concept of functional safety in the security field, there is an attempt on a leak-resistant cryptosystem which assumes that some information is leaked about secret information due to a side channel attack or the like. In those discussions, in order to prove that the leakage-resistant cryptographic primitive is safe even in the event of information leakage, information leakage is performed using the leakage information f (x) on the secret information x formulated by the leakage function f. Is modeled for safety. For example, Non-Patent Document 1 that presents a model for partial leakage and Non-Patent Document 2 that presents a model different from Non-Patent Document 1 for partial leakage are known.

ＡｄｉＡｋａｖｉａ、ＳｈａｆｉＧｏｌｄｗａｓｓｅｒ、Ｖ．Ｖ．：ＳｉｍｕｌｔａｎｅｏｕｓＨａｒｄｃｏｒｅＢｉｔｓａｎｄＣｒｙｐｔｏｇｒａｐｈｙａｇａｉｎｓｔＭｅｍｏｒｙＡｔｔａｃｋｓ、ｔｈｅ６ｔｈＴｈｅｏｒｙｏｆＣｒｙｐｔｏｇｒａｐｈｙ（ＴＣＣ０９）、ＬＮＣＳ、Ｖｏｌ．５４４４、ｐｐ．４７４−４９５（２００９）．Adi Akavia, Shafi Goldwasser, V.A. V. : Simultaneous Hardcore Bits and Cryptography Against Memory Attacks, the 6th Theory of Cryptography (TCC 09), LNCS, Vol. 5444; 474-495 (2009). Ｄｕｃ、Ａ．、Ｄｚｉｅｍｂｏｗｓｋｉ、Ｓ．ａｎｄＦａｕｓｔ、Ｓ．：Ｕｎｉｆｙｉｎｇｌｅａｋａｇｅｍｏｄｅｌｓ：ｆｒｏｍｐｒｏｂｉｎｇａｔｔａｃｋｓｔｏｎｏｉｓｙｌｅａｋａｇｅ、ＡｄｖａｎｃｅｓｉｎＣｒｙｐｔｏｌｏｇｙ−ＥＵＲＯＣＲＹＰＴ２０１４、ＬＮＣＳ、Ｖｏｌ．８４４１、ｐｐ．４２３−４４０（２０１４）．Duc, A .; , Dziembowski, S .; and Faust, S.M. : Unifying leakage models: from probing attacks to noisy leakage, Advances in Cryptology-EUROCRYPT 2014, LNCS, Vol. 8441 pp. 423-440 (2014).

しかしながら、上述のモデルではある特定のユーザｘの部分的な情報のみが漏洩すること、例えば、ｆ（ｘ）はｘの一部のビットであるなど、を想定しており、システムから大量の秘密情報が漏洩した状況下での情報の安全性を保障するものではない。 However, the above model assumes that only partial information of a specific user x is leaked, for example, f (x) is a part of x, and a large amount of secrets are transmitted from the system. It does not guarantee the security of information in the event of information leakage.

そこで、システムから一部のユーザの秘密情報が漏洩した際に、秘密情報の非一様性を考慮した攻撃モデルに対しての耐性を評価する装置が望まれている。また、残りのユーザの秘密情報への影響を最小限に抑えるために、評価する装置により安全性が高いと判定されるようなデータベースの分割をする装置が望まれている。 Therefore, there is a demand for an apparatus that evaluates resistance to an attack model that takes into account non-uniformity of secret information when secret information of some users leaks from the system. Further, in order to minimize the influence on the confidential information of the remaining users, there is a demand for a device that divides the database so that the device to be evaluated is determined to have high security.

本発明は、データベースのうち一部分が漏洩した場合に残りの部分の安全性を評価できるデータベース評価装置、方法及びプログラムと、データベースを分割する場合に分割した一方が漏洩しても、分割した他方が安全であるようにデータベースを分割できるデータベース分割装置、方法及びプログラムとを提供することを目的とする。 The present invention provides a database evaluation device, method, and program that can evaluate the security of the remaining part when a part of the database is leaked, and that when the database is split, one of the splits leaks, It is an object of the present invention to provide a database dividing apparatus, method and program capable of dividing a database so as to be safe.

具体的には、以下のような解決手段を提供する。
（１）データベースを評価するデータベース評価装置であって、前記データベースのうちの一部分である第１データベースと、前記データベースのうちの前記第１データベース以外の部分である第２データベースとにおいて、前記第１データベースが漏洩したと仮定した場合に、前記第１データベースのデータ分布である第１分布に基づいて、前記第２データベースのデータ分布である第２分布を推測する推測手段と、前記推測手段によって推測された推測分布において選択されたデータと、前記第２分布において選択されたデータとが一致して攻撃が成功する確率である平均攻撃成功確率を評価指標とし、前記評価指標に基づいて、前記データベースが安全であるか否かを判定する判定手段と、を備えるデータベース評価装置。 Specifically, the following solution is provided.
(1) A database evaluation device for evaluating a database, wherein a first database that is a part of the database and a second database that is a part of the database other than the first database include the first database. Estimating means for estimating a second distribution, which is the data distribution of the second database, based on the first distribution, which is the data distribution of the first database, assuming that the database has leaked, and estimating by the estimating means. An average attack success probability, which is a probability that the data selected in the estimated distribution and the data selected in the second distribution match and the attack succeeds, is used as an evaluation index, and based on the evaluation index, the database And a determination unit for determining whether or not the database is safe.

（１）のデータベース評価装置は、推測分布において選択されたデータと、第２分布において選択されたデータとが一致して攻撃が成功する確率である平均攻撃成功確率を、データベースの安全性を評価するための評価指標とする。 The database evaluation device of (1) evaluates the average attack success probability, which is the probability that the data selected in the inferred distribution matches the data selected in the second distribution and succeeds in the attack, and evaluates the security of the database. And an evaluation index.

したがって、（１）に係るデータベース評価装置は、平均攻撃成功確率を評価指標に用いるので、データベースのうち一部分が漏洩した場合に残りの部分の安全性を評価できる。 Therefore, the database evaluation device according to (1) uses the average attack success probability as an evaluation index, so that if a part of the database is leaked, the security of the remaining part can be evaluated.

（２）前記判定手段は、前記評価指標が、前記推測分布を一様分布であるとしたときに、攻撃が成功する確率以下である場合に、前記データベースを安全であると判定する、（１）に記載のデータベース評価装置。 (2) The determination means determines that the database is safe if the evaluation index is equal to or less than the probability of successful attack, assuming that the estimated distribution is a uniform distribution. The database evaluation device described in (1).

（２）に係るデータベース評価装置は、データベースのうち一部分が漏洩した場合に残りの部分の安全性を、確実に評価できる。 The database evaluation device according to (2) can reliably evaluate the security of the remaining part when a part of the database is leaked.

（３）前記推測分布と前記第２分布との第１距離を算出する第１距離算出手段と、
一様分布と前記第２分布との第２距離を算出する第２距離算出手段と、をさらに備え、
前記判定手段は、前記推測分布の２次のレニーエントロピーが、前記第２分布の２次のレニーエントロピー以上であり、かつ、前記第１距離が、前記第２距離に基づく一定の値以上である場合に、前記データベースを安全であると判定する、（１）に記載のデータベース評価装置。 (3) first distance calculating means for calculating a first distance between the estimated distribution and the second distribution;
A second distance calculating means for calculating a second distance between the uniform distribution and the second distribution,
The determination means may be configured such that a second order Renyi entropy of the estimated distribution is equal to or greater than a second order Renyi entropy of the second distribution, and the first distance is a certain value or more based on the second distance. The database evaluation device according to (1), wherein the database is determined to be safe in such a case.

（３）に係るデータベース評価装置は、データベースのうち一部分が漏洩した場合に残りの部分の安全性を、さらに確実に評価できる。 The database evaluation device according to (3) can more reliably evaluate the security of the remaining part when a part of the database is leaked.

（４）データベースを分割するデータベース分割装置であって、前記データベースのうちの一部分である第１データベースと、前記データベースのうちの前記第１データベース以外の部分である第２データベースとにおいて、前記第１データベースのデータ分布である第１分布の２次のレニーエントロピーと、前記第２データベースのデータ分布である第２分布の２次のレニーエントロピーとの差を所定の範囲内にすると共に、前記第１分布及び前記第２分布をヒストグラムに表した場合の階級の個数を互いに同一にするという条件の下で、前記第１データベース又は前記第２データベースの一方に含まれるデータの分布に基づいて他方のデータベースに含まれるデータの分布を推測した推測分布から選択されたデータと、前記他方のデータベースから選択されたデータとが一致して前記他方のデータベースへの攻撃が成功する確率が所定以下となるように、前記データベースを前記第１データベースと前記第２データベースとに分割する分割手段を備える、データベース分割装置。 (4) A database dividing apparatus for dividing a database, wherein the first database is a part of the database and the second database is a part of the database other than the first database. The difference between the second-order Renyi entropy of the first distribution, which is the data distribution of the database, and the second-order Renyi entropy of the second distribution, which is the data distribution of the second database, is within a predetermined range, and Under the condition that the number of classes is the same when the distribution and the second distribution are represented in a histogram, the other database is based on the distribution of data contained in one of the first database or the second database. Data selected from the inferred distribution in which the distribution of the data contained in the A dividing unit that divides the database into the first database and the second database so that a probability that an attack on the other database is successful by matching data selected from the base is equal to or less than a predetermined value. , Database partitioning device.

（４）に係るデータベース分割装置は、第１データベース又は第２データベースの一方に含まれるデータの分布に基づいて他方のデータベースに含まれるデータの分布を推測した推測分布から選択されたデータと、他方のデータベースから選択されたデータとが一致して他方のデータベースへの攻撃が成功する確率が所定以下となるように、データベースを第１データベースと第２データベースとに分割する。 The database dividing device according to (4) includes: a data selected from an estimated distribution obtained by estimating a distribution of data included in the other database based on a distribution of data included in one of the first database and the second database; The database is divided into a first database and a second database so that the probability that an attack on the other database will be successful when the data selected from the database matches will be equal to or less than a predetermined value.

したがって、（４）に係るデータベース分割装置は、データベースを分割する場合に分割した一方が漏洩しても、分割した他方が安全であるようにデータベースを分割することができる。 Therefore, the database dividing device according to (4) can divide the database such that, even if one of the divided databases leaks when dividing the database, the other is safe.

（５）前記分割手段は、前記第１分布と前記第２分布との距離が、前記第１分布又は前記第２分布と一様分布との距離に基づく一定の値以上であるように、前記データベースを前記第１データベースと前記第２データベースとに分割する、（４）に記載のデータベース分割装置。 (5) The dividing means may be configured such that a distance between the first distribution and the second distribution is equal to or greater than a certain value based on a distance between the first distribution or the second distribution and a uniform distribution. The database dividing device according to (4), wherein the database is divided into the first database and the second database.

したがって、（５）に係るデータベース分割装置は、データベースを分割する場合に分割した一方が漏洩しても、分割した他方が、確実に安全であるようにデータベースを分割することができる。 Therefore, the database dividing device according to (5) can divide the database such that, even if one of the divided databases leaks when dividing the database, the other of the divided databases is surely safe.

（６）前記分割手段は、前記第１分布の２次のレニーエントロピー及び前記第２分布の２次のレニーエントロピーと、前記データベースのデータ分布の２次のレニーエントロピーとの差をそれぞれ所定の範囲内にするという、さらに追加した条件の下で分割する、（４）又は（５）に記載のデータベース分割装置。 (6) The dividing means determines a difference between a second-order Renyi entropy of the first distribution and a second-order Renyi entropy of the second distribution and a second-order Renyi entropy of the data distribution of the database in a predetermined range, respectively. The database dividing apparatus according to (4) or (5), wherein the database is divided under the added condition of being inside.

（６）に係るデータベース分割装置は、データベースを分割する場合に分割した一方がすべて漏洩しても、分割した他方が安全であるようにデータベースを分割することができる。 The database dividing device according to (6) can divide the database such that, even if one of the divided databases leaks when dividing the database, the other is safe.

（７）前記分割手段は、前記第１分布及び前記第２分布をヒストグラムに表した場合の階級の個数と、前記データベースのデータ分布をヒストグラムに表した場合の階級の個数とをそれぞれ同一にするという、さらに追加した条件の下で分割する、（４）から（６）のいずれか一に記載のデータベース分割装置。 (7) The dividing means makes the number of classes when the first distribution and the second distribution are represented by a histogram equal to the number of classes when the data distribution of the database is represented by a histogram. The database dividing device according to any one of (4) to (6), wherein the database is divided under the added condition.

（７）に係るデータベース分割装置は、分割したデータベースがいずれも漏洩していない場合の安全性を高くするようにデータベースを分割することができる。 The database dividing device according to (7) can divide the database so as to increase security when none of the divided databases is leaked.

（８）前記分割手段は、グリーディ法により、前記第１データベースと前記第２データベースとの距離の変化量の大きい階級を優先させてデータを移動させる、（４）から（７）のいずれか一に記載のデータベース分割装置。 (8) The data dividing means according to any one of (4) to (7), wherein the dividing means preferentially moves a class having a large amount of change in distance between the first database and the second database by a greedy method. 2. The database division device according to 1.

（８）に係るデータベース分割装置は、第１データベースと第２データベースとのデータの移動において、データベース間の距離を大きくするように効率よく移動させることができる。 The database dividing device according to (8) can efficiently move the data between the first database and the second database so as to increase the distance between the databases.

（９）（１）に記載のデータベース評価装置が実行する方法であって、前記推測手段が、前記データベースのうちの一部分である第１データベースと、前記データベースのうちの前記第１データベース以外の部分である第２データベースとにおいて、前記第１データベースが漏洩したと仮定した場合に、前記第１データベースのデータ分布である第１分布に基づいて、前記第２データベースのデータ分布である第２分布を推測する推測ステップと、前記判定手段が、前記推測ステップによって推測された推測分布において選択されたデータと、前記第２分布において選択されたデータとが一致して攻撃が成功する確率である平均攻撃成功確率を評価指標とし、前記評価指標に基づいて、前記データベースが安全であるか否かを判定する判定ステップと、を備える方法。 (9) The method executed by the database evaluation device according to (1), wherein the estimating unit includes: a first database that is a part of the database; and a part of the database other than the first database. And a second database, which is a data distribution of the second database, based on the first distribution, which is a data distribution of the first database, assuming that the first database has leaked. A guessing step of guessing, wherein the determination means determines that an average attack is a probability that the data selected in the guessed distribution estimated in the guessing step coincides with the data selected in the second distribution and the attack succeeds. Using the success probability as an evaluation index, determining whether or not the database is safe based on the evaluation index The method comprising: a step, a.

（９）に係る方法は、平均攻撃成功確率を評価指標に用いるので、データベースのうち一部分が漏洩した場合に残りの部分の安全性を評価できる。 In the method according to (9), since the average attack success probability is used as the evaluation index, when a part of the database is leaked, the security of the remaining part can be evaluated.

（１０）（４）に記載されたデータベース分割装置が実行する方法であって、前記分割手段が、前記データベースのうちの一部分である第１データベースと、前記データベースのうちの前記第１データベース以外の部分である第２データベースとにおいて、前記第１データベースのデータ分布である第１分布の２次のレニーエントロピーと、前記第２データベースのデータ分布である第２分布の２次のレニーエントロピーとの差を所定の範囲内にすると共に、前記第１分布及び前記第２分布をヒストグラムに表した場合の階級の個数を互いに同一にするという条件の下で、前記第１データベース又は前記第２データベースの一方に含まれるデータ分布に基づいて他方のデータベースに含まれるデータ分布を推測した推測分布から選択されたデータと、前記他方のデータベースから選択されたデータとが一致して前記他方のデータベースへの攻撃が成功する確率が所定以下となるように、前記データベースを前記第１データベースと前記第２データベースとに分割する分割ステップを備える、方法。 (10) The method executed by the database division device according to (4), wherein the division unit includes a first database that is a part of the database, and a first database other than the first database in the database. A difference between a second-order Renyi entropy of a first distribution, which is a data distribution of the first database, and a second-order Renyi entropy of a second distribution, which is a data distribution of the second database, with respect to a second database that is a part. Within a predetermined range, and one of the first database or the second database under the condition that the number of classes is the same when the first distribution and the second distribution are represented in a histogram. Was selected from the inferred distributions in which the data distribution in the other database was inferred based on the data distribution in The first database and the second database so that the data and the data selected from the other database match and the probability of successful attack on the other database is equal to or less than a predetermined value. The method comprises a dividing step of dividing into:

（１０）に係る方法は、データベースを分割する場合に分割した一方が漏洩しても、分割した他方が安全であるようにデータベースを分割することができる。 In the method according to (10), when a database is divided, even if one of the divided parts leaks, the database can be divided so that the other part is safe.

（１１）コンピュータに、（９）に記載の方法の各ステップを実行させるためのプログラム。 (11) A program for causing a computer to execute each step of the method according to (9).

（１１）に係るプログラムは、コンピュータに、データベースのうち一部分が漏洩した場合に残りの部分の安全性を評価させることができる。 The program according to (11) allows a computer to evaluate the security of the remaining part when a part of the database is leaked.

（１２）コンピュータに、（１０）に記載の方法の各ステップを実行させるためのプログラム。 (12) A program for causing a computer to execute each step of the method according to (10).

（１２）に係るプログラムは、コンピュータに、データベースを分割する場合に分割した一方が漏洩しても、分割した他方が安全であるようにデータベースを分割させることができる。 The program according to (12) can cause the computer to divide the database so that even if one of the divided databases leaks when the database is divided, the other is safe.

本発明によれば、データベース評価装置は、データベースのうち一部分が漏洩した場合に残りの部分の安全性を評価できる。データベース分割装置は、データベースを分割する場合に分割した一方が漏洩しても、分割した他方が安全であるようにデータベースを分割できる。 ADVANTAGE OF THE INVENTION According to this invention, when one part of a database leaks, the database evaluation apparatus can evaluate the security of the remaining part. The database dividing device can divide the database such that, even when one of the divided parts leaks when dividing the database, the other part is safe.

本発明の一実施形態に係るデータベース評価装置及びデータベース分割装置の構成を示すブロック図である。It is a block diagram showing composition of a database evaluation device and a database division device concerning one embodiment of the present invention. 本発明の一実施形態に係るデータベース評価装置の評価指標に基づく評価処理の例を示すフローチャートである。6 is a flowchart illustrating an example of an evaluation process based on an evaluation index of the database evaluation device according to an embodiment of the present invention. 本発明の一実施形態に係るデータベース評価装置の距離に基づく評価処理の例を示すフローチャートである。5 is a flowchart illustrating an example of an evaluation process based on a distance of the database evaluation device according to an embodiment of the present invention. 本発明の一実施形態に係るデータベース分割装置の分割アルゴリズムの例を示すフローチャートである。6 is a flowchart illustrating an example of a division algorithm of the database division device according to one embodiment of the present invention. 図４に続く、フローチャートである。It is a flowchart following FIG. 本発明の一実施形態に係るデータベース分割装置２０により分割された第１ＤＢ及び第２ＤＢのヒストグラムの例を示す図である。FIG. 4 is a diagram illustrating an example of histograms of a first DB and a second DB divided by the database dividing device 20 according to an embodiment of the present invention.

以下、本発明の実施形態について、図を参照しながら説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

説明に利用する２次のレニーエントロピー及び確率変数の定義は以下の通りである。
離散集合χ上の確率変数Ｘの確率関数をｐ（ｘ）とすると、Ｘの２次のレニーエントロピーＨ_２（Ｘ）は、式（１）で定義される。 The definitions of the second-order Renyi entropy and random variables used in the description are as follows.
Assuming that the probability function of the random variable X on the discrete set χ is p (x), the second-order Rennie entropy H ₂ (X) of X is defined by Expression (1).

離散集合χ上の確率変数Ｘ、Ｙの確率関数をそれぞれｐ（ｘ）、ｑ（ｘ）とすると、ＸとＹとの距離Ｄ（Ｘ，Ｙ）は、式（２）で定義される。 Assuming that the probability functions of the random variables X and Y on the discrete set χ are p (x) and q (x), the distance D (X, Y) between X and Y is defined by Expression (2).

ｑ（ｘ）が一様分布のとき、ＸとＹとの距離Ｅ（Ｘ）は式（３）で表せる。 When q (x) has a uniform distribution, the distance E (X) between X and Y can be expressed by equation (3).

［データベース評価装置１０］
まず、データベース評価装置１０について説明する。
ユーザの秘密情報（例えば、ユーザに対応付けられる暗証番号やパスワード、生体情報など）を記憶するデータベース３０のうちの一部分が漏洩した場合、攻撃者は漏洩した情報を利用することにより、残りのユーザの秘密情報の分布を推測できる可能性がある。このような推測された分布である推測分布を用いた攻撃モデルの１つであるＤＧＡ（ＤｉｓｔｒｉｂｕｔｉｏｎＧｕｅｓｓｉｎｇＡｔｔａｃｋ）に対する、データベース評価装置１０の安全性の評価について説明する。 [Database evaluation device 10]
First, the database evaluation device 10 will be described.
If a part of the database 30 that stores the user's secret information (for example, a personal identification number, a password, and biometric information associated with the user) is leaked, the attacker can use the leaked information to obtain the remaining users. May be able to estimate the distribution of secret information. The evaluation of the security of the database evaluation device 10 against DGA (Distribution Guessing Attack), which is one of the attack models using the inferred distribution that is the inferred distribution, will be described.

ＤＧＡの攻撃手順を攻撃者と挑戦者によるゲーム形式で示す。
ここで、秘密情報が漏洩したユーザの集合をＵ′、Ｕ′の各ユーザの秘密情報データの集合をＷ′とする。また、攻撃対象のユーザの集合をＵ、Ｕの各ユーザの秘密情報データの集合をＷとする。Ｗの各データはその取りうる値の集合χ上の確率関数ｐ（ｘ）にしたがって生起したものとする。ただし、ｕ≠ｕ′、ｕ∈Ｕ、ｕ′∈Ｕ′とする。
（１）攻撃者は任意の分布推測アルゴリズムΚを用いて取得したＷ′から確率関数ｐ（ｘ）を推測する。このとき、推測された分布をｑ（ｘ）とする。
（２）挑戦者はユーザｕ∈Ｕをランダムに選択し、対応する秘密情報ｗ∈Ｗを抽出する。
（３）攻撃者はｑ（ｘ）にしたがって秘密情報ｘ∈_ｑχを選択する。
（４）ｘとｗとが一致すれば、攻撃者の勝ちとする。 The DGA attack procedure is shown in the form of a game by an attacker and a challenger.
Here, a set of users whose secret information has been leaked is U ′, and a set of secret information data of each user U ′ is W ′. Also, a set of users to be attacked is U, and a set of secret information data of each user of U is W. It is assumed that each data of W occurs according to the probability function p (x) on the set of possible values χ. Here, u ≠ u ', u∈U, and u'∈U'.
(1) The attacker estimates a probability function p (x) from W ′ obtained using an arbitrary distribution estimation algorithm Κ. At this time, the estimated distribution is defined as q (x).
(2) The challenger randomly selects the user u @ U and extracts the corresponding secret information w @ W.
(3) The attacker selects secret information x { _q } according to q (x).
(4) If x and w match, the attacker wins.

データベース評価装置１０は、攻撃が成功する確率である平均攻撃成功確率を、安全性を評価するための評価指標とし、安全性を評価する。 The database evaluation device 10 evaluates security using an average attack success probability, which is a probability of successful attack, as an evaluation index for evaluating security.

図１は、本発明の一実施形態に係るデータベース評価装置１０及びデータベース分割装置２０の構成を示すブロック図である。データベース評価装置１０は、推測手段１１と、判定手段１２と、第１距離算出手段１３と、第２距離算出手段１４とを備える。手段ごとに詳述する。 FIG. 1 is a block diagram showing a configuration of a database evaluation device 10 and a database division device 20 according to an embodiment of the present invention. The database evaluation device 10 includes an estimating unit 11, a determining unit 12, a first distance calculating unit 13, and a second distance calculating unit 14. Each means will be described in detail.

推測手段１１は、漏洩した情報の分布に基づいて、漏洩していない残りの情報の分布を、攻撃者に代わって推測する。すなわち、推測手段１１は、データベース３０（例えば、ユーザの秘密情報など）のうちの一部分である第１データベース（第１ＤＢ３１と言う。）と、データベース３０のうちの第１ＤＢ３１以外の部分である第２データベース（第２ＤＢ３２と言う。）とにおいて、第１ＤＢ３１が漏洩したと仮定した場合に、第１ＤＢ３１のデータ分布である第１分布に基づいて、第２ＤＢ３２のデータ分布である第２分布を推測する。 The estimating means 11 estimates, based on the distribution of the leaked information, the distribution of the remaining information that has not been leaked on behalf of the attacker. That is, the estimating unit 11 includes a first database (referred to as a first DB 31) which is a part of the database 30 (for example, user's confidential information) and a second database which is a part of the database 30 other than the first DB 31. In a database (referred to as a second DB 32), assuming that the first DB 31 has leaked, a second distribution that is the data distribution of the second DB 32 is estimated based on the first distribution that is the data distribution of the first DB 31.

［データベース評価装置１０：評価指標に基づく評価の実施例］
判定手段１２は、推測手段１１によって推測された推測分布（攻撃者に代わって推測した分布）ｑ（ｘ）において選択されたデータと、第２分布（攻撃対象ユーザの秘密情報の分布）ｐ（ｘ）において選択されたデータとが一致して攻撃が成功する確率である平均攻撃成功確率を評価指標とし、評価指標に基づいて、データベース３０が安全であるか否かを判定する。具体的には、判定手段１２は、第２分布ｐ（ｘ）と、推測分布ｑ（ｘ）とにより、ＤＧＡの平均攻撃成功確率Ｐ_ＤＧＡを、式（４）によって算出し、データベース３０が安全であるか否かを判定する。 [Database Evaluation Apparatus 10: Example of Evaluation Based on Evaluation Index]
The judging unit 12 selects the data selected in the estimated distribution (distribution estimated on behalf of the attacker) q (x) estimated by the estimating unit 11 and the second distribution (distribution of secret information of the attack target user) p ( The average attack success probability, which is the probability of the successful attack in which the data selected in x) matches, is used as the evaluation index, and it is determined whether or not the database 30 is safe based on the evaluation index. More specifically, the determining unit 12 calculates the average attack success probability P _DGA of the _DGA from the second distribution p (x) and the estimated distribution q (x) according to the equation (4), and the database 30 is safe. Is determined.

仮に、システムから何の情報も漏洩していなければ、攻撃者はｐ（ｘ）に関して何の情報も持たないため、離散集合χ上の一様分布にしたがってＤＧＡを実行する他ない。｜χ｜＝２^ｎとすれば、そのときの攻撃成功確率は２^−ｎで表せる。したがって、一部の秘密情報が漏洩した場合であっても、Ｐ_ＤＧＡ≦２^−ｎであれば、情報が漏洩していないときと同様に安全であるといえる。本明細書ではこのときの安全性をＤＧＡ安全と定義する。 If no information is leaked from the system, the attacker has no information on p (x), and therefore has no other choice but to execute DGA according to a uniform distribution on the discrete set χ. If | χ | = ²ⁿ , the attack success probability at that time can be represented by 2- ⁿ . Therefore, even if some confidential information is leaked, if P _DGA ≦ 2− ^n, it can be said that the security is the same as when no information is leaked. In this specification, the security at this time is defined as DGA security.

さらに、判定手段１２は、評価指標が、推測分布を一様分布であるとしたときに、攻撃が成功する確率（２^−ｎ）以下である場合に、データベース３０を安全であると判定する。 Further, the determining unit 12 determines that the database 30 is safe if the evaluation index is equal to or less than the probability of successful attack (2- ⁿ ), ^assuming that the estimated distribution is a uniform distribution.

図２は、本発明の一実施形態に係るデータベース評価装置１０の評価指標に基づく評価処理の例を示すフローチャートである。データベース評価装置１０は、コンピュータ及びその周辺装置が備えるハードウェア並びに該ハードウェアを制御するソフトウェアによって構成される。以下の処理は、制御部（例えば、ＣＰＵ）が、所定のソフトウェアに従い実行する処理である。 FIG. 2 is a flowchart illustrating an example of an evaluation process based on the evaluation index of the database evaluation device 10 according to an embodiment of the present invention. The database evaluation device 10 is configured by hardware included in a computer and its peripheral devices, and software that controls the hardware. The following process is a process executed by a control unit (for example, a CPU) according to predetermined software.

ステップＳ１０１において、ＣＰＵ（推測手段１１）は、データベース３０の一部が漏洩したと仮定して、残りの部分のデータ分布を推測する。より具体的には、ＣＰＵは、データベース３０のうちの一部分である第１ＤＢ３１と、データベース３０のうちの第１ＤＢ３１以外の部分である第２ＤＢ３２とにおいて、第１ＤＢ３１が漏洩したと仮定した場合に、第１ＤＢ３１のデータ分布である第１分布に基づいて、第２ＤＢ３２のデータ分布である第２分布を推測する。 In step S101, the CPU (estimating means 11) assumes that a part of the database 30 has leaked, and estimates the data distribution of the remaining part. More specifically, the CPU, when assuming that the first DB 31 has leaked in the first DB 31 that is a part of the database 30 and the second DB 32 that is a part other than the first DB 31 of the database 30, A second distribution, which is a data distribution of the second DB 32, is estimated based on a first distribution, which is a data distribution of 1DB31.

ステップＳ１０２において、ＣＰＵ（判定手段１２）は、推測分布に基づく平均攻撃成功確率を評価指標として、残りの部分の安全性を評価する。より具体的には、ＣＰＵは、評価指標として、ＤＧＡの平均攻撃成功確率Ｐ_ＤＧＡを式（４）によって算出し、算出した評価指標と、ステップＳ１０１において推測された推測分布ｑ（ｘ）が一様分布であるときに攻撃が成功する確率（２^−ｎ）とを比較し、評価指標が確率（２^−ｎ）以下である場合に、データベース３０を安全であると判定する。その後、ＣＰＵは、処理を終了する。 In step S102, the CPU (determination means 12) evaluates the security of the remaining portion using the average attack success probability based on the estimated distribution as an evaluation index. More specifically, the CPU calculates the average attack success probability P _DGA of the _DGA as an evaluation index by equation (4), and calculates the calculated evaluation index and the estimated distribution q (x) estimated in step S101. The probability that the attack succeeds when the distribution is ^likewise (2- ⁿ ) is compared, and when the evaluation index is equal to or less than the probability (2- ⁿ ), the database 30 is determined to be safe. Thereafter, the CPU ends the processing.

［データベース評価装置１０：距離に基づく評価の実施例］
図１に戻り、データベース評価装置１０による、距離に基づく判定について説明する。
データベース評価装置１０は、第１距離算出手段１３と、第２距離算出手段１４とをさらに備える。
第１距離算出手段１３は、推測分布と第２分布との第１距離（Ｄ）を算出する。
第２距離算出手段１４は、一様分布と第２分布との第２距離（Ｅ）を算出する。
判定手段１２は、推測分布と第２分布との距離に基づいて、データベース３０を安全であると判定する。 [Database Evaluation Device 10: Example of Evaluation Based on Distance]
Returning to FIG. 1, the determination based on the distance by the database evaluation device 10 will be described.
The database evaluation device 10 further includes a first distance calculation unit 13 and a second distance calculation unit 14.
The first distance calculating means 13 calculates a first distance (D) between the estimated distribution and the second distribution.
The second distance calculating means 14 calculates a second distance (E) between the uniform distribution and the second distribution.
The determining unit 12 determines that the database 30 is safe based on the distance between the estimated distribution and the second distribution.

ここで、Ｘ、Ｙをそれぞれｐ（ｘ）、ｑ（ｘ）を確率関数として持つ離散集合χ上の確率変数とし、Ｈ_２（Ｘ）＝ｒ、Ｈ_２（Ｙ）＝ｓとすると、第１距離算出手段１３は、式（２）により、第１距離（Ｄ）を算出する。第２距離算出手段１４は、式（３）により、第２距離（Ｅ）を算出する。
ｒ≦ｓのとき、不等式（５）が成立する。 Here, if X and Y are random variables on a discrete set 持つ having p (x) and q (x) as probability functions, respectively, and H ₂ (X) = r and H ₂ (Y) = s, The one distance calculating means 13 calculates the first distance (D) by using the equation (2). The second distance calculating means 14 calculates the second distance (E) by using the equation (3).
When r ≦ s, the inequality (5) holds.

なお、式（５）が成立することを、次に示す。 The following shows that Expression (5) holds.

すなわち、式（５）が成立する。 That is, equation (5) holds.

式（３）及び式（５）より、Ｐ_ＤＧＡ≦２^−ｎのとき、次の不等式が成立する。 From Equations (3) and (5), the following inequality holds when P _DGA ≦ 2- ⁿ .

したがって、ｒ≦ｓ、かつ、Ｄ（Ｘ，Ｙ）^２≧２Ｅ（Ｘ）^２を達成できれば、Ｐ_ＤＧＡ≦２^−ｎの定義と同様にそのシステムはＤＧＡ安全であるといえる。
すなわち、判定手段１２は、第１距離が、第２距離に基づく一定の値以上である場合に、データベース３０を安全であると判定する。 Therefore, if r ≦ s and D (X, Y) ² ≧ 2E (X) ² can be achieved, it can be said that the system is DGA-safe, as in the definition of P _DGA ≦ 2- ⁿ .
That is, the determining unit 12 determines that the database 30 is safe when the first distance is equal to or more than a certain value based on the second distance.

図３は、本発明の一実施形態に係るデータベース評価装置１０の距離に基づく評価処理の例を示すフローチャートである。 FIG. 3 is a flowchart illustrating an example of an evaluation process based on a distance of the database evaluation device 10 according to an embodiment of the present invention.

ステップＳ１１１において、ＣＰＵ（推測手段１１）は、データベース３０の一部が漏洩したと仮定して、残りの部分のデータ分布を推測する。より具体的には、ＣＰＵは、データベース３０のうちの一部分である第１ＤＢ３１と、データベース３０のうちの第１ＤＢ３１以外の部分である第２ＤＢ３２とにおいて、第１ＤＢ３１が漏洩したと仮定した場合に、第１ＤＢ３１のデータ分布である第１分布に基づいて、第２ＤＢ３２のデータ分布である第２分布を推測する。 In step S111, the CPU (estimating means 11) estimates the data distribution of the remaining part, assuming that a part of the database 30 has leaked. More specifically, the CPU, when assuming that the first DB 31 has leaked in the first DB 31 that is a part of the database 30 and the second DB 32 that is a part other than the first DB 31 of the database 30, A second distribution, which is a data distribution of the second DB 32, is estimated based on a first distribution, which is a data distribution of 1DB31.

ステップＳ１１２において、ＣＰＵ（第１距離算出手段１３）は、推測分布と残りの部分との距離を算出する。より具体的には、ＣＰＵは、式（２）により、第１距離（Ｄ）を算出する。 In step S112, the CPU (first distance calculation means 13) calculates the distance between the estimated distribution and the remaining part. More specifically, the CPU calculates the first distance (D) according to equation (2).

ステップＳ１１３において、ＣＰＵ（第２距離算出手段１４）は、一様分布と残りの部分との距離を算出する。より具体的には、ＣＰＵは、式（３）により、第２距離（Ｅ）を算出する。 In step S113, the CPU (second distance calculating means 14) calculates the distance between the uniform distribution and the remaining portion. More specifically, the CPU calculates the second distance (E) by Expression (3).

ステップＳ１１４において、ＣＰＵ（判定手段１２）は、算出した距離に基づいて、残りの部分の安全性を評価する。より具体的には、ＣＰＵは、第１距離が、第２距離に基づく一定の値以上である場合（ｒ≦ｓ、かつ、Ｄ（Ｘ，Ｙ）^２≧２Ｅ（Ｘ）^２を満たす場合）に、データベース３０を安全であると判定する。その後、ＣＰＵは、処理を終了する。 In step S114, the CPU (determination means 12) evaluates the safety of the remaining part based on the calculated distance. More specifically, the CPU determines that the first distance is equal to or greater than a predetermined value based on the second distance (when r ≦ s and D (X, Y) ² ≧ 2E (X) ² is satisfied). Then, it is determined that the database 30 is safe. Thereafter, the CPU ends the processing.

［データベース分割装置２０］
次に、図１に戻り、データベース分割装置２０について説明する。
データベース分割装置２０は、分割手段２１を備え、分割した一方の第１ＤＢ３１が漏洩し、漏洩した第１ＤＢ３１に基づいて攻撃者が他方の第２ＤＢ３２にＤＧＡ攻撃をしても、データベース３０がＤＧＡ安全であるようにデータベース３０を分割する。
データベース分割装置２０は、第１ＤＢ３１又は第２ＤＢ３２のどちらが漏洩した場合においても、ＤＧＡに対して耐性のある秘密情報の分割保管をする。 [Database partitioning device 20]
Next, returning to FIG. 1, the database dividing device 20 will be described.
The database dividing device 20 includes a dividing unit 21. Even if one of the divided first DBs 31 leaks and an attacker performs a DGA attack on the other second DB 32 based on the leaked first DB 31, the database 30 is DGA-safe. The database 30 is divided as is.
The database dividing device 20 divides and stores confidential information that is resistant to DGA, regardless of whether the first DB 31 or the second DB 32 leaks.

秘密情報（例えば、ユーザに対応付けられた暗証番号やパスワードなど）は、クラウド化された２つのデータベース（例えば、第１ＤＢ３１、第２ＤＢ３２）に分割され、異なるストレージに保管されている。 The secret information (for example, a personal identification number and a password associated with the user) is divided into two databases (for example, a first DB 31 and a second DB 32) that are made into a cloud and stored in different storages.

分割手段２１は、以下の３つの条件の下でデータベースの分割をする。
条件１：攻撃者は，分布推測アルゴリズムΚとしてヒストグラムを用いる。
条件２：第１ＤＢ３１及び第２ＤＢ３２のどちらのデータベースが漏洩してもＤＧＡへの耐性は同様である。
条件３：第１ＤＢ３１及び第２ＤＢ３２のいずれのデータベースも漏洩していない場合、第１ＤＢ３１及び第２ＤＢ３２は同様に十分に安全である。 The dividing unit 21 divides the database under the following three conditions.
Condition 1: The attacker uses a histogram as the distribution estimation algorithm Κ.
Condition 2: The resistance to DGA is the same regardless of which of the first DB 31 and the second DB 32 is leaked.
Condition 3: If neither the first DB 31 nor the second DB 32 is leaked, the first DB 31 and the second DB 32 are similarly sufficiently secure.

分割手段２１は、データベース３０のうちの一部分である第１ＤＢ３１と、データベース３０のうちの第１ＤＢ３１以外の部分である第２ＤＢ３２とにおいて、第１ＤＢ３１のデータ分布である第１分布の２次のレニーエントロピーと、第２ＤＢ３２のデータ分布である第２分布の２次のレニーエントロピーとの差を所定の範囲内にすると共に、第１分布及び第２分布をヒストグラムに表した場合の階級の個数を互いに同一にするという条件の下で、第１ＤＢ３１又は第２ＤＢ３２の一方に含まれるデータの分布に基づいて他方のデータベースに含まれるデータの分布を推測した推測分布から選択されたデータと、他方のデータベースから選択されたデータとが一致して他方のデータベースへの攻撃が成功する確率が所定以下となるように、データベース３０を第１ＤＢ３１と第２ＤＢ３２とに分割する。 The dividing means 21 generates a second order Renyi entropy of the first distribution, which is the data distribution of the first DB 31, in the first DB 31 which is a part of the database 30 and the second DB 32 which is a part other than the first DB 31 of the database 30. And the difference between the second distribution and the second-order Renyi entropy of the second distribution, which is the data distribution of the second DB 32, is within a predetermined range, and the number of classes is the same when the first distribution and the second distribution are represented in the histogram. Under the condition that the data is selected from the estimated distribution obtained by estimating the distribution of data included in the other database based on the distribution of data included in one of the first DB 31 or the second DB 32, and selected from the other database So that the probability that the attack on the other database succeeds by matching the data obtained is less than or equal to the predetermined value, Dividing the database 30 to the first 1DB31 and the 2DB32.

第１分布の２次のレニーエントロピーと、第２分布の２次のレニーエントロピーとの差を所定の範囲内にするとは、例えば、ｒ＝Ｈ_２（Ｘ_１）＝Ｈ_２（Ｘ_２）の状態にすることをいう。これにより、ＤＧＡに対する耐性が向上する。また、第１ＤＢ３１又は第２ＤＢ３２のどちらが漏洩した場合においてもその程度は同様となる。 To make the difference between the second order Renyi entropy of the first distribution and the second order Renyi entropy of the second distribution within a predetermined range, for example, r = H ₂ (X ₁ ) = H ₂ (X ₂ ) It refers to the state. Thereby, the resistance to DGA is improved. In addition, the degree is the same regardless of whether the first DB 31 or the second DB 32 leaks.

ここで、ｐ_１（ｘ）を第１ＤＢ３１に含まれるデータのヒストグラム、ｐ_２（ｘ）を第２ＤＢ３２に含まれるデータのヒストグラムとする。
条件１より、攻撃者の推測分布ｑ（ｘ）は、第２ＤＢ３２が漏洩した場合はｐ２（ｘ）、第１ＤＢ３１が漏洩した場合はｐ１（ｘ）で表せる。
Ｘ_１、Ｘ_２をそれぞれｐ_１（ｘ）、ｐ_２（ｘ）にしたがう確率変数とする。ｒ＝Ｈ_２（Ｘ_１）＝Ｈ_２（Ｘ_２）とすると、式（５）の等号が成立するため、第１ＤＢ３１又は第２ＤＢ３２のどちらが漏洩した場合においてもＤＧＡの攻撃成功確率Ｐ_ＤＧＡは式（１２）で表せる。 Here, p ₁ (x) is a histogram of data included in the first DB 31 and p ₂ (x) is a histogram of data included in the second DB 32.
From condition 1, the estimated distribution q (x) of the attackers can be expressed as p2 (x) when the second DB 32 leaks, and p1 (x) when the first DB 31 leaks.
Let X ₁ and X _{2 be} random variables according to p ₁ (x) and p ₂ (x), respectively. Assuming that r = H ₂ (X ₁ ) = H ₂ (X ₂ ), the equality of equation (5) holds, so that the attack success probability P _DGA of the _DGA is the same regardless of whether the first DB 31 or the second DB 32 leaks. It can be expressed by equation (12).

本実施形態では、条件１により攻撃者は分布推測アルゴリズムとしてヒストグラムを用いることを仮定しているが、実際にはより有効なアルゴリズムを使用する可能性もある。攻撃者が攻撃対象のデータベースの分布ｐ（ｘ）と同一の分布を推測できた場合、Ｄ（Ｘ_１，Ｘ_２）＝０となり、ＤＧＡの平均攻撃成功確率は２^−ｒで表せる。したがって、ｒの値は十分に大きくする必要がある。ただし、Ｈ_２（Ｘ_１）＝Ｈ_２（Ｘ_２）を仮定しているため、第１ＤＢ３１及び第２ＤＢ３２にランダムにデータを振り分けたときにＨ_２（Ｘ_１），Ｈ_２（Ｘ_２）はそれぞれ最大となる。したがって、ｒは式（１３）を満たすように設定する。
ｒ＝Ｈ_２（Ｘ_１）＝Ｈ_２（Ｘ_２）＝Ｈ（Ｘ）（１３）
ただし、Ｘは第１ＤＢ３１及び第２ＤＢ３２の総データの分布にしたがう確率変数とする。 In the present embodiment, it is assumed that the attacker uses the histogram as the distribution estimation algorithm according to the condition 1, but there is a possibility that a more effective algorithm is actually used. If the attacker can guess the same distribution as the distribution p (x) of the database of the attack target, D (X ₁ , X ₂ ) = 0, and the average attack success probability of the DGA can be expressed by 2- ^r . Therefore, the value of r needs to be sufficiently large. However, since it is assumed that H ₂ (X ₁ ) = H ₂ (X ₂ ), H ₂ (X ₁ ) and H ₂ (X ₂ ) are randomly assigned to the first DB 31 and the second DB 32. Each becomes the maximum. Therefore, r is set so as to satisfy Expression (13).
r = H ₂ (X ₁ ) = H ₂ (X ₂ ) = H (X) (13)
Here, X is a random variable according to the distribution of the total data of the first DB 31 and the second DB 32.

分割手段２１は、第１分布の２次のレニーエントロピー及び第２分布の２次のレニーエントロピーと、データベース３０のデータ分布の２次のレニーエントロピーとの差をそれぞれ所定の範囲内にするという、追加した条件の下で分割する。
具体的には、分割手段２１は、攻撃者がｐ（ｘ）を完全に推測できた場合の安全性を高くするために、ｒが式（１３）を満たすという、追加した条件の下で分割する。 The dividing means 21 sets the difference between the second-order Renyi entropy of the first distribution and the second-order Renyi entropy of the second distribution and the second-order Renyi entropy of the data distribution in the database 30 to be within predetermined ranges, respectively. Split under the added conditions.
Specifically, in order to increase the security when the attacker can completely guess p (x), the dividing unit 21 performs the division under the added condition that r satisfies Expression (13). I do.

また、条件３を考慮して、攻撃者がｐ（ｘ）に関して何の情報も持たないときのＤＧＡへの耐性を高くするために、式（１４）を設定する．
｜χ_１｜＝｜χ_２｜＝｜χ｜（１４）
ただし、χ_１、χ_２は第１ＤＢ３１及び第２ＤＢ３２のデータの取りうる値の集合とし、χは第１ＤＢ３１と第２ＤＢ３２とに含まれる総データの取りうる値の集合とする。
以上より、ＤＧＡへの耐性を向上させるためには、式（１３）、式（１４）の条件の下、式（１２）のＤ（Ｘ_１，Ｘ_２）を大きくすればよいことがわかる。 Considering Condition 3, Expression (14) is set in order to increase the resistance to DGA when the attacker has no information on p (x).
| Χ ₁ | = | χ ₂ | = | χ | (14)
Here, _{１ 1} and χ ₂ are a set of possible values of data of the first DB 31 and the second DB 32, and χ is a set of possible values of total data included in the first DB 31 and the second DB 32.
From the above, it can be seen that D (X ₁ , X ₂ ) in equation (12) should be increased under the conditions of equations (13) and (14) in order to improve the resistance to DGA.

すなわち、分割手段２１は、第１分布と第２分布との距離が、第１分布又は第２分布と一様分布との距離に基づく一定の値以上（上述のＤ（Ｘ，Ｙ）^２≧２Ｅ（Ｘ）^２）であるように、データベース３０を第１ＤＢ３１と第２ＤＢ３２とに分割する。 That is, the dividing unit 21 determines that the distance between the first distribution and the second distribution is equal to or more than a certain value based on the distance between the first distribution or the second distribution and the uniform distribution (the above-described D (X, Y) ² ≧ 2E (X) ² ), the database 30 is divided into a first DB 31 and a second DB 32.

分割手段２１は、第１分布及び第２分布をヒストグラムに表した場合の階級の個数と、データベース３０のデータ分布をヒストグラムに表した場合の階級の個数とをそれぞれ同一にするという、追加した条件の下で分割する。
具体的には、分割手段２１は、攻撃者がｐ（ｘ）に関して何の情報も持たないときのＤＧＡへの耐性を高くするために、式（１４）を満たすように分割する。 The dividing means 21 adds an additional condition that the number of classes when the first distribution and the second distribution are represented by a histogram is equal to the number of classes when the data distribution of the database 30 is represented by a histogram. Split under.
Specifically, the dividing unit 21 divides so as to satisfy Expression (14) in order to increase the resistance to DGA when the attacker has no information on p (x).

［分割アルゴリズム］
データの個数が増加するにつれて、分割の組み合わせ総数は指数的に増加するため、効率的に解く分割アルゴリズムについて説明する。 [Division algorithm]
Since the total number of divisions increases exponentially as the number of data increases, a division algorithm that solves efficiently will be described.

ここで、ｃ_１（ｘ）及びｃ_２（ｘ）をそれぞれ第１ＤＢ３１又は第２ＤＢ３２に含まれる値ｘ∈χを取るデータの個数とする。このとき、第１ＤＢ３１及び第２ＤＢ３２の総データの中で値ｘを取るものの個数はｃ（ｘ）＝ｃ_１（ｘ）＋ｃ_２（ｘ）で与えられる。
分割アルゴリズムは、式（１５）を目的関数とし、式（１６）〜（１８）を制約条件として、目的関数を最大にするｃ_１（ｘ）及びｃ_２（ｘ）を求める。 Here, c ₁ (x) and c ₂ (x) are the numbers of data taking the value x 値 included in the first DB 31 or the second DB 32, respectively. At this time, the number of data having the value x in the total data of the first DB 31 and the second DB 32 is given by c (x) = c ₁ (x) + c ₂ (x).
The partitioning algorithm obtains c ₁ (x) and c ₂ (x) that maximize the objective function using equation (15) as an objective function and equations (16) to (18) as constraints.

分割アルゴリズムは、初期設定からＳｔｅｐ４までの手順からなる。 The division algorithm includes procedures from initial setting to Step 4.

初期設定：Ｎ個の秘密情報データを同一サイズの第１ＤＢ３１及び第２ＤＢ３２に分割する。ただし、すべてのｘ∈χにおいてｃ_１（ｘ）＝ｃ_２（ｘ）を満たすように分割する。次いで、Δ＝｛δ（ｘ）＝ｃ（ｘ）−２｝_{（ｘ∈χ）}を作成する。 Initial setting: N pieces of secret information data are divided into a first DB 31 and a second DB 32 having the same size. However, division is performed so as to satisfy c ₁ (x) = c ₂ (x) in all x∈χ. Next, Δ = {δ (x) = c (x) −2} _(x∈χ) is created.

Ｓｔｅｐ１：最も大きいδ（ｘ）∈Δと対応するｘを選択し、［δ（ｘ）−｜ｃ_１（ｘ）−ｃ_２（ｘ）｜］／２個のデータを第２ＤＢ３２から第１ＤＢ３１に移動させる。 Step 1: x corresponding to the largest δ (x) ∈Δ is selected, and [δ (x) − | c ₁ (x) −c ₂ (x) |] / 2 data is transferred from the second DB 32 to the first DB 31. Move.

Ｓｔｅｐ２：［δ（ｘ）−｜ｃ_１（ｘ）−ｃ_２（ｘ）｜］／２個のデータを式（１６）を満たすように第１ＤＢ３１から第２ＤＢ３２に移動させる。ただし、この移動はグリーディ法により行う。 Step 2: [δ (x) − | c ₁ (x) −c ₂ (x) |] / 2 pieces of data are moved from the first DB 31 to the second DB 32 so as to satisfy Expression (16). However, this movement is performed by the greedy method.

Ｓｔｅｐ２で用いるグリーディ法について説明する。値ｘを取るデータを第１ＤＢ３１から第２ＤＢ３２にｅ個移動させたときのｃ_２（ｘ）^２−ｃ_１（ｘ）^２の変化量は次式で表せる．
［（ｃ_２（ｘ）＋ｅ）^２−（ｃ_１（ｘ）−ｅ）^２］
−［ｃ_２（ｘ）^２−ｃ_１（ｘ）^２］（１９）
＝２・［ｃ_２（ｘ）＋ｃ_１（ｘ）］・ｅ（２０）
＝２・ｃ（ｘ）・ｅ（２１）
したがって、ｃ（ｘ）が大きいｘのデータを移動させた方が少ないデータの移動で式（２２）の値を大きく変化させられることが分かる。このため、Ｓｔｅｐ２のグリーディ法では、ｃ（ｘ）が最も大きいｘのデータから順に移動させる。 The greedy method used in Step 2 will be described. The change amount of c ₂ (x) ² −c ₁ (x) ² when e data having the value x is moved from the first DB 31 to the second DB 32 can be expressed by the following equation.
[(C ₂ (x) + e) ^2- (c ₁ (x) -e) ² ]
− [C ₂ (x) ² −c ₁ (x) ² ] (19)
= 2 · [c ₂ (x) + c ₁ (x)] · e (20)
= 2 · c (x) · e (21)
Therefore, it can be seen that moving the data of x having a large c (x) can greatly change the value of the expression (22) by moving a small amount of data. For this reason, in the greedy method of Step 2, the data is moved in order from the data of x having the largest c (x).

Ｓｔｅｐ３：δ（ｘ）＝⊥（データ終了）としてΔを更新する。 Step 3: Δ is updated as δ (x) = ⊥ (data end).

ただし、Ｓｔｅｐ２において、条件を満たすデータの移動ができなかった場合には、Ｓｔｅｐ１及びＳｔｅｐ２を行う前の状態に戻し、δ（ｘ）＝δ（ｘ）−２としてΔを更新する。 However, if the data that satisfies the condition cannot be moved in Step 2, the state is returned to the state before performing Step 1 and Step 2, and Δ is updated as δ (x) = δ (x) −2.

Ｓｔｅｐ４：Ｓｔｅｐ１からＳｔｅｐ３までの動作をΔのすべてのδ（ｘ）が０以下もしくは⊥になるまで繰り返す。 Step 4: The operation from Step 1 to Step 3 is repeated until all δ (x) of Δ become 0 or less or ⊥.

ただし、本ステップは以下の条件にしたがって行う。
・Ｓｔｅｐ１において、ｃ_２（ｘ）＞ｃ_１（ｘ）が成立する場合、データは第１ＤＢ３１から第２ＤＢ３２に移動する。この場合、Ｓｔｅｐ２では、第２ＤＢ３２から第１ＤＢ３１に移動する。
・Ｓｔｅｐ１において、｜ｃ_１（ｘ）−ｃ_２（ｘ）｜≧δ（ｘ）が成立する場合、当該ｘに関するデータの移動は行わずに、δ（ｘ）＝⊥（データ終了）としてΔを更新し、次のｘのプロセスに移行する。
・Ｓｔｅｐ２では、δ（ｘ）≠⊥（データの終了）のｘのデータのみ移動する。 However, this step is performed according to the following conditions.
In Step 1, if c ₂ (x)> c ₁ (x) holds, data moves from the first DB 31 to the second DB 32. In this case, in Step 2, the process moves from the second DB 32 to the first DB 31.
In Step 1, when | c ₁ (x) −c ₂ (x) | ≧ δ (x) holds, the data relating to the x is not moved, and Δ (x) = ⊥ (data end) is set as Δ Is updated, and the process proceeds to the next process x.
In Step 2, only data of x of δ (x) ≠ ⊥ (end of data) is moved.

本アルゴリズムにおいてグリーディ法の計算量はＮ／２を超えることはない。また、Ｓｔｅｐ３の計算量はΔの更新回数に比例するため、最悪の場合でも高々Ｎ／２である。したがって、本分割アルゴリズムの最悪時の計算量はΟ（Ｎ^２）となる。 In this algorithm, the amount of calculation of the greedy method does not exceed N / 2. In addition, since the calculation amount of Step 3 is proportional to the number of updates of Δ, the worst case is N / 2 at most. Therefore, the worst case calculation amount of the present division algorithm is Ο (N ² ).

次に、上述の分割アルゴリズムについて、フローチャートを用いて説明する。
図４及び図５は、本発明の一実施形態に係るデータベース分割装置２０の分割アルゴリズムの例を示すフローチャートである。データベース分割装置２０は、コンピュータ及びその周辺装置が備えるハードウェア並びに該ハードウェアを制御するソフトウェアによって構成される。以下の処理は、制御部（例えば、ＣＰＵ）が、所定のソフトウェアに従い実行する処理である。なお、本処理は、データベース評価装置１０によってデータベース３０の一部が安全でないと判定された場合に、起動されるとしてもよい。 Next, the above-described division algorithm will be described with reference to a flowchart.
FIGS. 4 and 5 are flowcharts illustrating an example of a division algorithm of the database division device 20 according to an embodiment of the present invention. The database partitioning device 20 is configured by hardware included in a computer and its peripheral devices, and software that controls the hardware. The following process is a process executed by a control unit (for example, a CPU) according to predetermined software. Note that this process may be started when the database evaluation device 10 determines that a part of the database 30 is not secure.

ステップＳ２０１において、ＣＰＵ（分割手段２１）は、初期設定をする。より具体的には、ＣＰＵは、Ｎ個の秘密情報データを同一サイズの第１ＤＢ３１及び第２ＤＢ３２に分割し、Δ＝｛δ（ｘ）＝ｃ（ｘ）−２｝（ｘ∈χ）を作成する。ただし、ＣＰＵは、すべてのｘ∈χにおいてｃ_１（ｘ）＝ｃ_２（ｘ）を満たすように分割する。 In step S201, the CPU (division unit 21) performs initialization. More specifically, the CPU divides the N pieces of secret information data into the first DB 31 and the second DB 32 having the same size and creates Δ = {δ (x) = c (x) −2} (x}). I do. However, the CPU divides so that c ₁ (x) = c ₂ (x) in all x∈χ.

ステップＳ２０２において、ＣＰＵ（分割手段２１）は、最も大きいδ（ｘ）∈Δと対応するｘを選択し、｜ｃ_１（ｘ）−ｃ_２（ｘ）｜≧δ（ｘ）が成立するか否かを判断する。この判断が、ＹＥＳの場合、ＣＰＵは、処理をステップＳ２０７に移し、この判断が、ＮＯの場合、ＣＰＵは、処理をステップＳ２０３に移す。 In step S202, the CPU (dividing means 21) selects x corresponding to the largest δ (x) ∈Δ, and | c ₁ (x) −c ₂ (x) | ≧ δ (x) holds. Determine whether or not. When this determination is YES, the CPU shifts the processing to step S207, and when this determination is NO, the CPU shifts the processing to step S203.

ステップＳ２０３において、ＣＰＵ（分割手段２１）は、ｃ_２（ｘ）＞ｃ_１（ｘ）が成立するか否かを判断する。この判断が、ＹＥＳの場合、ＣＰＵは、処理をステップＳ２１０に移し、この判断が、ＮＯの場合、ＣＰＵは、処理をステップＳ２０４に移す。 In step S203, the CPU (dividing means 21) determines whether c ₂ (x)> c ₁ (x) holds. When this determination is YES, the CPU shifts the processing to step S210, and when this determination is NO, the CPU shifts the processing to step S204.

ステップＳ２０４において、ＣＰＵ（分割手段２１）は、データを第２ＤＢ３２から第１ＤＢ３１に移動させる。より具体的には、ＣＰＵは、［δ（ｘ）−｜ｃ_１（ｘ）−ｃ_２（ｘ）｜］／２個のデータを第２ＤＢ３２から第１ＤＢ３１に移動させる。 In step S204, the CPU (dividing means 21) moves the data from the second DB 32 to the first DB 31. More specifically, the CPU moves [δ (x) − | c ₁ (x) −c ₂ (x) |] / 2 data from the second DB 32 to the first DB 31.

ステップＳ２０５において、ＣＰＵ（分割手段２１）は、グリーディ法によりデータを第１ＤＢ３１から第２ＤＢ３２に移動させる。より具体的には、ＣＰＵは、［δ（ｘ）−｜ｃ_１（ｘ）−ｃ_２（ｘ）｜］／２個のデータを式（１６）を満たすように第１ＤＢ３１から第２ＤＢ３２に移動させる。ＣＰＵは、値ｘを取るデータを第１ＤＢ３１から第２ＤＢ３２に移動させたときのｃ_２（ｘ）^２−ｃ_１（ｘ）^２の変化量の大きいデータから順に移動させる（グリーディ法）。 In step S205, the CPU (dividing means 21) moves data from the first DB 31 to the second DB 32 by the greedy method. More specifically, the CPU moves [δ (x) − | c ₁ (x) −c ₂ (x) |] / 2 data from the first DB 31 to the second DB 32 so as to satisfy Expression (16). Let it. The CPU moves the data having the value x from the first DB 31 to the second DB 32 in order from the data having the largest change amount of c ₂ (x) ² −c ₁ (x) ² (greedy method).

ステップＳ２０６において、ＣＰＵ（分割手段２１）は、式（１６）を満たすように移動できたか否かを判断する。この判断が、ＹＥＳの場合、ＣＰＵは、処理をステップＳ２０７に移し、この判断が、ＮＯの場合、ＣＰＵは、処理をステップＳ２０８に移す。 In step S206, the CPU (division unit 21) determines whether or not the movement has been performed so as to satisfy Expression (16). If this determination is YES, the CPU shifts the processing to step S207. If this determination is NO, the CPU shifts the processing to step S208.

ステップＳ２０７において、ＣＰＵ（分割手段２１）は、δ（ｘ）＝⊥（データ終了）としてΔを更新する。その後、ＣＰＵは、処理をステップＳ２０９に移す。 In step S207, the CPU (division unit 21) updates Δ as δ (x) = ⊥ (data end). Thereafter, the CPU moves the processing to step S209.

ステップＳ２０８において、ＣＰＵ（分割手段２１）は、ステップＳ２０４及びステップＳ２０５におけるデータの移動を前の状態に戻し、δ（ｘ）＝δ（ｘ）−２としてΔを更新する。 In step S208, the CPU (division unit 21) returns the data movement in steps S204 and S205 to the previous state, and updates Δ as δ (x) = δ (x) −2.

ステップＳ２０９において、ＣＰＵ（分割手段２１）は、Δのすべてのδ（ｘ）が０以下又は⊥（データの終了）か否かを判断する。この判断が、ＹＥＳの場合、ＣＰＵは、処理を終了し、この判断が、ＮＯの場合、ＣＰＵは、処理をステップＳ２０２に移す。 In step S209, the CPU (division unit 21) determines whether all δ (x) of Δ are equal to or less than 0 or ⊥ (end of data). When this determination is YES, the CPU ends the process, and when this determination is NO, the CPU shifts the processing to step S202.

ステップＳ２１０において、ＣＰＵ（分割手段２１）は、データを第１ＤＢ３１から第２ＤＢ３２に移動させる。より具体的には、ＣＰＵは、［δ（ｘ）−｜ｃ_１（ｘ）−ｃ_２（ｘ）｜］／２個のデータを第１ＤＢ３１から第２ＤＢ３２に移動させる。 In step S210, the CPU (dividing means 21) moves data from the first DB 31 to the second DB 32. More specifically, the CPU moves [δ (x) − | c ₁ (x) −c ₂ (x) |] / 2 data from the first DB 31 to the second DB 32.

ステップＳ２１１において、ＣＰＵ（分割手段２１）は、グリーディ法によりデータを第２ＤＢ３２から第１ＤＢ３１に移動させる。より具体的には、ＣＰＵは、［δ（ｘ）−｜ｃ_１（ｘ）−ｃ_２（ｘ）｜］／２個のデータを式（１６）を満たすように第２ＤＢ３２から第１ＤＢ３１に移動させる。ＣＰＵは、値ｘを取るデータを第２ＤＢ３２から第１ＤＢ３１に移動させたときのｃ_２（ｘ）^２−ｃ_１（ｘ）^２の変化量の大きいデータから順に移動させる（グリーディ法）。 In step S211, the CPU (dividing means 21) moves data from the second DB 32 to the first DB 31 by the greedy method. More specifically, the CPU moves [δ (x) − | c ₁ (x) −c ₂ (x) |] / 2 data from the second DB 32 to the first DB 31 so as to satisfy Expression (16). Let it. CPU moves from a larger data _{^{_{c 2 (x) 2 -c 1}}} (x) 2 of the amount of change when the data take the value x is moved from the 2DB32 to the 1DB31 sequentially (greedy method).

ステップＳ２１２において、ＣＰＵ（分割手段２１）は、式（１６）を満たすように移動できたか否かを判断する。この判断が、ＹＥＳの場合、ＣＰＵは、処理をステップＳ２１３に移し、この判断が、ＮＯの場合、ＣＰＵは、処理をステップＳ２１４に移す。 In step S212, the CPU (division unit 21) determines whether or not the movement has been performed so as to satisfy Expression (16). If this determination is YES, the CPU moves the process to step S213, and if this determination is NO, the CPU moves the process to step S214.

ステップＳ２１３において、ＣＰＵ（分割手段２１）は、δ（ｘ）＝⊥（データ終了）としてΔを更新する。その後、ＣＰＵは、処理をステップＳ２０９に移す。 In step S213, the CPU (division unit 21) updates Δ as δ (x) = ⊥ (data end). Thereafter, the CPU moves the processing to step S209.

ステップＳ２１４において、ＣＰＵ（分割手段２１）は、ステップＳ２１０及びステップＳ２１１におけるデータの移動を前の状態に戻し、δ（ｘ）＝δ（ｘ）−２としてΔを更新する。その後、ＣＰＵは、処理をステップＳ２０９に移す。 In step S214, the CPU (dividing means 21) returns the data movement in step S210 and step S211 to the previous state, and updates Δ as δ (x) = δ (x) -2. Thereafter, the CPU moves the processing to step S209.

図６は、本発明の一実施形態に係るデータベース分割装置２０により分割された第１ＤＢ３１及び第２ＤＢ３２のヒストグラムの例を示す図である。分割される前のデータベース３０は、秘密情報が３２個の異なる値を取るものとし、１０００個のデータを記憶している。ただし、それらのデータは、標準正規分布にしたがって生起されている。データについて、［−３σ，３σ］の区間が均等に３２分割され、それぞれの区間に秘密情報の各値が割り当てられている。 FIG. 6 is a diagram illustrating an example of histograms of the first DB 31 and the second DB 32 divided by the database dividing device 20 according to an embodiment of the present invention. The database 30 before being divided assumes that the secret information takes 32 different values, and stores 1000 data. However, those data are generated according to the standard normal distribution. In the data, the section [-3σ, 3σ] is equally divided into 32 sections, and each section is assigned a value of secret information.

図６の例は、データベース３０が分割アルゴリズムによって分割され、分割された第１ＤＢ３１と、第２ＤＢ３２とのそれぞれのヒストグラムの例を示す。図６における横軸は、秘密情報の階級（取りうる値の参照番号）を示し、縦軸はその頻度を示す。図６の例では、ほぼすべての参照番号において、第１ＤＢ３１と第２ＤＢ３２との間でデータの出現頻度が大きく異なっている。したがって、図６の例は、データベース分割装置２０によって、第１ＤＢ３１又は第２ＤＢ３２のどちらのデータベースが漏洩した場合においても、ＤＧＡに対して耐性のある秘密情報の分割保管がされたことを示している。 The example of FIG. 6 shows an example of the histograms of the first DB 31 and the second DB 32 obtained by dividing the database 30 by the division algorithm. The horizontal axis in FIG. 6 indicates the class of the secret information (reference numbers of possible values), and the vertical axis indicates the frequency. In the example of FIG. 6, the appearance frequency of data is largely different between the first DB 31 and the second DB 32 for almost all reference numbers. Therefore, the example of FIG. 6 shows that the secret information resistant to DGA is divided and stored by the database dividing device 20 even when either the first DB 31 or the second DB 32 is leaked. .

本実施形態によれば、データベース評価装置１０は、データベース３０のうちの一部分である第１ＤＢ３１と、データベース３０のうちの第１ＤＢ３１以外の部分である第２ＤＢ３２とにおいて、第１ＤＢ３１が漏洩したと仮定した場合に、第１ＤＢ３１のデータ分布である第１分布に基づいて、第２ＤＢ３２のデータ分布である第２分布を推測し、推測された推測分布において選択されたデータと、第２分布において選択されたデータとが一致して攻撃が成功する確率である平均攻撃成功確率を評価指標とし、評価指標に基づいて、データベース３０が安全であるか否かを判定する。データベース評価装置１０は、評価指標が、推測分布を一様分布であるとしたときに攻撃が成功する確率以下である場合に、データベース３０を安全であると判定する。データベース評価装置１０は、推測分布と第２分布との第１距離を算出し、一様分布と第２分布との第２距離を算出し、推測分布の２次のレニーエントロピーが、第２分布の２次のレニーエントロピー以上であり、かつ、第１距離が、第２距離に基づく一定の値以上である場合に、データベース３０を安全であると判定する。
したがって、データベース評価装置１０は、データベース３０のうち一部分が漏洩した場合に残りの部分の安全性を評価できる。 According to the present embodiment, the database evaluation device 10 assumes that the first DB 31 has leaked in the first DB 31 that is a part of the database 30 and the second DB 32 that is a part other than the first DB 31 of the database 30. In this case, based on the first distribution that is the data distribution of the first DB 31, the second distribution that is the data distribution of the second DB 32 is estimated, and the data selected in the estimated estimation distribution and the data that is selected in the second distribution are estimated. The average attack success probability, which is the probability of successful attack in accordance with the data, is used as an evaluation index, and whether or not the database 30 is safe is determined based on the evaluation index. The database evaluation device 10 determines that the database 30 is safe if the evaluation index is equal to or less than the probability of successful attack when the estimated distribution is assumed to be a uniform distribution. The database evaluation device 10 calculates a first distance between the estimated distribution and the second distribution, calculates a second distance between the uniform distribution and the second distribution, and calculates the second-order Lenny entropy of the estimated distribution as the second distribution. The database 30 is determined to be safe if the second distance is greater than or equal to the second Reny entropy and the first distance is greater than or equal to a certain value based on the second distance.
Therefore, when a part of the database 30 is leaked, the database evaluation device 10 can evaluate the security of the remaining part.

本実施形態によれば、データベース分割装置２０は、データベース評価装置１０によって安全であると評価されるように、データベース３０を分割する。データベース分割装置２０は、データベース３０のうちの第１ＤＢ３１と、第２ＤＢ３２とにおいて、第１ＤＢ３１のデータ分布である第１分布の２次のレニーエントロピーと、第２ＤＢ３２のデータ分布である第２分布の２次のレニーエントロピーとの差を所定の範囲内にすると共に、第１分布及び第２分布をヒストグラムに表した場合の階級の個数を互いに同一にするという条件の下で、第１分布と第２分布との距離が、第１分布又は第２分布と一様分布との距離に基づく一定の値以上であるように、データベース３０を第１ＤＢ３１と第２ＤＢ３２とに分割する。
さらに、データベース分割装置２０は、第１分布の２次のレニーエントロピー及び第２分布の２次のレニーエントロピーと、データベース３０のデータ分布の２次のレニーエントロピーとの差をそれぞれ所定の範囲内にするという、追加した条件の下で分割する。
さらに、データベース分割装置２０は、第１分布及び第２分布をヒストグラムに表した場合の階級の個数と、データベース３０のデータ分布をヒストグラムに表した場合の階級の個数とをそれぞれ同一にするという、追加した条件の下で分割する。
さらに、データベース分割装置２０は、グリーディ法により、第１ＤＢ３１と第２ＤＢ３２との距離の変化量の大きい階級を優先させてデータを移動させる。
したがって、データベース分割装置２０は、データベース３０を分割する場合に分割した一方が漏洩しても、分割した他方が安全であるようにデータベース３０を分割できる。
このように、本発明は、情報漏洩が起きた際の安全性を評価することができる。また、本発明は、システムから一部のユーザの秘密情報が漏洩した際に、残りのユーザの秘密情報への影響を最小限に抑えることができる。本発明は、秘密情報を大量に扱うシステムに特化したクラウドプラットフォームを提供することができる。 According to the present embodiment, the database dividing device 20 divides the database 30 so that the database is evaluated as safe by the database evaluating device 10. In the first DB 31 and the second DB 32 of the database 30, the database dividing device 20 generates a second order Renyi entropy of the first distribution that is the data distribution of the first DB 31 and a second Reny entropy of the second distribution that is the data distribution of the second DB 32. Under the condition that the difference from the next Rennie entropy is within a predetermined range and the number of classes is the same when the first distribution and the second distribution are represented in the histogram, the first distribution and the second distribution are equal. The database 30 is divided into a first DB 31 and a second DB 32 such that the distance to the distribution is equal to or greater than a certain value based on the distance between the first distribution or the second distribution and the uniform distribution.
Further, the database dividing device 20 sets the difference between the second order Renyi entropy of the first distribution and the second order Renyi entropy of the second distribution, and the difference between the second order Renyi entropy of the data distribution of the database 30 within a predetermined range. Split under the added condition of doing.
Further, the database dividing device 20 sets the number of classes when the first distribution and the second distribution are represented by a histogram and the number of classes when the data distribution of the database 30 is represented by a histogram, respectively. Split under the added conditions.
Further, the database dividing device 20 moves data by giving priority to a class having a large amount of change in the distance between the first DB 31 and the second DB 32 by the greedy method.
Therefore, when dividing the database 30, the database dividing device 20 can divide the database 30 so that even if one of the divided parts leaks, the other part is safe.
As described above, the present invention can evaluate the security when information leakage occurs. Further, the present invention can minimize the influence on the confidential information of the remaining users when the confidential information of some users leaks from the system. The present invention can provide a cloud platform specialized for a system that handles a large amount of confidential information.

以上、本発明の実施形態について説明したが、本発明は上述した実施形態に限るものではない。また、本発明の実施形態に記載された効果は、本発明から生じる最も好適な効果を列挙したに過ぎず、本発明による効果は、本発明の実施形態に記載されたものに限定されるものではない。 The embodiments of the present invention have been described above, but the present invention is not limited to the above-described embodiments. In addition, the effects described in the embodiments of the present invention merely enumerate the most preferable effects resulting from the present invention, and the effects of the present invention are limited to those described in the embodiments of the present invention. is not.

データベース評価装置１０又はデータベース分割装置２０による一連の処理は、ソフトウェアにより行うこともできる。一連の処理をソフトウェアによって行う場合には、そのソフトウェアを構成するプログラムが、汎用のコンピュータなどにインストールされる。また、当該プログラムは、コンピュータ読み取り可能な記録媒体（例えば、ＣＤ−ＲＯＭのようなリムーバブルメディアなど）に記録されてユーザに配布されてもよいし、ネットワークを介してユーザのコンピュータにダウンロードされることにより配布されてもよい。 A series of processes by the database evaluation device 10 or the database division device 20 can be performed by software. When a series of processing is performed by software, a program constituting the software is installed in a general-purpose computer or the like. In addition, the program may be recorded on a computer-readable recording medium (for example, a removable medium such as a CD-ROM) and distributed to the user, or may be downloaded to the user's computer via a network. May be distributed by

１０データベース評価装置
１１推測手段
１２判定手段
１３第１距離算出手段
１４第２距離算出手段
２０データベース分割装置
２１分割手段
３０データベース
３１第１ＤＢ
３２第２ＤＢ

DESCRIPTION OF SYMBOLS 10 Database evaluation apparatus 11 Estimation means 12 Judgment means 13 1st distance calculation means 14 2nd distance calculation means 20 Database division apparatus 21 Division means 30 Database 31 1st DB
32 2nd DB

Claims

A database evaluation device that evaluates a database that stores user secret information data,
A first database for storing secret information data of some of the users, the first database being a part of the database, and a part other than the first database of the database, And a second database that stores secret information data of the users other than the user, if it is assumed that the first database has leaked, based on a first distribution that is a data distribution of the first database, Estimating means for estimating a second distribution which is a data distribution of the second database using a distribution estimating algorithm;
The average attack success probability, which is the probability that the data selected in the guessed distribution guessed by the guessing means matches the data selected in the second distribution and succeed in the attack, is used as an evaluation index,
Determining means for determining whether the database is safe based on the evaluation index,
A database evaluation device comprising:

The said judgment means judges that the said database is safe, when the said evaluation parameter | index is below the probability of an attack succeeding, assuming that the said guess distribution is a uniform distribution, The said database is safe. Database evaluation device.

First distance calculating means for calculating a first distance between the estimated distribution and the second distribution;
A second distance calculating means for calculating a second distance between the uniform distribution and the second distribution,
The determining means replaces the determination based on the evaluation index with a second-order Renyi entropy of the estimated distribution that is equal to or greater than a second-order Renyi entropy of the second distribution, and the first distance is the first The database evaluation device according to claim 1, wherein the database is determined to be safe if the distance is equal to or greater than a value obtained by multiplying the two distances by a predetermined coefficient .

A database dividing device that divides a database that stores user secret information data,
A first database for storing secret information data of some of the users, the first database being a part of the database, and a part other than the first database of the database, A second database that stores confidential information data of the users other than the second user, a second-order Renyi entropy of a first distribution that is a data distribution of the first database, and a second Reny entropy that is a data distribution of the second database. Under the condition that the difference between the distribution and the second-order Reny's entropy is within a predetermined range, and the number of classes is the same when the first distribution and the second distribution are represented in a histogram,
Data selected from an estimated distribution obtained by estimating a data distribution included in the other database using an arbitrary distribution estimation algorithm based on a data distribution included in one of the first database and the second database; and A dividing unit that divides the database into the first database and the second database so that the probability that the data selected from the database matches and the attack on the other database succeeds is equal to or less than a predetermined value. ,
Database partitioning device.

The dividing means stores the database such that a distance between the first distribution and the second distribution is equal to or greater than a certain value based on a distance between the first distribution or the second distribution and a uniform distribution. The database dividing apparatus according to claim 4, wherein the database is divided into a first database and the second database.

The dividing means sets a difference between a second-order Renyi entropy of the first distribution and a second-order Renyi entropy of the second distribution and a second-order Renyi entropy of the data distribution of the database within a predetermined range. The database dividing apparatus according to claim 4, wherein the division is performed under the further added condition.

The dividing means sets the number of classes when the first distribution and the second distribution are represented in a histogram and the number of classes when the data distribution of the database is represented in a histogram, respectively. The database dividing apparatus according to claim 4, wherein the division is performed under the added condition.

The database according to any one of claims 4 to 7, wherein the division unit moves data by giving priority to a class having a large amount of change in distance between the first database and the second database by a greedy method. Splitting device.

A method performed by the database evaluation device according to claim 1, wherein
The estimating means is a part of the database, a first database storing secret information data of some of the users, and a part of the database other than the first database. And in a second database storing secret information data of the users other than the some users, when it is assumed that the first database has been leaked, the first distribution is a data distribution of the first database. An estimation step of estimating a second distribution, which is a data distribution of the second database, using an arbitrary distribution estimation algorithm,
The evaluation means uses, as an evaluation index, an average attack success probability, which is a probability that the data selected in the guessed distribution estimated in the guessing step matches the data selected in the second distribution and the attack succeeds. A determination step of determining whether the database is safe based on the evaluation index;
A method comprising:

A method performed by the database partitioning apparatus according to claim 4, wherein
The dividing means is a part of the database, a first database storing secret information data of some of the users, and a part of the database other than the first database. A second database that stores secret information data of the users other than the some users; a second-order Renyi entropy of a first distribution that is a data distribution of the first database; and a data of the second database. A condition that the difference between the second distribution and the second-order Reny's entropy of the second distribution is within a predetermined range, and the number of classes is the same when the first distribution and the second distribution are represented in a histogram. Under
Data selected from an estimated distribution obtained by estimating a data distribution included in the other database using an arbitrary distribution estimation algorithm based on a data distribution included in one of the first database and the second database; and A dividing step of dividing the database into the first database and the second database such that a probability that an attack on the other database is successful by matching data selected from the database is equal to or less than a predetermined value. ,
Method.

A program for causing the database evaluation device to execute each step of the method according to claim 9.

A program for causing the database dividing device to execute each step of the method according to claim 10.