JP5651568B2

JP5651568B2 - Database disturbance device, system, method and program

Info

Publication number: JP5651568B2
Application number: JP2011223770A
Authority: JP
Inventors: 大五十嵐; 千田　浩司; 浩司千田; 亮菊池
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2011-10-11
Filing date: 2011-10-11
Publication date: 2015-01-14
Anticipated expiration: 2031-10-11
Also published as: JP2013083801A

Description

この発明は、プライバシーを保護しながらデータマイニングを行う技術に関する。 The present invention relates to a technique for performing data mining while protecting privacy.

いわゆるＰｋ−匿名性を満たすデータベース撹乱技術が、特許文献１で提案されている（例えば、特許文献１参照。）。 A database disturbance technique that satisfies so-called Pk-anonymity is proposed in Patent Document 1 (see, for example, Patent Document 1).

Ｐｋ−匿名性は、データベースの各レコードと、その各レコードに対応する個人とを１／ｋ以上の確率で結びつけることができないという性質である。 Pk-anonymity is a property that each record in the database and an individual corresponding to each record cannot be associated with a probability of 1 / k or more.

特開２０１１−１００１１６号公報JP 2011-100116 A

しかしながら、特許文献１の技術は属性値がいわゆるカテゴリ属性値であることを想定しており、属性値がいわゆる数値属性値である場合には非特許文献１の技術を適用することができない。 However, the technique of Patent Document 1 assumes that the attribute value is a so-called category attribute value, and the technique of Non-Patent Document 1 cannot be applied when the attribute value is a so-called numeric attribute value.

この発明の課題は、属性値が数値属性値である場合にも適用することができる、Ｐｋ−匿名性を満たすデータベース撹乱装置、システム、方法及びプログラムを提供することである。 The subject of this invention is providing the database disturbance apparatus, system, method, and program which satisfy | fill Pk-anonymity which can be applied also when an attribute value is a numerical attribute value.

この発明の一態様によるデータベース撹乱装置は、データベースは複数のレコードを含み、各レコードはレコード識別子及び少なくとも１つの属性値を含み、||・||_１を・のＬ１距離とし、σを所定の値として、データベースに含まれる一部又は全部の属性値のそれぞれについて、下記の式により定義される分散２σ^２のラプラス分布に従う値を加算する撹乱部を備える。 In the database disturbance device according to one aspect of the present invention, the database includes a plurality of records, each record includes a record identifier and at least one attribute value, || · || ₁ is an L1 distance of •, and σ is a predetermined value For each or all of the attribute values included in the database, a disturbance unit that adds a value according to a Laplace distribution with a variance 2σ ² defined by the following equation is provided.

属性値が数値属性値である場合にも適用することができる。 The present invention can also be applied when the attribute value is a numerical attribute value.

第一実施形態のデータベース撹乱システムを説明するためのブロック図。The block diagram for demonstrating the database disturbance system of 1st embodiment. 第一実施形態のデータベース撹乱システムを説明するための流れ図。The flowchart for demonstrating the database disturbance system of 1st embodiment. 第二実施形態のデータベース撹乱システムを説明するための流れ図。The flowchart for demonstrating the database disturbance system of 2nd embodiment. データベース撹乱システムの変形例を説明するためのブロック図。The block diagram for demonstrating the modification of a database disturbance system. データベース撹乱システムの変形例を説明するためのブロック図。The block diagram for demonstrating the modification of a database disturbance system. データベース撹乱システムの変形例を説明するためのブロック図。The block diagram for demonstrating the modification of a database disturbance system. 第一実施形態で撹乱の対象となるデータベースの例を説明するための図。The figure for demonstrating the example of the database used as the object of disturbance in 1st embodiment. 第二実施形態で撹乱の対象となるデータベースの例を説明するための図。The figure for demonstrating the example of the database used as the object of disturbance in 2nd embodiment.

以下、図面を参照して、この発明の実施形態を説明する。 Embodiments of the present invention will be described below with reference to the drawings.

［第一実施形態］
第一実施形態のデータベース撹乱システムは、図１に例示するように、撹乱装置１及び集計装置２を備えている。 [First embodiment]
The database disturbance system of 1st embodiment is provided with the disturbance apparatus 1 and the totaling apparatus 2, as illustrated in FIG.

撹乱装置１は、データベース記憶部１１と、撹乱部１２と、パラメータ決定部１３とを例えば備えている。この例では、撹乱部１２は、並替部１４を備える。 The disturbance device 1 includes a database storage unit 11, a disturbance unit 12, and a parameter determination unit 13, for example. In this example, the disturbance unit 12 includes a rearrangement unit 14.

集計装置２は、集計部２１を例えば備えている。 The counting device 2 includes a counting unit 21, for example.

データベース記憶部１１には、撹乱の対象となるデータベースが記憶されている。データベース記憶部１１に記憶されたデータベースについての情報は、撹乱部１２に送信される。 The database storage unit 11 stores a database to be disturbed. Information about the database stored in the database storage unit 11 is transmitted to the disturbance unit 12.

データベースは、図７に例示するように、複数のレコードから構成されている。 The database is composed of a plurality of records as illustrated in FIG.

各レコードは、レコード識別子と少なくとも１つの属性値とから構成されている。レコード識別子は、個人を識別する識別子であり、いわゆるレコードＩＤである。レコード識別子は、例えば氏名や氏名に対応するＩＤ番号である。 Each record is composed of a record identifier and at least one attribute value. The record identifier is an identifier for identifying an individual and is a so-called record ID. The record identifier is, for example, a name or an ID number corresponding to the name.

各属性値は、第一実施形態では、ｎ次元実数ベクトルの部分集合Ｖに含まれるベクトルであり、いわゆる数値属性値である。ｎは、１以上の整数である。ｎ＝１であり属性が例えば「中間テストの点数」や「期末テストの点数」である場合には、属性値は０から１００までの何れかの整数である。 In the first embodiment, each attribute value is a vector included in the subset V of the n-dimensional real vector, and is a so-called numerical attribute value. n is an integer of 1 or more. When n = 1 and the attribute is, for example, “intermediate test score” or “term test score”, the attribute value is any integer from 0 to 100.

撹乱部１２は、データベース記憶部１１から読み込んだデータベースに含まれる一部又は全部の属性値のそれぞれについて、下記の式により定義される、平均μであり分散２σ^２のラプラス分布に従う値を加算することによりデータベースの撹乱を行う（ステップＳ１）。撹乱されたデータベースは、並替部１４に送信される。撹乱の対象となる属性値が複数ある場合には、それらの複数の属性値を独立に撹乱してもよいし、従属に撹乱してもよい。 The disturbance unit 12 adds a value according to the Laplace distribution of mean μ and variance 2σ ² defined by the following formula for each or all of the attribute values included in the database read from the database storage unit 11. Thus, the database is disturbed (step S1). The disturbed database is transmitted to the rearrangement unit 14. When there are a plurality of attribute values to be disturbed, the plurality of attribute values may be disturbed independently or may be subordinately disturbed.

||・||_１は・のいわゆるＬ１距離であり、・がｎ次元実数ベクトルである場合には、ベクトル・の各成分の絶対値の総和である。ｎは１以上の整数である。 || · || ₁ is the so-called L1 distance of, and when is an n-dimensional real vector, it is the sum of the absolute values of the components of the vector. n is an integer of 1 or more.

例えば、μ＝０とする。この場合、撹乱部１２が用いるラプラス分布は以下のようになる。 For example, μ = 0. In this case, the Laplace distribution used by the disturbance unit 12 is as follows.

以下、「ラプラス分布に従う値」について説明する。まず、ラプラス分布を含む一般の確率密度関数ｆに従う値について説明する。 Hereinafter, the “value according to the Laplace distribution” will be described. First, a value according to a general probability density function f including a Laplace distribution will be described.

１．「確率密度関数ｆに従う値」について
（１）確率密度関数ｆの定義域及び属性値が１次元の場合
（ｉ）累積分布関数Ｆ（ｘ）＝∫_−∞ ^ｘｆ（ｘ’）ｄｘ’を求める。 1. Regarding “value according to probability density function f” (1) When domain and attribute value of probability density function f are one-dimensional (i) Cumulative distribution function F (x) = ∫− _∞ ^x f (x ′) dx ′ Ask.

（ｉｉ）累積分布関数Ｆ（ｘ）の逆関数Ｆ^−１を求める。 (Ii) An inverse function F ⁻¹ of the cumulative distribution function F (x) is obtained.

（ｉｉｉ）区間［０，１］上の一様乱数ｒを生成する。 (Iii) Generate a uniform random number r on the interval [0, 1].

（ｉｖ）Ｆ^−１（ｒ）を「確率密度関数ｆに従う値」として出力する。 (Iv) F ⁻¹ (r) is output as “a value according to the probability density function f”.

累積分布関数Ｆ（ｘ）や逆関数Ｆ^−１が数式で得られる場合にはその数式に基づいてＦ^−１（ｒ）を計算してもよいし、そうでない場合には数値計算によってＦ^−１（ｒ）を計算してもよい。 When the cumulative distribution function F (x) or the inverse function F ⁻¹ is obtained by a mathematical formula, F ⁻¹ (r) may be calculated based on the mathematical formula. Otherwise, F ⁻ is calculated by numerical calculation. ¹ (r) may be calculated.

（２）確率密度関数ｆの定義域及び属性値がｎ次元の場合
ｉ＝０，…，ｎ−１のそれぞれに対して、以下の（ｉ）（ｉｉ）を行う。 (2) When the domain and the attribute value of the probability density function f are n-dimensional: The following (i) and (ii) are performed for each of i = 0,.

（ｉ）ｘ_０からｘ_ｉ−１までを固定し、ｘ_ｉ＋１からｘ_ｎ−１までを積分し、ｘ_ｉだけを変数として残した確率密度関数ｆ_ｉを求める。 (I) x ₀ to x _i−1 are fixed, x _{i + 1} to x _n−1 are integrated, and a probability density function f _{i in} which only x _i is left as a variable is obtained.

（ｉｉ）確率密度関数ｆ_ｉの定義域は１次元なので、上記「（１）確率密度関数ｆの定義域及び属性値が１次元の場合」で示した方法と同様の方法により、「確率密度関数ｆ_ｉに従う値」を計算する。 (Ii) Since the domain of the probability density function f _i is one-dimensional, the “probability density” is determined by a method similar to the method described above in “(1) When the domain and attribute value of the probability density function f are one-dimensional”. The value according to the function f _i is calculated.

ｉ＝０，…，ｎ−１のそれぞれに対して「確率密度関数ｆ_ｉに従う値」を計算することにより、ｎ個の「確率密度関数ｆ_ｉに従う値」が得られる。 By calculating “value according to probability density function f _i ” for each of i = 0,..., n−1, n “values according to probability density function f _i ” are obtained.

上記の方法を、確率密度関数がラプラス分布の場合に当てはめると以下のようになる。 Applying the above method when the probability density function is a Laplace distribution is as follows.

２．「ラプラス分布に従う値」について
（１）ラプラス分布の定義域及び属性値が１次元の場合
（ｉ）区間［０，１］上の一様乱数ｒ、区間（０，１）上の一様乱数ｂを生成する。 2. About “value according to Laplace distribution” (1) When the domain and attribute value of Laplace distribution are one-dimensional (i) Uniform random number r on interval [0, 1], Uniform random number on interval (0, 1) b is generated.

（ｉｉ）（−１）^ｂσｌｏｇｒ＋μを「ラプラス分布に従う値」として出力する。 (Ii) (-1) ^b σlogr + μ is output as “value according to Laplace distribution”.

（２）ラプラス分布の定義域及び属性値がｎ次元の場合
（ｉ）上記「（１）ラプラス分布の定義域及び属性値が１次元の場合」で示した方法と同様の方法により、ｎ個の「ラプラス分布に従う値」であるｘ_０，ｘ_１，…，ｘ_ｎ−１を計算する。 (2) When the domain and attribute value of the Laplace distribution are n-dimensional (i) n in the same manner as the method described in “(1) When the domain and attribute value of the Laplace distribution is one-dimensional” above X ₀ , x ₁ ,..., X _n−1 which are “values according to the Laplace distribution”.

（ｉｉ）これらのｘ_０，ｘ_１，…，ｘ_ｎ−１を「ラプラス分布に従う値」として出力する。 (Ii) These x ₀ , x ₁ ,..., X _n−1 are output as “values according to Laplace distribution”.

並替部１４は、撹乱部１２により撹乱されたデータベースに含まれるレコードの順序を並び替える（ステップＳ２）。レコードが並び替えられたデータベースは、集計装置２に送信される。 The rearrangement unit 14 rearranges the order of records included in the database disturbed by the disturbance unit 12 (step S2). The database in which the records are rearranged is transmitted to the aggregation device 2.

並び替えの対象となるのは、データベースに含まれる全部又は一部のレコードである。レコードの並び替えは、一様ランダムに行われてもよいし、ランダムに行われてもよいし、一部又は全部の属性値についての昇順、降順等の所定の並替規則に基づいて行われてもよい。 The target of rearrangement is all or a part of records included in the database. Records may be rearranged uniformly, randomly, or based on a predetermined rearrangement rule such as ascending or descending order for some or all attribute values. May be.

属性値の種類の数が１である場合には、属性値が属するｎ次元実数ベクトルの部分集合Ｖの元をｕ，ｖとすると、σは下記式（１）又は（２）を満たすように予め定められているとする。|Ｒ|は、データベースのレコードの数である。 When the number of types of attribute value is 1, assuming that the elements of the subset V of the n-dimensional real vector to which the attribute value belongs are u and v, σ satisfies the following formula (1) or (2): Suppose that it is predetermined. | R | is the number of records in the database.

属性値の種類の数が２以上である場合には、各属性値ａが属するｎ次元実数ベクトルの部分集合Ｖ_ａの元をｕ，ｖとすると、σは下記式（３）又は（４）を満たすように予め定められているとする。 When the number of types of attribute values is 2 or more, assuming that the elements of the subset V _a of the n-dimensional real vector to which each attribute value a belongs are u and v, σ is the following formula (3) or (4) It is assumed that it is predetermined so as to satisfy

パラメータ決定部１３が、予め定められたｋに基づいて、上記（１）から（４）の式を満たすσを決定してもよい（ステップＳ０）。この場合、パラメータ決定部１３により決定されたσは、撹乱部１２に送信される。 The parameter determination unit 13 may determine σ that satisfies the expressions (1) to (4) based on k determined in advance (step S0). In this case, σ determined by the parameter determination unit 13 is transmitted to the disturbance unit 12.

このようにして撹乱されたデータベースは、いわゆるＰｋ−匿名性を満たす。ここでは、その証明を省略する。Ｐｋ−匿名性は、データベースの各レコードと、その各レコードに対応する個人とを１／ｋ以上の確率で結びつけることができないという性質である。 The database disturbed in this way satisfies so-called Pk-anonymity. Here, the proof is omitted. Pk-anonymity is a property that each record in the database and an individual corresponding to each record cannot be associated with a probability of 1 / k or more.

したがって、このようにして撹乱されたデータベースは、Ｐｋ−匿名性という明確な基準で匿名性が保障される。また、撹乱前のデータベース及び撹乱後のデータベースを用いずに匿名性を保障することができる。 Therefore, the database disturbed in this way is assured of anonymity on the clear basis of Pk-anonymity. Moreover, anonymity can be ensured without using the database before disturbance and the database after disturbance.

集計部２１は、撹乱装置１により撹乱されたデータベースを用いて集計処理を行う（ステップＳ３）。集計部２１は、例えば、参考文献１に記載された反復ベイズ手法等を用いて、クロス集計等の集計結果を推定する。 The counting unit 21 performs a counting process using the database disturbed by the disturbing device 1 (step S3). The tabulation unit 21 estimates a tabulation result such as a cross tabulation using, for example, an iterative Bayesian method described in Reference Document 1.

〔参考文献１〕
五十嵐大，外２名，「多値属性に適用可能な効率的プライバシー保護クロス集計」，コンピュータセキュリティシンポジウム２００８
［第二実施形態］
第一実施形態は、データベースの全ての属性値がいわゆる数値属性値である場合のデータベース撹乱システムであった。これに対して、第二実施形態は、データベースの属性値がいわゆるカテゴリ属性値を含む場合のデータベース撹乱システムである。第二実施形態で撹乱の対象となるデータベースの例を図８に示す。 [Reference 1]
University of Igarashi, 2 others, “Efficient privacy protection cross-tabulation applicable to multi-valued attributes”, Computer Security Symposium 2008
[Second Embodiment]
The first embodiment is a database disruption system in the case where all the attribute values of the database are so-called numerical attribute values. On the other hand, 2nd embodiment is a database disturbance system in case the attribute value of a database contains what is called a category attribute value. An example of a database to be disturbed in the second embodiment is shown in FIG.

カテゴリ属性値とは、例えば性別等の属性値であり、数値属性値とは異なり属性値の取り得る値がいくつかに制限されている属性値のことである。 The category attribute value is, for example, an attribute value such as gender, and is an attribute value that is limited to several values that the attribute value can take, unlike the numerical attribute value.

以下、第一実施形態と異なる部分を中心に説明する。第一実施形態と同様の部分については説明を省略する。 Hereinafter, a description will be given centering on differences from the first embodiment. Description of the same parts as those in the first embodiment is omitted.

第二実施形態の撹乱部１２は、図２のステップＳ１に代えて、図３のステップＳ１０，Ｓ１，Ｓ１１の処理を行う。 The disturbance part 12 of 2nd embodiment replaces step S1 of FIG. 2, and performs the process of step S10, S1, S11 of FIG.

撹乱部１２は、まず、データベース記憶部１１から読み込んだデータベースに含まれる一部又は全部の属性値のそれぞれについて、そのそれぞれの属性値がカテゴリ属性値であるか判定する（ステップＳ１０）。 First, the disturbance unit 12 determines whether or not each attribute value is a category attribute value for each of some or all of the attribute values included in the database read from the database storage unit 11 (step S10).

属性値がカテゴリ属性値でない場合には、すなわち数値属性値である場合には、撹乱部１２は、第一実施形態と同様の方法によりラプラス分布に従う値の加算を行う（ステップＳ１）。 If the attribute value is not a category attribute value, that is, if it is a numerical attribute value, the disturbing unit 12 adds values according to the Laplace distribution by the same method as in the first embodiment (step S1).

属性値がカテゴリ属性値である場合には、撹乱部１２は、その属性値を所定の確率で他のカテゴリ属性値に置換する（ステップＳ１１）。具体的には、いわゆる維持確率ρの維持−置換撹乱を行う。 When the attribute value is a category attribute value, the disturbing unit 12 replaces the attribute value with another category attribute value with a predetermined probability (step S11). Specifically, the so-called maintenance probability ρ is maintained and replaced.

維持確率ρの維持−置換撹乱は、維持確率ρが予め定められているとして、維持確率ρでその属性値を変更せずに維持し、１−ρの確率でその属性値を他のカテゴリ属性値に置換する撹乱方法である。他のカテゴリ属性値に置換するとは、例えば属性が性別であり属性値が「男」である場合には、その属性値「男」を属性値「女」に置換することを意味する。維持確率ρの維持−置換撹乱の詳細については、特許文献１を参照のこと。 The maintenance-replacement disturbance of the maintenance probability ρ assumes that the maintenance probability ρ is predetermined and maintains the attribute value with the maintenance probability ρ without changing the attribute value. This is a disturbance method that replaces the value. For example, when the attribute is gender and the attribute value is “male”, the replacement with another category attribute value means that the attribute value “male” is replaced with the attribute value “female”. See Patent Document 1 for details of maintenance-replacement disturbance of maintenance probability ρ.

属性の種類の数が２以上である場合には、各属性ａの属性値が属するｎ次元実数ベクトルの部分集合Ｖ_ａの元をｕ，ｖとすると、σ及び維持確率ρは下記式（５）を満たすように予め定められているとする。例えば、パラメータ決定部１３が、予め定められたｋに基づいて、下記式（５）の式を満たすσ及び維持確率ρを決定する（ステップＳ０）。|Ｖ_ａ|は、属性ａのカテゴリ属性値の取り得る値の数である。 When the number of attribute types is 2 or more, assuming that the elements of the subset V _a of the n-dimensional real vector to which the attribute value of each attribute a belongs are u and v, σ and the maintenance probability ρ ) To be satisfied in advance. For example, the parameter determination unit 13 determines σ and the maintenance probability ρ that satisfy the following expression (5) based on k determined in advance (step S0). | V _a | is the number of possible values of the category attribute value of attribute a.

このようにして撹乱されたデータベースは、第一実施形態と同様に、いわゆるＰｋ−匿名性を満たす。ここでは、その証明を省略する。 The database disturbed in this way satisfies the so-called Pk-anonymity as in the first embodiment. Here, the proof is omitted.

したがって、このようにして撹乱されたデータベースは、第一実施形態と同様に、Ｐｋ−匿名性という明確な基準で匿名性が保障される。また、撹乱前のデータベース及び撹乱後のデータベースを用いずに匿名性を保障することができる。 Therefore, the anonymity of the database disturbed in this way is ensured on the basis of the clear standard of Pk-anonymity, as in the first embodiment. Moreover, anonymity can be ensured without using the database before disturbance and the database after disturbance.

［変形例等］
並替部１４の処理は行わなくてもよい。この場合、データベースのレコードの並び替えは行われず、撹乱部１２により撹乱されたデータベースが集計装置２に送信される。集計装置２は、受信した並び替えが行われていないデータベースに基づいて集計処理を行う。 [Modifications, etc.]
The processing of the rearrangement unit 14 may not be performed. In this case, the database records are not rearranged, and the database disturbed by the disturbing unit 12 is transmitted to the counting device 2. The aggregation device 2 performs aggregation processing based on the received database that has not been rearranged.

撹乱部１２が撹乱装置１に備えられ、集計部２１が集計装置２に備えられていれば、他の各部はデータベース撹乱システムを構成する装置の何れに備えられていてもよい。 As long as the disturbing unit 12 is provided in the disturbing device 1 and the counting unit 21 is provided in the counting device 2, the other units may be provided in any of the devices constituting the database disturbing system.

例えば、図４に例示するように、パラメータ決定部１３が集計装置２に備えられていてもよい。この場合、パラメータ決定部１３により決定されたパラメータは、撹乱装置１に送信される。 For example, as illustrated in FIG. 4, the parameter determination unit 13 may be provided in the counting device 2. In this case, the parameter determined by the parameter determination unit 13 is transmitted to the disturbance device 1.

また、例えば、図５に示すように、データベース撹乱システムが、撹乱装置１、集計装置２及び撹乱データサーバ装置３から構成されている場合には、パラメータ決定部１３が撹乱データサーバ装置３に備えられていてもよい。この場合、パラメータ決定部１３により決定されたパラメータは撹乱装置１に送信され、撹乱装置１により撹乱されたデータベースは撹乱データサーバ装置３を経由して集計装置２に送信される。具体的には、撹乱データサーバ装置３のデータ送受信部３１が、撹乱装置１により撹乱されたデータベースを受信して、集計装置２に送信する。 Further, for example, as shown in FIG. 5, when the database disturbance system includes a disturbance device 1, a totaling device 2, and a disturbance data server device 3, the parameter determination unit 13 is provided in the disturbance data server device 3. It may be done. In this case, the parameter determined by the parameter determination unit 13 is transmitted to the disturbance device 1, and the database disturbed by the disturbance device 1 is transmitted to the aggregation device 2 via the disturbance data server device 3. Specifically, the data transmitting / receiving unit 31 of the disturbance data server device 3 receives the database disturbed by the disturbance device 1 and transmits it to the counting device 2.

また、図６に例示するように、データベース撹乱システムに、撹乱装置１及び集計装置２のそれぞれが複数備えられていてもよい。
データベース撹乱装置の各部間のデータの送受信は直接行われてもよいし、図示していない記憶部を介して行われてもよい。データベース撹乱システムの各装置間のデータの送受信は直接行われてもよいし、他の装置を経由して行われてもよい。 Moreover, as illustrated in FIG. 6, a plurality of disturbance devices 1 and aggregating devices 2 may be provided in the database disturbance system.
Data transmission / reception between the respective units of the database disturbance device may be performed directly or may be performed via a storage unit (not shown). Data transmission / reception between the devices of the database disturbance system may be performed directly, or may be performed via other devices.

その他、この発明は上述の実施形態に限定されるものではない。例えば、上述の各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。 In addition, the present invention is not limited to the above-described embodiment. For example, the various processes described above are not only executed in time series according to the description, but may also be executed in parallel or individually as required by the processing capability of the apparatus that executes the processes.

また、上述の構成をコンピュータによって実現する場合、各装置が有すべき各部の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、各部がコンピュータ上で実現される。 Further, when the above-described configuration is realized by a computer, the processing content of each unit that each device should have is described by a program. Each part is realized on the computer by executing this program on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。 The program describing the processing contents can be recorded on a computer-readable recording medium. As the computer-readable recording medium, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used.

その他、この発明の趣旨を逸脱しない範囲で適宜変更が可能であることはいうまでもない。 Needless to say, other modifications are possible without departing from the spirit of the present invention.

１撹乱装置
１１データベース記憶部
１２撹乱部
１３パラメータ決定部
１４並替部
２１集計部
２集計装置 DESCRIPTION OF SYMBOLS 1 Disturbing device 11 Database storage part 12 Disturbing part 13 Parameter determination part 14 Rearrangement part 21 Totaling part 2 Totaling apparatus

Claims

The database includes a plurality of records, each record includes a record identifier and at least one attribute value, || · || ₁ is an L1 distance of •, and σ is a predetermined value,
For each or all of the attribute values included in the database, a disturbance unit that adds a value according to a Laplace distribution with a variance 2σ ² defined by the following equation:

Database disruptor including.

The database disruptor of claim 1,
A rearrangement unit for rearranging the order of the records included in the database disturbed by the disturbance unit;
Database disturbance device.

In the database disturbance apparatus of Claim 1 or 2,
When the respective attribute values are category attribute values, the disturbing unit replaces the respective attribute values with other category attribute values with a predetermined probability.
Database disturbance device.

A database disruptor according to any one of claims 1 to 3;
A tally processing unit that performs tally processing using a database disturbed by the disturbing unit and a database in which records are rearranged by the rearrangement unit;
Including database disturbance system.

The database includes a plurality of records, each record includes a record identifier and at least one attribute value, || · || ₁ is an L1 distance of •, and σ is a predetermined value,
A disturbance step in which the disturbance unit adds a value according to a Laplace distribution of variance 2σ ² defined by the following equation for each of some or all of the attribute values included in the database:

Database disruption method including.

The program for functioning a computer as each part of the database disturbance apparatus in any one of Claim 1 to 3.