CROSS-REFERENCE TO RELATED APPLICATIONS
The present application is based on PCT filing PCT/JP2019/019093, filed May 14, 2019, which claims priority to JP 2018-100626, filed May 25, 2018, the entire contents of each are incorporated herein by reference.
TECHNICAL FIELD
The present invention relates to secure computation techniques, and in particular to techniques for calculating an aggregate function while keeping confidentiality.
BACKGROUND ART
An aggregate function is an operation to obtain statistical values that have been grouped based on the value of a key attribute when there are a key attribute and a value attribute in a table. An aggregate function is also called group-by operation. A key attribute is an attribute that is used for grouping table records, such as official position or gender, for example. A value attribute is an attribute that is used for calculating statistical values, such as salary or body height, for example. Group-by operation can be an operation to obtain the average body heights by gender when the key attribute is gender, for example. The key attribute may also be a composite key consisting of multiple attributes: for example, it may be an operation to obtain the average body height of males in their teens, the average body height of males in their twenties, and so on when the key attributes are gender and age. Non-Patent Literature 1 describes a method that performs group-by operation by secure computation.
Group-by operations specifically include group-by count, group-by sum, group-by maximum/minimum, group-by median, rank in group, and the like. Group-by count refers to cross tabulation, being an operation to sum up the number of records in each group when a table is grouped based on the value of a key attribute. Group-by sum is the sum of a desired value attribute in each group. Group-by maximum/minimum is the maximum/minimum of a desired value attribute in each group. Group-by median is the median of a desired value attribute in each group. Rank in group is a function for obtaining the rank of the value of a value attribute in each record within the group.
PRIOR ART LITERATURE
Non-Patent Literature
- Non-Patent Literature 1: Dai Ikarashi, Koji Chida, Koki Hamada, and Katsumi Takahashi, “Secure Database Operations Using An Improved 3-party Verifiable Secure Function Evaluation”, The 2011 Symposium on Cryptography and Information Security, 2011.
SUMMARY OF THE INVENTION
Problems to be Solved by the Invention
Group-by operation sometimes determines intermediate data in the course of calculation. Some of such intermediate data are determined in common between different kinds of group-by operations. When multiple group-by operations are calculated simultaneously or in succession while keeping confidentiality, processing for determining common intermediate data can overlap, which leads to an increased computational complexity.
In view of the technical problem described above, an object of the present invention is to provide a technique that can efficiently determine intermediate data for use in group-by operations when calculating multiple group-by operations simultaneously or in succession while keeping confidentiality.
Means to Solve the Problems
To solve the above-described problem, a secure aggregate function computation system according to one aspect of the present invention includes a plurality of secure computation apparatuses, where F is an arbitrary ring; m is an integer greater than or equal to 2; nk is an integer greater than or equal to 1; and [k0], . . . , [knk−1] are shares obtained by secret sharing of key attributes k0, . . . , knk−1∈Fm. Each of the secure computation apparatuses includes: a group sort generation unit that generates, from a share {b} which becomes a bit string b:=b0, . . . , bm−1 obtained by bit decomposition and concatenation of the key attributes k0, . . . , knk−1 when reconstructed, a share {{σ0}} that becomes a permutation σ0 for performing a stable sort of the bit string b in ascending order when reconstructed, using the shares [k0], . . . , [knk−1]; a bit string sorting unit that generates a share {b′} that becomes a sorted bit string b′:=b′0, . . . , b′m−1 which is the bit string b as sorted by the permutation σ0 when reconstructed, using the share {b} and the share {{σ0}}; a flag generation unit that generates a share {e} that becomes a flag e:=e0, . . . , em−1 when reconstructed, using the share {b′}, by setting {ei}:={b′i≠b′i+1} for each integer i greater than or equal to 0 and smaller than or equal to m−2 and also setting {em−1}:={1}; and a key aggregate sort generation unit that generates a share {{σ}} that becomes a permutation σ for performing a stable sort of negation ¬e of the flag e in ascending order when reconstructed, using the share {e}.
Effects of the Invention
The secure aggregate function techniques of the present invention can efficiently determine intermediate data for use in group-by operations while keeping confidentiality. By using the intermediate data, the overall computational complexity can be reduced when multiple group-by operations are calculated simultaneously or in succession.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 illustrates a functional configuration of a secure aggregate function computation system.
FIG. 2 illustrates a functional configuration of a secure computation apparatus.
FIG. 3 illustrates a processing procedure of a secure aggregate function computation method.
DETAILED DESCRIPTION OF THE EMBODIMENTS
Embodiments of the present invention are described below in detail. In the drawings, components having the same function are given the same numbers and overlapping description is omitted.
[x]∈[F] indicates that a certain value x is concealed by secret sharing and the like on a certain ring F. {b}∈{B} indicates that a certain one bit value b is concealed by secret sharing and the like on a ring B capable of representing one bit. {{s}}∈{{Sm}} indicates that a certain permutation s belonging to a set Sm of permutations with m elements is concealed by secret sharing and the like. In the following, a secret-shared value is also called a “share”.
For sorting (including stable sort) in secure computation used in an embodiment, the sorting described in Reference Literature 1 can be used, for example. For a share {{s}} of permutation s, the hybrid permutation {{π}} described in Reference Literature 1 may be used.
- [Reference Literature 1] Dai Ikarashi, Koki Hamada, Ryo Kikuchi, and Koji Chida, “A Design and an Implementation of Super-high-speed Multi-party Sorting: The Day When Multi-party Computation Reaches Scripting Languages”, Computer Security Symposium 2017.
Embodiment
Referring to FIG. 1 , an exemplary configuration of a secure aggregate function computation system 100 according to an embodiment is described. The secure aggregate function computation system 100 includes N (≥2) secure computation apparatuses 1 1, . . . , 1 N. In the present embodiment, the secure computation apparatuses 1 1, . . . , 1 N are respectively connected to a communication network 2. The communication network 2 is a circuit-switched or packet-switched communication network configured to allow mutual communications among the connected apparatuses, and, for example, the Internet, local area network (LAN), or wide area network (WAN) may be used. The apparatuses do not necessarily be capable of communicating online via the communication network 2. For example, they may be configured such that information entered to the secure computation apparatuses 1 1, . . . , 1 N is stored in a portable recording medium such as magnetic tape or a USB memory and the information is entered offline to the secure computation apparatuses 1 1, . . . , 1 N from the portable recording medium.
Referring to FIG. 2 , an exemplary configuration of a secure computation apparatus 1 n (n=1, . . . , N) included in the secure aggregate function computation system 100 is described. The secure computation apparatus 1 n includes, for example, an input unit 10, a bit decomposition unit 11, a group sort generation unit 12, a bit string sorting unit 13, a flag generation unit 14, a key aggregate sort generation unit 15, a de-duplication unit 16, a key sorting unit 17, a value sorting unit 18, and an output unit 19, as shown in FIG. 2 . By the secure computation apparatus 1 n (1≤n≤N) performing processing at the steps described below in cooperation with other secure computation apparatus 1 n′ (n′=1, . . . , N, where n≠n′), a secure aggregate function computation method according to an embodiment is implemented.
The secure computation apparatus 1 n is a special apparatus configured by loading of a special program into a well-known or dedicated computer having a central processing unit (CPU), main storage unit (random access memory: RAM), and the like, for example. The secure computation apparatus 1 n executes various kinds of processing under control of the central processing unit, for example. Data input to the secure computation apparatus 1 n and data resulting from processing are stored in the main storage unit, for example, and the data stored in the main storage unit is read into the central processing unit as necessary to be used for other processing. The processing units of the secure computation apparatus 1 n may at least partially consist of hardware such as an integrated circuit.
Referring to FIG. 3 , the processing procedure of the secure aggregate function computation method for execution by the secure aggregate function computation system 100 according to the embodiment is described.
At step S10, the input unit 10 of each secure computation apparatus 1 n receives, as input, shares [k0], . . . , [knk−1]∈[F]m which are obtained by concealing respective ones of nk key attributes k0, . . . , knk−1∈Fm by secret sharing and shares [v0], . . . , [vna−1]∈[F]m which are obtained by concealing respective ones of na value attributes v0, . . . , vna−1∈Fm by secret sharing. Here, nk and na are integers greater than or equal to 1, and m is an integer greater than or equal to 2. In the following, each element of [kj]∈[F]m (j=0, . . . , nk−1) can also be referred to by [kj,i]∈[F] (i=0, . . . , m−1). Likewise, each element of [vh]∈[F]m (h=0, . . . , na−1) can also be referred to by [vh,i]∈[F] (i=0, . . . , m−1). The input unit 10 outputs the shares [k0], . . . , [knk−1] of the key attributes k0, . . . , knk−1 to the bit decomposition unit 11 and the de-duplication unit 16. The input unit 10 also outputs the shares [v0], . . . , [vna−1] of the value attributes v0, . . . , vna−1 to the value sorting unit 18.
At step S11, the bit decomposition unit 11 of each secure computation apparatus 1 n applies bit decomposition to the shares [k0], . . . , [knk−1] of the key attributes k0, . . . , knk−1 and concatenates them, obtaining a share {b}∈{B}λ that becomes a bit string b:=b0, . . . , bm−1∈Bλ which is a concatenation of bit representations of the key attributes k0, . . . , knk−1 when reconstructed. Here, λ is the bit length of the bit string b, being the sum of the bit length of each bi (i=0, . . . , m−1). In other words, {bi} is a bit string obtained by concatenation of bit representations of the respective ith elements [k0,i], . . . , [knk−1,i] of the shares [k0], . . . , [knk−1] of the key attributes k0, . . . , knk−1. The bit decomposition unit 11 outputs the share {b} of the bit string b to the group sort generation unit 12.
At step S12, the group sort generation unit 12 of each secure computation apparatus 1 n generates a share {{σ0}}∈{{Sm}} that becomes a permutation σ0 for performing a stable sort of the bit string b in ascending order when reconstructed, using the share {b} of the bit string b. A stable sort refers to a kind of sorting operation that preserves an order among elements having the same value in a case where elements of the same value exist. For example, when a stable sort by gender is performed on a table that has been sorted by employee number, it yields a sorting result in which the order of employee numbers is maintained in each gender. As the bit string b is a concatenation of bit representations of the key attributes k0, . . . , knk−1, the permutation σ0 can also be considered as an operation to rearrange records so that records for which the values of the key attributes k0, . . . , knk−1 are equal will be consecutive and group them. The group sort generation unit 12 outputs the share {b} of the bit string b and the share {{σ0}} of the permutation σ0 to the bit string sorting unit 13. The group sort generation unit 12 also outputs the share {{σ0}} of the permutation σ0 to the key sorting unit 17 and the value sorting unit 18.
At step S13, the bit string sorting unit 13 of each secure computation apparatus 1 n obtains a share {b′}∈{B}λ that becomes a sorted bit string b′:=b′0, . . . , b′m−1∈Bλ which is the bit string b as sorted by the permutation σ0 when reconstructed, using the share {b} of the bit string b and the share {{σ0}} of the permutation σ0. The bit string sorting unit 13 outputs the share {b′} of the sorted bit string b′ to the flag generation unit 14.
At step S14, the flag generation unit 14 of each secure computation apparatus 1 n generates a share {e}∈{B}m that becomes a flag e:=e0, . . . , em−1∈Bm when reconstructed, using the share {b′} of the sorted bit string b′ by setting {ei}:={b′i≠b′i+1} for each integer i greater than or equal to 0 and smaller than or equal to m−2 and setting {em−1}:={1}. Since the flag ei is set to true when the ith element b′i of the sorted bit string b′ is different from the i+1th element b′i+1, it serves as a flag that indicates the last element of each group (that is, the element immediately before the boundary between groups). The flag generation unit 14 outputs the share {e} of the flag e to the key aggregate sort generation unit 15, the de-duplication unit 16, and the output unit 19.
At step S15, the key aggregate sort generation unit 15 of each secure computation apparatus 1 n first generates a share {e′}∈{B}m that becomes a flag e′ which is negation ¬e of the flag e when reconstructed, using the share {e} of the flag e. That is, {e′i}:={¬ei} is set for each integer i greater than or equal to 0 and smaller than or equal to m−1. Then, the key aggregate sort generation unit 15 generates a share {{σ}}∈{{Sm}} that becomes a permutation σ for performing a stable sort of the flag e′ in ascending order when reconstructed, using the share {e′} of the flag e′. The key aggregate sort generation unit 15 outputs the share {{σ}} of the permutation a to the key sorting unit 17 and the output unit 19.
At step S16, the de-duplication unit 16 of each secure computation apparatus 1 n generates shares [k″0], . . . , [k″nk−11] that become de-duplicated key attributes k″0, . . . , k″nk−1 when reconstructed, using the share {e} of the flag e and the shares [k0], . . . , [knk−1] of the key attributes k0, . . . , knk−1 by setting [k″j,i]:=[ei?kj,i:null]. Here, “?” is a conditional operator (or a tertiary operator). That is, [k″j,i]:=[kj,i] is set when {ei} is true (for example, {ei}={1}), and [k″j,i]:=null is set when {ei} is false (for example, {ei}={0}). The value that is set when {ei}={0} does not have to be null, but may be any value that cannot be assumed by the key attributes k0, . . . , knk−1. Since the flag e is a flag which sets true only in the last element of each group, the de-duplicated key attributes k″0, . . . , k″nk−1 will be a vector in which only elements corresponding to the last element of the respective groups are set to values of the key attribute and the other elements are set to a predetermined value that cannot be assumed by the key attributes. The de-duplication unit 16 outputs the shares [k″0], . . . , [k″nk−1] of the de-duplicated key attributes k″0, . . . , k″nk−1 to the key sorting unit 17.
At step S17, the key sorting unit 17 of each secure computation apparatus 1 n generates shares [k′0], . . . , [k′nk−1] that become sorted key attributes k′0, . . . , k′nk−1 which are the de-duplicated key attributes k″0, . . . , k″nk−1 as sorted by the permutation σ0 and the permutation σ in sequence when reconstructed, using the shares [k″0], . . . , [k″nk−1] of the de-duplicated key attributes k″0, . . . , k″nk−1, the share {{σ0}} of the permutation σ0, and the share {{σ}} of the permutation σ. The key sorting unit 17 outputs the shares [k′0], . . . , [k′nk−1] of the sorted key attributes k′0, . . . , k′nk−1 to the output unit 19.
At step S18, the value sorting unit 18 of each secure computation apparatus 1 n generates shares [v′0], . . . , [v′na−1] that become sorted value attributes v′0, . . . , v′na−1 which are the value attributes v0, . . . , vna−1 as sorted by the permutation σ0 when reconstructed, using the shares [v0], . . . , [vna−1] of the value attributes v0, . . . , vna−1 and the share {{σ0}} of the permutation σ0. The value sorting unit 18 outputs the shares [v′0], . . . , [v′na−1] of the sorted value attributes v′0, . . . , v′na−1 to the output unit 19.
At step S19, the output unit 19 of each secure computation apparatus 1 n outputs at least one of the shares [k′0], . . . , [k′nk−1] of the sorted key attributes k′0, . . . , k′nk−1, the shares [v′0], . . . , [v′na−1] of the sorted value attributes v′0, . . . , v′na−1, the share {e} of the flag e, and the share {{σ}} of the permutation σ. Information to be output by the output unit 19 is selected so as to satisfy intermediate data which is required for one or more group-by operations that are subsequently calculated.
In the following, specific procedures to calculate various kinds of aggregate functions using the intermediate data output by the secure aggregate function computation system 100 are described.
<<Group-by Count>>
Group-by count is an operation to sum up the number of records in each group when a table is grouped based on the value of a key attribute. Group-by count can be determined as shown below using the share {e} of the flag e and the share {{σ}} of the permutation σ which are output by the secure aggregate function computation system 100. Here, g is the maximum number of groups, representing the number of combinations of possible values of key attributes, that is, the number of kinds of possible values of the key attributes.
Firstly, the share {e}∈{B}m of the flag e is converted to a share [e]∈[F]m by secret sharing on an arbitrary ring F.
Secondly, using the share [e] of the flag e, [xi]:=[ei?i+1:m] is set for each integer i greater than or equal to 0 and smaller than or equal to m−1, and a share [x]∈[F]m that becomes a vector x:=x0, . . . , xm−1∈F when reconstructed is generated. The vector x is such that, where records having the same value of a key attribute are placed in the same group when a stable sort is performed on a table by the key attribute, the position of the next element from beginning is set in the last element of each group, and the number of records contained in the entire table is set in the other elements. In other words, in the last element of each group, the total number of records summed up from the first group through that group will be set.
Thirdly, using the share [x] of the vector x and the share {{σ}} of the permutation σ, a share [σ(x)]∈[F]m that becomes a sorted vector σ(x) which is the vector x as sorted by the permutation σ when reconstructed is generated. In the following, each element of [σ(x)]∈[F]m can also be referred to by [σ(x)i]∈[F] (i=0, . . . , m−1).
Finally, using the share [σ(x)] of the sorted vector σ(x), [ci]: [σ(x)i−σ(x)i−1] is set for each integer i greater than or equal to 1 and smaller than or equal to min(g,m)−1, and [c0]:=[σ(x)0] is also set, and a share [c]∈[F]min(g,m) that becomes a vector c:=c0, . . . , cmin(g,m)−1∈F representing the number of records in each group when reconstructed is generated. Since the total number of records summed up from the 0th through ith groups is set in the ith element σ(x)i of the sorted vector σ(x), the number of records in the ith group will be set in the ith element ci of the vector c. Because the key attributes are concealed, min(g,m) is the maximum that can be assumed by the number of groups, and the actual number of groups will be a value that is equal to or smaller than min(g,m) and that cannot be known to each secure computation apparatus 1 n (hereinbelow, the actual number of groups is denoted as g′). Thus, for those of min(g,m) shares [ci] that exceed the actual number of groups (that is, i≥g′), it is necessary to set an invalid value that becomes distinguishable from a valid value after reconstruction. In the present embodiment, [xi]=m is set for those shares [xi] with [ei] being false or the last share [xi] among those with [ei] being true. Thus, σ(x)i−σ(x)i−1=m−m=0 is set for cg′, . . . , cmin(g,m)−1. Since the count of groups in which records exist is one or greater, 0 is applicable as an invalid value that becomes distinguishable from a valid value.
<<Group-by Sum>>
Group-by sum is an operation to sum up the sum of a desired value attribute per group when a table is grouped based on the value of a key attribute. Use of group-by sum also allows calculation of group-by multiply accumulate, which determines the sum of multiplications per group, or group-by sum of squares, which determines the sum of squares per group. For group-by multiply accumulate, group-by sum may be determined on a result of applying multiplication to the value attribute in each record. For group-by sum of squares, group-by sum may be similarly determined on a result of applying squaring to the value attribute in each record. Group-by sum can be determined as shown below using the shares [v′0], . . . , [v′na−1] of the sorted value attributes v′0, . . . , v′na−1, the share {e} of the flag e, and the share {{σ}} of the permutation σ which are output by the secure aggregate function computation system 100. Here, v is a desired value attribute for which group-by sum is to be determined among the sorted value attributes v′0, . . . , v′na−1.
Firstly, [v′]:=prefix-sum([v]) is calculated using the share [v] of the value attribute v, and a share [v′]∈[F]m that becomes a vector v′:=v′0, . . . , v′m−1∈F when reconstructed is generated. The prefix-sum is an operation to set the sum of the values of the 0th element v0 through the ith element vi of the input vector v into the ith element v′i of the output vector v′, for each integer i greater than or equal to 0 and smaller than or equal to m−1, where m is the length of the input vector v.
Secondly, the share {e}∈{B}m of the flag e is converted to the share [e]∈[F]m by secret sharing on an arbitrary ring F.
Thirdly, using the share [v′] of the vector v′ and the share [e] of the flag e, [ti]:=[ei?v′i:v′m−1] is set for each integer i greater than or equal to 0 and smaller than or equal to m−1, and a share [t]∈[F]m that becomes a vector t:=t0, . . . , tm−1∈F when reconstructed is generated. The vector t is such that, where records having the same value of a key attribute are placed in the same group when a stable sort is performed on a table by the key attribute, the sum of the values of the value attribute preceding the last element of each group is set in the last element, and the sum of the values of the value attribute in the entire table is set in the other elements.
Fourthly, using the share [t] of the vector t and the share {{σ}} of the permutation σ, a share [σ(t)]∈[F]m that becomes a sorted vector σ(t) which is the vector t as sorted by the permutation σ when reconstructed is generated. In the following, each element of [σ(t)]∈[F]m can also be referred to by [σ(t)i]∈[F] (i=0, . . . , m−1).
Finally, using the share [σ(t)] of the sorted vector σ(t), [si]:=[σ(t)i−σ(t)i−1] is set for each integer i greater than or equal to 1 and smaller than or equal to min(g,m)−1 and [t0]:=[σ(t)0] is also set, and a share [s]∈[F]min(g,m) that becomes the sum of value attribute v per group, s:=s0, . . . , smin(g,m)−1∈F, when reconstructed is generated. Since the ith element σ(t)i of the sorted vector σ(t) has been set to the sum of the values of the value attribute v belonging to the 0th through the ith groups, the sum of the values of the value attribute v belonging to the ith group will be set in the ith element t; of the vector t.
<<Group-by Maximum>>
Group-by maximum is an operation to obtain the maximum of a desired value attribute per group when a table is grouped based on the value of a key attribute. Group-by maximum can be determined as shown below using the shares [v′0], . . . , [v′na−1] of the sorted value attributes v′0, . . . , v′na−1, the share {e} of the flag e, and the share {{σ}} of the permutation σ which are output by the secure aggregate function computation system 100. Here, v is a desired value attribute for which group-by maximum is to be determined among the sorted value attributes v′0, . . . , v′na−1.
Firstly, the share {e}∈{B}m of the flag e is converted to the share [e]∈[F]m by secret sharing on an arbitrary ring F.
Secondly, using the share [v] of the value attribute v and the share [e] of the flag e, [fi]:=[ei?vi:0] is set for each integer i greater than or equal to 0 and smaller than or equal to m−1, and a share [f]∈[F]m that becomes a vector f:=f0, . . . , fm−1∈F when reconstructed is generated. The vector f is such that, where records having the same value of a key attribute are placed in the same group when a stable sort is performed on a table by the key attribute, the value vi of the value attribute corresponding to the last element fi of each group is set in the last element fi, and 0 is set in the other elements. That is, it is a vector having the maximums of the respective groups and 0s as its elements.
Thirdly, using the share [f] of the vector f and the share {{σ}} of the permutation σ, a share [σ(f)]∈[F]m that becomes a sorted vector σ(f) which is the vector f as sorted by the permutation σ when reconstructed is generated. In the following, each element of [σ(f)]∈[F]m can also be referred to by [σ(f)i]∈[F] (i=0, . . . , m−1). The sorted vector σ(f) will be a vector in which the value of the last element when sorted by group (that is, the maximum in each group) is set in elements as many as the number of groups from beginning and 0 is set in the subsequent elements.
Finally, from the share [σ(f)] of the sorted vector σ(f), the share [x]∈[F]min(g,m) that becomes the vector x:=σ(f)0, . . . , σ(f)min(g,m)−1 representing the maximum in each group when reconstructed is generated.
<<Group-by Minimum>>
Group-by minimum is an operation to obtain the minimum of a desired value attribute per group when a table is grouped based on the value of a key attribute. Group-by minimum can be determined as shown below using the shares [v′0], . . . , [v′na−1] of the sorted value attributes v′0, . . . , v′na−1, the share {e} of the flag e, and the share {{σ}} of the permutation σ which are output by the secure aggregate function computation system 100. Here, v is a desired value attribute for which group-by minimum is to be determined among the sorted value attributes v′0, . . . , v′na−1.
Firstly, using the share {e} of the flag e, {e′i}:={ei−1} is set for each integer i greater than or equal to 1 and smaller than or equal to m−1 and {e′0}:={1} is also set, and a share {e′}∈{B}m that becomes a flag e′:=e′0, . . . , e′m−1∈Bm when reconstructed is generated. Since the flag e′ is a flag equivalent to the flag e indicating the last element of each group as shifted backward by one, it serves as a flag that indicates the first element of each group (that is, the element immediately after the boundary between groups).
Secondly, the share {e′}∈{B}m of the flag e′ is converted to a share [e′]∈[F]m by secret sharing on an arbitrary ring F.
Thirdly, using the share [v] of the value attribute v and the share [e′] of the flag e′, [f′i]:=[e′i?vi:0] is set for each integer i greater than or equal to 0 and smaller than or equal to m−1, and a share [f′]∈[F]m that becomes a vector f′:=f′0, . . . , f′m−1∈F when reconstructed is generated. The vector f′ is such that, where records having the same value of a key attribute are placed in the same group when a stable sort is performed on a table by the key attribute, the value vi of the value attribute corresponding to the first element f′i of each group is set in the element f′i, and 0 is set in the other elements. That is, it is a vector having the minimums of the respective groups and 0s as its elements.
Fourthly, using the share [f′] of the vector f′ and the share {{σ}} of the permutation σ, a share [σ(f′)]∈[F]m that becomes a sorted vector σ(f′) which is the vector f′ as sorted by the permutation σ when reconstructed is generated. In the following, each element of [σ(f′)]∈[F]m can also be referred to by [σ(f′)i]∈[F] (i=0, . . . , m−1). The sorted vector σ(f′) will be a vector in which the value of the first element when sorted by group (that is, the minimum in each group) is set in elements as many as the number of groups from beginning and 0 is set in the subsequent elements.
Finally, from the share [σ(f′)] of the sorted vector σ(f′), a share [x′]∈[F]min(g,m) that becomes a vector x′:=σ(f′)0, . . . , σ(f′)min(g,m)−1 representing the minimum in each group when reconstructed is generated.
<<Ascending-Order Rank in Group>>
Ascending-order rank in group is an operation to obtain the rank of a value of a desired value attribute within a group when it is sorted in ascending order in the case of grouping a table based on the value of a key attribute. Ascending-order rank in group can be determined as shown below using the share {e} of the flag e and the share {{σ}} of the permutation σ which are output by the secure aggregate function computation system 100. Here, c is a result of group-by count (hereinafter called “cross tabulation”). The cross tabulation c can be determined with the share {e} of the flag e and the share {{σ} } of the permutation σ in accordance with the <<group-by count>> procedure described above, for example.
Firstly, using the share [c] of the cross tabulation c and the share {{σ}} of the permutation σ, a share [u]:=[σ−1(c)]∈[F]m that becomes an inverse-permuted cross tabulation u:=σ−1(c) which is obtained by inverse application of the permutation σ to the cross tabulation c when reconstructed is generated. The cross tabulation c is a vector in which the number of records in each group is set in elements as many as the number of groups from beginning, and the permutation σ is a permutation that arranges the last element of each group from beginning. Thus, the inverse-permuted cross tabulation u, obtained by inverse application of the permutation σ to the cross tabulation c, will be a vector in which the number of records in each group is set in the last element of that group. In the following, each element of [u]∈[F]m can also be referred to by [ui]∈[F] (i=0, . . . , m−1).
Secondly, [s]:=prefix-sum([u]) is calculated using the share [u] of the inverse-permuted cross tabulation u, and a share [s]∈[F]m that becomes a vector s:=s0, . . . , sm−1∈F when reconstructed is generated. The prefix-sum is an operation to set the sum of the values of the 0th element u0 through the ith element ui of the input vector u into the ith element si of the output vector s for each integer i greater than or equal to 0 and smaller than or equal to m−1, where m is the length of the input vector u.
Finally, using the share [s] of the vector s, [ai]:=[i−si−1] is set for each integer i greater than or equal to 1 and smaller than or equal to m−1 and [a0]:=[0] is also set, and a share [a]∈[F]m that becomes the ascending-order rank in group, a:=a0, . . . , am−1∈F, when reconstructed is generated. Note that the ascending-order rank in group starts at 0. For obtaining ranks starting at 1, each rank may be incremented. That is, [ai]:=[i−si−1+1] may be set for each integer i greater than or equal to 1 and smaller than or equal to m−1 and [a0]:=[1] may be also set, and then the ascending-order rank a may be generated.
<<Descending-Order Rank in Group>>
Descending-order rank in group is an operation to obtain the rank of a value of a desired value attribute within a group when it is sorted in descending order in the case of grouping a table based on the value of the key attribute. Descending-order rank in group can be determined as shown below using the share {e} of the flag e and the share {{σ}} of the permutation σ which are output by the secure aggregate function computation system 100. Here, c is a result of group-by count (hereinafter called “cross tabulation”). The cross tabulation c can be determined with the share {e} of the flag e and the share {{σ}} of the permutation σ in accordance with the <<group-by count>> procedure described above, for example.
Firstly, using the share [c] of the cross tabulation c, [c′i]:=[ci+1] is set for each integer i greater than or equal to 0 and smaller than or equal to m−2 and [c′m−1]:=[0] is also set, and a share [c′]∈[F]m that becomes a shifted cross tabulation c′:=c′0, . . . , c′m−11∈Fm when reconstructed is generated. The shifted cross tabulation c′ is a vector equivalent to the cross tabulation c, which is a vector representing the number of records in each group, as shifted forward by one.
Secondly, using the share [c′] of the shifted cross tabulation c′ and the share {{σ}} of the permutation σ, a share [u′]:=[σ−1(c′)]∈[F]m that becomes an inverse-permuted cross tabulation u′:=σ−1(c′) which is obtained by inverse application of the permutation σ to the shifted cross tabulation c′ when reconstructed is generated. The shifted cross tabulation c′ is a vector obtained by shifting the cross tabulation c, in which the number of records in each group is set in elements as many as the number of groups from beginning, forward by one, and the permutation σ is a permutation that arranges the last elements of each group from beginning. Thus, the inverse-permuted cross tabulation u′, obtained by inverse application of the permutation σ to the shifted cross tabulation c′, will be a vector in which the number of records of the immediately subsequent group is set in the last element of each group. In the following, each element of [u′]∈[F]m can also be referred to by [u′i]∈[F] (i=0, . . . , m−1).
Thirdly, [s′]:=postfix-sum([u′]) is calculated using the share [u′] of the inverse-permuted cross tabulation u′, and a share [s′]∈[F]m that becomes a vector s′:=s′0, . . . , s′m−1∈F when reconstructed is generated. The postfix-sum is an operation to set the sum of the values of the ith element u′i through the m−1th element u′m−1 of the input vector u′ into the ith element s′i of the output vector s′, for each integer i greater than or equal to 0 and smaller than or equal to m−1, where m is the length of the input vector u′.
Finally, using the share [s′] of the vector s′, [di]:=[m−i−s′i−1] is set for each integer i greater than or equal to 0 and smaller than or equal to m−1, and a share [d]∈[F]m that becomes the descending-order rank in group, d:=d0, . . . , dm−1∈F, when reconstructed is generated. Note that the descending-order rank in group starts at 0. For obtaining ranks starting at 1, each rank may be incremented. That is, [di]:=[m−i−s′i] may be set for each integer i greater than or equal to 0 and smaller than or equal to m−1, and then the descending-order rank d may be generated.
<<Group-by Median>>
Group-by median is an operation to obtain a median of a desired value attribute per group when a table is grouped based on the value of a key attribute. Group-by median can be determined as shown below using the shares [v′0], . . . , [v′na−1] of the value attributes v′0, . . . , v′na−1 and the share {e} of the flag e, and the share {{σ}} of the permutation σ which are output by the secure aggregate function computation system 100. Here, c is a result of group-by count (hereinafter called “cross tabulation”). The cross tabulation c can be determined with the share {e} of the flag e and the share {{σ}} of the permutation σ in accordance with the <<group-by count>> procedure described above, for example. v is a desired value attribute for which group-by median is to be determined among the sorted value attributes v′0, . . . , v′na−1.
Firstly, using the share [c] of the cross tabulation c and the share {{σ}} of the permutation σ, the share [a]∈[F]m that becomes the vector a:=a0, . . . , am−1∈F representing the ascending-order rank in group when reconstructed and the share [d]∈[F]m that becomes the vector d:=d0, . . . , dm−1∈F representing the descending-order rank in group when reconstructed are generated. Here, the ascending-order rank and the descending-order rank are assumed to start at 1. The ascending-order rank in group can be determined in accordance with the <<ascending-order rank in group>> procedure described above, for example. The descending-order rank in group can be determined in accordance with the <<descending-order rank in group>> procedure described above, for example.
Secondly, using the share [a] of the ascending-order rank a and the share [d] of the descending-order rank d, [2λ+a−d], [2λ+d−a] are calculated for λ satisfying 2λ>m, [2λ+a−d], [2λ+d−a] are subjected to bit decomposition into λ bits, and a share {a−d}∈{Bλ}m that becomes a bit string a−d when reconstructed and a share {d−a}∈{Bλ}m that becomes a bit string d−a when reconstructed are generated.
Thirdly, the least significant bit is removed from the share {a−d} of a−d and the share {d−a} of d−a, and shares {a′}, {d′}∈{Bλ−1}m that become a′, d′ when reconstructed are generated. a′ is a bit string obtained by removing the least significant bit of a−d, and d′ is a bit string obtained by removing the least significant bit of d−a.
Fourthly, {a″}:={|a′=0|}, {d″}:={|d′=0|} are calculated using the share {a′} of a′ and the share {d′} of d′, and shares {a″}, {d″}∈{B}m that become flags a″, d″∈Bm when reconstructed are generated. Here, |⋅| is a symbol that returns true or false of equation “⋅”. The flags a″, d″ respectively indicate whether a−d, d−a are greater than or equal to 0 and smaller than or equal to 1. Further, a″ indicates whether the record represents the greater median and d″ indicates whether the record represents the smaller median.
Fifthly, the shares {a″}, {d″}∈{B}m of the flags a″, d″ are converted to shares [a″], [d″]∈[F]m by secret sharing on an arbitrary ring F.
Sixthly, [va]:=[va″], [vd]:=[vd″] are calculated using the share [v] of the value attribute v and the shares {a″}, {d″} of the flags a″, d″, and shares [va], [vd]∈[F]m that become vectors va, vd∈Fm when reconstructed are generated.
Seventhly, using the shares {a″}, {d″} of the flags a″, d″, shares {¬a″}, {¬d″}∈{B}m that become negations ¬a″, ¬d″ of the flags a″, d″ when reconstructed are generated. Then, using the shares {¬a″}, {¬d″} of the negations ¬a″, ¬d″ of the flags a″, d″, shares {{σ}}, {{σd}}∈{{Sm}} that become permutations σa, σd for sorting the negations ¬a″, ¬d″ of the flags a″, d″ when reconstructed are generated.
Finally, [x]:=[σa(va)+σd(vd)] is calculated using the shares [va], [vd] of the vectors va, vd and the shares {{σa}}, {{σd}} of the permutations σa, σd, and the share [x]∈[F]m that becomes the vector x representing the median of each group when reconstructed is generated.
<Modifications>
The embodiment above showed a case of configurating each secure computation apparatus 1 n to output at least one of the shares [k′0], . . . , [k′nk−1] of the sorted key attributes k′0, . . . , k′nk−1, the shares [v′0], . . . , [v′na−1] of the sorted value attributes v′0, . . . , v′na−1, the share {e} of the flag e, and the share {{σ}} of the permutation σ. However, processing units to be included may be selected depending on the type of group-by operation to be subsequently calculated. For example, group-by count and group-by median are kinds of group-by operations that require the share {e} of the flag e and the share {{σ}} of the permutation σ. Group-by sum and group-by maximum/minimum are kinds of group-by operations that require the shares [v′0], . . . , [v′na−1] of the sorted value attributes v′0, . . . , v′na−1, the share {e} of the flag e, and the share {{σ}} of the permutation σ. Rank in group is a kind of group-by operation that requires the share {e} of the flag e and the share {{σ}} of the permutation σ. That is, in a situation where group-by count, group-by median, or rank in group is calculated but group-by sum or group-by maximum/minimum is not calculated, it is sufficient that the secure aggregate function computation system 100 is able to output at least the share {e} of the flag e and the share {{σ}} of the permutation σ. Then, each secure computation apparatus 1 n could be configured to include the input unit 10, the bit decomposition unit 11, the group sort generation unit 12, the bit string sorting unit 13, the flag generation unit 14, the key aggregate sort generation unit 15, and the output unit 19, and not to include the de-duplication unit 16, the key sorting unit 17, and the value sorting unit 18, for example.
While the embodiments of the present invention have been described, specific configurations are not limited to these embodiments, but design modifications and the like within a range not departing from the spirit of the invention are encompassed in the scope of the invention, of course. The various processes described in the embodiments may be executed in parallel or separately depending on the processing ability of an apparatus executing the process or on any necessity, rather than being executed in time series in accordance with the described order.
[Program and Recording Medium]
When various types of processing functions in the apparatuses described in the above embodiments are implemented on a computer, the contents of processing function to be contained in each apparatus is written by a program. With this program executed on the computer, various types of processing functions in the above-described apparatuses are implemented on the computer.
This program in which the contents of processing are written can be recorded in a computer-readable recording medium. The computer-readable recording medium may be any medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory.
Distribution of this program is implemented by sales, transfer, rental, and other transactions of a portable recording medium such as a DVD and a CD-ROM on which the program is recorded, for example. Furthermore, this program may be stored in a storage unit of a server computer and transferred from the server computer to other computers via a network so as to be distributed.
A computer which executes such program first stores the program recorded in a portable recording medium or transferred from a server computer once in a storage unit thereof, for example. When the processing is performed, the computer reads out the program stored in the storage unit thereof and performs processing in accordance with the program thus read out. As another execution form of this program, the computer may directly read out the program from a portable recording medium and perform processing in accordance with the program. Furthermore, each time the program is transferred to the computer from the server computer, the computer may sequentially perform processing in accordance with the received program. Alternatively, a configuration may be adopted in which the transfer of a program to the computer from the server computer is not performed and the above-described processing is executed by so-called application service provider (ASP)-type service by which the processing functions are implemented only by an instruction for execution thereof and result acquisition. It should be noted that a program in this form includes information which is provided for processing performed by electronic calculation equipment and which is equivalent to a program (such as data which is not a direct instruction to the computer but has a property specifying the processing performed by the computer).
In this form, the present apparatus is configured with a predetermined program executed on a computer. However, the present apparatus may be configured with at least part of these processing contents realized in a hardware manner.