WO2014185043A1

WO2014185043A1 - Information processing device, information anonymization method, and recording medium

Info

Publication number: WO2014185043A1
Application number: PCT/JP2014/002480
Authority: WO
Inventors: 隆夫竹之内
Original assignee: 日本電気株式会社
Priority date: 2013-05-15
Filing date: 2014-05-12
Publication date: 2014-11-20
Also published as: JPWO2014185043A1

Abstract

In order to allow a user device to integrate anonymized data and allow data to be appropriately anonymized for the provider as well, an information processing device according to the present invention comprises: a generalization policy coordination determination means which in coordination with another device determines a common generalization policy, which is a generalization policy for anonymization of data, shared with the other device; and an anonymization means for anonymizing the data on the basis of the common generalization policy.

Description

Information processing apparatus, information anonymization method, and recording medium

The present invention relates to information processing, and in particular to data anonymization.

In recent years, a lot of personal data has been converted to electronic data.

Demand for secondary use of personal data is expanding with the conversion of data to electronic data.

However, the personal data includes data (sensitive data (SD: Sensitive Data) or Sensitive Data Attribute) related to the individual that you do not want to disclose. For this reason, it is necessary to protect personal privacy in order to disclose personal data.

Anonymization technology is one technology that protects privacy. An information processing device of a provider (provider) that provides data anonymizes the data and transmits the data to a user device that uses the data (hereinafter referred to as “user device”).

The information processing apparatus related to the present invention, for example, deletes an identifier (ID: Identifier) that uniquely identifies an individual from personal data and publishes the data.

However, personal data may include data that can identify (specify) an individual when combined with other data. Thus, the “quasi-identifier (QID)” is data that can identify an individual when combined with other data.

Therefore, the information processing apparatus related to the present invention anonymizes the quasi-identifier (QID) so as to satisfy a predetermined policy for protecting the personal data to be provided.

Several anonymization policies (generalization policies) have been proposed.

For example, “k-anonymity” and “l-diversity” are widely used (see, for example, Patent Document 1). “K-anonymity” is a policy that guarantees anonymization in which “k” or more pieces of data including the same quasi-identifier or pair of quasi-identifiers are included in each group of data. “I-diversity” is a policy that guarantees anonymization in which “l” or more sensitive data is included in each group of data.

Others, for example, “t-proximity” and “m-invariance” have been proposed. “T-proximity” is a policy that guarantees that the difference between the distance in the distribution of sensitive data between groups and the distance in the distribution of all attributes is equal to or less than “t”. “M-invariance” is a policy for guaranteeing that there are “m” or more records with the same combination of quasi-identification information in the sequential disclosure of data, and that all records have different sensitive data.

And anonymization policies may be used in combination.

Note that “k-anonymization” is anonymization satisfying “k-anonymity”. Further, “l-diversification” is anonymization satisfying “l-diversity”. Similarly, “t-proximity” and “m-invariant” are anonymization satisfying “t-proximity” and “m-invariance”.

Many anonymization techniques have been proposed (see, for example, Non-Patent Document 1). “Mondrian Multidimensional” described in Non-Patent Document 1 is a method of dividing quasi-identifiers into one group and then dividing the data into a plurality of groups so as to satisfy k-anonymity.

JP 2011-170632 A

However, the number of data providers (providers) is not limited to one, and there may be a plurality of cases.

The information processing device of each providing source anonymizes the data individually and provides it to the user device.

Therefore, when there are a plurality of data providing sources, the user device needs to receive anonymized data from a plurality of information processing devices of the providing sources and aggregate the anonymized data.

However, the data stored by the provider is not the same. Therefore, for example, when the number of data stored by the provider is different, the information processing apparatus related to the present invention anonymizes the data based on different generalization policies. Similarly, when the QIDs included in the data are different, the information processing apparatus related to the present invention anonymizes the data based on different generalization policies. And when the generalization policy of the anonymization of a provider does not correspond, the user apparatus cannot aggregate the anonymized data received from the information processing apparatuses of a plurality of providers related to the present invention.

As described above, when a plurality of information processing devices provided by a provider provide data, the methods described in Patent Literature 1 and Non-Patent Literature 1 indicate that the user device cannot aggregate the anonymized data that has been provided. There was a problem.

An object of the present invention is to provide an information processing apparatus, an information anonymization method, and a recording medium that solve the above-described problems.

An information processing apparatus according to an aspect of the present invention is configured to determine a generalization policy cooperation determination that determines a common generalization policy that is a generalization policy for anonymizing data that is used in common with the other apparatus in cooperation with another apparatus. Means and anonymization means for anonymizing data based on the common generalization policy.

In one form of the information anonymization method according to the present invention, a common generalization policy that is a generalization policy for anonymization of data used in common with the other device is determined in cooperation with the other device, and the common generalization policy is determined. Anonymize the data based on the conversion policy.

A computer-readable recording medium in which a program according to an embodiment of the present invention is recorded has a common generalization policy that is a generalization policy for anonymizing data used in common with the other device in cooperation with the other device. A computer apparatus is caused to execute a program including a process of determining and a process of anonymizing data based on the common generalization policy.

According to the present invention, it is possible to provide anonymized data that allows a user device to aggregate data after anonymization.

FIG. 1 is a diagram showing data for explaining the operation of the information processing apparatus related to the present invention. FIG. 2 is a diagram showing data for explaining the operation of the information processing apparatus related to the present invention. FIG. 3 is a block diagram illustrating an example of a configuration of a system including the information processing apparatus according to the first embodiment of the present invention. FIG. 4 is a block diagram illustrating an example of the configuration of the information processing apparatus according to the first embodiment. FIG. 5 is a block diagram illustrating an example of the configuration of the information processing apparatus according to the first embodiment. FIG. 6 is a flowchart illustrating an example of the operation of the information processing apparatus according to the first embodiment. FIG. 7 is a diagram illustrating data for explaining the operation of the information processing apparatus according to the first embodiment. FIG. 8 is a diagram illustrating data for explaining the operation of the information processing apparatus according to the first embodiment. FIG. 9 is a block diagram illustrating an example of another configuration of the information processing apparatus according to the first embodiment.

Next, an embodiment of the present invention will be described with reference to the drawings.

Before describing the embodiment of the present invention, the operation of the information processing apparatus related to the present invention will be described.

FIG. 1 is a diagram showing data for explaining the operation of the information processing apparatus related to the present invention.

For convenience of explanation, the information processing apparatus of provider A is referred to as “provider A” in the following description. Similarly, the information processing apparatus of provider B is referred to as “provider B”.

Suppose that provider A anonymizes data 1000 on the upper left. First, the provider A anonymizes the quasi-identifiers (QID1 and QID2) into one group like the data 1001 shown in the upper center. Then, the provider A divides QID1 into two groups (generalization width “120-125” and generalization width “126-129”) with the central value “125” of QID1 as a boundary, and on the upper right side It anonymizes like the data 1002 to show.

Meanwhile, it is assumed that the provider B anonymizes the data 2000 on the lower left. First, the provider B anonymizes the quasi-identifiers (QID1 and QID2) into one group like the data 2001 shown in the lower center. Then, the provider B divides QID1 into two groups (generalization width “120-124” and generalization width “125-129”) with “124” being the median value of QID1 as a boundary. It anonymizes like the data 2002 to show.

FIG. 2 is a diagram showing data for explaining the operation of the information processing apparatus related to the present invention.

As shown in FIG. 2, the anonymized data 1002 of the provider A and the anonymized data 2002 of the provider B have different boundaries. Therefore, the user apparatus can assume a plurality of connection methods (mappings) between the group of the provider A and the group of the provider B.

For example, a group with QID1 “125-129” of provider B includes a common QID with a group with QID1 “120-125” and QID1 “126-129” of provider A. For this reason, the user apparatus cannot determine which group of the provider A the group with the QID “125-129” of the provider B of the received anonymized data.

As described above, when a plurality of information processing devices provided by a provider provide data, the information processing device related to the present invention has a problem that the user device cannot aggregate the provided anonymized data. It was.

Therefore, the information processing apparatus related to the present invention anonymizes the data using, for example, the method described below.

The first method is as follows.

In the first method, the information processing apparatus related to the present invention stores a common generalization policy in advance. And the information processing apparatus relevant to this invention anonymizes data based on the common generalization policy to preserve | save.

The second method is as follows.

In the second method, the information processing apparatus related to the present invention mutually discloses the QID. And the information processing apparatus relevant to this invention determines the policy of anonymization using QID of all the information processing apparatuses.

However, the information processing apparatus using the first method has a problem that the data to be stored cannot be anonymized optimally.

Specific explanation will be given using an example.

For example, it is assumed that the information processing apparatus stores four data with QIDs “1”, “8”, “13”, and “19”. The information processing apparatus satisfies “2-anonymity”.

In this case, the information processing apparatus can adopt, for example, generalization policies of “0-9” and “10-19” in order to anonymize the stored data. Therefore, it is assumed that the information processing apparatus stores “0-9” and “10-19” as common generalization policies in advance.

However, after that, the information processing apparatus additionally stores data with QIDs “5”, “7”, “14”, and “17”.

Then, the information processing apparatus anonymizes the data based on the generalization policies “0-5”, “6-9”, “10-14”, and “15-20”, for example, -Anonymity can be secured.

However, the information processing apparatus using the first method has determined the generalization policies (“0-9” and “10-19”) in advance. Therefore, the information processing apparatus divides the data into “1, 5, 7, 8” and “13, 14, 17, 19” according to the generalization policy. As described above, the information processing apparatus using the first method has a problem in that it cannot carry out optimal anonymization.

Also, data including QID is a property for the provider. Therefore, the data provider wants to avoid disclosing data including the QID in a state where it is not anonymized to other providers.

That is, the information processing apparatus using the second method has a problem that it is difficult to implement in actual operation.

Each drawing explains an embodiment of the present invention. However, the present invention is not limited to the description of each drawing. Moreover, the same number is attached | subjected to the same structure of each drawing, and the repeated description may be abbreviate | omitted.

(First embodiment)
FIG. 3 is a block diagram showing an example of the configuration of the information processing system 40 including the information processing apparatus 10 and the information processing apparatus 30 according to the first embodiment of the present invention.

The information processing system 40 includes an information processing device 10, a user device 20, and an information processing device 30. The information processing apparatus 10, the user apparatus 20, and the information processing apparatus 30 are connected via a general communication path, for example, a network or a bus.

User device 20 receives anonymized data from information processing device 10 and information processing device 30. Then, the user device 20 uses the anonymized data after aggregation. The user device 20 is not particularly limited as long as it is a device that processes general data. Therefore, detailed description of the user device 20 is omitted.

The information processing apparatus 10 anonymizes the data and transmits it to the user apparatus 20 so that the user apparatus 20 can aggregate the anonymized data.

The information processing apparatus 30 is the same apparatus as the information processing apparatus 10. However, the information processing apparatus 10 cooperates with other information processing apparatuses (for example, the information processing apparatus 30) as will be described later. Therefore, in order to clarify the following description of the cooperation, the information processing apparatus 30 is assigned a reference numeral different from that of the information processing apparatus 10.

Therefore, in the description of cooperation, the information processing apparatus 10 will be described as an apparatus that is a main subject of cooperation. The information processing apparatus 30 will be described as an apparatus that responds to the information processing apparatus 10. That is, the information processing apparatus 30 corresponds to “another information processing apparatus 10” that responds to the information processing apparatus 10.

Therefore, in the following description, the configurations and operations of the information processing apparatus 10 and the information processing apparatus 30 may be interchanged.

In the following description, when there is no need to distinguish between the information processing apparatus 10 and the information processing apparatus 30, the information processing apparatus 10 will be described.

In addition, in FIG. 3, although the information processing apparatus 10 and the information processing apparatus 30 are one each, the number is the illustration for the convenience of description. The information processing apparatus 10 according to the present embodiment may cooperate with a plurality of information processing apparatuses 30. The same applies to the information processing apparatus 30. That is, the information processing system 40 including the information processing apparatus 10 and the information processing apparatus 30 according to the present embodiment may include a plurality of information processing apparatuses 10 and a plurality of information processing apparatuses 30.

The information processing apparatus 10 will be further described with reference to the drawings.

FIG. 4 is a block diagram illustrating an example of the configuration of the information processing apparatus 10 according to the present embodiment.

In FIG. 4, each of the information processing apparatus 10 and the information processing apparatus 30 is one, but the number of the

information processing apparatuses

10 and 30 is an example as in FIG.

The information processing apparatus 10 anonymizes data in cooperation with the information processing apparatus 30.

Therefore, as illustrated in FIG. 4, the information processing apparatus 10 includes an anonymization unit 110 and a generalization policy cooperation determination unit 120.

The generalization policy cooperation determination unit 120 cooperates (communications) with the information processing apparatus 30 and determines a generalization policy to be shared (hereinafter referred to as “common generalization policy”). That is, the generalization policy cooperation determination unit 120 determines a common generalization policy in cooperation with the “other information processing apparatus 10”. It can be said that the generalization policy cooperation determination unit 120 shares the common generalization policy in cooperation with the information processing apparatus 30.

Here, the common generalization policy is a generalization policy used for anonymization of data in common between the information processing apparatus 10 and the information processing apparatus 30. The common generalization policy is, for example, a QID division point (boundary) or a range of data after QID division (generalization width).

The anonymization unit 110 anonymizes data based on the common generalization policy determined by the generalization policy cooperation determination unit 120.

The information processing apparatus 10 transmits the anonymized data thus anonymized to the user apparatus 20.

The information processing apparatus 10 and the information processing apparatus 30 have a common generalization policy for anonymization. Therefore, the user device 20 can collect the received anonymized data.

Note that the information processing apparatus 10 does not need to share all generalization policies.

The information processing apparatus 10 may share a generalization policy within a predetermined range. Then, the information processing apparatus 10 may determine a generalization policy that is suitable for its own apparatus with respect to a generalization policy that is not shared (hereinafter referred to as “individual generalization policy”).

That is, the information processing apparatus 10 can anonymize data based on the individual generalization policy in addition to anonymization based on the common generalization policy.

Also, the generalization policy cooperation determination unit 120 may store information on data attributes in addition to the function of determining the common generalization policy in cooperation.

For example, the generalization policy cooperation determination unit 120 may store information regarding the attribute type of data to be anonymized. Here, the type of attribute is not particularly limited. For example, the following attribute types can be assumed.

(1) Identifier (2) QID for sharing generalization policy (common QID)
(3) QIDs that do not share a generalization policy
(4) Others The generalization policy cooperation determination unit 120 may determine whether the generalization policy used by the anonymization unit 110 is a common generalization policy based on the stored information.

The anonymization unit 110 may use information stored by the generalization policy cooperation determination unit 120 for anonymization of data. For example, when deleting an identifier from data, the anonymization unit 110 may determine the attribute to be deleted based on information indicating that the attribute corresponds to the identifier stored by the generalization policy cooperation determination unit 120.

FIG. 5 is a block diagram illustrating an example of the configuration of the information processing apparatus 10.

In FIG. 5, the same components as those in FIG. 4 are given the same numbers.

The information processing apparatus 10 includes an anonymization unit 110, a generalization policy linkage determination unit 120, a pre-anonymization data storage unit 160, an anonymized data storage unit 170, and a transmission unit 180.

The pre-anonymization data storage unit 160 stores pre-anonymization data. The information processing device 10 transmits the pre-anonymization data to the user device 20 after anonymization.

As described above, the anonymization unit 110 anonymizes the data before anonymization based on the common generalization policy determined by the generalization policy cooperation determination unit 120 in anonymizing the data before anonymization. create. Moreover, as already demonstrated, the anonymization part 110 may anonymize data using an individual generalization policy in addition to a common generalization policy. Furthermore, the anonymization unit 110 may use information stored by the generalization policy cooperation determination unit 120 for anonymizing data.

And the anonymization unit 110 stores the anonymized data in the anonymized data storage unit 170. Also, the anonymization unit 110 responds to the request from the user device 20 and sends the anonymized data to the transmission unit 180. Note that the anonymization unit 110 may store data in the middle of anonymization in the anonymized data storage unit 170.

The anonymized data storage unit 170 stores the anonymized data anonymized by the anonymization unit 110.

The transmission unit 180 transmits the anonymized data received from the anonymization unit 110 to the user device 20. Therefore, the transmission unit 180 controls communication with the user device 20. The transmission unit 180 may receive the anonymized data from the anonymized data storage unit 170 without passing through the anonymization unit 110 and transmit the anonymized data to the user device 20.

The generalization policy cooperation determination unit 120 determines the common generalization policy used by the anonymization unit 110 with the information processing apparatus 30 as described above. Therefore, the generalization policy cooperation determination unit 120 includes an anonymity parameter storage unit 130, a common parameter setting unit 140, and a communication unit 150.

The anonymity parameter storage unit 130 stores information on the types of attributes already described, for example, information on the QID (common QID) that the generalization policy cooperation determination unit 120 shares with the information processing apparatus 30 in the generalization policy. That is, the anonymity parameter storage unit 130 holds information (anonymity parameters) for determining whether or not the generalization policy used by the anonymization unit 110 is a common generalization policy. The anonymity parameter storage unit 130 may store other types already described, for example, information on QIDs that do not share a generalization policy, or information on other attributes.

In this description, it is assumed that the anonymity parameter storage unit 130 has information set in advance. For example, an administrator of the information processing apparatus 10 may operate the information processing apparatus 10 to store (set) information in the anonymity parameter storage unit 130.

The common parameter setting unit 140 determines a common generalization policy (common parameter) based on information stored in the anonymity parameter storage unit 130 in cooperation with the information processing apparatus 30.

The common parameter setting unit 140 will be further described under the following assumptions using a specific example.

First, the data of the information processing apparatus 10 includes QID1 and QID2 as quasi-identifiers to be anonymized. Second, the anonymity parameter storage unit 130 is set with information for determining the generalization policy of QID1 in cooperation. In other words, the anonymity parameter storage unit 130 is set with information for sharing QID1. Thirdly, in the anonymity parameter storage unit 130, information for not linking the determination of the generalization policy of QID2 is set. That is, information that the QID2 is not shared is set in the anonymity parameter storage unit 130. Therefore, QID1 is a common QID, and the generalization policy of QID1 is a common generalization policy. QID2 is not a common QID, and the generalization policy of QID2 is an individual generalization policy.

First, the case where the information processing apparatus 10 starts cooperation will be described.

A case where data is anonymized using QID1 will be described.

First, the common parameter setting unit 140 determines whether QID1 is a common QID based on information stored by the anonymity parameter storage unit 130. In this case, QID1 is a common QID. Therefore, the common parameter setting unit 140 starts cooperation with the information processing apparatus 30 for commonization of the common generalization policy (QID1 generalization policy).

When the common generalization policy is received from the information processing device 30, the common parameter setting unit 140 determines a common generalization policy used for anonymization based on the common generalization policy of the own device and the received common generalization policy. To do.

If the common generalization policy cannot be received from the information processing apparatus 30, the common parameter setting unit 140 gives up cooperation. In this case, the information processing apparatus 10 anonymizes data as in the individual generalization policy described below.

On the other hand, the case where QID2 is used will be described.

The common parameter setting unit 140 determines whether QID2 is a common QID based on information stored by the anonymity parameter storage unit 130. In this case, QID2 is not a common QID. Therefore, the common parameter setting unit 140 does not cooperate with the information processing apparatus 30. In this case, the information processing apparatus 10 anonymizes the data based on the individual generalization policy (QID2 generalization policy).

Next, a case where the information processing apparatus 10 receives a request for cooperation, that is, a case where the information processing apparatus 30 starts cooperation will be described.

First, when a notification of cooperation is received from the information processing apparatus 30, the common parameter setting unit 140 determines whether to share the information based on information stored by the anonymity parameter storage unit 130.

When the information processing apparatus 30 receives a link based on a generalization policy that can be shared (common generalization policy: for example, QID1 generalization policy), the common parameter setting unit 140 determines that the common generalization policy of its own device is used. Is transmitted to the information processing apparatus 30. Then, the information processing apparatus 10 determines the common generalization policy used for anonymization based on the received common generalization policy and the common generalization policy of the own apparatus.

On the other hand, if the information processing apparatus 30 receives a notification of cooperation in a generalization policy (individual generalization policy: for example, QID2 generalization policy) that cannot be shared, the common parameter setting unit 140 responds to the information processing apparatus 30. do not do. However, the information processing apparatus 10 may notify the information processing apparatus 30 that it does not cooperate.

Note that the common parameter setting unit 140 may determine whether to cooperate based on the content of the received generalization policy.

The communication unit 150 mediates communication with the information processing apparatus 30 of the common parameter setting unit 140. Therefore, the communication unit 150 controls communication with the communication unit 150 of the information processing device 30.

Next, anonymization operation based on the common generalization policy in the information processing apparatus 10 will be described with reference to the drawings.

FIG. 6 is a flowchart illustrating an example of the anonymization operation of the information processing apparatus 10 according to the first embodiment.

In the description of FIG. 6, the generalization policy is described as a QID division point (boundary) as an example. That is, the information processing apparatus 10 shares the QID division points.

Further, it is assumed that the anonymity secured by the information processing apparatus 10 is determined in advance. In addition, it is assumed that the common QID (common QID) is stored in the anonymity parameter storage unit 130 in advance.

Further, it is assumed that the information processing apparatus 10 knows the information processing apparatus 30 that cooperates (for example, the number of apparatuses that cooperate with each other and their addresses).

First, the anonymization unit 110 of the information processing apparatus 10 determines the QID to be divided based on the data stored by the pre-anonymization data storage unit 160 (step S210). For example, the anonymization unit 110 may select a QID having the widest value range. Alternatively, the anonymization unit 110 may select QIDs in order in a round robin manner.

Next, the anonymization unit 110 sends the determined QID generalization policy to the common parameter setting unit 140. For example, the anonymization unit 110 determines the division point (boundary) of the QID, and sends the QID and the boundary to the common parameter setting unit 140 as a generalization policy.

The common parameter setting unit 140 determines whether or not the received QID is a common QID based on the information stored by the anonymity parameter storage unit 130 (step S220).

In the case of a common QID (“YES” in step S220), the common parameter setting unit 140 shares a generalization policy (for example, a common QID division point (boundary)) with the information processing apparatus 30 via the communication unit 150. (Step S230).

For example, the common parameter setting unit 140 operates as follows.

The common parameter setting unit 140 notifies the information processing apparatus 30 of the sharing with the QID determined in step S210 (for example, dividing the QID). That is, the common parameter setting unit 140 notifies the sharing of the QID. Then, the common parameter setting unit 140 waits for a response regarding cooperation from the information processing apparatus 30.

When receiving a response indicating that all information processing apparatuses 30 cooperate, the common parameter setting unit 140 notifies the information processing apparatus 30 of the common generalization policy (for example, the boundary of common QID division). Then, the common parameter setting unit 140 waits for notification of the common generalization policy from the information processing apparatus 30. When the common generalization policy is received from all the information processing devices 30, the common parameter setting unit 140 proceeds to step S240.

When receiving a response indicating cooperation from some information processing devices 30 and receiving a response indicating not cooperation from other information processing devices 30, the information processing device 10 and the information processing device 30 responding that cooperation is performed As with, commonality of common generalization policies should be linked. However, in the case of cooperation with some information processing apparatuses 30, the information processing apparatus 10 may stop cooperation. In that case, the information processing apparatus 10 may operate in the same manner as when a response indicating that the information processing apparatuses 30 described below do not cooperate is received.

When receiving a response indicating that the information processing apparatuses 30 do not cooperate with each other, the common parameter setting unit 140 may operate similarly to the case of the individual generalization policy described later. For example, the common parameter setting unit 140 returns the generalization policy received from the anonymization unit 110 to the anonymization unit 110.

It should be noted that the common parameter setting unit 140 need not be limited to the transmission of the cooperation notification as the start of communication in the common cooperation of common generalization policies. For example, the common parameter setting unit 140 may determine a common generalization policy to be negotiated with the information processing apparatus 30 in advance and shared in common without determining a generalization policy to be shared in advance. Alternatively, the common parameter setting unit 140 may transmit the common generalization policy and the cooperation notification together without transmitting the common generalization policy and the notification of commonization of the QID as a separate notification. Alternatively, the information processing apparatus 10 may determine in advance that the transmission of the common generalization policy also serves as a notification of cooperation.

After receiving all the common generalization policies, the common parameter setting unit 140 determines a generalization policy used for data anonymization based on the received common generalization policies (step S240). For example, when the common generalization policy is a QID boundary, the information processing apparatus 10 may use an average value of the received QID boundary as the generalization policy.

The information processing apparatus 30 also determines a generalization policy based on the received common generalization policy. Therefore, the information processing apparatus 10 and the information processing apparatus 30 calculate the same generalization policy (for example, a QID boundary) as the generalization policy used for anonymization.

In this way, the information processing apparatus 10 and the information processing apparatus 30 determine the generalization policy for anonymization in cooperation.

After determining the generalization policy, the common parameter setting unit 140 returns the determined generalization policy to the anonymization unit 110.

Note that the common parameter setting unit 140 may not receive the common generalization policy from the information processing apparatus 30 due to, for example, a network failure or a failure of the information processing apparatus 30.

Therefore, when the common generalization policy is not sent from the information processing apparatus 30, the common parameter setting unit 140 may return the boundary received from the anonymization unit 110 to the anonymization unit 110 as the generalization policy. That is, the information processing apparatus 10 may anonymize data using the generalization policy determined by the anonymization unit 110 when the generalization policy cannot be determined in cooperation.

Alternatively, the information processing apparatus 10 may notify the user apparatus 20 of the failure.

On the other hand, when the received QID is not a common QID (“NO” in step S220), the common parameter setting unit 140 returns the boundary received from the anonymization unit 110 to the anonymization unit 110 as a generalization policy.

The anonymization unit 110 divides the QID based on the generalization policy received from the common parameter setting unit 140 (step S250).

That is, in the case of the common QID, the information processing apparatus 10 cooperates with the information processing apparatus 30 and anonymizes data based on the common generalization policy. On the other hand, if it is not the common QID, the information processing apparatus 10 does not cooperate with the information processing apparatus 30 and anonymizes the data based on the generalization policy determined by the own apparatus.

After the division, the anonymization unit 110 confirms the anonymity of the data (step S260).

If the anonymity is satisfied (“YES” in step S260), the anonymization unit 110 proceeds to the division of the next QID (step S210). The information processing apparatus 10 repeats the division as long as the anonymity is satisfied.

If the anonymity is not satisfied (“NO” in step S260), the anonymization unit 110 cancels the immediately preceding division and ends the anonymization process (step S270).

When the previous division is a common QID, the information processing apparatus 10 notifies the linked information processing apparatus 30 of the cancellation of generalization.

However, when the previous division is a common QID, the information processing apparatus 10 may change the division point in cooperation with the information processing apparatus 30.

Further, the information processing apparatus 10 may notify the information processing apparatus 30 of the end of cooperation after the anonymization process is completed.

As described above, the information processing apparatus 10 determines a generalization policy in cooperation with the information processing apparatus 30 in the case of a generalization policy to be shared in anonymization.

FIG. 7 is a diagram illustrating data for explaining the operation of determining the generalization policy of the information processing apparatus 10.

7 shows the data of the information processing apparatus 10 (apparatus A in FIG. 7), for example. The lower part of FIG. 7 shows data of another information processing apparatus 10 (that is, information processing apparatus 30 and apparatus B of FIG. 7).

In this description, the information processing apparatus 10 will be described assuming that the QID to be shared has been communicated in advance. Specifically, QID1 shown in FIG. 7 is a common QID.

The anonymization unit 110 first anonymizes data 3000 and data 4000 into data 3001 and data 4001 in the most anonymized state. That is, the anonymization unit 110 anonymizes each QID into one group.

The data 3001 and data 4001 shown in the center of FIG. 7 are the first anonymized states of QID1 and QID2.

Next, the anonymization unit 110 determines the dividing point (boundary) of QID1. For example, the anonymization unit 110 of the device A determines the average “125” of QID1 of the data 3001 as the boundary. Similarly, the anonymization unit 110 of the device B determines the average “124” of QID1 of the data 4001 as a boundary.

In this description, the information processing apparatus 10 calculates the average of QID1 as a boundary. However, the information processing apparatus 10 has no particular limitation on how to determine the boundary.

For example, the information processing apparatus 10 may use the average of the groups having the largest size (the number of records is large) among the groups of QID1 as a boundary. The data 3001 and the data 4001 shown in FIG. 7 are in the initial state, the number of groups is 1, and the size of the group is 5. That is, the group of the device A and the device B shown in FIG. 7 is the largest group. Then, the devices A and B calculate the average of the largest group, and determine “125” and “124” as the boundaries, respectively.

Further, the information processing apparatus 10 need not be limited to the average of the group having the largest size (the number of records is large) as a boundary. For example, the information processing apparatus 10 may use the median value of the group as the boundary. Alternatively, the information processing apparatus 10 may select another group such as a group having a wide range.

Next, the anonymization unit 110 sends QID1 and the boundary to the common parameter setting unit 140.

The common parameter setting unit 140 determines whether QID1 is a common QID. Here, QID1 is a common QID.

Therefore, the common parameter setting unit 140 of the device A and the common parameter setting unit 140 of the device B communicate with each other via the communication unit 150 at the boundary of QID1 that is the common generalization policy. For example, apparatus A transmits an average “125” of QID1 and receives an average “124” of QID1 of apparatus B.

Then, the common parameter setting unit 140 determines a common generalization policy based on the common generalization policy of its own device and all received common generalization policies, that is, all QID1 boundaries. For example, apparatus A and apparatus B use the average of their boundaries (124 = (124 + 125) / 2 rounded down to the nearest decimal point) as a common generalization policy.

Then, the common parameter setting unit 140 returns the determined generalization policy (common generalization policy) to the anonymization unit 110.

The anonymization unit 110 anonymizes data based on the received generalization policy (here, the boundary “124” of QID1).

7 is data anonymized based on the common generalization policy (QID1 “124”).

FIG. 8 is a diagram illustrating data for explaining the anonymization operation of the information processing apparatus 10 according to the present embodiment. In FIG. 8, the data 3002 of the device A and the data 4002 of the device B are displayed side by side so that the data can be easily compared.

As is clear from FIG. 8, the data boundary of the anonymized data 3002 of the device A and the anonymized data 4002 of the device B match. Therefore, the user device 20 can collect data.

Thus, the information processing apparatus 10 according to the present embodiment anonymizes data.

The effect of the information processing apparatus 10 of this embodiment will be described.

The information processing apparatus 10 can aggregate the data after the user apparatus 20 is anonymized, and can obtain an effect of providing the data by anonymizing the data appropriately for the data provider.

The reason is as follows.

The generalization policy cooperation determination unit 120 of the information processing apparatus 10 determines a common generalization policy to be shared in cooperation with the information processing apparatus 30 (that is, another information processing apparatus 10) in anonymization. Furthermore, the generalization policy cooperation determination unit 120 notifies the optimal generalization policy at that time determined by the anonymization unit 110. Therefore, the generalization policy cooperation determination unit 120 can determine a more appropriate generalization policy as compared to the case where the generalization policy is determined in advance. And it is because the anonymization part 110 of the information processing apparatus 10 can anonymize data based on the common generalization policy determined in cooperation. Therefore, the user device 20 can aggregate the data after anonymization.

Further, the information processing apparatus 10 can be anonymized without transmitting data to the information processing apparatus 30.

The reason is as follows.

The information processing apparatus 10 can determine the common generalization policy by transmitting the common generalization policy to the information processing apparatus 30. And the information processing apparatus 10 can anonymize data based on a common generalization policy. Thus, the information processing apparatus 10 can anonymize the data without transmitting the data to the information processing apparatus 30.

<Modification 1>
A modification of the operation of the information processing apparatus 10 that is not included in the above description will be described.

Since the user device 20 aggregates data, the information processing device 10 needs to set the data value of the common generalization policy to the same generalized value (global recoding: Global （Re-Coding). is there.

On the other hand, the information processing apparatus 10 does not need to anonymize the data value of the individual generalization policy so as to satisfy the global recoding. The information processing apparatus 10 may set the data value of the individual generalization policy as a different generalized value (local recoding: Local Re-Coding).

Further, the information processing apparatus 10 may be anonymized using data (name, preference, etc.) that can be categorized in addition to numerical data that allows easy range setting and size determination.

In addition, when the data is data that can be classified into categories, the information processing apparatus 10 may apply a conceptual tree classification system (taxonomy) to the data and anonymize the data.

Further, the information processing apparatus 10 is not limited to the top-down anonymization method that repeats the division as illustrated in FIG. 6, and may use a bottom-up anonymization method that repeats the combination. Alternatively, the information processing apparatus 10 may combine top down and bottom up.

<Modification 2>
The information processing apparatus 10 and the information processing apparatus 30 may have overlapping requests from both.

When the common QID of the information processing apparatus 10 and the common QID of the information processing apparatus 30 are the same, the information processing apparatus 10 may determine the common generalization policy in cooperation based on the operation described above.

However, when the common QID notified by the information processing apparatus 10 in cooperation with the common QID notified by the information processing apparatus 30 is different, the information processing apparatus 10 and the information processing apparatus 30 need to select the common QID.

The information processing apparatus 10 and the information processing apparatus 30 may determine which common QID is used by arbitrating. Alternatively, the information processing apparatus 10 and the information processing apparatus 30 may set a priority order when cooperation requests overlap in advance.

In addition, when requests for cooperation from three or more devices overlap, the information processing apparatus 10 may arbitrate and determine the common QID determined in cooperation.

However, mediation requires a lot of time until the decision is made as the number of devices increases. Therefore, the information processing apparatus 10 may determine a predetermined priority order of the common QID in advance. For example, the information processing apparatus 10 may adopt a common QID having the largest number of cooperation requests as a common QID.

In addition, after determining the common QID used for cooperation, the information processing apparatus 10 and the information processing apparatus 30 transmit a common generalization policy of the determined common QID. Subsequent operations may be the same as those already described.

<Modification 3>
The configuration of the information processing apparatus 10 is not limited to the above description. The information processing apparatus 10 may divide each component into a plurality of components.

Furthermore, the information processing apparatus 10 does not need to be configured by one apparatus. For example, the information processing apparatus 10 may be configured using a device including the anonymization unit 110 connected via a network and a device including the generalization policy cooperation determination unit 120.

Alternatively, the information processing apparatus 10 may configure either or both of the pre-anonymization data storage unit 160 and the anonymized data storage unit 170 as an external storage device.

In addition, the information processing apparatus 10 may be configured with a plurality of components by one apparatus.

For example, the information processing apparatus 10 may be realized as a computer apparatus including a CPU (Central Processing Unit), a ROM (Read Only Memory), and a RAM (Random Access Memory). The information processing apparatus 10 may further be realized as a computer apparatus including an input / output connection circuit (IOC: Input Output Circuit) and a network interface circuit (NIC: Network Interface Circuit).

FIG. 9 is a block diagram illustrating an example of a configuration of an information processing device 60 that is a modification of the information processing device 10 of the present embodiment.

The information processing device 60 includes a CPU 610, a ROM 620, a RAM 630, an internal storage device 640, an IOC 650, and a NIC 680, and constitutes a computer.

CPU 610 reads a program from ROM 620. The CPU 610 controls the RAM 630, the internal storage device 640, the IOC 650, and the NIC 680 based on the read program. And CPU610 controls these structures and implement | achieves each function as the anonymization part 110 and the generalization policy cooperation determination part 120 which are shown in FIG. The CPU 610 may use the RAM 630 as a temporary program storage when realizing each function.

Further, the CPU 610 may read the program included in the storage medium 700 storing the program so as to be readable by a computer using a storage medium reading device (not shown). Alternatively, the CPU 610 may receive a program from an external device (not shown) via the NIC 680.

ROM 620 stores programs executed by CPU 610 and fixed data. The ROM 620 is, for example, a P-ROM (Programmable-ROM) or a flash ROM.

The RAM 630 temporarily stores programs executed by the CPU 610 and data. The RAM 630 is, for example, a D-RAM (Dynamic-RAM).

The internal storage device 640 stores data and programs stored in the information processing device 60 for a long time. Further, the internal storage device 640 may operate as a temporary storage device for the CPU 610. The internal storage device 640 is, for example, a hard disk device, a magneto-optical disk device, an SSD (Solid State Drive), or a disk array device.

The IOC 650 mediates data between the CPU 610, the input device 660, and the display device 670. The IOC 650 is, for example, an IO interface card.

The input device 660 is a device that receives an input instruction from an operator of the information processing apparatus 60. The input device 660 is, for example, a keyboard, a mouse, or a touch panel.

The display device 670 is a device that displays information to the operator of the information processing apparatus 60. The display device 670 is a liquid crystal display, for example.

NIC 680 relays data exchange with an external device via a network. The NIC 680 is, for example, a LAN (Local Area Network) card.

The information processing apparatus 60 configured as described above can obtain the same effects as the information processing apparatus 10.

The reason is as follows.

This is because the CPU 610 of the information processing apparatus 60 can realize the same function as the information processing apparatus 10 based on the program.

(Second Embodiment)
The information processing apparatus 10 anonymizes data based on the generalization policy shared with the information processing apparatus 30.

However, the common generalization policy may be different from the optimal generalization policy for the information processing apparatus 10.

Moreover, the information processing apparatus 10 differs in the degree of difficulty (degree of difficulty) of data anonymization according to, for example, the data amount (data size) or anonymity of data to be handled.

In other words, the difficulty level is an index indicating the difficulty of ensuring the anonymity of data. For example, the difficulty level is an index that increases in value as it is difficult to ensure data anonymity. Alternatively, the difficulty level may be an index that decreases in value as it is difficult to ensure anonymity of data.

For example, the information processing apparatus 10 that handles data having a small data size has fewer boundary candidates than the information processing apparatus 10 that handles data having a large data size. In particular, when the amount of data handled by the information processing apparatus 10 is about twice the value of “k” of “k-anonymity”, the information processing apparatus 10 determines that the boundary of the data that can be divided is Limited to around the median.

That is, when the data size stored in the information processing apparatus 10 is different, the information processing apparatus 10 having a small data size has a higher degree of difficulty in securing anonymity than the information processing apparatus 10 having a large data size.

Alternatively, when the anonymity secured by the information processing apparatus 10 is different, the information processing apparatus 10 has different degrees of difficulty in securing anonymity even if the data size is the same.

Therefore, when the information processing apparatus 10 according to the present embodiment determines the common generalization policy in cooperation, the information processing apparatus 30 communicates with the information processing apparatus 30 information regarding the difficulty level or difficulty level of ensuring anonymity.

There are no particular restrictions on the factors that determine the level of difficulty in securing anonymity. For example, the above data size and anonymity are examples of factors that determine the difficulty level of anonymity.

* As the data size increases, it becomes easier to ensure anonymity. Therefore, the data size is an example of a factor that determines the difficulty level of securing data. The data size is an example of an index whose value decreases as it is difficult to ensure data anonymity.

K-Anonymity is more difficult to secure as the value of “k” is larger. Therefore, the value of “k” in k-anonymity is an example of a factor that determines the difficulty of securing data. Note that the value of “k” for k-anonymity is an example of an index whose value increases as it is difficult to ensure data anonymity.

And the generalization policy cooperation determination part 120 of the information processing apparatus 10 which concerns on this embodiment determines a common generalization policy in consideration of the information regarding the difficulty level or difficulty level of anonymity.

Note that the configuration of the information processing apparatus 10 of the present embodiment is the same as that of the first embodiment, and thus the description of the configuration is omitted. Also, description of operations similar to those in the first embodiment will be omitted, and operations unique to the present embodiment will be described.

Hereinafter, as a specific example of the operation, a case where the common generalization policy is determined in consideration of the number of data records as the data size will be described.

For example, assume that the data size (number of records) of device A is “100”. On the other hand, the data size (number of records) of device B is “10”. Assume that “5-anonymity” is secured.

When the device B is optimally divided, the data can be divided into two groups with a data size (number of records) of “5” as a group after division. However, when the boundary is changed and the data size (number of records) included in the divided group is changed, the device B cannot satisfy “5-anonymity” of the data in the divided group.

Therefore, the generalization policy linkage determination unit 120 of the information processing apparatus 10 (the above-described device A and device B) of the present embodiment does not determine the generalization policy that is shared based on both generalization policies as the generalization policy. . The generalization policy linkage determination unit 120 of the information processing apparatus 10 (the apparatuses A and B) according to the present embodiment shares the generalization policy of the information processing apparatus 10 (apparatus B) with a small data size (number of records). Determined as generalization policy.

Note that the generalization policy cooperation determination unit 120 of the information processing apparatus 10 may change the generalization policy determination method as the anonymization process progresses. That is, the information processing apparatus 10 is not limited to the data size to be stored, and may use the divided data size.

For example, in the following case, the generalization policy cooperation determination unit 120 of the information processing device 10 may determine a common generalization policy based on the generalization policies of all the information processing devices 10. In this case, in all the information processing apparatuses 10, the data size (number of records) after division is a predetermined multiple (for example, “k” of “k−anonymity”) to be secured (for example, “k” of “k-anonymity”). 3 times). On the other hand, in the opposite case, the generalization policy cooperation determination unit 120 of the information processing apparatus 10 may prioritize the generalization policy of the information processing apparatus 10 whose data size (number of records) has been reduced, and use the common generalization policy. Here, the opposite case is a case where the data size (number of records) after the division of any one of the information processing apparatuses 10 becomes smaller than a predetermined multiple for the anonymity to be secured.

Alternatively, the generalization policy linkage determination unit 120 of the information processing device 10 may handle the generalization policies of the information processing devices 10 in consideration of the data size, instead of handling them to the same extent.

For example, when determining the generalization policy to be shared, the information processing apparatus 10 may set a weight based on the data size of each information processing apparatus 10 (for example, a weight inversely proportional to the data size).

More specifically, for example, the information processing apparatus 10 multiplies the boundary value by a weight that is inversely proportional to the data size, as shown in the following formula (1), and sets the boundary value in the generalization policy (the point of division) ) May be determined.

[Equation 1]
Boundary value = {(1 / size1) × edge1 + (1 / size2) × edge2}
/ {(1 / size1) + (1 / size2)} (1)
Here, “size1” is the data size of a certain device A (for example, the information processing device 10). “Size2” is the data size of the other device B (for example, the information processing device 30). “Edge1” is a boundary value in the device A. “Edge2” is a boundary value in the device B. Note that when there are more than two information processing apparatuses 10, the information processing apparatus 10 may use a mathematical formula in which the number of terms “size” and “edge” is increased in the mathematical formula (1).

Here, for example, it is assumed that the data size (size 1) of the device A is “100” and the data size (size 2) of the device B is “200”. That is, the data size of device A is smaller than the data size of device B. Further, the boundary value (edge1) in the device A is “120”, and the boundary value (edge2) in the device B is “126”. In this case, the boundary value of Equation (1) is as follows.

Boundary value = {(1/100) × 120 + (1/200) × 126}
/ {(1/100) + (1/200)}
= 122
The boundary value obtained using Equation (1) is close to the boundary value of device A having a small data size. That is, priority is given to the boundary of the device A where it is difficult to ensure anonymization. As a result, in the device A having a small data size, many divisions are possible. That is, the generalization policy of apparatus A is given priority.

Moreover, when considering the anonymity to be secured, the information processing apparatus 10 may operate as follows, for example.

For example, when ensuring “k-anonymity”, the information processing apparatus 10 may prioritize the generalization policy of the information processing apparatus 10 having a large “k” value.

Note that the information processing apparatus 10 may cooperate in consideration of the difficulty level of anonymity. For example, when ensuring “k-anonymity”, the information processing apparatus 10 may use the value of “k” as a weight.

For example, the information processing apparatus 10 may use the following mathematical formula (2).

[Equation 2]
Boundary value = (k1 * edge1 + k2 * edge2) / (k1 + k2) (2)
Here, “k1” is the value of “k” of “k-anonymity” of a device A. “K2” is the value of “k” of “k-anonymity” of the other device B. “Edge1” and “edge2” are the same as in equation (1). When there are more than two information processing apparatuses 10, the information processing apparatus 10 may use a mathematical expression in which the number of terms “edge” and “k” is increased in mathematical expression (2).

For example, “k1” of the device A is “10”, and “k2” of the device B is “2”. That is, it is assumed that device A is more anonymous than device B. The boundary value is the same as described above. In this case, the boundary value of Equation (2) is as follows.

Boundary value = (10 × 120 + 2 × 126) / (10 + 2) = 121
The boundary value obtained using Equation (2) is close to the boundary value of the device A having high anonymity (“k” is large). That is, priority is given to the boundary of the device A where it is difficult to ensure anonymization. As a result, in the device A that is difficult to anonymize, many divisions are possible.

Note that the information processing apparatus 10 may combine the above.

Further, the information processing apparatus 10 may use the difficulty level of ensuring anonymity even in the selection of the common QID described in the modification of the first embodiment.

The effect of the information processing apparatus 10 of the second embodiment will be described.

In addition to the effects of the first embodiment, the information processing apparatus 10 of the present embodiment can obtain the effect of setting an appropriate generalization policy even when the difficulty level of ensuring anonymity is different in the information processing apparatus 10. it can.

The reason is as follows.

The information processing apparatus 10 changes the priority generalization policy determination method based on the degree of difficulty in securing anonymity.

Specifically, for example, the information processing apparatus 10 determines a generalization policy to be prioritized based on the data size of the data to be anonymized (data size to be stored or data size after division) or anonymity.

In particular, the information processing apparatus 10 selects the generalization policy of the information processing apparatus 10 having a small data size or high anonymity. Alternatively, the information processing apparatus 10 gives priority to the generalization policy of the information processing apparatus 10 having a small data size or high anonymity. As a result, the information processing apparatus 10 according to the present embodiment can easily ensure anonymization of the information processing apparatus 10 that is difficult to anonymize.

The present invention has been described above with reference to the embodiments, but the present invention is not limited to the above embodiments. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.

This application claims priority based on Japanese Patent Application No. 2013-103192 filed on May 15, 2013, the entire disclosure of which is incorporated herein.

Some or all of the above embodiments can be described as in the following supplementary notes, but are not limited thereto.

(Appendix 1)
A generalization policy linkage determining means for determining a common generalization policy that is a generalization policy of anonymization of data used in common with the other device in cooperation with another device;
And an anonymizing means for anonymizing data based on the common generalization policy.

(Appendix 2)
The generalization policy linkage determination means is
The information processing apparatus according to appendix 1, wherein a generalization policy of at least some attributes of the data to be anonymized is determined as the common generalization policy.

(Appendix 3)
The anonymization means is
The information processing apparatus according to claim 2, wherein in addition to the common generalization policy, the attribute generalization policy is configured to anonymize data based on at least a part of an attribute generalization policy that is not included in the common generalization policy. .

(Appendix 4)
The generalization policy linkage determination means is
The information processing apparatus according to appendix 2 or appendix 3, wherein the attribute used as the common generalization policy is determined as the other apparatus.

(Appendix 5)
The common generalization policy is a quasi-identifier generalization policy;
The information processing apparatus according to any one of Supplementary Note 1 to Supplementary Note 4, wherein the common generalization policy includes a generalization width and / or a boundary of the reference identifier.

(Appendix 6)
The generalization policy linkage determination means is
The common generalization policy is determined based on the degree of difficulty, which is an index indicating the difficulty of securing anonymization of data to be secured in the own device and the other device when anonymizing data. The information processing apparatus according to any one of the above.

(Appendix 7)
The information processing apparatus according to appendix 6, wherein the difficulty level is calculated based on anonymized data size or anonymity.

(Appendix 8)
The generalization policy linkage determination means is
Anonymity parameter storage means for holding anonymity parameters that are information for determining whether the generalization policy used by the anonymization means is the common generalization policy;
A common parameter setting means for determining a common generalization policy in cooperation with the other device;
The information processing apparatus according to any one of appendix 1 to appendix 7, further comprising: a communication unit that mediates communication between the common parameter setting unit and the other device.

(Appendix 9)
The pre-anonymization data storage means for storing the pre-anonymization data to be anonymized by the anonymization means,
Anonymized data storage means for storing anonymized data anonymized by the anonymization means,
The information processing apparatus according to any one of Supplementary Note 1 to Supplementary Note 8, comprising: transmission means for transmitting the anonymized data to a user device.

(Appendix 10)
The generalization policy linkage determination means is
The information processing apparatus according to any one of Supplementary Note 1 to Supplementary Note 9, wherein an apparatus that prioritizes cooperation in determining the common generalization policy of a plurality of apparatuses or an attribute of a generalization policy that is prioritized is determined in advance.

(Appendix 11)
Determine a common generalization policy that is a generalization policy of anonymization of data used in common with other devices in cooperation with other devices,
An information anonymization method for anonymizing data based on the common generalization policy.

(Appendix 12)
A process of determining a common generalization policy that is a generalization policy of anonymization of data used in common with the other device in cooperation with another device;
A computer-readable recording medium storing a program for causing a computer device to execute processing for anonymizing data based on the common generalization policy.

DESCRIPTION OF SYMBOLS 10 Information processing apparatus 20 User apparatus 30 Information processing apparatus 40 Information processing system 60 Information processing apparatus 110 Anonymization part 120 Generalization policy cooperation determination part 130 Anonymity parameter storage part 140 Common parameter setting part 150 Communication part 160 Data before anonymization Storage unit 170 Anonymized data storage unit 180 Transmission unit 610 CPU
620 ROM
630 RAM
640 Internal storage device 650 IOC
660 Input device 670 Display device 680 NIC
700 storage media

Claims

A generalization policy linkage determining means for determining a common generalization policy that is a generalization policy of anonymization of data used in common with the other device in cooperation with another device;
And an anonymizing means for anonymizing data based on the common generalization policy.
The generalization policy linkage determination means is
The information processing apparatus according to claim 1, wherein a generalization policy of at least some attributes of the data to be anonymized is determined as the common generalization policy.
The anonymization means is
3. The information processing according to claim 2, wherein, in addition to the common generalization policy, in the attribute generalization policy, data is anonymized based on at least a part of an attribute generalization policy not included in the common generalization policy. apparatus.
The generalization policy linkage determination means is
The information processing apparatus according to claim 2, wherein an attribute used as the common generalization policy is determined as the other apparatus.
The common generalization policy is a quasi-identifier generalization policy;
The information processing apparatus according to any one of claims 1 to 4, wherein the common generalization policy includes a generalization width and / or a boundary of the reference identifier.
The generalization policy linkage determination means is
The common generalization policy is determined based on a degree of difficulty that is an index indicating a difficulty in securing anonymization of data to be secured in the own device and the other device when anonymizing data. 6. The information processing apparatus according to any one of 5 above.
The information processing apparatus according to claim 6, wherein the difficulty level is calculated based on anonymized data size or anonymity.
The generalization policy linkage determination means is
Anonymity parameter storage means for holding anonymity parameters that are information for determining whether the generalization policy used by the anonymization means is the common generalization policy;
A common parameter setting means for determining a common generalization policy in cooperation with the other device;
The information processing apparatus according to claim 1, further comprising: a communication unit that mediates communication between the common parameter setting unit and the other device.
The pre-anonymization data storage means for storing the pre-anonymization data to be anonymized by the anonymization means,
Anonymized data storage means for storing anonymized data anonymized by the anonymization means,
The information processing apparatus according to claim 1, further comprising: a transmission unit configured to transmit the anonymized data to a user apparatus.
The generalization policy linkage determination means is
The information processing apparatus according to any one of claims 1 to 9, wherein a priority apparatus or an attribute of a generalization policy to be prioritized when the determination of the common generalization policy of a plurality of apparatuses is linked in advance. .
Determine a common generalization policy that is a generalization policy of anonymization of data used in common with other devices in cooperation with other devices,
An information anonymization method for anonymizing data based on the common generalization policy.
A process of determining a common generalization policy that is a generalization policy of anonymization of data used in common with the other device in cooperation with another device;
A computer-readable recording medium storing a program for causing a computer device to execute processing for anonymizing data based on the common generalization policy.