WO2016067566A1

WO2016067566A1 - Information processing device, information processing method, and recording medium

Info

Publication number: WO2016067566A1
Application number: PCT/JP2015/005289
Authority: WO
Inventors: 隆夫竹之内
Original assignee: 日本電気株式会社
Priority date: 2014-10-29
Filing date: 2015-10-20
Publication date: 2016-05-06
Also published as: JPWO2016067566A1

Abstract

In order to calculate an identification risk relating to personal data, this information processing device comprises: a discrimination risk calculating means which calculates a discrimination risk representing the possibility that data relating to a specific individual can be determined to be the data for a single person; and an identification risk calculating means which calculates an identification risk representing the possibility that the data for the specific individual can be determined to be the data for the specific individual, on the basis of the discrimination risk and an identified individual attainment ratio representing the possibility that the data for the specific individual can be determined to be the data for the specific individual.

Description

Information processing apparatus, information processing method, and recording medium

The present invention relates to anonymity of data, and more particularly to an information processing apparatus, an information processing method, and a recording medium related to the possibility that an individual is specified based on data.

Use and utilization of personal data (hereinafter referred to as personal data) is expected (see, for example, Non-Patent Document 1). In Non-Patent Document 1, in order to promote the use and utilization of personal data, if it is “data with reduced individual identifiability”, the data can be transferred to a third party (without the consent of the person). For example, it is described that the data is provided to a company other than the company that collected the data.

Note that personal data generally includes one or more attribute values. For example, a database including personal data includes personal data attribute values as column values of the data.

Here, the terms “specific” and “identification” used below will be described (see Non-Patent Document 2). Non-Patent Document 2 explains the term “specific” in comparison with the term “identification”. Based on Non-Patent Document 2, “identification” means “knowing that certain information is the information of one person”. On the other hand, “specific” means “knowing who the information is about”. For example, it is assumed that personal data of a certain individual (one individual) is stored in a certain record in the database. At this time, “identification” means that a record corresponding to the attribute value is uniquely determined based on a certain attribute value (a value in a database column). On the other hand, “specific” means that the record identified by “identified” can be known to whom (the above-mentioned individual) is a record. As a matter of course, a record is not “specified” unless it is “identified”.

As techniques for preventing “identification”, techniques of k-anonymity and k-anonymization are disclosed (for example, see Non-Patent Document 3). In addition, since the nonpatent literature 1 and the nonpatent literature 2 do not disclose the specific technical content of the anonymization for preventing specification and identification, the following description is mainly a technique of the nonpatent literature 3. Will be described with reference to FIG.

Referring to FIG. 16, k-anonymity and k-anonymization will be described.

First, the table shown in FIG. 16 will be described. Each of the left table and the right table shown in FIG. 16 stores personal data for four persons. In each table, each record represented by each row is personal data of each individual (each person). Each table includes columns of record number (No.), zip code, age, and medical condition. “Record number (No.)” is a number for uniquely identifying a record. “Zip code”, “age” and “disease state” are attributes. The attribute includes an attribute name and an attribute value. For example, “zip code”, “age”, and “disease state” are attribute names. “1230001” “28” and “heart disease” are attribute values. For example, in the record in the first row of the left table, the attribute indicated by the attribute name “zip code” has an attribute value of 1230001. The table shown in FIG. 16 assumes hospital chart data.

In k-anonymity and k-anonymization techniques, there are four types of attributes: an identifier (ID), a quasi-identifier (QI), a sensitive attribute (Sensitive Attribute), and other attributes. are categorized. An identifier is an attribute such as a name that can uniquely identify an individual. The quasi-identifier is an attribute that can identify an individual when combined with other attributes. Sensitive attributes are attributes that you do not want others to know. And attributes other than these are other attributes. The attribute may be a quasi-identifier and a sensitive attribute.

Next, an operation for ensuring k-anonymity (k-anonymization operation) will be described. The table on the left side of FIG. 16 is a table that collects records from which identifiers such as names are deleted from hospital medical record data.

The record in the left table does not include an identifier. Therefore, it seems that the records included in the left table cannot identify who the record is based on the attribute of the record. However, here, for example, an analyst (hereinafter also referred to as an attacker) in the table on the left side knows that the postal code is 1230001 and the age is 28 years old as the attributes of the user A. Suppose you knew you were going to this hospital. In this case, the attacker uses the No. in the left table. It can be seen that one record is the record of user A. As a result, the attacker knows that the medical condition of user A, which is a sensitive attribute, is heart disease. Thus, the attacker can specify a record based on the zip code and the age in the left table. That is, the zip code and age are quasi-identifiers. In this way, even if the identifier is deleted, if the quasi-identifier is included, the personal data may be identified as a record. As a result, the sensitive attribute that the user does not want to be known may be known to the attacker. Therefore, a technique called k-anonymization is used.

The table on the right side of FIG. 16 is a table including records obtained by processing quasi-identifiers using k-anonymization. In the record in the right table, the postal code and age attribute values are processed using k-anonymity. In FIG. 16, 2 is used as the value of k. That is, in the table on the right side of FIG. 16, the number of records that can be identified from the attribute value of the semi-identifier is 2 or more. Non-Patent Document 3 defines that a table in which the number of records identified based on the quasi-identifier is k or more satisfies k-anonymity. Processing to satisfy k-anonymity is called k-anonymization. For example, the right table is a table satisfying k-anonymity of k = 2. That is, the table on the right is a 2-anonymized table. A table satisfying k-anonymity can prevent “identification” below k. As a result, a table satisfying k-anonymity can prevent “specification” less than k.

The techniques of k-anonymity and k-anonymization described in Non-Patent Document 3 are techniques regarding the possibility of “identification” (hereinafter also referred to as “identification risk”). However, the risk includes a “specific” risk described in Non-Patent Document 1.

In other words, the current technology related to anonymity is a technology corresponding to identification risk (identification), and is not a technology corresponding to the possibility of “specific” (hereinafter referred to as “specific risk”). That is, the technique described in Non-Patent Document 3 has a problem that it cannot cope with a specific risk.

Furthermore, anonymization of data considering only identification risk without considering specific risk may protect privacy more than necessary, that is, process data more than necessary.

The assumption that the attacker knows the quasi-identifier corresponds to consideration of identification risk. However, the attacker may not know the quasi-identifier. If the value of k-anonymity is set on the assumption that the attacker knows the quasi-identifier even though it is unlikely that the attacker knows the quasi-identifier, the value of k is more than necessary. It will be a big value. As described above, an anonymization apparatus may process data more than necessary in anonymization using an identification risk. As a result, the usefulness of the data after anonymization decreases more than necessary.

Thus, the technique described in Non-Patent Document 3 has a problem of processing data more than necessary.

An object of the present invention is to provide an information processing apparatus, an information processing method, and a recording medium that can solve the above problems and calculate (evaluate) a specific risk.

An information processing apparatus according to an aspect of the present invention includes an identification risk calculation unit that calculates an identification risk indicating that there is a possibility that data related to a specified individual is determined to be one person's data; Possibility that the specified personal data is determined to be the specified personal data based on the specific individual arrival rate indicating the possibility of being determined to be the specified personal data and the identification risk Specific risk calculation means for calculating a specific risk indicative of

An information processing system according to an aspect of the present invention includes an identification risk calculation unit that calculates an identification risk indicating that there is a possibility that data related to a specified individual is the data of one person, and data of the specified individual is Possibility that the specified personal data is determined to be the specified personal data based on the specific individual arrival rate indicating the possibility of being determined to be the specified personal data and the identification risk An information processing apparatus including a specific risk calculating means for calculating a specific risk indicating personal information, a personal information storing means for storing personal data as information on a plurality of individuals, and all individuals included in the personal data calculated by the information processing apparatus And an overall risk calculating means for calculating a risk corresponding to the entire personal data based on the specific risk corresponding to the data.

An information processing method according to an aspect of the present invention calculates an identification risk indicating a possibility that data related to a specified individual is determined to be one person's data, and the data of the specified individual is Based on the specific individual arrival rate indicating the possibility of being determined to be data and the identification risk, the specific risk indicating the possibility that the specified personal data is determined to be the specified personal data calculate.

The recording medium according to one aspect of the present invention includes a process for calculating an identification risk indicating that there is a possibility that the data related to the designated individual is determined to be one person's data, and the individual designated by the designated individual data. Specific risk that indicates the possibility that the data of the specified individual is determined to be the data of the specified individual based on the specific individual arrival rate that indicates the possibility that the data is determined to be A program for causing the computer to execute the process of calculating the value is recorded so as to be readable by the computer.

Based on the present invention, it is possible to obtain an effect that a specific risk can be calculated.

FIG. 1 is a block diagram showing an example of the configuration of an information processing system including an information processing apparatus according to the first embodiment of the present invention. FIG. 2 is a diagram illustrating an example of personal data included in the personal data storage unit according to the first embodiment. FIG. 3 is a flowchart illustrating an example of the operation of the information processing apparatus according to the first embodiment. FIG. 4 is a block diagram illustrating an example of another configuration of the information processing system including the information processing apparatus according to the first embodiment. FIG. 5 is a flowchart illustrating an example of the operation of the information processing system according to the first embodiment. FIG. 6 is a diagram illustrating a calculation result of a specific risk used for explaining the first embodiment. FIG. 7 is a block diagram illustrating an example of the configuration of the information processing apparatus according to the second embodiment. FIG. 8 is a diagram illustrating an example of data stored by the storage unit according to the second embodiment. FIG. 9 is a block diagram illustrating an example of the configuration of the information processing apparatus according to the third embodiment. FIG. 10 is a diagram illustrating an example of data stored by the storage unit according to the third embodiment. FIG. 11 is a block diagram illustrating an example of the configuration of the information processing apparatus according to the fourth embodiment. FIG. 12 is a diagram illustrating an example of data stored by the storage unit according to the fourth embodiment. FIG. 13 is a block diagram illustrating an example of the configuration of the information processing apparatus according to the fifth embodiment. FIG. 14 is a diagram illustrating an example of data stored by the storage unit according to the fifth embodiment. FIG. 15 is a block diagram illustrating an example of the configuration of the information processing apparatus according to the sixth embodiment. FIG. 16 is a diagram for explaining k-anonymization and k-anonymity. FIG. 17 is a block diagram illustrating an example of a configuration of a modification of the information processing apparatus according to the first embodiment. FIG. 18 is a block diagram illustrating an example of a configuration of a modified example of the information processing apparatus according to the first embodiment.

Next, an embodiment of the present invention will be described with reference to the drawings.

Each drawing explains an embodiment of the present invention. However, the present invention is not limited to the description of each drawing. Moreover, the same number is attached | subjected to the same structure of each drawing, and the repeated description may be abbreviate | omitted. Further, in the drawings used for the following description, the description of the configuration of the part not related to the description of the present invention is omitted, and there are cases where it is not illustrated.

First, the terms used in the description of this embodiment will be organized.

“Identification” means that it is known (determined) that certain information (data) is information (data) of one person.

“Identification risk” indicates the possibility of identification.

“Specific” means knowing (determining) who information (data) is certain information (data).

“Specified risk” indicates the possibility of being specified.

“Record” is personal data of each individual. The record includes a plurality of attributes.

“Attribute” is the type of data included in the record. The attribute includes an attribute name and an attribute value.

“Identifier” is an attribute that can identify an individual alone.

“Semi-identifier” is an attribute that can identify an individual based on a combination with other attributes.

“Sensitive attributes” are attributes (data) that individuals do not want to disclose.

The “specific individual arrival rate” is a value indicating the possibility that the data (record) is determined (specified) as data of a specific individual. More specifically, the specific individual arrival rate is a value used to calculate the specific risk when the identification risk is calculated. Alternatively, the specific individual arrival rate is a value used when calculating a specific risk by correcting an identification risk calculated based on information for identifying data in a record. More specifically, the specific individual arrival rate is, for example, the possibility that an attacker acquires information (for example, the value of the quasi-identifier described above) for identifying data in a record. For example, when there are two quasi-identifiers that have the same identification risk, and the probability of acquisition for one quasi-identifier is higher than the probability of acquisition for other quasi-identifiers, the specific individual arrival rate for one quasi-identifier Is higher than the specific individual arrival rate for other quasi-identifiers. Alternatively, the possibility of obtaining a value of a certain semi-identifier is generally higher than the possibility of obtaining a combination of values of a plurality of semi-identifiers including the semi-identifier. Therefore, the specific individual arrival rate for a certain quasi-identifier is higher than the specific individual arrival rate for a combination of quasi-identifiers including the quasi-identifier. In addition, increasing the specific individual arrival rate means increasing the specific risk. And the anonymization corresponding to a specific risk makes anonymity high, so that a specific risk is high. For example, rare attributes (quasi-identifiers) are easily specified. That is, it is desirable that the ratio of the specific risk to the identification risk in the rare attribute is higher than the ratio of the specific risk to the identification risk in the general attribute. Therefore, the specific person arrival rate for rare attributes is higher than the specific individual arrival rate for non-rare attributes. In addition, the range of the value (r) of the specific individual arrival rate is between 0 and 1 (0 ≦ r ≦ 1).

<First Embodiment>
A first embodiment of the present invention will be described with reference to the drawings.

[Description of configuration]
FIG. 1 is a block diagram showing an example of the configuration of an information processing system 300 including an information processing apparatus 100 according to the first embodiment of the present invention. As illustrated in FIG. 1, the information processing system 300 includes an information processing apparatus 100 and a personal data storage unit 200. In addition, the direction of the arrow in a drawing shows an example and does not limit the direction of a signal.

The personal data storage unit 200 stores personal data that is a target of specific risk evaluation processing in the information processing apparatus 100.

FIG. 2 is a diagram illustrating an example of personal data stored by the personal data storage unit 200.

As shown in FIG. 2, the personal data storage unit 200 stores “user ID”, which is an identifier of a user (individual), in association with attributes related to the user as personal data. The personal data shown in FIG. 2 includes age, sex, and disease name as attributes relating to the user. For example, “age” (attribute name) of “user1” is “20” (attribute value). Similarly, the “sex” (attribute name) of user1 is “male” (attribute value). The “disease name” (attribute name) of user1 is “cold” (attribute value).

In the following description of the operation of each embodiment, personal data shown in FIG. 2 is used as an example of personal data. The quasi-identifier shall be age and gender. That is, age and sex are attributes to be anonymized. The sensitive attribute is a disease name.

The information processing apparatus 100 calculates (evaluates) the specified individual specific risk in the personal data stored by the personal data storage unit 200.

Therefore, the information processing apparatus 100 includes a reception unit 110, an identification risk calculation unit 120, and a specific risk calculation unit 130.

The receiving unit 110 receives a specific individual arrival rate from a device (not shown). In the present embodiment, the device that transmits the specific individual arrival rate is not particularly limited. For example, the receiving unit 110 may receive a specific individual arrival rate from a device operated by the user. Alternatively, the receiving unit 110 may read the specific individual arrival rate from a storage device (not shown). Hereinafter, these are collectively referred to as “the receiving unit 110 receives a specific individual arrival rate”.

The identification risk calculation unit 120 calculates the identification risk.

The specific risk calculation unit 130 calculates the specific risk based on the identification risk and the specific individual arrival rate.

Details of calculation in the identification risk calculation unit 120 and the specific risk calculation unit 130 will be described later.

[Description of operation]
Next, the operation of the information processing apparatus 100 will be described with reference to the drawings.

FIG. 3 is a flowchart showing an example of the operation of the information processing apparatus 100 according to the first embodiment. As illustrated in FIG. 3, the information processing apparatus 100 performs operations from steps S101 to S104 described below.

It is assumed that the information processing apparatus 100 has previously designated an individual (user) who is to calculate a specific risk. Then, the receiving unit 110 receives the specific individual arrival rate (r).

Then, the identification risk calculation unit 120 acquires personal data from the personal data storage unit 200 (step S101). In this description, as already described, the identification risk calculation unit 120 acquires the personal data shown in FIG.

Next, the identification risk calculation unit 120 calculates how many (how many) records can be identified for the specified individual based on the personal data. Then, the identification risk calculation unit 120 calculates the identification risk of the designated individual based on the number (m) of records that can be identified (step S102). The greater the value of m, the more difficult it is to identify the specified individual from the identified record.

Note that the identification risk calculation unit 120 calculates the identification risk using a preset calculation method. There is no particular limitation on the method for calculating the identification risk. For example, when m records are identified, the identification risk calculation unit 120 may calculate “1 / m” as the identification risk. Then, the identification risk calculation unit 120 transmits the calculated identification risk to the specific risk calculation unit 130.

An example of the operation of the identification risk calculation unit 120 will be described with reference to personal data shown in FIG. The designated individual is “user1”.

Referring to FIG. 2, the values of the two quasi-identifiers (age and sex) of “user1” are “20 (age)” and “male (sex)”, respectively. The values of the two quasi-identifiers are the same in “user1” and “user2”. That is, two records (m = 2) are identified as records having an age of “20” and a gender of “male”. Therefore, the identification risk calculation unit 120 calculates 0.5 (= 1/2) as the identification risk. In addition, the identification risk calculation unit 120 may use other arithmetic operations such as other four arithmetic operations or a power root as the calculation of the identification risk, not limited to division. The identification risk calculation unit 120 may use a general identification risk calculation method.

Then, the identification risk calculation unit 120 transmits 0.5 as the identification risk to the specific risk calculation unit 130. The specific risk calculation unit 130 receives the identification risk from the identification risk calculation unit 120.

Then, the specific risk calculation unit 130 acquires the specific individual arrival rate (r) of the individual from the reception unit 110 (step S103). Here, the specific individual arrival rate (r) is 0.3.

Then, the specific risk calculation unit 130 calculates the specific risk of the specified individual based on the identification risk (0.5) and the specific individual arrival rate (r = 0.3) (step S104). The specific risk calculation unit 130 calculates the specific risk using a preset calculation method. The method for calculating the specific risk is not particularly limited. In the following description, as an example, it is assumed that the specific risk calculation unit 130 calculates using “individual specific risk = (1 / m) × r” as the specific risk.

Here, the identification risk (1 / m) is 0.5. The specific individual arrival rate (r) is 0.3. Therefore, the specific risk calculation unit 130 calculates “0.5 × 0.3 = 0.15” as the specific risk.

In the above, “specific risk = (1 / m) × r” is used as the calculation of the individual specific risk. However, the calculation of the specific risk is not limited to this. For example, the specific risk calculation unit 130 may use “specific risk = (1 / m) × r × r” as the calculation of the specific risk. The specific risk calculation unit 130 is not limited to multiplication or division as a calculation formula, and addition or subtraction may be used. Alternatively, the specific risk calculation unit 130 is not limited to the four rules, and may use a calculation such as a power root or logarithm. Furthermore, the specific risk calculation unit 130 may combine these calculations.

It should be noted that the specific risk calculation unit 130 may transmit the specific risk to a predetermined device (for example, a user device).

As an explanation so far, the information processing apparatus 100 calculates a specific risk for a certain individual. However, the information processing apparatus 100 may calculate specific risks for a plurality of or all individuals stored in the personal data storage unit 200.

FIG. 4 is a diagram illustrating an example of a configuration of an information processing system 310 having another configuration including the information processing apparatus 100 according to the first embodiment. In addition, the direction of the arrow in a drawing shows an example and does not limit the direction of a signal.

4 includes, in addition to the configuration of the information processing system 300, a specific risk calculation result storage unit 230 and an overall risk calculation unit 240. Note that the information processing apparatus 100 may include an overall risk calculation unit 240. Furthermore, the information processing apparatus 100 may include a specific risk calculation result storage unit 230.

The specific risk calculation result storage unit 230 stores a specific risk for each individual.

FIG. 6 is a diagram showing a calculation result of a specific risk for each user. In FIG. 6, the formula shown on the right side of the table is a formula for calculating the specific risk of each individual.

The overall risk calculation unit 240 calculates the specific risks of all individuals stored in the personal data storage unit 200 using the information processing apparatus 100, and stores the calculated specific risks in the specific risk calculation result storage unit 230. To do. After the calculation of the specific risk for all individuals is completed, the overall risk calculation unit 240 is stored in the personal data storage unit 200 using all the specific risks stored in the specific risk calculation result storage unit 230. Calculate the overall risk in your personal data. Here, the “total risk” is a value calculated using a predetermined calculation formula based on the specific risk of each individual. For example, the overall risk may be a total value, an arithmetic average value, a median value, or a mode value of specific risks of all individuals. Alternatively, the overall risk may be the maximum value or the minimum value in the specific risk of all individuals. Alternatively, the total risk may be a total value or an average value of a predetermined number of high-level specific risks in all individual specific risks. Furthermore, the overall risk may be a value related to the distribution of specific risks of all individuals or a distribution shape such as standard deviation.

The overall risk calculation unit 240 may calculate not only one value but a plurality of values (for example, an average value and variance) as the overall risk.

The operation of the information processing system 310 will be described with reference to the drawings.

FIG. 5 is a flowchart showing an example of the operation of the information processing system 310.

In the following description of the operation, it is assumed that the overall risk calculation unit 240 controls the operation. However, the operation control entity need not be limited to this. For example, the information processing apparatus 100 may control including the overall risk calculation unit 240. Alternatively, a control device (not shown) may control the configuration included in the information processing system 310.

First, the overall risk calculation unit 240 instructs the information processing apparatus 100 to acquire personal data. The information processing apparatus 100 acquires personal data from the personal data storage unit 200 (step S201).

Next, the overall risk calculation unit 240 instructs the information processing apparatus 100 to calculate a specific risk corresponding to each individual of the personal data (step S202).

The information processing apparatus 100 calculates a specific risk corresponding to the designated individual (step S203).

The overall risk calculation unit 240 stores the calculated specific risk in the specific risk calculation result storage unit 230 in association with the individual (step S204).

After calculating the specific risk for all individuals, the overall risk calculation unit 240 calculates the overall risk based on the specific risks of all individuals (step S205).

[Description of effects]
Next, the effect of the first embodiment will be described.

The information processing apparatus 100 according to the first embodiment can achieve an effect that a specific risk of a predetermined individual can be calculated.

The reason is as follows.

The receiving unit 110 of the present embodiment receives the specific individual arrival rate. Then, the identification risk calculation unit 120 calculates an individual identification risk. This is because the specific risk calculation unit 130 can calculate the specific risk of the designated individual based on the identification risk and the specific individual arrival rate.

Therefore, the system using the information processing apparatus 100 can determine anonymization of appropriate personal data using the specific risk calculated by the information processing apparatus 100.

In addition, the information processing apparatus 100 can achieve an effect of preventing unnecessary data processing.

The reason is that since the information processing apparatus 100 calculates a specific risk, the system using the information processing apparatus 100 determines the degree of anonymization (for example, k-value of anonymity) when identifying risk. This is because a specific risk can be used.

In addition, the information processing system 310 including the information processing apparatus 100 according to the present embodiment can produce an effect that it is possible to calculate the overall risk for the entire personal data.

The reason is that the overall risk calculation unit 240 can calculate the overall risk of personal data based on the specific risk of all individuals.

[Modification 1]
The information processing apparatus 100 described above is configured as follows.

For example, each component of the information processing apparatus 100 may be configured with a hardware circuit.

Moreover, in the information processing apparatus 100, each component may be configured using a plurality of apparatuses connected via a network.

FIG. 18 is a block diagram illustrating an example of the configuration of the information processing apparatus 106 according to the first modification of the present embodiment. In addition, the direction of the arrow in a drawing shows an example and does not limit the direction of a signal.

The information processing apparatus 106 includes an identification risk calculation unit 120 and a specific risk calculation unit 130. Each configuration of the information processing apparatus 106 receives personal data and a specific individual arrival rate via a network (not shown) and operates in the same manner as each configuration of the information processing apparatus 100.

The information processing apparatus 106 configured in this manner can achieve the same effects as the information processing apparatus 100.

The reason is that, as described above, each configuration of the information processing apparatus 106 operates in the same manner as the configuration of the information processing apparatus 100 and can calculate a specific risk.

Note that the information processing apparatus 106 is the minimum configuration in the embodiment of the present invention.

[Modification 2]
Furthermore, modified examples of the information processing apparatus 100 and the information processing apparatus 106 will be described using the information processing apparatus 100. In the information processing apparatus 100 and the information processing apparatus, the plurality of components may be configured with a single piece of hardware.

Further, the information processing apparatus 100 may be realized as a computer apparatus including a CPU (Central Processing Unit), a ROM (Read Only Memory), and a RAM (Random Access Memory). In addition to the above configuration, the information processing apparatus 100 may be realized as a computer apparatus that further includes an input / output connection circuit (IOC: Input ： / Output Circuit) and a network interface circuit (NIC: Network Interface Circuit).

FIG. 17 is a block diagram showing an example of the configuration of the information processing apparatus 600 according to this modification.

The information processing apparatus 600 includes a CPU 610, a ROM 620, a RAM 630, an internal storage device 640, an IOC 650, and a NIC 680, and constitutes a computer device.

CPU 610 reads a program from ROM 620. The CPU 610 controls the RAM 630, the internal storage device 640, the IOC 650, and the NIC 680 based on the read program. The computer including the CPU 610 controls these configurations to realize the functions as the reception unit 110, the identification risk calculation unit 120, and the specific risk calculation unit 130 illustrated in FIG. Further, the computer including the CPU 610 may control these configurations to realize the function as the overall risk calculation unit 240 shown in FIG.

The CPU 610 may use the RAM 630 or the internal storage device 640 as a temporary storage medium for the program when realizing each function.

Further, the CPU 610 may read a program included in the storage medium 700 storing the program so as to be readable by a computer by using a storage medium reading device (not shown). Alternatively, the CPU 610 may receive a program from an external device (not shown) via the NIC 680, store the program in the RAM 630, and operate based on the stored program.

ROM 620 stores programs executed by CPU 610 and fixed data. The ROM 620 is, for example, a P-ROM (Programmable-ROM) or a flash ROM.

The RAM 630 temporarily stores programs executed by the CPU 610 and data. The RAM 630 is, for example, a D-RAM (Dynamic-RAM).

The internal storage device 640 stores data and programs stored in the information processing device 600 for a long period of time. Further, the internal storage device 640 may operate as a temporary storage device for the CPU 610. The internal storage device 640 is, for example, a hard disk device, a magneto-optical disk device, an SSD (Solid State Drive), or a disk array device. The internal storage device 640 may operate as the personal data storage unit 200.

Here, the ROM 620 and the internal storage device 640 are nonvolatile storage media. On the other hand, the RAM 630 is a volatile storage medium. The CPU 610 can operate based on a program stored in the ROM 620, the internal storage device 640, or the RAM 630. That is, the CPU 610 can operate using a nonvolatile storage medium or a volatile storage medium.

The IOC 650 mediates data between the CPU 610, the input device 660, and the display device 670. The IOC 650 is, for example, an IO interface card or a USB (Universal Serial Bus) card.

The input device 660 is a device that receives an input instruction from an operator of the information processing apparatus 600. The input device 660 is, for example, a keyboard, a mouse, or a touch panel.

The display device 670 is a device that displays information to the operator of the information processing apparatus 600. The display device 670 is a liquid crystal display, for example.

The NIC 680 relays data exchange with an external device (not shown) via the network. The NIC 680 is, for example, a LAN (Local Area Network) card.

The information processing apparatus 600 configured as described above can achieve the same effects as the information processing apparatus 100.

This is because the CPU 610 of the information processing apparatus 600 can realize the same function as the information processing apparatus 100 based on the program.

<Second Embodiment>
A second embodiment will be described with reference to the drawings.

The information processing apparatus 101 according to the second embodiment is different from the information processing apparatus 100 according to the first embodiment in that the specific individual arrival rate is determined according to the attribute of the quasi-identifier. This embodiment can cope with a case where a plurality of attackers have different possibilities of knowing the quasi-identifier.

[Description of configuration]
The configuration of the information processing apparatus 101 according to the second embodiment will be described with reference to the drawings.

FIG. 7 is a block diagram illustrating an example of the configuration of the information processing apparatus 101 according to the second embodiment. In addition, the direction of the arrow in a drawing shows an example and does not limit the direction of a signal. As illustrated in FIG. 7, the information processing apparatus 101 includes an acquisition unit (first acquisition unit) 111 instead of the reception unit 110 that receives the specific individual arrival rate of the information processing apparatus 100. Further, the information processing apparatus 101 includes a storage unit 211. The information processing apparatus 101 may be configured using a computer shown in FIG. Further, the information processing apparatus 101 may use the storage unit 211 as an external apparatus connected via a network. Further, the acquisition unit 111 may receive information stored in the storage unit 211 described below from an external device (not shown) as in the first embodiment. In this case, the information processing apparatus 101 may not include the storage unit 211.

The storage unit 211 stores the quasi-identifier or combination of quasi-identifiers and the corresponding specific individual arrival rate in association with each other.

FIG. 8 is a diagram illustrating an example of data stored by the storage unit 211. In the data shown in FIG. 8, for example, the specific individual arrival rate corresponding to the combination of age and sex, which are quasi-identifiers, is 0.3. On the other hand, the specific individual arrival rate with respect to age is 0.6, which is higher than that value. This is because the possibility that the value of the combination of quasi-identifiers can be acquired is lower than the possibility that the value of one quasi-identifier included in the combination can be acquired. The following description of the present embodiment will be described using the data of FIG.

The acquisition unit 111 acquires the specific individual arrival rate corresponding to the quasi-identifier or the combination of quasi-identifiers from the storage unit 211. When acquiring the specific individual arrival rate of the combination of quasi-identifiers, the acquiring unit 111 may use the product of the specific individual arrival rates of the quasi-identifiers included in the combination as the specific individual arrival rate of the combination of the quasi-identifiers.

[Description of operation]
Next, an operation of the information processing apparatus 101 will be described focusing on an operation different from the operation of the first embodiment described with reference to FIG.

The information processing apparatus 101 executes step S101 and step S102 as in the first embodiment.

Then, the information processing apparatus 101 executes the following operation instead of step S103.

The specific risk calculation unit 130 receives the identification risk from the identification risk calculation unit 120. Then, the specific risk calculation unit 130 passes the information on the specified individual semi-identifier or combination of semi-identifiers to the acquisition unit 111 and requests acquisition of the specific individual arrival rate.

The acquisition unit 111 acquires the specific individual arrival rate corresponding to the requested quasi-identifier or combination of quasi-identifiers from the storage unit 211. Then, the acquisition unit 111 returns the specific individual arrival rate to the specific risk calculation unit 130.

For example, when the anonymization target quasi-identifier is age and gender, the acquisition unit 111 acquires 0.3 as the specific individual reach and transmits it to the specific risk calculation unit 130.

The specific risk calculation unit 130 operates in the same manner as step S104 of the first embodiment, and calculates a specific risk.

[Description of effects]
Next, effects of the second embodiment will be described.

In addition to the effect of the first embodiment, the information processing apparatus 101 according to the second embodiment can produce an effect that a specific risk can be calculated according to the attribute of the quasi-identifier. That is, the second embodiment can produce an effect that it can cope with a case where the quasi-identifiers that the attacker may know are different.

The reason is as follows.

The first acquisition unit 111 acquires a specific individual arrival rate corresponding to a quasi-identifier or a combination of quasi-identifiers. This is because the specific risk calculation unit 130 calculates the specific risk based on the specific individual arrival rate corresponding to the quasi-identifier or the combination of quasi-identifiers.

<Third Embodiment>
A third embodiment will be described with reference to the drawings.

The information processing apparatus 102 according to the third embodiment differs from the information processing apparatus 100 according to the first embodiment in that the specific individual arrival rate is determined according to a combination of conditions for the attributes of the quasi-identifier.

[Description of configuration]
The configuration of the information processing apparatus 102 according to the third embodiment will be described with reference to the drawings.

FIG. 9 is a block diagram illustrating an example of the configuration of the information processing apparatus 102 according to the third embodiment. In addition, the direction of the arrow in a drawing shows an example and does not limit the direction of a signal. As illustrated in FIG. 9, the information processing apparatus 102 includes an acquisition unit 112 instead of the reception unit 110 of the information processing apparatus 100. Hereinafter, in order to distinguish the acquisition unit 112 from the reception unit 110, the acquisition unit 112 may be referred to as a second acquisition unit. Further, the information processing apparatus 102 includes a storage unit 212. The information processing apparatus 102 may be configured using a computer shown in FIG. Further, the information processing apparatus 102 may use the storage unit 212 as an external apparatus connected via a network. Further, the acquisition unit 112 may receive information stored in the storage unit 212 described below from an external device (not shown) as in the first embodiment. In this case, the information processing apparatus 102 may not include the storage unit 212.

The storage unit 212 includes a first attribute (first attribute name) as a condition, a second attribute (attribute name) for setting a specific individual arrival rate, and a function ( For example, a combination of a conditional expression and a calculation expression) is stored in association with each other. The first and second attributes may be a combination of a plurality of attributes.

FIG. 10 is a diagram illustrating an example of data stored by the storage unit 212. In the data shown in FIG. 10, the attribute used for the determination is the first attribute, and the attribute for setting the specific individual arrival rate is the second attribute. Further, the specific individual arrival rate set in FIG. 10 is a function for calculating the set specific individual arrival rate. According to the function shown in FIG. 10, for example, in the case of the attribute value (male) of the attribute name (gender), the specific individual arrival rate for the attribute (age) is 0.2. Further, according to the function, in the case of the attribute value (female) of the attribute name (gender), the specific individual arrival rate for the attribute (age) is 0.1. The following description of the present embodiment will be described using the data of FIG.

The acquisition unit 112 acquires a specific individual arrival rate corresponding to the specified attribute. The acquisition unit 112 may acquire personal data from the personal data storage unit 200 as necessary when acquiring the specific individual arrival rate. When acquiring the specific individual arrival rate of the combination of quasi-identifiers, the acquiring unit 112 may use the product of the specific individual arrival rates of the quasi-identifiers included in the combination as the specific individual arrival rate of the combination of the quasi-identifiers. Alternatively, when acquiring the specific individual arrival rate of the combination of quasi-identifiers, the acquiring unit 112 obtains the minimum value among the specific individual arrival rates of the quasi-identifiers included in the combination as the specific individual arrival rate of the combination of quasi-identifiers. You may select the specific individual arrival rate which is the maximum value.

The information processing apparatus 102 executes step S101 and step S102 as in the first embodiment.

Then, the information processing apparatus 102 executes the following operation instead of step S103.

The specific risk calculation unit 130 receives the identification risk from the identification risk calculation unit 120. The specific risk calculation unit 130 then passes the information on the specified individual semi-identifier or combination of semi-identifiers to the acquisition unit 112 and requests acquisition of the specific individual reach.

The obtaining unit 112 refers to the storage unit 212 and obtains the specific individual arrival rate corresponding to the attribute name and attribute value of the designated individual semi-identifier. Then, the acquisition unit 112 returns the specific individual arrival rate to the specific risk calculation unit 130.

Note that the specific risk calculation unit 130 may transmit the specified individual attribute name and attribute value to the acquisition unit 112. Alternatively, the specific risk calculation unit 130 may transmit the specified individual identifier or the specified individual attribute name to the acquisition unit 112. In this case, the acquisition unit 112 refers to the personal data storage unit 200 and acquires data necessary for the determination.

For example, it is assumed that the specific risk calculation unit 130 transmits user1 and an attribute name (age). In this case, the acquisition unit 112 acquires the attribute value (20) of the attribute name (age) of user1 from the data (data shown in FIG. 2) of the personal data storage unit 200. Then, the acquisition unit 112 acquires 0.3 as the specific individual arrival rate based on the data stored in the storage unit 212 (data shown in FIG. 10).

[Description of effects]
Next, effects of the third embodiment will be described.

In addition to the effects of the first embodiment, the information processing apparatus 102 according to the third embodiment can achieve the effect that the specific individual arrival rate can be determined according to the attribute name and the attribute value. That is, the third embodiment can produce an effect that an appropriate specific risk corresponding to a finer condition can be calculated.

The reason is as follows.

The second acquisition unit 112 acquires the specific individual arrival rate based on the condition set corresponding to the attribute value or the combination of attribute values that are quasi-identifiers. This is because the specific risk calculation unit 130 calculates the specific risk based on the specific individual arrival rate corresponding to the condition.

<Fourth embodiment>
A fourth embodiment will be described with reference to the drawings.

The information processing apparatus 103 according to the fourth embodiment is different from the information processing apparatus 100 according to the first embodiment in that the specific individual arrival rate is determined according to the individual identification risk.

[Description of configuration]
The configuration of the information processing apparatus 103 according to the fourth embodiment will be described with reference to the drawings.

FIG. 11 is a block diagram illustrating an example of the configuration of the information processing apparatus 103 according to the fourth embodiment. In addition, the direction of the arrow in a drawing shows an example and does not limit the direction of a signal. As illustrated in FIG. 11, the information processing apparatus 103 includes an acquisition unit 113 instead of the reception unit 110 of the information processing apparatus 100. Hereinafter, the acquisition unit 113 may be referred to as a third acquisition unit in order to distinguish it from the acquisition unit 112 or the like. Further, the information processing apparatus 103 includes a storage unit 213. The information processing apparatus 103 may be configured using a computer shown in FIG. Further, the information processing apparatus 103 may use the storage unit 213 as an external apparatus connected via a network. Further, the acquisition unit 113 may receive information stored in the storage unit 213 described below from an external device (not shown) as in the first embodiment. In this case, the information processing apparatus 103 may not include the storage unit 213.

The storage unit 213 stores the identification risk and the specific individual arrival rate in association with each other.

FIG. 12 is a diagram illustrating an example of data stored by the storage unit 213. Increasing the specific individual arrival rate means increasing the specific risk. And the anonymization process using a specific risk performs anonymization so that it may become high anonymity, so that it is hard to identify, so that a specific risk is high. Therefore, the specific individual arrival rate shown in FIG. 12 is higher as the identification risk (1 / m) is larger. This is because, as described above, the rare attributes (attributes with high identification risk) are less likely to be specified (specific individual arrival rate is high). However, this is an example of this embodiment. The present embodiment need not be limited to such data. In addition, the following description of this embodiment is demonstrated using the data of FIG.

The acquisition unit 113 acquires the specific individual arrival rate based on the identification risk.

[Description of operation]
Next, the operation of the information processing apparatus 103 will be described focusing on the operation different from the operation of the first embodiment described with reference to FIG.

The information processing apparatus 103 executes step S101 and step S102 as in the first embodiment.

Then, the information processing apparatus 103 performs the following operation instead of step S103.

The specific risk calculation unit 130 receives the identification risk from the identification risk calculation unit 120. Then, the specific risk calculation unit 130 passes the quasi-identifier information of the specified individual identification risk condition to the acquisition unit 113 and requests acquisition of the specific individual arrival rate.

The acquisition unit 113 acquires the possibility of individual identification (identification risk) from the identification risk calculation unit 120. Thereafter, the acquisition unit 113 refers to the storage unit 213 and acquires the specific individual arrival rate.

For example, when the identification risk is 0.6, the acquisition unit 113 refers to the storage unit 213 (data illustrated in FIG. 12) and acquires 0.8 as the specific individual arrival rate.

Then, the acquisition unit 113 returns the specific individual arrival rate to the specific risk calculation unit 130.

[Description of effects]
Next, the effect of the fourth embodiment will be described.

In addition to the effect of the first embodiment, the information processing apparatus 103 according to the fourth embodiment can produce an effect that an appropriate specific risk can be calculated according to the identification risk.

The reason is as follows.

The third acquisition unit 113 acquires the specific individual arrival rate in consideration of the identification risk. Then, the specific risk calculation unit 130 calculates the specific risk based on the specific individual arrival rate.

For example, birthday is a quasi-identifier. In this case, a person whose birthday is February 29 of a leap year is more easily identified based on a quasi-identifier (birthday) than a person whose birthday is a birthday on the other day. Therefore, a person whose birthday is February 29 is more likely to hide his / her birthday (quasi-identifier) than other people. That is, when the quasi-identifier is a birthday, the identification risk for a person whose birthday is February 29 is higher than the identification risk for other persons. Even in such a case, the present embodiment can acquire an appropriate specific individual arrival rate according to the identification risk and calculate the specific risk.

<Fifth embodiment>
A fifth embodiment will be described with reference to the drawings.

The information processing apparatus 104 according to the fifth embodiment is different from the first embodiment in that the specific individual arrival rate is changed according to an organization (or an organization that may become an attacker) of a personal data providing destination. Different from the information processing apparatus 100. This is because the organization (partner) to whom personal data is provided may become an attacker against the provided personal data. Each organization has different risks as attackers.

[Description of configuration]
The configuration of the information processing apparatus 104 according to the fifth embodiment will be described with reference to the drawings.

FIG. 13 is a block diagram illustrating an example of the configuration of the information processing apparatus 104 according to the fifth embodiment. In addition, the direction of the arrow in a drawing shows an example and does not limit the direction of a signal. As illustrated in FIG. 13, the information processing device 104 includes an acquisition unit 114 instead of the reception unit 110 of the information processing device 100. Hereinafter, when the acquisition unit 114 is distinguished from the acquisition unit 112 or the like, it is referred to as a fourth acquisition unit. Further, the information processing apparatus 104 includes a storage unit 214. The information processing apparatus 104 may be configured using a computer shown in FIG. Further, the information processing apparatus 104 may use the storage unit 214 as an external apparatus connected via a network. Further, the acquisition unit 114 may receive information stored in the storage unit 214 described below from an external device (not shown) as in the first embodiment. In this case, the information processing apparatus 104 may not include the storage unit 214.

The storage unit 214 stores the information providing destination in association with the specific individual arrival rate corresponding to the providing destination.

FIG. 14 is a diagram illustrating an example of data stored by the storage unit 214. For example, an organization with a large number of people is likely to include a person who knows the target individual. Here, for example, it is assumed that the number of members of the organization B is larger than the number of members of the organization A. In that case, the specific individual arrival rate in the organization B needs to be larger than the specific individual arrival rate in the organization A. Therefore, in FIG. 14, the specific individual arrival rate in the organization B of the provision destination is a value larger than the specific individual arrival rate in the organization A. As illustrated in FIG. 14, the storage unit 214 may include a plurality of types (for example, the organization and the business type illustrated in FIG. 14) as the organization to be stored as a providing destination. The following description of the present embodiment will be described using the data of FIG.

The acquisition unit 114 acquires the specific individual arrival rate from the storage unit 214. When acquiring the specific individual arrival rate of the combination of quasi-identifiers, the acquiring unit 114 may use the product of the specific individual arrival rates of the quasi-identifiers included in the combination as the specific individual arrival rate of the combination of the quasi-identifiers. Or the acquisition part 114 may acquire the information (for example, the number of members) regarding a provision destination from the preservation | save part 214, and may calculate the specific individual arrival rate according to a provision destination based on the information.

[Description of operation]
Next, operations of the information processing apparatus 104 will be described focusing on operations different from the operations of the first embodiment described with reference to FIG.

The information processing apparatus 104 executes step S101 and step S102 as in the first embodiment.

Then, the information processing apparatus 104 executes the following operation instead of step S103.

The specific risk calculation unit 130 receives the identification risk from the identification risk calculation unit 120. Then, the specific risk calculation unit 130 requests the acquisition unit 114 to acquire the specific individual arrival rate. At this time, the specific risk calculation unit 130 transmits to the acquisition unit 114 information on the provision destination and the attribute set as the opponent (attacker).

The acquisition unit 114 acquires the specific individual arrival rate corresponding to the received destination and attribute based on the data of the storage unit 214.

For example, when the providing destination is the organization B and the attribute is gender, the acquisition unit 114 acquires 0.5 as the specific individual arrival rate.

Then, the acquisition unit 114 returns information on the specific individual arrival rate to the specific risk calculation unit 130.

[Description of effects]
Next, effects of the fifth exemplary embodiment will be described.

In addition to the effects of the first embodiment, the information processing apparatus 104 according to the fifth embodiment changes the specific individual arrival rate according to an organization that provides personal data (an organization that may become an attacker). The effect that it can be done can be produced.

The reason is as follows.

The fourth acquisition unit 114 acquires a specific individual arrival rate corresponding to an organization (partner) that provides personal data. This is because the specific risk calculation unit 130 calculates the specific risk based on the specific individual arrival rate corresponding to the organization (partner).

<Sixth Embodiment>
A sixth embodiment will be described with reference to the drawings.

The information processing apparatus 105 according to the sixth embodiment is different from the information processing apparatus 100 according to the first embodiment in that the specific individual arrival rate is calculated using publicly available data (public information).

Here, public information is data (public data) that is open to the public. For example, public information refers to the distribution of members to whom data is provided (for example, information such as 10,000 members in their 10s, 15,000 members in their 20s, and 10,000 members in their 30s). is there. Alternatively, the public information is information published on the Internet such as Twitter (registered trademark) (for example, “user1 is a teenager and discloses location information”). However, the disclosure range of public information need not be limited to information that has no limitation on the disclosure range, such as information on the Internet. For example, the public information may be information that is disclosed to a member registered in a predetermined organization (for example, an Internet provider) and whose disclosure range is limited to some extent.

[Description of configuration]
The configuration of the information processing apparatus 105 according to the sixth embodiment will be described with reference to the drawings.

FIG. 15 is a block diagram illustrating an example of the configuration of the information processing apparatus 105 according to the sixth embodiment. In addition, the direction of the arrow in a drawing shows an example and does not limit the direction of a signal. As illustrated in FIG. 15, the information processing device 105 includes a specific individual arrival rate calculation unit 115 instead of the reception unit 110 of the information processing device 100. Further, the information processing apparatus 105 includes a public distribution information storage unit 215. The information processing apparatus 105 may be configured using a computer shown in FIG. Further, the information processing apparatus 105 may use the public distribution information storage unit 215 as an external apparatus connected via a network. Further, the specific individual arrival rate calculation unit 115 may receive information stored in the public distribution information storage unit 215 described below from an external device (not shown), as in the first embodiment. In this case, the information processing apparatus 105 may not include the public distribution information storage unit 215.

The public distribution information storage unit 215 stores public information.

The specific individual arrival rate calculation unit 115 calculates the specific individual arrival rate based on the public information.

[Description of operation]
Next, the operation of the information processing apparatus 105 will be described focusing on the operation different from the operation of the first embodiment described with reference to FIG.

The information processing apparatus 105 executes step S101 and step S102 as in the first embodiment.

Then, the information processing apparatus 105 executes the following operation instead of step S103.

The specific risk calculation unit 130 receives the identification risk from the identification risk calculation unit 120. Then, the specific risk calculation unit 130 requests the specific individual arrival rate calculation unit 115 to calculate the specific individual arrival rate.

The specific individual arrival rate calculation unit 115 calculates the specific individual arrival rate using the public information stored in the public distribution information storage unit 215 and the personal data stored in the personal data storage unit 200, and the calculation result To the specific risk calculation unit 130.

The calculation of the specific individual arrival rate calculation unit 115 is not particularly limited and may be set according to the required risk. For example, the specific individual arrival rate calculation unit 115 may calculate the specific individual arrival rate using the distribution of personal data and the distribution of data in the organization to which the personal data is provided in the public information.

Next, two calculation examples will be described as more specific calculation examples.

The first calculation example is a calculation example using the number of members of the target organization. Here, the public information is “Company A has 10 million minor members (10% of Japan's population)”. Based on this public information, the specific individual arrival rate calculation unit 115 knows that Company A knows the quasi-identifier with a probability of 10%. Therefore, the specific individual arrival rate calculation unit 115 calculates the specific individual arrival rate as 0.1 based on the public information (population ratio).

The second calculation example is a calculation example using the distribution of members of the target organization. Here, the public information is “the number of teenage members is 10,000, the number of location information public members of the teens is 1,000, the number of members of the 20s is 20,000, The number of location information disclosure members is 1,000. " In this case, the disclosure rate of location information among teenage members is 0.1 (= 1000/10000). Similarly, the disclosure rate of location information among members in their 20s is 0.05 (= 1000/20000). Therefore, the specific individual arrival rate calculation unit 115 calculates the specific individual arrival rate of the teenage member as 0.1 based on the assumption that “the disclosure rate of the position information = the specific individual arrival rate”, and the 20th generation The specific individual arrival rate of members is calculated as 0.05.

Then, the specific individual arrival rate calculation unit 115 returns the specific individual arrival rate to the specific risk calculation unit 130.

[Description of effects]
Next, the effect of the sixth embodiment will be described.

In addition to the effects of the first embodiment, the information processing apparatus 105 according to the sixth embodiment has an effect of receiving specific personal information and reducing the operation of storing the specific personal reach.

The reason is that the specific individual arrival rate calculation unit 115 calculates the specific individual arrival rate based on the public information.

<Other embodiments>
The above first to sixth embodiments may be combined. For example, the specific risk calculation unit 130 according to the embodiment of the present invention uses the specific individual arrival rate corresponding to the quasi-identifier described in the second embodiment and the specific individual corresponding to the identification risk described in the fourth embodiment. The specific risk may be calculated using the arrival rate.

Alternatively, the information processing system 310 may include the information processing apparatus 101 according to the second embodiment to the information processing apparatus 105 according to the sixth embodiment, instead of the information processing apparatus 100 according to the first embodiment. .

The present invention has been described above with reference to the embodiments, but the present invention is not limited to the above embodiments. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.

This application claims priority based on Japanese Patent Application No. 2014-219808 filed on October 29, 2014, the entire disclosure of which is incorporated herein.

The present invention can be used as a tool for calculating a specific risk of an individual. The present invention can also be used when processing data so that the personal data becomes “data with reduced individual specificity”.

DESCRIPTION OF SYMBOLS 100 Information processing apparatus 101 Information processing apparatus 102 Information processing apparatus 103 Information processing apparatus 104 Information processing apparatus 105 Information processing apparatus 106 Information processing apparatus 110 Reception part 111 Acquisition part 112 Acquisition part 113 Acquisition part 114 Acquisition part 115 Specific individual arrival rate calculation part 120 Identification Risk Calculation Unit 130 Specific Risk Calculation Unit 200 Personal Data Storage Unit 211 Storage Unit 212 Storage Unit 213 Storage Unit 214 Storage Unit 215 Public Distribution Information Storage Unit 230 Specific Risk Calculation Result Storage Unit 240 Overall Risk Calculation Unit 300 Information Processing System 310 Information processing system 600 Information processing device 610 CPU
620 ROM
630 RAM
640 Internal storage device 650 IOC
660 Input device 670 Display device 680 NIC
700 storage media

Claims

An identification risk calculation means for calculating an identification risk indicating the possibility that the data related to the designated individual is determined to be the data of one person,
Based on the specific individual arrival rate indicating the possibility that the specified personal data is determined to be the specified personal data and the identification risk, the specified personal data is the specified personal data. An information processing apparatus comprising: a specific risk calculating means for calculating a specific risk indicating a possibility of being determined as personal data.
The information processing apparatus according to claim 1, further comprising receiving means for receiving the specific individual arrival rate.
The information processing apparatus according to claim 1, further comprising: a first acquisition unit configured to acquire the specific individual arrival rate corresponding to the attribute of the quasi-identifier in the data regarding the specified individual.
4. The information processing apparatus according to claim 1, further comprising: a second acquisition unit configured to acquire the specific individual arrival rate corresponding to a combination of conditions for attributes in the data related to the designated individual. 5.
The information processing apparatus according to claim 1, further comprising: a third acquisition unit that acquires the specific individual arrival rate corresponding to the identification risk.
6. The information processing apparatus according to claim 1, further comprising: a fourth acquisition unit configured to acquire the specific individual arrival rate corresponding to the personal data providing destination.
The information processing apparatus according to any one of claims 1 to 6, further comprising: a specific individual arrival rate calculating unit that calculates the specific individual arrival rate based on public information and personal data.
An information processing apparatus according to any one of claims 1 to 7,
Personal information storage means for storing personal data that is information about a plurality of individuals;
An overall information processing system comprising: an overall risk calculating means for calculating a risk corresponding to the entire personal data based on a specific risk corresponding to all personal data included in the personal data calculated by the information processing apparatus.
Calculates the identification risk that indicates that the data for the specified individual may be determined by someone else ’s data,
Based on the specific individual arrival rate indicating the possibility that the specified personal data is determined to be the specified personal data and the identification risk, the specified personal data is the specified personal data. An information processing method that calculates a specific risk that indicates the possibility of being considered personal data.
A process of calculating an identification risk that indicates that the data for the specified individual may be determined to be that of someone else,
Based on the specific individual arrival rate indicating the possibility that the specified personal data is determined to be the specified personal data and the identification risk, the specified personal data is the specified personal data. A non-volatile recording medium that records in a computer readable manner a program that causes the computer to execute a process of calculating a specific risk indicating the possibility of being determined as personal data.