WO2013128879A1 - Information processing device for implementing anonymization process, anonymization method, and program therefor - Google Patents

Information processing device for implementing anonymization process, anonymization method, and program therefor Download PDF

Info

Publication number
WO2013128879A1
WO2013128879A1 PCT/JP2013/001073 JP2013001073W WO2013128879A1 WO 2013128879 A1 WO2013128879 A1 WO 2013128879A1 JP 2013001073 W JP2013001073 W JP 2013001073W WO 2013128879 A1 WO2013128879 A1 WO 2013128879A1
Authority
WO
WIPO (PCT)
Prior art keywords
focus
user information
anonymization
anonymization group
group
Prior art date
Application number
PCT/JP2013/001073
Other languages
French (fr)
Japanese (ja)
Inventor
由起 豊田
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Publication of WO2013128879A1 publication Critical patent/WO2013128879A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6263Protecting personal data, e.g. for financial or medical purposes during internet communication, e.g. revealing personal data from cookies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0407Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the identity of one or more communicating identities is hidden
    • H04L63/0421Anonymous communication, i.e. the party's identifiers are hidden from the other party or parties, e.g. using an anonymizer

Definitions

  • the present invention relates to an information processing apparatus that performs anonymization processing that abstracts user data and improves anonymity, an anonymization method, and a program therefor.
  • Non-Patent Document 1 discloses a technique regarding k-anonymity.
  • k-anonymity is an index that guarantees that there are k or more sets of personal information including combinations of the same quasi-identifiers due to anonymization of quasi-identifiers that are information that may identify an individual. Specifically, when there are at least k or more records having a common combination of attribute values for any attribute in certain disclosed data including a plurality of records including attribute values of a plurality of attributes (quasi-identifiers) The disclosed data satisfies k-anonymity.
  • k-anonymity means that attributes having the same combination of quasi-identifiers can be obtained by abstracting the attribute values (also called quasi-identifiers) of attributes that can be information for identifying individuals into common values. It is an index that guarantees that it will be more than one.
  • a set of records having the same combination of quasi-identifiers is referred to as an anonymization group.
  • Non-Patent Document 2 discloses a technique of an anonymization method using Local Recording (local re-encoding).
  • An anonymization method using Local Recording is a method of replacing attribute values of some records with more generalized ones.
  • Non-Patent Document 2 discloses an anonymization method that includes anonymization group G with a part of anonymization group G ′ having a sufficient number of records, and anonymization group G It is a technology that merges and raises the level of abstraction.
  • records to be merged are selected from the anonymization group G ′ so that k-anonymity k is minimized in the anonymized group after merging. That is, the anonymization method of Non-Patent Document 2 is a method for minimizing information loss due to abstraction (anonymization) by minimizing an increase in the degree of abstraction of information in the entire disclosed data.
  • Patent Document 1 discloses a privacy protection device that solves such problems.
  • This privacy protection device generalizes (abstracts) attribute values (quasi-identifiers) based on priorities that are set for each attribute name (attribute type) and indicate the importance for the data user.
  • this privacy protection device abstracts an attribute value of an attribute having a lower priority order so that the original information is retained for an attribute having a higher priority order.
  • Patent Document 1 is a technique that generalizes attribute values based on the priority set in units of attribute types. .
  • An object of the present invention is to provide an information processing apparatus that executes anonymization processing that can solve the above-described problems, an anonymization method, and a program therefor.
  • the information processing apparatus acquires a plurality of user information records including arbitrary attribute values, and includes a plurality of user information records including at least an attention attribute value that is the specific attribute value.
  • a focus partial anonymization group creation means for creating a generalized group candidate; Information loss amount for calculating an information loss amount indicating an amount of loss of information obtained from the focus partial anonymization group candidate for information obtained from the user information record corresponding to the focus partial anonymization group candidate Calculating means,
  • the focus partial anonymization group creating means determines and outputs the focus partial anonymization group candidate corresponding to the smallest amount of information loss as the focus partial anonymization group among the created focus partial anonymization group candidates .
  • the computer Get multiple user information records containing any attribute value, Create a focus partial anonymization group candidate that groups a plurality of the user information records including at least an attribute value of interest that is the specific attribute value; For information obtained from the user information record corresponding to the focus partial anonymization group candidate, calculate an information loss amount indicating an amount of loss of information obtained from the focus partial anonymization group candidate, Of the created focus partial anonymization group candidates, the focus partial anonymization group candidate corresponding to the smallest amount of information loss is determined as a focus partial anonymization group and output.
  • the non-volatile recording medium of the present invention acquires a plurality of user information records including arbitrary attribute values, Create a focus partial anonymization group candidate that groups a plurality of the user information records including at least an attribute value of interest that is the specific attribute value; For information obtained from the user information record corresponding to the focus partial anonymization group candidate, calculate an information loss amount indicating an amount of loss of information obtained from the focus partial anonymization group candidate, A program that causes a computer to execute a process for determining and outputting the focus partial anonymization group candidate corresponding to the smallest amount of information loss among the created focus partial anonymization group candidates is recorded. To do.
  • the present invention has an effect that it is possible to obtain anonymized data so as to preferentially lower the abstraction level of an attribute value to be focused on locally.
  • FIG. 1 is a block diagram showing the configuration of the anonymization system according to the first embodiment.
  • FIG. 2 is a diagram illustrating an example of a user information record in the first embodiment.
  • FIG. 3 is a diagram illustrating an example of the anonymized user information record in the first embodiment.
  • FIG. 4 is a diagram illustrating an example of a property record in the first embodiment.
  • FIG. 5 is a diagram illustrating an example of a focus record in the first embodiment.
  • FIG. 6 is a block diagram illustrating a hardware configuration of a computer that implements the anonymization device according to the first embodiment.
  • FIG. 7 is a flowchart illustrating the operation of the anonymization apparatus according to the first embodiment.
  • FIG. 1 is a block diagram showing the configuration of the anonymization system according to the first embodiment.
  • FIG. 2 is a diagram illustrating an example of a user information record in the first embodiment.
  • FIG. 3 is a diagram illustrating an example of the anonymized user information record in the first embodiment.
  • FIG. 4
  • FIG. 8 is a diagram illustrating an example in which the focus partial anonymization group creation unit selects a user information record in the first embodiment.
  • FIG. 9 is a diagram illustrating an example of a focus partial anonymization group candidate in the first embodiment.
  • FIG. 10 is a diagram illustrating an example of a focus partial anonymization group candidate in the first embodiment.
  • FIG. 11 is a block diagram showing the configuration of the anonymization system according to the second embodiment.
  • FIG. 12 is a diagram illustrating an example of a user information record in the second embodiment.
  • FIG. 13 is a diagram illustrating an example of a division value record in the second embodiment.
  • FIG. 14 is a block diagram illustrating a configuration of an anonymization system according to the third embodiment.
  • FIG. 15 is a diagram illustrating an example of a focus record in the third embodiment.
  • FIG. 16 is a flowchart illustrating the operation of the anonymization device of the third exemplary embodiment.
  • FIG. 17 is a block diagram illustrating a configuration of the anonymization
  • FIG. 1 is a block diagram showing the configuration of the anonymization system according to the first embodiment of the present invention.
  • an anonymization system (also referred to as an information processing system) according to this embodiment includes an anonymization device (also referred to as an information processing device) 100, a user information storage unit 510, and an anonymized user information storage unit 520. Prepare.
  • the anonymization device 100, the user information storage unit 510, and the anonymized user information storage unit 520 are connected by a network (not shown). Note that the user information storage unit 510 may be included in the anonymization device 100. Further, the anonymized user information storage unit 520 may be included in the anonymization device 100.
  • the anonymization device 100 anonymizes the user information stored in the user information storage unit 510 and stores the anonymized user information in the anonymized user information storage unit 520.
  • FIG. 2 is a diagram illustrating an example of a user information record 511 stored in the user information storage unit 510.
  • the user information storage unit 510 includes a plurality of user information records 511 as user information. As shown in FIG. 2, the user information storage unit 510 includes one or more user information records 511.
  • the user information record 511 includes a number 519, an age 512, and a medical condition 513.
  • Age 512 is one of the quasi-identifiers.
  • the medical condition 513 is one of sensitive attributes.
  • the quasi-identifier (age 512) and the sensitive attribute (medical condition 513) are also generally called attributes.
  • the quasi-identifier is information that may make it possible to identify an individual by combining them.
  • Sensitive attributes are information that is generally not desired to be known to humans.
  • the number 519 is a number for identifying the user information record 511.
  • the user information record 511 needs to be individually shown and described, for example, the user information record 511 having the number 519 of “1” is described as the user information record 511 (1).
  • User information is, for example, receipt information held by a government agency or a medical institution.
  • the receipt information includes the date of birth, sex, illness, and the like.
  • the attribute value of the age attribute is age 512
  • the attribute value of the disease attribute is disease state 513.
  • the user corresponding to the user information record 511 (1) indicates that he is 20 years old and suffers from heart disease.
  • the user information record 511 may be arbitrary information regardless of the above.
  • the user information record may include age 512, medical condition 513, and other types of information (for example, gender).
  • the user information record may not include the medical condition 513, for example.
  • each arbitrary attribute quadsi-identifier and sensitive attribute
  • the medical condition 513 may include two attribute values “hay fever” and “tooth decay”.
  • FIG. 3 is a diagram illustrating an example of the anonymized user information record 521 stored in the anonymized user information storage unit 520.
  • the anonymized user information storage unit 520 includes k or more anonymized user information records 521.
  • the anonymized user information record 521 includes a group number 529, an age 512, and a medical condition 513.
  • the group number 529 is a number for identifying the anonymized user information record 521.
  • the anonymized user information record 521 needs to be individually shown and described, for example, the anonymized user information record 521 having the group number 529 of “1” is described as the anonymized user information record 521 (1).
  • the anonymized user information record 521 may not include the group number 529.
  • the anonymization apparatus 100 may specify and process the anonymized user information record 521 using the age 512, for example.
  • the anonymized user information record 521 is anonymized user information.
  • the user information is as described above.
  • the user corresponding to the anonymized user information record 521 (1) has an age of 20 to 21 and has suffered from a heart disease, a fracture, or an infection.
  • the anonymization device 100 includes a property storage unit 110, a focus value storage unit 120, an anonymization execution reception unit 130, a focus partial anonymization group creation unit 140, an information loss amount calculation unit 150, and an anonymization group.
  • a creation unit 160 is included.
  • FIG. 4 is a diagram illustrating an example of the property record 111 stored in the property storage unit 110.
  • the property storage unit 110 includes one or more property records 111.
  • the property record 111 includes a parameter name 112 and a parameter value 113.
  • at least one of the property records 111 stored in the property storage unit 110 is a set of a parameter name 112 and a parameter value 113 that specify k-anonymity k.
  • the parameter name 112 is “k” and the parameter value 113 is “3”.
  • the parameter name 112 is “quasi-identifier name” and the parameter value 113 is “age”.
  • the parameter name 112 is “sensitive attribute” and the parameter value 113 is “disease state”.
  • FIG. 5 is a diagram illustrating an example of the focus record 121 stored in the focus value storage unit 120.
  • the focus record 121 includes a quasi-identifier name 122 and a focus value 123.
  • the information included in the focus record 121 is information input to the anonymization device 100 in advance by a user of anonymized data (not shown). Note that the information included in the focus record 121 may be included in an anonymization process execution start instruction to be described later and input to the anonymization device 100.
  • the focus record 121 has a semi-identifier name 122 of “age” and a focus value 123 of “21”. Therefore, the focus record 121 indicates that the attribute value to be noticed is, for example, the attribute value having the age 512 of “21” in the user information record 511 illustrated in FIG. 2.
  • the focus partial anonymization group creation unit 140 acquires the user information record 511 and the property record 111 via the anonymization group creation unit 160 in response to an instruction from the anonymization group creation unit 160. And create a focus part anonymization group.
  • the focus partial anonymization group creation unit 140 may create a focus partial anonymization group in response to an anonymization process execution start instruction received by the anonymization execution reception unit 130.
  • the focus part anonymization group creation unit 140 may acquire the user information record 511 directly from the user information storage unit 510.
  • the focus part anonymization group creation unit 140 may acquire the property record 111 directly from the property storage unit 110.
  • the focus partial anonymization group creation unit 140 outputs the created focus partial anonymization group to the anonymization group creation unit 160.
  • the focus partial anonymization group creation unit 140 creates a focus partial anonymization group as follows.
  • the focus part anonymization group creation unit 140 groups the user information record 511 including at least the focus value 123 and other user information records 511 in the user information record 511 shown in FIG. Create a group candidate.
  • the focus partial anonymization group creation unit 140 performs grouping based on the information of the property record 111 that specifies k-anonymity k of the property storage unit 110. For example, the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate by grouping k user information records 511 including at least the user information record 511 including the focus value 123.
  • the focus partial anonymization group creation unit 140 outputs the created focus partial anonymization group candidate to the information loss amount calculation unit 150. Then, the focus partial anonymization group creation unit 140 receives the information loss amount of the focus partial anonymization group candidate from the information loss amount calculation unit 150.
  • the focus partial anonymization group creation unit 140 focuses the focus partial anonymization group candidate having the smallest value of the corresponding information loss amount among the plurality of focus partial anonymization group candidates that have received the information loss amount. Determine as a partially anonymized group.
  • the focus partial anonymization group includes information corresponding to the user information record 511 including at least the focus value 123.
  • the corresponding information is, for example, information “infectious disease” of the medical condition 513 in the user information record 511 including the age 512 having the same value as “21” which is the focus value 123.
  • Information loss amount (maximum value of specific attribute value included in focus value anonymization group candidate ⁇ minimum value of specific attribute value included in focus value anonymization group candidate + 1) ⁇ number of records.
  • the “specific attribute value included in the focus value anonymization group candidate” is, in other words, the quasi-identifier name of the focus value storage unit 120 of the user information record 511 corresponding to the focus value anonymization group candidate.
  • the attribute value specified by 122 is age 512. That is, the specific attribute value is an attribute value corresponding to the quasi-identifier name 122 of the focus record 121, and in the case of the focus record 121 shown in FIG. 5, the specific attribute value is age 512.
  • the maximum value of the specific attribute value included in the focus value anonymization group candidate ⁇ the minimum value of the specific attribute value included in the focus value anonymization group candidate is the focus value anonymization
  • the information loss amount calculation unit 150 calculates the information loss amount of the categorical value (character string) as shown in Non-Patent Document 2. It may be.
  • the number of user information records as the source is three user information records 511 having numbers 519 of “1”, “2”, and “3”.
  • the number of specific attribute values when grouped is one in which the group number 529 is “1” and the age 512 attribute value is “20-21”.
  • the anonymization group creation unit 160 creates the anonymized user information record 521 when triggered by the anonymization process execution start instruction received by the anonymization execution reception unit 130.
  • the anonymization group creation unit 160 creates the anonymized user information record 521 as follows.
  • the anonymization group creation unit 160 passes the user information record 511 and the property record 111 to the focus partial anonymization group creation unit 140 and instructs the creation of the focus partial anonymization group. Then, the anonymization group creation unit 160 receives the focus partial anonymization group from the focus partial anonymization group creation unit 140 as a response to the creation instruction.
  • the anonymization group creation unit 160 creates one or more anonymization groups from the user information record 511 other than the user information record 511 corresponding to the focus partial anonymization group created by the focus partial anonymization group creation unit 140. create.
  • the anonymization group creation unit 160 creates an anonymized user information record 521 corresponding to the focus partial anonymization group received from the focus partial anonymization group creation unit 140 and the anonymization group created by itself. And stored in the anonymized user information storage unit 520.
  • An anonymized user information record 521 corresponding to each of the focus partial anonymization group and the anonymization group is obtained by collecting user information records 511 included in the focus partial anonymization group and the anonymization group, and assigning a group number 529. It is.
  • FIG. 6 is a diagram illustrating a hardware configuration of a computer 700 that realizes the anonymization apparatus 100 according to the present embodiment.
  • the computer 700 includes a CPU (Central Processing Unit) 701, a storage unit 702, a storage device 703, an input unit 704, an output unit 705, and a communication unit 706. Furthermore, the computer 700 includes a recording medium (or storage medium) 707 supplied from the outside.
  • the recording medium 707 may be a non-volatile recording medium that stores information non-temporarily.
  • the CPU 701 controls the overall operation of the computer 700 by operating an operating system (not shown). Further, the CPU 701 reads a program (for example, a program that causes the computer 700 to execute an operation of a flowchart shown in FIG. 7 described later) and data from a recording medium 707 mounted on the storage device 703, and loads the read program and data. Write to the storage unit 702. The CPU 701 follows the read program and based on the read data, the anonymization execution reception unit 130, the focus partial anonymization group creation unit 140, the information loss amount calculation unit 150, and the anonymization group creation unit shown in FIG. Various processes are executed as 160.
  • a program for example, a program that causes the computer 700 to execute an operation of a flowchart shown in FIG. 7 described later
  • data for example, a program that causes the computer 700 to execute an operation of a flowchart shown in FIG. 7 described later
  • the CPU 701 follows the read program and based on the read data, the anonymization execution reception unit 130, the focus partial anonymization group creation unit 140,
  • the CPU 701 may download a program or data to the storage unit 702 from an external computer (not shown) connected to a communication network (not shown).
  • the storage unit 702 stores programs and data.
  • the storage unit 702 may include a property storage unit 110 and a focus value storage unit 120. Furthermore, when the computer 700 (anonymization apparatus 100) includes the user information storage unit 510 and the anonymized user information storage unit 520, the storage unit 702 may include these.
  • the storage device 703 is, for example, an optical disk, a flexible disk, a magnetic optical disk, an external hard disk, and a semiconductor memory, and includes a recording medium 707.
  • the storage device 703 records the program so that it can be read by a computer. Further, the storage device 703 may record data so as to be readable by a computer.
  • the storage device 703 may include a property storage unit 110 and a focus value storage unit 120.
  • the computer 700 anonymization device 100
  • the storage device 703 may include these.
  • the input unit 704 is realized by, for example, a mouse, a keyboard, a built-in key button, and the like, and is used for an input operation.
  • the input unit 704 is not limited to a mouse, a keyboard, and a built-in key button, and may be a touch panel, an accelerometer, a gyro sensor, a camera, or the like.
  • the input unit 704 is included as part of the anonymization execution reception unit 130.
  • the output unit 705 is realized by a display, for example, and is used for confirming the output.
  • the communication unit 706 implements an interface with the user information storage unit 510 and the anonymized user information storage unit 520.
  • the communication unit 706 is included as a part of the anonymization group creation unit 160.
  • the functional unit block of the anonymization device 100 shown in FIG. 1 is realized by the computer 700 having the hardware configuration shown in FIG.
  • the means for realizing each unit included in the computer 700 is not limited to the above.
  • the computer 700 may be realized by one physically coupled device, or may be realized by two or more physically separated devices connected by wire or wirelessly and by a plurality of these devices. .
  • the recording medium 707 in which the above-described program code is recorded may be supplied to the computer 700, and the CPU 701 may read and execute the program code stored in the recording medium 707.
  • the CPU 701 may store the code of the program stored in the recording medium 707 in the storage unit 702, the storage device 703, or both. That is, the present embodiment includes an embodiment of a recording medium 707 that stores a program (software) executed by the computer 700 (CPU 701) temporarily or non-temporarily.
  • FIG. 7 is a flowchart showing the operation of the anonymization device 100 of this embodiment. Note that the processing according to this flowchart may be executed based on the above-described program control by the CPU. Further, the step name of the process is described by a symbol as in S601.
  • the anonymization execution reception unit 130 receives an anonymization process execution start instruction from a user of anonymization data (not shown) and outputs the instruction to the anonymization group creation unit 160 (S601).
  • the anonymization group creation unit 160 obtains the user information record 511 from the user information storage unit 510 upon receiving the anonymization process execution start instruction (S602).
  • the anonymization group creation unit 160 acquires the property record 111 having the parameter name 112 of “k” from the property storage unit 110 (S603).
  • the anonymization group creation unit 160 includes the user information record 511 acquired in S602, and the parameter value 113 (eg, “3”) of the property record 111 having the parameter name 112 “k” acquired in S603. Is output to the focus partial anonymization group creation unit 140 to instruct the creation of the focus partial anonymization group. (S604).
  • the focus part anonymization group creation unit 140 acquires the focus record 121 (for example, the quasi-identifier name 122 is “age” and the focus value 123 is “21”) from the focus value storage unit 120 (S605).
  • the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate based on the acquired focus record 121 (S606).
  • FIG. 8 is a diagram illustrating an example in which the focus partial anonymization group creation unit 140 selects the user information record 511 when creating a focus partial anonymization group candidate.
  • the focus partial anonymization group creating unit 140 regards three records as one group from the user information record 511 whose age 512 is “21” in the direction of decreasing age, and makes the focus value anonymized. Create group candidates.
  • the focus partial anonymization group creation unit 140 creates focus value anonymization group candidates by regarding three records as one group in the direction of increasing age.
  • the focus partial anonymization group creation unit 140 further creates a focus value anonymization group candidate when three records centered on the user information record 511 having the age 512 of “21” are regarded as one group. May be.
  • FIG. 8 is a diagram illustrating an example in which the focus partial anonymization group creation unit 140 selects the user information record 511 when creating a focus partial anonymization group candidate.
  • the focus partial anonymization group creating unit 140 regards three records as one group from the user information record 511 whose age 512 is “21
  • FIG. 9 is a diagram illustrating an example in which the focus partial anonymization group creation unit 140 creates one focus partial anonymization group candidate by collecting three records in the direction of decreasing age.
  • FIG. 10 is a diagram illustrating an example in which the focus partial anonymization group creation unit 140 creates one focus partial anonymization group candidate by collecting three records in the direction of increasing age.
  • the focus partial anonymization group creation unit 140 transmits the created focus value anonymization group candidate to the information loss amount calculation unit 150 (S607).
  • the information loss amount calculation unit 150 calculates an information loss amount for each received focus value anonymization group candidate (S608).
  • the information loss amount calculation unit 150 calculates the information loss amount of the focus partial anonymization group candidate shown in FIGS. 9 and 10 using the above-described information loss amount calculation formula as follows.
  • the focus partial anonymization group creation unit 140 has the smallest information loss amount (for example, “6”) and the focus partial anonymization group candidate (for example, 9 is determined as a focus value anonymization group. Subsequently, the focus partial anonymization group creation unit 140 outputs the determined focus partial anonymization group candidate to the anonymization group creation unit 160 (S609).
  • the anonymization group creation unit 160 creates one or more anonymization groups from the user information record 511 other than the user information record 511 corresponding to the focus partial anonymization group created by the focus partial anonymization group creation unit 140. (S610).
  • the anonymization group creation unit 160 creates an anonymized user information record 521 corresponding to the focus partial anonymization group received from the focus partial anonymization group creation unit 140 and the anonymization group created by itself. Is stored in the anonymized user information storage unit 520 (S611).
  • the anonymized user information record 521 created as described above gives priority to minimizing the abstraction level of the quasi-identifier (age 512) corresponding to the focus value 123 specified by the user who uses the anonymized data.
  • the anonymization process is performed. That is, the anonymization device 100 according to the present embodiment can minimize the abstraction level of the quasi-identifier corresponding to the designated focus value 123.
  • an anonymized data set (a set of anonymized user information records 521) whose abstraction level has been lowered locally, for example, the postal code of the disaster area, the study guidance guidelines have been significantly changed It is possible to examine in detail the vicinity of a meaningful value such as the date of birth.
  • the effect of the present embodiment described above is that it is possible to obtain anonymized data so as to preferentially lower the abstraction level of the attribute value to be focused on locally.
  • the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate.
  • the information loss amount calculation unit 150 calculates the information loss amount of each focus partial anonymization group candidate.
  • the focus partial anonymization group creation unit 140 determines a focus partial anonymization group candidate corresponding to the smallest amount of information loss as a focus partial anonymization group.
  • FIG. 11 is a block diagram showing the configuration of the anonymization system according to the second embodiment of the present invention.
  • the anonymization system includes an anonymization device 200, a user information storage unit 510, and an anonymized user information storage unit 520.
  • the anonymization device 200, the user information storage unit 510, and the anonymized user information storage unit 520 are connected by a network (not shown). Note that the user information storage unit 510 may be included in the anonymization device 200. Further, the anonymized user information storage unit 520 may be included in the anonymization device 200.
  • the anonymization device 200 anonymizes the user information stored in the user information storage unit 510 and stores the anonymized user information in the anonymized user information storage unit 520.
  • the anonymization device 200 further includes a divided value storage unit 270 as compared with the anonymization device 100 according to the first embodiment.
  • the anonymization apparatus 200 has the focus partial anonymization group creation part 240 instead of the focus partial anonymization group creation part 140 compared with the anonymization apparatus 100 of 1st Embodiment.
  • FIG. 12 is a diagram showing an example of the user information record 511 in the present embodiment. As shown in FIG. 12, the user information record 511 in the present embodiment further includes a consultation date 514 as a quasi-identifier as compared to the user information record 511 in FIG.
  • FIG. 13 is a diagram illustrating an example of the division value record 271 stored in the division value storage unit 270.
  • the division value storage unit 270 includes one or more division value records 271.
  • the division value record 271 includes a semi-identifier name 272 and a division value 273.
  • the information included in the division value record 271 is information input to the anonymization device 200 in advance by a user of anonymized data (not shown). Note that the information included in the division value record 271 may be included in the anonymization process execution start instruction and input to the anonymization device 200.
  • the division value record 271 indicates that the user information is divided into a user information record 511 before November 30, 2011 and a user information record 511 after December 1, 2011.
  • the focus partial anonymization group creation unit 240 divides the user information record 511 shown in FIG. 12 based on the division value record 271 to create a plurality of division groups. Next, the focus partial anonymization group creation unit 240 executes steps S606 to S609 in FIG. 7 for each divided group of the user information record 511 in the same manner as the focus partial anonymization group creation unit 140 of the first embodiment. And create a focus part anonymization group.
  • the focus part anonymization group creation unit 240 sets the user information record 511 in FIG. 12 as the user information record 511 having the numbers 519 of “1”, “3”, “4”, and “6”, and the number 519 as “ It is divided into user information records 511 of “2”, “5” and “7”.
  • the focus partial anonymization group creation unit 240 executes steps S606 to S609, and outputs the focus partial anonymization group created without crossing the division value 273.
  • the anonymized group creation unit 160 stores the anonymized user information record 521 corresponding to the focus partial anonymization group created without straddling the division value 273 in the anonymized user information storage unit 520.
  • a converted user information record 521 can be created.
  • the division value record 271 is not limited to the example described above, and may specify division of an arbitrary attribute value included in the user information record 511. Moreover, the division value record 271 may be plural.
  • the effect of the present embodiment described above is that, in addition to the effect of the first embodiment, for the user information record 511 with a narrower range, the abstraction level of the attribute value to be focused is preferentially lowered locally. It is possible to obtain anonymized data.
  • the reason is that the focus part anonymization group creation unit 240 divides the user information record 511 based on the division value record 271.
  • FIG. 14 is a block diagram showing the configuration of the anonymization system according to the third embodiment of the present invention.
  • the anonymization system includes an anonymization device 300, a user information storage unit 510, and an anonymized user information storage unit 520.
  • 3 is a block diagram showing a configuration of an anonymization device 300.
  • the anonymization device 300, the user information storage unit 510, and the anonymized user information storage unit 520 are connected by a network (not shown). Note that the user information storage unit 510 may be included in the anonymization device 300. Further, the anonymized user information storage unit 520 may be included in the anonymization device 300.
  • the anonymization device 300 anonymizes the user information stored in the user information storage unit 510 and stores the anonymized user information in the anonymized user information storage unit 520.
  • the anonymization device 300 in the present embodiment is different from the anonymization device 100 of the first embodiment in that a focus partial anonymization group creation unit 340 is used instead of the focus partial anonymization group creation unit 140. Instead of the anonymization group creation unit 160, an anonymization group creation unit 360 is provided.
  • FIG. 15 is a diagram illustrating an example of the focus record 121 stored in the focus value storage unit 120 according to the present embodiment.
  • the focus value storage unit 120 of this embodiment includes a plurality of focus records 121.
  • the focus record 121 of this embodiment further includes a priority 128 in addition to the quasi-identifier name 122 and the focus value 123.
  • the priority 128 is information indicating the order of the focus records 121 when a focus partial anonymization group is created.
  • the priority 128 may be information indicating the weight of the focus record 121.
  • the focus partial anonymization group creation unit 340 Upon receiving an instruction from the anonymization group creation unit 360, the focus partial anonymization group creation unit 340 creates the focus partial anonymization group using the focus records 121 in order of priority 128, and creates the anonymization group Output to the unit 360. At this time, the focus partial anonymization group creation unit 340 adds the completion information to the focus partial anonymization group and outputs it to the anonymization group creation unit 360.
  • the completion information is information indicating whether focus partial anonymization group candidates including each of the focus records 121 have been created (“complete”) or not (“incomplete”) for all focus records 121. .
  • the anonymization group creation unit 360 receives from the focus partial anonymization group creation unit 340 a focus partial anonymization group to which information indicating whether or not an unused focus record 121 remains is added. Then, the anonymization group creation unit 360 creates the anonymized user information record 521 in the same manner as the anonymization group creation unit 160.
  • the anonymization group creation unit 360 confirms the completion information.
  • the anonymization group creation unit 360 passes the created anonymized user information record 521 to the focus partial anonymization group creation unit 340 to create the focus partial anonymization group again. Instruct.
  • the anonymized group creation unit 360 stores the created anonymized user information record 521 in the anonymized user information storage unit 520.
  • the anonymization apparatus 300 can specify the focus value 123 for each of a plurality of quasi-identifiers (arbitrary quasi-identifier names 122 and quasi-identifiers having the same quasi-identifier name 122). .
  • FIG. 16 is a flowchart showing the operation of the present embodiment.
  • S601 to S603 are the same operations as S601 to S603 in FIG.
  • the anonymization group creation unit 360 outputs one of the anonymized user information records 521 and the parameter value 113 to the focus partial anonymization group creation unit 340, and instructs the creation of the focus partial anonymization group. . (S634).
  • the anonymized user information record 521 is the anonymized user information record 521 created in S602 or the user information record 511 acquired in S602.
  • the parameter value 113 is the parameter value 113 of the property record 111 whose parameter name 112 acquired in S603 is “k”.
  • S605 to S608 are the same operations as S605 to S608 in FIG.
  • the focus partial anonymization group creation unit 340 determines the focus partial anonymization group candidate with the smallest information loss amount as the focus value anonymization group. . Subsequently, the focus partial anonymization group creation unit 340 adds completion information to the determined focus partial anonymization group candidate and outputs the completion information to the anonymization group creation unit 160 (S639).
  • S610 is the same operation as S610 in FIG.
  • the anonymization group creation unit 360 creates an anonymized user information record 521 corresponding to the focus partial anonymization group received from the focus partial anonymization group creation unit 340 and the anonymization group created by itself. (S641).
  • the anonymization group creation unit 360 confirms the completion information (S642). If the completion information is “incomplete” (NO in S642), the process returns to S634.
  • the anonymized user information record 521 created in S641 is stored in the anonymized user information storage unit 520 (S643).
  • a focus value 123 is set in the date of birth, date of birth, and quasi-identifier of gender, and attention is paid to a female patient at an age when permission to use a certain new drug is given.
  • the incidence of cervical cancer can be examined.
  • the focus record 121 may not include the priority 128.
  • the focus part anonymization group creation unit 340 may use the focus records 121 included in the address of the young or old number in the focus value storage unit 120 in order. Further, the focus partial anonymization group creation unit 340 may use the plurality of focus records 121 in a fixed order predetermined for the semi-identifier name 122 or in an arbitrary order.
  • the effect in the present embodiment described above is that anonymized data is preferentially lowered locally in order to preferentially reduce the abstraction level of the attribute value to be noticed from a plurality of viewpoints. It is a point that can be obtained.
  • the focus partial anonymization group creation unit 340 creates a focus partial anonymization group by sequentially using a plurality of focus records 121.
  • the anonymization group creation unit 360 creates a focus partial anonymization group for the created anonymized user information record 521 until the focus partial anonymization group creation unit 340 uses all the focus records 121.
  • the focus partial anonymization group creation unit 340 is instructed.
  • FIG. 17 is a block diagram showing a configuration of an anonymization apparatus 400 according to the fourth embodiment of the present invention.
  • the anonymization device 400 includes a focus partial anonymization group creation unit 140 and an information loss amount calculation unit 150.
  • User information storage means (not shown) may be included in the anonymization device 400, or may be a user information storage unit 510 as shown in FIG.
  • the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate in which a plurality of user information records 511 are grouped.
  • the plurality of user information records 511 include at least an attention attribute value.
  • the attention attribute value is one of the above-described arbitrary attribute values, and is an attribute value specified by the quasi-identifier name 122 and the focus value 123 included in the focus record 121.
  • the focus partial anonymization group creation unit 140 determines the focus partial anonymization group candidate corresponding to the smallest amount of information loss among the created focus partial anonymization group candidates and outputs the focus partial anonymization group candidate.
  • the information loss amount calculation unit 150 calculates and outputs the information output amount of the focus partial anonymization group candidate created by the focus partial anonymization group creation unit 140.
  • the amount of information output is the amount of information obtained from the focus partial anonymization group candidate lost (decreased) with respect to the information obtained from the user information record 511 corresponding to the focus partial anonymization group candidate. Show.
  • the effect of the present embodiment described above is that it is possible to obtain anonymized data so as to preferentially lower the abstraction level of the attribute value to be focused on locally.
  • the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate.
  • the information loss amount calculation unit 150 calculates the information loss amount of each focus partial anonymization group candidate.
  • the focus partial anonymization group creation unit 140 determines a focus partial anonymization group candidate corresponding to the smallest amount of information loss as a focus partial anonymization group.
  • each component described in each of the above embodiments does not necessarily need to be an independent entity.
  • each component may be realized as a module with a plurality of components.
  • each component may be realized by a plurality of modules.
  • Each component may be configured such that a certain component is a part of another component.
  • Each component may be configured such that a part of a certain component overlaps a part of another component.
  • each component and a module that realizes each component may be realized by hardware if necessary. Moreover, each component and the module which implement
  • the program is provided by being recorded on a non-volatile computer-readable recording medium such as a magnetic disk or a semiconductor memory, and is read by the computer when the computer is started up.
  • the read program causes the computer to function as a component in each of the above-described embodiments by controlling the operation of the computer.
  • a plurality of operations are not limited to being executed at different timings. For example, another operation may occur during the execution of a certain operation, or the execution timing of a certain operation and another operation may partially or entirely overlap.
  • each of the embodiments described above it is described that a certain operation becomes a trigger for another operation, but the description does not limit all relationships between the certain operation and other operations. For this reason, when each embodiment is implemented, the relationship between the plurality of operations can be changed within a range that does not hinder the contents.
  • the specific description of each operation of each component does not limit each operation of each component. For this reason, each specific operation
  • movement of each component may be changed in the range which does not cause trouble with respect to a functional, performance, and other characteristic in implementing each embodiment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Bioethics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides an information processing device for implementing anonymization in such a manner that the abstraction levels of attribute values to be focused on are preferentially locally lowered. The information processing device is provided with: a means for outputting user information codes that include attribute values to be focused on by generating grouped focus area anonymization group candidates, and determining the focus area anonymization group candidate with the smallest information loss as a focus area anonymization group; and a means for calculating the information loss in the focus area anonymization group candidates.

Description

匿名化処理を実行する情報処理装置、匿名化方法、及びそのためのプログラムInformation processing apparatus for performing anonymization process, anonymization method, and program therefor
 本発明は、ユーザデータを抽象化して匿名性を向上させる匿名化処理を実行する情報処理装置、匿名化方法、及びそのためのプログラムに関する。 The present invention relates to an information processing apparatus that performs anonymization processing that abstracts user data and improves anonymity, an anonymization method, and a program therefor.
 近年、プライバシー情報の匿名化について、様々な関連技術が知られている。 In recent years, various related technologies have been known for anonymizing privacy information.
 非特許文献1は、k-匿名性についての技術を開示する。k-匿名性は、個人を特定する恐れがある情報である準識別子の匿名化により、同じ準識別子の組み合わせを含む個人情報の組がk通り以上になることを保証する指標である。具体的には、複数の属性(準識別子)の属性値を含むレコードを、複数含むある開示データにおいて、任意の属性について、その属性値が共通の組み合わせを持つレコードが少なくともk個以上存在する時、その開示データはk-匿名性を満たす。即ち、k-匿名性とは、個人を特定する情報となり得る属性について、その属性値(準識別子とも呼ばれる)を抽象化して共通な値にすることにより、同じ準識別子の組み合わせを持つレコードがk個以上になることを保証する指標である。以下、同じ準識別子の組み合わせを持つレコードの集合を、匿名化グループと呼ぶ。 Non-Patent Document 1 discloses a technique regarding k-anonymity. k-anonymity is an index that guarantees that there are k or more sets of personal information including combinations of the same quasi-identifiers due to anonymization of quasi-identifiers that are information that may identify an individual. Specifically, when there are at least k or more records having a common combination of attribute values for any attribute in certain disclosed data including a plurality of records including attribute values of a plurality of attributes (quasi-identifiers) The disclosed data satisfies k-anonymity. In other words, k-anonymity means that attributes having the same combination of quasi-identifiers can be obtained by abstracting the attribute values (also called quasi-identifiers) of attributes that can be information for identifying individuals into common values. It is an index that guarantees that it will be more than one. Hereinafter, a set of records having the same combination of quasi-identifiers is referred to as an anonymization group.
 非特許文献2は、Local Recoding(局所再符号化)を用いた匿名化手法の技術を開示する。Local Recodingを用いた匿名化手法は、一部のレコードの属性値を、より一般化されたものに置き換える手法である。非特許文献2が開示する匿名化手法は、レコード数がk未満の匿名化グループGに対して、レコード数が十分にある匿名化グループG´の一部のレコードと、匿名化グループGとをマージし、より抽象度を高くする技術である。非特許文献2の匿名化手法では、マージ後の匿名化グループにおいてk-匿名性のkが最小になるように、マージするレコードを匿名化グループG‘から選択する。即ち、非特許文献2の匿名化手法は、開示データ全体での情報の抽象度の増加をできるだけ小さくし、抽象化(匿名化)による情報損失を最小にする手法である。 Non-Patent Document 2 discloses a technique of an anonymization method using Local Recording (local re-encoding). An anonymization method using Local Recording is a method of replacing attribute values of some records with more generalized ones. Non-Patent Document 2 discloses an anonymization method that includes anonymization group G with a part of anonymization group G ′ having a sufficient number of records, and anonymization group G It is a technology that merges and raises the level of abstraction. In the anonymization method of Non-Patent Document 2, records to be merged are selected from the anonymization group G ′ so that k-anonymity k is minimized in the anonymized group after merging. That is, the anonymization method of Non-Patent Document 2 is a method for minimizing information loss due to abstraction (anonymization) by minimizing an increase in the degree of abstraction of information in the entire disclosed data.
 しかしながら、上述のような関連技術による匿名化は、k-匿名性を満たすために全てのデータを平等に扱っていたため、データ利用者が求める情報が欠落する場合があるという問題点があった。特許文献1は、このような問題点を解決するプライバシー保護装置を開示する。このプライバシー保護装置は、属性名(属性の種類)ごとに設定された、データ利用者にとっての重要度を示す優先順位に基づいて、属性値(準識別子)を一般化(抽象化)する。即ち、このプライバシー保護装置は、優先順位の低い属性の属性値から先に一般化することで、優先順位の高い属性ほど元の情報が保持されるように抽象化する。 However, the anonymization by the related technology as described above has a problem that information required by the data user may be lost because all data is treated equally to satisfy k-anonymity. Patent Document 1 discloses a privacy protection device that solves such problems. This privacy protection device generalizes (abstracts) attribute values (quasi-identifiers) based on priorities that are set for each attribute name (attribute type) and indicate the importance for the data user. In other words, this privacy protection device abstracts an attribute value of an attribute having a lower priority order so that the original information is retained for an attribute having a higher priority order.
特開2011-128862号公報JP 2011-128862 A
 しかしながら、上述した特許文献1に記載された技術においては、データの匿名化に際して、注目したい属性値の抽象度を優先的に、局所的に低くすることができないという問題点がある。 However, in the technique described in Patent Document 1 described above, there is a problem that, when anonymizing data, the abstraction level of an attribute value to be focused on cannot be lowered locally.
 抽象度を局所的に低くすることができない理由は、特許文献1に記載された技術が、属性の種類を単位として設定された優先度に基づいて、属性値の一般化を行う技術だからである。 The reason why the level of abstraction cannot be locally reduced is that the technique described in Patent Document 1 is a technique that generalizes attribute values based on the priority set in units of attribute types. .
 本発明の目的は、上述した問題点を解決できる匿名化処理を実行する情報処理装置、匿名化方法、及びそのためのプログラムを提供することにある。 An object of the present invention is to provide an information processing apparatus that executes anonymization processing that can solve the above-described problems, an anonymization method, and a program therefor.
 本発明の情報処理装置は、任意の属性値を含む複数のユーザ情報レコードを取得し、特定の前記属性値である注目属性値を少なくとも含む、複数の前記ユーザ情報レコードをグループ化したフォーカス部分匿名化グループ候補を作成するフォーカス部分匿名化グループ作成手段と、
 前記フォーカス部分匿名化グループ候補に対応する前記ユーザ情報レコードから得られる情報に対して、前記フォーカス部分匿名化グループ候補から得られる情報が損失している量を示す情報損失量を計算する情報損失量計算手段と、を含み、
 前記フォーカス部分匿名化グループ作成手段は、前記作成したフォーカス部分匿名化グループ候補の内、最も小さい前記情報損失量に対応する前記フォーカス部分匿名化グループ候補をフォーカス部分匿名化グループとして決定し、出力する。
The information processing apparatus according to the present invention acquires a plurality of user information records including arbitrary attribute values, and includes a plurality of user information records including at least an attention attribute value that is the specific attribute value. A focus partial anonymization group creation means for creating a generalized group candidate;
Information loss amount for calculating an information loss amount indicating an amount of loss of information obtained from the focus partial anonymization group candidate for information obtained from the user information record corresponding to the focus partial anonymization group candidate Calculating means,
The focus partial anonymization group creating means determines and outputs the focus partial anonymization group candidate corresponding to the smallest amount of information loss as the focus partial anonymization group among the created focus partial anonymization group candidates .
 本発明の匿名化方法は、コンピュータが、
 任意の属性値を含む複数のユーザ情報レコードを取得し、
 特定の前記属性値である注目属性値を少なくとも含む、複数の前記ユーザ情報レコードをグループ化したフォーカス部分匿名化グループ候補を作成し、
 前記フォーカス部分匿名化グループ候補に対応する前記ユーザ情報レコードから得られる情報に対して、前記フォーカス部分匿名化グループ候補から得られる情報が損失している量を示す情報損失量を計算し、
 前記作成したフォーカス部分匿名化グループ候補の内、最も小さい前記情報損失量に対応する前記フォーカス部分匿名化グループ候補をフォーカス部分匿名化グループとして決定し、出力する。
In the anonymization method of the present invention, the computer
Get multiple user information records containing any attribute value,
Create a focus partial anonymization group candidate that groups a plurality of the user information records including at least an attribute value of interest that is the specific attribute value;
For information obtained from the user information record corresponding to the focus partial anonymization group candidate, calculate an information loss amount indicating an amount of loss of information obtained from the focus partial anonymization group candidate,
Of the created focus partial anonymization group candidates, the focus partial anonymization group candidate corresponding to the smallest amount of information loss is determined as a focus partial anonymization group and output.
 本発明の不揮発性記録媒体は、任意の属性値を含む複数のユーザ情報レコードを取得し、
 特定の前記属性値である注目属性値を少なくとも含む、複数の前記ユーザ情報レコードをグループ化したフォーカス部分匿名化グループ候補を作成し、
 前記フォーカス部分匿名化グループ候補に対応する前記ユーザ情報レコードから得られる情報に対して、前記フォーカス部分匿名化グループ候補から得られる情報が損失している量を示す情報損失量を計算し、
 前記作成したフォーカス部分匿名化グループ候補の内、最も小さい前記情報損失量に対応する前記フォーカス部分匿名化グループ候補をフォーカス部分匿名化グループとして決定し、出力する、処理をコンピュータに実行させるプログラムを記録する。
The non-volatile recording medium of the present invention acquires a plurality of user information records including arbitrary attribute values,
Create a focus partial anonymization group candidate that groups a plurality of the user information records including at least an attribute value of interest that is the specific attribute value;
For information obtained from the user information record corresponding to the focus partial anonymization group candidate, calculate an information loss amount indicating an amount of loss of information obtained from the focus partial anonymization group candidate,
A program that causes a computer to execute a process for determining and outputting the focus partial anonymization group candidate corresponding to the smallest amount of information loss among the created focus partial anonymization group candidates is recorded. To do.
 本発明は、注目したい属性値の抽象度を優先的に、局所的に低くするように匿名化されたデータを得ることが可能になるという効果がある。 The present invention has an effect that it is possible to obtain anonymized data so as to preferentially lower the abstraction level of an attribute value to be focused on locally.
図1は、第1の実施形態に係る匿名化システムの構成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of the anonymization system according to the first embodiment. 図2は、第1の実施形態におけるユーザ情報レコードの一例を示す図である。FIG. 2 is a diagram illustrating an example of a user information record in the first embodiment. 図3は、第1の実施形態における匿名化済ユーザ情報レコードの一例を示す図である。FIG. 3 is a diagram illustrating an example of the anonymized user information record in the first embodiment. 図4は、第1の実施形態におけるプロパティレコードの一例を示す図である。FIG. 4 is a diagram illustrating an example of a property record in the first embodiment. 図5は、第1の実施形態におけるフォーカスレコードの一例を示す図である。FIG. 5 is a diagram illustrating an example of a focus record in the first embodiment. 図6は、第1の実施形態に係る匿名化装置を実現するコンピュータのハードウェア構成を示すブロック図である。FIG. 6 is a block diagram illustrating a hardware configuration of a computer that implements the anonymization device according to the first embodiment. 図7は、第1の実施形態の匿名化装置の動作を示すフローチャートである。FIG. 7 is a flowchart illustrating the operation of the anonymization apparatus according to the first embodiment. 図8は、第1の実施形態における、フォーカス部分匿名化グループ作成部がユーザ情報レコードを選択する一例を示す図である。FIG. 8 is a diagram illustrating an example in which the focus partial anonymization group creation unit selects a user information record in the first embodiment. 図9は、第1の実施形態におけるフォーカス部分匿名化グループ候補の一例を示す図である。FIG. 9 is a diagram illustrating an example of a focus partial anonymization group candidate in the first embodiment. 図10は、第1の実施形態におけるフォーカス部分匿名化グループ候補の一例を示す図である。FIG. 10 is a diagram illustrating an example of a focus partial anonymization group candidate in the first embodiment. 図11は、第2の実施形態に係る匿名化システムの構成を示すブロック図である。FIG. 11 is a block diagram showing the configuration of the anonymization system according to the second embodiment. 図12は、第2の実施形態におけるユーザ情報レコードの一例を示す図である。FIG. 12 is a diagram illustrating an example of a user information record in the second embodiment. 図13は、第2の実施形態における分割値レコードの一例を示す図である。FIG. 13 is a diagram illustrating an example of a division value record in the second embodiment. 図14は、第3の実施形態に係る匿名化システムの構成を示すブロック図である。FIG. 14 is a block diagram illustrating a configuration of an anonymization system according to the third embodiment. 図15は、第3の実施形態におけるフォーカスレコードの一例を示す図である。FIG. 15 is a diagram illustrating an example of a focus record in the third embodiment. 図16は、第3の実施形態の匿名化装置の動作を示すフローチャートである。FIG. 16 is a flowchart illustrating the operation of the anonymization device of the third exemplary embodiment. 図17は、第4の実施形態に係る匿名化システムの構成を示すブロック図である。FIG. 17 is a block diagram illustrating a configuration of the anonymization system according to the fourth embodiment.
 本発明を実施するための形態について図面を参照して詳細に説明する。尚、各図面及び明細書記載の各実施形態において、同様の機能を備える構成要素には同様の符号が与えられている。 Embodiments for carrying out the present invention will be described in detail with reference to the drawings. In addition, in each embodiment described in each drawing and specification, the same code | symbol is given to the component provided with the same function.
 <<第1の実施形態>>
 図1は、本発明の第1の実施形態に係る匿名化システムの構成を示すブロック図である。
<< First Embodiment >>
FIG. 1 is a block diagram showing the configuration of the anonymization system according to the first embodiment of the present invention.
 図1を参照すると、本実施形態に係る匿名化システム(情報処理システムとも呼ばれる)は、匿名化装置(情報処理装置とも呼ばれる)100、ユーザ情報記憶部510及び匿名化済ユーザ情報記憶部520を備える。 Referring to FIG. 1, an anonymization system (also referred to as an information processing system) according to this embodiment includes an anonymization device (also referred to as an information processing device) 100, a user information storage unit 510, and an anonymized user information storage unit 520. Prepare.
 匿名化装置100、ユーザ情報記憶部510及び匿名化済ユーザ情報記憶部520は、図示しないネットワークで接続されている。尚、ユーザ情報記憶部510は、匿名化装置100に含まれていてもよい。また、匿名化済ユーザ情報記憶部520は、匿名化装置100に含まれていてもよい。 The anonymization device 100, the user information storage unit 510, and the anonymized user information storage unit 520 are connected by a network (not shown). Note that the user information storage unit 510 may be included in the anonymization device 100. Further, the anonymized user information storage unit 520 may be included in the anonymization device 100.
 匿名化装置100は、ユーザ情報記憶部510に格納されているユーザ情報を匿名化し、その匿名化したユーザ情報を匿名化済ユーザ情報記憶部520に格納する。 The anonymization device 100 anonymizes the user information stored in the user information storage unit 510 and stores the anonymized user information in the anonymized user information storage unit 520.
 図2は、ユーザ情報記憶部510に格納されるユーザ情報レコード511の一例を示す図である。ユーザ情報記憶部510は、複数のユーザ情報レコード511を、ユーザ情報として含む。図2に示すように、ユーザ情報記憶部510は、1以上のユーザ情報レコード511を含む。ユーザ情報レコード511は、番号519、年齢512及び病状513を含む。 FIG. 2 is a diagram illustrating an example of a user information record 511 stored in the user information storage unit 510. The user information storage unit 510 includes a plurality of user information records 511 as user information. As shown in FIG. 2, the user information storage unit 510 includes one or more user information records 511. The user information record 511 includes a number 519, an age 512, and a medical condition 513.
 年齢512は、準識別子の1つである。病状513は、センシティブ属性の1つである。準識別子(年齢512)及びセンシティブ属性(病状513)は、一般的に属性とも呼ばれる。準識別子は、組み合わせることにより個人を特定することが可能になる場合がある情報である。また、センシティブ属性は、一般的に人には知られたくない情報である。 Age 512 is one of the quasi-identifiers. The medical condition 513 is one of sensitive attributes. The quasi-identifier (age 512) and the sensitive attribute (medical condition 513) are also generally called attributes. The quasi-identifier is information that may make it possible to identify an individual by combining them. Sensitive attributes are information that is generally not desired to be known to humans.
 尚、番号519は、ユーザ情報レコード511を識別する番号である。ユーザ情報レコード511を個別に示して説明する必要がある場合、例えば番号519が「1」のユーザ情報レコード511は、ユーザ情報レコード511(1)と記載する。 Note that the number 519 is a number for identifying the user information record 511. When the user information record 511 needs to be individually shown and described, for example, the user information record 511 having the number 519 of “1” is described as the user information record 511 (1).
 ユーザ情報は、例えば、政府機関や医療機関などが保有するレセプト情報である。レセプト情報は、生年月日、性別、病気などを含む。 User information is, for example, receipt information held by a government agency or a medical institution. The receipt information includes the date of birth, sex, illness, and the like.
 図2に示すユーザ情報レコード511は、年齢の属性の属性値を年齢512とし、病状の属性の属性値を病状513としている。例えば、ユーザ情報レコード511(1)に対応するユーザは、年齢が20歳で心臓病を患っていることを示している。 In the user information record 511 shown in FIG. 2, the attribute value of the age attribute is age 512, and the attribute value of the disease attribute is disease state 513. For example, the user corresponding to the user information record 511 (1) indicates that he is 20 years old and suffers from heart disease.
 尚、ユーザ情報レコード511は、上述に係わらず、任意の情報であってよい。例えば、ユーザ情報レコードは、年齢512、病状513及びその他の複数種類の情報(例えば、性別)を含んでもよい。また、ユーザ情報レコードは、例えば病状513を含まなくてもよい。更に、任意の属性(準識別子及びセンシティブ属性)それぞれは、複数の属性値を含んでよい。例えば、病状513は、「花粉症」及び「虫歯」の2つの属性値を含んでよい。 Note that the user information record 511 may be arbitrary information regardless of the above. For example, the user information record may include age 512, medical condition 513, and other types of information (for example, gender). The user information record may not include the medical condition 513, for example. Furthermore, each arbitrary attribute (quasi-identifier and sensitive attribute) may include a plurality of attribute values. For example, the medical condition 513 may include two attribute values “hay fever” and “tooth decay”.
 図3は、匿名化済ユーザ情報記憶部520に格納される匿名化済ユーザ情報レコード521の一例を示す図である。匿名化済ユーザ情報記憶部520は、k個以上の匿名化済ユーザ情報レコード521を含む。図3に示すように、匿名化済ユーザ情報レコード521は、グループ番号529、年齢512及び病状513を含む。 FIG. 3 is a diagram illustrating an example of the anonymized user information record 521 stored in the anonymized user information storage unit 520. The anonymized user information storage unit 520 includes k or more anonymized user information records 521. As shown in FIG. 3, the anonymized user information record 521 includes a group number 529, an age 512, and a medical condition 513.
 尚、グループ番号529は、匿名化済ユーザ情報レコード521を識別する番号である。匿名化済ユーザ情報レコード521を個別に示して説明する必要がある場合、例えばグループ番号529が「1」の匿名化済ユーザ情報レコード521は、匿名化済ユーザ情報レコード521(1)と記載する。尚、匿名化済ユーザ情報レコード521は、グループ番号529を含まなくてもよい。この場合、匿名化装置100は、例えば年齢512を利用して、匿名化済ユーザ情報レコード521を特定して処理するようにしてもよい。 The group number 529 is a number for identifying the anonymized user information record 521. When the anonymized user information record 521 needs to be individually shown and described, for example, the anonymized user information record 521 having the group number 529 of “1” is described as the anonymized user information record 521 (1). . The anonymized user information record 521 may not include the group number 529. In this case, the anonymization apparatus 100 may specify and process the anonymized user information record 521 using the age 512, for example.
 匿名化済ユーザ情報レコード521は、匿名化済みのユーザ情報である。ユーザ情報は、上述した通りである。例えば、匿名化済ユーザ情報レコード521(1)に対応するユーザは、年齢が20歳から21歳で、心臓病、骨折または感染症を患っていることを示している。 The anonymized user information record 521 is anonymized user information. The user information is as described above. For example, the user corresponding to the anonymized user information record 521 (1) has an age of 20 to 21 and has suffered from a heart disease, a fracture, or an infection.
 次に、第1の実施形態における匿名化装置100が備える各構成要素について説明する。尚、図1に示す構成要素は、ハードウェア単位の構成要素ではなく、機能単位の構成要素を示している。 Next, each component with which the anonymization apparatus 100 in 1st Embodiment is provided is demonstrated. Note that the components shown in FIG. 1 are not hardware components but functional units.
 図1に示すように、匿名化装置100は、プロパティ記憶部110、フォーカス値記憶部120、匿名化実行受付部130、フォーカス部分匿名化グループ作成部140、情報損失量計算部150及び匿名化グループ作成部160を含む。 As shown in FIG. 1, the anonymization device 100 includes a property storage unit 110, a focus value storage unit 120, an anonymization execution reception unit 130, a focus partial anonymization group creation unit 140, an information loss amount calculation unit 150, and an anonymization group. A creation unit 160 is included.
 ===プロパティ記憶部110===
 プロパティ記憶部110は、匿名化指標となる情報を記憶する。
=== Property Storage Unit 110 ===
The property storage unit 110 stores information that becomes an anonymization index.
 図4は、プロパティ記憶部110に記憶されるプロパティレコード111の一例を示す図である。図4に示すように、プロパティ記憶部110は、1以上のプロパティレコード111を含む。プロパティレコード111は、パラメータ名112とパラメータ値113とを含む。また、プロパティ記憶部110に記憶されるプロパティレコード111の内の少なくとも1つは、k-匿名性のkを指定するパラメータ名112とパラメータ値113との組である。 FIG. 4 is a diagram illustrating an example of the property record 111 stored in the property storage unit 110. As shown in FIG. 4, the property storage unit 110 includes one or more property records 111. The property record 111 includes a parameter name 112 and a parameter value 113. In addition, at least one of the property records 111 stored in the property storage unit 110 is a set of a parameter name 112 and a parameter value 113 that specify k-anonymity k.
 図4において、k-匿名性のkを指定するプロパティレコード111は、パラメータ名112が「k」で、パラメータ値113が「3」である。また、図4において、準識別子を示すプロパティレコード111は、パラメータ名112が「準識別子名」で、パラメータ値113が「年齢」である。また、センシティブ属性を示すプロパティレコード111は、パラメータ名112が「センシティブ属性」で、パラメータ値113が「病状」である。 4, in the property record 111 that specifies k-anonymity k, the parameter name 112 is “k” and the parameter value 113 is “3”. In FIG. 4, in the property record 111 indicating the quasi-identifier, the parameter name 112 is “quasi-identifier name” and the parameter value 113 is “age”. In the property record 111 indicating the sensitive attribute, the parameter name 112 is “sensitive attribute” and the parameter value 113 is “disease state”.
 ===フォーカス値記憶部120===
 フォーカス値記憶部120は、ユーザ情報レコード511に含まれる属性値のうち、注目したい属性値(注目属性値)を示す情報を保持する。
=== Focus Value Storage Unit 120 ===
The focus value storage unit 120 holds information indicating an attribute value (attention attribute value) to be noted among attribute values included in the user information record 511.
 図5は、フォーカス値記憶部120に記憶されるフォーカスレコード121の一例を示す図である。図5に示すように、フォーカスレコード121は、準識別子名122及びフォーカス値123を含む。フォーカスレコード121に含まれる情報は、図示しない匿名化データの利用者が、予め匿名化装置100に入力した情報である。尚、フォーカスレコード121に含まれる情報は、後述する匿名化処理の実行開始指示に含めて匿名化装置100に入力されるようにしてもよい。 FIG. 5 is a diagram illustrating an example of the focus record 121 stored in the focus value storage unit 120. As shown in FIG. 5, the focus record 121 includes a quasi-identifier name 122 and a focus value 123. The information included in the focus record 121 is information input to the anonymization device 100 in advance by a user of anonymized data (not shown). Note that the information included in the focus record 121 may be included in an anonymization process execution start instruction to be described later and input to the anonymization device 100.
 図5において、フォーカスレコード121は、準識別子名122が「年齢」で、フォーカス値123が「21」である。従って、フォーカスレコード121は、注目したい属性値が、例えば、図2に示すユーザ情報レコード511の内、年齢512が「21」の属性値であることを示す。 5, the focus record 121 has a semi-identifier name 122 of “age” and a focus value 123 of “21”. Therefore, the focus record 121 indicates that the attribute value to be noticed is, for example, the attribute value having the age 512 of “21” in the user information record 511 illustrated in FIG. 2.
 ===匿名化実行受付部130===
 匿名化実行受付部130は、外部から、匿名化処理の実行開始指示を受け付けて、受け付けた匿名化処理の実行開始指示を出力する。
=== Anonymization Execution Accepting Unit 130 ===
The anonymization execution reception unit 130 receives an anonymization process execution start instruction from the outside, and outputs the received anonymization process execution start instruction.
 ===フォーカス部分匿名化グループ作成部140===
 フォーカス部分匿名化グループ作成部140は、ユーザ情報記憶部510に記憶されたユーザ情報レコード511を利用して、フォーカス値記憶部120に記憶されたフォーカスレコード121に基づいて、フォーカス部分匿名化グループを作成する。
=== Focus Partial Anonymization Group Creation Unit 140 ===
The focus partial anonymization group creation unit 140 uses the user information record 511 stored in the user information storage unit 510 to generate a focus partial anonymization group based on the focus record 121 stored in the focus value storage unit 120. create.
 本実施形態においては、フォーカス部分匿名化グループ作成部140は、匿名化グループ作成部160からの指示を契機にして、匿名化グループ作成部160を経由してユーザ情報レコード511及びプロパティレコード111を取得し、フォーカス部分匿名化グループを作成する。尚、フォーカス部分匿名化グループ作成部140は、匿名化実行受付部130が受け付けた匿名化処理の実行開始指示を契機にして、フォーカス部分匿名化グループを作成するようにしてもよい。また、フォーカス部分匿名化グループ作成部140は、ユーザ情報記憶部510から直接ユーザ情報レコード511を取得するようにしてもよい。また、フォーカス部分匿名化グループ作成部140は、プロパティ記憶部110から直接プロパティレコード111を取得するようにしてもよい。 In the present embodiment, the focus partial anonymization group creation unit 140 acquires the user information record 511 and the property record 111 via the anonymization group creation unit 160 in response to an instruction from the anonymization group creation unit 160. And create a focus part anonymization group. The focus partial anonymization group creation unit 140 may create a focus partial anonymization group in response to an anonymization process execution start instruction received by the anonymization execution reception unit 130. In addition, the focus part anonymization group creation unit 140 may acquire the user information record 511 directly from the user information storage unit 510. Further, the focus part anonymization group creation unit 140 may acquire the property record 111 directly from the property storage unit 110.
 そして、フォーカス部分匿名化グループ作成部140は、作成したフォーカス部分匿名化グループを匿名化グループ作成部160に出力する。 Then, the focus partial anonymization group creation unit 140 outputs the created focus partial anonymization group to the anonymization group creation unit 160.
 具体的には、フォーカス部分匿名化グループ作成部140は、以下のようにしてフォーカス部分匿名化グループを作成する。 Specifically, the focus partial anonymization group creation unit 140 creates a focus partial anonymization group as follows.
 第1に、フォーカス部分匿名化グループ作成部140は、図2に示すユーザ情報レコード511の内、少なくともフォーカス値123を含むユーザ情報レコード511と他のユーザ情報レコード511とをグループ化してフォーカス部分匿名化グループ候補を作成する。 First, the focus part anonymization group creation unit 140 groups the user information record 511 including at least the focus value 123 and other user information records 511 in the user information record 511 shown in FIG. Create a group candidate.
 このとき、フォーカス部分匿名化グループ作成部140は、プロパティ記憶部110のk-匿名性のkを指定するプロパティレコード111の情報に基づいて、グループ化を行う。例えば、フォーカス部分匿名化グループ作成部140は、フォーカス値123を含むユーザ情報レコード511を少なくとも含む、合わせてk個のユーザ情報レコード511をグループ化してフォーカス部分匿名化グループ候補を作成する。 At this time, the focus partial anonymization group creation unit 140 performs grouping based on the information of the property record 111 that specifies k-anonymity k of the property storage unit 110. For example, the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate by grouping k user information records 511 including at least the user information record 511 including the focus value 123.
 第2に、フォーカス部分匿名化グループ作成部140は、作成したフォーカス部分匿名化グループ候補を情報損失量計算部150に出力する。そして、フォーカス部分匿名化グループ作成部140は、情報損失量計算部150からそのフォーカス部分匿名化グループ候補の情報損失量を受け取る。 Secondly, the focus partial anonymization group creation unit 140 outputs the created focus partial anonymization group candidate to the information loss amount calculation unit 150. Then, the focus partial anonymization group creation unit 140 receives the information loss amount of the focus partial anonymization group candidate from the information loss amount calculation unit 150.
 第3に、フォーカス部分匿名化グループ作成部140は、その情報損失量を受け取った複数のフォーカス部分匿名化グループ候補の内、対応する情報損失量の値が最も小さいフォーカス部分匿名化グループ候補をフォーカス部分匿名化グループとして決定する。 Thirdly, the focus partial anonymization group creation unit 140 focuses the focus partial anonymization group candidate having the smallest value of the corresponding information loss amount among the plurality of focus partial anonymization group candidates that have received the information loss amount. Determine as a partially anonymized group.
 フォーカス部分匿名化グループは、少なくともフォーカス値123を含むユーザ情報レコード511に対応する情報を含む。この対応する情報は、例えば、フォーカス値123である「21」と同じ値の、年齢512を含むユーザ情報レコード511の、病状513の「感染病」という情報である。 The focus partial anonymization group includes information corresponding to the user information record 511 including at least the focus value 123. The corresponding information is, for example, information “infectious disease” of the medical condition 513 in the user information record 511 including the age 512 having the same value as “21” which is the focus value 123.
 ===情報損失量計算部150===
 情報損失量計算部150は、フォーカス部分匿名化グループ候補の情報損失量を計算する。情報損失量計算部150は、例えば、以下の式により情報損失量を計算する。
=== Information Loss Calculation Unit 150 ===
The information loss amount calculation unit 150 calculates the information loss amount of the focus part anonymization group candidate. For example, the information loss amount calculation unit 150 calculates the information loss amount by the following equation.
 情報損失量=(フォーカス値匿名化グループ候補に含まれる特定の属性値の最大値-フォーカス値匿名化グループ候補に含まれる特定の属性値の最小値+1)×レコード数。 Information loss amount = (maximum value of specific attribute value included in focus value anonymization group candidate−minimum value of specific attribute value included in focus value anonymization group candidate + 1) × number of records.
 上述の式において、「フォーカス値匿名化グループ候補に含まれる特定の属性値」は、言い換えると、そのフォーカス値匿名化グループ候補に対応するユーザ情報レコード511の、フォーカス値記憶部120の準識別子名122で特定される属性値である、年齢512である。即ち、特定の属性値は、フォーカスレコード121の準識別子名122に対応する属性値であり、図5に示すフォーカスレコード121の場合、特定の属性値は年齢512である。そして、「フォーカス値匿名化グループ候補に含まれる特定の属性値の最大値-フォーカス値匿名化グループ候補に含まれる特定の属性値の最小値」、即ち年齢512の差分は、そのフォーカス値匿名化グループ候補に対応するユーザ情報レコード511の、フォーカス値記憶部120の準識別子名122で特定される属性値の範囲である。 In the above formula, the “specific attribute value included in the focus value anonymization group candidate” is, in other words, the quasi-identifier name of the focus value storage unit 120 of the user information record 511 corresponding to the focus value anonymization group candidate. The attribute value specified by 122 is age 512. That is, the specific attribute value is an attribute value corresponding to the quasi-identifier name 122 of the focus record 121, and in the case of the focus record 121 shown in FIG. 5, the specific attribute value is age 512. Then, “the maximum value of the specific attribute value included in the focus value anonymization group candidate−the minimum value of the specific attribute value included in the focus value anonymization group candidate”, that is, the difference of the age 512 is the focus value anonymization This is a range of attribute values specified by the quasi-identifier name 122 of the focus value storage unit 120 of the user information record 511 corresponding to the group candidate.
 尚、属性の値が住所や性別などの文字列である場合、情報損失量計算部150は、非特許文献2で示されるように、カテゴリカル値(文字列)の情報損失量を計算するようにしてもよい。例えば、情報損失量計算部150は、情報損失量=(元となるユーザ情報レコードの数)/(グループ化された場合の特定の属性値の数)として計算してもよい。例えば、図2及び図3を参照すると、グループ番号529が「1」の匿名化済ユーザ情報レコード521の場合、情報損失量計算部150は、情報損失量=3/1=3を算出する。ここで、元となるユーザ情報レコードの数は、番号519が「1」、「2」、「3」のユーザ情報レコード511が3つである。また、グループ化された場合の特定の属性値の数は、グループ番号529が「1」の、年齢512の属性値の数が「20-21」の1つである。 If the attribute value is a character string such as an address or gender, the information loss amount calculation unit 150 calculates the information loss amount of the categorical value (character string) as shown in Non-Patent Document 2. It may be. For example, the information loss amount calculation unit 150 may calculate information loss amount = (number of original user information records) / (number of specific attribute values when grouped). For example, referring to FIG. 2 and FIG. 3, in the case of the anonymized user information record 521 whose group number 529 is “1”, the information loss amount calculation unit 150 calculates information loss amount = 3/1 = 3. Here, the number of user information records as the source is three user information records 511 having numbers 519 of “1”, “2”, and “3”. Further, the number of specific attribute values when grouped is one in which the group number 529 is “1” and the age 512 attribute value is “20-21”.
 ===匿名化グループ作成部160===
 匿名化グループ作成部160は、匿名化済ユーザ情報レコード521を作成し、作成した匿名化済ユーザ情報レコード521を匿名化済ユーザ情報記憶部520に記憶させる。尚、匿名化グループ作成部160は、匿名性を保証できる匿名化済ユーザ情報レコード521を作成できなかった場合、匿名化済ユーザ情報記憶部520に記憶させる処理を実行しない。
=== Anonymization group creation unit 160 ===
The anonymization group creation unit 160 creates the anonymized user information record 521 and stores the created anonymized user information record 521 in the anonymized user information storage unit 520. In addition, the anonymization group creation part 160 does not perform the process memorize | stored in the anonymized user information storage part 520, when the anonymized user information record 521 which can ensure anonymity cannot be created.
 匿名化グループ作成部160は、匿名化実行受付部130が受け付けた匿名化処理の実行開始指示を契機に、匿名化済ユーザ情報レコード521を作成する。 The anonymization group creation unit 160 creates the anonymized user information record 521 when triggered by the anonymization process execution start instruction received by the anonymization execution reception unit 130.
 具体的には、匿名化グループ作成部160は、以下のようにして匿名化済ユーザ情報レコード521を作成する。 Specifically, the anonymization group creation unit 160 creates the anonymized user information record 521 as follows.
 第1に、匿名化グループ作成部160は、フォーカス部分匿名化グループ作成部140にユーザ情報レコード511及びプロパティレコード111を渡し、フォーカス部分匿名化グループの作成を指示する。そして、匿名化グループ作成部160は、その作成の指示の応答として、フォーカス部分匿名化グループ作成部140からフォーカス部分匿名化グループを受け取る。 First, the anonymization group creation unit 160 passes the user information record 511 and the property record 111 to the focus partial anonymization group creation unit 140 and instructs the creation of the focus partial anonymization group. Then, the anonymization group creation unit 160 receives the focus partial anonymization group from the focus partial anonymization group creation unit 140 as a response to the creation instruction.
 第2に、匿名化グループ作成部160は、フォーカス部分匿名化グループ作成部140が作成したフォーカス部分匿名化グループに対応するユーザ情報レコード511以外の、ユーザ情報レコード511から1以上の匿名化グループを作成する。 Secondly, the anonymization group creation unit 160 creates one or more anonymization groups from the user information record 511 other than the user information record 511 corresponding to the focus partial anonymization group created by the focus partial anonymization group creation unit 140. create.
 第3に、匿名化グループ作成部160は、フォーカス部分匿名化グループ作成部140から受け取ったフォーカス部分匿名化グループ及び自身が作成した匿名化グループそれぞれに対応する、匿名化済ユーザ情報レコード521を作成し匿名化済ユーザ情報記憶部520に記憶させる。フォーカス部分匿名化グループ及び匿名化グループそれぞれに対応する匿名化済ユーザ情報レコード521とは、フォーカス部分匿名化グループ及び匿名化グループそれぞれに含まれるユーザ情報レコード511を纏め、グループ番号529を付与したものである。 Third, the anonymization group creation unit 160 creates an anonymized user information record 521 corresponding to the focus partial anonymization group received from the focus partial anonymization group creation unit 140 and the anonymization group created by itself. And stored in the anonymized user information storage unit 520. An anonymized user information record 521 corresponding to each of the focus partial anonymization group and the anonymization group is obtained by collecting user information records 511 included in the focus partial anonymization group and the anonymization group, and assigning a group number 529. It is.
 以上が、匿名化装置100の機能単位の各構成要素についての説明である。 This completes the description of each component of the functional unit of the anonymization device 100.
 次に、匿名化装置100のハードウェア単位の構成要素について説明する。 Next, the components of the anonymization device 100 in hardware units will be described.
 図6は、本実施形態における匿名化装置100を実現するコンピュータ700のハードウェア構成を示す図である。 FIG. 6 is a diagram illustrating a hardware configuration of a computer 700 that realizes the anonymization apparatus 100 according to the present embodiment.
 図6に示すように、コンピュータ700は、CPU(Central Processing Unit)701、記憶部702、記憶装置703、入力部704、出力部705及び通信部706を含む。更に、コンピュータ700は、外部から供給される記録媒体(または記憶媒体)707を含む。記録媒体707は、情報を非一時的に記憶する不揮発性記録媒体であってもよい。 As shown in FIG. 6, the computer 700 includes a CPU (Central Processing Unit) 701, a storage unit 702, a storage device 703, an input unit 704, an output unit 705, and a communication unit 706. Furthermore, the computer 700 includes a recording medium (or storage medium) 707 supplied from the outside. The recording medium 707 may be a non-volatile recording medium that stores information non-temporarily.
 CPU701は、オペレーティングシステム(不図示)を動作させて、コンピュータ700の、全体の動作を制御する。また、CPU701は、例えば記憶装置703に装着された記録媒体707から、プログラム(例えば、後述の図7に示すフローチャートの動作をコンピュータ700に実行させるプログラム)やデータを読み込み、読み込んだプログラムやデータを記憶部702に書き込む。そして、CPU701は、読み込んだプログラムに従って、また読み込んだデータに基づいて、図1に示す匿名化実行受付部130、フォーカス部分匿名化グループ作成部140、情報損失量計算部150及び匿名化グループ作成部160として各種の処理を実行する。 The CPU 701 controls the overall operation of the computer 700 by operating an operating system (not shown). Further, the CPU 701 reads a program (for example, a program that causes the computer 700 to execute an operation of a flowchart shown in FIG. 7 described later) and data from a recording medium 707 mounted on the storage device 703, and loads the read program and data. Write to the storage unit 702. The CPU 701 follows the read program and based on the read data, the anonymization execution reception unit 130, the focus partial anonymization group creation unit 140, the information loss amount calculation unit 150, and the anonymization group creation unit shown in FIG. Various processes are executed as 160.
 尚、CPU701は、通信網(不図示)に接続されている外部コンピュータ(不図示)から、記憶部702にプログラムやデータをダウンロードするようにしてもよい。 Note that the CPU 701 may download a program or data to the storage unit 702 from an external computer (not shown) connected to a communication network (not shown).
 記憶部702は、プログラムやデータを記憶する。記憶部702は、プロパティ記憶部110及びフォーカス値記憶部120を含んでよい。更に、コンピュータ700(匿名化装置100)がユーザ情報記憶部510、匿名化済ユーザ情報記憶部520それぞれを含む場合、記憶部702は、これらそれぞれを含んでよい。 The storage unit 702 stores programs and data. The storage unit 702 may include a property storage unit 110 and a focus value storage unit 120. Furthermore, when the computer 700 (anonymization apparatus 100) includes the user information storage unit 510 and the anonymized user information storage unit 520, the storage unit 702 may include these.
 記憶装置703は、例えば、光ディスク、フレキシブルディスク、磁気光ディスク、外付けハードディスク及び半導体メモリであって、記録媒体707を含む。記憶装置703は、プログラムをコンピュータ読み取り可能に記録する。また、記憶装置703は、データをコンピュータ読み取り可能に記録してもよい。記憶装置703は、プロパティ記憶部110及びフォーカス値記憶部120を含んでよい。更に、コンピュータ700(匿名化装置100)がユーザ情報記憶部510、匿名化済ユーザ情報記憶部520それぞれを含む場合、記憶装置703は、これらそれぞれを含んでよい。 The storage device 703 is, for example, an optical disk, a flexible disk, a magnetic optical disk, an external hard disk, and a semiconductor memory, and includes a recording medium 707. The storage device 703 records the program so that it can be read by a computer. Further, the storage device 703 may record data so as to be readable by a computer. The storage device 703 may include a property storage unit 110 and a focus value storage unit 120. Furthermore, when the computer 700 (anonymization device 100) includes the user information storage unit 510 and the anonymized user information storage unit 520, the storage device 703 may include these.
 入力部704は、例えばマウスやキーボード、内蔵のキーボタンなどで実現され、入力操作に用いられる。入力部704は、マウスやキーボード、内蔵のキーボタンに限らず、例えばタッチパネル、加速度計、ジャイロセンサ、カメラなどでもよい。入力部704は、匿名化実行受付部130の一部として含まれる。 The input unit 704 is realized by, for example, a mouse, a keyboard, a built-in key button, and the like, and is used for an input operation. The input unit 704 is not limited to a mouse, a keyboard, and a built-in key button, and may be a touch panel, an accelerometer, a gyro sensor, a camera, or the like. The input unit 704 is included as part of the anonymization execution reception unit 130.
 出力部705は、例えばディスプレイで実現され、出力を確認するために用いられる。 The output unit 705 is realized by a display, for example, and is used for confirming the output.
 通信部706は、ユーザ情報記憶部510及び匿名化済ユーザ情報記憶部520とのインタフェースを実現する。通信部706は、匿名化グループ作成部160の一部として含まれる。 The communication unit 706 implements an interface with the user information storage unit 510 and the anonymized user information storage unit 520. The communication unit 706 is included as a part of the anonymization group creation unit 160.
 以上説明したように、図1に示す匿名化装置100の機能単位のブロックは、図6に示すハードウェア構成のコンピュータ700によって実現される。但し、コンピュータ700が備える各部の実現手段は、上記に限定されない。すなわち、コンピュータ700は、物理的に結合した1つの装置により実現されてもよいし、物理的に分離した2つ以上の装置を有線または無線で接続し、これら複数の装置により実現されてもよい。 As described above, the functional unit block of the anonymization device 100 shown in FIG. 1 is realized by the computer 700 having the hardware configuration shown in FIG. However, the means for realizing each unit included in the computer 700 is not limited to the above. In other words, the computer 700 may be realized by one physically coupled device, or may be realized by two or more physically separated devices connected by wire or wirelessly and by a plurality of these devices. .
 尚、上述のプログラムのコードを記録した記録媒体707が、コンピュータ700に供給され、CPU701は、記録媒体707に格納されたプログラムのコードを読み出して実行するようにしてもよい。或いは、CPU701は、記録媒体707に格納されたプログラムのコードを、記憶部702、記憶装置703またはその両方に格納するようにしてもよい。すなわち、本実施形態は、コンピュータ700(CPU701)が実行するプログラム(ソフトウェア)を、一時的にまたは非一時的に、記憶する記録媒体707の実施形態を含む。 Note that the recording medium 707 in which the above-described program code is recorded may be supplied to the computer 700, and the CPU 701 may read and execute the program code stored in the recording medium 707. Alternatively, the CPU 701 may store the code of the program stored in the recording medium 707 in the storage unit 702, the storage device 703, or both. That is, the present embodiment includes an embodiment of a recording medium 707 that stores a program (software) executed by the computer 700 (CPU 701) temporarily or non-temporarily.
 以上が、本実施形態における匿名化装置100を実現するコンピュータ700の、ハードウェア単位の各構成要素についての説明である。 This completes the description of each component of the computer 700 that implements the anonymization device 100 according to the present embodiment.
 次に本実施形態の動作について、図1~図10(図面)を参照して詳細に説明する。 Next, the operation of this embodiment will be described in detail with reference to FIGS. 1 to 10 (drawings).
 図7は、本実施形態の匿名化装置100の動作を示すフローチャートである。尚、このフローチャートによる処理は、前述したCPUによるプログラム制御に基づいて、実行されても良い。また、処理のステップ名については、S601のように、記号で記載する。 FIG. 7 is a flowchart showing the operation of the anonymization device 100 of this embodiment. Note that the processing according to this flowchart may be executed based on the above-described program control by the CPU. Further, the step name of the process is described by a symbol as in S601.
 匿名化実行受付部130は、図示しない匿名化データの利用者から匿名化処理の実行開始指示を受け付けて、匿名化グループ作成部160へ出力する(S601)。 The anonymization execution reception unit 130 receives an anonymization process execution start instruction from a user of anonymization data (not shown) and outputs the instruction to the anonymization group creation unit 160 (S601).
 次に、匿名化グループ作成部160は、匿名化処理の実行開始指示を受け取ったことを契機に、ユーザ情報記憶部510からユーザ情報レコード511を取得する(S602)。 Next, the anonymization group creation unit 160 obtains the user information record 511 from the user information storage unit 510 upon receiving the anonymization process execution start instruction (S602).
 次に、匿名化グループ作成部160は、プロパティ記憶部110からパラメータ名112が「k」のプロパティレコード111を取得する(S603)。 Next, the anonymization group creation unit 160 acquires the property record 111 having the parameter name 112 of “k” from the property storage unit 110 (S603).
 次に、匿名化グループ作成部160は、S602で取得したユーザ情報レコード511と、S603で取得した、パラメータ名112が「k」であるプロパティレコード111のパラメータ値113(例えば、「3」)とを、フォーカス部分匿名化グループ作成部140へ出力し、フォーカス部分匿名化グループの作成を指示する。(S604)。 Next, the anonymization group creation unit 160 includes the user information record 511 acquired in S602, and the parameter value 113 (eg, “3”) of the property record 111 having the parameter name 112 “k” acquired in S603. Is output to the focus partial anonymization group creation unit 140 to instruct the creation of the focus partial anonymization group. (S604).
 次に、フォーカス部分匿名化グループ作成部140は、フォーカス値記憶部120からフォーカスレコード121(例えば、準識別子名122が「年齢」、フォーカス値123が「21」)を取得する(S605)。 Next, the focus part anonymization group creation unit 140 acquires the focus record 121 (for example, the quasi-identifier name 122 is “age” and the focus value 123 is “21”) from the focus value storage unit 120 (S605).
 次に、フォーカス部分匿名化グループ作成部140は、取得したフォーカスレコード121に基づいて、フォーカス部分匿名化グループ候補を作成する(S606)。 Next, the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate based on the acquired focus record 121 (S606).
 図8は、フォーカス部分匿名化グループ作成部140が、フォーカス部分匿名化グループ候補を作成する場合に、ユーザ情報レコード511を選択する例を示す図である。図8に示すように、フォーカス部分匿名化グループ作成部140は、年齢512が「21」のユーザ情報レコード511から、年齢が低くなる方向に3レコードを、1つのグループとみなしてフォーカス値匿名化グループ候補を作成する。更に、フォーカス部分匿名化グループ作成部140は、年齢が高くなる方向に3レコードを、1つのグループとみなしてフォーカス値匿名化グループ候補を作成する。尚、フォーカス部分匿名化グループ作成部140は、年齢512が「21」のユーザ情報レコード511を中心にした3レコードを、1つのグループとみなした場合のフォーカス値匿名化グループ候補を、更に作成してもよい。図9は、フォーカス部分匿名化グループ作成部140が、年齢が低くなる方向に3レコードを纏めて、1つのフォーカス部分匿名化グループ候補を作成した例を示す図である。図10は、フォーカス部分匿名化グループ作成部140が、年齢が高くなる方向に3レコードを纏めて、1つのフォーカス部分匿名化グループ候補を作成した例を示す図である。 FIG. 8 is a diagram illustrating an example in which the focus partial anonymization group creation unit 140 selects the user information record 511 when creating a focus partial anonymization group candidate. As shown in FIG. 8, the focus partial anonymization group creating unit 140 regards three records as one group from the user information record 511 whose age 512 is “21” in the direction of decreasing age, and makes the focus value anonymized. Create group candidates. Further, the focus partial anonymization group creation unit 140 creates focus value anonymization group candidates by regarding three records as one group in the direction of increasing age. The focus partial anonymization group creation unit 140 further creates a focus value anonymization group candidate when three records centered on the user information record 511 having the age 512 of “21” are regarded as one group. May be. FIG. 9 is a diagram illustrating an example in which the focus partial anonymization group creation unit 140 creates one focus partial anonymization group candidate by collecting three records in the direction of decreasing age. FIG. 10 is a diagram illustrating an example in which the focus partial anonymization group creation unit 140 creates one focus partial anonymization group candidate by collecting three records in the direction of increasing age.
 次に、フォーカス部分匿名化グループ作成部140は、作成したフォーカス値匿名化グループ候補を情報損失量計算部150へ送信する(S607)。 Next, the focus partial anonymization group creation unit 140 transmits the created focus value anonymization group candidate to the information loss amount calculation unit 150 (S607).
 次に、情報損失量計算部150は、受信したフォーカス値匿名化グループ候補それぞれについて、情報損失量を計算する(S608)。 Next, the information loss amount calculation unit 150 calculates an information loss amount for each received focus value anonymization group candidate (S608).
 例えば、情報損失量計算部150は、図9及び図10に示すフォーカス部分匿名化グループ候補の情報損失量は、上述の情報損失量の計算式を用いて、以下の通り計算する。 For example, the information loss amount calculation unit 150 calculates the information loss amount of the focus partial anonymization group candidate shown in FIGS. 9 and 10 using the above-described information loss amount calculation formula as follows.
 図9に示すフォーカス部分匿名化グループ候補の情報損失量は、(21-20+1)×3=6。 The information loss amount of the focus partial anonymization group candidate shown in FIG. 9 is (21-20 + 1) × 3 = 6.
 図10に示すフォーカス部分匿名化グループ候補の情報損失量は、(23-21+1)×3=9。 The information loss amount of the focus partial anonymization group candidate shown in FIG. 10 is (23-21 + 1) × 3 = 9.
 次に、フォーカス部分匿名化グループ作成部140は、情報損失量計算部150から受け取った情報損失量に基づいて、情報損失量が最も少ない(例えば、「6」)フォーカス部分匿名化グループ候補(例えば、図9に示すフォーカス部分匿名化グループ候補)をフォーカス値匿名化グループと決定する。続けて、フォーカス部分匿名化グループ作成部140は、決定したフォーカス部分匿名化グループ候補を匿名化グループ作成部160へ出力する(S609)。 Next, based on the information loss amount received from the information loss amount calculation unit 150, the focus partial anonymization group creation unit 140 has the smallest information loss amount (for example, “6”) and the focus partial anonymization group candidate (for example, 9 is determined as a focus value anonymization group. Subsequently, the focus partial anonymization group creation unit 140 outputs the determined focus partial anonymization group candidate to the anonymization group creation unit 160 (S609).
 次に、匿名化グループ作成部160は、フォーカス部分匿名化グループ作成部140が作成したフォーカス部分匿名化グループに対応するユーザ情報レコード511以外の、ユーザ情報レコード511から1以上の匿名化グループを作成する(S610)。 Next, the anonymization group creation unit 160 creates one or more anonymization groups from the user information record 511 other than the user information record 511 corresponding to the focus partial anonymization group created by the focus partial anonymization group creation unit 140. (S610).
 次に、匿名化グループ作成部160は、フォーカス部分匿名化グループ作成部140から受け取ったフォーカス部分匿名化グループ及び自身が作成した匿名化グループそれぞれに対応する、匿名化済ユーザ情報レコード521を作成して匿名化済ユーザ情報記憶部520に記憶させる(S611)。 Next, the anonymization group creation unit 160 creates an anonymized user information record 521 corresponding to the focus partial anonymization group received from the focus partial anonymization group creation unit 140 and the anonymization group created by itself. Is stored in the anonymized user information storage unit 520 (S611).
 以上が、本実施形態の動作の説明である。 The above is the description of the operation of the present embodiment.
 上述のようにして作成された匿名化済ユーザ情報レコード521は、匿名化済みデータを利用するユーザが指定したフォーカス値123に対応する準識別子(年齢512)の抽象度を最小にすることを優先に、匿名化処理されている。即ち、本実施形態の匿名化装置100は、指定されたフォーカス値123に対応する準識別子の抽象度を最小にすることができる。 The anonymized user information record 521 created as described above gives priority to minimizing the abstraction level of the quasi-identifier (age 512) corresponding to the focus value 123 specified by the user who uses the anonymized data. The anonymization process is performed. That is, the anonymization device 100 according to the present embodiment can minimize the abstraction level of the quasi-identifier corresponding to the designated focus value 123.
 局所的に抽象度が低くされた匿名化済みデータセット(匿名化済ユーザ情報レコード521の集合)を利用することにより、例えば、災害地域の郵便番号、学習指導要綱が大幅に変更になった人の生年月日などの、意味がある値付近を詳細に調べることが可能になる。 By using an anonymized data set (a set of anonymized user information records 521) whose abstraction level has been lowered locally, for example, the postal code of the disaster area, the study guidance guidelines have been significantly changed It is possible to examine in detail the vicinity of a meaningful value such as the date of birth.
 上述した本実施形態における効果は、注目したい属性値の抽象度を優先的に、局所的に低くするように、匿名化されたデータを得ることが可能になる点である。 The effect of the present embodiment described above is that it is possible to obtain anonymized data so as to preferentially lower the abstraction level of the attribute value to be focused on locally.
 その理由は、以下のような構成を含むからである。即ち、第1にフォーカス部分匿名化グループ作成部140が、フォーカス部分匿名化グループ候補を作成する。第2に、情報損失量計算部150がフォーカス部分匿名化グループ候補それぞれの情報損失量を算出する。第3に、フォーカス部分匿名化グループ作成部140が、最も小さい情報損失量に対応するフォーカス部分匿名化グループ候補を、フォーカス部分匿名化グループとして決定する。 The reason is that the following configuration is included. That is, first, the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate. Secondly, the information loss amount calculation unit 150 calculates the information loss amount of each focus partial anonymization group candidate. Thirdly, the focus partial anonymization group creation unit 140 determines a focus partial anonymization group candidate corresponding to the smallest amount of information loss as a focus partial anonymization group.
 <<第2の実施形態>>
 次に、本発明の第2の実施形態について図面を参照して詳細に説明する。以下、本実施形態の説明が不明確にならない範囲で、前述の説明と重複する内容については説明を省略する。
<< Second Embodiment >>
Next, a second embodiment of the present invention will be described in detail with reference to the drawings. Hereinafter, the description overlapping with the above description is omitted as long as the description of the present embodiment is not obscured.
 図11は、本発明の第2の実施形態に係る匿名化システムの構成を示すブロック図である。 FIG. 11 is a block diagram showing the configuration of the anonymization system according to the second embodiment of the present invention.
 図11を参照すると、本実施形態に係る匿名化システムは、匿名化装置200、ユーザ情報記憶部510及び匿名化済ユーザ情報記憶部520を備える。 Referring to FIG. 11, the anonymization system according to the present embodiment includes an anonymization device 200, a user information storage unit 510, and an anonymized user information storage unit 520.
 匿名化装置200、ユーザ情報記憶部510及び匿名化済ユーザ情報記憶部520は、図示しないネットワークで接続されている。尚、ユーザ情報記憶部510は、匿名化装置200に含まれていてもよい。また、匿名化済ユーザ情報記憶部520は、匿名化装置200に含まれていてもよい。 The anonymization device 200, the user information storage unit 510, and the anonymized user information storage unit 520 are connected by a network (not shown). Note that the user information storage unit 510 may be included in the anonymization device 200. Further, the anonymized user information storage unit 520 may be included in the anonymization device 200.
 匿名化装置200は、ユーザ情報記憶部510に格納されているユーザ情報を匿名化し、その匿名化したユーザ情報を匿名化済ユーザ情報記憶部520に格納する。 The anonymization device 200 anonymizes the user information stored in the user information storage unit 510 and stores the anonymized user information in the anonymized user information storage unit 520.
 図11を参照すると、本実施形態における匿名化装置200は、第1の実施形態の匿名化装置100と比べて、分割値記憶部270を更に含む。また、匿名化装置200は、第1の実施形態の匿名化装置100と比べて、フォーカス部分匿名化グループ作成部140の代わりにフォーカス部分匿名化グループ作成部240を有している。 Referring to FIG. 11, the anonymization device 200 according to the present embodiment further includes a divided value storage unit 270 as compared with the anonymization device 100 according to the first embodiment. Moreover, the anonymization apparatus 200 has the focus partial anonymization group creation part 240 instead of the focus partial anonymization group creation part 140 compared with the anonymization apparatus 100 of 1st Embodiment.
 図12は、本実施形態におけるユーザ情報レコード511の一例を示す図である。図12に示すように、本実施形態におけるユーザ情報レコード511は、図2のユーザ情報レコード511に比べて、準識別子として受診日514を更に含む。 FIG. 12 is a diagram showing an example of the user information record 511 in the present embodiment. As shown in FIG. 12, the user information record 511 in the present embodiment further includes a consultation date 514 as a quasi-identifier as compared to the user information record 511 in FIG.
 ===分割値記憶部270===
 分割値記憶部270は、ユーザ情報を分割する属性値を示す情報を保持する。
=== Division Value Storage Unit 270 ===
The division value storage unit 270 holds information indicating attribute values for dividing user information.
 図13は、分割値記憶部270に記憶される分割値レコード271の一例を示す図である。図13に示すように、分割値記憶部270は、1以上の分割値レコード271を含む。分割値レコード271は、準識別子名272及び分割値273を含む。分割値レコード271に含まれる情報は、図示しない匿名化データの利用者が、予め匿名化装置200に入力した情報である。尚、分割値レコード271に含まれる情報は、前述の匿名化処理の実行開始指示に含めて匿名化装置200に入力されるようにしてもよい。 FIG. 13 is a diagram illustrating an example of the division value record 271 stored in the division value storage unit 270. As illustrated in FIG. 13, the division value storage unit 270 includes one or more division value records 271. The division value record 271 includes a semi-identifier name 272 and a division value 273. The information included in the division value record 271 is information input to the anonymization device 200 in advance by a user of anonymized data (not shown). Note that the information included in the division value record 271 may be included in the anonymization process execution start instruction and input to the anonymization device 200.
 図13において、分割値レコード271は、準識別子名272が「受診日」で、分割値273が「2011年11月30日、2011年12月1日」である。従って、分割値レコード271は、ユーザ情報を2011年11月30日以前のユーザ情報レコード511と、2011年12月1日以後のユーザ情報レコード511とに分割することを示す。 13, in the division value record 271, the quasi-identifier name 272 is “visit date”, and the division value 273 is “November 30, 2011, December 1, 2011”. Therefore, the division value record 271 indicates that the user information is divided into a user information record 511 before November 30, 2011 and a user information record 511 after December 1, 2011.
 ===フォーカス部分匿名化グループ作成部240===
 フォーカス部分匿名化グループ作成部240は、ユーザ情報記憶部510に記憶されたユーザ情報レコード511を利用して、フォーカス値記憶部120に記憶されたフォーカスレコード121及び分割値記憶部270に記憶された分割値レコード271に基づいて、フォーカス部分匿名化グループを作成する。
=== Focus partial anonymization group creation unit 240 ===
The focus part anonymization group creation unit 240 uses the user information record 511 stored in the user information storage unit 510 to store the focus record 121 and the divided value storage unit 270 stored in the focus value storage unit 120. Based on the division value record 271, a focus partial anonymization group is created.
 具体的には、フォーカス部分匿名化グループ作成部240は、図12に示すユーザ情報レコード511を、分割値レコード271に基づいて分割し、複数の分割グループを作成する。次に、フォーカス部分匿名化グループ作成部240は、第1の実施形態のフォーカス部分匿名化グループ作成部140と同様にして、ユーザ情報レコード511の分割グループごとに図7のステップS606~609を実行し、フォーカス部分匿名化グループを作成する。 Specifically, the focus partial anonymization group creation unit 240 divides the user information record 511 shown in FIG. 12 based on the division value record 271 to create a plurality of division groups. Next, the focus partial anonymization group creation unit 240 executes steps S606 to S609 in FIG. 7 for each divided group of the user information record 511 in the same manner as the focus partial anonymization group creation unit 140 of the first embodiment. And create a focus part anonymization group.
 例えば、フォーカス部分匿名化グループ作成部240は、図12のユーザ情報レコード511を、番号519が「1」、「3」、「4」及び「6」のユーザ情報レコード511と、番号519が「2」、「5」及び「7」のユーザ情報レコード511とに分割する。 For example, the focus part anonymization group creation unit 240 sets the user information record 511 in FIG. 12 as the user information record 511 having the numbers 519 of “1”, “3”, “4”, and “6”, and the number 519 as “ It is divided into user information records 511 of “2”, “5” and “7”.
 次に、フォーカス部分匿名化グループ作成部240は、ステップS606~609を実行し、分割値273を跨ぐことなく作成したフォーカス部分匿名化グループを出力する。 Next, the focus partial anonymization group creation unit 240 executes steps S606 to S609, and outputs the focus partial anonymization group created without crossing the division value 273.
 そして、匿名化グループ作成部160は、分割値273を跨ぐことなく作成されたフォーカス部分匿名化グループに対応する匿名化済ユーザ情報レコード521を、匿名化済ユーザ情報記憶部520に記憶させる。 Then, the anonymized group creation unit 160 stores the anonymized user information record 521 corresponding to the focus partial anonymization group created without straddling the division value 273 in the anonymized user information storage unit 520.
 上述のように構成することにより、例えば、あるガイドラインが制定された日、新薬が使用許可された日、などの前後を分離してユーザ情報の匿名化が可能になり、より有用性の高い匿名化済ユーザ情報レコード521を作成することができる。 By configuring as described above, for example, it becomes possible to anonymize user information by separating the front and back of a day when a certain guideline is established, a day when a new drug is permitted to be used, etc. A converted user information record 521 can be created.
 分割値レコード271は、上述の例に限らず、ユーザ情報レコード511に含まれる任意の属性値の、分割を指定するものであってよい。また、分割値レコード271は、複数であってよい。 The division value record 271 is not limited to the example described above, and may specify division of an arbitrary attribute value included in the user information record 511. Moreover, the division value record 271 may be plural.
 上述した本実施形態における効果は、第1の実施形態の効果に加えて、より範囲を絞ったユーザ情報レコード511について、注目したい属性値の抽象度を優先的に、局所的に低くするように、匿名化されたデータを得ることが可能になる点である。 The effect of the present embodiment described above is that, in addition to the effect of the first embodiment, for the user information record 511 with a narrower range, the abstraction level of the attribute value to be focused is preferentially lowered locally. It is possible to obtain anonymized data.
 その理由は、フォーカス部分匿名化グループ作成部240が、分割値レコード271に基づいてユーザ情報レコード511を分割するようにしたからである。 The reason is that the focus part anonymization group creation unit 240 divides the user information record 511 based on the division value record 271.
 <<第3の実施形態>>
 次に、本発明の第3の実施形態について図面を参照して詳細に説明する。以下、本実施形態の説明が不明確にならない範囲で、前述の説明と重複する内容については説明を省略する。
<< Third Embodiment >>
Next, a third embodiment of the present invention will be described in detail with reference to the drawings. Hereinafter, the description overlapping with the above description is omitted as long as the description of the present embodiment is not obscured.
 図14は、本発明の第3の実施形態に係る匿名化システムの構成を示すブロック図である。 FIG. 14 is a block diagram showing the configuration of the anonymization system according to the third embodiment of the present invention.
 図14を参照すると、本実施形態に係る匿名化システムは、匿名化装置300、ユーザ情報記憶部510及び匿名化済ユーザ情報記憶部520を備える。匿名化装置300の構成を示すブロック図である。 Referring to FIG. 14, the anonymization system according to this embodiment includes an anonymization device 300, a user information storage unit 510, and an anonymized user information storage unit 520. 3 is a block diagram showing a configuration of an anonymization device 300. FIG.
 匿名化装置300、ユーザ情報記憶部510及び匿名化済ユーザ情報記憶部520は、図示しないネットワークで接続されている。尚、ユーザ情報記憶部510は、匿名化装置300に含まれていてもよい。また、匿名化済ユーザ情報記憶部520は、匿名化装置300に含まれていてもよい。 The anonymization device 300, the user information storage unit 510, and the anonymized user information storage unit 520 are connected by a network (not shown). Note that the user information storage unit 510 may be included in the anonymization device 300. Further, the anonymized user information storage unit 520 may be included in the anonymization device 300.
 匿名化装置300は、ユーザ情報記憶部510に格納されているユーザ情報を匿名化し、その匿名化したユーザ情報を匿名化済ユーザ情報記憶部520に格納する。 The anonymization device 300 anonymizes the user information stored in the user information storage unit 510 and stores the anonymized user information in the anonymized user information storage unit 520.
 図14を参照すると、本実施形態における匿名化装置300は、第1の実施形態の匿名化装置100と比べて、フォーカス部分匿名化グループ作成部140の替わりにフォーカス部分匿名化グループ作成部340を、匿名化グループ作成部160の替わりに匿名化グループ作成部360を有している。 Referring to FIG. 14, the anonymization device 300 in the present embodiment is different from the anonymization device 100 of the first embodiment in that a focus partial anonymization group creation unit 340 is used instead of the focus partial anonymization group creation unit 140. Instead of the anonymization group creation unit 160, an anonymization group creation unit 360 is provided.
 図15は、本実施形態におけるフォーカス値記憶部120に記憶されるフォーカスレコード121の一例を示す図である。図15に示すように、本実施形態のフォーカス値記憶部120は、複数のフォーカスレコード121を含む。本実施形態のフォーカスレコード121は、準識別子名122、フォーカス値123に加え、優先度128を更に含む。 FIG. 15 is a diagram illustrating an example of the focus record 121 stored in the focus value storage unit 120 according to the present embodiment. As shown in FIG. 15, the focus value storage unit 120 of this embodiment includes a plurality of focus records 121. The focus record 121 of this embodiment further includes a priority 128 in addition to the quasi-identifier name 122 and the focus value 123.
 優先度128は、フォーカス部分匿名化グループを作成する場合の、フォーカスレコード121の順番を示す情報である。尚、優先度128は、フォーカスレコード121の重みを示す情報であってもよい。 The priority 128 is information indicating the order of the focus records 121 when a focus partial anonymization group is created. The priority 128 may be information indicating the weight of the focus record 121.
 ===フォーカス部分匿名化グループ作成部340===
 フォーカス部分匿名化グループ作成部340は、フォーカス部分匿名化グループ作成部140に対して、以下の差分を有する。
=== Focus partial anonymization group creation unit 340 ===
The focus partial anonymization group creation unit 340 has the following differences with respect to the focus partial anonymization group creation unit 140.
 フォーカス部分匿名化グループ作成部340は、匿名化グループ作成部360からの指示を受けると、フォーカスレコード121を優先度128の順で使用して、フォーカス部分匿名化グループを作成し、匿名化グループ作成部360に出力する。このとき、フォーカス部分匿名化グループ作成部340は、完了情報をフォーカス部分匿名化グループに付加して、匿名化グループ作成部360に出力する。ここで、完了情報は、全てのフォーカスレコード121について、それらのフォーカスレコード121それぞれを含むフォーカス部分匿名化グループ候補を作成した(「完了」)か否(「未完了」)かを示す情報である。 Upon receiving an instruction from the anonymization group creation unit 360, the focus partial anonymization group creation unit 340 creates the focus partial anonymization group using the focus records 121 in order of priority 128, and creates the anonymization group Output to the unit 360. At this time, the focus partial anonymization group creation unit 340 adds the completion information to the focus partial anonymization group and outputs it to the anonymization group creation unit 360. Here, the completion information is information indicating whether focus partial anonymization group candidates including each of the focus records 121 have been created (“complete”) or not (“incomplete”) for all focus records 121. .
 ===匿名化グループ作成部360===
 匿名化グループ作成部360は、匿名化グループ作成部160に対して、以下の差分を有する。
=== Anonymization group creation unit 360 ===
The anonymization group creation unit 360 has the following differences with respect to the anonymization group creation unit 160.
 匿名化グループ作成部360は、フォーカス部分匿名化グループ作成部340から、未使用のフォーカスレコード121が残っているか否かを示す情報を付加されたフォーカス部分匿名化グループを受け取る。そして、匿名化グループ作成部360は、匿名化グループ作成部160と同様にして、匿名化済ユーザ情報レコード521を作成する。 The anonymization group creation unit 360 receives from the focus partial anonymization group creation unit 340 a focus partial anonymization group to which information indicating whether or not an unused focus record 121 remains is added. Then, the anonymization group creation unit 360 creates the anonymized user information record 521 in the same manner as the anonymization group creation unit 160.
 次に、匿名化グループ作成部360は、完了情報を確認する。匿名化グループ作成部360は、完了情報が「未完了」である場合、作成した匿名化済ユーザ情報レコード521をフォーカス部分匿名化グループ作成部340に渡して、再度フォーカス部分匿名化グループの作成を指示する。また、匿名化グループ作成部360は、完了情報が「完了」である場合、作成した匿名化済ユーザ情報レコード521を匿名化済ユーザ情報記憶部520に記憶させる。 Next, the anonymization group creation unit 360 confirms the completion information. When the completion information is “incomplete”, the anonymization group creation unit 360 passes the created anonymized user information record 521 to the focus partial anonymization group creation unit 340 to create the focus partial anonymization group again. Instruct. Also, when the completion information is “completed”, the anonymized group creation unit 360 stores the created anonymized user information record 521 in the anonymized user information storage unit 520.
 即ち、本実施形態の匿名化装置300は、複数の準識別子(異なる準識別子名122の準識別子及び同一の準識別子名122の準識別子の、任意のもの)ごとに、フォーカス値123を指定できる。 That is, the anonymization apparatus 300 according to the present embodiment can specify the focus value 123 for each of a plurality of quasi-identifiers (arbitrary quasi-identifier names 122 and quasi-identifiers having the same quasi-identifier name 122). .
 次に、図面を参照して、本実施形態の動作を説明する。 Next, the operation of this embodiment will be described with reference to the drawings.
 図16は、本実施形態の動作を示すフローチャートである。 FIG. 16 is a flowchart showing the operation of the present embodiment.
 S601からS603までは、図7のS601からS603までと同じ動作である。 S601 to S603 are the same operations as S601 to S603 in FIG.
 次に、匿名化グループ作成部360は、匿名化済ユーザ情報レコード521のいずれかと、パラメータ値113とを、フォーカス部分匿名化グループ作成部340へ出力し、フォーカス部分匿名化グループの作成を指示する。(S634)。ここで、その匿名化済ユーザ情報レコード521は、S602で取得したユーザ情報レコード511またはS641で作成された匿名化済ユーザ情報レコード521である。また、そのパラメータ値113は、S603で取得されたパラメータ名112が「k」であるプロパティレコード111のパラメータ値113である。 Next, the anonymization group creation unit 360 outputs one of the anonymized user information records 521 and the parameter value 113 to the focus partial anonymization group creation unit 340, and instructs the creation of the focus partial anonymization group. . (S634). Here, the anonymized user information record 521 is the anonymized user information record 521 created in S602 or the user information record 511 acquired in S602. The parameter value 113 is the parameter value 113 of the property record 111 whose parameter name 112 acquired in S603 is “k”.
 S605からS608までは、図7のS605からS608までと同じ動作である。 S605 to S608 are the same operations as S605 to S608 in FIG.
 次に、フォーカス部分匿名化グループ作成部340は、情報損失量計算部150から受け取った情報損失量に基づいて、情報損失量が最も少ないフォーカス部分匿名化グループ候補をフォーカス値匿名化グループと決定する。続けて、フォーカス部分匿名化グループ作成部340は、決定したフォーカス部分匿名化グループ候補に完了情報を付加して、匿名化グループ作成部160へ出力する(S639)。 Next, based on the information loss amount received from the information loss amount calculation unit 150, the focus partial anonymization group creation unit 340 determines the focus partial anonymization group candidate with the smallest information loss amount as the focus value anonymization group. . Subsequently, the focus partial anonymization group creation unit 340 adds completion information to the determined focus partial anonymization group candidate and outputs the completion information to the anonymization group creation unit 160 (S639).
 S610は、図7のS610と同じ動作である。 S610 is the same operation as S610 in FIG.
 次に、匿名化グループ作成部360は、フォーカス部分匿名化グループ作成部340から受け取ったフォーカス部分匿名化グループ及び自身が作成した匿名化グループそれぞれに対応する、匿名化済ユーザ情報レコード521を作成する(S641)。 Next, the anonymization group creation unit 360 creates an anonymized user information record 521 corresponding to the focus partial anonymization group received from the focus partial anonymization group creation unit 340 and the anonymization group created by itself. (S641).
 次に、匿名化グループ作成部360は、完了情報を確認する(S642)。完了情報が「未完了」であった場合(S642でNO)、処理は、S634へ戻る。 Next, the anonymization group creation unit 360 confirms the completion information (S642). If the completion information is “incomplete” (NO in S642), the process returns to S634.
 完了信号が「完了」であった場合(S642でYES)、S641で作成した匿名化済ユーザ情報レコード521を匿名化済ユーザ情報記憶部520に記憶させる(S643)。 If the completion signal is “complete” (YES in S642), the anonymized user information record 521 created in S641 is stored in the anonymized user information storage unit 520 (S643).
 以上が、本実施形態の動作の説明である。 The above is the description of the operation of the present embodiment.
 上述のように構成することにより、例えば、受診年月日、生年月日及び性別の準識別子にフォーカス値123を設定し、ある新薬の使用許可が下りた年齢で、女性の患者に注目して子宮頸がんの発生率などを調べることが可能になる。 By configuring as described above, for example, a focus value 123 is set in the date of birth, date of birth, and quasi-identifier of gender, and attention is paid to a female patient at an age when permission to use a certain new drug is given. The incidence of cervical cancer can be examined.
 尚、フォーカスレコード121は、優先度128を含まなくてもよい。この場合、フォーカス部分匿名化グループ作成部340は、フォーカス値記憶部120の若番または老番のアドレスに含まれているフォーカスレコード121から順番に使用するようにしてもよい。また、フォーカス部分匿名化グループ作成部340は、複数のフォーカスレコード121を、準識別子名122に対して予め定められた固定的な順番で、または任意の順番で使用するようにしてもよい。 The focus record 121 may not include the priority 128. In this case, the focus part anonymization group creation unit 340 may use the focus records 121 included in the address of the young or old number in the focus value storage unit 120 in order. Further, the focus partial anonymization group creation unit 340 may use the plurality of focus records 121 in a fixed order predetermined for the semi-identifier name 122 or in an arbitrary order.
 上述した本実施形態における効果は、第1の実施形態の効果に加えて、複数の観点から注目したい属性値の抽象度を優先的に、局所的に低くするように、匿名化されたデータを得ることが可能になる点である。 In addition to the effect of the first embodiment, the effect in the present embodiment described above is that anonymized data is preferentially lowered locally in order to preferentially reduce the abstraction level of the attribute value to be noticed from a plurality of viewpoints. It is a point that can be obtained.
 その理由は、以下の構成を含むようにしたからである。第1に、フォーカス部分匿名化グループ作成部340が複数のフォーカスレコード121を順次に使用して、フォーカス部分匿名化グループを作成する。第2に、匿名化グループ作成部360が、フォーカス部分匿名化グループ作成部340がフォーカスレコード121を全て使うまで、作成した匿名化済ユーザ情報レコード521についてフォーカス部分匿名化グループを作成することを、フォーカス部分匿名化グループ作成部340に指示する。 The reason is that the following configuration is included. First, the focus partial anonymization group creation unit 340 creates a focus partial anonymization group by sequentially using a plurality of focus records 121. Second, the anonymization group creation unit 360 creates a focus partial anonymization group for the created anonymized user information record 521 until the focus partial anonymization group creation unit 340 uses all the focus records 121. The focus partial anonymization group creation unit 340 is instructed.
 <<第4の実施形態>>
 次に、本発明の第4の実施形態について図面を参照して詳細に説明する。以下、本実施形態の説明が不明確にならない範囲で、前述の説明と重複する内容については説明を省略する。
<< Fourth Embodiment >>
Next, a fourth embodiment of the present invention will be described in detail with reference to the drawings. Hereinafter, the description overlapping with the above description is omitted as long as the description of the present embodiment is not obscured.
 図17は、本発明の第4の実施形態に係る匿名化装置400の構成を示すブロック図である。 FIG. 17 is a block diagram showing a configuration of an anonymization apparatus 400 according to the fourth embodiment of the present invention.
 図17に示すように、匿名化装置400は、フォーカス部分匿名化グループ作成部140及び情報損失量計算部150を含む。 17, the anonymization device 400 includes a focus partial anonymization group creation unit 140 and an information loss amount calculation unit 150.
 ===フォーカス部分匿名化グループ作成部140===
 フォーカス部分匿名化グループ作成部140は、任意の属性値(例えば、図6に示す年齢512、病状513、図12に示す受診日514)を含む複数のユーザ情報レコード511を、図示しないユーザ情報記憶手段から取得する。図示しないユーザ情報記憶手段は、匿名化装置400に含まれていてもよいし、図1に示すようなユーザ情報記憶部510であってもよい。
=== Focus Partial Anonymization Group Creation Unit 140 ===
The focus partial anonymization group creation unit 140 stores a plurality of user information records 511 including arbitrary attribute values (for example, age 512, medical condition 513 shown in FIG. 6, consultation date 514 shown in FIG. 12). Obtain from means. User information storage means (not shown) may be included in the anonymization device 400, or may be a user information storage unit 510 as shown in FIG.
 フォーカス部分匿名化グループ作成部140は、複数のユーザ情報レコード511をグループ化したフォーカス部分匿名化グループ候補を作成する。これらの複数のユーザ情報レコード511は、注目属性値を少なくとも含む。注目属性値は、上述の任意の属性値のうちのいずれかであり、フォーカスレコード121に含まれる準識別子名122とフォーカス値123で特定される属性値である。 The focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate in which a plurality of user information records 511 are grouped. The plurality of user information records 511 include at least an attention attribute value. The attention attribute value is one of the above-described arbitrary attribute values, and is an attribute value specified by the quasi-identifier name 122 and the focus value 123 included in the focus record 121.
 また、フォーカス部分匿名化グループ作成部140は、作成したフォーカス部分匿名化グループ候補の内、最も小さい情報損失量に対応するフォーカス部分匿名化グループ候補をフォーカス部分匿名化グループとして決定し、出力する。 Also, the focus partial anonymization group creation unit 140 determines the focus partial anonymization group candidate corresponding to the smallest amount of information loss among the created focus partial anonymization group candidates and outputs the focus partial anonymization group candidate.
 ===情報損失量計算部150===
 情報損失量計算部150は、フォーカス部分匿名化グループ作成部140が作成したフォーカス部分匿名化グループ候補の情報産出量を計算し、出力する。情報産出量は、フォーカス部分匿名化グループ候補に対応するユーザ情報レコード511から得られる情報に対して、そのフォーカス部分匿名化グループ候補から得られる情報が損失している(減少している)量を示す。
=== Information Loss Calculation Unit 150 ===
The information loss amount calculation unit 150 calculates and outputs the information output amount of the focus partial anonymization group candidate created by the focus partial anonymization group creation unit 140. The amount of information output is the amount of information obtained from the focus partial anonymization group candidate lost (decreased) with respect to the information obtained from the user information record 511 corresponding to the focus partial anonymization group candidate. Show.
 上述した本実施形態における効果は、注目したい属性値の抽象度を優先的に、局所的に低くするように、匿名化されたデータを得ることが可能になる点である。 The effect of the present embodiment described above is that it is possible to obtain anonymized data so as to preferentially lower the abstraction level of the attribute value to be focused on locally.
 その理由は、以下のような構成を含むからである。即ち、第1にフォーカス部分匿名化グループ作成部140が、フォーカス部分匿名化グループ候補を作成する。第2に、情報損失量計算部150がフォーカス部分匿名化グループ候補それぞれの情報損失量を算出する。第3に、フォーカス部分匿名化グループ作成部140が、最も小さい情報損失量に対応するフォーカス部分匿名化グループ候補を、フォーカス部分匿名化グループとして決定する。 The reason is that the following configuration is included. That is, first, the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate. Secondly, the information loss amount calculation unit 150 calculates the information loss amount of each focus partial anonymization group candidate. Thirdly, the focus partial anonymization group creation unit 140 determines a focus partial anonymization group candidate corresponding to the smallest amount of information loss as a focus partial anonymization group.
 以上の各実施形態で説明した各構成要素は、必ずしも個々に独立した存在である必要はない。例えば、各構成要素は、複数の構成要素が1個のモジュールとして実現されてよい。また、各構成要素は、1つの構成要素が複数のモジュールで実現されてもよい。また、各構成要素は、ある構成要素が他の構成要素の一部であるような構成であってよい。また、各構成要素は、ある構成要素の一部と他の構成要素の一部とが重複するような構成であってもよい。 Each component described in each of the above embodiments does not necessarily need to be an independent entity. For example, each component may be realized as a module with a plurality of components. In addition, each component may be realized by a plurality of modules. Each component may be configured such that a certain component is a part of another component. Each component may be configured such that a part of a certain component overlaps a part of another component.
 以上説明した各実施形態における各構成要素及び各構成要素を実現するモジュールは、必要に応じ、可能であれば、ハードウェア的に実現されてよい。また、各構成要素及び各構成要素を実現するモジュールは、コンピュータ及びプログラムで実現されてよい。また、各構成要素及び各構成要素を実現するモジュールは、ハードウェア的なモジュールとコンピュータ及びプログラムとの混在により実現されてもよい。 In the embodiments described above, each component and a module that realizes each component may be realized by hardware if necessary. Moreover, each component and the module which implement | achieves each component may be implement | achieved by a computer and a program. Each component and a module that realizes each component may be realized by mixing hardware modules, computers, and programs.
 そのプログラムは、例えば、磁気ディスクや半導体メモリなど、不揮発性のコンピュータ可読記録媒体に記録されて提供され、コンピュータの立ち上げ時などにコンピュータに読み取られる。この読み取られたプログラムは、そのコンピュータの動作を制御することにより、そのコンピュータを前述した各実施形態における構成要素として機能させる。 The program is provided by being recorded on a non-volatile computer-readable recording medium such as a magnetic disk or a semiconductor memory, and is read by the computer when the computer is started up. The read program causes the computer to function as a component in each of the above-described embodiments by controlling the operation of the computer.
 また、以上説明した各実施形態では、複数の動作をフローチャートの形式で順番に記載してあるが、その記載の順番は複数の動作を実行する順番を限定するものではない。このため、各実施形態を実施する時には、その複数の動作の順番は内容的に支障しない範囲で変更することができる。 In each of the embodiments described above, a plurality of operations are described in order in the form of a flowchart. However, the order of description does not limit the order in which the plurality of operations are executed. For this reason, when each embodiment is implemented, the order of the plurality of operations can be changed within a range that does not hinder the contents.
 更に、以上説明した各実施形態では、複数の動作は個々に相違するタイミングで実行されることに限定されない。例えば、ある動作の実行中に他の動作が発生したり、ある動作と他の動作との実行タイミングが部分的に乃至全部において重複していたりしていてもよい。 Furthermore, in each embodiment described above, a plurality of operations are not limited to being executed at different timings. For example, another operation may occur during the execution of a certain operation, or the execution timing of a certain operation and another operation may partially or entirely overlap.
 更に、以上説明した各実施形態では、ある動作が他の動作の契機になるように記載しているが、その記載はある動作と他の動作との全ての関係を限定するものではない。このため、各実施形態を実施する時には、その複数の動作の関係は内容的に支障のない範囲で変更することができる。また各構成要素の各動作の具体的な記載は、各構成要素の各動作を限定するものではない。このため、各構成要素の具体的な各動作は、各実施形態を実施する上で機能的、性能的、その他の特性に対して支障をきたさない範囲内で変更されて良い。 Furthermore, in each of the embodiments described above, it is described that a certain operation becomes a trigger for another operation, but the description does not limit all relationships between the certain operation and other operations. For this reason, when each embodiment is implemented, the relationship between the plurality of operations can be changed within a range that does not hinder the contents. The specific description of each operation of each component does not limit each operation of each component. For this reason, each specific operation | movement of each component may be changed in the range which does not cause trouble with respect to a functional, performance, and other characteristic in implementing each embodiment.
 以上、各実施形態及び実施例を参照して本発明を説明したが、本発明は上記実施形態及び実施例に限定されるものではない。本発明の構成や詳細には、本発明のスコープ内で当業者が理解しえる様々な変更をすることができる。 As mentioned above, although this invention was demonstrated with reference to each embodiment and an Example, this invention is not limited to the said embodiment and Example. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.
 この出願は、2012年3月1日に出願された日本出願特願2012-045548を基礎とする優先権を主張し、その開示の全てをここに取り込む。
符号の説明
 100  匿名化装置
 110  プロパティ記憶部
 111  プロパティレコード
 112  パラメータ名
 113  パラメータ値
 120  フォーカス値記憶部
 121  フォーカスレコード
 122  準識別子名
 123  フォーカス値
 128  優先度
 130  匿名化実行受付部
 140  フォーカス部分匿名化グループ作成部
 150  情報損失量計算部
 160  匿名化グループ作成部
 200  匿名化装置
 240  フォーカス部分匿名化グループ作成部
 270  分割値記憶部
 271  分割値レコード
 272  準識別子名
 273  分割値
 300  匿名化装置
 340  フォーカス部分匿名化グループ作成部
 360  匿名化グループ作成部
 400  匿名化装置
 510  ユーザ情報記憶部
 511  ユーザ情報レコード
 512  年齢
 513  病状
 514  受診日
 519  番号
 520  匿名化済ユーザ情報記憶部
 521  匿名化済ユーザ情報レコード
 529  グループ番号
 700  コンピュータ
 701  CPU
 702  記憶部
 703  記憶装置
 704  入力部
 705  出力部
 706  通信部
 707  記録媒体
This application claims the priority on the basis of Japanese application Japanese Patent Application No. 2012-045548 for which it applied on March 1, 2012, and takes in those the indications of all here.
DESCRIPTION OF SYMBOLS 100 Anonymization device 110 Property storage unit 111 Property record 112 Parameter name 113 Parameter value 120 Focus value storage unit 121 Focus record 122 Quasi-identifier name 123 Focus value 128 Priority 130 Anonymization execution reception unit 140 Focus partial anonymization group Creation unit 150 Information loss calculation unit 160 Anonymization group creation unit 200 Anonymization device 240 Focus partial anonymization group creation unit 270 Division value storage unit 271 Division value record 272 Quasi-identifier name 273 Division value 300 Anonymization device 340 Focus portion anonymous Group creation unit 360 anonymization group creation unit 400 anonymization device 510 user information storage unit 511 user information record 512 age 513 medical condition 514 medical examination 519 No. 520 anonymized already user information storage unit 521 anonymized already user information record 529 group number 700 computer 701 CPU
702 Storage unit 703 Storage device 704 Input unit 705 Output unit 706 Communication unit 707 Recording medium

Claims (8)

  1.  任意の属性値を含む複数のユーザ情報レコードを取得し、特定の前記属性値である注目属性値を少なくとも含む、複数の前記ユーザ情報レコードをグループ化したフォーカス部分匿名化グループ候補を作成するフォーカス部分匿名化グループ作成手段と、
     前記フォーカス部分匿名化グループ候補に対応する前記ユーザ情報レコードから得られる情報に対して、前記フォーカス部分匿名化グループ候補から得られる情報が損失している量を示す情報損失量を計算する情報損失量計算手段と、を含み、
     前記フォーカス部分匿名化グループ作成手段は、前記作成したフォーカス部分匿名化グループ候補の内、最も小さい前記情報損失量に対応する前記フォーカス部分匿名化グループ候補をフォーカス部分匿名化グループとして決定し、出力する
     情報処理装置。
    Focus part for acquiring a plurality of user information records including arbitrary attribute values and creating a focus part anonymization group candidate in which a plurality of user information records including at least the attribute value of interest that is the specific attribute value are grouped Anonymization group creation means;
    Information loss amount for calculating an information loss amount indicating an amount of loss of information obtained from the focus partial anonymization group candidate for information obtained from the user information record corresponding to the focus partial anonymization group candidate Calculating means,
    The focus partial anonymization group creating means determines and outputs the focus partial anonymization group candidate corresponding to the smallest amount of information loss as the focus partial anonymization group among the created focus partial anonymization group candidates Information processing device.
  2.  前記情報損失量計算手段は、
    前記フォーカス部分匿名化グループ候補それぞれに対応する前記ユーザ情報レコードの、前記注目属性値と同一の属性名の属性値の範囲と、前記対応するユーザ情報レコードの個数と、に基づいて情報損失量を計算する
     ことを特徴とする請求項1記載の情報処理装置。
    The information loss amount calculating means includes:
    The amount of information loss based on the attribute value range of the same attribute name as the attention attribute value of the user information record corresponding to each of the focus partial anonymization group candidates and the number of the corresponding user information records The information processing apparatus according to claim 1, wherein calculation is performed.
  3.  前記フォーカス部分匿名化グループ作成手段は、前記注目属性値を含む前記ユーザ情報レコードを少なくとも含む、k-匿名性におけるk個の前記ユーザ情報レコードをグループ化した前記フォーカス部分匿名化グループ候補を作成する
     ことを特徴とする請求項1または2記載の情報処理装置。
    The focus partial anonymization group creation means creates the focus partial anonymization group candidate that groups at least the k user information records in k-anonymity including at least the user information record including the attention attribute value. The information processing apparatus according to claim 1 or 2.
  4.  前記フォーカス部分匿名化グループ作成部は、
     前記複数のユーザ情報レコードを分割する属性値を示す分割情報に基づいて、前記複数のユーザ情報レコードを分割し、
     前記分割したユーザ情報レコードの範囲内において、前記注目属性値を少なくとも含む複数の前記ユーザ情報レコードをグループ化したフォーカス部分匿名化グループ候補を作成する
     ことを特徴とする請求項1乃至3のいずれか1項記載の情報処理装置。
    The focus part anonymization group creation unit
    Dividing the plurality of user information records based on division information indicating an attribute value for dividing the plurality of user information records;
    4. The focus partial anonymization group candidate is created by grouping a plurality of the user information records including at least the attention attribute value within the range of the divided user information records. 5. 1. An information processing apparatus according to item 1.
  5.  前記フォーカス部分匿名化グループ作成手段に前記フォーカス部分匿名化グループの作成を指示して前記フォーカス部分匿名化グループを取得し、前記フォーカス部分匿名化グループに対応するユーザ情報レコード以外のユーザ情報レコードから匿名化グループを作成し、前記フォーカス部分匿名化グループと前記匿名化グループとのそれぞれに対応する匿名化済ユーザ情報レコードを作成する匿名化グループ作成手段を更に含み、
     前記フォーカス部分匿名化グループ作成手段は、前記フォーカス部分匿名化グループの作成の指示を受けるごとに、複数の前記注目属性値について、前記複数の注目属性値の内の1つを順次、少なくとも含む、前記フォーカス部分匿名化グループ候補を作成し、
     前記匿名化グループ作成手段は、前記フォーカス部分匿名化グループ作成手段が、全ての前記複数の注目属性値について、前記注目属性値を含む前記フォーカス部分匿名化グループ候補を作成するまで、前記作成した匿名化済ユーザ情報レコードについて前記フォーカス部分匿名化グループを作成することを、前記フォーカス部分匿名化グループ作成手段に指示する
     ことを特徴とする請求項1乃至4のいずれか1項記載の情報処理装置。
    The focus partial anonymization group creation means is instructed to create the focus partial anonymization group to obtain the focus partial anonymization group, and anonymous from a user information record other than the user information record corresponding to the focus partial anonymization group An anonymized group creating means for creating an anonymized user information record corresponding to each of the focused partial anonymized group and the anonymized group,
    The focus part anonymization group creation means includes at least one of the plurality of attention attribute values sequentially for each of the plurality of attention attribute values each time an instruction to create the focus partial anonymization group is received. Create the focus partial anonymization group candidate,
    The anonymized group creating means creates the anonymous anonymous until the focus partial anonymized group creating means creates the focus partial anonymized group candidate including the focused attribute value for all the plurality of focused attribute values. The information processing apparatus according to any one of claims 1 to 4, wherein the focus partial anonymization group creation unit is instructed to create the focus partial anonymization group for a converted user information record.
  6.  コンピュータが、
     任意の属性値を含む複数のユーザ情報レコードを取得し、
     特定の前記属性値である注目属性値を少なくとも含む、複数の前記ユーザ情報レコードをグループ化したフォーカス部分匿名化グループ候補を作成し、
     前記フォーカス部分匿名化グループ候補に対応する前記ユーザ情報レコードから得られる情報に対して、前記フォーカス部分匿名化グループ候補から得られる情報が損失している量を示す情報損失量を計算し、
     前記作成したフォーカス部分匿名化グループ候補の内、最も小さい前記情報損失量に対応する前記フォーカス部分匿名化グループ候補をフォーカス部分匿名化グループとして決定し、出力する、
     匿名化方法。
    Computer
    Get multiple user information records containing any attribute value,
    Create a focus partial anonymization group candidate that groups a plurality of the user information records including at least an attribute value of interest that is the specific attribute value;
    For information obtained from the user information record corresponding to the focus partial anonymization group candidate, calculate an information loss amount indicating an amount of loss of information obtained from the focus partial anonymization group candidate,
    Of the created focus partial anonymization group candidates, determine the focus partial anonymization group candidate corresponding to the smallest amount of information loss as a focus partial anonymization group, and output,
    Anonymization method.
  7.  任意の属性値を含む複数のユーザ情報レコードを取得し、
     特定の前記属性値である注目属性値を少なくとも含む、複数の前記ユーザ情報レコードをグループ化したフォーカス部分匿名化グループ候補を作成し、
     前記フォーカス部分匿名化グループ候補に対応する前記ユーザ情報レコードから得られる情報に対して、前記フォーカス部分匿名化グループ候補から得られる情報が損失している量を示す情報損失量を計算し、
     前記作成したフォーカス部分匿名化グループ候補の内、最も小さい前記情報損失量に対応する前記フォーカス部分匿名化グループ候補をフォーカス部分匿名化グループとして決定し、出力する、処理をコンピュータに実行させる
     プログラムを記録した不揮発性記録媒体。
    Get multiple user information records containing any attribute value,
    Create a focus partial anonymization group candidate that groups a plurality of the user information records including at least an attribute value of interest that is the specific attribute value;
    For information obtained from the user information record corresponding to the focus partial anonymization group candidate, calculate an information loss amount indicating an amount of loss of information obtained from the focus partial anonymization group candidate,
    A program for causing a computer to execute a process for determining and outputting the focus partial anonymization group candidate corresponding to the smallest amount of information loss among the created focus partial anonymization group candidates is recorded. Non-volatile recording medium.
  8.  請求項1乃至6のいずれか1項に記載の情報処理装置と、
     前記ユーザ情報レコードを記憶するユーザ情報記憶手段と、
     前記フォーカス部分匿名化グループに含まれる匿名化済ユーザ情報レコードを記憶する匿名化済ユーザ情報記憶手段と、を含む
     情報処理システム。
    The information processing apparatus according to any one of claims 1 to 6,
    User information storage means for storing the user information record;
    An anonymized user information storage means for storing an anonymized user information record included in the focus partial anonymization group.
PCT/JP2013/001073 2012-03-01 2013-02-25 Information processing device for implementing anonymization process, anonymization method, and program therefor WO2013128879A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2012-045548 2012-03-01
JP2012045548 2012-03-01

Publications (1)

Publication Number Publication Date
WO2013128879A1 true WO2013128879A1 (en) 2013-09-06

Family

ID=49082092

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/001073 WO2013128879A1 (en) 2012-03-01 2013-02-25 Information processing device for implementing anonymization process, anonymization method, and program therefor

Country Status (1)

Country Link
WO (1) WO2013128879A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010097336A (en) * 2008-10-15 2010-04-30 Nippon Telegr & Teleph Corp <Ntt> Device and method for monitoring invasion of privacy, and program
JP2012003440A (en) * 2010-06-16 2012-01-05 Kddi Corp Apparatus, method and program for protecting privacy of public information
JP2012022315A (en) * 2010-07-02 2012-02-02 Nec (China) Co Ltd Method and device for anonymizing data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010097336A (en) * 2008-10-15 2010-04-30 Nippon Telegr & Teleph Corp <Ntt> Device and method for monitoring invasion of privacy, and program
JP2012003440A (en) * 2010-06-16 2012-01-05 Kddi Corp Apparatus, method and program for protecting privacy of public information
JP2012022315A (en) * 2010-07-02 2012-02-02 Nec (China) Co Ltd Method and device for anonymizing data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KUNIHIKO HARADA ET AL.: "k-anonymization schemes with automatic generation of generalization trees and distortion measuring using information entropy", IPSJ SIG NOTES, vol. 2010-CSE, no. 47, 24 June 2010 (2010-06-24), pages 1 - 7, XP008179074 *

Similar Documents

Publication Publication Date Title
US20230409750A1 (en) Smart de-identification using date jittering
US11144660B2 (en) Secure data sharing
US20160307063A1 (en) Dicom de-identification system and method
US20170344716A1 (en) Context and location specific real time care management system
US20160306999A1 (en) Systems, methods, and computer-readable media for de-identifying information
US10958421B2 (en) User access control in blockchain
JP2007299396A (en) System and method for patient re-identification
JP2015515659A (en) Method for processing patient-related data records
US10657273B2 (en) Systems and methods for automatic and customizable data minimization of electronic data stores
US9009075B2 (en) Transfer system for security-critical medical image contents
JP6242469B1 (en) Personal medical information management method, personal medical information management server and program
JP2022529524A (en) Consent regarding common personal information
WO2014030302A1 (en) Information processing device for executing anonymization and anonymization processing method
JP6127774B2 (en) Information processing apparatus and data processing method
US11113418B2 (en) De-identification of electronic medical records for continuous data development
Sheinson et al. Estimated impact of public and private sector COVID-19 diagnostics and treatments on US healthcare resource utilization
WO2013128879A1 (en) Information processing device for implementing anonymization process, anonymization method, and program therefor
WO2022233236A1 (en) Secure data analytics
WO2014136422A1 (en) Information processing device for performing anonymization processing, and anonymization method
US20220382711A1 (en) Data analysis system and data analysis method
JP2019036249A (en) Medical information management device, method for managing medical information, and program
JPWO2013183250A1 (en) Information processing apparatus and anonymization method for anonymization
JP6192601B2 (en) Personal information management system and personal information anonymization device
JP2016045535A (en) Information processing system, anonymization method and program therefor
JP6799775B2 (en) Server equipment, communication systems, information processing methods, and information processing programs

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13754792

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13754792

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP