WO2013128879A1 - Dispositif de traitement d'informations pour mettre en œuvre un processus d'anonymisation, procédé d'anonymisation et programme correspondant - Google Patents

Dispositif de traitement d'informations pour mettre en œuvre un processus d'anonymisation, procédé d'anonymisation et programme correspondant Download PDF

Info

Publication number
WO2013128879A1
WO2013128879A1 PCT/JP2013/001073 JP2013001073W WO2013128879A1 WO 2013128879 A1 WO2013128879 A1 WO 2013128879A1 JP 2013001073 W JP2013001073 W JP 2013001073W WO 2013128879 A1 WO2013128879 A1 WO 2013128879A1
Authority
WO
WIPO (PCT)
Prior art keywords
focus
user information
anonymization
anonymization group
group
Prior art date
Application number
PCT/JP2013/001073
Other languages
English (en)
Japanese (ja)
Inventor
由起 豊田
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Publication of WO2013128879A1 publication Critical patent/WO2013128879A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6263Protecting personal data, e.g. for financial or medical purposes during internet communication, e.g. revealing personal data from cookies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0407Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the identity of one or more communicating identities is hidden
    • H04L63/0421Anonymous communication, i.e. the party's identifiers are hidden from the other party or parties, e.g. using an anonymizer

Definitions

  • the present invention relates to an information processing apparatus that performs anonymization processing that abstracts user data and improves anonymity, an anonymization method, and a program therefor.
  • Non-Patent Document 1 discloses a technique regarding k-anonymity.
  • k-anonymity is an index that guarantees that there are k or more sets of personal information including combinations of the same quasi-identifiers due to anonymization of quasi-identifiers that are information that may identify an individual. Specifically, when there are at least k or more records having a common combination of attribute values for any attribute in certain disclosed data including a plurality of records including attribute values of a plurality of attributes (quasi-identifiers) The disclosed data satisfies k-anonymity.
  • k-anonymity means that attributes having the same combination of quasi-identifiers can be obtained by abstracting the attribute values (also called quasi-identifiers) of attributes that can be information for identifying individuals into common values. It is an index that guarantees that it will be more than one.
  • a set of records having the same combination of quasi-identifiers is referred to as an anonymization group.
  • Non-Patent Document 2 discloses a technique of an anonymization method using Local Recording (local re-encoding).
  • An anonymization method using Local Recording is a method of replacing attribute values of some records with more generalized ones.
  • Non-Patent Document 2 discloses an anonymization method that includes anonymization group G with a part of anonymization group G ′ having a sufficient number of records, and anonymization group G It is a technology that merges and raises the level of abstraction.
  • records to be merged are selected from the anonymization group G ′ so that k-anonymity k is minimized in the anonymized group after merging. That is, the anonymization method of Non-Patent Document 2 is a method for minimizing information loss due to abstraction (anonymization) by minimizing an increase in the degree of abstraction of information in the entire disclosed data.
  • Patent Document 1 discloses a privacy protection device that solves such problems.
  • This privacy protection device generalizes (abstracts) attribute values (quasi-identifiers) based on priorities that are set for each attribute name (attribute type) and indicate the importance for the data user.
  • this privacy protection device abstracts an attribute value of an attribute having a lower priority order so that the original information is retained for an attribute having a higher priority order.
  • Patent Document 1 is a technique that generalizes attribute values based on the priority set in units of attribute types. .
  • An object of the present invention is to provide an information processing apparatus that executes anonymization processing that can solve the above-described problems, an anonymization method, and a program therefor.
  • the information processing apparatus acquires a plurality of user information records including arbitrary attribute values, and includes a plurality of user information records including at least an attention attribute value that is the specific attribute value.
  • a focus partial anonymization group creation means for creating a generalized group candidate; Information loss amount for calculating an information loss amount indicating an amount of loss of information obtained from the focus partial anonymization group candidate for information obtained from the user information record corresponding to the focus partial anonymization group candidate Calculating means,
  • the focus partial anonymization group creating means determines and outputs the focus partial anonymization group candidate corresponding to the smallest amount of information loss as the focus partial anonymization group among the created focus partial anonymization group candidates .
  • the computer Get multiple user information records containing any attribute value, Create a focus partial anonymization group candidate that groups a plurality of the user information records including at least an attribute value of interest that is the specific attribute value; For information obtained from the user information record corresponding to the focus partial anonymization group candidate, calculate an information loss amount indicating an amount of loss of information obtained from the focus partial anonymization group candidate, Of the created focus partial anonymization group candidates, the focus partial anonymization group candidate corresponding to the smallest amount of information loss is determined as a focus partial anonymization group and output.
  • the non-volatile recording medium of the present invention acquires a plurality of user information records including arbitrary attribute values, Create a focus partial anonymization group candidate that groups a plurality of the user information records including at least an attribute value of interest that is the specific attribute value; For information obtained from the user information record corresponding to the focus partial anonymization group candidate, calculate an information loss amount indicating an amount of loss of information obtained from the focus partial anonymization group candidate, A program that causes a computer to execute a process for determining and outputting the focus partial anonymization group candidate corresponding to the smallest amount of information loss among the created focus partial anonymization group candidates is recorded. To do.
  • the present invention has an effect that it is possible to obtain anonymized data so as to preferentially lower the abstraction level of an attribute value to be focused on locally.
  • FIG. 1 is a block diagram showing the configuration of the anonymization system according to the first embodiment.
  • FIG. 2 is a diagram illustrating an example of a user information record in the first embodiment.
  • FIG. 3 is a diagram illustrating an example of the anonymized user information record in the first embodiment.
  • FIG. 4 is a diagram illustrating an example of a property record in the first embodiment.
  • FIG. 5 is a diagram illustrating an example of a focus record in the first embodiment.
  • FIG. 6 is a block diagram illustrating a hardware configuration of a computer that implements the anonymization device according to the first embodiment.
  • FIG. 7 is a flowchart illustrating the operation of the anonymization apparatus according to the first embodiment.
  • FIG. 1 is a block diagram showing the configuration of the anonymization system according to the first embodiment.
  • FIG. 2 is a diagram illustrating an example of a user information record in the first embodiment.
  • FIG. 3 is a diagram illustrating an example of the anonymized user information record in the first embodiment.
  • FIG. 4
  • FIG. 8 is a diagram illustrating an example in which the focus partial anonymization group creation unit selects a user information record in the first embodiment.
  • FIG. 9 is a diagram illustrating an example of a focus partial anonymization group candidate in the first embodiment.
  • FIG. 10 is a diagram illustrating an example of a focus partial anonymization group candidate in the first embodiment.
  • FIG. 11 is a block diagram showing the configuration of the anonymization system according to the second embodiment.
  • FIG. 12 is a diagram illustrating an example of a user information record in the second embodiment.
  • FIG. 13 is a diagram illustrating an example of a division value record in the second embodiment.
  • FIG. 14 is a block diagram illustrating a configuration of an anonymization system according to the third embodiment.
  • FIG. 15 is a diagram illustrating an example of a focus record in the third embodiment.
  • FIG. 16 is a flowchart illustrating the operation of the anonymization device of the third exemplary embodiment.
  • FIG. 17 is a block diagram illustrating a configuration of the anonymization
  • FIG. 1 is a block diagram showing the configuration of the anonymization system according to the first embodiment of the present invention.
  • an anonymization system (also referred to as an information processing system) according to this embodiment includes an anonymization device (also referred to as an information processing device) 100, a user information storage unit 510, and an anonymized user information storage unit 520. Prepare.
  • the anonymization device 100, the user information storage unit 510, and the anonymized user information storage unit 520 are connected by a network (not shown). Note that the user information storage unit 510 may be included in the anonymization device 100. Further, the anonymized user information storage unit 520 may be included in the anonymization device 100.
  • the anonymization device 100 anonymizes the user information stored in the user information storage unit 510 and stores the anonymized user information in the anonymized user information storage unit 520.
  • FIG. 2 is a diagram illustrating an example of a user information record 511 stored in the user information storage unit 510.
  • the user information storage unit 510 includes a plurality of user information records 511 as user information. As shown in FIG. 2, the user information storage unit 510 includes one or more user information records 511.
  • the user information record 511 includes a number 519, an age 512, and a medical condition 513.
  • Age 512 is one of the quasi-identifiers.
  • the medical condition 513 is one of sensitive attributes.
  • the quasi-identifier (age 512) and the sensitive attribute (medical condition 513) are also generally called attributes.
  • the quasi-identifier is information that may make it possible to identify an individual by combining them.
  • Sensitive attributes are information that is generally not desired to be known to humans.
  • the number 519 is a number for identifying the user information record 511.
  • the user information record 511 needs to be individually shown and described, for example, the user information record 511 having the number 519 of “1” is described as the user information record 511 (1).
  • User information is, for example, receipt information held by a government agency or a medical institution.
  • the receipt information includes the date of birth, sex, illness, and the like.
  • the attribute value of the age attribute is age 512
  • the attribute value of the disease attribute is disease state 513.
  • the user corresponding to the user information record 511 (1) indicates that he is 20 years old and suffers from heart disease.
  • the user information record 511 may be arbitrary information regardless of the above.
  • the user information record may include age 512, medical condition 513, and other types of information (for example, gender).
  • the user information record may not include the medical condition 513, for example.
  • each arbitrary attribute quadsi-identifier and sensitive attribute
  • the medical condition 513 may include two attribute values “hay fever” and “tooth decay”.
  • FIG. 3 is a diagram illustrating an example of the anonymized user information record 521 stored in the anonymized user information storage unit 520.
  • the anonymized user information storage unit 520 includes k or more anonymized user information records 521.
  • the anonymized user information record 521 includes a group number 529, an age 512, and a medical condition 513.
  • the group number 529 is a number for identifying the anonymized user information record 521.
  • the anonymized user information record 521 needs to be individually shown and described, for example, the anonymized user information record 521 having the group number 529 of “1” is described as the anonymized user information record 521 (1).
  • the anonymized user information record 521 may not include the group number 529.
  • the anonymization apparatus 100 may specify and process the anonymized user information record 521 using the age 512, for example.
  • the anonymized user information record 521 is anonymized user information.
  • the user information is as described above.
  • the user corresponding to the anonymized user information record 521 (1) has an age of 20 to 21 and has suffered from a heart disease, a fracture, or an infection.
  • the anonymization device 100 includes a property storage unit 110, a focus value storage unit 120, an anonymization execution reception unit 130, a focus partial anonymization group creation unit 140, an information loss amount calculation unit 150, and an anonymization group.
  • a creation unit 160 is included.
  • FIG. 4 is a diagram illustrating an example of the property record 111 stored in the property storage unit 110.
  • the property storage unit 110 includes one or more property records 111.
  • the property record 111 includes a parameter name 112 and a parameter value 113.
  • at least one of the property records 111 stored in the property storage unit 110 is a set of a parameter name 112 and a parameter value 113 that specify k-anonymity k.
  • the parameter name 112 is “k” and the parameter value 113 is “3”.
  • the parameter name 112 is “quasi-identifier name” and the parameter value 113 is “age”.
  • the parameter name 112 is “sensitive attribute” and the parameter value 113 is “disease state”.
  • FIG. 5 is a diagram illustrating an example of the focus record 121 stored in the focus value storage unit 120.
  • the focus record 121 includes a quasi-identifier name 122 and a focus value 123.
  • the information included in the focus record 121 is information input to the anonymization device 100 in advance by a user of anonymized data (not shown). Note that the information included in the focus record 121 may be included in an anonymization process execution start instruction to be described later and input to the anonymization device 100.
  • the focus record 121 has a semi-identifier name 122 of “age” and a focus value 123 of “21”. Therefore, the focus record 121 indicates that the attribute value to be noticed is, for example, the attribute value having the age 512 of “21” in the user information record 511 illustrated in FIG. 2.
  • the focus partial anonymization group creation unit 140 acquires the user information record 511 and the property record 111 via the anonymization group creation unit 160 in response to an instruction from the anonymization group creation unit 160. And create a focus part anonymization group.
  • the focus partial anonymization group creation unit 140 may create a focus partial anonymization group in response to an anonymization process execution start instruction received by the anonymization execution reception unit 130.
  • the focus part anonymization group creation unit 140 may acquire the user information record 511 directly from the user information storage unit 510.
  • the focus part anonymization group creation unit 140 may acquire the property record 111 directly from the property storage unit 110.
  • the focus partial anonymization group creation unit 140 outputs the created focus partial anonymization group to the anonymization group creation unit 160.
  • the focus partial anonymization group creation unit 140 creates a focus partial anonymization group as follows.
  • the focus part anonymization group creation unit 140 groups the user information record 511 including at least the focus value 123 and other user information records 511 in the user information record 511 shown in FIG. Create a group candidate.
  • the focus partial anonymization group creation unit 140 performs grouping based on the information of the property record 111 that specifies k-anonymity k of the property storage unit 110. For example, the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate by grouping k user information records 511 including at least the user information record 511 including the focus value 123.
  • the focus partial anonymization group creation unit 140 outputs the created focus partial anonymization group candidate to the information loss amount calculation unit 150. Then, the focus partial anonymization group creation unit 140 receives the information loss amount of the focus partial anonymization group candidate from the information loss amount calculation unit 150.
  • the focus partial anonymization group creation unit 140 focuses the focus partial anonymization group candidate having the smallest value of the corresponding information loss amount among the plurality of focus partial anonymization group candidates that have received the information loss amount. Determine as a partially anonymized group.
  • the focus partial anonymization group includes information corresponding to the user information record 511 including at least the focus value 123.
  • the corresponding information is, for example, information “infectious disease” of the medical condition 513 in the user information record 511 including the age 512 having the same value as “21” which is the focus value 123.
  • Information loss amount (maximum value of specific attribute value included in focus value anonymization group candidate ⁇ minimum value of specific attribute value included in focus value anonymization group candidate + 1) ⁇ number of records.
  • the “specific attribute value included in the focus value anonymization group candidate” is, in other words, the quasi-identifier name of the focus value storage unit 120 of the user information record 511 corresponding to the focus value anonymization group candidate.
  • the attribute value specified by 122 is age 512. That is, the specific attribute value is an attribute value corresponding to the quasi-identifier name 122 of the focus record 121, and in the case of the focus record 121 shown in FIG. 5, the specific attribute value is age 512.
  • the maximum value of the specific attribute value included in the focus value anonymization group candidate ⁇ the minimum value of the specific attribute value included in the focus value anonymization group candidate is the focus value anonymization
  • the information loss amount calculation unit 150 calculates the information loss amount of the categorical value (character string) as shown in Non-Patent Document 2. It may be.
  • the number of user information records as the source is three user information records 511 having numbers 519 of “1”, “2”, and “3”.
  • the number of specific attribute values when grouped is one in which the group number 529 is “1” and the age 512 attribute value is “20-21”.
  • the anonymization group creation unit 160 creates the anonymized user information record 521 when triggered by the anonymization process execution start instruction received by the anonymization execution reception unit 130.
  • the anonymization group creation unit 160 creates the anonymized user information record 521 as follows.
  • the anonymization group creation unit 160 passes the user information record 511 and the property record 111 to the focus partial anonymization group creation unit 140 and instructs the creation of the focus partial anonymization group. Then, the anonymization group creation unit 160 receives the focus partial anonymization group from the focus partial anonymization group creation unit 140 as a response to the creation instruction.
  • the anonymization group creation unit 160 creates one or more anonymization groups from the user information record 511 other than the user information record 511 corresponding to the focus partial anonymization group created by the focus partial anonymization group creation unit 140. create.
  • the anonymization group creation unit 160 creates an anonymized user information record 521 corresponding to the focus partial anonymization group received from the focus partial anonymization group creation unit 140 and the anonymization group created by itself. And stored in the anonymized user information storage unit 520.
  • An anonymized user information record 521 corresponding to each of the focus partial anonymization group and the anonymization group is obtained by collecting user information records 511 included in the focus partial anonymization group and the anonymization group, and assigning a group number 529. It is.
  • FIG. 6 is a diagram illustrating a hardware configuration of a computer 700 that realizes the anonymization apparatus 100 according to the present embodiment.
  • the computer 700 includes a CPU (Central Processing Unit) 701, a storage unit 702, a storage device 703, an input unit 704, an output unit 705, and a communication unit 706. Furthermore, the computer 700 includes a recording medium (or storage medium) 707 supplied from the outside.
  • the recording medium 707 may be a non-volatile recording medium that stores information non-temporarily.
  • the CPU 701 controls the overall operation of the computer 700 by operating an operating system (not shown). Further, the CPU 701 reads a program (for example, a program that causes the computer 700 to execute an operation of a flowchart shown in FIG. 7 described later) and data from a recording medium 707 mounted on the storage device 703, and loads the read program and data. Write to the storage unit 702. The CPU 701 follows the read program and based on the read data, the anonymization execution reception unit 130, the focus partial anonymization group creation unit 140, the information loss amount calculation unit 150, and the anonymization group creation unit shown in FIG. Various processes are executed as 160.
  • a program for example, a program that causes the computer 700 to execute an operation of a flowchart shown in FIG. 7 described later
  • data for example, a program that causes the computer 700 to execute an operation of a flowchart shown in FIG. 7 described later
  • the CPU 701 follows the read program and based on the read data, the anonymization execution reception unit 130, the focus partial anonymization group creation unit 140,
  • the CPU 701 may download a program or data to the storage unit 702 from an external computer (not shown) connected to a communication network (not shown).
  • the storage unit 702 stores programs and data.
  • the storage unit 702 may include a property storage unit 110 and a focus value storage unit 120. Furthermore, when the computer 700 (anonymization apparatus 100) includes the user information storage unit 510 and the anonymized user information storage unit 520, the storage unit 702 may include these.
  • the storage device 703 is, for example, an optical disk, a flexible disk, a magnetic optical disk, an external hard disk, and a semiconductor memory, and includes a recording medium 707.
  • the storage device 703 records the program so that it can be read by a computer. Further, the storage device 703 may record data so as to be readable by a computer.
  • the storage device 703 may include a property storage unit 110 and a focus value storage unit 120.
  • the computer 700 anonymization device 100
  • the storage device 703 may include these.
  • the input unit 704 is realized by, for example, a mouse, a keyboard, a built-in key button, and the like, and is used for an input operation.
  • the input unit 704 is not limited to a mouse, a keyboard, and a built-in key button, and may be a touch panel, an accelerometer, a gyro sensor, a camera, or the like.
  • the input unit 704 is included as part of the anonymization execution reception unit 130.
  • the output unit 705 is realized by a display, for example, and is used for confirming the output.
  • the communication unit 706 implements an interface with the user information storage unit 510 and the anonymized user information storage unit 520.
  • the communication unit 706 is included as a part of the anonymization group creation unit 160.
  • the functional unit block of the anonymization device 100 shown in FIG. 1 is realized by the computer 700 having the hardware configuration shown in FIG.
  • the means for realizing each unit included in the computer 700 is not limited to the above.
  • the computer 700 may be realized by one physically coupled device, or may be realized by two or more physically separated devices connected by wire or wirelessly and by a plurality of these devices. .
  • the recording medium 707 in which the above-described program code is recorded may be supplied to the computer 700, and the CPU 701 may read and execute the program code stored in the recording medium 707.
  • the CPU 701 may store the code of the program stored in the recording medium 707 in the storage unit 702, the storage device 703, or both. That is, the present embodiment includes an embodiment of a recording medium 707 that stores a program (software) executed by the computer 700 (CPU 701) temporarily or non-temporarily.
  • FIG. 7 is a flowchart showing the operation of the anonymization device 100 of this embodiment. Note that the processing according to this flowchart may be executed based on the above-described program control by the CPU. Further, the step name of the process is described by a symbol as in S601.
  • the anonymization execution reception unit 130 receives an anonymization process execution start instruction from a user of anonymization data (not shown) and outputs the instruction to the anonymization group creation unit 160 (S601).
  • the anonymization group creation unit 160 obtains the user information record 511 from the user information storage unit 510 upon receiving the anonymization process execution start instruction (S602).
  • the anonymization group creation unit 160 acquires the property record 111 having the parameter name 112 of “k” from the property storage unit 110 (S603).
  • the anonymization group creation unit 160 includes the user information record 511 acquired in S602, and the parameter value 113 (eg, “3”) of the property record 111 having the parameter name 112 “k” acquired in S603. Is output to the focus partial anonymization group creation unit 140 to instruct the creation of the focus partial anonymization group. (S604).
  • the focus part anonymization group creation unit 140 acquires the focus record 121 (for example, the quasi-identifier name 122 is “age” and the focus value 123 is “21”) from the focus value storage unit 120 (S605).
  • the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate based on the acquired focus record 121 (S606).
  • FIG. 8 is a diagram illustrating an example in which the focus partial anonymization group creation unit 140 selects the user information record 511 when creating a focus partial anonymization group candidate.
  • the focus partial anonymization group creating unit 140 regards three records as one group from the user information record 511 whose age 512 is “21” in the direction of decreasing age, and makes the focus value anonymized. Create group candidates.
  • the focus partial anonymization group creation unit 140 creates focus value anonymization group candidates by regarding three records as one group in the direction of increasing age.
  • the focus partial anonymization group creation unit 140 further creates a focus value anonymization group candidate when three records centered on the user information record 511 having the age 512 of “21” are regarded as one group. May be.
  • FIG. 8 is a diagram illustrating an example in which the focus partial anonymization group creation unit 140 selects the user information record 511 when creating a focus partial anonymization group candidate.
  • the focus partial anonymization group creating unit 140 regards three records as one group from the user information record 511 whose age 512 is “21
  • FIG. 9 is a diagram illustrating an example in which the focus partial anonymization group creation unit 140 creates one focus partial anonymization group candidate by collecting three records in the direction of decreasing age.
  • FIG. 10 is a diagram illustrating an example in which the focus partial anonymization group creation unit 140 creates one focus partial anonymization group candidate by collecting three records in the direction of increasing age.
  • the focus partial anonymization group creation unit 140 transmits the created focus value anonymization group candidate to the information loss amount calculation unit 150 (S607).
  • the information loss amount calculation unit 150 calculates an information loss amount for each received focus value anonymization group candidate (S608).
  • the information loss amount calculation unit 150 calculates the information loss amount of the focus partial anonymization group candidate shown in FIGS. 9 and 10 using the above-described information loss amount calculation formula as follows.
  • the focus partial anonymization group creation unit 140 has the smallest information loss amount (for example, “6”) and the focus partial anonymization group candidate (for example, 9 is determined as a focus value anonymization group. Subsequently, the focus partial anonymization group creation unit 140 outputs the determined focus partial anonymization group candidate to the anonymization group creation unit 160 (S609).
  • the anonymization group creation unit 160 creates one or more anonymization groups from the user information record 511 other than the user information record 511 corresponding to the focus partial anonymization group created by the focus partial anonymization group creation unit 140. (S610).
  • the anonymization group creation unit 160 creates an anonymized user information record 521 corresponding to the focus partial anonymization group received from the focus partial anonymization group creation unit 140 and the anonymization group created by itself. Is stored in the anonymized user information storage unit 520 (S611).
  • the anonymized user information record 521 created as described above gives priority to minimizing the abstraction level of the quasi-identifier (age 512) corresponding to the focus value 123 specified by the user who uses the anonymized data.
  • the anonymization process is performed. That is, the anonymization device 100 according to the present embodiment can minimize the abstraction level of the quasi-identifier corresponding to the designated focus value 123.
  • an anonymized data set (a set of anonymized user information records 521) whose abstraction level has been lowered locally, for example, the postal code of the disaster area, the study guidance guidelines have been significantly changed It is possible to examine in detail the vicinity of a meaningful value such as the date of birth.
  • the effect of the present embodiment described above is that it is possible to obtain anonymized data so as to preferentially lower the abstraction level of the attribute value to be focused on locally.
  • the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate.
  • the information loss amount calculation unit 150 calculates the information loss amount of each focus partial anonymization group candidate.
  • the focus partial anonymization group creation unit 140 determines a focus partial anonymization group candidate corresponding to the smallest amount of information loss as a focus partial anonymization group.
  • FIG. 11 is a block diagram showing the configuration of the anonymization system according to the second embodiment of the present invention.
  • the anonymization system includes an anonymization device 200, a user information storage unit 510, and an anonymized user information storage unit 520.
  • the anonymization device 200, the user information storage unit 510, and the anonymized user information storage unit 520 are connected by a network (not shown). Note that the user information storage unit 510 may be included in the anonymization device 200. Further, the anonymized user information storage unit 520 may be included in the anonymization device 200.
  • the anonymization device 200 anonymizes the user information stored in the user information storage unit 510 and stores the anonymized user information in the anonymized user information storage unit 520.
  • the anonymization device 200 further includes a divided value storage unit 270 as compared with the anonymization device 100 according to the first embodiment.
  • the anonymization apparatus 200 has the focus partial anonymization group creation part 240 instead of the focus partial anonymization group creation part 140 compared with the anonymization apparatus 100 of 1st Embodiment.
  • FIG. 12 is a diagram showing an example of the user information record 511 in the present embodiment. As shown in FIG. 12, the user information record 511 in the present embodiment further includes a consultation date 514 as a quasi-identifier as compared to the user information record 511 in FIG.
  • FIG. 13 is a diagram illustrating an example of the division value record 271 stored in the division value storage unit 270.
  • the division value storage unit 270 includes one or more division value records 271.
  • the division value record 271 includes a semi-identifier name 272 and a division value 273.
  • the information included in the division value record 271 is information input to the anonymization device 200 in advance by a user of anonymized data (not shown). Note that the information included in the division value record 271 may be included in the anonymization process execution start instruction and input to the anonymization device 200.
  • the division value record 271 indicates that the user information is divided into a user information record 511 before November 30, 2011 and a user information record 511 after December 1, 2011.
  • the focus partial anonymization group creation unit 240 divides the user information record 511 shown in FIG. 12 based on the division value record 271 to create a plurality of division groups. Next, the focus partial anonymization group creation unit 240 executes steps S606 to S609 in FIG. 7 for each divided group of the user information record 511 in the same manner as the focus partial anonymization group creation unit 140 of the first embodiment. And create a focus part anonymization group.
  • the focus part anonymization group creation unit 240 sets the user information record 511 in FIG. 12 as the user information record 511 having the numbers 519 of “1”, “3”, “4”, and “6”, and the number 519 as “ It is divided into user information records 511 of “2”, “5” and “7”.
  • the focus partial anonymization group creation unit 240 executes steps S606 to S609, and outputs the focus partial anonymization group created without crossing the division value 273.
  • the anonymized group creation unit 160 stores the anonymized user information record 521 corresponding to the focus partial anonymization group created without straddling the division value 273 in the anonymized user information storage unit 520.
  • a converted user information record 521 can be created.
  • the division value record 271 is not limited to the example described above, and may specify division of an arbitrary attribute value included in the user information record 511. Moreover, the division value record 271 may be plural.
  • the effect of the present embodiment described above is that, in addition to the effect of the first embodiment, for the user information record 511 with a narrower range, the abstraction level of the attribute value to be focused is preferentially lowered locally. It is possible to obtain anonymized data.
  • the reason is that the focus part anonymization group creation unit 240 divides the user information record 511 based on the division value record 271.
  • FIG. 14 is a block diagram showing the configuration of the anonymization system according to the third embodiment of the present invention.
  • the anonymization system includes an anonymization device 300, a user information storage unit 510, and an anonymized user information storage unit 520.
  • 3 is a block diagram showing a configuration of an anonymization device 300.
  • the anonymization device 300, the user information storage unit 510, and the anonymized user information storage unit 520 are connected by a network (not shown). Note that the user information storage unit 510 may be included in the anonymization device 300. Further, the anonymized user information storage unit 520 may be included in the anonymization device 300.
  • the anonymization device 300 anonymizes the user information stored in the user information storage unit 510 and stores the anonymized user information in the anonymized user information storage unit 520.
  • the anonymization device 300 in the present embodiment is different from the anonymization device 100 of the first embodiment in that a focus partial anonymization group creation unit 340 is used instead of the focus partial anonymization group creation unit 140. Instead of the anonymization group creation unit 160, an anonymization group creation unit 360 is provided.
  • FIG. 15 is a diagram illustrating an example of the focus record 121 stored in the focus value storage unit 120 according to the present embodiment.
  • the focus value storage unit 120 of this embodiment includes a plurality of focus records 121.
  • the focus record 121 of this embodiment further includes a priority 128 in addition to the quasi-identifier name 122 and the focus value 123.
  • the priority 128 is information indicating the order of the focus records 121 when a focus partial anonymization group is created.
  • the priority 128 may be information indicating the weight of the focus record 121.
  • the focus partial anonymization group creation unit 340 Upon receiving an instruction from the anonymization group creation unit 360, the focus partial anonymization group creation unit 340 creates the focus partial anonymization group using the focus records 121 in order of priority 128, and creates the anonymization group Output to the unit 360. At this time, the focus partial anonymization group creation unit 340 adds the completion information to the focus partial anonymization group and outputs it to the anonymization group creation unit 360.
  • the completion information is information indicating whether focus partial anonymization group candidates including each of the focus records 121 have been created (“complete”) or not (“incomplete”) for all focus records 121. .
  • the anonymization group creation unit 360 receives from the focus partial anonymization group creation unit 340 a focus partial anonymization group to which information indicating whether or not an unused focus record 121 remains is added. Then, the anonymization group creation unit 360 creates the anonymized user information record 521 in the same manner as the anonymization group creation unit 160.
  • the anonymization group creation unit 360 confirms the completion information.
  • the anonymization group creation unit 360 passes the created anonymized user information record 521 to the focus partial anonymization group creation unit 340 to create the focus partial anonymization group again. Instruct.
  • the anonymized group creation unit 360 stores the created anonymized user information record 521 in the anonymized user information storage unit 520.
  • the anonymization apparatus 300 can specify the focus value 123 for each of a plurality of quasi-identifiers (arbitrary quasi-identifier names 122 and quasi-identifiers having the same quasi-identifier name 122). .
  • FIG. 16 is a flowchart showing the operation of the present embodiment.
  • S601 to S603 are the same operations as S601 to S603 in FIG.
  • the anonymization group creation unit 360 outputs one of the anonymized user information records 521 and the parameter value 113 to the focus partial anonymization group creation unit 340, and instructs the creation of the focus partial anonymization group. . (S634).
  • the anonymized user information record 521 is the anonymized user information record 521 created in S602 or the user information record 511 acquired in S602.
  • the parameter value 113 is the parameter value 113 of the property record 111 whose parameter name 112 acquired in S603 is “k”.
  • S605 to S608 are the same operations as S605 to S608 in FIG.
  • the focus partial anonymization group creation unit 340 determines the focus partial anonymization group candidate with the smallest information loss amount as the focus value anonymization group. . Subsequently, the focus partial anonymization group creation unit 340 adds completion information to the determined focus partial anonymization group candidate and outputs the completion information to the anonymization group creation unit 160 (S639).
  • S610 is the same operation as S610 in FIG.
  • the anonymization group creation unit 360 creates an anonymized user information record 521 corresponding to the focus partial anonymization group received from the focus partial anonymization group creation unit 340 and the anonymization group created by itself. (S641).
  • the anonymization group creation unit 360 confirms the completion information (S642). If the completion information is “incomplete” (NO in S642), the process returns to S634.
  • the anonymized user information record 521 created in S641 is stored in the anonymized user information storage unit 520 (S643).
  • a focus value 123 is set in the date of birth, date of birth, and quasi-identifier of gender, and attention is paid to a female patient at an age when permission to use a certain new drug is given.
  • the incidence of cervical cancer can be examined.
  • the focus record 121 may not include the priority 128.
  • the focus part anonymization group creation unit 340 may use the focus records 121 included in the address of the young or old number in the focus value storage unit 120 in order. Further, the focus partial anonymization group creation unit 340 may use the plurality of focus records 121 in a fixed order predetermined for the semi-identifier name 122 or in an arbitrary order.
  • the effect in the present embodiment described above is that anonymized data is preferentially lowered locally in order to preferentially reduce the abstraction level of the attribute value to be noticed from a plurality of viewpoints. It is a point that can be obtained.
  • the focus partial anonymization group creation unit 340 creates a focus partial anonymization group by sequentially using a plurality of focus records 121.
  • the anonymization group creation unit 360 creates a focus partial anonymization group for the created anonymized user information record 521 until the focus partial anonymization group creation unit 340 uses all the focus records 121.
  • the focus partial anonymization group creation unit 340 is instructed.
  • FIG. 17 is a block diagram showing a configuration of an anonymization apparatus 400 according to the fourth embodiment of the present invention.
  • the anonymization device 400 includes a focus partial anonymization group creation unit 140 and an information loss amount calculation unit 150.
  • User information storage means (not shown) may be included in the anonymization device 400, or may be a user information storage unit 510 as shown in FIG.
  • the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate in which a plurality of user information records 511 are grouped.
  • the plurality of user information records 511 include at least an attention attribute value.
  • the attention attribute value is one of the above-described arbitrary attribute values, and is an attribute value specified by the quasi-identifier name 122 and the focus value 123 included in the focus record 121.
  • the focus partial anonymization group creation unit 140 determines the focus partial anonymization group candidate corresponding to the smallest amount of information loss among the created focus partial anonymization group candidates and outputs the focus partial anonymization group candidate.
  • the information loss amount calculation unit 150 calculates and outputs the information output amount of the focus partial anonymization group candidate created by the focus partial anonymization group creation unit 140.
  • the amount of information output is the amount of information obtained from the focus partial anonymization group candidate lost (decreased) with respect to the information obtained from the user information record 511 corresponding to the focus partial anonymization group candidate. Show.
  • the effect of the present embodiment described above is that it is possible to obtain anonymized data so as to preferentially lower the abstraction level of the attribute value to be focused on locally.
  • the focus partial anonymization group creation unit 140 creates a focus partial anonymization group candidate.
  • the information loss amount calculation unit 150 calculates the information loss amount of each focus partial anonymization group candidate.
  • the focus partial anonymization group creation unit 140 determines a focus partial anonymization group candidate corresponding to the smallest amount of information loss as a focus partial anonymization group.
  • each component described in each of the above embodiments does not necessarily need to be an independent entity.
  • each component may be realized as a module with a plurality of components.
  • each component may be realized by a plurality of modules.
  • Each component may be configured such that a certain component is a part of another component.
  • Each component may be configured such that a part of a certain component overlaps a part of another component.
  • each component and a module that realizes each component may be realized by hardware if necessary. Moreover, each component and the module which implement
  • the program is provided by being recorded on a non-volatile computer-readable recording medium such as a magnetic disk or a semiconductor memory, and is read by the computer when the computer is started up.
  • the read program causes the computer to function as a component in each of the above-described embodiments by controlling the operation of the computer.
  • a plurality of operations are not limited to being executed at different timings. For example, another operation may occur during the execution of a certain operation, or the execution timing of a certain operation and another operation may partially or entirely overlap.
  • each of the embodiments described above it is described that a certain operation becomes a trigger for another operation, but the description does not limit all relationships between the certain operation and other operations. For this reason, when each embodiment is implemented, the relationship between the plurality of operations can be changed within a range that does not hinder the contents.
  • the specific description of each operation of each component does not limit each operation of each component. For this reason, each specific operation
  • movement of each component may be changed in the range which does not cause trouble with respect to a functional, performance, and other characteristic in implementing each embodiment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Bioethics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un dispositif de traitement d'informations pour mettre en œuvre une anonymisation d'une manière telle que les niveaux d'abstraction de valeurs d'attribut sur lesquelles il faut se focaliser sont abaissés localement de manière préférentielle. Le dispositif de traitement d'informations comporte : un moyen pour délivrer des codes d'informations d'utilisateur qui comprennent des valeurs d'attribut sur lesquelles il faut se focaliser par génération de candidats de groupe d'anonymisation de zone de focalisation groupés, et détermination du candidat de groupe d'anonymisation de zone de focalisation ayant la plus petite perte d'informations en tant que groupe d'anonymisation de zone de focalisation ; et un moyen pour calculer la perte d'informations dans les candidats de groupe d'anonymisation de zone de focalisation.
PCT/JP2013/001073 2012-03-01 2013-02-25 Dispositif de traitement d'informations pour mettre en œuvre un processus d'anonymisation, procédé d'anonymisation et programme correspondant WO2013128879A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2012-045548 2012-03-01
JP2012045548 2012-03-01

Publications (1)

Publication Number Publication Date
WO2013128879A1 true WO2013128879A1 (fr) 2013-09-06

Family

ID=49082092

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/001073 WO2013128879A1 (fr) 2012-03-01 2013-02-25 Dispositif de traitement d'informations pour mettre en œuvre un processus d'anonymisation, procédé d'anonymisation et programme correspondant

Country Status (1)

Country Link
WO (1) WO2013128879A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010097336A (ja) * 2008-10-15 2010-04-30 Nippon Telegr & Teleph Corp <Ntt> プライバシー侵害監視装置、プライバシー侵害監視方法及びプログラム
JP2012003440A (ja) * 2010-06-16 2012-01-05 Kddi Corp 公開情報のプライバシー保護装置、公開情報のプライバシー保護方法およびプログラム
JP2012022315A (ja) * 2010-07-02 2012-02-02 Nec (China) Co Ltd データ匿名化の方法と装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010097336A (ja) * 2008-10-15 2010-04-30 Nippon Telegr & Teleph Corp <Ntt> プライバシー侵害監視装置、プライバシー侵害監視方法及びプログラム
JP2012003440A (ja) * 2010-06-16 2012-01-05 Kddi Corp 公開情報のプライバシー保護装置、公開情報のプライバシー保護方法およびプログラム
JP2012022315A (ja) * 2010-07-02 2012-02-02 Nec (China) Co Ltd データ匿名化の方法と装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KUNIHIKO HARADA ET AL.: "k-anonymization schemes with automatic generation of generalization trees and distortion measuring using information entropy", IPSJ SIG NOTES, vol. 2010-CSE, no. 47, 24 June 2010 (2010-06-24), pages 1 - 7, XP008179074 *

Similar Documents

Publication Publication Date Title
US20230409750A1 (en) Smart de-identification using date jittering
US11144660B2 (en) Secure data sharing
US20160307063A1 (en) Dicom de-identification system and method
US20170344716A1 (en) Context and location specific real time care management system
US20160306999A1 (en) Systems, methods, and computer-readable media for de-identifying information
US10958421B2 (en) User access control in blockchain
JP2007299396A (ja) 患者の再識別のためのシステムおよび方法
JP2015515659A (ja) 患者に関連するデータレコードを処理するための方法
US10657273B2 (en) Systems and methods for automatic and customizable data minimization of electronic data stores
US9009075B2 (en) Transfer system for security-critical medical image contents
JP6242469B1 (ja) 個人医療情報管理方法、個人医療情報管理サーバおよびプログラム
JP2022529524A (ja) 共通個人情報に関する同意
WO2014030302A1 (fr) Dispositif de traitement d&#39;informations pour exécuter une anonymisation et procédé de traitement d&#39;anonymisation
JP6127774B2 (ja) 情報処理装置、及び、データ処理方法
US11113418B2 (en) De-identification of electronic medical records for continuous data development
Sheinson et al. Estimated impact of public and private sector COVID-19 diagnostics and treatments on US healthcare resource utilization
WO2013128879A1 (fr) Dispositif de traitement d&#39;informations pour mettre en œuvre un processus d&#39;anonymisation, procédé d&#39;anonymisation et programme correspondant
WO2022233236A1 (fr) Analyse de données sécurisée
WO2014136422A1 (fr) Dispositif de traitement d&#39;informations pour réaliser un traitement de préservation de l&#39;anonymat, et procédé de préservation de l&#39;anonymat
US20220382711A1 (en) Data analysis system and data analysis method
JP2019036249A (ja) 医療情報管理装置、医療情報管理方法及びプログラム
JPWO2013183250A1 (ja) 匿名化を行う情報処理装置及び匿名化方法
JP6192601B2 (ja) パーソナル情報管理システム及びパーソナル情報匿名化装置
JP2016045535A (ja) 情報処理システム、匿名化方法、及びそのためのプログラム
JP6799775B2 (ja) サーバ装置、通信システム、情報処理方法、および、情報処理プログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13754792

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13754792

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP