WO2013183250A1 - Dispositif de traitement d'informations pour anonymisation et procédé d'anonymisation - Google Patents

Dispositif de traitement d'informations pour anonymisation et procédé d'anonymisation Download PDF

Info

Publication number
WO2013183250A1
WO2013183250A1 PCT/JP2013/003347 JP2013003347W WO2013183250A1 WO 2013183250 A1 WO2013183250 A1 WO 2013183250A1 JP 2013003347 W JP2013003347 W JP 2013003347W WO 2013183250 A1 WO2013183250 A1 WO 2013183250A1
Authority
WO
WIPO (PCT)
Prior art keywords
anonymization
target
quasi
record
identifier
Prior art date
Application number
PCT/JP2013/003347
Other languages
English (en)
Japanese (ja)
Inventor
翼 高橋
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2014519824A priority Critical patent/JPWO2013183250A1/ja
Publication of WO2013183250A1 publication Critical patent/WO2013183250A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification

Definitions

  • the present invention relates to an information processing apparatus, anonymization method, and program for processing information and anonymizing it.
  • Non-Patent Document 1 proposes k-anonymity, which is a well-known anonymity index.
  • a technique for satisfying a predetermined k-anonymity in a data set to be anonymized is called k-anonymization.
  • the attribute information to be converted included in the anonymization target data set is referred to as a quasi-identifier.
  • the quasi-identifier is not unique identification information (for example, a name) that identifies an individual.
  • the quasi-identifier is information of an attribute that may identify an individual by combining with other information that is not unique identification information.
  • a process of converting the target quasi-identifier is performed so that at least k records having the same quasi-identifier exist in the data set to be anonymized.
  • Patent Document 1 discloses a privacy protection device including data processing means for processing data until the above k-anonymity is satisfied.
  • Non-Patent Document 1 and the privacy protection device of Patent Document 1 include a case where a plurality of records having the same unique identification information are included in the data set to be anonymized. This is because the problem is not considered.
  • the data set to be anonymized includes a plurality of records having the same unique identification information.
  • the anonymization target data set is k-anonymized while maintaining a specific connection relationship between a plurality of records having the same unique identification information.
  • the quasi-identifiers of record groups after k-anonymization that can be related by the connection relationship are compared.
  • the quasi-identifier abstracted by the above-mentioned k-anonymization is embodied by the comparison.
  • a data set including a plurality of records having the same unique identification information as described above is stored in a predetermined recording medium.
  • the data set includes historical information accumulated by those service providers, such as purchase information and medical information.
  • purchase information and medical information are generally stored in a recording medium as a set of a plurality of records for one individual (user). For example, purchase information associated with a credit card number is generated every time a user performs a purchase action using the same credit card. Such purchase information is associated with the user and stored in a recording medium as a record. Similarly, medical information is generated every time a medical practice is received using the same insurance card. Then, medical information associated with the same insured person is accumulated in the recording medium.
  • FIG. 2 is a diagram illustrating an example of an anonymization target data set.
  • FIG. 3 is a diagram illustrating an example of a data set obtained by k-anonymizing the data set to be anonymized in FIG.
  • the fake ID is local identification information for the data set shown in FIG. 3 that shows only the relationship between each record of the data set shown in FIG. 3 and does not specify a specific individual.
  • the data set shown in FIG. 3 is designed to prevent a person's records from being narrowed down to less than k from any combination of knowledge about “gender”, “birth date”, and “care date” for an individual. ing.
  • the data set shown in FIG. 3 is different from the data set handled in the background art as follows. That is, it is that the data set shown in FIG. 3 is anonymized information set (FIG. 2) to be anonymized in which a plurality of records are stored for one individual. Specifically, each individual record group of the anonymization target data set shown in FIG. 2 has a specific connection relationship, that is, a connection relationship in which attributes of names (unique identification information) are common. The difference is that a plurality of records having the same unique identification information are stored in the recording medium as a data set shown in FIG. 3 by anonymized fake IDs. That is, as described above, k-anonymization of related technology does not take into consideration that a plurality of records of one individual appear at the same time.
  • the target record 822 of “sex: female, date of birth: February 2, 1985, date of medical treatment: April 2010” having the name: “Alice” shown in FIG. 2 is a false ID: “ 2 is processed into an anonymized record 832 of “sex: Any, date of birth: 1981-1985, date of medical treatment: April 2010”. Further, the target record 825 of “sex: female, date of birth: February 2, 1985, date of medical treatment: May 2010” having the name: “Alice” shown in FIG. 2 is a false ID shown in FIG. It is processed into an anonymized record 835 of “sex: woman, date of birth: 1985-1986, date of medical treatment: May 2010” with “2”.
  • each of the anonymization record 832 and the anonymization record 835 having the fake ID: 2 has 2-anonymity regarding “gender: female, date of birth: February 2, 1985”. Yes.
  • the attributes of “sex” and “birth date” have an invariant attribute value for a certain individual. Therefore, it can be easily estimated that anonymized records having the same fake ID have the same attribute value as such an invariant attribute value in the target record before anonymization.
  • the person x can combine the anonymization records based on the fake ID.
  • the person x has anonymized record 832 of “sex: Any, date of birth: 1981-1985” with a false ID: 2 and anonymity of “sex: woman, date of birth: 1985-1986”. From the product with the quantified record 835, “sex: female, date of birth: 1985” can be obtained as information of the anonymized record of the actualized false ID: “2”. From the above, the 2-anonymity of the false ID: “2” is broken.
  • An object of the present invention is to provide an information processing apparatus, anonymization method, and program for anonymization that can solve the above-described problems.
  • the information processing apparatus that performs anonymization according to the present invention includes the unique identification information for an anonymization target data set including one or more target records including unique identification information and one or more target quasi-identifiers corresponding to the unique identification information. Are converted to false identification information that is uniquely assigned to each of the unique identification information, and each of the target quasi-identifiers is converted into false identification information.
  • An anonymity enhancing unit that generates and outputs a data set, and the anonymity enhancing unit includes the target quasi-identifier that always has the same attribute value corresponding to each of the unique identification information in the anonymization target data set.
  • the anonymization quasi-identifier corresponding to the anonymization quasi-identifier is converted into information that cannot be instantiated the quasi-identification quasi-identifier by comparing the quasi-identification quasi-identifier, and Convert to the enhanced record.
  • each of the unique identification information is The restoration information to the unique identification information is not included, converted into false identification information uniquely assigned to each of the unique identification information, and each of the target quasi-identifiers is converted into k-anonymity by the anonymization target data set Is converted to an anonymized quasi-identifier, the target record is converted to an anonymized record, a k-anonymized data set including the anonymized record is generated, and the k-anonymized data set is , Converting the anonymization record having the same false identification information into an enhancement record, generating an anonymity enhancement data set including the enhancement record, and outputting,
  • the enhancement target quasi-identifier corresponding to the target quasi-identifier always corresponding to each of the unique identification information and having the same attribute value is the anonymization quasi-identifier
  • the anonymization record is converted to the strengthening record by
  • the non-volatile recording medium program of the present invention provides each of the unique identification information for an anonymization target data set including one or more target records including unique identification information and one or more target quasi-identifiers corresponding to the unique identification information.
  • the anonymization target data set is k ⁇ so as to satisfy the anonymity, it converts the anonymous quasi identifier, converting the target record to anonymous record, and generating a k- anonymous data set that contains the anonymous record, the k- anonymous Anonymization enhanced data including the enhanced record by converting the anonymized record having the same false identification information into a strengthened record
  • a process for generating a computer and a process for causing the computer to execute recording and a process for converting the anonymization record into the enhanced record correspond to each of the unique identification information in the anonymization target data set.
  • the quasi-identifier quasi-identifier that is the anonymization quasi-identifier, and the reinforcement quasi-identifier cannot be instantiated by contrasting the strengthening quasi-identifier It is processing to convert to.
  • the present invention has an effect that it is possible to generate a data set with enhanced anonymity so that the personality cannot be improved even if the quasi-identifier of the record group after k-anonymization is compared.
  • FIG. 1 is a block diagram showing the configuration of the anonymization apparatus according to the first embodiment.
  • FIG. 2 is a diagram illustrating an example of the anonymization target data set in the first embodiment.
  • FIG. 3 is a diagram showing an example of a k-anonymization data set in the first embodiment.
  • FIG. 4 is a diagram illustrating an example of the anonymity enhancement data set in the first embodiment.
  • FIG. 5 is a block diagram illustrating a hardware configuration of a computer that implements the anonymization apparatus according to the first embodiment.
  • FIG. 6 is a flowchart showing the operation of the anonymization device according to the first embodiment.
  • FIG. 7 is a flowchart showing the operation of the anonymity enhancing unit in the first embodiment.
  • FIG. 8 is a flowchart showing the operation of the anonymity enhancing unit in the modification of the first embodiment.
  • FIG. 9 is a diagram illustrating an example of the anonymity enhancing data set according to the first embodiment.
  • FIG. 10 is a block diagram illustrating a configuration of the anonymization apparatus according to the second embodiment.
  • FIG. 11 is a diagram illustrating an example of the anonymization target data set in the second embodiment.
  • FIG. 12 is a diagram illustrating an image when the target records of the anonymization target data set are distributed to groups in the second embodiment.
  • FIG. 13 shows an example of the anonymization quasi-identifier when the anonymization target data set is k-anonymized in a combination of certain target records in the second embodiment.
  • FIG. 14 shows an example of the anonymization quasi-identifier when the anonymization target data set is k-anonymized in a combination of certain target records in the second embodiment.
  • FIG. 15 is information showing an example of the information loss amount corresponding to the combination of target records in the second embodiment.
  • FIG. 16 is a diagram illustrating an example of a k-anonymized data set according to the second embodiment.
  • FIG. 17 is a flowchart showing the operation of the anonymization apparatus according to the second embodiment.
  • FIG. 1 is a block diagram showing a configuration of an anonymization apparatus (generally also called an information processing apparatus) 100 according to the first embodiment of the present invention.
  • an anonymization apparatus generally also called an information processing apparatus
  • the anonymization device 100 includes a k-anonymization unit 110 and an anonymity enhancement unit 120.
  • the constituent elements shown in FIG. 1 may be constituent elements in hardware units or constituent elements divided into functional units of a computer.
  • the components shown in FIG. 1 will be described as components divided into functional units of a computer.
  • the k-anonymization unit 110 converts the anonymization target data set stored in a storage device (not shown) into a k-anonymization data set that satisfies k (for example, 2) k-anonymity. .
  • the conversion is a process for anonymizing data, and is also referred to as “processing”, but here, it is unified with “conversion”.
  • k- anonymizing section 110 the anonymization target data set to generate a record (hereinafter, referred to as the target record) was converted to anonymous record k- anonymous data sets included in anonymized subject dataset .
  • the k-anonymization unit 110 converts the target record into an anonymization record as follows. First, the k-anonymization unit 110 converts the unique identification information included in each target record into false identification information that is uniquely assigned to the unique identification information.
  • the false identification information is identification information that does not include restoration information to the unique identification information.
  • the k-anonymization unit 110 generates a quasi-identifier (also referred to as a target quasi-identifier) included in each target record for at least k quasi-identifiers having the same attribute in generating the k-anonymization data set.
  • a quasi-identifier also referred to as a target quasi-identifier
  • the anonymization quasi-identifier is a quasi-identifier determined so that the anonymization target data set including the anonymization quasi-identifier satisfies predetermined k-anonymity.
  • FIG. 2 is a diagram illustrating an example of the anonymization target data set 820.
  • the anonymization target data set 820 shown in FIG. 2 is stored in a storage device (not shown). This storage device may be included in the k-anonymization unit 110 or may be an external storage medium connected to the k-anonymization unit 110.
  • the anonymization target data set 820 includes a plurality of target records (for example, one of them is the target record 822) including attributes of name (unique identification information), gender, date of birth, date of medical treatment, and name of injury and illness.
  • the attribute includes an attribute name (attribute element name) and a value of the attribute (attribute value). For example, regarding the first attribute of the target record 822, the element name is “name”, and “Alice” is the attribute value.
  • the name is a kind of unique identification information and is information for identifying an individual.
  • each of the attributes, sex, date of birth, and date of medical care is a quasi-identifier (target quasi-identifier).
  • An invariant quasi-identifier that always has the same attribute value corresponding to each unique identification information is called an invariant quasi-identifier.
  • a variable quasi-identifier that may have a different attribute value corresponding to each unique identification information is called a variable quasi-identifier.
  • the attributes “gender” and “birth date” are invariant identifiers.
  • the attribute “medical care date” is a variable quasi-identifier.
  • FIG. 3 is a diagram illustrating an example of the k-anonymization data set 830.
  • the anonymization target data set 830 shown in FIG. 3 is stored in a storage device (not shown). This storage device may be included in the k-anonymization unit 110 or may be an external storage medium connected to the k-anonymization unit 110.
  • the k-anonymization data set 830 includes a plurality of anonymization records (for example, anonymization records) including attributes of fake ID (fake identification information), gender, date of birth, date of medical treatment, and name of sickness. Record 832).
  • Each of the false IDs of the k-anonymization data set 830 corresponds to each of the names included in the anonymization target data set 820 on a one-to-one basis.
  • the fake ID indicates only the relationship between each anonymization record of the k-anonymization data set 830 shown in FIG. 3, and does not specify a specific individual, and is local identification information of the k-anonymization data set 830. is there.
  • the attributes included in the k-anonymization data set 830 are quasi-identifiers (anonymization quasi-identifiers) as in the case of the anonymization target data set 820 described above. is there.
  • FIG. 2 corresponds to the anonymization records of the anonymization target data set 820 shown in FIG. 2 and the anonymization records of the k-anonymization data set 830 shown in FIG. Anonymized record 832 corresponds, and target record 825 corresponds to anonymized record 835).
  • the anonymity enhancement unit 120 executes processing for enhancing anonymity for the quasi-identifier to be strengthened included in the anonymization record having the same false ID included in the k-anonymization data set.
  • the quasi-identifier to be strengthened is an invariant quasi-identifier among the anonymization quasi-identifiers included in the k-anonymization data set.
  • the process for strengthening quasi-identifiers is strengthened so that when the quasi-identifiers to be reinforced are compared, the quasi-identifiers to be reinforced cannot be instantiated (individuals can be identified or identified) It is to convert the target quasi-identifier into data with enhanced anonymity.
  • converting this strengthening target quasi-identifier into data with enhanced anonymity is referred to as strengthening processing.
  • the anonymity strengthening unit 120 strengthens the quasi-identifier to be strengthened so as to prevent the failure of k-anonymity due to the comparison of quasi-identifiers included in a plurality of anonymized records 831 of the same user (the same fake Id). Process.
  • the anonymity enhancing unit 120 reinforces the reinforcement target quasi-identifier to the same attribute value for each attribute name.
  • the same attribute value is an attribute value that includes all the reinforcement target quasi-identifiers for each attribute name having the same false ID, and indicates the minimum range.
  • the same attribute value may be an attribute value indicating an arbitrary range including all the reinforcement target quasi-identifiers having the same false ID for each attribute name.
  • all strengthening target quasi-identifiers having the same fake ID for each attribute name are abbreviated as “same fake ID strengthening target quasi-identifiers”.
  • FIG. 6 is a diagram illustrating an example of the anonymity enhancement data set 840.
  • the anonymity enhancement data set 840 is information output from the anonymity conversion processing unit 120 and stored in a storage device (not shown).
  • the anonymity enhancement data set 840 includes a plurality of enhancement records (for example, enhancement records 842) including attributes of fake ID, gender, date of birth, date of medical care, and name of injury and illness.
  • the enhancement record is obtained by strengthening the sex and date of birth, which are the quasi-identifiers to be strengthened in the k-anonymization data set 830 shown in FIG.
  • each anonymization record of the anonymization target data set 820 shown in FIG. 2 each anonymization record of the k-anonymization data set 830 shown in FIG. 3, and each enhancement record of the anonymity enhancement data set 840 shown in FIG. Correspond in the order of arrangement.
  • the target record 822 and the anonymization record 832 correspond to the strengthening record 842
  • the target record 825, the anonymization record 835, and the strengthening record 845 correspond to each other.
  • the anonymization device 100 can be realized by an information processing device such as a computer.
  • Each component (functional block) in the anonymization apparatus 100 and the anonymization apparatus in other embodiments described later is realized by hardware resources included in the information processing apparatus.
  • the information processing apparatus may include a CPU (Central Processing Unit) that executes a computer program (software program: hereinafter may be simply referred to as “program”) stored in a recording medium.
  • CPU Central Processing Unit
  • program software program: hereinafter may be simply referred to as “program” stored in a recording medium.
  • the anonymization device 100 includes hardware such as a CPU of a computer, a main storage device, and an auxiliary storage device, and is realized by the cooperation of the CPU based on a program loaded from the storage device or the like to the main storage device.
  • the functions realized by the CPU are not limited to the block configuration shown in FIG. 1 (k-anonymization unit 110, anonymity enhancement unit 120), and various implementation forms that can be adopted by those skilled in the art can be applied. (The same applies to the following embodiments).
  • the anonymization device 100 and the anonymization device according to each embodiment to be described later may be realized by a dedicated device.
  • FIG. 5 is a diagram illustrating a hardware configuration of a computer 700 that realizes the anonymization apparatus 100 according to the present embodiment.
  • the computer 700 includes a CPU (Central Processing Unit) 701, a storage unit 702, a storage device 703, an input unit 704, an output unit 705, and a communication unit 706. Furthermore, the computer 700 includes a recording medium (or storage medium) 707 supplied from the outside.
  • the recording medium 707 may be a non-volatile recording medium that stores information non-temporarily.
  • the CPU 701 controls the overall operation of the computer 700 by operating an operating system (not shown).
  • the CPU 701 reads a program and data from a recording medium 707 mounted on the storage device 703, for example, and writes the read program and data to the storage unit 702.
  • the program is, for example, a program that causes the computer 700 to execute an operation of a flowchart shown in FIG.
  • the CPU 701 executes various processes as the k-anonymization unit 110 and the anonymity enhancement unit 120 shown in FIG. 1 according to the read program and based on the read data.
  • the CPU 701 may download a program or data to the storage unit 702 from an external computer (not shown) connected to a communication network (not shown).
  • the storage unit 702 stores programs and data.
  • the storage unit 702 may store an anonymity target data set, a k-anonymization data set, and an anonymity enhancement data set.
  • the storage device 703 is, for example, an optical disk, a flexible disk, a magnetic optical disk, an external hard disk, and a semiconductor memory, and includes a recording medium 707.
  • the storage device 703 records the program so that it can be read by a computer. Further, the storage device 703 may record data so as to be readable by a computer.
  • the storage device 703 may store an anonymity target data set, a k-anonymization data set, and an anonymity enhancement data set.
  • the input unit 704 is realized by, for example, a mouse, a keyboard, a built-in key button, and the like, and is used for an input operation.
  • the input unit 704 is not limited to a mouse, a keyboard, and a built-in key button, and may be a touch panel, an accelerometer, a gyro sensor, a camera, or the like.
  • the output unit 705 is realized by a display, for example, and is used for confirming the output.
  • the communication unit 706 implements an interface with the outside (for example, a data server that stores an anonymization target data set).
  • the communication unit 706 is included as part of the k-anonymization unit 110, for example.
  • the functional unit block of the anonymization device 100 shown in FIG. 1 is realized by the computer 700 having the hardware configuration shown in FIG.
  • the means for realizing each unit included in the computer 700 is not limited to the above.
  • the computer 700 may be realized by one physically coupled device, or may be realized by two or more physically separated devices connected by wire or wirelessly and by a plurality of these devices. .
  • the recording medium 707 in which the above-described program code is recorded may be supplied to the computer 700, and the CPU 701 may read and execute the program code stored in the recording medium 707.
  • the CPU 701 may store the code of the program stored in the recording medium 707 in the storage unit 702, the storage device 703, or both. That is, the present embodiment includes an embodiment of a recording medium 707 that stores a program (software) executed by the computer 700 (CPU 701) temporarily or non-temporarily.
  • FIG. 6 is a flowchart showing the operation of the anonymization device 100 of this embodiment. Note that the processing according to this flowchart may be executed based on the above-described program control by the CPU. Further, the step name of the process is described by a symbol as in S601.
  • the k-anonymization unit 110 acquires the anonymization target data set 820 (S601). For example, the k-anonymization unit 110 reads the anonymization target data set held in the storage unit 702 or the storage device 703 illustrated in FIG. Note that the k-anonymization unit 110 may receive the anonymization target data set from the outside (not shown) via the communication unit 706. Further, the k-anonymization unit 110 may receive the anonymization target data set input via the input unit 704.
  • k- anonymizing section 110 the anonymization target data set 820 k- and anonymized generate k- anonymous data set 830, and outputs to the storage unit 702 or the storage device 703 (S602).
  • the k-anonymization unit 110 sets the anonymization target data set 820 so that anonymization records having at least k different false IDs in the k-anonymization data set 830 have the same combination of anonymization quasi-identifiers.
  • the target quasi-identifier of each target record is converted into an anonymization quasi-identifier.
  • the method of converting the target quasi-identifier of each target record into the anonymized quasi-identifier is, for example, abstraction by generalizing the target quasi-identifier.
  • the method for anonymizing the target quasi-identifier of each target record is not limited to a specific method, and various methods such as perturbation may be used.
  • the anonymity enhancement unit 120 reinforces the reinforcement target quasi-identifier included in the k-anonymization data set 830 to generate an anonymity enhancement data set 840 obtained by converting the anonymization record into the enhancement record (S603). ).
  • the anonymity strengthening unit 120 reinforces the same false ID strengthening target quasi-identifier so as to be the same for each attribute name.
  • the anonymity enhancing unit 120 converts all anonymized records having the same fake ID into enhanced records.
  • any combination of various processes generally used in k-anonymization such as generalization and perturbation can be used.
  • invariant canonical identifier for each attribute name is the same in all the strengthening records having the same fake ID, even if the invariant canonical identifiers (strengthening identifiers) of a plurality of strengthening records are compared Invariant quasi-identifiers (strengthened identifiers) are never embodied. Therefore, even when compared, the desired k-anonymity can be prevented from being broken.
  • the anonymity enhancing unit 120 outputs the generated anonymity enhancing data set 840 (S604).
  • the anonymity enhancing unit 120 outputs the anonymity enhancing data set to the outside (not shown) via the communication unit 706.
  • the anonymity enhancing unit 120 may store the anonymity enhancing data set in the storage unit 702 or the storage device 703 illustrated in FIG. 5. Further, the anonymity enhancing unit 120 may output the anonymity enhancing data set to the output unit 705 shown in FIG. 5 and control it to be displayed on the display.
  • FIG. 7 is a flowchart showing an operation (S603 shown in FIG. 6) in which the anonymity enhancing unit 120 generates the anonymity enhancing data set.
  • the anonymity enhancing unit 120 performs the processing from S611 to S614 for each of all anonymized record groups having the same false ID. For example, in the case of the k-anonymization data set 830 shown in FIG. 3, the anonymity enhancement unit 120 performs anonymization record 832 and anonymization record 835 with false ID: 2, anonymization record 834 with false ID: 4 and anonymization The processing from S611 to S614 is performed on the conversion record 836.
  • the anonymity enhancing unit 120 selects a fake ID to be processed (S611).
  • a fake ID is a fake ID to be processed. If there is no fake ID to be processed (YES in S612), the process ends.
  • the anonymity enhancing unit 120 strengthens the reinforcement target quasi-identifiers of all anonymized records having the selected false ID into the same attribute value for each attribute name (S613).
  • the anonymity enhancing unit 120 strengthens the reinforcement target quasi-identifier of the anonymization record to be processed into the same attribute value as the specific reinforcement quasi-identifier for each attribute name (S614).
  • the processing target anonymization record is an anonymization record belonging to the same reinforcement target quasi-identifier group (described later) as “anonymization record obtained by strengthening the reinforcement quasi-identifier in S613”.
  • the specific reinforcement target quasi-identifier is the reinforcement quasi-identifier of “anonymization record obtained by strengthening the reinforcement target quasi-identifier in S613”.
  • the anonymization record having the same quasi-identifier to be strengthened as either the anonymization record 832 or the anonymization record 835 of the false ID: 2 is the anonymization record 831 having the false ID: 1 (gender: Any, date of birth) : 1981-1985) and anonymized record 836 with fake ID: 4 (gender: female, date of birth: 1985-1986).
  • a plurality of such anonymized records having the same reinforcement target quasi-identifier belong to the same reinforcement target quasi-identifier group.
  • the anonymization record 832 having the false ID: 2 and the anonymization record 831 having the false ID: 1 belong to the same reinforcement target quasi-identifier group.
  • the anonymization record 835 having the false ID: 2 and the anonymization record 836 having the false ID: 4 belong to the same reinforcement target quasi-identifier group.
  • the anonymization records belonging to the same reinforcement target quasi-identifier group as the anonymization record having the false ID: 2 are the anonymization record 831 of the false ID: 1 and the anonymization record 836 of the false ID: 4.
  • the anonymity enhancement unit 120 strengthens the reinforcement target quasi-identifier of the processing target anonymization record to “the same attribute value as the specific reinforcement target quasi-identifier strengthened in S614” for each attribute name ( S615).
  • the anonymization record to be processed is an anonymization record having the same fake ID as the anonymization record obtained by strengthening the reinforcement target quasi-identifier again in S614.
  • the specific reinforcement target quasi-identifier is the reinforcement target quasi-identifier strengthened in S614.
  • the anonymity enhancing unit 120 assigns the reinforcement target quasi-identifiers of the same attribute value between the anonymization records having the same fake ID to the reinforcement quasi-identifiers. Reinforce processing. Furthermore, the anonymity enhancement unit 120 applies the reinforcement process to the reinforcement target quasi-identifier of the anonymization record belonging to the same reinforcement target quasi-identifier group as the reinforcement record in which the reinforcement process is applied to the reinforcement target quasi-identifier. Apply. The anonymity enhancement unit 120 recursively reinforces the reinforcement target quasi-identifier.
  • the anonymity strengthening unit 120 reinforces the reinforcement target quasi-identifier of a certain anonymization record
  • the anonymity enhancement unit 120 also reinforces the reinforcement target quasi-identifier in the anonymization record having the same false ID. Furthermore, the anonymity enhancing unit 120 strengthens the reinforcement target quasi-identifier of the anonymization record belonging to the same reinforcement target quasi-identifier group as the anonymization record obtained by strengthening the reinforcement target quasi-identifier. Furthermore, the anonymity enhancing unit 120 recursively repeats the reinforcement process of the reinforcement target quasi-identifier of the anonymization record having the same false ID as the anonymization record obtained by strengthening the reinforcement target quasi-identifier again.
  • the anonymity enhancing unit 120 selects a fake ID: 2 as a fake ID to be processed (S611).
  • the anonymity strengthening unit 120 reinforces the selected anonymization record 832 having the selected false ID: 2 and the reinforcement target quasi-identifier of the anonymization record 835 by converting them into the same attribute value for each attribute name. (S613).
  • the same attribute value for the strengthened quasi-identifier whose attribute name is “sex” is “Any” including “Any” and “female”.
  • the same attribute value for the reinforcement target quasi-identifier whose attribute name is “birth date” is “1981 to 1986” including “1981 to 1985” and “1985 to 1986”.
  • the anonymity enhancement unit 120 generalizes the reinforcement target quasi-identifiers of the anonymization record 832 and the anonymization record 835 having a false ID: 2 into “sex: Any, date of birth: 1981-1986”. To do.
  • anonymity reinforcing portion 120 has enhanced processed-enhanced quasi identifier in S613, the false ID: belonging to two anonymization record 832 and the same be reinforced semi identifier group and anonymizing record 835, anonymization record Strengthen the quasi-identifier for reinforcement. That is, the anonymity enhancement unit 120 assigns the reinforcement target quasi-identifier of the anonymization record 831 of false ID: 1 belonging to the same reinforcement target quasi-identifier group as the anonymization record 832 of false ID: 2 to the false ID: 2. Reinforce processing to “sex: Any, date of birth: 1981-1986”, which is the same attribute value as the quasi-identifier to be strengthened in the anonymization record 832.
  • the anonymity enhancement unit 120 assigns the reinforcement target quasi-identifier of the anonymization record 836 of false ID: 4 belonging to the same reinforcement target quasi-identifier group as the anonymization record 835 of false ID: 2 to the false ID: 2 Strengthening is performed to “sex: Any, date of birth: 1981-1986”, which is the same attribute value as the quasi-identifier to be strengthened in the anonymization record 835 (S614).
  • the anonymity enhancing unit 120 selects a fake ID: 4 as a fake ID to be processed (S611).
  • the anonymity enhancing unit 120 reinforces the selected anonymization record 834 having the selected false ID: 4 and the reinforcement target quasi-identifier of the anonymization record 836 by converting them into the same attribute value for each attribute name. (S613).
  • the fake ID: 2 anonymization record 835 and the fake ID: 4 anonymization record 836 belonging to the same reinforcement target quasi-identifier group are not changed by the process of S612 when the fake ID: 2 is selected in S611.
  • the quasi-identifier “sex: Any, date of birth: 1981-1986” is given. Therefore, the anonymity strengthening unit 120 generalizes the anonymization record 834 and the anonymization record 836 of the false ID: 4 into “sex: Any, date of birth: 1981-1990”.
  • anonymity reinforced section 120 be reinforced semi identifier enhanced processed fake ID in S613: belonging to 4 anonymization record 834 and the same be reinforced semi identifier group and anonymizing record 836, strengthening anonymized record Strengthen the target quasi-identifier.
  • anonymity strengthening section 120 false ID: fake ID belong to the same strengthening the subject quasi-identifier groups and anonymous record 834 of 4: to strengthen the subject quasi-identifier of anonymous record 833 for each attribute name of 3, anonymous Reinforce processing to “sex: Any, date of birth: 1981-1990”, which is the same attribute value as the quasi-identifier to be strengthened in the record 834.
  • anonymity strengthening section 120 false ID: 4 of anonymity record 836 false belong to the same strengthening the subject quasi-identifier group and ID: the strengthening subject quasi-identifier for each attribute names of 2 of anonymity record 835, anonymous Strengthening is performed to “sex: Any, date of birth: 1981-1990”, which is the same attribute value as the quasi-identifier to be strengthened in the record 836 (S614).
  • the anonymity strengthening unit 120 anonymizes the anonymization record 832 of the false ID: 2 for each attribute name along with the re-strengthening process for the reinforcement target quasi-identifier of the anonymization record 835 of the false ID: 2 in S614.
  • the data is strengthened to “sex: Any, date of birth: 1981-1990”, which is the same attribute value as the quasi-identifier to be strengthened of the record 835 (S615).
  • the anonymity strengthening unit 120 reinforces the reinforcement target quasi-identifier of the anonymization record 832 of fake ID: 2, and the same reinforcement target quasi-identifier group as the anonymization record 832 of fake ID: 2.
  • the reinforcement target quasi-identifier of the anonymization record 831 of fake ID: 1 is the same attribute value as that of the quasi-identification record of the anonymization record 832 of fake ID: 2, “sex: Any, date of birth Japan: 1981-1986 ”(S614).
  • the anonymity enhancing unit 120 of the present embodiment performs an enhancement process of the reinforcement target quasi-identifier (invariant quasi-identifier), thereby anonymizing quasi-identifiers between anonymized records having the same false ID. Prevents the breakdown of k-anonymity due to contrast. That is, the anonymity enhancing unit 120 of the present embodiment performs enhancement processing on the reinforcement target quasi-identifier so that k-anonymity is satisfied even if this comparison is made.
  • the k-anonymization data set 830 is k-anonymized by the k-anonymization unit 110. Therefore, k-anonymity is satisfied in each anonymized record unit.
  • k-case of further generalization anonymization quasi identifier anonymized records satisfying anonymity, corresponding to anonymous quasi identifier of the anonymous record, enhanced record after generalization is always present or k or . That is, there are always k or more strengthened records after generalization corresponding to the target quasi-identifier of the target record, like the k-anonymized record.
  • the anonymity enhancement data set obtained by further generalizing the anonymization quasi-identifier is the same as the k-anonymity of the k-anonymization data set 830 even if it is outside the strict definition of k-anonymity. Can have privacy strength.
  • the anonymity strengthening unit 120 should generalize the anonymization record having the same false ID into a superset of invariant quasi-identifiers (strengthening target quasi-identifiers).
  • the super-set has the same false ID strengthening target semi-identification for each attribute name, and the same attribute value (attribute value including the range of attribute values of all invariant semi-identifiers for each attribute name). It has been converted to an invariant canonical identifier.
  • the super-set has the same false ID strengthening target semi-identification for each attribute name, and the same attribute value (attribute value including the range of attribute values of all invariant semi-identifiers for each attribute name). It has been converted to an invariant canonical identifier.
  • a super set is a set that represents a superordinate concept of a set.
  • the attribute value of the invariant quasi-identifier is, for each attribute name, an attribute value of all invariant quasi-identifiers or a superset (or union) that includes all of the values included in the attribute values of the invariant quasi-identifier. Converted.
  • the union is the smallest superset among supersets that include all the invariant identifier attribute values or all the values included in the invariant identifier attribute values.
  • a superset may be expressed using a range or the like. Such a superset can maintain the same privacy strength as the k-anonymity guaranteed by the k-anonymization unit 110.
  • FIG. 8 is a flowchart showing an operation (S603 shown in FIG. 6) in which the anonymity enhancing unit 120 generates the anonymity enhancing data set in the modification of the first embodiment.
  • FIG. 9 is a diagram illustrating an example of the anonymity enhancing data set 850 generated by the anonymity enhancing unit 120 by generalizing the k-anonymized data set 830.
  • the anonymity enhancing unit 120 selects fake ID: 2 as a fake ID to be processed (S621).
  • anonymity reinforcing portion 120 selects the false ID: enhanced target level identifier anonymization record 832 and anonymizing record 835 with 2, to enhance processing to the same attribute value for each attribute name (S623).
  • the same attribute value for the strengthened quasi-identifier whose attribute name is “sex” is, for example, “Any” including “Any” and “female”.
  • the same attribute value for the reinforcement target quasi-identifier whose attribute name is “birth date” is an attribute value indicating a minimum range including, for example, “1981 to 1985” and “1985 to 1986”. 1981-1986 ”.
  • the attribute value does not necessarily have to be an attribute value indicating the minimum range including all of them.
  • all anonymization standards having an attribute name of “birth date” such as “1980 to 1989” are used.
  • An attribute value in an arbitrary range including the identifier may be used.
  • the anonymity enhancing unit 120 selects a fake ID: 4 as a fake ID to be processed (S621).
  • anonymity reinforcing portion 120 selects the false ID: enhanced target level identifier anonymization record 832 and anonymizing record 835 with 2, to enhance processing to the same attribute value for each attribute name (S623).
  • the same attribute value for the strengthened quasi-identifier whose attribute name is “sex” is, for example, “female” including “female” and “female”.
  • the same attribute value for the reinforcement target quasi-identifier whose attribute name is “birth date” is, for example, an attribute value indicating a minimum range including “1986 to 1990” and “1985 to 1986”. 1986-1990 ".
  • anonymization records belonging to the same quasi-identifier group to be strengthened must be strengthened.
  • the first effect of the present embodiment described above is to generate a data set that cannot improve individual specificity even if the invariant semi-identifier (anonymized semi-identifier) of the anonymized record after anonymization is compared. it is that it allows.
  • the reason is that the k-anonymization data set generated by the k-anonymization unit 110 is processed by the anonymity enhancement unit 120 to strengthen the quasi-identifier to be strengthened included in the anonymization record to generate the anonymity enhancement data set. This is because the way.
  • the second effect of the present embodiment described above is that a data set that does not improve personal identification while strictly maintaining k-anonymity of the k-anonymization data set generated by the k-anonymization unit 110 is generated. is a point to be able to.
  • the anonymity strengthening unit 120 recursively executes the above-described strengthening process to generate an anonymity strengthening data set.
  • the third effect of the present embodiment described above is that anonymity is maintained so that the loss of information is kept relatively small and personal identification cannot be improved by comparing the quasi-identifiers of the k-anonymized records. It is a point that makes it possible to generate a data set enhanced.
  • the reason is that the anonymity enhancing unit 120 reinforces only the reinforcement target quasi-identifier included in the anonymized record having the same fake ID to generate the anonymity enhanced data set.
  • the anonymization apparatus of this embodiment calculates an information loss amount corresponding to generalization (anonymity enhancement) by the anonymity enhancement unit 120 illustrated in FIG. Then, the anonymization apparatus of this embodiment determines a combination of target records in k-anonymization based on the calculated information loss amount so that the information loss amount is minimized, for example. Then, the anonymization device of the present embodiment converts the target quasi-identifier of the anonymization target data set based on the determined combination of target records so as to satisfy desired anonymity, and k-anonymization data set to generate.
  • the anonymization device of this embodiment calculates the information loss amount with the unique identification information as a unit. The reason is to cope with the case where the anonymization target data set includes a plurality of target records for one unique identification information.
  • the anonymity enhancing unit 120 performs reinforcement processing in order to prevent anonymity failure due to comparison.
  • the information loss amount of each target record is calculated in record units, the information loss amount does not include the loss of information when strengthened by the anonymity enhancing unit 120.
  • the loss is for an anonymization target data set including a plurality of target records for one unique identification information.
  • the enhancement processing by the anonymity enhancement unit 120 further generalizes the anonymization record of the k-anonymization data set based on the false ID corresponding to each unique identification information. it is intended. Therefore, the information loss in that case is not taken into account only by obtaining the information loss amount of the single target record.
  • the anonymization apparatus of this embodiment is an information loss including an information loss amount in units of unique identification information corresponding to each fake ID, that is, an information loss when strengthened by the anonymity enhancement unit 120 the amount is calculated.
  • this “information loss amount corresponding to each unique identification information associated with the strengthening process of the quasi-identifier to be strengthened by the anonymity enhancing unit 120” will be referred to as a strengthened processing information loss amount.
  • the anonymization device of the present embodiment is compatible with a case where the anonymization target data set includes a plurality of target records for one unique identification information, and an anonymity enhancement data set in which loss of information due to generalization is further reduced. to generate.
  • FIG. 10 is a block diagram showing the configuration of the anonymization apparatus 200 according to the second embodiment of the present invention.
  • the anonymization device 200 includes a k-anonymization unit 210 instead of the k-anonymization unit 110 as compared to the anonymization device 100 according to the first embodiment. Further, the anonymization device 200 further includes a combination determination unit 230 and an information loss calculation unit 240 as compared with the anonymization device 100.
  • the combination determination unit 230 may include an information loss calculation unit 240.
  • the combination determination unit 230 generates one or more combination candidates.
  • a combination candidate is a candidate for a combination of target records when target records included in the anonymization target data set are distributed to one or more groups.
  • the combination determination unit 230 passes the combination candidates to the information loss calculation unit 240. Then, the combination determination unit 230 receives the reinforced processing information loss amount corresponding to each combination candidate from the information loss calculation unit 240.
  • the combination determination unit 230 determines a combination candidate having the smallest information loss calculated based on the received amount of reinforced processing information loss as a combination of target records. That is, the combination determination unit 230 determines the combination of the target records so that the total sum of information loss amounts after the reinforcement processing by the anonymity enhancement unit 120 of all target records included in the anonymization target data set is minimized. to.
  • FIG. 11 is a diagram illustrating an example of the anonymization target data set 860.
  • the anonymization target data set 860 includes a plurality of target records (for example, target records 8601) including attributes of a patient ID (also referred to as unique identification information), a birth year, a medical treatment date, and a wound name. .
  • the attribute “birth year” is an invariant canonical identifier.
  • the attribute “medical care date” is a variable quasi-identifier.
  • FIG. 12 is a diagram showing an image when the target records of the anonymization target data set 860 are distributed to one or more groups.
  • the dotted line in FIG. 12 shows an example of partitioning in which target records are combined and distributed to groups so that 3-anonymity can be guaranteed.
  • this group is referred to as an anonymous group.
  • the partition 401 and the partition 402 are partitions that divide the anonymization target data set by attribute: year of birth.
  • the partition 403 and the partition 404 are partitions which divide
  • each anonymous group includes target records having three or more different patient IDs.
  • the patient ID is shown using a false ID shown in FIG. 13 (corresponding to the patient ID shown in FIG. 12 in the order of arrangement). Therefore, each anonymous group is a group that is partitioned so that 3-anonymity can be guaranteed.
  • the combination determination unit 230 determines whether to adopt an anonymization group divided by either the partition 403 or the partition 404, that is, a combination of target records. In this way, the combination determining unit 230 can satisfy the desired k-anonymity among the candidate combinations of the target records, and the amount of reinforced processing information loss corresponding to each patient ID (unique identification information) The candidate for the combination of the target records having the smallest sum is selected and determined as the combination of the target records.
  • the information loss calculation unit 240 receives a combination candidate from the combination determination unit 230. Next, the information loss calculation unit 240 calculates the reinforced processing information loss amount based on the received combination candidate. Next, the information loss calculation unit 240 passes the calculated reinforced processing information loss amount to the combination determination unit 230.
  • the information loss calculation unit 240 calculates the reinforced processing information loss amount by using a calculation method corresponding to the strengthening processing of the anonymity strengthening unit 120.
  • the strengthening process of the anonymity enhancement unit 120 is performed to change the target quasi-identifier (invariant quasi-identifier whose attribute is “birth year”) included in the target record of the anonymization target data set 860 to an attribute value exceeding the minimum range.
  • a case of strengthening processing that generalizes to a set will be described.
  • the information loss calculation unit 240 calculates the reinforced processing information loss amount by NCP (Normalized City Penalty). Various indexes for measuring the amount of information loss have been proposed.
  • the information loss calculation unit 240 may calculate the reinforced processing information loss amount by using any calculation method corresponding to the strengthening processing of the anonymity strengthening unit 120 without being limited to the NCP.
  • NCP (r.a)
  • NCP (r.a) is an NCP value related to attribute a of a target record r. a_max-r. a_min
  • r. a_max is the maximum attribute value of the attribute a of the target record r
  • r. a_min is the minimum value of the attribute value of the attribute a of the target record r.
  • a. max is the maximum value of the attribute a in all target records in the anonymization target data set 860
  • a. min represents the minimum value of the attribute a in all target records in the anonymization target data set 860.
  • an NCP for each target record having a patient ID: Alice is calculated as follows.
  • the target quasi-identifier with the attribute “birth year” included in the target record 8601 is k-anonymized in 1981-1988 by the k-anonymization unit 210 in the anonymization group divided by the partition 403. Therefore, the NCP of the target quasi-identifier whose attribute included in the target record 8601 is “birth year” is 0.78 (the third decimal place is rounded off, and so on).
  • the NCP of the target semi-identifier whose attribute included in the target record 8604 is “Birth Year” is 0.67
  • the NCP of the target semi-identifier whose attribute included in the target record 8607 is “Birth Year” is 0.44.
  • the information loss calculation unit 240 of the present embodiment calculates an NCP for each patient ID as the reinforced processing information loss amount.
  • each of the target quasi-identifiers whose attributes included in the target record 8601, the target record 8604, and the target record 8607 having the patient ID: Alice are “birth year” is k-anonymized in the anonymization group divided by the partition 403.
  • Part 210 converts to 1981-1988, 1983-1989 and 1981-1985.
  • the minimum value of the “year of birth” attribute included in the target record 8601, the target record 8604, and the target record 8607 of the patient ID: Alice is 1981, and the maximum value is 1989. Therefore, the NCP * of the target record 8601, the target record 8604, and the target record 8607 of the patient ID: Alice is 0.89.
  • FIG. 13 shows an anonymization quasi-identifier corresponding to a target quasi-identifier whose attribute included in each target record is “birth year” when the anonymization target data set 860 is divided by partition 401, partition 403, and partition 404. It illustrates.
  • FIG. 14 shows an anonymization quasi-identifier corresponding to the target quasi-identifier whose attribute included in each target record is “birth year” when the anonymization target data set 860 is divided by partition 402, partition 403, and partition 404. It illustrates. That is, FIG. 13 and FIG. 14 show an example of the anonymization quasi-identifier when the anonymization target data set 860 is k-anonymized in a certain combination of target records.
  • FIG. 15 is information indicating an example of the information loss amount corresponding to the combination of the target records. Specifically, FIG. 15 shows the value of NCP * for each patient ID and the sum of NCP * of the entire anonymization target data set 860 when each of the partition 401 and the partition 402 is adopted. FIG. 15 shows that the loss of information due to anonymization can be reduced when the partition 402 is adopted instead of the partition 401. In this case, the combination determination unit 230 employs the partition 402.
  • the k-anonymization unit 210 converts the target quasi-identifier included in the target record belonging to each anonymous group of the combination of the target records determined by the combination determination unit 230 into an anonymization quasi-identifier, and k-anonymization data set Is generated. For example, the k-anonymization unit 210 converts each of the target quasi-identifiers included in the target records belonging to each anonymous group into the same attribute value for each attribute name.
  • the k-anonymization unit 210 anonymizes the target quasi-identifier so that the total amount of information loss after conversion by the anonymity enhancement unit 120 of all target records included in the anonymization target data set is minimized. to convert to of quasi-identifier.
  • FIG. 16 is a diagram illustrating an example of a k-anonymization data set generated by the k-anonymization unit 210.
  • This k-anonymization data set is obtained by the k-anonymization unit 210 when the combination determination unit 230 determines to divide the combination of the target records by the partition 402, the partition 403, and the partition 404. K-anonymized.
  • component of the hardware unit of the anonymization apparatus 200 may be the configuration shown in FIG.
  • FIG. 17 is a flowchart showing the operation of the anonymization apparatus 200 according to this embodiment.
  • the combination determination unit 230 generates one or more combination candidates and passes the generated combination candidates to the information loss calculation unit 240 (S631).
  • the information loss calculation unit 240 calculates the reinforced processing information loss amount based on the received combination candidate, and passes the calculated reinforced processing information loss amount to the combination determination unit 230 (S632).
  • the combination determination unit 230 determines a combination candidate with the smallest information loss calculated based on the received amount of strengthening processing information loss as a combination of target records (S633).
  • the k-anonymization unit 210 sets the target quasi-identifier included in each target record of the combination of the target records determined by the combination determination unit 230 as an anonymization quasi-identifier for each target record belonging to each anonymous group.
  • the k-anonymization data set is generated by conversion, and output to the storage unit 702 or the storage device 703 (S634).
  • the anonymity enhancement unit 120 reinforces the reinforcement target quasi-identifier included in the received k-anonymization data set, and generates an anonymity enhancement data set obtained by converting the anonymization record into the enhancement record (S635). ).
  • the anonymization target data set may include a plurality of invariant quasi-identifiers.
  • the combination determination unit 230 adds up all the invariant identifiers NCP * in the same combination candidate in the same attribute name and the same fake ID unit, and sums the totals based on the “enhanced processing information loss amount”. Loss of information calculated in this way. Note that the above-mentioned summation may be performed for any same attribute name.
  • the first effect of the present embodiment described above is anonymity with a smaller loss of information than the anonymity enhancing data set generated by the anonymization device 100 of the first embodiment. It is a point that makes it possible to generate an enhanced data set.
  • the information loss calculation unit 240 calculates the reinforced processing information loss amount corresponding to each unique identification information.
  • the combination determining unit 230 determines a combination of target records based on the amount of reinforced processing information loss.
  • the k-anonymization unit 210 converts the target quasi-identifier included in the determined target record into the same attribute value for each attribute name, thereby generating a k-anonymization data set.
  • the second effect of the present embodiment described above is that it is possible to generate an anonymity-enhanced data set with relatively small loss of information.
  • the reason is that the combination determining unit 230 determines the combination candidate with the smallest information loss calculated based on the received amount of strengthening processing information loss as the combination of the target records.
  • the reason is that the k-anonymization unit 210 converts the target quasi-identifier included in the target record belonging to each anonymous group into the same attribute value for each attribute name. This is because the way.
  • each component described in each of the above embodiments does not necessarily need to be an independent entity.
  • each component may be realized as a module with a plurality of components.
  • each component may be realized by a plurality of modules.
  • Each component may be configured such that a certain component is a part of another component.
  • Each component may be configured such that a part of a certain component overlaps a part of another component.
  • each component and a module that realizes each component may be realized by hardware if necessary. Moreover, each component and the module which implement
  • the program is provided by being recorded on a non-volatile computer-readable recording medium such as a magnetic disk or a semiconductor memory, and is read by the computer when the computer is started up.
  • the read program causes the computer to function as a component in each of the above-described embodiments by controlling the operation of the computer.
  • a plurality of operations are not limited to being executed at different timings. For example, another operation may occur during the execution of a certain operation, or the execution timing of a certain operation and another operation may partially or entirely overlap.
  • each of the embodiments described above it is described that a certain operation becomes a trigger for another operation, but the description does not limit all relationships between the certain operation and other operations. For this reason, when each embodiment is implemented, the relationship between the plurality of operations can be changed within a range that does not hinder the contents.
  • the specific description of each operation of each component does not limit each operation of each component. For this reason, each specific operation
  • movement of each component may be changed in the range which does not cause trouble with respect to a functional, performance, and other characteristic in implementing each embodiment.
  • Anonymization apparatus 110 k-anonymization part 120 Anonymity enhancement part 200 Anonymization apparatus 210 k-anonymization part 230 Combination determination part 240 Information loss calculation part 401 Partition 402 Partition 403 Partition 404 Partition 700 Computer 701 CPU 702 Storage unit 703 Storage device 704 Input unit 705 Output unit 706 Communication unit 707 Recording medium 820 Anonymization target data set 822 Target record 825 Target record 830 k-anonymization data set 830 Anonymization data set 831 Anonymization record 832 Anonymization record 833 Anonymization record 834 Anonymization record 835 Anonymization record 836 Anonymization record 840 Anonymity enhancement data set 842 Enhancement record 845 Enhancement record 850 Anonymity enhancement data set 860 Anonymization target data set 8601 Target record 8604 Target record 8607 Target record

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Cette invention porte sur un dispositif d'anonymisation servant à générer un ensemble de données à anonymat renforcé de telle sorte qu'une spécificité individuelle ne peut pas être améliorée même par traçage de parallèles entre des quasi-identificateurs d'un groupe d'enregistrements après k-anonymisation. Le dispositif d'anonymisation comprend : un moyen de k-anonymisation pour générer un ensemble de données k-anonymisé par conversion d'informations d'identification intrinsèques d'un ensemble de données cible d'anonymisation en fausses informations d'identification et, de plus, conversion d'un quasi-identificateur pour satisfaire un k-anonymat déterminé ; et un moyen de renforcement d'anonymat pour générer et délivrer un ensemble de données à anonymat renforcé par conversion de quasi-identificateurs qui ont toujours la même valeur correspondant aux informations d'identification intrinsèques pour chaque même élément de fausses informations d'identification de l'ensemble de données k-anonymisé en quasi-identificateurs pour lesquels une instanciation n'est pas possible par traçage de parallèles entre ces quasi-identificateurs.
PCT/JP2013/003347 2012-06-04 2013-05-28 Dispositif de traitement d'informations pour anonymisation et procédé d'anonymisation WO2013183250A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2014519824A JPWO2013183250A1 (ja) 2012-06-04 2013-05-28 匿名化を行う情報処理装置及び匿名化方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2012-127257 2012-06-04
JP2012127257 2012-06-04

Publications (1)

Publication Number Publication Date
WO2013183250A1 true WO2013183250A1 (fr) 2013-12-12

Family

ID=49711658

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/003347 WO2013183250A1 (fr) 2012-06-04 2013-05-28 Dispositif de traitement d'informations pour anonymisation et procédé d'anonymisation

Country Status (2)

Country Link
JP (1) JPWO2013183250A1 (fr)
WO (1) WO2013183250A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101704702B1 (ko) * 2016-04-18 2017-02-08 (주)케이사인 태깅 기반의 개인 정보 비식별화 시스템 및 방법
KR20200026559A (ko) * 2018-09-03 2020-03-11 (주)아이알컴퍼니 K-익명성 모델 이용 데이터 셋 비식별화 방법 및 장치
JP2021197064A (ja) * 2020-06-18 2021-12-27 株式会社日立製作所 データ提供サーバ装置、およびデータ提供方法

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011145401A1 (fr) * 2010-05-19 2011-11-24 株式会社日立製作所 Dispositif de désidentification d'informations d'identité
WO2012067213A1 (fr) * 2010-11-16 2012-05-24 日本電気株式会社 Système de traitement d'informations et procédé d'anonymisation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011145401A1 (fr) * 2010-05-19 2011-11-24 株式会社日立製作所 Dispositif de désidentification d'informations d'identité
WO2012067213A1 (fr) * 2010-11-16 2012-05-24 日本電気株式会社 Système de traitement d'informations et procédé d'anonymisation

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101704702B1 (ko) * 2016-04-18 2017-02-08 (주)케이사인 태깅 기반의 개인 정보 비식별화 시스템 및 방법
KR20200026559A (ko) * 2018-09-03 2020-03-11 (주)아이알컴퍼니 K-익명성 모델 이용 데이터 셋 비식별화 방법 및 장치
KR102126386B1 (ko) * 2018-09-03 2020-06-24 (주)아이알컴퍼니 K-익명성 모델 이용 데이터 셋 비식별화 방법 및 장치
JP2021197064A (ja) * 2020-06-18 2021-12-27 株式会社日立製作所 データ提供サーバ装置、およびデータ提供方法
JP7382902B2 (ja) 2020-06-18 2023-11-17 株式会社日立製作所 データ提供サーバ装置、およびデータ提供方法

Also Published As

Publication number Publication date
JPWO2013183250A1 (ja) 2016-01-28

Similar Documents

Publication Publication Date Title
Kumar et al. Blockchain utilization in healthcare: Key requirements and challenges
JP6007969B2 (ja) 匿名化装置及び匿名化方法
Cheng et al. Validity of in-hospital mortality data among patients with acute myocardial infarction or stroke in National Health Insurance Research Database in Taiwan
US9230132B2 (en) Anonymization for data having a relational part and sequential part
JP6015658B2 (ja) 匿名化装置、及び、匿名化方法
AU2019203992A1 (en) Data platform for automated data extraction, transformation, and/or loading
US11449674B2 (en) Utility-preserving text de-identification with privacy guarantees
US11093645B2 (en) Coordinated de-identification of a dataset across a network
EP3832559A1 (fr) Contrôle d'accès à des ensembles de données désidentifiées basé sur un risque de réidentification
JP2008501173A (ja) 一般アプリケーションプログラムインタフェースを実装するためのシステム及び方法
US20160306999A1 (en) Systems, methods, and computer-readable media for de-identifying information
WO2013105076A1 (fr) Rédaction automatisée de document
US11468996B1 (en) Maintaining stability of health services entities treating influenza
US9971905B2 (en) Adaptive access control in relational database management systems
JP2013190838A (ja) 情報匿名化システム、情報損失判定方法、及び情報損失判定プログラム
WO2013183250A1 (fr) Dispositif de traitement d'informations pour anonymisation et procédé d'anonymisation
US11269632B1 (en) Data conversion to/from selected data type with implied rounding mode
WO2013121738A1 (fr) Dispositif d'anonymisation distribuée, et procédé d'anonymisation distribuée
Khamadja et al. Designing flexible access control models for the cloud
JP2017228255A (ja) 評価装置、評価方法及びプログラム
KR20110099214A (ko) 동결된 개체들을 위한 형식 설명자 관리
US20240119173A1 (en) Information transaction device, information transaction method, and program
JP6127774B2 (ja) 情報処理装置、及び、データ処理方法
US20170235883A1 (en) Identifying Missing Medical Codes for Re-Coding in Patient Registry Records
WO2014136422A1 (fr) Dispositif de traitement d'informations pour réaliser un traitement de préservation de l'anonymat, et procédé de préservation de l'anonymat

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13800297

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2014519824

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13800297

Country of ref document: EP

Kind code of ref document: A1