WO2013088681A1 - 匿名化装置、匿名化方法、並びにコンピュータ・プログラム - Google Patents
匿名化装置、匿名化方法、並びにコンピュータ・プログラム Download PDFInfo
- Publication number
- WO2013088681A1 WO2013088681A1 PCT/JP2012/007825 JP2012007825W WO2013088681A1 WO 2013088681 A1 WO2013088681 A1 WO 2013088681A1 JP 2012007825 W JP2012007825 W JP 2012007825W WO 2013088681 A1 WO2013088681 A1 WO 2013088681A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- anonymization
- records
- record
- property
- unique identifier
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
- G06F21/6254—Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
Definitions
- the present invention relates to a technical field for anonymizing information (history information) that is not preferably disclosed or used as original information content such as personal information.
- history information By analyzing the history information, it is possible to grasp a specific user's behavior pattern, grasp a unique tendency of a certain group, predict an event that may occur in the future, analyze a factor for a past event, and the like.
- the service provider can reinforce and review his / her business. Therefore, the history information is useful information having a very high utility value.
- the history information held by such service providers is also useful for third parties other than service providers.
- this third party can obtain information that could not be obtained by himself / herself by using such history information, so that the third party's own services and marketing can be enhanced.
- the service provider may request a third party to analyze the history information or may disclose the history information for research purposes.
- Such history information with high utility value may include information that the subject of the history information does not want to be known to others or information that should not be known to third parties.
- information is generally called sensitive information (sensitive information: Sensitive Attribute (SA), Sensitive Value).
- SA Sensitive Attribute
- Sensitive Value Sensitive Value
- purchase history purchased products can be sensitive information.
- medical information the name of a sickness or the name of a medical practice is sensitive information.
- history information is given a user identifier (user ID) that uniquely identifies a service user and a plurality of attributes (attribute information) that characterize the service user.
- the user identifier includes a name, a membership number, an insured number, and the like. Attributes that characterize service users include gender, date of birth, occupation, residential area, and postal code.
- the service provider records these user identifiers, a plurality of types of attributes, and sensitive information as one record. The service provider accumulates such records as history information every time a specific user whose user identifier is associated with the record enjoys the service.
- history information with a user identifier still attached is provided to a third party, it is possible to specify a service user by using the user identifier, which may cause a privacy infringement problem.
- a certain individual can be identified by combining one or more attribute values given to each record from a data set composed of a plurality of records.
- Such an attribute that can specify an individual is called a quasi-identifier. That is, even in the history information from which the user identifier is removed, privacy infringement may occur if an individual can be identified based on the quasi-identifier.
- Anonymization is known as a method for converting a history information data set having such characteristics into a form in which privacy is protected while maintaining its original usefulness.
- Patent Document 1 processes data received from a user terminal and evaluates privacy information included in the received data, thereby identifying the received data. Disclosed is a technique for converting to information excluding information.
- Non-Patent Document 1 proposes “k-anonymity” which is the most well-known anonymity index.
- the technique of satisfying k-anonymity for the data set to be anonymized is called “k-anonymization”.
- k-anonymization a process of converting the target quasi-identifier is performed so that at least k records having the same quasi-identifier exist in the data set to be anonymized.
- methods such as generalization and cutoff are known. In such generalization, the original detailed information is converted into abstracted information.
- Non-Patent Document 2 proposes “l-diversity”, which is one of the anonymity indicators developed from k-anonymity.
- l-diversity is one of the anonymity indicators developed from k-anonymity.
- a process of converting the target quasi-identifier is performed so that a plurality of records having the same quasi-identifier include at least one or more different types of sensitive information.
- k-anonymization ensures that the number of records associated with the quasi-identifier is k or more.
- l-Diversification ensures that there are more than one type of sensitive information associated with a quasi-identifier. An example of a data set subjected to l-diversification will be described later with reference to FIGS. 11A to 11C.
- an anonymization technique for a movement trajectory is known.
- Non-Patent Document 3 is a paper on a technique for anonymizing a movement locus in which position information is associated in time series. More specifically, the anonymization technique described in Non-Patent Document 3 is an anonymization technique that guarantees consistent k-anonymity by regarding the movement locus from the start point to the end point as a series of sequences. In this anonymization technique of a movement locus, a tube-like anonymous movement locus in which k or more movement loci that are geographically similar are bundled is generated. In the anonymization technique of the movement trajectory, an anonymous movement trajectory that maximizes geographical similarity is generated within the anonymity constraint.
- Non-Patent Document 3 In the anonymization method of the movement trajectory represented by Non-Patent Document 3, a time-series order relationship is particularly maintained among the properties existing between records given the same user identifier.
- Non-Patent Document 3 mainly aims to construct an anonymous movement trajectory that maximizes geographical similarity, and the properties between records are not necessarily maintained. Further, Non-Patent Document 3 does not support any anonymity guarantees such as k-anonymity, l-diversity, and m-invariance.
- FIGS. 11A to 11C the problem of anonymizing history information in which there are a plurality of records to which the same user identifier is assigned will be considered with reference to FIGS. 11A to 11C, FIGS. 12A and 12B, and the examples shown in FIG.
- medical information collected at a medical institution as a service provider.
- medical information includes a large number of records associated with different medical treatment times on the basis of user identifiers assigned to the same patient.
- FIG. 11A is a diagram illustrating a table relating to history information before anonymization (medical history relating to April 2010).
- FIG. 11B is a diagram exemplifying history information before anonymization (a table relating to medical history relating to May 2010).
- the history information shown in FIG. 11A is a table in which the gender, date of birth, date of medical care, and name of injury / illness are associated with the user identifier for identifying the patient for April 2010.
- the history information illustrated in FIG. 11B is a table in which history information similar to that illustrated in FIG. 11A is collected for May 2010. In other words, in FIG. 11B, based on the same user identifier as in FIG. 11A, similar types of records are associated with different medical treatment dates.
- sex, date of birth, and date of medical care correspond to the above-described “quasi-identifier”.
- the name of the wound corresponds to the “sensitive information” described above.
- FIG. 11C is a diagram exemplifying a table showing properties existing between the data sets shown in FIGS. 11A and 11B. More specifically, in FIG. 11C, for example, when attention is paid to the records of the patient with the user identifier 00001 in April and May 2010, the property of having suffered from sickness A in April and then suffering from sickness E in May It can be seen that there exists.
- an arrow shown in FIG. 11C (in the following description by text in the present application, it is expressed by “>”) represents a transition of injury or illness.
- the patient with the user identifier 00002 it can be seen that there is a property that the patient suffered from sickness B in May after suffering from the sickness B in April.
- FIG. 12A is a diagram illustrating a result of anonymizing the history information illustrated in FIG. 11A.
- FIG. 12B is a diagram illustrating the result of anonymizing the history information illustrated in FIG. 11B.
- FIG. 13 is a diagram illustrating a generalization tree used when abstracting gender.
- the user identifier indicated by the broken line frame is information that is not disclosed to the user by being deleted when using the anonymized result, and is shown for convenience of explanation.
- FIG. 13 is a diagram showing an example of a generalized (generalized) tree used when abstracting gender. That is, FIG. 13 shows an example of a concept hierarchy of an abstract tree that defines an abstraction method when abstracting gender as a quasi-identifier.
- “*” corresponds to both men and women, and “*” represents a superordinate concept of two types of sex (male and female).
- or FIG. 11C, FIG. 12A, and FIG. 12B it shall refer to the basis of the premise mentioned above also in embodiment mentioned later.
- the anonymization technique described here is a general technique for anonymizing a single data set (for April and May) individually. That is, in such a general anonymization technique, as shown in FIGS. 11A and 11B, a plurality of data sets having different medical treatment times are anonymized as shown in FIGS. 11A to 12A, and As shown in FIG. 11B to FIG. 12B, anonymization is performed on a monthly basis for each month. Also, in general anonymization, even if the data set is not divided into explicit units like monthly, if there are multiple records with the same user identifier, Anonymization is performed.
- “gender” is made ambiguous according to the abstraction tree shown in FIG. In FIG. 12B, it is represented by “*” which is a superordinate concept.
- the date of birth shown in FIGS. 12A and 12B represents a range (period). This represents the result of deleting the date by converting (abstracting) the attribute “birth date” shown in FIGS. 11A and 11B so that two or more records have a common value. .
- the semi-identifier is abstracted so that a plurality of records have a common semi-identifier (the same identifier).
- FIG. 11C originally exists between the data set shown in FIG. 11A and the data set shown in FIG. 11B.
- a record that was disease A in April 2010 has the property of having disease E or disease F in May 2010.
- the following time-series properties exist in the data set before anonymization shown in FIGS. 11A and 11B.
- FIGS. 12A and 12B it can be assumed that the following time-series properties exist between the two anonymized data sets shown in these drawings.
- these properties are not only the user identifiers 00002 and 00004 deleted by anonymization in the record shown in FIG. 12A, but in FIG. Also, the properties related to the user identifiers 00001, 00003 (ie, B> G) have been derived.
- these properties are not only the user identifiers 00006 and 00008 deleted by anonymization in the record shown in FIG. 12A, but in FIG. 12B, not only these deleted user identifiers. Even the properties (ie, C> F) related to the user identifiers 00005 and 00007 have been derived.
- the time-series properties after anonymization estimated based on the tables shown in FIGS. 12A and 12B are the original time-series properties obtained before anonymization shown in FIG. 11C. Despite the ambiguity, even the nature that should not be derived is derived.
- the present invention has been made in view of the above-described problems.
- the present invention provides an anonymization device and the like that perform anonymization optimally and sufficiently while maintaining the maximum original property existing between a plurality of records having the same identifier when anonymizing history information Is the main purpose.
- an anonymization device has the following configuration.
- the anonymization device is With regard to the history information including a plurality of records in which at least the quasi-identifier and the sensitive information are associated with the unique identification information, the desired anonymity can be satisfied from the history information, and a specific unique identifier is A record extraction means for extracting a record given another unique identifier different from the specific unique identifier based on a small ambiguity of a property existing between a plurality of records in common; Anonymization abstracted by providing commonality to the quasi-identifiers included in each of the records so that individual attributes of the plurality of records extracted by the record extraction means satisfy the desired anonymity Means.
- the anonymization method according to the present invention is characterized by having the following configuration.
- the anonymization method is: With regard to the history information including a plurality of records in which at least the quasi-identifier and the sensitive information are associated with the unique identification information, the desired anonymity can be satisfied from the history information, and a specific unique identifier is Based on the small ambiguity of the property existing between a plurality of records having in common, the computer extracts a record given another unique identifier different from the specific unique identifier, Abstraction is made by making the quasi-identifiers included in each of these records common by the computer or different computers so that individual attributes of the extracted records satisfy the desired anonymity .
- This object is also achieved by a computer program for realizing the anonymization device having the above-described configuration and the corresponding method by a computer, and a computer-readable storage medium storing the computer program. .
- anonymization is performed for optimal and sufficient anonymization while maintaining the original property existing between a plurality of records having the same identifier to the maximum Provision of devices and the like is realized.
- FIG. 1 is a functional block diagram showing the configuration of the anonymization device 100 according to the first embodiment of the present invention.
- FIG. 2 is a functional block diagram showing the configuration of the anonymization apparatus 200 according to the second embodiment of the present invention.
- FIG. 3 is a flowchart showing a procedure of control processing in the anonymization apparatus according to the first embodiment of the present invention.
- FIG. 4A is a diagram illustrating a table that illustrates anonymized information using the history information before anonymization illustrated in FIG. 11A in the first embodiment.
- FIG. 4B illustrates the result of anonymization in the first embodiment using the history information before anonymization illustrated in FIG. 11B so as to suppress a decrease in the abstractness of the original property (FIG. 11C). It is a figure which shows a table.
- FIG. 11C shows a table.
- FIG. 5 is a flowchart showing a procedure of control processing in the anonymization apparatus according to the second embodiment of the present invention.
- FIG. 6 is a functional block diagram showing the configuration of the anonymization apparatus 300 according to the third embodiment of the present invention.
- FIG. 7 is a flowchart showing a procedure of control processing in the anonymization apparatus according to the third embodiment of the present invention.
- FIG. 8 is a functional block diagram showing the configuration of the anonymization apparatus 400 according to the fourth embodiment of the present invention.
- FIG. 9 illustrates the result of anonymization in the third embodiment using the history information before anonymization illustrated in FIG. 14B so as to suppress a decrease in the abstractness of the original property (FIG. 14C). It is a figure which shows a table.
- FIG. 14C shows a table.
- FIG. 10 is a diagram for exemplifying the hardware configuration of a computer (information processing apparatus) that can implement the first to fourth embodiments of the present invention.
- FIG. 11A is a diagram illustrating a table relating to history information before anonymization (medical history relating to April 2010).
- FIG. 11B is a diagram exemplifying history information before anonymization (a table relating to medical history relating to May 2010).
- FIG. 11C is a diagram exemplifying a table representing properties existing between the data sets illustrated in FIGS. 11A and 11B.
- FIG. 12A is a diagram illustrating a result of anonymizing the history information illustrated in FIG. 11A.
- FIG. 12B is a diagram illustrating the result of anonymizing the history information illustrated in FIG. 11B.
- FIG. 12A is a diagram illustrating a result of anonymizing the history information illustrated in FIG. 11A.
- FIG. 12B is a diagram illustrating the result of anonymizing the history information illustrated in FIG. 11B.
- FIG. 13 is a diagram illustrating a generalized tree used when abstracting gender.
- FIG. 14A is a diagram illustrating a table relating to history information before anonymization (medical history relating to June 2010).
- FIG. 14B is a diagram exemplifying history information before anonymization (a table relating to medical history relating to July 2010).
- FIG. 14C is a table for each user identifier regarding the properties existing between the history information before anonymization from April to July 2010 (data sets shown in FIGS. 11A, 11B, 14A, and 14B). It is a figure which shows the example put together.
- FIGS. 11A, 11B, and 11C the relationship illustrated in FIGS. 11A, 11B, and 11C referred to in the above-described “Problem to be Solved by the Invention” column is also used in the following embodiments for the convenience of explanation.
- FIG. 1 is a functional block diagram showing the configuration of the anonymization device 100 according to the first embodiment of the present invention.
- the anonymization device 100 includes a record extraction unit 102 and an anonymization unit 104.
- the anonymization device 100 performs anonymization based on the history information 110.
- the anonymization apparatus 100 acquires the property regarding the history information 110 to which attention is paid from, for example, an external apparatus.
- the history information 110 includes an identifier for associating (associating) a plurality of records with sensitive information.
- the history information 110 is information such as personal information that is not preferably disclosed or used with the original information content. That is, the history information 110 is, for example, a plurality of records that share the same insured person number as a user identifier and have different medical treatment dates. More specifically, in the present embodiment that refers to the example shown in FIGS. 11A and 11B, the history information 110 includes gender, date of birth, date of medical treatment, and name of injury and disease as attributes that characterize the user represented by the user identifier. including. Among these attributes, the user identifier is a unique identifier. The name of injury or illness is sensitive information.
- the record extraction unit 102 can suppress abstraction of properties existing between a plurality of records having a specific user identifier (common user identifier) and can satisfy desired anonymity. Are extracted from the history information 110. In other words, the record extraction unit 102 can satisfy the desired anonymity (“2-diversity” in the present embodiment), and based on the small ambiguity of the property of the history information 110, the same A record given another user identifier different from the user identifier is extracted.
- the anonymization unit 104 abstracts the quasi-identifiers included in the records so that individual attributes of the plurality of records extracted by the record extraction unit 102 satisfy (satisfy) the desired anonymity. .
- the anonymization device 100 can be realized by an information processing device such as a computer.
- Each component (functional block) in the anonymization apparatus 100 and the anonymization apparatus in other embodiments to be described later is a computer program (software program: hereinafter simply referred to as “program”) in hardware resources provided in the information processing apparatus. This may be realized by executing.
- the anonymization device 100 is realized by cooperation of hardware such as a computer CPU (Central Processing Unit), a main storage device, and an auxiliary storage device, and a program loaded from the storage device to the main storage device. Is done.
- the implementation form of the program is not limited to the block configuration (record extraction unit 102, anonymization unit 104) shown in FIG.
- the anonymization device 100 and the anonymization device according to each embodiment to be described later may be realized by a dedicated device.
- FIG. 3 is a flowchart showing a procedure of control processing in the anonymization apparatus according to the first embodiment of the present invention.
- the record extraction unit 102 extracts a plurality of records necessary for satisfying the desired anonymity from the history information 110 (step S101). Then, the record extraction unit 102 selects a record having the smallest ambiguity of the property from the plurality of records extracted in step S101 (step S103). Hereinafter, the processing procedure of these two steps will be described in detail.
- step S101 the record extraction unit 102 extracts a plurality of records necessary for satisfying the desired anonymity from the history information 110.
- the record to be extracted in step S101 is referred to as “target record”.
- a plurality of records that is, a plurality of records extracted in step S101
- anonymization candidate record group a plurality of records necessary for satisfying the desired anonymity with respect to the target record.
- a case will be described in which a record set (FIG. 11B) whose medical treatment date is May 2010 (hereinafter may be referred to as “2010/5”) is the history information 110.
- a case is considered in which, in the history information 110, a record associated with a user identifier 00001 (a specific user identifier of interest) is a target record.
- the anonymization candidate record group necessary for satisfying “2-diversity” as the desired anonymity is a record having different sensitive information in the same medical treatment date.
- the sensitive information of the target record is the wound name E. Therefore, in this case, the anonymization candidate record group is a plurality of records associated with wound names (F, G, H) different from the wound names E in the record set shown in FIG. 11B. Therefore, the record extraction unit 102 records each record associated with a user identifier 00002, 00004, 00005, 00006, 00007, 00008 different from the specific user identifier of interest as the anonymization candidate record group related to the target record. select.
- step S103 the record extraction unit 102 extracts a plurality of records capable of storing the properties of each record from the anonymization candidate record group extracted in step S101.
- a record that can store the property of each record is referred to as a “property storage candidate record”.
- the procedure for extracting the property storage candidate record will be described in detail.
- the original property “A> E” that existed for the patient with the user identifier 00001 is the general anonymized data set shown in FIGS. 12A and 12B. Is ambiguous as “A> E, A> G, B> E, B> G”. This will be described in more detail below.
- the record relating to the period (1976 to 1985) is not only a record including the wound name A but also a new wound name. Records that include B are also targeted.
- the wound name B has the wound names E and G included in the two records related to the same period in FIG. 12B (medical year 2010/5).
- the property relating to the name B is “B> E, B> G”.
- the ambiguity (degree of ambiguity) of the property that occurs when anonymization is performed is obtained.
- records relating to the user identifier 00001 shown in FIG. 12B are grouped using a record relating to the user identifier 00002 having the same period determined based on the date of birth as a quasi-identifier.
- this group is referred to as an “anonymous group”.
- the ambiguity of properties when anonymization is performed is the difference between the number of types of properties estimated after anonymization and the number of types of original properties before anonymization (Difference).
- the method of calculating the ambiguity is not limited to the method using the difference.
- the degree of ambiguity can be obtained by calculating an increase rate of the number of types of properties estimated after anonymization based on the number of types of properties before anonymization.
- step S103 the record extraction unit 102 extracts, from the anonymization candidate record group obtained as described above, a plurality of records with small ambiguity of the property after anonymization. That is, in the record set shown in FIG. 11B, when a record having the user identifier 00001 is a target record, in step S103, two records having user identifiers 00005 and 00007 are extracted as property storage candidate records in FIG. 11B. The Here, this extraction method will be further described.
- the record extraction unit 102 first sets the user anonymization candidate record group of the record (target record) related to the user identifier 00001 shown in FIG. Records relating to identifiers 00002, 00004, 00005, 00006, 00007, and 00008 are extracted.
- the record extraction unit 102 has a plurality of records having sensitive information different from the sensitive information (that is, the name of sickness E) related to the user identifier 00001 in the same medical treatment date (2010/5 shown in FIG. 11B). Is a reason to select. Therefore, in this case, since the record related to the user identifier 0003 has the same sensitive information as the record related to the user identifier 0001 as the target record, it is excluded from the anonymization candidate record group.
- the record extraction part 102 calculates the ambiguity of the property in the case where an anonymous group is formed with the said object record with respect to each record which makes the anonymization candidate record group in step S103.
- the ambiguity that can be calculated based on the above-described difference is 2 in consideration of the case where an anonymous group with the target record is formed in order regarding the records relating to the user identifiers 00002, 00004, 00006, and 00008. Further, considering the case where an anonymous group with the target record is generated in order regarding the remaining two user identifiers 00005 and 00007, the ambiguity that can be calculated based on the above-described difference is zero.
- the record extraction unit 102 selects two records related to the user identifiers 00005 and 00007 as property storage candidate records. These records selected at this time should be abstracted together with the record 00001 having the user identifier (specific unique identifier) that is currently focused on. Among the records having different unique identifiers, It is the record with the smallest ambiguity that can be guessed.
- the processing configuration for selecting a property storage candidate record based on the smallness of the property ambiguity has been described in the case of anonymizing one target record.
- the present invention described by taking this embodiment as an example is not limited to this processing configuration, and for example, two or more target records can be processed.
- the record extraction unit 102 calculates the ambiguity between two or more target records and a plurality of types of anonymization candidate record groups that can be obtained based on the target records. Then, the record extraction unit 102 may extract a record with a low abstraction level from the calculated results as a property storage candidate record for each target record.
- the anonymization unit 104 extracts a plurality of records forming an anonymous group from the plurality of records (property storage candidate records) selected in step S103 (step S105). Then, the anonymization unit 104 anonymizes the quasi-identifier for a plurality of records (that is, an anonymous group) extracted in step S105 (step S107). That is, in step S107, the anonymization unit 104 abstracts the quasi-identifiers included in each of a plurality of records belonging to the anonymous group of interest.
- the processing procedure of these two steps will be described in detail.
- step S105 the anonymization unit 104 selects, from the property storage candidate record group obtained in step S103, a record that forms an anonymous group together with the target record currently focused on.
- attention is focused on a record relating to the user identifier 00001 as a target record.
- records having user identifiers 00005 and 00007 are recorded as property storage candidate records by the above-described record extraction unit 102 (steps S101 and S103). chosen.
- the desired anonymity to be satisfied is 2-diversity
- any one of the records relating to the user identifiers 00005 and 00007 may be selected.
- the criteria, indices, and viewpoints for selecting a record are not limited to the above-described examples.
- a method of evaluating the ambiguity when the quasi-identifier after anonymization is compared with the quasi-identifier before anonymization and extracting a record having the minimum ambiguity as a result of the evaluation is assumed. In this case, in order to minimize the degree of ambiguity, when the date of birth as a quasi-identifier is converted into a period, a record with a shorter period after conversion is selected from these two records. Good.
- the anonymization unit 104 selects a record related to the user identifier 00005 so that the target record (record related to the user identifier 00001) forms an anonymous group.
- step S107 the anonymization unit 104, for the anonymous group formed with respect to the target record in step S105 as described above, the quasi-identifier associated with each of a plurality of records constituting the anonymous group.
- the abstraction As a general example of quasi-identifier abstraction, a case where the abstraction level is minimized by generalization (generalization) of the quasi-identifier will be described.
- FIG. 4A is a diagram illustrating a table illustrating anonymized information using the history information before anonymization illustrated in FIG. 11A in the first embodiment.
- FIG. 4B illustrates the result of anonymization in the first embodiment using the history information before anonymization illustrated in FIG. 11B so as to suppress a decrease in the abstractness of the original property (FIG. 11C). It is a figure which shows a table.
- the range indicated by the broken line is information that is not disclosed when provided to the user of the anonymized information, as in the case of FIGS. 12A and 12B, and is shown for convenience of explanation. ing. Therefore, the anonymization device 100 may store the entire data structure indicated by a frame indicated by a broken line and a solid line in FIGS. 4A and 4B as long as it is not provided to the user.
- the records relating to the user identifiers 00001 and 00005 in the table relating to the medical year 2010/05 form an anonymous group (II-I).
- the quasi-identifier (gender, date of birth) of the user identifier 00001 is (female, 1985/1/1).
- the quasi-identifier of the user identifier 00005 is (female, 1976/5/5) in FIG. 11B.
- the anonymization unit 104 abstracts these quasi-identifiers by generalization, and assigns the abstracted quasi-identifiers to both records after anonymization.
- the abstraction in this embodiment is performed by generalization as an example.
- generalization detailed information (specific category value) can be converted into ambiguous information. That is, in this embodiment, in the generalization from the record shown in FIG. 11B to the record shown in FIG. 4B, the gender is abstracted based on the generalization tree shown in FIG. The date of birth is converted from a specific value into a period in which the specific date is ambiguous.
- two records having user identifiers 00001 and 00005 are both “female” in gender, and “female” after abstraction based on the hierarchy represented by the generalization tree as shown in FIG. Become.
- the date of birth which is a specific numerical value, has a minimum range (period) including a value representing the date of birth of the patient with the user identifier 00001 and a value representing the date of birth of the patient with the user identifier 00005. Selected.
- information representing “month” and “day” is further truncated from the selected minimum range.
- the birth dates of the two patients in the anonymous group are within the range consisting of only “years” of “1976 to 1985” after abstraction, as shown in FIG. 4B. Is converted.
- the anonymization unit 104 obtains the quasi-identifier “(female, 1976-1985) from the records regarding the medical date 2010/05 of the two patients, user identifiers 00001 and 00005, in step S107. Is generated.
- the anonymization apparatus 100 executes such a series of procedures while sequentially changing the target records, thereby obtaining anonymous groups II-I to II-IV shown in FIG. 4B. Further, the anonymization apparatus 100 performs, for example, a medical history (not shown) related to March 2010 and a medical history related to April 2010 (FIG. 11A) as the history information 110 by the same procedure as described above. Based on this, anonymous groups II to I-IV shown in FIG. 4A are obtained.
- the present invention is not limited to paying attention to the n month and the (n + 1) month as in the above-described embodiment, for example, even when paying attention to the monthly property with the passage of time.
- the present invention may focus on non-continuous desired months such as n month and (n + 2) month or (n + 3) month.
- the history information on the desired month prior to the desired month to be anonymized is not limited as the order of passage of time. You may refer to
- the present invention is not limited to the specific example of abstraction shown in FIG. 4B.
- the anonymization apparatus 100 performs the above-described series of procedures on the record sets of the medical treatment dates 2010/04 and 2010/05, thereby obtaining the two anonymization tables illustrated in FIGS. 4A and 4B. Generate.
- These anonymization tables are data sets in which anonymization satisfying 2-diversity is performed for each data set shown in FIGS. 11A and 11B while preserving the original properties shown in FIG. 11C as much as possible. It is.
- the anonymization apparatus 100 when the history information is anonymized, it is optimal and sufficient with the original property existing between a plurality of records having the same identifier maintained to the maximum.
- Anonymization can be performed. That is, according to the present embodiment, it is possible to provide a data set that preserves many properties existing between a plurality of records sharing the same user identifier while satisfying desired anonymity. And according to this embodiment, when performing an analysis etc. using the data set which performed anonymization, many original properties which original data have can be preserve
- the quasi-identifier of the record ri having a certain unique identifier is processed (ie, abstracted) so that it is difficult to distinguish from the quasi-identifier of a record having another unique identifier.
- abstraction there is a method of assigning to a record ri having the same quasi-identifier and one or more records having other unique identifiers.
- the range of the quasi-identifier included in the plurality of records to be processed may be any of the following cases.
- the number and type of quasi-identifiers of the record ri and other records that become common by abstraction are determined according to the anonymity (k-anonymity, l-diversity, etc.) to be satisfied. Is done.
- the property pij when abstracting a record ri having a unique identifier and a plurality of records having other unique identifiers, the property pij is abstracted so as not to be as ambiguous as possible.
- a plurality of types of records that should be abstracted together with the record ri having a unique identifier and that can be inferred after the abstraction are recorded. Select based on small ambiguity between properties.
- the ambiguity between multiple types of properties that can be estimated after anonymization is, for example, the number of properties that are estimated after anonymization, the geographical distance between the multiple properties that are estimated, or semantic It can be measured by a simple distance.
- such ambiguity is not limited to a specific measurement method.
- a record that is assigned a quasi-identifier common to the record ri has a property similar to pij, and a record in which the ambiguity of a plurality of properties estimated after anonymization is reduced Is selected.
- a plurality of records obtained by such selection are referred to as “anonymous groups”.
- abstraction is performed on the record ri and a plurality of selected records as processing targets.
- generalization for converting to a value having an abstract concept than the original value, perturbation for adding noise, or the like can be employed. Therefore, abstraction may use any method as long as desired anonymity can be satisfied, and may employ a combination of a plurality of methods.
- FIG. 2 is a functional block diagram showing the configuration of the anonymization apparatus 200 according to the second embodiment of the present invention.
- the anonymization apparatus 200 includes a record extraction unit 202, an anonymization unit 204, an original data storage unit 206, a property analysis unit 208, an anonymity input unit 210, and a data storage unit 212.
- FIG. 5 is a flowchart showing a procedure of control processing in the anonymization apparatus according to the second embodiment of the present invention. That is, the anonymization apparatus 200 according to the present embodiment executes step S201 prior to processing steps S203 to S209 similar to steps S101 to S107 in the flowchart shown in FIG. 3 according to the first embodiment.
- This step S201 is processing realized by the original data storage unit 206 and the property analysis unit 208 described below.
- the original data storage unit 206 can store the history information 110 acquired from the outside. It is assumed that the history information 110 includes one or more records having the same user identifier. Also in this embodiment, each history information 110 includes at least a user identifier, a quasi-identifier, and sensitive information, for example, information such as a record set shown in FIGS. 11A and 11B.
- the property analysis unit 208 reads the history information 110 stored in the original data storage unit 206 from the original data storage unit 206, and stores a plurality of records constituting the data set (FIGS. 11A and 11B) as the read history information. By analyzing, the properties existing between individual records are extracted. Examples of the analysis performed by the property analysis unit 208 include various data mining and statistical analysis methods such as co-occurrence analysis of attribute values, correlation analysis, regression analysis, and time series analysis among a plurality of records constituting the data set. is assumed. In the present embodiment, as in the first embodiment, a case of time series analysis will be described as an example.
- the original data storage unit 206 can derive the property illustrated in FIG. 11C described above when the data set illustrated in FIGS. 11A and 11B is an analysis target.
- the record extraction unit 202 extracts an anonymization candidate record group from the history information 110 in substantially the same manner as the record extraction unit 102 in the first embodiment described above (step S203). And the record extraction part 202 extracts a property preservation
- information on anonymity that should be satisfied by the data set after anonymization by the anonymization unit 204 can be set from the outside.
- the anonymization unit 204 forms an anonymous group based on the property storage candidate record extracted by the record extraction unit 202 in substantially the same manner as the anonymization unit 104 in the first embodiment described above (step S207). Then, the anonymization unit 204 abstracts the quasi-identifiers included in the plurality of records forming the anonymous group in substantially the same manner as the anonymization unit 104 in the first embodiment described above (step S209). However, the anonymization unit 204 executes processing so as to satisfy the anonymity set by the anonymity input unit 210 during abstraction. And the data storage part 212 can store the anonymization data which the anonymization part 204 produced
- the anonymization apparatus 200 also maximizes the original property existing between a plurality of records having the same identifier when anonymizing history information, as in the first embodiment. Optimal and sufficient anonymization can be performed in a state where the limit is maintained.
- the property of the history information 110 can be analyzed by the property analysis unit 208. For this reason, according to the present embodiment, anonymization can be realized with a low abstraction level of the properties extracted by the analysis.
- FIG. 6 is a functional block diagram showing the configuration of the anonymization device 300 according to the third embodiment of the present invention.
- An anonymization apparatus 300 according to the present embodiment includes a record extraction unit 302, an anonymization unit 304, an original data storage unit 306, a property analysis unit 308, an anonymity input unit 310, a data storage unit 312, and an importance evaluation unit 314.
- the anonymization device 300 according to the present embodiment is different from the second embodiment in that the anonymization device 300 further includes an importance evaluation unit 314 in addition to the configuration of the anonymization device 200 according to the second embodiment.
- the anonymization device 300 is different from the second embodiment in that the anonymization device 300 further includes an importance evaluation unit 314 in addition to the configuration of the anonymization device 200 according to the second embodiment.
- an importance evaluation unit 314 in addition to the configuration of the anonymization device 200 according to the second embodiment.
- an apparatus configuration that does not include at least one of the original data storage unit 306, the anonymity input unit 310, and the data storage unit 312. Is also envisaged.
- the multiple types of properties handled in the above-described embodiment may have different importance for each property. For example, the nature of “a user who has a certain illness will have another illness with a high degree of certainty” or “a user who has a certain illness has a much higher possibility of a specific illness than a user who does not. It can be said that information representing the property “high” is more important than other properties. Therefore, in this embodiment, when it is difficult to store all types of properties, the importance evaluation unit 314 evaluates the importance of each property. That is, the anonymization apparatus 300 according to the present embodiment determines the property to be stored based on the evaluated importance so that the important property can be satisfied as much as possible within the given anonymity constraint. Create an anonymous group.
- FIG. 7 is a flowchart showing the procedure of the control process in the anonymization apparatus according to the third embodiment of the present invention. That is, the anonymization device 300 according to the present embodiment performs processing steps S301 and S307 to S313 that are substantially the same as steps S201 to S209 in the flowchart shown in FIG. 5 according to the second embodiment, and further includes steps Steps S303 and S305 are newly executed between S301 and S307. These steps S303 and S305 are processes realized by the importance evaluation unit 314.
- step S301 the property analysis unit 308, based on the history information 110 stored in the original data storage unit 306, obtains a plurality of properties possessed by each user identifier, as in step S201 in the second embodiment. Extract.
- step S303 the importance evaluation unit 314 evaluates the importance of the plurality of properties extracted in step S301.
- step S305 the property importance evaluation unit 314 extracts a highly important property from the plurality of properties based on the evaluation result, and notifies the record extraction unit 302 of the extracted property. To do.
- step S307 by the record extraction unit 302 and the anonymization unit 304 is the same as the processing after step S205 shown in FIG. 5 in the second embodiment.
- the record extraction unit 302 can store the highly important property extracted by the importance evaluation unit 314 in step S305 when extracting a record having another user identifier that can store the property. Extract records with other user identifiers.
- An example of an important property is one that appears more frequently under certain conditions than under other conditions.
- an index representing the importance of the property for example, there is a confidence, a lift value, entropy, and the like.
- the certainty factor represents a conditional probability that a certain event occurs under a certain condition.
- the lift value represents how easily a specific event is likely to occur depending on whether a certain condition is present or not.
- Entropy represents how rare a particular event occurs.
- a property with a high certainty is treated as an important property.
- FIG. 14A is a diagram illustrating a table relating to history information before anonymization (medical history relating to June 2010).
- FIG. 14B is a diagram exemplifying history information before anonymization (tables related to medical history related to July 2010. These tables are for a plurality of user identifiers similar to the tables shown in FIGS. 11A and 11B. Records of different medical treatment dates are compiled.
- FIG. 14C shows each user identifier regarding the property which exists between the historical information before anonymization from April to July 2010 (each data set shown in FIG. 11A, FIG. 11B, FIG. 14A, and FIG. 14B). It is a figure which shows the example put together in the table.
- the importance evaluation unit 314 evaluates the appearance frequency and the certainty factor for each of the properties illustrated in FIG. 14C in step S303.
- the data described on the rightmost side is a conclusion part, and the other parts are precondition parts.
- the premise part is “B> G” and the conclusion part is “X”.
- the certainty factor is an index representing the ratio of the appearance of the conclusion part when the premise part appears.
- the rate at which the conclusion part “X” appears when the premise part “B> G” appears is 100%. Since the property “B> G> X” appears in FIG. 14C with respect to the user identifiers 00002 and 00004, the appearance frequency is 2.
- the importance evaluation unit 314 evaluates each property shown in FIG. 14C based on the appearance frequency of the premise part.
- the threshold value is that the same property appears two or more times.
- the importance evaluation unit 314 sets a property that is equal to or higher than the threshold as a storage target.
- the importance evaluation unit 314 uses the three thresholds “B> G> X”, “B> X”, and “G> X” with reference to the threshold value. Extract.
- the importance evaluation unit 314 evaluates the property based on the certainty factor, and extracts the property having the highest certainty factor as a result of the evaluation.
- the certainty factors are “B> G> X: 100%”, “B> X: 100%”, and “G> X: 100%”, respectively.
- the importance evaluation unit 314 evaluates and stores the plurality of properties of interest based on the length of the properties and the appearance frequency of the properties. Determine (select) one of the properties to be used.
- the importance evaluation unit 314 determines that the record regarding the patient's July 2010 identified by the user identifier 00002 shown in FIG. > G> X ”is selected.
- the most important property can be determined by measuring the importance of the property in consideration of the certainty factor, the appearance frequency, the length, and the like for each user identifier.
- step S305 the importance evaluation unit 314 extracts, for the records having the user identifiers 00002, 00004, 00006, and 00008, the properties extracted with respect to these records as the most important properties.
- the anonymous group is generated in the same manner as in the first and second embodiments described above while preserving this most important property. Thereby, anonymization is given to the some record whose medical treatment date is 2010/07 as shown in FIG.
- the record extraction unit 302 in the third embodiment forms an anonymous group that suppresses ambiguity of the property to be stored, as in the first and second embodiments.
- the record extraction unit 302 assigns a different injury / illness name to the record related to July 2010 of the patient specified by the user identifier 00001 (FIG. 14B) in order to satisfy 2-diversity. Extract from July records.
- a record having all user identifiers other than the user identifier 00001 is applicable.
- the importance evaluation unit 314 evaluates the ambiguity of the property. For example, the evaluation of the ambiguity may be performed in the same manner as the evaluation of the ambiguity in the first embodiment.
- the anonymization unit 304 forms an anonymous group (III-III) with a record related to the user identifier 00001 as a target record through such a procedure, a record having the user identifier 00003 is displayed. Selected as the record with the least ambiguity.
- FIG. 9 illustrates the result of anonymization in the third embodiment using the history information before anonymization illustrated in FIG. 14B so as to suppress a decrease in the abstractness of the original property (FIG. 14C). It is a figure which shows a table.
- the anonymization device 300 uses the importance evaluation unit 314 to evaluate the importance of the property, so that a plurality of types of properties possessed by a plurality of records given a common user identifier can be obtained. From among them, an anonymous group can be generated while preserving highly important properties, and anonymization can be performed based on the generated anonymous group.
- the plurality of Appropriate anonymization can be realized in a state where the degree of abstraction is kept low with respect to properties of high importance among the properties of. Therefore, according to the anonymization apparatus 300 according to the present embodiment, by performing anonymization based on the anonymous group generated as described above, it is possible to guarantee anonymity and simultaneously save important properties. .
- FIG. 8 is a functional block diagram showing the configuration of the anonymization device 400 according to the fourth embodiment of the present invention.
- An anonymization apparatus 400 according to the present embodiment includes a record extraction unit 402, an anonymization unit 404, an original data storage unit 406, a property analysis unit 408, an anonymity input unit 410, a data storage unit 412, an importance evaluation unit 414, and A property holding request receiving unit 416 is provided. That is, the anonymization device 400 according to the present embodiment is different from the third embodiment in that the anonymization device 400 further includes a property holding request reception unit 416 in addition to the configuration of the anonymization device 300 according to the third embodiment.
- omitted is abbreviate
- the anonymization device 400 can accept a property to be stored as an external request using the property holding request accepting unit 416.
- the property storage request accepting unit 416 can accept information indicating the property to be stored, which is input via an input interface such as a data file or GUI (Graphical User Interface), and the information. Can be stored.
- an input interface such as a data file or GUI (Graphical User Interface)
- GUI Graphic User Interface
- the input method, the format, the storage method, and the communication method are not limited at all in the present invention described by taking this embodiment as an example.
- the importance evaluation unit 414 extracts the detected property as an important property to be satisfied in response to detecting the property input to the property storage request receiving unit 416. On the other hand, when the importance evaluation unit 414 does not detect the presence of such a property, the importance evaluation unit 414 performs the same operation as the importance evaluation unit 314 in the above-described third embodiment.
- FIG. 14C referred to in the third embodiment is also referred to in this embodiment.
- the importance evaluation unit 414 sets such importance high.
- the property “A> Z” does not exist, it is not regarded as important.
- g is a coefficient that amplifies importance
- c is a coefficient that amplifies certainty
- ⁇ is a coefficient that amplifies importance.
- the importance of the property input to the property storage request accepting unit 416 is highly evaluated.
- the fourth embodiment it is possible to realize anonymization of history information in a state where the abstraction level of the property given from the outside is kept low from one or more properties of the history information. .
- FIG. 10 is a diagram for exemplifying the hardware configuration of a computer (information processing apparatus) that can implement the first to fourth embodiments of the present invention.
- the hardware of the information processing apparatus (computer) 1000 shown in FIG. 10 includes a CPU 11, a communication interface (I / F) 12, an input / output user interface 13, a ROM (Read Only Memory) 14, a RAM (Random Access Memory) 15, A storage device 17 and a drive device 18 of a computer-readable storage medium 19 are provided, and these are connected via a bus 16.
- the input / output user interface 13 is a man-machine interface (user interface: UI) such as a keyboard as an example of an input device and a display as an output device.
- the communication interface 12 is generally used for the anonymization device (FIGS. 1, 2, 6, and 7) according to the above-described embodiments to communicate with an external device via the communication network 600. Communication means.
- the CPU 11 governs the overall operation of the information processing apparatus 1000 as an apparatus according to each embodiment.
- the present invention described with reference to the first to fourth embodiments described above is the function of the flowcharts (FIGS. 3, 5, and 9) referred to in the description, or FIGS. ,
- the program is read to the CPU 11. Achieved by executing.
- the storage device 17 as a hardware resource is appropriately used for various data storage units (206, 212, etc.).
- the input / output user interface 13 as a hardware resource is appropriately used for various input units and reception units (310, 410, 416).
- the program supplied to the information processing apparatus 1000 may be stored in a readable / writable temporary storage memory (15) or a non-volatile storage device (17) such as a hard disk drive. That is, in the storage device 17, the program group 17 ⁇ / b> A is a program that can realize the function of each unit shown in the anonymization device (100, 200, 300, 400) in each embodiment described above, for example.
- the various stored information 17B is, for example, the history information 110 in each of the above-described embodiments, information indicating desired anonymity, or the like.
- the program can be supplied to the apparatus by a method of installing in the apparatus via various computer-readable recording media (19) such as a CD-ROM and a flash memory, and the Internet.
- various computer-readable recording media (19) such as a CD-ROM and a flash memory, and the Internet.
- a general procedure can be adopted, such as a method of downloading from the outside via a communication line (600).
- the present invention can be considered to be configured by a code (program group 17A) representing the computer program or a storage medium (19) in which the code is stored.
- a method of supplying the history information 110 to the anonymization device in each of the above-described embodiments a method in which the user supplies using the input / output user interface 13 or the like, or an external device that can communicate with the anonymization device A so-called method (so-called M2M: Machine to Machine) or the like can be employed.
- a method for supplying desired anonymity (anonymity information) to the anonymity input unit (210, 310, 410) in the second to fourth embodiments described above, and property preservation in the fourth embodiment As a method for supplying the property (property information) to the request receiving unit 416, a method in which the user supplies using the UI, a method in which the user supplies from an external device that can communicate with the anonymization device, or the like can be employed.
- the anonymization device handles, as an example, the property of injury and illness transition with the passage of time, as described in the “Background Art” column for convenience of explanation.
- the present invention described by taking the above-described embodiments as an example is not limited to such a property (transition of injury and illness with the passage of time), and can be applied to various properties.
- the present invention can be applied to a property related to a co-occurrence relationship between wounds.
- the property which exists for every user identifier was made into object as an example.
- the present invention is not limited to the property exemplarily adopted in these embodiments, and for example, the common property is stored (maintained) in a user identifier having a common quasi-identifier (the same quasi-identifier). You may apply to a case.
- the record extraction means includes Supplementary note 1 wherein a record having the smallest ambiguity that can be inferred after abstraction is extracted from records having the other unique identifier that should be abstracted together with the record having the specific unique identifier.
- the record extraction means includes The records provided with the specific unique identifier and the records provided with other unique identifiers different from the extracted specific unique identifier are grouped together as one group,
- the anonymization means is: The anonymization device according to appendix 1 or appendix 2, wherein the abstraction is performed in units of groups.
- Appendix 4 The anonymity according to any one of appendices 1 to 3, further comprising a property analysis means for extracting the property from the history information by analyzing a plurality of records constituting the history information Device.
- the anonymization device By evaluating the importance of these properties when there are multiple types of the properties, The anonymization device according to any one of appendix 1 to appendix 4, further comprising an evaluation unit that selects an important property to be prioritized when the record extraction unit extracts.
- appendix 6 The anonymization device according to appendix 5, further comprising request accepting means capable of entering a request regarding a property desired to be stored among the plurality of properties.
- the evaluation means includes The anonymity according to appendix 6, wherein the importance of the property entered in the request accepting unit is made higher than other properties and then the importance of the property extracted by the property analyzing unit is evaluated. Device.
- appendix 8 The anonymization device according to appendix 7, further comprising setting means for setting a method for evaluating the importance of the property by the evaluation means.
- Supplementary note 9 The record having the smallest ambiguity that can be inferred after abstraction is extracted from the records having the other unique identifiers that should be abstracted together with the record having the specific unique identifier. The anonymization method described.
- the desired anonymity can be satisfied from the history information, and a specific unique identifier is A record extraction function for extracting a record given another unique identifier different from the specific unique identifier based on a small ambiguity of a property existing between a plurality of records in common; Anonymization abstracted by providing commonality to the quasi-identifiers included in each of the records so that individual attributes of the plurality of records extracted by the record extraction means satisfy the desired anonymity Functions and A computer program that is executed by a computer.
- the record extraction function is: The records provided with the specific unique identifier and the records provided with other unique identifiers different from the extracted specific unique identifier are grouped together as one group,
- the anonymization function is 14.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
・B>G,
・C>H,
ここで、演算子「>」は、個々の性質が有する時系列な順序を表しており、例えば、X>Yであれば、状態Xの後に状態Yになることを表す(以下の本願の説明に於いても同様)。
・B>E,B>G,
・C>F,C>H,
即ち、上記匿名化後の時系列な性質の例を具体的に説明すると、まず、2010年4月に疾病Aになった患者に注目した場合を考える。この場合、図12Aに示す4つの期間(「1976年~1985年」、「1975年~1979年」、「1972年~1976年」、「1951年~1963年」)内に生年月日を有する患者が対象となる。この場合、これらの患者について図12Bにおいて同じ生年月日の期間に注目すると、以下の性質を読み取ることができる。
・期間「1975年~1979年」においては疾病EおよびGを罹患するという性質、
・期間「1972年~1976年」においては疾病FおよびHを罹患するという性質、そして
・期間「1951年~1963年」においては疾病FおよびHを罹患するという性質。
・期間「1975年~1979年」においても疾病EおよびGを罹患するという性質。
・期間「1951年~1963年」においても疾病FおよびHを罹患するという性質。
固有識別情報に関して少なくとも準識別子とセンシティブ情報とが関連付けされたレコードが複数件含まれる履歴情報を対象として、その履歴情報の中から、所望の匿名性を充足可能であり、且つ特定の固有識別子を共通に有する複数のレコード間に存在する性質の曖昧さの小ささに基づいて、該特定の固有識別子とは異なる他の固有識別子を与えられたレコードを抽出するレコード抽出手段と、
前記レコード抽出手段によって抽出された複数のレコードが有する個々の属性が前記所望の匿名性を充足するように、それらのレコードにそれぞれ含まれる準識別子に共通性を持たせることによって抽象化する匿名化手段とを備える。
固有識別情報に関して少なくとも準識別子とセンシティブ情報とが関連付けされたレコードが複数件含まれる履歴情報を対象として、その履歴情報の中から、所望の匿名性を充足可能であり、且つ特定の固有識別子を共通に有する複数のレコード間に存在する性質の曖昧さの小ささに基づいて、該特定の固有識別子とは異なる他の固有識別子を与えられたレコードをコンピュータによって抽出し、
抽出された複数のレコードが有する個々の属性が前記所望の匿名性を充足するように、前記コンピュータまたは異なるコンピュータによって、それらのレコードにそれぞれ含まれる準識別子に共通性を持たせることによって抽象化する。
まず、本発明の第1の実施形態に係る匿名化装置について説明する。図1は、本発明の第1の実施形態に係る匿名化装置100の構成を示す機能ブロック図である。匿名化装置100は、レコード抽出部102と、匿名化部104とを有する。匿名化装置100は、履歴情報110に基づいて匿名化を行う。本実施形態において、匿名化装置100は、注目する履歴情報110に関する性質を、例えば外部装置から入手する。
まず、レコード抽出部102の動作について説明する。レコード抽出部102は、当該所望の匿名性を充足するために必要な複数のレコードを、履歴情報110の中から抽出する(ステップS101)。そして、レコード抽出部102は、ステップS101にて抽出した複数のレコードの中から、当該性質の曖昧度が最も小さいレコードを選択する(ステップS103)。以下、これら2つのステップの処理手順について詳細に説明する。
次に、匿名化部104の動作について説明する。匿名化部104は、ステップS103にて選択された複数のレコード(性質保存候補レコード)の中から、匿名グループを形成する複数のレコードを抽出する(ステップS105)。そして、匿名化部104は、ステップS105にて抽出した複数のレコード(即ち、匿名グループ)を対象として、準識別子の匿名化を行う(ステップS107)。即ち、ステップS107において、匿名化部104は、注目する匿名グループに属する複数のレコードにそれぞれ含まれる準識別子を抽象化する。以下、これら2つのステップの処理手順について詳細に説明する。
例えば、2010年3月に関する診療履歴(不図示)と、2010年4月に関する診療履歴(図11A)に基づいて、図4Aに示す匿名グループI-I乃至I-IVを得る。
(2)一方のレコードの値域と、他方のレコードの値域とが部分的に重複している場合。
次に、上述した第1の実施形態を基本とする第2の実施形態について説明する。以下の説明においては、本実施形態に係る特徴的な部分を中心に説明すると共に、上述した第1の実施形態と同様な構成についての重複する説明は省略する。
次に、上述した第1及び第2の実施形態を基本とする第3の実施形態について説明する。以下の説明においては、本実施形態に係る特徴的な部分を中心に説明すると共に、上述した第1及び第2の実施形態と同様な構成についての重複する説明は省略する。
次に、上述した第1乃至第3の実施形態を基本とする第4の実施形態について説明する。以下の説明においては、本実施形態に係る特徴的な部分を中心に説明すると共に、上述した第1乃至第3の実施形態と同様な構成についての重複する説明は省略する。
ここで、上述した各実施形態に係る装置を実現可能なハードウェアの構成例について説明する。図10は、本発明の第1乃至第4の実施形態を実現可能なコンピュータ(情報処理装置)のハードウェア構成を例示的に説明する図である。
(付記1)
固有識別情報に関して少なくとも準識別子とセンシティブ情報とが関連付けされたレコードが複数件含まれる履歴情報を対象として、その履歴情報の中から、所望の匿名性を充足可能であり、且つ特定の固有識別子を共通に有する複数のレコード間に存在する性質の曖昧さの小ささに基づいて、該特定の固有識別子とは異なる他の固有識別子を与えられたレコードを抽出するレコード抽出手段と、
前記レコード抽出手段によって抽出された複数のレコードが有する個々の属性が前記所望の匿名性を充足するように、それらのレコードにそれぞれ含まれる準識別子に共通性を持たせることによって抽象化する匿名化手段とを備える
ことを特徴とする匿名化装置。
前記レコード抽出手段は、
前記特定の固有識別子を有するレコードと共に抽象化されるべきところの、前記他の固有識別子を有するレコードのうち、抽象化後に推測され得る曖昧さが最も小さいレコードを抽出する
ことを特徴とする付記1記載の匿名化装置。
前記レコード抽出手段は、
前記特定の固有識別子が与えられたレコードと、抽出した該特定の固有識別子とは異なる他の固有識別子を与えられたレコードとを1つのグループとしてまとめ、
前記匿名化手段は、
前記グループ単位に前記抽象化を行う
ことを特徴とする付記1または付記2記載の匿名化装置。
前記履歴情報を構成する複数のレコードを分析することにより、その履歴情報の中から、前記性質を抽出する性質分析手段を更に備える
ことを特徴とする付記1乃至付記3の何れかに記載の匿名化装置。
前記性質が複数種類存在する場合に、それらの性質の重要性を評価することにより、
前記レコード抽出手段による抽出に際して優先すべき重要な性質を選択する評価手段を更に備える
ことを特徴とする付記1乃至付記4の何れかに記載の匿名化装置。
前記複数種類の性質のうち、保存を希望する性質に関する要求をエントリ可能な要求受付手段を更に備える
ことを特徴とする付記5記載の匿名化装置。
前記評価手段は、
前記要求受付手段にエントリされた性質の重要性を他の性質と比較して高くしてから、前記性質分析手段によって抽出された性質の重要性を評価する
ことを特徴とする付記6記載の匿名化装置。
前記評価手段による性質の重要性の評価方法を設定する設定手段を更に備える
ことを特徴とする付記7記載の匿名化装置。
固有識別情報に関して少なくとも準識別子とセンシティブ情報とが関連付けされたレコードが複数件含まれる履歴情報を対象として、その履歴情報の中から、所望の匿名性を充足可能であり、且つ特定の固有識別子を共通に有する複数のレコード間に存在する性質の曖昧さの小ささに基づいて、該特定の固有識別子とは異なる他の固有識別子を与えられたレコードをコンピュータによって抽出し、
抽出された複数のレコードが有する個々の属性が前記所望の匿名性を充足するように、前記コンピュータまたは異なるコンピュータによって、それらのレコードにそれぞれ含まれる準識別子に共通性を持たせることによって抽象化する
ことを特徴とする匿名化方法。
前記抽出に際して、
前記特定の固有識別子を有するレコードと共に抽象化されるべきところの、前記他の固有識別子を有するレコードのうち、抽象化後に推測され得る曖昧さが最も小さいレコードを抽出する
ことを特徴とする付記9記載の匿名化方法。
前記抽出に際しては、前記特定の固有識別子が与えられたレコードと、抽出した該特定の固有識別子とは異なる他の固有識別子を与えられたレコードとを1つのグループとしてまとめ、
前記匿名化に際しては、前記グループ単位に前記抽象化を行う
ことを特徴とする付記9記載の匿名化方法。
固有識別情報に関して少なくとも準識別子とセンシティブ情報とが関連付けされたレコードが複数件含まれる履歴情報を対象として、その履歴情報の中から、所望の匿名性を充足可能であり、且つ特定の固有識別子を共通に有する複数のレコード間に存在する性質の曖昧さの小ささに基づいて、該特定の固有識別子とは異なる他の固有識別子を与えられたレコードを抽出するレコード抽出機能と、
前記レコード抽出手段によって抽出された複数のレコードが有する個々の属性が前記所望の匿名性を充足するように、それらのレコードにそれぞれ含まれる準識別子に共通性を持たせることによって抽象化する匿名化機能とを、
コンピュータに実行させることを特徴とするコンピュータ・プログラム。
前記レコード抽出機能は、
前記特定の固有識別子を有するレコードと共に抽象化されるべきところの、前記他の固有識別子を有するレコードのうち、抽象化後に推測され得る曖昧さが最も小さいレコードを抽出する
ことを特徴とする付記12記載のコンピュータ・プログラム。
前記レコード抽出機能は、
前記特定の固有識別子が与えられたレコードと、抽出した該特定の固有識別子とは異なる他の固有識別子を与えられたレコードとを1つのグループとしてまとめ、
前記匿名化機能は、
前記グループ単位に前記抽象化を行う
ことを特徴とする付記12または付記13記載のコンピュータ・プログラム。
12 通信インタフェース(I/F)
13 入出力ユーザインタフェース
14 ROM
15 RAM
16 バス
17 記憶装置
18 ドライブ装置
19 記憶媒体
100,200,300,400 匿名化装置
102,202,302,402 レコード抽出部
104,204,304,404 匿名化部
110 履歴情報
206,306,406 元データ格納部
208,308,408 性質分析部
210,310,410 匿名性入力部
212,312,412 データ格納部
314,414 重要性評価部
416 性質保持要求受付部
600 通信ネットワーク
1000 情報処理装置(コンピュータ)
Claims (10)
- 固有識別情報に関して少なくとも準識別子とセンシティブ情報とが関連付けされたレコードが複数件含まれる履歴情報を対象として、その履歴情報の中から、所望の匿名性を充足可能であり、且つ特定の固有識別子を共通に有する複数のレコード間に存在する性質の曖昧さの小ささに基づいて、該特定の固有識別子とは異なる他の固有識別子を与えられたレコードを抽出するレコード抽出手段と、
前記レコード抽出手段によって抽出された複数のレコードが有する個々の属性が前記所望の匿名性を充足するように、それらのレコードにそれぞれ含まれる準識別子に共通性を持たせることによって抽象化する匿名化手段とを備える
ことを特徴とする匿名化装置。 - 前記レコード抽出手段は、
前記特定の固有識別子を有するレコードと共に抽象化されるべきところの、前記他の固有識別子を有するレコードのうち、抽象化後に推測され得る曖昧さが最も小さいレコードを抽出する
ことを特徴とする請求項1記載の匿名化装置。 - 前記レコード抽出手段は、
前記特定の固有識別子が与えられたレコードと、抽出した該特定の固有識別子とは異なる他の固有識別子を与えられたレコードとを1つのグループとしてまとめ、
前記匿名化手段は、
前記グループ単位に前記抽象化を行う
ことを特徴とする請求項1または請求項2記載の匿名化装置。 - 前記履歴情報を構成する複数のレコードを分析することにより、その履歴情報の中から、前記性質を抽出する性質分析手段を更に備える
ことを特徴とする請求項1乃至請求項3の何れかに記載の匿名化装置。 - 前記性質が複数種類存在する場合に、それらの性質の重要性を評価することにより、
前記レコード抽出手段による抽出に際して優先すべき重要な性質を選択する評価手段を更に備える
ことを特徴とする請求項1乃至請求項4の何れかに記載の匿名化装置。 - 前記複数種類の性質のうち、保存を希望する性質に関する要求をエントリ可能な要求受付手段を更に備える
ことを特徴とする請求項5記載の匿名化装置。 - 前記評価手段は、
前記要求受付手段にエントリされた性質の重要性を他の性質と比較して高くしてから、前記性質分析手段によって抽出された性質の重要性を評価する
ことを特徴とする請求項6記載の匿名化装置。 - 固有識別情報に関して少なくとも準識別子とセンシティブ情報とが関連付けされたレコードが複数件含まれる履歴情報を対象として、その履歴情報の中から、所望の匿名性を充足可能であり、且つ特定の固有識別子を共通に有する複数のレコード間に存在する性質の曖昧さの小ささに基づいて、該特定の固有識別子とは異なる他の固有識別子を与えられたレコードをコンピュータによって抽出し、
抽出された複数のレコードが有する個々の属性が前記所望の匿名性を充足するように、前記コンピュータまたは異なるコンピュータによって、それらのレコードにそれぞれ含まれる準識別子に共通性を持たせることによって抽象化する
ことを特徴とする匿名化方法。 - 前記抽出に際して、
前記特定の固有識別子を有するレコードと共に抽象化されるべきところの、前記他の固有識別子を有するレコードのうち、抽象化後に推測され得る曖昧さが最も小さいレコードを抽出する
ことを特徴とする請求項8記載の匿名化方法。 - 固有識別情報に関して少なくとも準識別子とセンシティブ情報とが関連付けされたレコードが複数件含まれる履歴情報を対象として、その履歴情報の中から、所望の匿名性を充足可能であり、且つ特定の固有識別子を共通に有する複数のレコード間に存在する性質の曖昧さの小ささに基づいて、該特定の固有識別子とは異なる他の固有識別子を与えられたレコードを抽出するレコード抽出機能と、
前記レコード抽出手段によって抽出された複数のレコードが有する個々の属性が前記所望の匿名性を充足するように、それらのレコードにそれぞれ含まれる準識別子に共通性を持たせることによって抽象化する匿名化機能とを、
コンピュータに実行させることを特徴とするコンピュータ・プログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP12858495.0A EP2793162A4 (en) | 2011-12-15 | 2012-12-06 | ANONYLATION DEVICE, ANONYMIZATION PROCESS AND COMPUTER PROGRAM |
US14/365,615 US20140317756A1 (en) | 2011-12-15 | 2012-12-06 | Anonymization apparatus, anonymization method, and computer program |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2011-274791 | 2011-12-15 | ||
JP2011274791 | 2011-12-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013088681A1 true WO2013088681A1 (ja) | 2013-06-20 |
Family
ID=48612158
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2012/007825 WO2013088681A1 (ja) | 2011-12-15 | 2012-12-06 | 匿名化装置、匿名化方法、並びにコンピュータ・プログラム |
Country Status (4)
Country | Link |
---|---|
US (1) | US20140317756A1 (ja) |
EP (1) | EP2793162A4 (ja) |
JP (1) | JPWO2013088681A1 (ja) |
WO (1) | WO2013088681A1 (ja) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015007885A (ja) * | 2013-06-25 | 2015-01-15 | 日本電気株式会社 | 情報処理装置、及び、データ処理方法 |
WO2015079647A1 (ja) * | 2013-11-28 | 2015-06-04 | 日本電気株式会社 | 情報処理装置および情報処理方法 |
JP2017516194A (ja) * | 2014-03-26 | 2017-06-15 | アルカテル−ルーセント | ストリーミングデータの匿名化 |
WO2019168144A1 (ja) * | 2018-03-02 | 2019-09-06 | 日本電気株式会社 | 情報処理装置、情報処理システム、情報処理方法、及び、記録媒体 |
KR102058030B1 (ko) | 2017-09-20 | 2019-12-20 | 주식회사 티지360테크놀로지스 | 익명성 유지 시스템 및 그 방법 |
CN113127924A (zh) * | 2019-12-30 | 2021-07-16 | 财团法人工业技术研究院 | 数据匿名方法与数据匿名系统 |
KR20230105569A (ko) * | 2022-01-04 | 2023-07-11 | 비씨카드(주) | 익명 값에 대한 예측 값을 결정하는 방법 및 디바이스 |
JP7374796B2 (ja) | 2020-02-07 | 2023-11-07 | 日鉄ソリューションズ株式会社 | 情報処理装置、情報処理方法及びプログラム |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160203333A1 (en) * | 2012-08-20 | 2016-07-14 | Thomson Licensing | Method and apparatus for utility-aware privacy preserving mapping against inference attacks |
US20140380489A1 (en) * | 2013-06-20 | 2014-12-25 | Alcatel-Lucent Bell Labs France | Systems and methods for data anonymization |
US9489538B2 (en) * | 2014-01-02 | 2016-11-08 | Alcatel Lucent | Role-based anonymization |
CA2852253A1 (en) * | 2014-05-23 | 2015-11-23 | University Of Ottawa | System and method for shifting dates in the de-identification of datesets |
US10354303B1 (en) * | 2015-02-27 | 2019-07-16 | Intuit Inc. | Verification of rental and mortgage payment history |
CN106529110A (zh) * | 2015-09-09 | 2017-03-22 | 阿里巴巴集团控股有限公司 | 一种用户数据分类的方法和设备 |
TWI644224B (zh) | 2017-10-18 | 2018-12-11 | 財團法人工業技術研究院 | 資料去識別化方法、資料去識別化裝置及執行資料去識別化方法的非暫態電腦可讀取儲存媒體 |
US11360972B2 (en) * | 2019-03-27 | 2022-06-14 | Sap Se | Data anonymization in database management systems |
KR20210003667A (ko) * | 2019-07-02 | 2021-01-12 | 현대자동차주식회사 | M2m 시스템에서 보호 데이터를 처리하는 방법 및 장치 |
US11641346B2 (en) * | 2019-12-30 | 2023-05-02 | Industrial Technology Research Institute | Data anonymity method and data anonymity system |
US11983278B2 (en) * | 2021-03-18 | 2024-05-14 | Tata Consultancy Services Limited | System and method for data anonymization using optimization techniques |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011180839A (ja) | 2010-03-01 | 2011-09-15 | Kddi Corp | プライバシー情報評価サーバ、データ管理方法およびプログラム |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002084531A2 (en) * | 2001-04-10 | 2002-10-24 | Univ Carnegie Mellon | Systems and methods for deidentifying entries in a data source |
US8631500B2 (en) * | 2010-06-29 | 2014-01-14 | At&T Intellectual Property I, L.P. | Generating minimality-attack-resistant data |
CA2679800A1 (en) * | 2008-09-22 | 2010-03-22 | University Of Ottawa | Re-identification risk in de-identified databases containing personal information |
US8112422B2 (en) * | 2008-10-27 | 2012-02-07 | At&T Intellectual Property I, L.P. | Computer systems, methods and computer program products for data anonymization for aggregate query answering |
WO2012063546A1 (ja) * | 2010-11-09 | 2012-05-18 | 日本電気株式会社 | 匿名化装置及び匿名化方法 |
-
2012
- 2012-12-06 WO PCT/JP2012/007825 patent/WO2013088681A1/ja active Application Filing
- 2012-12-06 US US14/365,615 patent/US20140317756A1/en not_active Abandoned
- 2012-12-06 JP JP2013549106A patent/JPWO2013088681A1/ja active Pending
- 2012-12-06 EP EP12858495.0A patent/EP2793162A4/en not_active Withdrawn
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011180839A (ja) | 2010-03-01 | 2011-09-15 | Kddi Corp | プライバシー情報評価サーバ、データ管理方法およびプログラム |
Non-Patent Citations (6)
Title |
---|
L. SWEENEY: "k-anonymity: a model for protecting privacy", INTERNATIONAL JOURNAL ON UNCERTAINTY, FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, vol. 10, no. 5, 2002, pages 555 - 570 |
MACHANAVAJJHALA, A.; KIFER, D.; GEHRKE, J.; VENKITASUBRAMANIAM, M.: "1-diversity: Privacy beyond k-anonymity", ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA (TKDD, vol. 1, no. 1, 2007, pages 3 |
O. ABUL; F. BONCHI; M. NANNI: "Never Walk Alone: Uncertainty for Anonymity in Moving Objects Databases", PROCEEDINGS OF 24TH IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING, 2008, pages 376 - 385 |
See also references of EP2793162A4 * |
SHUNSUKE MURAMOTO: "Haikei Chishiki o Mochiita Suisoku o Konnan nishi Data Waikyokudo o Kyokushoka suru Privacy Hogo Shuho", THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS DAI 19 KAI DATA KOGAKU WORKSHOP RONBUNSHU, vol. CL-4, 7 April 2008 (2008-04-07), pages 1 - 8, XP055153042 * |
YUKI TOYODA: "Tokumeika Group-kan no Yososu no Henka o Hikaku Kano na Tokumeika Shuho no Jitsugen", CSS2011 COMPUTER SECURITY SYMPOSIUM 2011 RONBUNSHU HEISAI MALWARE TAISAKU KENKYU JINZAI IKUSEI WORKSHOP 2011 IPSJ SYMPOSIUM SERIES, vol. 2011, no. 3, 12 October 2011 (2011-10-12), pages 432 - 437, XP008173270 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015007885A (ja) * | 2013-06-25 | 2015-01-15 | 日本電気株式会社 | 情報処理装置、及び、データ処理方法 |
WO2015079647A1 (ja) * | 2013-11-28 | 2015-06-04 | 日本電気株式会社 | 情報処理装置および情報処理方法 |
JPWO2015079647A1 (ja) * | 2013-11-28 | 2017-03-16 | 日本電気株式会社 | 情報処理装置および情報処理方法 |
JP2017516194A (ja) * | 2014-03-26 | 2017-06-15 | アルカテル−ルーセント | ストリーミングデータの匿名化 |
KR102058030B1 (ko) | 2017-09-20 | 2019-12-20 | 주식회사 티지360테크놀로지스 | 익명성 유지 시스템 및 그 방법 |
WO2019168144A1 (ja) * | 2018-03-02 | 2019-09-06 | 日本電気株式会社 | 情報処理装置、情報処理システム、情報処理方法、及び、記録媒体 |
JPWO2019168144A1 (ja) * | 2018-03-02 | 2020-12-10 | 日本電気株式会社 | 情報処理装置、情報処理システム、情報処理方法、及び、プログラム |
JP7151759B2 (ja) | 2018-03-02 | 2022-10-12 | 日本電気株式会社 | 情報処理装置、情報処理方法、及び、プログラム |
CN113127924A (zh) * | 2019-12-30 | 2021-07-16 | 财团法人工业技术研究院 | 数据匿名方法与数据匿名系统 |
JP7374796B2 (ja) | 2020-02-07 | 2023-11-07 | 日鉄ソリューションズ株式会社 | 情報処理装置、情報処理方法及びプログラム |
KR20230105569A (ko) * | 2022-01-04 | 2023-07-11 | 비씨카드(주) | 익명 값에 대한 예측 값을 결정하는 방법 및 디바이스 |
KR102627734B1 (ko) * | 2022-01-04 | 2024-01-23 | 비씨카드(주) | 익명 값에 대한 예측 값을 결정하는 방법 및 디바이스 |
Also Published As
Publication number | Publication date |
---|---|
US20140317756A1 (en) | 2014-10-23 |
JPWO2013088681A1 (ja) | 2015-04-27 |
EP2793162A4 (en) | 2015-09-16 |
EP2793162A1 (en) | 2014-10-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2013088681A1 (ja) | 匿名化装置、匿名化方法、並びにコンピュータ・プログラム | |
US9230132B2 (en) | Anonymization for data having a relational part and sequential part | |
JP6007969B2 (ja) | 匿名化装置及び匿名化方法 | |
US10242213B2 (en) | Asymmetric journalist risk model of data re-identification | |
EP3477528B1 (en) | Data anonymization in an in-memory database | |
US9990515B2 (en) | Method of re-identification risk measurement and suppression on a longitudinal dataset | |
JP6471699B2 (ja) | 情報判定装置、情報判定方法及びプログラム | |
US20170161519A1 (en) | Information processing device, information processing method and recording medium | |
CA2852253A1 (en) | System and method for shifting dates in the de-identification of datesets | |
WO2014181541A1 (ja) | 匿名性を検証する情報処理装置及び匿名性検証方法 | |
Kieseberg et al. | Protecting anonymity in data-driven biomedical science | |
JP2013190838A (ja) | 情報匿名化システム、情報損失判定方法、及び情報損失判定プログラム | |
US20140089657A1 (en) | Recording medium storing data processing program, data processing apparatus and data processing system | |
Bewong et al. | A relative privacy model for effective privacy preservation in transactional data | |
JP6127774B2 (ja) | 情報処理装置、及び、データ処理方法 | |
KR102419256B1 (ko) | 의료정보 기반의 커뮤니티 서비스 제공 방법 및 장치 | |
Yao et al. | Privacy preservation in publishing electronic health records based on perturbation | |
El Ouazzani et al. | Proximity measurement for hierarchical categorical attributes in Big Data | |
Prada et al. | Avoiding disclosure of individually identifiable health information: a literature review | |
KR102210395B1 (ko) | 의료정보 기반의 커뮤니티 서비스 제공 방법 및 장치 | |
Chicha et al. | Exposing safe correlations in transactional datasets | |
Ina et al. | Anonymization and analysis of horizontally and vertically divided user profile databases with multiple sensitive attributes | |
Mahiou et al. | Anonymity, Utility and Risk: an Overview | |
Acharjya et al. | Improved Anonymization Algorithms for Hiding Sensitive Information in Hybrid Information System. | |
JP2016110472A (ja) | 情報処理装置、情報処理法、及び、プログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12858495 Country of ref document: EP Kind code of ref document: A1 |
|
REEP | Request for entry into the european phase |
Ref document number: 2012858495 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2012858495 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2013549106 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14365615 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |