US10403392B1 - Data de-identification methodologies - Google Patents
Data de-identification methodologies Download PDFInfo
- Publication number
- US10403392B1 US10403392B1 US14/102,522 US201314102522A US10403392B1 US 10403392 B1 US10403392 B1 US 10403392B1 US 201314102522 A US201314102522 A US 201314102522A US 10403392 B1 US10403392 B1 US 10403392B1
- Authority
- US
- United States
- Prior art keywords
- identified
- database
- field
- value
- patient
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active - Reinstated, expires
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000013503 de-identification Methods 0.000 title abstract description 42
- 238000012360 testing method Methods 0.000 claims abstract description 8
- 238000013507 mapping Methods 0.000 abstract description 52
- 230000036541 health Effects 0.000 abstract description 16
- 235000021178 picnic Nutrition 0.000 description 14
- 235000013351 cheese Nutrition 0.000 description 7
- 241000220225 Malus Species 0.000 description 5
- 230000008569 process Effects 0.000 description 4
- 230000006978 adaptation Effects 0.000 description 3
- 238000002347 injection Methods 0.000 description 3
- 239000007924 injection Substances 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000013515 script Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 206010037180 Psychiatric symptoms Diseases 0.000 description 1
- 235000021016 apples Nutrition 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000003999 initiator Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/60—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
-
- G06Q50/24—
Definitions
- the present invention generally relates to data de-identification.
- De-identification of personal information is a challenge in any domain in which the use of individual identifiable information has privacy implications. De-identification of such data can be desirable, but can be complex task.
- Various existing scripts which claim to perform “de-identification” merely scramble data, e.g. by changing “John Smith” and “Jane Doe” to “John Doe” and “Jane Smith”.
- other fields may be jumbled or including data that isn't semantically accurate.
- such a script not only does not meet HIPAA requirements, but is not ‘useful’ data for support and testing purposes, as the de-identified data doesn't resemble actual data.
- the present invention includes many aspects and features. Moreover, while many aspects and features relate to, and are described in, the context of medical data, the present invention is not limited to use only in this context, as will become apparent from the following summaries and detailed descriptions of aspects, features, and one or more embodiments of the present invention.
- one aspect of the present invention relates to a method comprising de-identifying a data table containing protected health information by receiving, from a user via an input device associated with an electronic device, input corresponding to mapping, in a mappings file, of each of a plurality of columns of a data table to a respective data type, determining, based on the mappings file, that a first column of the data table is associated with a first data type, accessing each value of the first column, and, for each respective accessed value, automatically generating a de-identified value, based on the identification of the first data type as being associated with the first column in the mappings file, one or more computer logic instructions associated with de-identification of data of the first data type, and the respective accessed value, by applying the one or more computer logic instructions associated with de-identification of the first data type to the respective accessed value to result in a respective de-identified value, and saving the respective de-identified value back to the data table in place of the respective accessed value, determining, based on
- the first column contains first name data.
- the first column contains last name data.
- the first column contains middle initial data.
- the first column contains birthdate data.
- the first column contains date of death data.
- the first column contains social security number data.
- the first column contains driver's license number data.
- the first column contains patient ID data.
- the first column contains employer data.
- the first column contains URL data.
- the first column contains email address data.
- the first column contains address data.
- the first column contains phone number data.
- the first column contains health plan beneficiary data.
- the first column contains account number data.
- the first column contains vehicle identification number data.
- the first column contains license plate data.
- Another aspect relates to a method comprising de-identifying a data table containing protected health information by receiving, from a user via an input device associated with an electronic device, input corresponding to mapping, in a mappings file, of each of a plurality of columns of a data table to a respective data type, determining, based on the mappings file, that a first column of the data table is associated with a first data type, accessing each value of the first column, and, for each respective accessed value, automatically generating a de-identified value, based on the identification of the first data type as being associated with the first column in the mappings file, one or more computer logic instructions associated with de-identification of data of the first data type, and the respective accessed value, by applying the one or more computer logic instructions associated with de-identification of the first data type to the respective accessed value to result in a respective de-identified value, and saving the respective de-identified value back to the data table in place of the respective accessed value, determining, based on the mappings file, that
- Another aspect relates to a non-transitory computer readable medium containing computer executable instructions configured to perform a method comprising determining, based on a mappings file, that a first column of a data table is associated with a first data type, accessing each value of the first column, and, for each respective accessed value, automatically generating a de-identified value, based on the identification of the first data type as being associated with the first column in the mappings file, one or more computer logic instructions associated with de-identification of data of the first data type, and the respective accessed value, by applying the one or more computer logic instructions associated with de-identification of the first data type to the respective accessed value to result in a respective de-identified value, and saving the respective de-identified value back to the data table in place of the respective accessed value, determining, based on the mappings file, that a second column of the data table is associated with a second data type, accessing each value of the second column, and, for each respective accessed value, automatically generating a de
- Another aspect relates to a method comprising de-identifying a data table containing protected health information by receiving, from a user via an input device associated with an electronic device, input corresponding to mapping, in a mappings file, of each of a plurality of columns of a data table to a respective data type, determining, based on the mappings file, that a first column of the data table is associated with a first data type, accessing each value of the first column, and, for each respective accessed value, automatically generating a de-identified value, based on the identification of the first data type as being associated with the first column in the mappings file, one or more computer logic instructions associated with de-identification of data of the first data type, and the respective accessed value, by applying the one or more computer logic instructions associated with de-identification of the first data type to the respective accessed value to result in a respective de-identified value, and saving the respective de-identified value back to the data table in place of the respective accessed value; and utilizing the de-identified data table to test
- FIG. 1 illustrates the de-identification of original data from a database
- FIG. 2 illustrates an exemplary sequence flow and exemplary interaction of various components of an exemplary tool in accordance with one or more preferred implementations
- FIG. 3 illustrates an exemplary mappings file in accordance with one or more preferred implementations.
- FIG. 4 illustrates the de-identification of a database based on the mappings file of FIG. 3 .
- any embodiment may incorporate only one or a plurality of the above-disclosed aspects of the invention and may further incorporate only one or a plurality of the above-disclosed features.
- any embodiment discussed and identified as being “preferred” is considered to be part of a best mode contemplated for carrying out the present invention.
- Other embodiments also may be discussed for additional illustrative purposes in providing a full and enabling disclosure of the present invention.
- any embodiment may incorporate only one or a plurality of the above-disclosed aspects of the invention and may further incorporate only one or a plurality of the above-disclosed features.
- many embodiments, such as adaptations, variations, modifications, and equivalent arrangements, will be implicitly disclosed by the embodiments described herein and fall within the scope of the present invention.
- any sequence(s) and/or temporal order of steps of various processes or methods that are described herein are illustrative and not restrictive. Accordingly, it should be understood that, although steps of various processes or methods may be shown and described as being in a sequence or temporal order, the steps of any such processes or methods are not limited to being carried out in any particular sequence or order, absent an indication otherwise. Indeed, the steps in such processes or methods generally may be carried out in various different sequences and orders while still falling within the scope of the present invention. Accordingly, it is intended that the scope of patent protection afforded the present invention is to be defined by the appended claims rather than the description set forth herein.
- a picnic basket having an apple describes “a picnic basket having at least one apple” as well as “a picnic basket having apples.”
- a picnic basket having a single apple describes “a picnic basket having only one apple.”
- a de-identification utility is configured to de-identify healthcare data and replace it with realistic “fake” data.
- such replacement is in compliance with privacy requirements of the Health Insurance Portability and Accountability Act (HIPAA), which establishes minimum standards for protecting the privacy of all “individually identifiable health information”, sometimes referred to as Protected Health Information (PHI), held or transmitted by a covered entity or its business associate, in any form of media, whether electronic, paper or oral.
- HIPAA Health Insurance Portability and Accountability Act
- a de-identification tool functions utilizing a data source and a list of fields that need to be de-identified. In one or more preferred implementations, such a list is included in a mappings file.
- the de-identification tool reads data from the data source, generates realistic “fake” data based on the read data in accordance with an appropriate one of eighteen HIPAA data types in the same format as the read data, and writes the generated “fake” data to a data store (which may or may not be the same as the original data source).
- FIG. 1 illustrates the de-identification of original data from a database.
- data from an original database is utilized to generate de-identified data, which is then used to populate a de-identified database.
- the de-identified data in the de-identified database is free of identifiable information, and can, for example, be used for support and developmental purposes free from the risks associated with storing identifiable health information. It will be appreciated that although FIG. 1 only illustrates three exemplary fields from such a database, these are merely exemplary fields, and the database may (and typically would) have more fields.
- a database would frequently additionally include information which would not be de-identified, and would simply be transferred from the original data set to the new de-identified data set.
- a data set containing patient identifying information and health information for such patients might be de-identified to result in a data set which includes the same health information, but which would simply include “fake” personal information, rather than actual PHI.
- Such a de-identified data set could subsequently be used for testing and developmental purposes without risking exposure of PHI.
- a de-identification tool is written in C# using Dependency Injection to allow for multiple configurations based on the objects being de-identified.
- a Reader object reads original data, one or more Generators create/transform data based on the data types of original data, and a Write object writes the created/transformed data to an output stream.
- a DatabaseReader that reads data from a Database
- HipaaGenerators that create fake data according to each of the eighteen HIPAA data types
- a DatabaseWriter that writes generated data (e.g. either back to the database or to a new database).
- each of the HipaaGenerators corresponds to one of the eighteen data types identified by HIPAA.
- the tool does not read a data source looking for data that is potentially protected health information. Instead, it preferably depends on an input file, which can be characterized as a mappings file, which provides this information.
- a mappings file maps data types (e.g. the eighteen HIPAA data types) to a data set being de-identified.
- a mappings file may contain the table and column names for columns to be de-identified, along with the appropriate HIPAA data type for each column.
- Section 164.514(b)(2)(i) of the HIPAA Privacy Rule lists the identifiers that must be removed for data to be considered de-identified.
- Exemplary generators in accordance with one or more preferred implementations which are designed to comply with this HIPAA Privacy Rule will now be described.
- one or more generators are utilized for the de-identification of name data.
- these include generators for a male first name, a female first name, and a first name when a gender is unknown.
- the male first name generator randomly selects a name from a list of male names (e.g. a list of the one thousand most common male first names in the U.S.), and the female first name generator similarly randomly selects a name from a list of female names (e.g. a list of the one thousand most common female first names in the U.S.).
- a list may be maintained of gender neutral first names.
- a generator is similarly included for last name generation which randomly selects a last name from a list of last names (e.g. a list of the one thousand most common surnames in the U.S.).
- a middle name generator may be utilized similar to the first and last name generators, or a middle initiator generator may be utilized which simply randomly generates a middle initial.
- a street address generator which selects a random house number from “1” to “9999”, and a random street name from a list of street names (e.g. a list of street names which are based on Monopoly property names).
- a city-state-zip generator which selects a random city, state, and zip code from a list (e.g. a list containing valid city, state, and zip code tuples for twelve U.S. cities).
- a birthdate generator is utilized which generates a random month and day.
- the birth year in the original data is more than eighty nine years ago, the year will be set to a value corresponding to the current year minus ninety, while if the birth year in the original data is eighty nine years ago or less, then the birth year in the original data will be utilized.
- a general data generator is utilized for some or all non-birthdate dates.
- a random value between one and fourteen is generated, and all dates are shifted backward in time by that many days, updating month and year where appropriate.
- this days offset value is stored in memory and is not persisted after the tool finishes running.
- a phone number generator is utilized which generates a random phone number.
- this phone number is of the form xxx555yyyy.
- an email address generator which generates a random email address. For example, this might generate a random “@example.com” email address which comprises ten random characters.
- a social security number generator which generates a random social security number (SSN). For example, this might generate a random SSN in the format XXX-YY-ZZZZ where XXX, YY, or ZZZZ are all zeroes, as illustrated in FIG. 1 .
- a format preserving generator generates a random value based on original data which respects the format of the original data. For example, if the original data contains six characters, the first three being digits 0-9, and the last three being capital letters A-Z, the format preserving generator will preferably follow these rules while generating fake data.
- This type of format preserving generator might be used, for example, for medical record numbers, health plan beneficiary numbers, account numbers, certificate/license numbers, vehicle identifiers and serial numbers, license plate numbers, and device identifiers and serial numbers.
- a driver's license generator generates a random driver's license number.
- a state column is specified in a mappings file (e.g. a state field for an address)
- the generated driver's license number will follow the format for that state.
- a default state may be utilized, or a state may be randomly selected.
- a URL generator generates a random “.com” web address comprising a string of ten random characters selected from A-Z, a-z, and 0-9, followed by “.com”.
- a suffix e.g. “.com”, “.net”, or “.biz” may be randomly selected.
- an image link generator generates a path to a randomly selected one of a default or user configured set of images and image blobs. For example, image data corresponding to a patient photo may be replaced with image data corresponding to a smiley face.
- an employer generator is utilized which selects a random employer from a list (e.g. a list containing fictitious business entities).
- a free text generator is utilized which generates text having the same length as original data.
- “Lorem Ipsum” text is generated having the same length as original data.
- a list e.g. a list of common first names which is utilized in generating de-identified data.
- this is a list included at compile time, while in at least some preferred implementations, such a resource list can be specified at run time.
- a default list is utilized unless a different resource list is specified at run time.
- the tool preferably utilizes a mappings file which maps data types (e.g. the eighteen HIPAA data types) to a data set being de-identified.
- a mappings file may contain the table and column names in a database that needs to be de-identified, along with the appropriate HIPAA data type for each column.
- FIG. 2 illustrates an exemplary sequence flow and exemplary interaction of various components of an exemplary tool in accordance with one or more preferred implementations, for a scenario where the data source is a database.
- read operations are utilized to read data from a source database informed by a mappings file, and a data generation factory component utilizes this data to generate de-identified data in accordance with a configuration file. This generated de-identified data is then written to a database (which may or may not be the same database) utilizing write operations.
- mappings file would be utilized which would specify that values in the “Last Name” column correspond to a “Hipaa Last Name” data type, that values in the “SSN” column correspond to a “Hipaa Social Security Number” data type, and that values in the “Patient ID” column correspond to a “Hipaa Formatted Value” data type. Based on these specifications in the mappings file, data from each column would be read from the original database and de-identified using an appropriate routine selected based on the specified Hipaa data type.
- new values would be generated respecting the format of the original data.
- new values would be generated in the format AAA-XXXX, where AAA is a three character string of alphabetic values A-Z, and XXXXX is a five character string of numeric values 0-9.
- FIG. 3 illustrates an exemplary mappings file in accordance with one or more preferred implementations.
- FIG. 3 illustrates a mappings file for the original database illustrated in FIG. 4 .
- the mappings file maps columns of a specified table of the database to data types identified in the mappings file which are stored at specified web addresses. For example, the mappings file maps the d_FirstName column to a firstname data type stored at “http://www.allscripts.com/deid/datatypes/firstname”.
- the mappings file can specify an additional column to be utilized by a generator in generating de-identified data.
- the mappings file in addition to mapping the d_FirstName column to a firstname data type, further specifies that the d_sex column is to be used to determine a gender, which is “female” if the d_sex column value is “F”, and male if the d_sex column value is “male”.
- data that might conventionally be viewed as being of the same data type for example date of birth and date of death data (both of which might conventionally be viewed as being of a date data type) can be mapped to different data types so as to be given different treatment.
- the d_DoB column is mapped to a dateofbirth data type which generates a random month and date but leaves the year unchanged if it is within the last ninety years, while the d_DoD column is mapped to a date data type which changes the date to be a certain number of days (randomly selected at runtime) prior to the original date.
- a tool for databases, is configured to take care of scenarios where composite primary keys are defined.
- a tool has a built-in cache that holds a lookup table for original values and, for each original value, a corresponding value the respective original value was de-identified to.
- the lookup table is searched for an entry and, if found, the same value is used. In some preferred implementations, this enhances performance and also maintains integrity in the table, as the same patient will be de-identified to the same value across multiple entries.
- such a cache table is stored as an encrypted file that can be reloaded on a subsequent launch, either automatically or upon user specification.
- one or more preferred implementations are configured to support de-identification of other data sources, such as XML files, HL7 files, claims files, and MOM messages.
- dependency injection is employed to support multiple configurations based on the objects to be de-identified.
- One or more preferred implementations utilizing dependency injection are extended to support de-identification of databases, XML files, HL7 files, claims files, MOM messages, and other data sources.
- a tool is configured for de-identification of XML data.
- a tool may be configured for de-identification of both XML files (e.g. a file stored on a local hard drive), as well as XML data, such as an XML document stored in columns inside of a table.
- a main mappings file for de-identification of a table as described herein there is utilized child mappings files for XML files (e.g. a list of data types and tag names). This can comprise, for example, a listing of all tag names and corresponding data types for each tag name.
- a preferred methodology involves building an internal mapping list, parsing column values, determining corresponding data types, and de-identifying to dummy values.
- a cell of a data table might include the XML snippet “ ⁇ SSN>354-52-4513 ⁇ /SSN> ⁇ patientID>852-AABC ⁇ /patientID>”.
- a child mappings file might map the tag name ⁇ SSN> to a certain data type, and the tag name ⁇ patientID> to another certain data type.
- a tool is configured for use in other contexts, such as government, finance, or other industries that handle sensitive information.
- generators, or portions of generators may be useful in more than one context.
- an NPI generator used to generate National Provider Identity numbers, employs the Luhn algorithm, which is the same algorithm commonly used to validate credit card numbers.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Primary Health Care (AREA)
- Business, Economics & Management (AREA)
- General Health & Medical Sciences (AREA)
- Epidemiology (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Entrepreneurship & Innovation (AREA)
- General Business, Economics & Management (AREA)
- Quality & Reliability (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Operations Research (AREA)
- Marketing (AREA)
- Economics (AREA)
- Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Data Mining & Analysis (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
Description
Claims (20)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/102,522 US10403392B1 (en) | 2013-12-11 | 2013-12-11 | Data de-identification methodologies |
US16/558,291 US11366927B1 (en) | 2013-12-11 | 2019-09-02 | Computing system for de-identifying patient data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/102,522 US10403392B1 (en) | 2013-12-11 | 2013-12-11 | Data de-identification methodologies |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/558,291 Continuation-In-Part US11366927B1 (en) | 2013-12-11 | 2019-09-02 | Computing system for de-identifying patient data |
Publications (1)
Publication Number | Publication Date |
---|---|
US10403392B1 true US10403392B1 (en) | 2019-09-03 |
Family
ID=67770093
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/102,522 Active - Reinstated 2034-02-06 US10403392B1 (en) | 2013-12-11 | 2013-12-11 | Data de-identification methodologies |
Country Status (1)
Country | Link |
---|---|
US (1) | US10403392B1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190266353A1 (en) * | 2018-02-26 | 2019-08-29 | International Business Machines Corporation | Iterative execution of data de-identification processes |
US10988115B2 (en) * | 2019-02-11 | 2021-04-27 | Ford Global Technologies, Llc | Systems and methods for providing vehicle access using biometric data |
US20220222356A1 (en) * | 2021-01-14 | 2022-07-14 | Bank Of America Corporation | Generating and disseminating mock data for circumventing data security breaches |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050165623A1 (en) * | 2003-03-12 | 2005-07-28 | Landi William A. | Systems and methods for encryption-based de-identification of protected health information |
US20060074983A1 (en) * | 2004-09-30 | 2006-04-06 | Jones Paul H | Method of maintaining data confidentiality |
US20060129345A1 (en) * | 2001-08-24 | 2006-06-15 | Bio-Rad Laboratories, Inc. | Biometric quality control process |
US20060179075A1 (en) * | 2005-02-07 | 2006-08-10 | Fay Jonathan E | Method and system for obfuscating data structures by deterministic natural data substitution |
US20100306854A1 (en) * | 2009-06-01 | 2010-12-02 | Ab Initio Software Llc | Generating Obfuscated Data |
US20110123118A1 (en) * | 2008-01-24 | 2011-05-26 | Nayar Shree K | Methods, systems, and media for swapping faces in images |
US20120041791A1 (en) * | 2008-08-13 | 2012-02-16 | Gervais Thomas J | Systems and methods for de-identification of personal data |
US20120266254A1 (en) * | 2010-12-14 | 2012-10-18 | International Business Machines Corporation | De-Identification of Data |
US20130080398A1 (en) * | 2011-09-23 | 2013-03-28 | Dataguise Inc. | Method and system for de-identification of data within a database |
US20140280261A1 (en) * | 2013-03-15 | 2014-09-18 | PathAR, LLC | Method and apparatus for substitution scheme for anonymizing personally identifiable information |
-
2013
- 2013-12-11 US US14/102,522 patent/US10403392B1/en active Active - Reinstated
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060129345A1 (en) * | 2001-08-24 | 2006-06-15 | Bio-Rad Laboratories, Inc. | Biometric quality control process |
US20050165623A1 (en) * | 2003-03-12 | 2005-07-28 | Landi William A. | Systems and methods for encryption-based de-identification of protected health information |
US20060074983A1 (en) * | 2004-09-30 | 2006-04-06 | Jones Paul H | Method of maintaining data confidentiality |
US20060179075A1 (en) * | 2005-02-07 | 2006-08-10 | Fay Jonathan E | Method and system for obfuscating data structures by deterministic natural data substitution |
US20110123118A1 (en) * | 2008-01-24 | 2011-05-26 | Nayar Shree K | Methods, systems, and media for swapping faces in images |
US20120041791A1 (en) * | 2008-08-13 | 2012-02-16 | Gervais Thomas J | Systems and methods for de-identification of personal data |
US20100306854A1 (en) * | 2009-06-01 | 2010-12-02 | Ab Initio Software Llc | Generating Obfuscated Data |
US20120266254A1 (en) * | 2010-12-14 | 2012-10-18 | International Business Machines Corporation | De-Identification of Data |
US20130080398A1 (en) * | 2011-09-23 | 2013-03-28 | Dataguise Inc. | Method and system for de-identification of data within a database |
US20140280261A1 (en) * | 2013-03-15 | 2014-09-18 | PathAR, LLC | Method and apparatus for substitution scheme for anonymizing personally identifiable information |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190266353A1 (en) * | 2018-02-26 | 2019-08-29 | International Business Machines Corporation | Iterative execution of data de-identification processes |
US20190303618A1 (en) * | 2018-02-26 | 2019-10-03 | International Business Machines Corporation | Iterative execution of data de-identification processes |
US11036886B2 (en) * | 2018-02-26 | 2021-06-15 | International Business Machines Corporation | Iterative execution of data de-identification processes |
US11036884B2 (en) * | 2018-02-26 | 2021-06-15 | International Business Machines Corporation | Iterative execution of data de-identification processes |
US10988115B2 (en) * | 2019-02-11 | 2021-04-27 | Ford Global Technologies, Llc | Systems and methods for providing vehicle access using biometric data |
US20220222356A1 (en) * | 2021-01-14 | 2022-07-14 | Bank Of America Corporation | Generating and disseminating mock data for circumventing data security breaches |
US11880472B2 (en) * | 2021-01-14 | 2024-01-23 | Bank Of America Corporation | Generating and disseminating mock data for circumventing data security breaches |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9720943B2 (en) | Columnar table data protection | |
US11080423B1 (en) | System for simulating a de-identified healthcare data set and creating simulated personal data while retaining profile of authentic data | |
US9311369B2 (en) | Virtual masked database | |
US9460311B2 (en) | Method and system for on-the-fly anonymization on in-memory databases | |
US8924401B2 (en) | Method and system for logical data masking | |
US9792454B2 (en) | Record level data security | |
US20220100899A1 (en) | Protecting sensitive data in documents | |
US11710544B2 (en) | Performing analytics on protected health information | |
US20140317758A1 (en) | Focused personal identifying information redaction | |
US20130080398A1 (en) | Method and system for de-identification of data within a database | |
US20060074897A1 (en) | System and method for dynamic data masking | |
US11113417B2 (en) | Dynamic data anonymization using taint tracking | |
CN110289059A (en) | Medical data processing method, device, storage medium and electronic equipment | |
US20150213458A1 (en) | Analytic modeling of protected health information | |
US20200233977A1 (en) | Classification and management of personally identifiable data | |
US10403392B1 (en) | Data de-identification methodologies | |
Carrell et al. | The machine giveth and the machine taketh away: a parrot attack on clinical text deidentified with hiding in plain sight | |
Kastenhofer | The logic of archival authenticity: ISO 15489 and the varieties of forgeries in archives | |
Chen et al. | Generation of surrogates for de-identification of electronic health records | |
Trabelsi et al. | Data disclosure risk evaluation | |
US11387998B2 (en) | Processing personally identifiable information from separate sources | |
Woodman | Hackers paradise: Hackers across Latin America are taking advantage of the current crisis to access people’s personal data. If not protected it could spell disaster | |
US11436365B1 (en) | Generating a compliance report of data processing activity | |
Komarova et al. | K-anonymity: A note on the trade-off between data utility and data security | |
CN110766536B (en) | Data processing method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20230903 |
|
PRDP | Patent reinstated due to the acceptance of a late maintenance fee |
Effective date: 20231207 |
|
FEPP | Fee payment procedure |
Free format text: PETITION RELATED TO MAINTENANCE FEES FILED (ORIGINAL EVENT CODE: PMFP); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Free format text: PETITION RELATED TO MAINTENANCE FEES GRANTED (ORIGINAL EVENT CODE: PMFG); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Free format text: SURCHARGE, PETITION TO ACCEPT PYMT AFTER EXP, UNINTENTIONAL (ORIGINAL EVENT CODE: M1558); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |