EP3752948A1 - Procédé de traitement automatique pour l'anonymisation d'un jeu de données numériques - Google Patents
Procédé de traitement automatique pour l'anonymisation d'un jeu de données numériquesInfo
- Publication number
- EP3752948A1 EP3752948A1 EP19710728.7A EP19710728A EP3752948A1 EP 3752948 A1 EP3752948 A1 EP 3752948A1 EP 19710728 A EP19710728 A EP 19710728A EP 3752948 A1 EP3752948 A1 EP 3752948A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- variables
- status
- attributes
- data
- sensitive
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title description 2
- 238000012545 processing Methods 0.000 claims abstract description 40
- 238000000034 method Methods 0.000 claims description 46
- 230000035945 sensitivity Effects 0.000 claims description 41
- 230000008569 process Effects 0.000 claims description 20
- 238000011282 treatment Methods 0.000 claims description 13
- 230000001105 regulatory effect Effects 0.000 claims description 9
- 238000010200 validation analysis Methods 0.000 claims description 3
- 238000004321 preservation Methods 0.000 claims 1
- 238000004458 analytical method Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 201000010099 disease Diseases 0.000 description 7
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 7
- 238000009826 distribution Methods 0.000 description 7
- 238000013480 data collection Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 230000036541 health Effects 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012797 qualification Methods 0.000 description 2
- 238000010206 sensitivity analysis Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 208000017667 Chronic Disease Diseases 0.000 description 1
- 101000900767 Homo sapiens Protein cornichon homolog 1 Proteins 0.000 description 1
- 102100022049 Protein cornichon homolog 1 Human genes 0.000 description 1
- 208000019802 Sexually transmitted disease Diseases 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 238000012958 reprocessing Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000012502 risk assessment Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
- G06F21/6254—Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
Definitions
- the present invention relates to the field of digital data processing and more particularly automatic processing of large volumes of digital data by modifying the content and / or structure of these data in order to make it very difficult or impossible to "re-identify” the data.
- anonymizing data is often the result of an ethical, legal and ethical compromise between a desire or an obligation to protect individuals and their personal data.
- anonymization is used for the dissemination and sharing of data deemed to be of public interest, such as open data.
- a first step usually consists of removing the identifiers from the cards or databases concerned, such as surnames, first names, tax identifiers, social security numbers, etc.
- the next step will be to apply to the files or databases "filters” and “cryptographic transformations” (eg encryption and / or hashing of data by a dedicated algorithm, for example SHA for Secure Hash Algorithm), but before this work , the data manager carries out or commission a study clarifying its need for anonymisation, its objectives and its requirements (eg must there be a possible reversibility of the anonymisation), prioritizing where necessary the data to be protected, according to their degree of "sensitivity” and according to the purpose of the treatment that must then undergo the information. It can thus produce and compare several anonymisation scenarios in order to better choose the solution that seems most relevant to it (according to its requirements, and the requirements of the Law). In all cases the anonymization must resist dictionary attacks.
- filters eg encryption and / or hashing of data by a dedicated algorithm, for example SHA for Secure Hash Algorithm
- the notion of anonymized identity and re-identification of the patient concerns the direct and indirect means of re-identification (eg name, address %) but also the encrypted data if the decryption means is available .
- a person (ex: a patient) is included in an anonymous database only if it is obligatory or really useful, and to a project can be associated only one anonymized database .
- Increased legal certainty is obtained if all the persons listed in it have given their consent (in writing or via the provision of their identifier, for a medico-commercial study, for example), but this type of basis induces interpretation bias. .
- Mechanisms should be provided to detect and block attempts to intrude (through the Internet or other means) and in particular malicious attempts at data inference, abuse of power, etc.
- Patent application WO 2015066523 describes an example of a computer-implemented method, to provide better levels of data privacy, anonymity and security by allowing subjects to whom data belong, to remain "anonymous dynamically," otherwise Anonymous says as long as they wish and to the extent desired.
- Embodiments include systems that create, access, use, store, and / or erase data with increased levels of privacy, anonymity, and security, thereby obtaining better qualified and more accurate information.
- embodiments may make possible controlled information sharing that can deliver temporally, geographically and / or usage limited information to the receiving party.
- anonymity score scores can be calculated for the shared data items, so that a level of consent / commitment required by the data object before the sharing of the relevant data items to third parties can be done. to be specified.
- the patent application WO2012080081 relates to a computer-implemented method of anonymizing data from a data source for a target application, the method comprising: identifying sensitive data elements in data from the source of data; data through a discovery tool and generating data definitions for data items indicating the sensitive data items, the data definitions including at least one property for the data items; specify a set of runtime rules including at least one runtime rule, the runtime rule including an runtime anonymizer protocol, the runtime engine rule set being specified by via an interface; map the runtime ruleset to the data definitions generated by the discovery tool for each of the sensitive data items; and consuming the generated data definitions and applying the mapped runtime anonymization protocol to the sensitive data item data definition, to anonymize the sensitive data item for the target application.
- Patent Application EP2752786 is also known which describes an anonymization device and an anonymization method characterized in that all the data satisfy the requested anonymity levels for each, and in that they prevent the loss of value of the information that results from the abstraction of the entire data collection.
- the present anonymization device comprises: an anonymization means for performing an anonymization processing in which a group of data is treated as a processing unit for a data collection comprising at least two data; an anonymity level specifying means for specifying an adaptive anonymity level for each group; and an anonymity rating means for judging whether a group meets the specified adaptive anonymity level.
- the anonymization means on the basis of the evaluation result of the anonymity evaluation means, further performs an anonymization processing of the data collection for which the anonymization processing has been carried out.
- European Patent Application EP2573699 discloses another example of an anonymization device for automatically configuring a general hierarchical tree of attribute values in identity information protection technology.
- the anonymization device describes, quantitatively evaluates the amount of information that is lost during the generalization of an attribute value, and can thus automatically evaluate priorities between anonymized data and between data that are being anonymized.
- Information of each person includes attribute values of the person for a plurality of attributes.
- An anonymization is performed by obscuring the attribute values, and a structure in which attribute values to be obscured, are expressed in a tree structure according to the obscuration level is called a general hierarchical tree.
- the described identity information anonymization device performs automatic configuration by configuring a tree using frequency information of attribute values.
- a quantity of information lost between two anonymized data or between data being anonymized is quantitatively evaluated.
- US patent application 2107/0124336 describes an automated method of identifying the attributes for the anonymisation exercise. This method is based on data encryption, a step prior to studying the level of sensitivity of the data and therefore their degree of requirement in terms of anonymization.
- This patent proposes three methods for choosing values / attributes for anonymization.
- a first method consists in comparing the different values with values present in a dictionary, with which different levels of sensitivity are associated. Attributes for which the presence of sensitive values in the dataset exceeds a certain predetermined threshold will be selected for anonymization.
- a second classification method is based on a comparison of the distributions of the values of an attribute in the dataset and in a known distribution. This method can confirm the results of the first method of identifying the attributes to be anonymized.
- a final method is to provide the anonymizer with a portion of the dataset in its version. original (before encryption) and generate from this sample a number of expressions for one or more attribute (s). The rest of the dataset will be encrypted and compared to these generated expressions to identify certain attributes and their sensitivity.
- the solutions of the prior art are adapted to prepare anonymous databases when they are created. On the other hand, these solutions do not make it possible to easily change the anonymization, for example when the addition of new entries modifies the context of anonymisation.
- the solutions of the prior art require in this case the reprocessing of the entire database, which may require considerable computation time, for databases that may represent several terabytes.
- sex information combined with age information can be identifying, which requires a transformation / anonymization action, especially when data contains in addition information relating to a given pathology.
- the information is in fact not identifiable. But if new entries change this situation, the information "sex" or "age” may require different treatment.
- the anonymization requires a preliminary step of identifying the attributes / values to be anonymized. This step is left to the choice of the anonymizer / user and is therefore subject to a problem of subjectivity and non-precision of the classification. Moreover, even work that focuses on the classification of attributes does not provide a clear and documented methodology for qualifying attributes.
- the present invention aims to overcome these disadvantages by proposing a method for having different levels of anonymization through a classification of the variables of a database.
- the invention relates in its most general sense to a method of automatically processing a digital data set consisting of:
- a digital file constituted by a table determining at least identifiers / denominations of the variables, and for each of said variables
- An order of the power of identification of the different census variables o
- a numerical file constituted by a table of variables with an established order of the degree of facility (208) by which an attacking potential can access the information on the different variables. This order can be deduced from some databases tracing the history of attacks.
- o A digital file consisting of a table of "sensitive" attributes, for which the values / modalities are classified in order of sensitivity.
- ⁇ a first indicator for the availability of the associated value from external data sources, such as from a web crawler or a repository or historical attacks
- the data set is a subset
- a sensitivity indicator by referring to a list of sensitive variables with their different modalities / values ranging from the most sensitive to the least sensitive. These indicators are calculated based on the occurrence frequency of the most sensitive values of the sensitive attribute. They will then be compared to a frequency threshold
- a fourth processing concerning the residual variables associated with a "general" sensitivity parameter of assigning some of said variables a "hidden” status to prevent their normal use in said set of data comprises, prior to the first classification step, a processing for assigning to each of the variables for which no correspondence with the attribute repository (201) is established, a provisional status in the attribute repository (201) , which can be changed to definitive status or rejected according to the opinion of an operator.
- the method further comprises a step consisting in dynamically applying to the variables that can not be associated with the referential of the attributes, a specific processing consisting in registering in said repository the pair "variable, status" awaiting validation / rejection according to the opinion of an operator.
- a step consisting in dynamically applying to the variables that can not be associated with the referential of the attributes, a specific processing consisting in registering in said repository the pair "variable, status" awaiting validation / rejection according to the opinion of an operator.
- said processes are applied periodically [for example during each evolution of the data set (210) or at each evolution of the regulatory framework].
- said treatments applied to the "hidden" variables / values consist of:
- Figure 1 shows the flow diagram of the set of treatments.
- FIG. 2 represents the set of processing modules for implementing the invention.
- Figure 3 shows a detailed view of the logic diagram of the first classification step.
- Figure 4 presents a detailed view of the logic diagram of the attribute identification power analysis.
- Figure 5 provides a detailed view of the logic diagram of attribute sensitivity analysis.
- the present invention relates to the automatic classification of the attributes of a digital data set to better target the anonymisation and / or risk assessment of re-identification (RI) exercises.
- the aim is to automate the technical processes to ensure compliance with the regulatory framework on the protection of personal data.
- the anonymisation and assessment of the risk of disclosure of personal data generally concern certain variables in a dataset, particularly those with an identifying nature or those with a sensitive character.
- anonymization involves loss of information about the dataset, which can affect the usefulness of the data for users such as researchers. For that, it is relevant for a user or owner of the data to target the variables on which the anonymization or the re-identification risk measurement will be carried out.
- the classification of the attributes of a dataset would be an asset in striking a balance between the obligation to respect one's private life and the guarantee of the usefulness of the data.
- the classification of the attributes is carried out by a "manual" treatment by the owner of the data and remains linked to its appreciation. This leaves the question of the classification of variables subject to subjectivity and thus may result in decisions of anonymisation or assessment of the risk of re-identification that are not in conformity with the requirements of the manipulation of personal data.
- the context of dissemination of datasets, the evolution of laws and customs as well as the characteristics of certain data sets mean that the classification of variables is not final and that an expert assessment is always desirable to ensure the ethical use of personal data. Given these elements, there is therefore a technical problem related to the preliminary analysis (manual or automatic) of the attributes of a dataset in order to target the anonymisation exercises and / or assessment of the risk of re-identification. data by a potential attacker of the dataset.
- the present invention provides an attribute classification methodology to help data owners share their data while respecting the requirements of personal data automatically and dynamically, allowing the parameters to be automatically scaled according to the introduction of new data into the database.
- the data owner accesses a dataset with attributes.
- Each attribute has a name to classify it.
- Each attribute can take different modalities / values and so can also be classify according to the composition of these values (distribution, frequency or other).
- the innovation of this classification methodology therefore lies particularly in the intervention of the modalities of the different attributes of a dataset in the classification process of the attributes.
- This invention has two stages of classification of the data.
- the classification begins with a first step, where the attributes of the dataset to be processed are subject to a first classification, using a created database called "Attributes Repository".
- This invention will be described according to a detailed example with reference to Figures 1 to 5 annexed showing the functional architecture and the logic of the main functional modules.
- the "Attributes framework” (201) consists of applying a classification of the attributes according to two main criteria of anonymization of the personal data, namely:
- the identifier character (202) results in the recording of a three-state numerical sequence: "I” when the variable is directly identifying as the social security number, "IQ” when the variable can become an identifier, combined with other variables associated with the same state as the postal code, or "NP".
- the variables associated with the numerical sequence "NP" are not treated in the the scope of this invention, which can reduce computational time in the anonymization process / process (204).
- the sensitive character (203) results in the recording of a digital sequence that can take two states: "S” when the variable is sensitive in the sense that its disclosure should be avoided and "NS” in the other cases.
- the repository (201) is translated into a file containing variables, listed from the state of the art, the recommendations of the institutes for the protection of privacy and the use cases encountered. These variables are categorized to facilitate the use of the repository when classifying the attributes of a given dataset. The categories listed are: health, education and work, addresses, numbers and dates ...
- Attribute classification is then based on two elements:
- Attributes belonging, according to the law, to a "particular category" are classified as sensitive variables assigned to the numerical sequence "S", for example health data, criminal record, etc.
- This repository (201) can be continuously enriched and is supposed to bring together a large set of variables related to many sectors of activity, in order to increase its usefulness.
- Attribute The name of the attribute.
- Identifier status This is to classify the variable as identifier "I, to be eliminated from the anonymized version", quasi-identifier "IQ" or not.
- Sensitivity includes sensitivity in the legal sense but also in the sense of ethics, custom, society, ...
- the "sensitivity of attributes” repository (205) proposes to reference, according to the degree of sensitivity, the different modalities / values of an attribute classified as sensitive and therefore assigned the numerical sequence "S".
- Certain attributes classified as “sensitive” and assigned to the numerical sequence “S” take values that do not necessarily have the same degree of sensitivity and / or protection requirement, hence the interest of proposing a more refined analysis of sensitivity and sensitivity order for the different modalities of the sensitive attributes (206).
- the "Attributes Sensitivity Repository” (205) is constituted by the list of sensitive attributes identified by the “Attributes Reference” (201) and for each attribute, the various possible modalities (that can evolve) are classified by order sensitivity and / or requirement in terms of protection of privacy and from a socio-cultural point of view.
- the qualification of the quasi-identifier attributes assigned to a numerical sequence "IQ" can be improved by passing to a finer degree of analysis (212). Indeed, the power of identification can vary from one quasi-identifying attribute to another. Thus, the level of requirement in terms of anonymization and / or anonymization evaluation could be different depending on the level of power of a virtual identifier in the re-identification of an individual.
- Dates easy to access dates of birth, ...
- Dates less accessible dates of hospitalization, ...
- Dates difficult to access medical check dates, ...
- the goal is to have a repository of quasi-identifying attributes, affected by the numerical sequence "IQ", classified according to their ease of access by an attacking potential.
- the "Reference Population Reference” (209) is therefore based on the distribution of the different attributes in the reference population, for example a country. For France, we refer for example to the data of the last census of the French population of 2013 to deduce the distribution of a set of attributes.
- the data recorded concern the following variables at this level: age, socio-professional category, department of birth, department of previous residence, department of current residence, department of work, degree obtained, nationality, sector of activity, region of birth, region of previous residence, region of work, sex, marital status and type of activity.
- This list can be enriched by other data on the French population which will expand the list of attributes.
- This processing makes it possible to give an order of power of identification of the attributes.
- This reference population reference system (209) can be extended by taking into account the characteristics of other reference populations, such as the United States or Canada. We will have, in fine, a database giving the main characteristics of the reference populations (populations to which the data sets are attached).
- the two criteria may be complementary to cover the most quasi-identifying attributes, assigned the numerical sequence "IQ", of a dataset.
- Step (1) the data owner / user accesses a dataset (210) that contains attributes with different denominations.
- the data owner examines the attribute dictionary (if it exists) or attributes directly to classify them.
- Step (2) During this step, the user accesses the "attribute repository" (201).
- Step (3) In this step, the calculator processes the data set (210) to match each of the attributes with the attribute repository (201). For attributes of the dataset (210), for which matching is performed, the processing consists of assign them a marker. This correspondence can be done manually by the user by comparing the list of attributes of his dataset to the attribute repository or automatically by creating search automation algorithms such as the Rabin-Karp algorithm, String searching, approximate string searching, or else semantic search algorithms such as the Lesk algorithm.
- search automation algorithms such as the Rabin-Karp algorithm, String searching, approximate string searching, or else semantic search algorithms such as the Lesk algorithm.
- Step (4) This step distinguishes the attributes of the dataset (201) for which a matching has been performed on the one hand, and the attributes for which no matching has been determined, on the other hand .
- Step (5) This step consists in registering in the attribute repository (201) the attributes of the dataset (210) for which no match has been found. These variables are registered with a temporary status, which can be changed to final status or rejected according to the opinion of an operator.
- Step (6) This step to perform a first classification of the attributes, denoted “Initial Classification” (211), based on “the referential of the attributes” (201). This step only affects those attributes for which a match with the "attribute repository” (201) has been established. At the end of this step, each of the marked attributes will have a status based on the attribute repository (201) translated by a numerical sequence that can take different states: "I”, "IQ”,
- a user / owner of the data can make a first classification, denoted "Initial Classification” (211) of the attributes of its data set in order to target the anonymisation / disclosure risk measurement exercises.
- a user accesses (301) the attribute dictionary of the dataset to be studied and the "attribute repository" (201). For attributes whose matching in the attribute repository has been found (303), a determination of their identifier (304) / sensitive (305) status will allow for an initial classification of the attributes (306). The determination of this first classification is done by referring to the different columns of the file of the "referential of the attributes” (201). Again, the correspondence between the attributes of the dataset (201) and their status in the "attribute repository” (201) can be done manually or automatically by search automation algorithms.
- the initial classification of the attributes (306) corresponds to their definitive classification. These attributes will therefore be permanently stored in the classification module (213), on which the anonymization process is based:
- Step (7) The user then determines an option to grant the attributes assigned to a digital sequence "IQ" or "S” a hidden status preventing their normal use in the final data set (215) and go directly to the anonymisation process (204) or to further processing of the data set (210), described below.
- Step (8) This step only applies to attributes, assigned to a numerical sequence "S”, determined by a filtering module (501). This step, called “sensitivity analysis” (206), is presented in more detail by the logic diagram, object of FIG.
- the processing will be based on the result of the initial classification of the attributes (306) and the "sensitivity reference" (205).
- the calculator By accessing (502) the "attribute sensitivity repository" (205), the calculator will examine the distribution of the modalities of the sensitive attribute in the data set (503). The occurrence frequencies of the most sensitive categories of the attribute are then calculated for the data set to be studied (504).
- Step (9) This step only applies to the attributes assigned to a numerical sequence "QI" determined by a filtering module (401). This step, named “Analysis of the power of identification” (212), is presented in more detail by the logic diagram, object of Figure 4.
- the processing will be based on the result of the initial classification of the attributes (306) and on the "identification power referential" (207).
- the computer accesses (402) the "attribute access facility repository” (208) and compares (403) thereafter the degrees of ease of access of the various attributes of the dataset (210) assigned to a digital sequence "IQ" ", Based on the same repository (208). This comparison results in an order of "ease of access” of the different attributes.
- the calculator then accesses (404) the "reference population reference” (209) and will sort (405) attributes assigned a numerical sequence "IQ" according to the order established in the "reference population reference”. (209).
- This order can be done manually or automatically by sorting algorithms, namely “selection sorting", “tree sorting” ...
- Step (10) This step presents the end of the classification process of the attributes of the dataset (210).
- the results of the sensitivity analyzes (206) and the identification power (212) are grouped in a classification module (213), on which the computer for the data processing (204) of the data set (210) will be based. .
- This processing may result in an anonymization of certain attributes, with different degrees of requirement in order to arrive at a final version of the dataset (215). In all cases, data processing must meet privacy needs while maintaining the usefulness of the dataset (210).
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Storage Device Security (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR1851182A FR3077894B1 (fr) | 2018-02-13 | 2018-02-13 | Procede de traitement automatique pour l’anonymisation d’un jeu de donnees numeriques |
PCT/FR2019/050280 WO2019158840A1 (fr) | 2018-02-13 | 2019-02-08 | Procédé de traitement automatique pour l'anonymisation d'un jeu de données numériques |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3752948A1 true EP3752948A1 (fr) | 2020-12-23 |
Family
ID=62528569
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19710728.7A Pending EP3752948A1 (fr) | 2018-02-13 | 2019-02-08 | Procédé de traitement automatique pour l'anonymisation d'un jeu de données numériques |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP3752948A1 (fr) |
FR (1) | FR3077894B1 (fr) |
WO (1) | WO2019158840A1 (fr) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111298432B (zh) * | 2020-01-16 | 2021-07-06 | 腾讯科技(深圳)有限公司 | 虚拟对象信息获取方法、装置、服务器及可读存储介质 |
CN113468561B (zh) * | 2021-06-18 | 2024-04-23 | 宝湾资本管理有限公司 | 数据保护方法、装置及服务器 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130138698A1 (en) | 2010-05-19 | 2013-05-30 | Kunihiko Harada | Identity information de-identification device |
US9323948B2 (en) | 2010-12-14 | 2016-04-26 | International Business Machines Corporation | De-identification of data |
WO2013031997A1 (fr) | 2011-09-02 | 2013-03-07 | 日本電気株式会社 | Dispositif et procédé de désidentification |
WO2015066523A2 (fr) | 2013-11-01 | 2015-05-07 | Anonos Inc. | Désidentification et anonymat dynamiques |
US10013576B2 (en) * | 2014-12-12 | 2018-07-03 | Panasonic Intellectual Property Management Co., Ltd. | History information anonymization method and history information anonymization device for anonymizing history information |
US9858426B2 (en) | 2015-11-03 | 2018-01-02 | Palo Alto Research Center Incorporated | Computer-implemented system and method for automatically identifying attributes for anonymization |
-
2018
- 2018-02-13 FR FR1851182A patent/FR3077894B1/fr active Active
-
2019
- 2019-02-08 WO PCT/FR2019/050280 patent/WO2019158840A1/fr unknown
- 2019-02-08 EP EP19710728.7A patent/EP3752948A1/fr active Pending
Also Published As
Publication number | Publication date |
---|---|
FR3077894A1 (fr) | 2019-08-16 |
FR3077894B1 (fr) | 2021-10-29 |
WO2019158840A1 (fr) | 2019-08-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Oliveira et al. | Biogeography of Amazon birds: rivers limit species composition, but not areas of endemism | |
KR102430649B1 (ko) | 익명화를 위해 속성들을 자동으로 식별하기 위한 컴퓨터 구현 시스템 및 방법 | |
Goldsteen et al. | Data minimization for GDPR compliance in machine learning models | |
Diakopoulos | Algorithmic accountability reporting: On the investigation of black boxes | |
US20220100899A1 (en) | Protecting sensitive data in documents | |
EP3908952B1 (fr) | Procédé de création d'avatars pour protéger des données sensibles | |
Nazah et al. | An unsupervised model for identifying and characterizing dark web forums | |
Min | Global business analytics models: Concepts and applications in predictive, healthcare, supply chain, and finance analytics | |
EP3752948A1 (fr) | Procédé de traitement automatique pour l'anonymisation d'un jeu de données numériques | |
Doss | Cyber privacy: who has your data and why you should care | |
Rizk et al. | Media coverage of online social network privacy issues in Germany: A thematic analysis | |
Luz et al. | Data preprocessing and feature extraction for phishing URL detection | |
Felmlee et al. | Can social media anti-abuse policies work? A quasi-experimental study of online sexist and racist slurs | |
Olson et al. | The Best Ends for the Best Means: Ethical Concerns in App Reviews | |
Siadaty et al. | Locating previously unknown patterns in data-mining results: a dual data-and knowledge-mining method | |
US20220382891A1 (en) | Detecting sensitive information in records using context and decoys | |
Tjikhoeri et al. | The best ends by the best means: ethical concerns in app reviews | |
Alben | When artificial intelligence and big data collide—How data aggregation and predictive machines threaten our privacy and autonomy | |
San | Predictions from data analytics: Does Malaysian data protection law apply? | |
Ricker et al. | AI-Generated Faces in the Real World: A Large-Scale Case Study of Twitter Profile Images | |
Goethals et al. | The Impact of Cloaking Digital Footprints on User Privacy and Personalization | |
Sloan et al. | When is an algorithm fair? errors, proxies, and predictions in algorithmic decision making | |
Alonso | Zero-Order Privacy Violations and Automated Decision-Making about Individuals | |
da Silveira | Democracy and invisible codes: How algorithms are modulating behaviors and political choices | |
Vogelsong et al. | Disclosive search ethics: Illuminating the gatekeepers of knowledge |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20200717 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20230208 |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: GRANT OF PATENT IS INTENDED |
|
INTG | Intention to grant announced |
Effective date: 20240626 |