CN112632618B - Desensitization method and device for label crowd data and computer equipment - Google Patents

Desensitization method and device for label crowd data and computer equipment Download PDF

Info

Publication number
CN112632618B
CN112632618B CN202011613978.7A CN202011613978A CN112632618B CN 112632618 B CN112632618 B CN 112632618B CN 202011613978 A CN202011613978 A CN 202011613978A CN 112632618 B CN112632618 B CN 112632618B
Authority
CN
China
Prior art keywords
desensitization
label
data
tag
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011613978.7A
Other languages
Chinese (zh)
Other versions
CN112632618A (en
Inventor
秦思哲
陈瑶
龚健
贾西贝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Huaao Data Technology Co Ltd
Original Assignee
Shenzhen Huaao Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Huaao Data Technology Co Ltd filed Critical Shenzhen Huaao Data Technology Co Ltd
Priority to CN202011613978.7A priority Critical patent/CN112632618B/en
Publication of CN112632618A publication Critical patent/CN112632618A/en
Application granted granted Critical
Publication of CN112632618B publication Critical patent/CN112632618B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2107File encryption

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention provides a desensitization method, a device and computer equipment of label crowd data, wherein the method comprises the following steps: importing a label model: importing an entity table from a data source, wherein the entity table is used as a marking object and corresponds to a label model; configuring a desensitization object: selecting sensitive fields needing desensitization in a label model; selecting a desensitization algorithm: selecting one or more desensitization algorithms to dynamically desensitize data corresponding to sensitive fields; extracting tag entity attributes: selecting a plurality of fields from the label model, and extracting entity attributes of the marking object; creating a data tag: setting a label layering rule for the label model, and filtering to form corresponding label crowd layering data according to the set label layering rule; displaying label crowd data after desensitization: and responding to the user to check the label crowd layered data, performing real-time dynamic desensitization, and displaying the label crowd layered data subjected to desensitization processing on the sensitive field to the user. The method can prevent sensitive data from being leaked.

Description

Desensitization method and device for label crowd data and computer equipment
Technical Field
The invention relates to the field of data management, in particular to a desensitization method, a device and computer equipment for label crowd data.
Background
The government information system or government data warehouse projects always collect and integrate data of multiple departments, and the department data can be divided into historical data, current data and future data along with inconsistent time of acquired data. If the data quality is not prevented in advance, monitored in advance and treated afterwards, the data quality can not be ensured to be reliable, and the data is safe to use.
Meanwhile, government departments systems have mass data, items do not find sensitive data and risk assessment, data classification and effective strategy definition are not carried out, real-time control is not carried out, monitoring, reporting, auditing and other operations are not carried out, and the use safety of the data cannot be ensured.
Disclosure of Invention
In view of the above, the present invention provides a method, an apparatus and a computer device for desensitizing tag population data, which at least partially solve the problems in the prior art.
To achieve the above object, in a first aspect, there is provided a method for desensitizing tag population data, comprising:
importing a label model: importing an entity table from a data source as a marking object, wherein the entity table corresponds to a label model;
configuring a desensitization object: selecting sensitive fields needing desensitization in the label model;
selecting a desensitization algorithm: selecting one or more desensitization algorithms to dynamically desensitize data corresponding to sensitive fields;
extracting tag entity attributes: selecting a plurality of fields from the label model, and extracting entity attributes of the marking object;
creating a data tag: setting a label layering rule for the label model, and filtering to form corresponding label crowd layering data according to the set label layering rule;
displaying label crowd data after desensitization: and responding to the real-time dynamic desensitization when the user views the label crowd layered data, and displaying the label crowd layered data subjected to the desensitization processing on the sensitive field to the user.
In some possible embodiments, the desensitization algorithm includes one or more desensitization algorithms including hash desensitization, mask desensitization, replacement desensitization, transform desensitization, encryption desensitization, random desensitization.
In some possible embodiments, the step of extracting the attribute of the tag entity may specifically include:
and selecting a plurality of fields including a name, an identity card number, a contact address, a mobile phone number, indication information of whether symptoms exist or not, indication information of whether epidemic patients are contacted, sex, an epidemic patient source area and an area where the epidemic patients are currently located from the tag model, and extracting entity attributes of the tag model.
In some possible embodiments, the step of creating a data tag may specifically include:
setting basic information, the setting basic information including: setting the label model as a model to be analyzed in response to an input operation; setting an updating mode of the tag model to be manual updating or routine updating in response to an input operation; setting an execution period of creating the data tag to one of minutes, hours, days, months, and years in response to the input operation; responding to input operation of a user, and setting a scheduling strategy of the label model to be executed once every preset time length;
setting a tag layering rule, wherein the setting of the tag layering rule comprises: the crowd is divided into a plurality of tiers, each tier being associated with a plurality of conditions of the configuration.
In some possible embodiments, the dividing the crowd into a plurality of layers, each layer being associated with a plurality of conditions of the configuration may specifically include:
dividing the population into at least two layers; one of the layers represents a first regional epidemic population, and the other layer represents a second regional epidemic population;
each hierarchy is associated with the simultaneous satisfaction of at least two conditions of the configuration, the at least two conditions comprising: the first condition is that the health condition is equal to an abnormality; and the second condition is that the area is equal to the first area or the second area.
In a second aspect, there is provided a desensitising apparatus for tag population data, comprising:
the label model importing module is used for importing an entity table from a data source as a marking object, wherein the entity table corresponds to a label model;
the desensitization object configuration module is used for selecting sensitive fields needing desensitization in the label model;
the desensitization algorithm selection module is used for selecting one or more desensitization algorithms to dynamically desensitize the data corresponding to the sensitive fields;
the label entity attribute extraction module is used for selecting a plurality of fields from the label model and extracting entity attributes of the marking objects;
the data label creation module is used for setting label layering rules for the label model, and filtering to form corresponding label crowd layering data according to the set label layering rules;
the desensitized label crowd data display module is used for responding to real-time dynamic desensitization when a user views label crowd layered data and displaying the label crowd layered data subjected to desensitization processing on the sensitive field to the user.
In some possible embodiments, the desensitization algorithm includes one or more desensitization algorithms including hash desensitization, mask desensitization, replacement desensitization, transform desensitization, encryption desensitization, random desensitization.
In some possible embodiments, the tag entity attribute extraction module is specifically configured to: and selecting a plurality of fields including a name, an identity card number, a contact address, a mobile phone number, indication information of whether symptoms exist or not, indication information of whether epidemic patients are contacted, sex, an epidemic patient source area and an area where the epidemic patients are currently located from the tag model, and extracting entity attributes of the tag model.
In a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor implements any of the tag population data desensitizing methods described above.
In a fourth aspect, there is provided a computer device characterized in that it comprises:
one or more processors;
a storage means for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement any of the methods of desensitizing tag population data described above.
The beneficial technical effects are as follows:
the technical scheme of the embodiment of the invention can improve the use safety of the data and prevent sensitive data from being leaked. When creating the data tag and screening the tag population, the embodiment of the invention inevitably has the risk of sensitive data leakage for the screened population data, so that a desensitization algorithm is set for real-time data desensitization while creating the data tag. The built-in rich desensitization algorithm comprises hash desensitization, shielding desensitization, replacement desensitization, transformation desensitization, encryption desensitization and random desensitization, and various desensitization algorithms enable sensitive data to be more comprehensively and safely protected.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method of tag desensitization logic flow of an embodiment of the present invention;
FIG. 2 is a software interface diagram of an import tag model as one example in accordance with an embodiment of the present invention;
FIG. 3 is a software interface diagram of an exemplary set-up desensitization object according to an embodiment of the present invention;
FIG. 4A is an interface diagram of a selection desensitizing algorithm, as an example, in accordance with an embodiment of the present invention;
FIG. 4B is an interface diagram II of a selection desensitization algorithm as an example in accordance with an embodiment of the invention;
FIG. 5 is a schematic diagram of an operation interface for extracting entity attributes as an example in accordance with an embodiment of the present invention;
FIG. 6A is a schematic diagram of an interface for creating a data tag, as an example, in accordance with an embodiment of the present invention;
FIG. 6B is a second interface schematic for creating a data tag, as an example, in accordance with an embodiment of the present invention;
FIG. 7 is an interface diagram of a population of tags after desensitization, as an example, in accordance with an embodiment of the present invention;
FIG. 8 is a functional block diagram of a tag desensitizing apparatus according to an embodiment of the present invention;
FIG. 9 is a functional block diagram of a tag desensitizing computer device according to an embodiment of the invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
It should be noted that, without conflict, the following embodiments and features in the embodiments may be combined with each other; and, based on the embodiments in this disclosure, all other embodiments that may be made by one of ordinary skill in the art without inventive effort are within the scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the following claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the present disclosure, one skilled in the art will appreciate that one aspect described herein may be implemented independently of any other aspect, and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. In addition, such apparatus may be implemented and/or such methods practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
When creating the data tag and screening the tag population, the embodiment of the invention inevitably has the risk of sensitive data leakage for the screened population data, so that a desensitization algorithm is set for real-time data desensitization while creating the data tag. The built-in rich desensitization algorithm comprises hash desensitization, shielding desensitization, replacement desensitization, transformation desensitization, encryption desensitization and random desensitization, and various desensitization algorithms enable sensitive data to be more comprehensively and safely protected.
FIG. 1 is a flow chart of a method of tag desensitization logic flow of an embodiment of the present invention. As shown in fig. 1, it includes the steps of:
s110, importing a label model: an entity table is imported from a data source as a target of marking, and corresponds to the label model.
S120, configuring a desensitization object: in the label model, sensitive fields are selected that require desensitization.
S130, selecting a desensitization algorithm: one or more desensitization algorithms including hash desensitization, shielding desensitization, replacement desensitization, transformation desensitization, encryption desensitization and random desensitization are selected to dynamically desensitize data corresponding to sensitive fields.
S140, extracting tag attributes: and selecting a field from the label model, and extracting entity attributes of the marking object to be used as the dimension and basis of marking.
S150, creating a data tag: and setting a label layering rule for the label model, and filtering to form corresponding label crowd layering data according to the set label layering rule.
Specifically, the step can set label layering classification rules, a scheduling plan, select a model and attributes for free combination. And filtering to form corresponding packet data according to the label rule.
S160, checking or displaying the label crowd data after desensitization: when specific label crowd data is checked, the system performs real-time dynamic desensitization in response to the user checking the label crowd layered data, and the label crowd data after desensitization is checked, namely the label crowd layered data after desensitization processing on sensitive fields is displayed to the user.
In some embodiments, the step of extracting the attribute of the tag entity may specifically include:
and selecting a plurality of fields including a name, an identity card number, a contact address, a mobile phone number, indication information of whether symptoms exist or not, indication information of whether epidemic patients are contacted, sex, an epidemic patient source area and an area where the epidemic patients are currently located from the tag model, and extracting entity attributes of the tag model.
In some embodiments, the step of creating a data tag specifically includes:
setting basic information, the setting basic information including: setting the label model as a model to be analyzed in response to an input operation; setting an updating mode of the tag model to be manual updating or routine updating in response to an input operation; setting an execution period of creating the data tag to one of minutes, hours, days, months, and years in response to the input operation; responding to input operation of a user, and setting a scheduling strategy of the label model to be executed once every preset time length;
setting a tag layering rule, wherein the setting of the tag layering rule comprises: the crowd is divided into a plurality of tiers, each tier being associated with a plurality of conditions of the configuration.
In some embodiments, the dividing the crowd into a plurality of tiers, each tier being associated with a plurality of conditions of the configuration, specifically includes:
dividing the population into at least two layers; one of the layers represents a first regional epidemic population, and the other layer represents a second regional epidemic population;
each hierarchy is associated with the simultaneous satisfaction of at least two conditions of the configuration, the at least two conditions comprising: the first condition is that the health condition is equal to an abnormality; and the second condition is that the area is equal to the first area or the second area.
The following examples are given in detail:
FIG. 2 is a software interface diagram of an import tag model as an example of an embodiment of the present invention. As shown in fig. 2, in the step of importing a tag model, a user is provided with an import table interface in which data source names, such as epidemic situation demonstration and management, are set, and table names of data entities, such as hospitalization information tables, are selected.
FIG. 3 is a software interface diagram of an exemplary set-up desensitization object according to an embodiment of the present invention. As shown in fig. 3, in the interface of the edit tag model, column information is provided to the user, wherein column names include, but are not limited to: ID. LAT, FULL NAME, SEX, PHONE, ID CARD, ate NO, HEALTHY, CONTACT NAME, CONTACT PHONE, CONTACT REMARK. Remark information corresponding to the column names is respectively as follows: primary key, dimension, name, gender, cell phone number, identification card, license plate number, health status, emergency contact phone, emergency contact remark. The data types of the column names may be VARCHAR2. Wherein, the column name 'ID' is hooked or selected as a main key, and other column names are not selected as main keys; column NAMEs "ID", "FULL_NAME", "PHONE", "ID_CARD", "HEALTY" are selected for list display, wherein the column NAMEs are not selected for list display; the column names "PHONE" and "id_card" are selected or configured to be desensitized, and the other column names are not selected or configured to be desensitized.
FIG. 4A is an interface diagram of a selection desensitization algorithm as an example in accordance with an embodiment of the invention. As shown in fig. 4A, in the drop-down box of the desensitization algorithm, a plurality of desensitization algorithms are configured, including: hash desensitization, masking desensitization, substitution desensitization, transformation desensitization, encryption desensitization, random desensitization. In one example, when the desensitization algorithm is selected or configured, its lower algorithm is selected to employ the SHA-256 algorithm. FIG. 4B is an interface diagram of a selection desensitization algorithm as an example in accordance with an embodiment of the invention. As shown in fig. 4B, in the algorithm classification tree displayed on the left side of the user interface, the hash desensitization algorithm may specifically include: MD5, SHA-1, SHA-256, HMAC; the transform desensitization algorithm may specifically include: number rounding, date rounding and character displacement; the encryption desensitization algorithm may specifically include: DES algorithm, 3DES algorithm, AES algorithm, national cipher SM1 algorithm; the random desensitization algorithm may specifically include: scattering rearrangement and random selection. When any specific algorithm in the left column is selected, algorithm information, parameter configuration and algorithm test columns are displayed in the right interface. Wherein, when MD5 is selected, the name of the algorithm is MD5, the type of algorithm is hash desensitization, whether the algorithm is reversible or not is set to be irreversible, and whether the algorithm is started or not is set to be started is displayed in the right interface.
FIG. 5 is a schematic diagram of an operation interface for extracting entity attributes as an example in accordance with an embodiment of the present invention. As shown in fig. 5, in the newly added entity attribute interface, the table name is set as the entry information table of the Shenzhen epidemic situation, and the column information includes the following fields: column name (attribute name), display name, data type, unit format, and dictionary information. Wherein, the column name field is selected, it includes: "xm", "sfzhm", "lxdz", "sjhm", "zz", "jchz", "sex"; correspondingly, the display name fields are respectively: name, identification number, contact address, phone number, symptom (0 is asymptomatic, 1 is symptomatic), indication of contact patient (0 is not contact, 1 is contact), sex.
FIG. 6A is a schematic diagram of an interface for creating a data tag, as an example, in accordance with an embodiment of the present invention; FIG. 6B is a second interface diagram of creating a data tag as an example in accordance with an embodiment of the present invention. As shown in fig. 6A, in the software interface for creating the custom tag, the first step is to set basic information, which mainly includes: setting a tag name, for example, an epidemic tag; setting a model, for example, setting a Shenzhen epidemic situation entering deep information table; the grouping is configured as Shenzhen epidemic; the update style is configured to update routinely; the execution period is configured as one of minutes, hours, days, months, in this example selected as minutes; the debug policy is configured to be executed once every 15 minutes. As shown in fig. 6B, in the software interface for creating the custom tag, the second step is to set tag rules, and in the interface for setting tag rules, users satisfying the following conditions are divided into 2 layers, such as the people with the martial arts epidemic and the people with the Beijing epidemic, among all users; but are not limited to 2 tiers, and tiers may be added as desired. By way of example, when the health condition is satisfied to be equal to an abnormality and the area from the group is equal to martial arts, the population satisfying both conditions is in the martial arts personnel hierarchy. The system performs user matching according to the sequence of the custom layering, and the same user is preferentially matched in the layering with the earlier sequence. FIG. 7 is a schematic diagram of an interface for viewing a population of tags after desensitization, as an example, in accordance with an embodiment of the present invention. As shown in fig. 7, in the display interface of the sample data example table, as shown in the box, the information of two columns, "PHONE" and "id_card" is subjected to desensitization processing, and displayed as a plurality of asterisks, so that key information is kept secret and hidden.
Example two
Fig. 8 is a functional block diagram of a tag desensitizing apparatus according to an embodiment of the present invention. As shown in fig. 8, the apparatus 200 includes:
a tag model importing module 210, configured to import an entity table from a data source, where the entity table corresponds to a tag model, as a labeled object;
a desensitization object configuration module 220, configured to select a sensitive field that needs to be desensitized in the tag model;
a desensitization algorithm selection module 230, configured to select one or more desensitization algorithms to dynamically desensitize data corresponding to the sensitive fields;
a tag entity attribute extraction module 240, configured to select a plurality of fields from the tag model, and extract entity attributes of the marking object;
the data tag creation module 250 is configured to set tag layering rules for the tag model, and filter and form corresponding tag crowd layering data according to the set tag layering rules;
the desensitized label crowd data display module 260 is configured to perform real-time dynamic desensitization in response to a user viewing label crowd layered data, and display the label crowd layered data after desensitizing the sensitive field to the user.
In some embodiments, the desensitization algorithm includes one or more desensitization algorithms including hash desensitization, mask desensitization, replacement desensitization, transform desensitization, encryption desensitization, random desensitization.
In some embodiments, the tag entity attribute extraction module 240 is specifically configured to: and selecting a plurality of fields including a name, an identity card number, a contact address, a mobile phone number, indication information of whether symptoms exist or not, indication information of whether epidemic patients are contacted, sex, an epidemic patient source area and an area where the epidemic patients are currently located from the tag model, and extracting entity attributes of the tag model.
In some embodiments, the data tag creation module 250 may be specifically configured to:
setting basic information, the setting basic information including: setting the label model as a model to be analyzed in response to an input operation; setting an updating mode of the tag model to be manual updating or routine updating in response to an input operation; setting an execution period of creating the data tag to one of minutes, hours, days, months, and years in response to the input operation; responding to input operation of a user, and setting a scheduling strategy of the label model to be executed once every preset time length;
setting a tag layering rule, wherein the setting of the tag layering rule comprises: the crowd is divided into a plurality of tiers, each tier being associated with a plurality of conditions of the configuration.
In some embodiments, the dividing the crowd into a plurality of layers, each layer being associated with a plurality of conditions of the configuration may specifically include:
dividing the population into at least two layers; one of the layers represents a first regional epidemic population, and the other layer represents a second regional epidemic population;
each hierarchy is associated with the simultaneous satisfaction of at least two conditions of the configuration, the at least two conditions comprising: the first condition is that the health condition is equal to an abnormality; and the second condition is that the area is equal to the first area or the second area.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
Example III
The embodiment of the invention also provides a computer readable storage medium, a computer program stored thereon, which when executed by a processor, implements any of the label crowd data desensitizing methods, devices and computer equipment described above.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. Of course, there are other ways of readable storage medium, such as quantum memory, graphene memory, etc. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
Example IV
The embodiment of the invention also provides a computer device, as shown in fig. 9, which comprises one or more processors 301, a communication interface 302, a memory 303 and a communication bus 304, wherein the processors 301, the communication interface 302 and the memory 303 complete communication with each other through the communication bus 304.
A memory 303 for storing a computer program;
processor 301, when executing a program stored in memory 303, implements:
importing a label model: importing an entity table from a data source as a marking object, wherein the entity table corresponds to a label model;
configuring a desensitization object: selecting sensitive fields needing desensitization in the label model;
selecting a desensitization algorithm: selecting one or more desensitization algorithms to dynamically desensitize data corresponding to sensitive fields;
extracting tag entity attributes: selecting a plurality of fields from the label model, and extracting entity attributes of the marking object;
creating a data tag: setting a label layering rule for the label model, and filtering to form corresponding label crowd layering data according to the set label layering rule;
displaying label crowd data after desensitization: and responding to the real-time dynamic desensitization when the user views the label crowd layered data, and displaying the label crowd layered data subjected to the desensitization processing on the sensitive field to the user.
In some embodiments, the desensitization algorithm includes one or more desensitization algorithms including hash desensitization, mask desensitization, substitution desensitization, transform desensitization, encryption desensitization, random desensitization, in the processing of the processor 301.
In some embodiments, in the processing of the processor 301, the step of extracting the attribute of the tag entity may specifically include:
and selecting a plurality of fields including a name, an identity card number, a contact address, a mobile phone number, indication information of whether symptoms exist or not, indication information of whether epidemic patients are contacted, sex, an epidemic patient source area and an area where the epidemic patients are currently located from the tag model, and extracting entity attributes of the tag model.
In some embodiments, in the processing of the processor 301, the step of creating a data tag may specifically include:
setting basic information, the setting basic information including: setting the label model as a model to be analyzed in response to an input operation; setting an updating mode of the tag model to be manual updating or routine updating in response to an input operation; setting an execution period of creating the data tag to one of minutes, hours, days, months, and years in response to the input operation; responding to input operation of a user, and setting a scheduling strategy of the label model to be executed once every preset time length;
setting a tag layering rule, wherein the setting of the tag layering rule comprises: the crowd is divided into a plurality of tiers, each tier being associated with a plurality of conditions of the configuration.
In some embodiments, in the processing of the processor 301, the dividing the crowd into a plurality of layers, where each layer is associated with a plurality of conditions configured, may specifically include:
dividing the population into at least two layers; one of the layers represents a first regional epidemic population, and the other layer represents a second regional epidemic population;
each hierarchy is associated with the simultaneous satisfaction of at least two conditions of the configuration, the at least two conditions comprising: the first condition is that the health condition is equal to an abnormality; and the second condition is that the area is equal to the first area or the second area.
The communication bus mentioned above for the electronic devices may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus. The communication interface is used for communication between the electronic device and other devices.
The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (6)

1. A method of desensitizing tag population data, comprising:
importing a label model: importing an entity table from a data source as a marking object, wherein the entity table corresponds to a label model;
configuring a desensitization object: selecting sensitive fields needing desensitization in the label model;
selecting a desensitization algorithm: selecting one or more desensitization algorithms to dynamically desensitize data corresponding to sensitive fields;
extracting tag entity attributes: selecting a plurality of fields from the label model, and extracting entity attributes of the marking object;
creating a data tag: setting a label layering rule for the label model, and filtering to form corresponding label crowd layering data according to the set label layering rule;
displaying label crowd data after desensitization: in response to the user checking the label crowd layered data, performing real-time dynamic desensitization, and displaying the label crowd layered data subjected to desensitization processing on the sensitive field to the user;
the creating the data tag specifically includes:
setting basic information, the setting basic information including: setting the label model as a model to be analyzed in response to an input operation; setting an updating mode of the tag model to be manual updating or routine updating in response to an input operation; setting an execution period of creating the data tag to one of minutes, hours, days, months, and years in response to the input operation; responding to input operation of a user, and setting a scheduling strategy of the label model to be executed once every preset time length;
setting a tag layering rule, wherein the setting of the tag layering rule comprises: dividing the population into a plurality of tiers, each tier being associated with a plurality of conditions of the configuration;
the crowd is divided into a plurality of layers, each layer is associated with a plurality of configured conditions, and the method specifically comprises the following steps:
dividing the population into at least two layers; one of the layers represents a first regional epidemic population, and the other layer represents a second regional epidemic population;
each hierarchy is associated with the simultaneous satisfaction of at least two conditions of the configuration, the at least two conditions comprising: the first condition is that the health condition is equal to an abnormality; and the second condition is that the area is equal to the first area or the second area;
the step of extracting the tag entity attribute specifically includes:
and selecting a plurality of fields including a name, an identity card number, a contact address, a mobile phone number, indication information of whether symptoms exist or not, indication information of whether epidemic patients are contacted, sex, an epidemic patient source area and an area where the epidemic patients are currently located from the tag model, and extracting entity attributes of the tag model.
2. The method of claim 1, wherein the desensitization algorithm comprises one or more desensitization algorithms including hash desensitization, mask desensitization, substitution desensitization, transform desensitization, encryption desensitization, random desensitization.
3. A device for desensitizing tag population data, comprising:
the label model importing module is used for importing an entity table from a data source as a marking object, wherein the entity table corresponds to a label model;
the desensitization object configuration module is used for selecting sensitive fields needing desensitization in the label model;
the desensitization algorithm selection module is used for selecting one or more desensitization algorithms to dynamically desensitize the data corresponding to the sensitive fields;
the label entity attribute extraction module is used for selecting a plurality of fields from the label model and extracting entity attributes of the marking objects;
the data label creation module is used for setting label layering rules for the label model, and filtering to form corresponding label crowd layering data according to the set label layering rules;
the desensitized label crowd data display module is used for responding to real-time dynamic desensitization when a user views label crowd layered data and displaying the label crowd layered data subjected to desensitization processing on the sensitive field to the user;
the data tag creation module is specifically configured to:
setting basic information, the setting basic information including: setting the label model as a model to be analyzed in response to an input operation; setting an updating mode of the tag model to be manual updating or routine updating in response to an input operation; setting an execution period of creating the data tag to one of minutes, hours, days, months, and years in response to the input operation; responding to input operation of a user, and setting a scheduling strategy of the label model to be executed once every preset time length;
setting a tag layering rule, wherein the setting of the tag layering rule comprises: dividing the population into a plurality of tiers, each tier being associated with a plurality of conditions of the configuration;
the crowd is divided into a plurality of layers, each layer is associated with a plurality of configured conditions, and the method specifically comprises the following steps:
dividing the population into at least two layers; one of the layers represents a first regional epidemic population, and the other layer represents a second regional epidemic population;
each hierarchy is associated with the simultaneous satisfaction of at least two conditions of the configuration, the at least two conditions comprising: the first condition is that the health condition is equal to an abnormality; and the second condition is that the area is equal to the first area or the second area;
the label entity attribute extraction module is specifically configured to: and selecting a plurality of fields including a name, an identity card number, a contact address, a mobile phone number, indication information of whether symptoms exist or not, indication information of whether epidemic patients are contacted, sex, an epidemic patient source area and an area where the epidemic patients are currently located from the tag model, and extracting entity attributes of the tag model.
4. The apparatus of claim 3, wherein the desensitization algorithm comprises one or more desensitization algorithms including hash desensitization, mask desensitization, substitution desensitization, transform desensitization, encryption desensitization, random desensitization.
5. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements a method of desensitizing tag population data according to claim 1 or 2.
6. A computer device, comprising:
one or more processors;
a storage means for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of desensitizing tag population data of claim 1 or 2.
CN202011613978.7A 2020-12-30 2020-12-30 Desensitization method and device for label crowd data and computer equipment Active CN112632618B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011613978.7A CN112632618B (en) 2020-12-30 2020-12-30 Desensitization method and device for label crowd data and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011613978.7A CN112632618B (en) 2020-12-30 2020-12-30 Desensitization method and device for label crowd data and computer equipment

Publications (2)

Publication Number Publication Date
CN112632618A CN112632618A (en) 2021-04-09
CN112632618B true CN112632618B (en) 2024-04-16

Family

ID=75286965

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011613978.7A Active CN112632618B (en) 2020-12-30 2020-12-30 Desensitization method and device for label crowd data and computer equipment

Country Status (1)

Country Link
CN (1) CN112632618B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109409121A (en) * 2018-09-07 2019-03-01 阿里巴巴集团控股有限公司 Desensitization process method, apparatus and server
CN110399733A (en) * 2019-03-18 2019-11-01 国网安徽省电力有限公司黄山供电公司 A kind of desensitization platform for structural data
CN111159763A (en) * 2019-12-26 2020-05-15 银江股份有限公司 System and method for analyzing portrait of law-related personnel group
CN111198948A (en) * 2020-01-08 2020-05-26 深圳前海微众银行股份有限公司 Text classification correction method, device and equipment and computer readable storage medium
CN111243748A (en) * 2019-12-30 2020-06-05 湖南中医药大学 Needle pushing health data standardization system
WO2020113582A1 (en) * 2018-12-07 2020-06-11 Microsoft Technology Licensing, Llc Providing images with privacy label
CN111666587A (en) * 2020-05-10 2020-09-15 武汉理工大学 Food data multi-attribute feature joint desensitization method and device based on supervised learning
CN111709052A (en) * 2020-06-01 2020-09-25 支付宝(杭州)信息技术有限公司 Private data identification and processing method, device, equipment and readable medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11062043B2 (en) * 2019-05-01 2021-07-13 Optum, Inc. Database entity sensitivity classification

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109409121A (en) * 2018-09-07 2019-03-01 阿里巴巴集团控股有限公司 Desensitization process method, apparatus and server
WO2020113582A1 (en) * 2018-12-07 2020-06-11 Microsoft Technology Licensing, Llc Providing images with privacy label
CN110399733A (en) * 2019-03-18 2019-11-01 国网安徽省电力有限公司黄山供电公司 A kind of desensitization platform for structural data
CN111159763A (en) * 2019-12-26 2020-05-15 银江股份有限公司 System and method for analyzing portrait of law-related personnel group
CN111243748A (en) * 2019-12-30 2020-06-05 湖南中医药大学 Needle pushing health data standardization system
CN111198948A (en) * 2020-01-08 2020-05-26 深圳前海微众银行股份有限公司 Text classification correction method, device and equipment and computer readable storage medium
CN111666587A (en) * 2020-05-10 2020-09-15 武汉理工大学 Food data multi-attribute feature joint desensitization method and device based on supervised learning
CN111709052A (en) * 2020-06-01 2020-09-25 支付宝(杭州)信息技术有限公司 Private data identification and processing method, device, equipment and readable medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Z.Wang ; M.Song ; Z.Zhang.Beyond Inferring Class Representatives:User-Level Privacy Leakage From Federated Learning.IEEE Conference on Computer Communications .第2019-05-02卷2512-2520. *
保留敏感数据统计特征的数据脱敏系统的研究与实现;孟雪;信息科技;20200215(第2期);30-43 *

Also Published As

Publication number Publication date
CN112632618A (en) 2021-04-09

Similar Documents

Publication Publication Date Title
US10817621B2 (en) Anonymization processing device, anonymization processing method, and program
CA2906475C (en) Method and apparatus for substitution scheme for anonymizing personally identifiable information
CN111079174A (en) Power consumption data desensitization method and system based on anonymization and differential privacy technology
US20200074104A1 (en) Controlling access to data in a database based on density of sensitive data in the database
Fahrenkrog-Petersen et al. PRIPEL: privacy-preserving event log publishing including contextual information
CN109829327A (en) Sensitive information processing method, device, electronic equipment and storage medium
Caruccio et al. GDPR compliant information confidentiality preservation in big data processing
US20220222374A1 (en) Data protection
CN110289059A (en) Medical data processing method, device, storage medium and electronic equipment
CN112417443A (en) Database protection method and device, firewall and computer readable storage medium
CN113158233B (en) Data preprocessing method and device and computer storage medium
CN112765673A (en) Sensitive data statistical method and related device
Ali et al. A classification module in data masking framework for business intelligence platform in healthcare
CN112395630A (en) Data encryption method and device based on information security, terminal equipment and medium
EP3973429A1 (en) Compatible anonymization of data sets of different sources
CN111522859A (en) Alarm analysis method and device, computer equipment and storage medium
US20190278943A1 (en) Computer system of computer servers and dedicated computer clients specially programmed to generate synthetic non-reversible electronic data records based on real-time electronic querying and methods of use thereof
CN112632618B (en) Desensitization method and device for label crowd data and computer equipment
CN116910023A (en) Data management system
CN116663026A (en) Block chain-based data processing method and device, electronic equipment and medium
CN114925033A (en) Information uplink method, device, system and storage medium
CN112632607A (en) Data processing method, device and equipment
Portillo-Dominguez et al. Towards an efficient log data protection in software systems through data minimization and anonymization
US20230214518A1 (en) Information security systems and methods for early change detection and data protection
Knackmuss et al. Investigation of security relevant aspects of android ehealthapps: permissions, storage properties and data transmission

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: 518000 2203/2204, Building 1, Huide Building, Beizhan Community, Minzhi Street, Longhua District, Shenzhen, Guangdong

Applicant after: SHENZHEN AUDAQUE DATA TECHNOLOGY Ltd.

Address before: 713, 7th floor, software building, No.9, Gaoxin Zhongyi Road, Nanshan District, Shenzhen, Guangdong 518000

Applicant before: SHENZHEN AUDAQUE DATA TECHNOLOGY Ltd.

Country or region before: China

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant