CN112632618A - Desensitization method and device for tag crowd data and computer equipment - Google Patents

Desensitization method and device for tag crowd data and computer equipment Download PDF

Info

Publication number
CN112632618A
CN112632618A CN202011613978.7A CN202011613978A CN112632618A CN 112632618 A CN112632618 A CN 112632618A CN 202011613978 A CN202011613978 A CN 202011613978A CN 112632618 A CN112632618 A CN 112632618A
Authority
CN
China
Prior art keywords
desensitization
label
data
tag
crowd
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011613978.7A
Other languages
Chinese (zh)
Other versions
CN112632618B (en
Inventor
秦思哲
陈瑶
龚健
贾西贝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Huaao Data Technology Co Ltd
Original Assignee
Shenzhen Huaao Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Huaao Data Technology Co Ltd filed Critical Shenzhen Huaao Data Technology Co Ltd
Priority to CN202011613978.7A priority Critical patent/CN112632618B/en
Publication of CN112632618A publication Critical patent/CN112632618A/en
Application granted granted Critical
Publication of CN112632618B publication Critical patent/CN112632618B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2107File encryption

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention provides a desensitization method, a desensitization device and computer equipment of tag crowd data, wherein the method comprises the following steps: importing a label model: importing an entity table from a data source as a marking object, wherein the entity table corresponds to a label model; configuring a desensitized subject: in the label model, selecting a sensitive field needing desensitization; selecting a desensitization algorithm: selecting one or more desensitization algorithms to perform dynamic desensitization on data corresponding to the sensitive fields; extracting the attribute of the tag entity: selecting a plurality of fields from the label model, and extracting the entity attributes of the marking object; creating a data tag: setting label layering rules for the label model, and filtering to form corresponding label crowd layering data according to the set label layering rules; display of label population data after desensitization: and responding to the situation that the user views the tag crowd hierarchical data, performing real-time dynamic desensitization, and displaying the tag crowd hierarchical data subjected to desensitization treatment on the sensitive field to the user. The method can prevent sensitive data from leaking.

Description

Desensitization method and device for tag crowd data and computer equipment
Technical Field
The invention relates to the field of data management, in particular to a desensitization method and device for tag crowd data and computer equipment.
Background
The government affair information system or the government affair data warehouse project collects and integrates data of a plurality of departments all the time, and along with the inconsistency of the obtained data time, the data of the departments are divided into historical data, current date data and future data. If the data quality is not precautionary, monitored in the process and administered afterwards, the reliability of the data quality and the use safety of the data cannot be ensured.
Meanwhile, each department of government affairs has massive data, sensitive data and risk assessment are not found in the project, data classification and effective strategy definition are not performed, real-time control, monitoring, reporting, auditing and other operations are not performed, and data use safety cannot be guaranteed.
Disclosure of Invention
In view of the above, the present invention provides a method, an apparatus and a computer device for desensitizing tag population data, which at least partially solve the problems in the prior art.
To achieve the above object, in a first aspect, there is provided a method for desensitizing tag population data, comprising:
importing a label model: importing an entity table from a data source as a marking object, wherein the entity table corresponds to a label model;
configuring a desensitized subject: selecting a sensitive field needing desensitization in the label model;
selecting a desensitization algorithm: selecting one or more desensitization algorithms to perform dynamic desensitization on data corresponding to the sensitive fields;
extracting the attribute of the tag entity: selecting a plurality of fields from the label model, and extracting entity attributes of the marking object;
creating a data tag: setting label layering rules for the label models, and filtering to form corresponding label crowd layering data according to the set label layering rules;
display of label population data after desensitization: and when the user views the tag crowd hierarchical data, real-time dynamic desensitization is carried out, and the tag crowd hierarchical data subjected to desensitization processing on the sensitive field is displayed for the user.
In some possible embodiments, the desensitization algorithm includes one or more desensitization algorithms including hash desensitization, mask desensitization, replacement desensitization, transform desensitization, encryption desensitization, random desensitization.
In some possible embodiments, the step of extracting the tag entity attribute may specifically include:
selecting a plurality of fields including a name, an identification number, a contact address, a mobile phone number, indication information of existence or non-existence of symptoms, indication information of whether an epidemic patient is contacted, sex, an epidemic patient source area and an area where the epidemic patient is located currently from the label model, and extracting entity attributes of the label model.
In some possible embodiments, the step of creating the data tag may specifically include:
setting basic information, wherein the setting basic information comprises: setting the label model as a model to be analyzed in response to an input operation; in response to an input operation, setting an updating mode of the tag model to be a manual updating mode or a routine updating mode; setting an execution cycle of creating the data tag to one of a minute, hour, day, month, and year in response to the input operation; responding to the input operation of a user, and setting the scheduling strategy of the label model to be executed once every preset time length;
setting a label layering rule, wherein the setting of the label layering rule comprises the following steps: the population is divided into a plurality of tiers, each tier being associated with a plurality of conditions for the configuration.
In some possible embodiments, the dividing the crowd into a plurality of tiers, where each tier is associated with a plurality of configured conditions may specifically include:
dividing a crowd into at least two layers; one of the layers represents epidemic situation crowd in the first region, and the other layer represents epidemic situation crowd in the second region;
each tier is associated with simultaneous satisfaction of at least two conditions of the configuration, the at least two conditions including: the first condition is that the health condition is equal to an anomaly; and, the second condition is that the from area is equal to the first region or the second region.
In a second aspect, there is provided a tag population data desensitizing apparatus comprising:
the label model importing module is used for importing an entity table from a data source as a marking object, wherein the entity table corresponds to a label model;
a desensitization object configuration module used for selecting a sensitive field needing desensitization in the label model;
the desensitization algorithm selection module is used for selecting one or more desensitization algorithms to perform dynamic desensitization on data corresponding to the sensitive fields;
the label entity attribute extraction module is used for selecting a plurality of fields from the label model and extracting entity attributes of the marking object;
the data label creating module is used for setting label layering rules for the label model and filtering to form corresponding label crowd layering data according to the set label layering rules;
and the desensitized tag crowd data display module is used for responding to the situation that a user views the tag crowd hierarchical data, carrying out real-time dynamic desensitization and displaying the tag crowd hierarchical data subjected to desensitization treatment on the sensitive field to the user.
In some possible embodiments, the desensitization algorithm includes one or more desensitization algorithms including hash desensitization, mask desensitization, replacement desensitization, transform desensitization, encryption desensitization, random desensitization.
In some possible embodiments, the tag entity attribute extraction module is specifically configured to: selecting a plurality of fields including a name, an identification number, a contact address, a mobile phone number, indication information of existence or non-existence of symptoms, indication information of whether an epidemic patient is contacted, sex, an epidemic patient source area and an area where the epidemic patient is located currently from the label model, and extracting entity attributes of the label model.
In a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the above described methods of desensitizing tag population data.
In a fourth aspect, there is provided a computer device comprising:
one or more processors;
storage means for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement any of the methods of desensitizing tag population data as described above.
The beneficial technical effects are as follows:
the technical scheme of the embodiment of the invention can improve the data use safety and prevent sensitive data from leaking. When the data label is created and the label crowd is screened, sensitive data leakage risks exist inevitably for the screened crowd data, so that a desensitization algorithm is set while the data label is created, and real-time data desensitization is performed. The built-in rich desensitization algorithm comprises Hash desensitization, shielding desensitization, replacement desensitization, transformation desensitization, encryption desensitization and random desensitization, and the multiple desensitization algorithms ensure that sensitive data is protected more comprehensively and safely.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a method flow diagram of a tag desensitization logic flow of an embodiment of the present invention;
FIG. 2 is a diagram of a software interface for an exemplary import tag model according to an embodiment of the present invention;
FIG. 3 is a diagram of a software interface for setting a desensitization object, as an example, according to an embodiment of the present invention;
FIG. 4A is a first interface diagram of an exemplary selection desensitization algorithm according to embodiments of the present invention;
FIG. 4B is a second interface diagram of an exemplary selection desensitization algorithm according to embodiments of the present invention;
FIG. 5 is a schematic diagram of an exemplary operation interface for extracting entity attributes according to an embodiment of the present invention;
FIG. 6A is a schematic diagram of an exemplary first interface for creating a data tag according to an embodiment of the present invention;
FIG. 6B is a schematic diagram of an exemplary interface for creating a data tag according to an embodiment of the present invention;
FIG. 7 is an interface diagram of a tag population after viewing desensitization, as an example, according to an embodiment of the present invention;
FIG. 8 is a functional block diagram of a tag desensitization apparatus of an embodiment of the present invention;
FIG. 9 is a functional block diagram of a computer device for tag desensitization according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
It should be noted that, in the case of no conflict, the features in the following embodiments and examples may be combined with each other; moreover, all other embodiments that can be derived by one of ordinary skill in the art from the embodiments disclosed herein without making any creative effort fall within the scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the disclosure, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
When the data label is created and the label crowd is screened, sensitive data leakage risks exist inevitably for the screened crowd data, so that a desensitization algorithm is set while the data label is created, and real-time data desensitization is performed. The built-in rich desensitization algorithm comprises Hash desensitization, shielding desensitization, replacement desensitization, transformation desensitization, encryption desensitization and random desensitization, and the multiple desensitization algorithms ensure that sensitive data is protected more comprehensively and safely.
FIG. 1 is a method flow diagram of a tag desensitization logic flow of an embodiment of the present invention. As shown in fig. 1, it includes the following steps:
s110, importing a label model: and importing an entity table from a data source as a marking object, wherein the entity table corresponds to the label model.
S120, configuring desensitized objects: in the tag model, sensitive fields that require desensitization are selected.
S130, selecting a desensitization algorithm: one or more desensitization algorithms including hash desensitization, shielding desensitization, replacement desensitization, transformation desensitization, encryption desensitization and random desensitization are selected to perform dynamic desensitization on data corresponding to the sensitive fields.
S140, extracting the label attribute: and selecting fields from the label model, and extracting the entity attributes of the marking object as the marking dimensions and basis.
S150, creating a data label: and setting label layering rules for the label model, and filtering to form corresponding label crowd layering data according to the set label layering rules.
Specifically, the step may set label hierarchical classification rules, schedule plans, select models and attributes for free combination. And filtering to form corresponding grouped data according to the label rule.
S160, checking or displaying desensitized label crowd data: when specific tag crowd data is checked, the system responds to the situation that a user checks tag crowd layered data, real-time dynamic desensitization is carried out, the checked tag crowd data is desensitized, and the tag crowd layered data after desensitization processing of sensitive fields is displayed for the user.
In some embodiments, the step of extracting the tag entity attribute may specifically include:
selecting a plurality of fields including a name, an identification number, a contact address, a mobile phone number, indication information of existence or non-existence of symptoms, indication information of whether an epidemic patient is contacted, sex, an epidemic patient source area and an area where the epidemic patient is located currently from the label model, and extracting entity attributes of the label model.
In some embodiments, the step of creating the data tag specifically includes:
setting basic information, wherein the setting basic information comprises: setting the label model as a model to be analyzed in response to an input operation; in response to an input operation, setting an updating mode of the tag model to be a manual updating mode or a routine updating mode; setting an execution cycle of creating the data tag to one of a minute, hour, day, month, and year in response to the input operation; responding to the input operation of a user, and setting the scheduling strategy of the label model to be executed once every preset time length;
setting a label layering rule, wherein the setting of the label layering rule comprises the following steps: the population is divided into a plurality of tiers, each tier being associated with a plurality of conditions for the configuration.
In some embodiments, the dividing the crowd into a plurality of tiers, each tier being associated with a plurality of configured conditions specifically includes:
dividing a crowd into at least two layers; one of the layers represents epidemic situation crowd in the first region, and the other layer represents epidemic situation crowd in the second region;
each tier is associated with simultaneous satisfaction of at least two conditions of the configuration, the at least two conditions including: the first condition is that the health condition is equal to an anomaly; and, the second condition is that the from area is equal to the first region or the second region.
The following examples are given for illustrative purposes:
fig. 2 is a software interface diagram of an exemplary import tag model according to an embodiment of the present invention. As shown in fig. 2, in the step of importing the label model, an import table interface is provided for the user, in which data source names are set, such as epidemic situation demonstration and treatment, and table names of data entities, such as hospitalization information table, are selected.
FIG. 3 is a diagram of a software interface for setting desensitization objects, as an example, according to an embodiment of the present invention. As shown in FIG. 3, in an interface for editing a label model, column information is provided to a user, where column names include, but are not limited to: ID. LAT, FULL _ NAME, SEX, PHONE, ID _ CARD, PLATE _ NO, HEALTHY, CONTACT _ NAME, CONTACT _ PHONE, CONTACT _ REMARK. The remark information corresponding to the column names is respectively as follows: primary key, dimension, name, gender, mobile phone number, identification card, license plate number, health status, emergency contact phone, emergency contact remark. The data types for the column names may all be VARCHAR 2. Wherein, column name "ID" is selected or selected as the primary key, other column names are not selected as the primary key; column NAMEs 'ID', 'FULL _ NAME', 'PHONE', 'ID _ CARD' and 'HEALHY' are selected for list presentation, wherein the column NAMEs are not selected for list presentation; the column names "PHONE" and "ID _ CARD" are selected or configured for desensitization, and the other column names are not selected or configured for desensitization.
Fig. 4A is a first interface diagram of an exemplary selection desensitization algorithm according to an embodiment of the present invention. As shown in fig. 4A, in the drop-down box of the desensitization algorithm, a plurality of desensitization algorithms are configured, including: hash desensitization, mask desensitization, replacement desensitization, transform desensitization, encryption desensitization, random desensitization. In one example, when a desensitization algorithm is selected or configured, its next algorithm is selected to use the SHA-256 algorithm. FIG. 4B is a second interface diagram of an exemplary selection desensitization algorithm according to embodiments of the present invention. As shown in fig. 4B, in the algorithm classification tree displayed on the left side of the user interface, the hash desensitization algorithm may specifically include: MD5, SHA-1, SHA-256, HMAC; the transform desensitization algorithm may specifically include: digit rounding, date rounding and character displacement; the encryption desensitization algorithm may specifically include: DES algorithm, 3DES algorithm, AES algorithm, SM1 algorithm; the stochastic desensitization algorithm may specifically include: scatter rearrangement, random selection. When any one specific algorithm in the left column is selected, algorithm information, parameter configuration and algorithm test columns are displayed in the right interface of the left column. When MD5 is selected, the name of the algorithm is MD5, the type of the algorithm is hash desensitization, whether the algorithm is reversible or not is set to be irreversible, and whether the algorithm is started or not is set to be started.
Fig. 5 is a schematic diagram of an exemplary operation interface for extracting entity attributes according to an embodiment of the present invention. As shown in fig. 5, in the newly added entity attribute interface, the table name is set as a deep information table of the Shenzhen epidemic, and the column information includes the following fields: column name (attribute name), display name, data type, unit format, and dictionary information. Wherein, the column name field is all selected, and it includes: "xm", "sfzhm", "lxdz", "sjhm", "zz", "jchz", "sex"; accordingly, the display name fields are respectively: name, identification number, contact address, mobile phone number, symptom (0 is no symptom, 1 is symptom), indication information of contacting the patient (0 is not contacting, 1 is contacting), and sex.
FIG. 6A is a schematic diagram of an exemplary first interface for creating a data tag according to an embodiment of the present invention; fig. 6B is a schematic diagram of an exemplary interface for creating a data tag according to an embodiment of the present invention. As shown in fig. 6A, in the software interface for creating the custom tag, the first step is to set basic information, which mainly includes: setting a tag name, for example, setting the tag name as an epidemic tag; setting a model, for example, setting the model as a deep information table of Shenzhen epidemic situation; the grouping is configured as a Shenzhen epidemic; the update mode is configured to be routinely updated; the execution period is configured to be one of minutes, hours, days, and months, in this example selected to be minutes; the commissioning strategy is configured to be executed every 15 minutes. As shown in fig. 6B, in the software interface for creating the custom tag, the second step is to set a tag rule, and in the interface for setting the tag rule, among all users, the users who satisfy the following conditions are divided into 2 hierarchies, such as martial arts epidemic situation personnel and beijing epidemic situation personnel; but are not limited to 2 tiers, which may be added as desired. By way of example, when health is met equal to abnormal, and from a region equal to Wuhan, the population meeting both of the above two conditions is in a Wuhan epidemic stratification. The system can carry out user matching according to the sequence of the self-defined hierarchies, and the same user can be preferentially matched in the hierarchy with the front sequence. FIG. 7 is an exemplary interface diagram for viewing a population of tags after desensitization according to embodiments of the present invention. As shown in fig. 7, in the presentation interface of the sample data instance table, as shown in a block, the two columns of information, namely "PHONE" and "ID _ CARD" are desensitized and displayed as a plurality of asterisks, so that the key information is kept secret and hidden.
Example two
Fig. 8 is a functional block diagram of a tag desensitization apparatus of an embodiment of the present invention. As shown in fig. 8, the apparatus 200 includes:
a label model importing module 210, configured to import an entity table from a data source as a marking object, where the entity table corresponds to a label model;
a desensitization object configuration module 220, configured to select a sensitive field requiring desensitization in the tag model;
a desensitization algorithm selecting module 230, configured to select one or more desensitization algorithms to perform dynamic desensitization on data corresponding to the sensitive field;
a tag entity attribute extraction module 240, configured to select multiple fields from the tag model, and extract entity attributes of the marking object;
the data tag creating module 250 is configured to set a tag layering rule for the tag model, and filter the tag model according to the set tag layering rule to form corresponding tag crowd layering data;
and the desensitized tag crowd data display module 260 is configured to perform real-time dynamic desensitization in response to that the user views the tag crowd hierarchical data, and display the tag crowd hierarchical data subjected to desensitization processing on the sensitive field to the user.
In some embodiments, the desensitization algorithm includes one or more desensitization algorithms including hash desensitization, mask desensitization, replacement desensitization, transform desensitization, encryption desensitization, random desensitization.
In some embodiments, the tag entity attribute extraction module 240 is specifically configured to: selecting a plurality of fields including a name, an identification number, a contact address, a mobile phone number, indication information of existence or non-existence of symptoms, indication information of whether an epidemic patient is contacted, sex, an epidemic patient source area and an area where the epidemic patient is located currently from the label model, and extracting entity attributes of the label model.
In some embodiments, the data tag creation module 250 may be specifically configured to:
setting basic information, wherein the setting basic information comprises: setting the label model as a model to be analyzed in response to an input operation; in response to an input operation, setting an updating mode of the tag model to be a manual updating mode or a routine updating mode; setting an execution cycle of creating the data tag to one of a minute, hour, day, month, and year in response to the input operation; responding to the input operation of a user, and setting the scheduling strategy of the label model to be executed once every preset time length;
setting a label layering rule, wherein the setting of the label layering rule comprises the following steps: the population is divided into a plurality of tiers, each tier being associated with a plurality of conditions for the configuration.
In some embodiments, the dividing the crowd into a plurality of tiers, where each tier is associated with a plurality of configured conditions, may specifically include:
dividing a crowd into at least two layers; one of the layers represents epidemic situation crowd in the first region, and the other layer represents epidemic situation crowd in the second region;
each tier is associated with simultaneous satisfaction of at least two conditions of the configuration, the at least two conditions including: the first condition is that the health condition is equal to an anomaly; and, the second condition is that the from area is equal to the first region or the second region.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
EXAMPLE III
Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements any one of the above-described desensitization method, apparatus, and computer device for tag population data.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. Of course, there are other ways of storing media that can be read, such as quantum memory, graphene memory, and so forth. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
Example four
The embodiment of the present invention further provides a computer device, as shown in fig. 9, including one or more processors 301, a communication interface 302, a memory 303, and a communication bus 304, where the processors 301, the communication interface 302, and the memory 303 complete communication with each other through the communication bus 304.
A memory 303 for storing a computer program;
the processor 301 is configured to implement, when executing the program stored in the memory 303:
importing a label model: importing an entity table from a data source as a marking object, wherein the entity table corresponds to a label model;
configuring a desensitized subject: selecting a sensitive field needing desensitization in the label model;
selecting a desensitization algorithm: selecting one or more desensitization algorithms to perform dynamic desensitization on data corresponding to the sensitive fields;
extracting the attribute of the tag entity: selecting a plurality of fields from the label model, and extracting entity attributes of the marking object;
creating a data tag: setting label layering rules for the label models, and filtering to form corresponding label crowd layering data according to the set label layering rules;
display of label population data after desensitization: and when the user views the tag crowd hierarchical data, real-time dynamic desensitization is carried out, and the tag crowd hierarchical data subjected to desensitization processing on the sensitive field is displayed for the user.
In some embodiments, the desensitization algorithm includes one or more desensitization algorithms of hash desensitization, mask desensitization, replacement desensitization, transform desensitization, encryption desensitization, random desensitization, among the processes of the processor 301.
In some embodiments, in the processing of the processor 301, the step of extracting the tag entity attribute may specifically include:
selecting a plurality of fields including a name, an identification number, a contact address, a mobile phone number, indication information of existence or non-existence of symptoms, indication information of whether an epidemic patient is contacted, sex, an epidemic patient source area and an area where the epidemic patient is located currently from the label model, and extracting entity attributes of the label model.
In some embodiments, in the processing of the processor 301, the step of creating the data tag may specifically include:
setting basic information, wherein the setting basic information comprises: setting the label model as a model to be analyzed in response to an input operation; in response to an input operation, setting an updating mode of the tag model to be a manual updating mode or a routine updating mode; setting an execution cycle of creating the data tag to one of a minute, hour, day, month, and year in response to the input operation; responding to the input operation of a user, and setting the scheduling strategy of the label model to be executed once every preset time length;
setting a label layering rule, wherein the setting of the label layering rule comprises the following steps: the population is divided into a plurality of tiers, each tier being associated with a plurality of conditions for the configuration.
In some embodiments, in the processing of the processor 301, the dividing the crowd into a plurality of tiers, where each tier is associated with a plurality of configured conditions may specifically include:
dividing a crowd into at least two layers; one of the layers represents epidemic situation crowd in the first region, and the other layer represents epidemic situation crowd in the second region;
each tier is associated with simultaneous satisfaction of at least two conditions of the configuration, the at least two conditions including: the first condition is that the health condition is equal to an anomaly; and, the second condition is that the from area is equal to the first region or the second region.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus. The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A method for desensitizing tag population data, comprising:
importing a label model: importing an entity table from a data source as a marking object, wherein the entity table corresponds to a label model;
configuring a desensitized subject: selecting a sensitive field needing desensitization in the label model;
selecting a desensitization algorithm: selecting one or more desensitization algorithms to perform dynamic desensitization on data corresponding to the sensitive fields;
extracting the attribute of the tag entity: selecting a plurality of fields from the label model, and extracting entity attributes of the marking object;
creating a data tag: setting label layering rules for the label models, and filtering to form corresponding label crowd layering data according to the set label layering rules;
display of label population data after desensitization: and when the user views the tag crowd hierarchical data, real-time dynamic desensitization is carried out, and the tag crowd hierarchical data subjected to desensitization processing on the sensitive field is displayed for the user.
2. The method of claim 1, wherein the desensitization algorithm comprises one or more desensitization algorithms selected from hash desensitization, mask desensitization, substitution desensitization, transform desensitization, encryption desensitization, and random desensitization.
3. The method according to claim 1, wherein the step of extracting the tag entity attribute specifically comprises:
selecting a plurality of fields including a name, an identification number, a contact address, a mobile phone number, indication information of existence or non-existence of symptoms, indication information of whether an epidemic patient is contacted, sex, an epidemic patient source area and an area where the epidemic patient is located currently from the label model, and extracting entity attributes of the label model.
4. The method according to any one of claims 1 to 3, wherein the step of creating a data tag specifically comprises:
setting basic information, wherein the setting basic information comprises: setting the label model as a model to be analyzed in response to an input operation; in response to an input operation, setting an updating mode of the tag model to be a manual updating mode or a routine updating mode; setting an execution cycle of creating the data tag to one of a minute, hour, day, month, and year in response to the input operation; responding to the input operation of a user, and setting the scheduling strategy of the label model to be executed once every preset time length;
setting a label layering rule, wherein the setting of the label layering rule comprises the following steps: the population is divided into a plurality of tiers, each tier being associated with a plurality of conditions for the configuration.
5. The method of claim 4, wherein the dividing of the population into a plurality of tiers, each tier associated with a plurality of configured conditions, comprises:
dividing a crowd into at least two layers; one of the layers represents epidemic situation crowd in the first region, and the other layer represents epidemic situation crowd in the second region;
each tier is associated with simultaneous satisfaction of at least two conditions of the configuration, the at least two conditions including: the first condition is that the health condition is equal to an anomaly; and, the second condition is that the from area is equal to the first region or the second region.
6. A device for desensitizing tag population data, comprising:
the label model importing module is used for importing an entity table from a data source as a marking object, wherein the entity table corresponds to a label model;
a desensitization object configuration module used for selecting a sensitive field needing desensitization in the label model;
the desensitization algorithm selection module is used for selecting one or more desensitization algorithms to perform dynamic desensitization on data corresponding to the sensitive fields;
the label entity attribute extraction module is used for selecting a plurality of fields from the label model and extracting entity attributes of the marking object;
the data label creating module is used for setting label layering rules for the label model and filtering to form corresponding label crowd layering data according to the set label layering rules;
and the desensitized tag crowd data display module is used for responding to the situation that a user views the tag crowd hierarchical data, carrying out real-time dynamic desensitization and displaying the tag crowd hierarchical data subjected to desensitization treatment on the sensitive field to the user.
7. The apparatus of claim 6, wherein the desensitization algorithm comprises one or more desensitization algorithms of hash desensitization, mask desensitization, replacement desensitization, transform desensitization, encryption desensitization, random desensitization.
8. The apparatus according to claim 6 or 7, wherein the tag entity attribute extraction module is specifically configured to: selecting a plurality of fields including a name, an identification number, a contact address, a mobile phone number, indication information of existence or non-existence of symptoms, indication information of whether an epidemic patient is contacted, sex, an epidemic patient source area and an area where the epidemic patient is located currently from the label model, and extracting entity attributes of the label model.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a method of desensitizing tag population data according to any one of claims 1 to 5.
10. A computer device, comprising:
one or more processors;
storage means for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement a method of desensitizing tag population data according to any of claims 1-5.
CN202011613978.7A 2020-12-30 2020-12-30 Desensitization method and device for label crowd data and computer equipment Active CN112632618B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011613978.7A CN112632618B (en) 2020-12-30 2020-12-30 Desensitization method and device for label crowd data and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011613978.7A CN112632618B (en) 2020-12-30 2020-12-30 Desensitization method and device for label crowd data and computer equipment

Publications (2)

Publication Number Publication Date
CN112632618A true CN112632618A (en) 2021-04-09
CN112632618B CN112632618B (en) 2024-04-16

Family

ID=75286965

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011613978.7A Active CN112632618B (en) 2020-12-30 2020-12-30 Desensitization method and device for label crowd data and computer equipment

Country Status (1)

Country Link
CN (1) CN112632618B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109409121A (en) * 2018-09-07 2019-03-01 阿里巴巴集团控股有限公司 Desensitization process method, apparatus and server
CN110399733A (en) * 2019-03-18 2019-11-01 国网安徽省电力有限公司黄山供电公司 A kind of desensitization platform for structural data
CN111159763A (en) * 2019-12-26 2020-05-15 银江股份有限公司 System and method for analyzing portrait of law-related personnel group
CN111198948A (en) * 2020-01-08 2020-05-26 深圳前海微众银行股份有限公司 Text classification correction method, device and equipment and computer readable storage medium
CN111243748A (en) * 2019-12-30 2020-06-05 湖南中医药大学 Needle pushing health data standardization system
WO2020113582A1 (en) * 2018-12-07 2020-06-11 Microsoft Technology Licensing, Llc Providing images with privacy label
CN111666587A (en) * 2020-05-10 2020-09-15 武汉理工大学 Food data multi-attribute feature joint desensitization method and device based on supervised learning
CN111709052A (en) * 2020-06-01 2020-09-25 支付宝(杭州)信息技术有限公司 Private data identification and processing method, device, equipment and readable medium
US20200349271A1 (en) * 2019-05-01 2020-11-05 Optum, Inc. Database entity sensitivity classification

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109409121A (en) * 2018-09-07 2019-03-01 阿里巴巴集团控股有限公司 Desensitization process method, apparatus and server
WO2020113582A1 (en) * 2018-12-07 2020-06-11 Microsoft Technology Licensing, Llc Providing images with privacy label
CN110399733A (en) * 2019-03-18 2019-11-01 国网安徽省电力有限公司黄山供电公司 A kind of desensitization platform for structural data
US20200349271A1 (en) * 2019-05-01 2020-11-05 Optum, Inc. Database entity sensitivity classification
CN111159763A (en) * 2019-12-26 2020-05-15 银江股份有限公司 System and method for analyzing portrait of law-related personnel group
CN111243748A (en) * 2019-12-30 2020-06-05 湖南中医药大学 Needle pushing health data standardization system
CN111198948A (en) * 2020-01-08 2020-05-26 深圳前海微众银行股份有限公司 Text classification correction method, device and equipment and computer readable storage medium
CN111666587A (en) * 2020-05-10 2020-09-15 武汉理工大学 Food data multi-attribute feature joint desensitization method and device based on supervised learning
CN111709052A (en) * 2020-06-01 2020-09-25 支付宝(杭州)信息技术有限公司 Private data identification and processing method, device, equipment and readable medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孟雪: "保留敏感数据统计特征的数据脱敏系统的研究与实现", 信息科技, no. 2, 15 February 2020 (2020-02-15), pages 30 - 43 *

Also Published As

Publication number Publication date
CN112632618B (en) 2024-04-16

Similar Documents

Publication Publication Date Title
US11409911B2 (en) Methods and systems for obfuscating sensitive information in computer systems
CN108509485B (en) Data preprocessing method and device, computer equipment and storage medium
US10735471B2 (en) Method, apparatus, and computer-readable medium for data protection simulation and optimization in a computer network
CN111400367B (en) Service report generation method, device, computer equipment and storage medium
CN109598628B (en) Method, device and equipment for identifying medical insurance fraud behaviors and readable storage medium
CN113158233B (en) Data preprocessing method and device and computer storage medium
KR102509748B1 (en) System for providing pseudonymization processing service using metadata and deeplearning security control
CN114219207A (en) Business decision method, system, device, computer equipment and storage medium
CN112860808A (en) User portrait analysis method, device, medium and equipment based on data tag
CN113360548A (en) Data processing method, device, equipment and medium based on data asset analysis
CN111522859A (en) Alarm analysis method and device, computer equipment and storage medium
CN112163214A (en) Data access method and device
CN107341095A (en) A kind of method and device of intellectual analysis daily record data
CN112632618A (en) Desensitization method and device for tag crowd data and computer equipment
CN116910023A (en) Data management system
CN110704469B (en) Updating method and updating device of early warning level and readable storage medium
CN112579571B (en) Monitoring data configuration, data monitoring method, device, equipment and storage medium
CN115277132A (en) Network security situation awareness method and device, computer equipment and storage medium
CN114092275A (en) Enterprise operation abnormity monitoring method and device, computer equipment and storage medium
CN114255123A (en) Risk checking method, risk checking device, computer equipment, storage medium and product
CN109885543A (en) Log processing method and device based on big data cluster
CN111382457A (en) Data risk assessment method and device
CN117493335A (en) Report processing method, report processing device, report processing equipment, storage medium and computer program product
CN114676190A (en) Data display method and device, computer equipment and storage medium
CN116663951A (en) User retention data analysis model construction and analysis method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Country or region after: China

Address after: 518000 2203/2204, Building 1, Huide Building, Beizhan Community, Minzhi Street, Longhua District, Shenzhen, Guangdong

Applicant after: SHENZHEN AUDAQUE DATA TECHNOLOGY Ltd.

Address before: 713, 7th floor, software building, No.9, Gaoxin Zhongyi Road, Nanshan District, Shenzhen, Guangdong 518000

Applicant before: SHENZHEN AUDAQUE DATA TECHNOLOGY Ltd.

Country or region before: China

GR01 Patent grant
GR01 Patent grant