CN110598442A - Sensitive data self-adaptive desensitization method and system - Google Patents

Sensitive data self-adaptive desensitization method and system Download PDF

Info

Publication number
CN110598442A
CN110598442A CN201910860749.6A CN201910860749A CN110598442A CN 110598442 A CN110598442 A CN 110598442A CN 201910860749 A CN201910860749 A CN 201910860749A CN 110598442 A CN110598442 A CN 110598442A
Authority
CN
China
Prior art keywords
desensitization
data
sensitive data
server
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910860749.6A
Other languages
Chinese (zh)
Inventor
叶卫
黄宇腾
戚伟强
沈志豪
张景明
韦金良
董科
季超
牟黎
耿继朴
尚天婷
陈泽堃
伍星宇
陈珊
王嘉怡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd
Original Assignee
Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd filed Critical Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd
Priority to CN201910860749.6A priority Critical patent/CN110598442A/en
Publication of CN110598442A publication Critical patent/CN110598442A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a sensitive data self-adaptive desensitization method and a sensitive data self-adaptive desensitization system, which relate to the field of computer technology and information security and comprise the following steps: adding a plurality of desensitization algorithms in a desensitization server, and setting a one-to-one corresponding quantitative relation between each desensitization algorithm and each desensitization effect in a plurality of desensitization effects; the desensitization server receives a desensitization instruction sent by the user side equipment and reads original data from the data source server according to the desensitization instruction; the desensitization server constructs a desensitization effect preference training set of a user for different sensitive data types to form a decision tree; the desensitization server locates the sensitive data existing in the original data and determines the type of the sensitive data, a desensitization algorithm is selected for the sensitive data by using a decision tree, and replacement data of the sensitive data are generated according to the desensitization algorithm. The invention has simple user configuration flow and can realize intelligent automatic configuration and automatic desensitization of desensitization strategies.

Description

Sensitive data self-adaptive desensitization method and system
Technical Field
The invention relates to the field of computer technology and information security, in particular to a sensitive data self-adaptive desensitization method and system.
Background
With the advent of the data age, the huge value of data is mined, and meanwhile, the difficulties in the protection of private information and key sensitive data are brought. How to realize the efficient sharing of data and protect sensitive information from being leaked becomes a key link of data security intelligent development.
Data desensitization is the change of the value of data while preserving its original characteristics, thereby protecting sensitive data from unauthorized access, while allowing related data processing, preserving data security while preserving data meaning and validity, and complying with data privacy regulations. By means of data desensitization, information can still be used and associated with the business without violating relevant regulations and the risk of data leakage is also avoided.
Before desensitization rules are set for sensitive data fields, a user often needs to learn a data desensitization strategy preset by a system, even to perform personalized customization and modification, so that the operation and maintenance cost of the user is greatly increased. And because the implementation mode of the desensitization strategy is generally completed by a system-specified algorithm, the effect of the user after desensitization on sensitive data cannot be expected. According to the practical use condition of the deployed desensitization system, the situation that the user is not satisfied with the desensitization effect of the sensitive data and the strategy modification is repeated occurs.
Disclosure of Invention
The invention aims to make up for the defects in the prior art, and provides a sensitive data self-adaptive desensitization method and system, which take desensitization effect as guidance and simplify user configuration flow; meanwhile, for a dynamic desensitization use scene, the learning of a user use model is facilitated, and intelligent automatic configuration of a desensitization strategy is realized, so that automatic desensitization is realized.
According to an aspect of the invention: a method of adaptive desensitization of sensitive data, comprising the steps of:
adding a plurality of desensitization algorithms in a desensitization server, and setting a one-to-one corresponding quantitative relation between each desensitization algorithm and each desensitization effect in a plurality of desensitization effects;
the desensitization server receives a desensitization instruction sent by user side equipment, and reads original data from a data source server according to the desensitization instruction, wherein the desensitization instruction comprises a sensitive data type, at least one desensitization effect and priority sequencing of the at least one desensitization effect;
the desensitization server constructs a desensitization effect preference training set of the user for different sensitive data types according to the set quantitative relation and the received desensitization instruction contained priority sequence to form a decision tree;
the desensitization server positions sensitive data existing in the original data and determines the type of the sensitive data, a desensitization algorithm is selected for the sensitive data by using the decision tree, replacement data of the sensitive data are generated according to the desensitization algorithm, and the sensitive data in the original data are replaced by the corresponding replacement data to generate desensitization data;
and the desensitization server sends the desensitization data to customer premise equipment.
Further, the sensitive data types include name, address, mailbox, telephone, certificate, account number, zip code, date.
Further, the desensitization algorithms include a plurality of replacement algorithms, invalidation algorithms, out-of-order algorithms, average value taking algorithms, anti-association algorithms, migration algorithms, symmetric encryption algorithms, and dynamic environment control algorithms.
Further, the at least one desensitization effect includes at least one of effectiveness, correlation, reversibility, repeatability, timeliness, and safety.
According to another aspect of the invention: an adaptive desensitization system for sensitive data, comprising: the client device is used for sending desensitization instructions and receiving desensitization data; the data source server is used for storing original data; a desensitization server, configured to add a plurality of desensitization algorithms, set a one-to-one quantitative relationship between each desensitization algorithm and each desensitization effect of the plurality of desensitization effects, and further configured to receive a desensitization instruction sent by a user end device, and read original data from a data source server according to the desensitization instruction, where the desensitization instruction includes a sensitive data type, at least one desensitization effect, and a priority ranking of the at least one desensitization effect, and is further configured to construct a desensitization effect preference training set of a user for different sensitive data types according to the set quantitative relationship and the priority ranking included in the received desensitization instruction, form a decision tree, and further configured to locate sensitive data existing in the original data, determine a type of the sensitive data, and select a desensitization algorithm for the sensitive data using the decision tree, and generating replacement data of the sensitive data according to the desensitization algorithm, replacing the sensitive data in the original data with corresponding replacement data to generate desensitization data, and sending the desensitization data to user end equipment.
Further, the desensitization server comprises a setting module, a decision tree forming module and a desensitization processing module, wherein the setting module is used for adding a plurality of desensitization algorithms and setting one-to-one quantitative relationship between each desensitization algorithm and each desensitization effect in a plurality of desensitization effects; the decision tree forming module is used for constructing a desensitization effect preference training set of the user for different sensitive data types according to the set quantitative relation and the received desensitization instruction contained priority sequence to form a decision tree; the desensitization processing module is used for positioning the sensitive data existing in the original data and determining the type of the sensitive data, selecting a desensitization algorithm for the sensitive data by using the decision tree, generating replacement data of the sensitive data according to the desensitization algorithm, and replacing the sensitive data in the original data with the corresponding replacement data to generate desensitization data.
The invention has the beneficial effects that: the user can release from the heavy rule configuration work, only the result characteristics of the whole desensitization task need to be concerned, priority ordering is carried out on the characteristic requirements, the recommended algorithm configuration of all fields can be obtained through a system algorithm, and the user configuration flow is simple; for a dynamic desensitization use scene, the learning of a user use model is facilitated, and intelligent automatic configuration of a desensitization strategy is realized, so that automatic desensitization is realized.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the description of the present invention will be briefly introduced below, and it is apparent that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive labor.
FIG. 1 is a diagram of an exemplary architecture of a sensitive data adaptive desensitization system according to an embodiment of the present invention.
Fig. 2 is a block diagram of a desensitization server according to an embodiment of the present invention.
Fig. 3 is a flowchart illustrating a sensitive data adaptive desensitization method according to an embodiment of the present invention.
FIG. 4 is a schematic block diagram of an apparatus for use in embodiments of the present invention.
Detailed Description
The technical solutions in the present invention will be described clearly and completely with reference to the accompanying drawings, and it is obvious that the described embodiments are some, not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
Embodiments of the invention are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the computer system/server include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, networked personal computers, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above, and the like.
The computer system/server may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.
Fig. 1 shows an exemplary architecture diagram of a desensitization system for sensitive data adaptation according to an embodiment of the present invention, and as shown in fig. 1, a desensitization system 100 includes a client device 110, a desensitization server 120, and a data source server 130. The customer premise equipment 110 is connected with the desensitization server 120 through the network 140 in a wired or wireless manner, and the desensitization server 120 is connected with the data source server 130 through the network 150 in a wired or wireless manner.
The customer premises device 110, desensitization server 120 and data origin server 130 are physically separated from each other. Although only customer premises equipment 110, desensitization server 120, and data origin server 130 are shown in fig. 1, the desensitization system 100 described above may include other one or more devices not shown, such as network elements like routers, switches, etc.
The customer premise equipment 110 may be a PC, a mobile terminal, or the like, the customer premise equipment 110 sends desensitization instructions and receives desensitization data, and may store and view desensitization data, and the data source server 130 stores original data.
The desensitization server 120 may add a plurality of desensitization algorithms through an input unit, and set a one-to-one quantitative relationship between each desensitization algorithm and each desensitization effect of the plurality of desensitization effects.
The desensitization algorithms include a plurality of substitution algorithms, invalidation algorithms, out-of-order algorithms, average value algorithms, anti-association algorithms, migration algorithms, symmetric encryption algorithms, and dynamic environment control algorithms. The replacement algorithm replaces real data with fictional data, such as establishing a large dictionary data table, generating random factors for each real value record, and replacing the dictionary table contents for the original data contents, wherein the data obtained by the algorithm is very similar to the real data. The invalidation algorithm replaces the truth value or a portion of the truth value with a special symbol, such as the first 6-14 bits of the identification number. The disorder algorithm carries out random redistribution on the values of the sensitive data column, and the relation between the original values and other fields is confused. The average value-taking algorithm aims at numerical data, firstly calculates the average value of the numerical data, and then randomly distributes desensitized values around the average value so as to keep the sum of the data unchanged, and is generally used for occasions such as cost tables, payroll tables and the like. The anti-association algorithm looks for mappings that may infer from certain fields another sensitive field and desensitizes these fields, such as the context in which the identification number, gender, region can be inferred from the date of birth. The offset algorithm changes the digital data by random shifting. The symmetric encryption algorithm is a special reversible desensitization method, original data is encrypted through an encryption key and the algorithm, the format of a ciphertext is consistent with that of the original data in a logic rule, and the original data can be recovered through a decryption key. The dynamic environment control algorithm only changes part of response data according to a predefined rule, if the business data is not accessed under the appointed condition, the data content is controlled, the content of a specific field is shielded, if important customer information is not displayed for a DBA (database administrator) account, the important customer information is only displayed for key users of the business module.
The at least one desensitization effect includes at least one of effectiveness, correlation, reversibility, repeatability, timeliness, and safety. The meaning of the effectiveness effect is as follows: according to the specific requirements of the desensitization task, the result after desensitization is often required to be service validity, random nonsense text sequences or numerical values cannot be simply generated, or simple deletion, truncation and mask processing are carried out, the service attribute of original data of the data is reserved to a certain extent according to the specific scene of service use, even the sampling characteristic of the data needs to be kept unchanged under partial scenes, for example, the data after the identification number desensitization still meets the characteristic of the identification number, the data can be correctly verified through the validity rule of the data, and the information contained in the identification number is also a meaningful area code or a meaningful birthday, but is a non-random numerical value. The connotation of the relevance is as follows: for structured data, particularly for data sets with very complex relationships between different data elements, there is often a correspondence between a field and another field in the same data table, and generally, this correspondence should not be destroyed before and after data desensitization, otherwise, the use value of the field will not exist, and generally, in the case of a reference quantity required for data statistics, the requirement on the relevance of the data is high. The content of the reversible effect is as follows: in a general application scenario where desensitization is used, the desensitized data can never be restored to the original traffic data. Most desensitizing products on the market are designed in this mode. However, with the increasing popularity of big data analysis, third-party business intelligence, accurate marketing service and the like are widely accepted, and business departments often need to restore desensitized data to original business data so as to carry out subsequent work. For example, after desensitizing the service data, the telecommunication company may send the service data to a third party for user behavior analysis, and after the final result is completed, the telecommunication company may restore the desensitized data so as to accurately draw an image of the user and perform an accurate marketing campaign, and in this case, the user may require reversibility of data desensitization. The content of the repeatability effect is as follows: in some business scenarios, desensitization must be a repeatable process. Desensitization is carried out on the same data for multiple times, or desensitization is carried out on different test systems to ensure that the data of each desensitization keep consistent, so that the desensitized results of the data in special environments such as increment and the like can still be effectively correlated; in other scenes, for the consideration of confidentiality, the desensitization results of the same data field (such as an identity card number, a credit card number and the like) are not necessarily the same every time, so that the original service data can be prevented from being restored by a hacker through reverse engineering after collecting a large amount of desensitization data; thus, the desensitization product should allow the user to choose whether the desensitization result is repeatable for a particular type of data and a particular scenario when configuring the policy. The content of the timeliness effect is as follows: in part of service requirements and application scenarios, especially in a dynamic desensitization scenario, a user often has a high requirement on desensitization timeliness, and desensitization data may not have a meaning of further analysis and mining after a certain time, so that data desensitization in such a scenario should avoid selecting a time-consuming desensitization algorithm, such as an encryption algorithm, as much as possible. The safety effect is characterized in that: for the desensitization of part of high-level sensitive data, the requirement on security is often high, and other requirements of users can serve the requirement on security, and for such requirement, an irreversible algorithm or other algorithms capable of ensuring that information is not leaked should be selected.
Through research, such as a mathematical method, a physical method, and the like, a quantitative relationship corresponding to each desensitization algorithm and each desensitization effect can be determined, and the quantitative relationship can be a relationship that distinguishes each desensitization effect corresponding to each desensitization algorithm as high, medium, and low, or can be a specific numerical value assigned to each desensitization effect corresponding to each desensitization algorithm.
The desensitization server 120 is further configured to receive a desensitization instruction sent by the user end device 110, read original data from the data source server 130 through the communication unit according to the desensitization instruction, and store the original data in the storage unit, where the desensitization instruction includes a sensitive data type, at least one desensitization effect, and a priority order of the at least one desensitization effect. Sensitive data typically contains customer personal privacy data as well as some key sensitive business data, e.g., name: client name, etc.; address: home address, company address, etc.; mail box: corporate mailboxes, regular mailboxes, and the like; telephone: mobile phones, fixed phones, etc.; certificate: identity cards, passports, officer's licenses, and the like; account number: bank card, customer number, tax registration number, organization code, business license number, etc.; and E, postcode: company zip code, home address zip code, etc.; date: birthday, etc. Thus, the sensitive data types may be name, address, mailbox, telephone, certificate, account number, zip code, date, etc. The user-side device 110 may send the type of sensitive data that needs desensitization, at least one of the effects of validity, relevance, reversibility, repeatability, timeliness, and security, and prioritize the sent at least one desensitization effect.
The desensitization server 120 is further configured to construct a desensitization effect preference training set of the user for different sensitive data types according to the set quantitative relationship and the received desensitization order included in the desensitization instruction sent by the user end device 110, and form a decision tree.
The desensitization server 120 may further locate the sensitive data existing in the original data according to a sensitive data identification method (such as including manual configuration matching, regular expression matching, or other intelligent identification algorithms), determine the type of the sensitive data, select a desensitization algorithm for the sensitive data by using the decision tree, generate replacement data of the sensitive data according to the desensitization algorithm, replace the sensitive data in the original data with corresponding replacement data to generate desensitization data, and send the desensitization data to the user end device 110.
Fig. 2 is a schematic block diagram of a desensitization server according to an embodiment of the present invention, and as shown in fig. 2, the desensitization server 120 includes a setting module 160, a decision tree forming module 170, and a desensitization processing module 180, where the setting module 160 is configured to add a plurality of desensitization algorithms and set a one-to-one quantitative relationship between each desensitization algorithm and each desensitization effect of a plurality of desensitization effects; the decision tree forming module 170 is configured to construct a desensitization effect preference training set of the user for different sensitive data types according to the set quantitative relationship and the received desensitization order included in the desensitization instruction, so as to form a decision tree; the desensitization processing module 180 is configured to locate the sensitive data existing in the original data and determine the type of the sensitive data, select a desensitization algorithm for the sensitive data by using the decision tree, generate replacement data of the sensitive data according to the desensitization algorithm, and replace the sensitive data in the original data with the corresponding replacement data to generate desensitization data.
Fig. 3 is a schematic flow chart of a desensitization method 200 for sensitive data adaptation according to an embodiment of the present invention, and as shown in fig. 3, in step 201, a plurality of desensitization algorithms are added to the desensitization server 120, and a one-to-one quantitative relationship between each desensitization algorithm and each desensitization effect of a plurality of desensitization effects is set; in step 202, the desensitization server 120 receives a desensitization instruction sent by the customer premises equipment 110, and reads original data from the data source server 130 according to the desensitization instruction, where the desensitization instruction includes a sensitive data type, at least one desensitization effect, and a priority order of the at least one desensitization effect; in step 203, the desensitization server 120 constructs a desensitization effect preference training set of the user for different sensitive data types according to the set quantitative relationship and the received desensitization order contained in the desensitization instruction, and forms a decision tree; in step 204, the desensitization server 120 locates the sensitive data existing in the original data and determines the type of the sensitive data, selects a desensitization algorithm for the sensitive data by using the decision tree, generates replacement data of the sensitive data according to the desensitization algorithm, and replaces the sensitive data in the original data with the corresponding replacement data to generate desensitization data; in step 205, the desensitization server 120 sends the desensitization data to the customer premises device 110.
In order to implement the functions of the client device 110, the desensitization server 120, and the data origin server 130, the client device 110, the desensitization server 120, and the data origin server 130 may be implemented using the device 300. Fig. 4 is a schematic block diagram of an apparatus 300 for implementing an embodiment of the present invention, as shown in fig. 4, the apparatus 300 comprising a Central Processing Unit (CPU)301 which may perform various suitable actions and processes in accordance with computer program instructions stored in a Read Only Memory (ROM)302 or computer program instructions loaded from a storage unit 308 into a Random Access Memory (RAM) 303. In the RAM303, various programs and data required for the operation of the device 300 can also be stored. The CPU301, ROM 302, and RAM303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
Various components in device 300 are connected to I/O interface 305, including: an input unit 306 such as a keyboard, a mouse, or the like; an output unit 307 such as various types of displays, speakers, and the like; a storage unit 308 such as a magnetic disk, optical disk, or the like; and a communication unit 309 such as a network card, modem, wireless communication transceiver, etc. The communication unit 309 allows the device 300 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
In the above method 200, in step 201, a plurality of desensitization algorithms may be added through the input unit 306 of the desensitization server 120, and a one-to-one quantitative relationship between each desensitization algorithm and each desensitization effect of the plurality of desensitization effects is set and stored in the Read Only Memory (ROM)302 or the Random Access Memory (RAM) 303. In step 202, the desensitization server 120 is in communication connection with the communication unit 309 of the user end device 110 through the communication unit 309, receives the desensitization instruction sent by the user end device 110, and is in communication connection with the communication unit 309 of the data source server 130 through the communication unit 309 of the desensitization server 120, so as to obtain the original data from the data source server 130. In step 203, the Central Processing Unit (CPU)301 of the desensitization server 120 constructs a desensitization effect preference training set of the user for different sensitive data types according to the set quantitative relationship and the received desensitization order contained in the desensitization instruction, and forms a decision tree. In step 204, a Central Processing Unit (CPU)301 of the desensitization server 120 locates the sensitive data existing in the original data and determines the type of the sensitive data, selects a desensitization algorithm for the sensitive data by using the decision tree, generates replacement data of the sensitive data according to the desensitization algorithm, and replaces the sensitive data in the original data with the corresponding replacement data to generate desensitization data. In step 205, the communication unit 309 of the desensitization server 120 is in communication connection with the communication unit 309 of the customer premises equipment 110, the desensitization data is sent to the customer premises equipment 110, and may be stored in the Read Only Memory (ROM)302 or the Random Access Memory (RAM)303 of the customer premises equipment 110, and the customer premises equipment 110 may view the stored desensitization data.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (6)

1. A method of adaptive desensitization of sensitive data, comprising the steps of:
adding a plurality of desensitization algorithms in a desensitization server, and setting a one-to-one corresponding quantitative relation between each desensitization algorithm and each desensitization effect in a plurality of desensitization effects;
the desensitization server receives a desensitization instruction sent by user side equipment, and reads original data from a data source server according to the desensitization instruction, wherein the desensitization instruction comprises a sensitive data type, at least one desensitization effect and priority sequencing of the at least one desensitization effect;
the desensitization server constructs a desensitization effect preference training set of the user for different sensitive data types according to the set quantitative relation and the received desensitization instruction contained priority sequence to form a decision tree;
the desensitization server positions sensitive data existing in the original data and determines the type of the sensitive data, a desensitization algorithm is selected for the sensitive data by using the decision tree, replacement data of the sensitive data are generated according to the desensitization algorithm, and the sensitive data in the original data are replaced by the corresponding replacement data to generate desensitization data;
and the desensitization server sends the desensitization data to customer premise equipment.
2. Desensitization method according to claim 1, characterized in that said sensitive data types comprise name, address, mailbox, telephone, certificate, account number, zip code, date.
3. A method of desensitization according to claim 2, wherein said plurality of desensitization algorithms comprises a plurality of substitution algorithms, invalidation algorithms, out-of-order algorithms, averaging algorithms, anti-correlation algorithms, migration algorithms, symmetric encryption algorithms, and dynamic environment control algorithms.
4. A method of desensitization according to claim 3, wherein said at least one desensitization effect includes at least one of effectiveness, correlation, reversibility, repeatability, timeliness and safety.
5. An adaptive desensitization system for sensitive data, comprising: the client device is used for sending desensitization instructions and receiving desensitization data; the data source server is used for storing original data; a desensitization server, configured to add a plurality of desensitization algorithms, set a one-to-one quantitative relationship between each desensitization algorithm and each desensitization effect of the plurality of desensitization effects, and further configured to receive a desensitization instruction sent by a user end device, and read original data from a data source server according to the desensitization instruction, where the desensitization instruction includes a sensitive data type, at least one desensitization effect, and a priority ranking of the at least one desensitization effect, and is further configured to construct a desensitization effect preference training set of a user for different sensitive data types according to the set quantitative relationship and the priority ranking included in the received desensitization instruction, form a decision tree, and further configured to locate sensitive data existing in the original data, determine a type of the sensitive data, and select a desensitization algorithm for the sensitive data using the decision tree, and generating replacement data of the sensitive data according to the desensitization algorithm, replacing the sensitive data in the original data with corresponding replacement data to generate desensitization data, and sending the desensitization data to user end equipment.
6. The desensitization system according to claim 5, wherein the desensitization server comprises a setting module, a decision tree forming module and a desensitization processing module, the setting module is configured to add a plurality of desensitization algorithms and set a one-to-one quantitative relationship between each desensitization algorithm and each desensitization effect of the plurality of desensitization effects; the decision tree forming module is used for constructing a desensitization effect preference training set of the user for different sensitive data types according to the set quantitative relation and the received desensitization instruction contained priority sequence to form a decision tree; the desensitization processing module is used for positioning the sensitive data existing in the original data and determining the type of the sensitive data, selecting a desensitization algorithm for the sensitive data by using the decision tree, generating replacement data of the sensitive data according to the desensitization algorithm, and replacing the sensitive data in the original data with the corresponding replacement data to generate desensitization data.
CN201910860749.6A 2019-09-11 2019-09-11 Sensitive data self-adaptive desensitization method and system Pending CN110598442A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910860749.6A CN110598442A (en) 2019-09-11 2019-09-11 Sensitive data self-adaptive desensitization method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910860749.6A CN110598442A (en) 2019-09-11 2019-09-11 Sensitive data self-adaptive desensitization method and system

Publications (1)

Publication Number Publication Date
CN110598442A true CN110598442A (en) 2019-12-20

Family

ID=68859032

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910860749.6A Pending CN110598442A (en) 2019-09-11 2019-09-11 Sensitive data self-adaptive desensitization method and system

Country Status (1)

Country Link
CN (1) CN110598442A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111625845A (en) * 2020-04-17 2020-09-04 沈阳派客动力科技有限公司 Security management method, device and equipment for big data
CN111752969A (en) * 2020-06-23 2020-10-09 上海观安信息技术股份有限公司 Algorithm for keeping statistical characteristics
CN112182654A (en) * 2020-09-29 2021-01-05 浙江鸿程计算机系统有限公司 Identification number desensitization method capable of keeping statistical-level characteristic attributes
CN112329053A (en) * 2020-10-28 2021-02-05 上海上讯信息技术股份有限公司 Method and apparatus for desensitization of target file data
CN112395645A (en) * 2020-11-30 2021-02-23 中国民航信息网络股份有限公司 Data desensitization processing method and device
CN112580339A (en) * 2020-12-18 2021-03-30 北京百度网讯科技有限公司 Model training method and device, electronic equipment and storage medium
CN112632597A (en) * 2020-12-08 2021-04-09 国家计算机网络与信息安全管理中心 Data desensitization method and device readable storage medium
CN112765673A (en) * 2021-03-16 2021-05-07 杭州数梦工场科技有限公司 Sensitive data statistical method and related device
CN113742763A (en) * 2021-11-08 2021-12-03 中关村科技软件股份有限公司 Confusion encryption method and system based on government affair sensitive data
CN113792342A (en) * 2021-09-17 2021-12-14 平安普惠企业管理有限公司 Desensitization data restoration method and device, computer equipment and storage medium
CN113988226A (en) * 2021-12-29 2022-01-28 深圳红途科技有限公司 Data desensitization validity verification method and device, computer equipment and storage medium
CN114429341A (en) * 2022-01-24 2022-05-03 吉林银行股份有限公司 Grouped payment method, device and equipment
CN115033914A (en) * 2022-05-30 2022-09-09 佳缘科技股份有限公司 Distributed dynamic desensitization method, system and storage medium
CN115422594A (en) * 2022-09-20 2022-12-02 成都比特信安科技有限公司 Method for realizing data desensitization by using matrix replacement
CN116702059A (en) * 2023-06-05 2023-09-05 苏州市联佳精密机械有限公司 Intelligent production workshop management system based on Internet of things
CN116776390A (en) * 2023-08-15 2023-09-19 上海观安信息技术股份有限公司 Method, device, storage medium and equipment for monitoring data leakage behavior
CN117851751A (en) * 2023-11-30 2024-04-09 深圳市马博士网络科技有限公司 Sensitive data identification method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145799A (en) * 2017-05-04 2017-09-08 山东浪潮云服务信息科技有限公司 A kind of data desensitization method and device
CN107766741A (en) * 2017-10-23 2018-03-06 中恒华瑞(北京)信息技术有限公司 Data desensitization system and method
US20180365610A1 (en) * 2017-06-19 2018-12-20 Verité Supply chain labor intelligence
CN109460676A (en) * 2018-10-30 2019-03-12 全球能源互联网研究院有限公司 A kind of desensitization method of blended data, desensitization device and desensitization equipment
CN109815742A (en) * 2019-02-22 2019-05-28 蔷薇智慧科技有限公司 Data desensitization method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145799A (en) * 2017-05-04 2017-09-08 山东浪潮云服务信息科技有限公司 A kind of data desensitization method and device
US20180365610A1 (en) * 2017-06-19 2018-12-20 Verité Supply chain labor intelligence
CN107766741A (en) * 2017-10-23 2018-03-06 中恒华瑞(北京)信息技术有限公司 Data desensitization system and method
CN109460676A (en) * 2018-10-30 2019-03-12 全球能源互联网研究院有限公司 A kind of desensitization method of blended data, desensitization device and desensitization equipment
CN109815742A (en) * 2019-02-22 2019-05-28 蔷薇智慧科技有限公司 Data desensitization method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
乔宏明 等: "《运营商面向大数据应用的数据脱敏方法探讨》", 《移动通信》 *
王鑫 等: "《基于机器学习的数据脱敏系统研究与设计》", 《电力信息与通信技术》 *

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111625845A (en) * 2020-04-17 2020-09-04 沈阳派客动力科技有限公司 Security management method, device and equipment for big data
CN111752969A (en) * 2020-06-23 2020-10-09 上海观安信息技术股份有限公司 Algorithm for keeping statistical characteristics
CN112182654A (en) * 2020-09-29 2021-01-05 浙江鸿程计算机系统有限公司 Identification number desensitization method capable of keeping statistical-level characteristic attributes
CN112182654B (en) * 2020-09-29 2024-03-05 浙江鸿程计算机系统有限公司 Identity card number desensitizing method capable of retaining statistical grade characteristic attribute
CN112329053A (en) * 2020-10-28 2021-02-05 上海上讯信息技术股份有限公司 Method and apparatus for desensitization of target file data
CN112395645A (en) * 2020-11-30 2021-02-23 中国民航信息网络股份有限公司 Data desensitization processing method and device
CN112395645B (en) * 2020-11-30 2024-06-11 中国民航信息网络股份有限公司 Data desensitization processing method and device
CN112632597A (en) * 2020-12-08 2021-04-09 国家计算机网络与信息安全管理中心 Data desensitization method and device readable storage medium
CN112580339B (en) * 2020-12-18 2022-04-05 北京百度网讯科技有限公司 Model training method and device, electronic equipment and storage medium
CN112580339A (en) * 2020-12-18 2021-03-30 北京百度网讯科技有限公司 Model training method and device, electronic equipment and storage medium
CN112765673A (en) * 2021-03-16 2021-05-07 杭州数梦工场科技有限公司 Sensitive data statistical method and related device
CN113792342A (en) * 2021-09-17 2021-12-14 平安普惠企业管理有限公司 Desensitization data restoration method and device, computer equipment and storage medium
CN113792342B (en) * 2021-09-17 2023-09-08 山西数字政府建设运营有限公司 Desensitization data reduction method, device, computer equipment and storage medium
CN113742763A (en) * 2021-11-08 2021-12-03 中关村科技软件股份有限公司 Confusion encryption method and system based on government affair sensitive data
CN113988226A (en) * 2021-12-29 2022-01-28 深圳红途科技有限公司 Data desensitization validity verification method and device, computer equipment and storage medium
CN114429341B (en) * 2022-01-24 2022-12-02 吉林银行股份有限公司 Grouped payment method, device and equipment
CN114429341A (en) * 2022-01-24 2022-05-03 吉林银行股份有限公司 Grouped payment method, device and equipment
CN115033914A (en) * 2022-05-30 2022-09-09 佳缘科技股份有限公司 Distributed dynamic desensitization method, system and storage medium
CN115422594A (en) * 2022-09-20 2022-12-02 成都比特信安科技有限公司 Method for realizing data desensitization by using matrix replacement
CN116702059A (en) * 2023-06-05 2023-09-05 苏州市联佳精密机械有限公司 Intelligent production workshop management system based on Internet of things
CN116702059B (en) * 2023-06-05 2023-12-19 苏州市联佳精密机械有限公司 Intelligent production workshop management system based on Internet of things
CN116776390A (en) * 2023-08-15 2023-09-19 上海观安信息技术股份有限公司 Method, device, storage medium and equipment for monitoring data leakage behavior
CN117851751A (en) * 2023-11-30 2024-04-09 深圳市马博士网络科技有限公司 Sensitive data identification method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110598442A (en) Sensitive data self-adaptive desensitization method and system
JP6835999B2 (en) Virtual service provider zone
JP6626095B2 (en) Confidential information processing method, apparatus, server, and security determination system
CN109740363B (en) Document grading desensitization encryption method
CN111783124B (en) Data processing method, device and server based on privacy protection
CN108985081A (en) A kind of watermark encrypting method, apparatus, medium and electronic equipment
CN110688662A (en) Sensitive data desensitization and inverse desensitization method and electronic equipment
CN108512830A (en) Information cipher processing method, device, computer equipment and storage medium
CN105827582B (en) A kind of communication encrypting method, device and system
CN107948152A (en) Information storage means, acquisition methods, device and equipment
CN110138754B (en) Multi-cloud-end information processing system and resource sharing method thereof
CN108681676A (en) Data managing method and device, system, electronic equipment, program and storage medium
CN114398665A (en) Data desensitization method, device, storage medium and terminal
CN110061967A (en) Business datum providing method, device, equipment and computer readable storage medium
CN112328486A (en) Interface automation test method and device, computer equipment and storage medium
CN114598671B (en) Session message processing method, device, storage medium and electronic equipment
CN108629164A (en) The generation method for encrypting the page and the retroactive method after encryption page leakage
CN110489992A (en) Desensitization method and system based on big data platform
CN112800467B (en) Online model training method, device and equipment based on data privacy protection
CN112783847B (en) Data sharing method and device
CN112734050A (en) Text model training method, text model recognition device, text model equipment and storage medium
US11133926B2 (en) Attribute-based key management system
CN116738482A (en) Sensitive data processing method, system, computer equipment and storage medium
CN103414688A (en) Method for loading user security seal in visited page and device thereof
CN116009791A (en) Data classified storage management method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191220