CN109657496B - Zero-copy full-mirror-image big data static database desensitization system and method - Google Patents

Zero-copy full-mirror-image big data static database desensitization system and method Download PDF

Info

Publication number
CN109657496B
CN109657496B CN201811563203.6A CN201811563203A CN109657496B CN 109657496 B CN109657496 B CN 109657496B CN 201811563203 A CN201811563203 A CN 201811563203A CN 109657496 B CN109657496 B CN 109657496B
Authority
CN
China
Prior art keywords
desensitization
turning
data
database
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811563203.6A
Other languages
Chinese (zh)
Other versions
CN109657496A (en
Inventor
陈天莹
李霄
李全兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Electronic Technology Cyber Security Co Ltd
CETC Big Data Research Institute Co Ltd
Original Assignee
China Electronic Technology Cyber Security Co Ltd
CETC Big Data Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Electronic Technology Cyber Security Co Ltd, CETC Big Data Research Institute Co Ltd filed Critical China Electronic Technology Cyber Security Co Ltd
Priority to CN201811563203.6A priority Critical patent/CN109657496B/en
Publication of CN109657496A publication Critical patent/CN109657496A/en
Application granted granted Critical
Publication of CN109657496B publication Critical patent/CN109657496B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a zero-copy full-mirror big data static database desensitization system which mainly comprises a system management module, a data source management module, a data desensitization task execution module and a desensitization configuration management module, wherein the data desensitization task execution module is the core of the whole system. The invention relates to a zero-copy full-mirror-image big data static database desensitization system which comprises a system management module for managing basic functions of the system, a data source management module for managing source database addresses and target addresses of the system, a data desensitization task execution module for realizing the configuration and execution of desensitization tasks and monitoring, and a desensitization configuration management module for providing a configuration basis for the desensitization tasks. The invention also discloses a desensitization method of the big data static database of the zero-copy full mirror image.

Description

Zero-copy full-mirror-image big data static database desensitization system and method
Technical Field
The invention relates to the cross technical field of computer technology and information security, in particular to a method and a system for desensitizing a big data static database of zero copy full mirror image.
Background
Nowadays, the development of social informatization and networking leads to the explosive growth of data, and with the rapid development of business of various industries, a large amount of sensitive data related to individuals, enterprises and governments are accumulated in the business production systems. The real data of the business system is directly used in actual development, test, outsourcing and other non-production environments, and sensitive data are easy to leak. Therefore, most users perform data deformation on sensitive information through a desensitization rule by means of a data desensitization technology, and protection of sensitive private data is achieved.
Currently, the data desensitization methods used by users mainly include the following methods:
1. data desensitization based on database commands
The data desensitization mode based on the database command is to directly adopt the SQL command of the database to shield or replace the fields needing desensitization in the database so as to achieve the purpose of data desensitization. The method is simple and convenient to operate, but has many disadvantages, specifically as follows:
1) when a desensitization rule is configured through a database command, a user needs to know sensitive data of the database very much, and once the user omits the sensitive data, the sensitive data can be leaked;
2) when the number of tables and fields in the database is large, a user needs to spend a large amount of time for configuring desensitization rules, and time and labor are consumed;
3) desensitization rules are set based on database commands, which occupy database resources and affect database performance.
4) Under the development and test environment, the service relevance and consistency of desensitized data cannot be maintained in a database command-based mode, and the normal use of the desensitized data is influenced.
2. Data desensitization based on data landing
Data desensitization mode based on data landing generally is that firstly, data of a user is extracted to the local by using a data extraction tool, and then after sensitive data is found locally, data desensitization is carried out on the sensitive data by adopting a desensitization algorithm. The method can meet the requirements of consistency and relevance of timing, incremental desensitization and desensitization data of users in a non-production environment, but has some problems, specifically as follows:
1) data desensitization based on data grounding is to extract data from a production environment to the local, and when the data volume is large, the data extraction has serious influence on the performance of a database and even influences the normal use of a user database;
2) the extracted production data are stored to the local part based on data desensitization of data landing, so that the risk of data leakage is increased, and once the storage equipment is stolen, huge loss is caused to a user;
3) when a data desensitization mode based on data landing is used for a database timing increment desensitization task, in order to improve desensitization speed, a user database is subjected to relevant modification, and the performance of the user database is possibly influenced;
4) the data desensitization mode based on data ground only considers performing timing increment data desensitization on the whole database, but does not support table-level timing increment desensitization and view-level timing increment desensitization, and cannot meet all scenes required by users.
From the above, the existing data desensitization methods also present the following challenges:
1) how to support data desensitization of timing increments with minimal impact on database performance;
2) how to adopt a data desensitization mode of zero copy full mirror image to data, ensure that the data does not fall to the ground, reduce the risk of sensitive data leakage;
3) how to realize the desensitization task of timing and increment without changing a user database under the condition of ensuring the desensitization rate;
4) how to realize data desensitization of database level, table level and view level, and retain the constraint relation and view of the database after desensitization of whole database level, thereby meeting the diversified scene requirements of users;
5) how to ensure the business relevance and consistency of the data after desensitization of the whole database level data.
Disclosure of Invention
In order to solve the problems, the invention provides a zero-copy full-mirror big data static database desensitization system and a method.
A desensitization system of a big data static database with zero copy and full mirror image comprises a system management module for managing basic functions of the system, a data source management module for managing source database addresses and target addresses of the system, a data desensitization task execution module for realizing the configuration and execution of desensitization tasks and monitoring, and a desensitization configuration management module for providing a configuration basis for the desensitization tasks.
The system management module comprises a role management module, a user management module, an equipment management module and a cluster management module. The data source management module comprises a source database address management module for managing the functions of registering, modifying, deleting, inquiring, enabling and disabling a source database; the data source management module also comprises a target address management module which can manage the registration, modification, deletion, inquiry, activation and deactivation of the storage address of the desensitized data. The data desensitization task execution module comprises a whole-library-level static data desensitization module for configuring the whole-library-level data desensitization tasks, a table-level static data desensitization module for configuring the table-level data desensitization tasks, a view-level static data desensitization module for configuring the view-level data desensitization tasks, and a data desensitization task monitoring module for monitoring all the data desensitization tasks. The desensitization configuration management module comprises a desensitization classification system management module, a data desensitization strategy management module, a user-defined sensitive field management module, a data desensitization algorithm management module, a user-defined sensitive data management module and a log query and analysis module.
The invention discloses a desensitization method of a zero-copy full-mirror image big data static database, which adopts a desensitization system of the zero-copy full-mirror image big data static database to desensitize data and comprises the following steps:
s1, acquiring the database, and turning to the step s 2;
s2, inputting database source information, registering database source, and going to step s 3;
s3, configuring a data desensitization task, starting the desensitization task, and turning to the step s 4;
s4, synchronizing data, and turning to step s 5;
s5, monitoring whether the target library is successfully connected, if so, turning to step s 6; if not, go to step s 7;
s6, judging the desensitization task type and selecting a task, and if the desensitization task type is whole-library desensitization, executing the whole-library desensitization task; if the desensitization task type is table-level desensitization, executing a table-level desensitization task; if the type of the dragging desensitization task is view-level desensitization, executing the view-level desensitization task;
s7, desensitization task execution fails, ending the process.
The database data source registration comprises the following steps:
y1, inputting the data source information of the registered source database, and turning to the step y 2;
y2, judging whether the information input in the step y1 is correct, if not, failing to register the source database data source information, and turning to a step y 3; if yes, go to step y 4;
y3, finishing data source registration;
y4, acquiring a table structure of the database, and turning to the step y 5;
y5, obtaining the constraint relation information of the database, and turning to the step y 6;
y6, acquiring database view information, and turning to the step y 7;
y7, extracting a database data sample by adopting a random sampling algorithm, and turning to the step y 8;
y8, identifying the sensitive data of the sample by adopting an intelligent self-adaptive sensitive data identification mode, and turning to the step y 9;
y9, outputting a sensitive data recognition result, and turning to the step y 10;
y10, the source database data source registration is successful.
The data synchronization comprises the following steps:
t1, the system receives the synchronous database event, starts the data synchronization task, and goes to step t 2;
t2, automatically connecting the database, and turning to the step t 3;
t3, judging whether the database connection is successful, if so, turning to a step t4, otherwise, turning to a step t10 if the database connection is failed;
t4, acquiring a database structure, and turning to the step t 5;
t5, comparing the database structure obtained in the step t4 with the table structure when the data source is registered, and turning to a step t 6;
t6, judging whether the comparison structures in the step t5 are consistent, if so, turning to a step t 10; if not, go to step t 7;
t7, updating the table structure during registration, and going to step t 8;
t8, discovering the sensitive data in the updated database based on the multithreading sensitive data automatic discovery mode, and going to step t 9;
t9, data synchronization is successful;
and t10, ending.
Performing a whole-library-level desensitization task includes the steps of:
z11, acquiring desensitization task configuration information, executing desensitization tasks, and turning to step z 12;
z12, executing data synchronization flow, and turning to step z 13;
z13, detecting whether the target library is successfully connected, if so, turning to a step z14, otherwise, failing to execute the desensitization task, and ending the exit;
z14, judging whether the desensitization task is whole bank desensitization, if yes, turning to a step z15, and if not, turning to other types of desensitization;
z15, judging whether the target library has a relation, if yes, turning to a step z16, and if not, turning to a step z 17;
z16, deleting the target library relation, and turning to step z 17;
z17, performing task distribution on the source database, and turning to step z 18;
z18, acquiring a data desensitization rule, and turning to step z 19;
z19, judging whether the structure of the target base table is consistent with the source database, if so, turning to step z 110; if not, go to step z 114;
z110, judging whether the desensitization task is a timing increment task, if so, turning to a step z111, and if not, turning to a step z 113;
z111, desensitizing the incremental data, and adding to the target library, and going to step z 112;
z112, judging whether the addition is successful, if so, turning to a step z 115; if not, go to step z 114;
z113, deleting the table which is the same as the target table of the desensitization task, and turning to a step z 114;
z114, carrying out sensitive data discovery, desensitization and import on the table again, and turning to a step z 115;
z115, write relationships and views.
Performing a table-level desensitization task includes the steps of:
z21, acquiring desensitization task configuration information, executing desensitization tasks, and turning to step z 22;
z22, executing data synchronization flow, and turning to step z 23;
z23, detecting whether the target library is successfully connected, if so, turning to a step z24, otherwise, failing to execute the desensitization task, and ending the exit;
z24, judging whether the desensitization task is table-level desensitization, if so, turning to a step z25, and if not, turning to other types of desensitization;
z25, performing task distribution on the source database, and turning to step z 26;
z26, obtaining a data desensitization rule, and turning to a step z 27;
z27, judging whether the structure of the target base table is consistent with the source database, if so, turning to step z 28; if not, go to step z 212;
z28, judging whether the desensitization task is a timing increment task, if so, turning to a step z29, and if not, turning to a step z 211;
z29, desensitize to incremental data, and append to target library, go to step z 210;
z210, judging whether the addition is successful, if so, turning to a step z 213; if not, go to step z 212;
z211, deleting the table which is the same as the target table of the desensitization task, and turning to the step z 212;
z212, performing sensitive data discovery, desensitization and import on the table again, and turning to a step z 213;
z213, write relationships, and views.
Performing a view-level desensitization task includes the steps of:
z31, acquiring desensitization task configuration information, executing desensitization tasks, and turning to step z 32;
z32, executing data synchronization flow, and turning to step z 33;
z33, detecting whether the target library is successfully connected, if so, turning to a step z34, otherwise, failing to execute the desensitization task, and ending the exit;
z34, judging whether the desensitization task is view-level desensitization, if yes, turning to a step z35, and if not, turning to other types of desensitization;
z35, performing task distribution on the source database, and turning to step z 36;
z36, acquiring a data desensitization rule, and turning to step z 37;
z37, judging whether the structure of the target base table is consistent with the source database, if so, turning to step z 38; if not, go to step z 312;
z38, judging whether the desensitization task is a timing increment task, if so, turning to a step z39, and if not, turning to a step z 311;
z39, desensitize to incremental data, and append to target library, go to step z 310;
z310, judging whether the addition is successful, if so, turning to a step z 313; if not, go to step z 312;
z311, deleting a table which is the same as the target base table of the desensitization task, and turning to the step z 312;
z312, carrying out sensitive data discovery, desensitization and import on the table again, and turning to a step z 313;
z313, write relationships, and views.
The invention provides a method and a system for desensitizing a big data static database of zero copy full mirror image, which effectively solve the problem of the static desensitization of the big data at present. In the data desensitization process, data are extracted in batches, zero copy is carried out on the data, and the data do not fall to the ground, so that the safety of the data in the data desensitization process is ensured; by means of a distributed cluster mode, the speed of processing the data desensitization task is improved; data desensitization in support of timing increments with minimal impact on database performance; under the condition of ensuring desensitization rate, the desensitization task of timing and increment is realized without changing a user database; by researching a database desensitization technology, database-level, table-level and view-level data desensitization is realized, the constraint relation of the database can be reserved, and diversified scene requirements of users are met; and the consistency and the service relevance of the data after the desensitization of the data of the whole library level are ensured through a consistency desensitization algorithm.
Drawings
FIG. 1 is a schematic diagram of a zero-copy full-mirror big data static database desensitization system architecture according to the present invention;
FIG. 2 is a schematic diagram of a main flow of a desensitization method of a zero-copy full-mirror big data static database according to the present invention;
FIG. 3 is a schematic diagram of a data source registration process of a zero-copy full-mirror big data static database desensitization method according to the present invention;
FIG. 4 is a schematic diagram of a data synchronization process of a zero-copy full-mirror large data static database desensitization method according to the present invention;
FIG. 5 is a schematic diagram of a whole-database-level data desensitization process of a zero-copy full-mirror large data static database desensitization method according to the present invention;
FIG. 6 is a table-level data desensitization flow diagram of a zero-copy full-mirror large data static database desensitization method according to the present invention;
FIG. 7 is a view-level data desensitization flow diagram of a zero-copy full-mirror big data static database desensitization method according to the present invention.
Detailed Description
For a better understanding of the present invention, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
A big data static database desensitization system of zero copy full mirror image mainly comprises a system management module, a data source management module, a data desensitization task execution module and a desensitization configuration management module, wherein the data desensitization task execution module is the core of the whole system. As shown in FIG. 1, the big data static database desensitization system of zero copy full mirror image of the present invention includes a system management module for managing the basic functions of the system, a data source management module for managing the source database address and the target address of the system, a data desensitization task execution module for implementing the configuration and execution of desensitization tasks and monitoring, and a desensitization configuration management module for providing a configuration basis for desensitization tasks. Specifically, the system management module includes a role management module, a user management module, an equipment management module, and a cluster management module. The role management module executes operations of adding, deleting, checking, changing and the like to the roles of the system and authorizes the roles. The user management module executes operations of adding, deleting, searching, changing and the like to the operation user of the system, and authorizes roles and data sources for the user.
The data source management module comprises a source database address management module for managing the functions of registering, modifying, deleting, inquiring, enabling and disabling a source database; the data source management module also comprises a target address management module which can manage the registration, modification, deletion, inquiry, enablement and disablement of the storage address of the desensitized data. The source database address management module realizes the functions of registering, modifying, deleting, inquiring, enabling, disabling and the like of the source database. The target address management module realizes the functions of registering, modifying, deleting, inquiring, enabling, disabling and the like of the storage address of the desensitized data, and comprises the address management of a target database and the address management of a folder.
The data desensitization task execution module comprises a whole-library-level static data desensitization module for configuring the whole-library-level data desensitization tasks, a table-level static data desensitization module for configuring the table-level data desensitization tasks, a view-level static data desensitization module for configuring the view-level data desensitization tasks, and a data desensitization task monitoring module for monitoring all the data desensitization tasks. The database desensitization task execution module realizes the configuration, execution and monitoring of data desensitization tasks at the whole library level, the table level and the view level, and is the core of the whole system. And performing whole-library-level static data desensitization to configure a whole-library-level data desensitization task, wherein configuration information comprises timing, increment, sensitive data discovery, subset extraction, target output addresses and the like. The table-level static data desensitization module configures a table-level data desensitization task, and configuration information comprises timing, increment, sensitive data discovery, subset extraction, target output addresses and the like. The view-level static data desensitization module configures a view-level data desensitization task, and configuration information comprises timing, increment, sensitive data discovery, subset extraction, target output addresses and the like. And the desensitization task monitoring module monitors all data desensitization tasks and supports operations of modifying, deleting, stopping and the like of the tasks.
The desensitization configuration management module comprises a desensitization classification system management module, a data desensitization strategy management module, a user-defined sensitive field management module, a data desensitization algorithm management module, a user-defined sensitive data management module and a log query and analysis module. Desensitization configuration management is the basis of data desensitization task configuration and comprises sensitive classification system management, data desensitization strategy management, user-defined sensitive fields, user-defined sensitive data, data desensitization algorithm management and log query and analysis. The sensitive classification system management module supports a user to check a system default sensitive classification system and supports the user to build a sensitive classification system by self; the data desensitization strategy management module supports operations of adding, deleting, searching, changing and the like of a data desensitization strategy based on a sensitive classification system; the user-defined sensitive field management module supports the sensitive fields in the user-defined database and the sensitive types of the sensitive fields; the user-defined sensitive data management module supports a user to define sensitive data and a replacement rule thereof; the data desensitization algorithm management module supports a user-defined data desensitization algorithm, and performs operations such as addition, deletion, check, modification and the like on the user-defined algorithm; the log query and analysis module records system logs and service logs and performs query and statistical analysis on the logs.
As shown in FIG. 2, the desensitization method of the zero-copy full-mirror large-data static database of the present invention adopts a desensitization system of the zero-copy full-mirror large-data static database to desensitize data, and comprises the following steps:
s1, acquiring the database, and turning to the step s 2;
s2, inputting database source information, registering database sources, and going to step s 3;
s3, configuring a data desensitization task, starting the desensitization task, and turning to the step s 4;
s4, synchronizing data, and turning to step s 5;
s5, monitoring whether the target library is successfully connected, if so, turning to step s 6; if not, go to step s 7;
s6, judging the desensitization task type and selecting a task, and if the desensitization task type is whole-library desensitization, executing the whole-library desensitization task; if the desensitization task type is table-level desensitization, executing a table-level desensitization task; if the type of the dragging desensitization task is view-level desensitization, executing the view-level desensitization task;
s7, desensitization task execution fails, ending the process.
The data source registration is the core of source database management, and the part supports database desensitization tasks by extracting data structures, constraint relations and sensitive data discovery of a source database, and as shown in fig. 3, the database data source registration comprises the following steps:
y1, inputting the data source information of the registered source database, and turning to the step y 2;
y2, judging whether the information input in the step y1 is correct, if not, failing to register the source database data source information, and turning to a step y 3; if yes, go to step y 4;
y3, finishing data source registration;
y4, acquiring a table structure of the database, and turning to the step y 5;
y5, acquiring the constraint relation information of the database, and turning to the step y 6;
y6, acquiring database view information, and turning to the step y 7;
y7, extracting a database data sample by adopting a random sampling algorithm, and turning to the step y 8;
y8, identifying the sensitive data of the sample by adopting an intelligent self-adaptive sensitive data identification mode, and turning to the step y 9;
y9, outputting a sensitive data recognition result, and turning to the step y 10;
y10, the source database data source registration is successful.
Data synchronization is used as a basis for a data desensitization task, and can effectively ensure that the data desensitization task is successfully executed, as shown in fig. 4, the data synchronization includes the following steps:
t1, the system receives the synchronous database event, starts the data synchronization task, and goes to step t 2;
t2, automatically connecting the database, and turning to the step t 3;
t3, judging whether the database connection is successful, if so, turning to a step t4, otherwise, turning to a step t10 if the database connection is failed;
t4, acquiring a database structure, and turning to the step t 5;
t5, comparing the database structure obtained in the step t4 with the table structure when the data source is registered, and turning to a step t 6;
t6, judging whether the comparison structures in the step t5 are consistent, if yes, turning to a step t 10; if not, go to step t 7;
t7, updating the table structure during registration, and going to step t 8;
t8, discovering the sensitive data in the updated database based on the multithreading sensitive data automatic discovery mode, and going to step t 9;
t9, data synchronization is successful;
and t10, ending.
The data desensitization of the whole database level is to configure and execute a data desensitization task with the smallest granularity of the whole database, as shown in fig. 5, the execution of the whole database level desensitization task includes the following steps:
z11, obtaining desensitization task configuration information, executing desensitization task, and turning to step z 12;
z12, executing data synchronization flow, and turning to step z 13;
z13, detecting whether the target library is successfully connected, if so, turning to a step z14, otherwise, failing to execute the desensitization task, and ending the exit;
z14, judging whether the desensitization task is full-library desensitization, if yes, turning to a step z15, and if not, turning to other types of desensitization;
z15, judging whether the target library has a relation, if yes, turning to a step z16, and if not, turning to a step z 17;
z16, deleting the target library relation, and turning to the step z 17;
z17, performing task distribution on the source database, and turning to step z 18;
z18, acquiring a data desensitization rule, and turning to step z 19;
z19, judging whether the structure of the target base table is consistent with the source database, if so, turning to step z 110; if not, go to step z 114;
z110, judging whether the desensitization task is a timing increment task, if so, turning to a step z111, and if not, turning to a step z 113;
z111, desensitizing the incremental data, and adding to the target library, and going to step z 112;
z112, judging whether the addition is successful, if so, turning to a step z 115; if not, go to step z 114;
z113, deleting the table which is the same as the target table of the desensitization task, and turning to a step z 114;
z114, carrying out sensitive data discovery, desensitization and import on the table again, and turning to a step z 115;
z115, write relationships and views.
Table-level data desensitization, which is to configure and execute a data desensitization task with a table as the minimum granularity, as shown in fig. 6, the execution of the table-level desensitization task includes the following steps:
z21, acquiring desensitization task configuration information, executing desensitization tasks, and turning to step z 22;
z22, executing data synchronization flow, and turning to step z 23;
z23, detecting whether the target library is successfully connected, if so, turning to a step z24, otherwise, failing to execute the desensitization task, and ending the exit;
z24, judging whether the desensitization task is table-level desensitization, if so, turning to a step z25, and if not, turning to other types of desensitization;
z25, performing task distribution on the source database, and turning to step z 26;
z26, acquiring a data desensitization rule, and turning to step z 27;
z27, judging whether the structure of the target base table is consistent with the source database, if so, turning to step z 28; if not, go to step z 212;
z28, judging whether the desensitization task is a timing increment task, if so, turning to a step z29, and if not, turning to a step z 211;
z29, desensitize to incremental data, and append to target library, go to step z 210;
z210, judging whether the addition is successful, if so, turning to a step z 213; if not, go to step z 212;
z211, deleting the table which is the same as the target table of the desensitization task, and turning to the step z 212;
z212, performing sensitive data discovery, desensitization and import on the table again, and turning to a step z 213;
z213, write relationships, and views.
View-level data desensitization, which is the configuration and execution of a data desensitization task with view as the minimum granularity, as shown in fig. 7, the execution of the view-level desensitization task includes the following steps:
z31, acquiring desensitization task configuration information, executing desensitization tasks, and turning to step z 32;
z32, executing data synchronization flow, and turning to step z 33;
z33, detecting whether the target library is successfully connected, if so, turning to a step z34, otherwise, failing to execute the desensitization task, and ending the exit;
z34, judging whether the desensitization task is view-level desensitization, if yes, turning to a step z35, and if not, turning to other types of desensitization;
z35, performing task distribution on the source database, and turning to step z 36;
z36, acquiring a data desensitization rule, and turning to step z 37;
z37, judging whether the structure of the target base table is consistent with the source database, if so, turning to step z 38; if not, go to step z 312;
z38, judging whether the desensitization task is a timing increment task, if so, turning to a step z39, and if not, turning to a step z 311;
z39, desensitize to incremental data, and append to target library, go to step z 310;
z310, judging whether the addition is successful, if so, turning to a step z 313; if not, go to step z 312;
z311, deleting the table which is the same as the target table of the desensitization task, and turning to the step z 312;
z312, carrying out sensitive data discovery, desensitization and import on the table again, and turning to a step z 313;
z313, write relationships, and views.
The invention provides a method and a system for desensitizing a big data static database of zero copy full mirror image, which effectively solve the problem of the static desensitization of the big data at present. In the data desensitization process, data are extracted in batches, zero copy is carried out on the data, and the data do not fall to the ground, so that the safety of the data in the data desensitization process is ensured; by means of a distributed cluster mode, the speed of processing the data desensitization task is improved; data desensitization in support of timing increments with minimal impact on database performance; under the condition of ensuring desensitization rate, the desensitization task of timing and increment is realized without changing a user database; by researching a database desensitization technology, database-level, table-level and view-level data desensitization is realized, the constraint relation of the database can be reserved, and diversified scene requirements of users are met; and the consistency and the service relevance of the data after the desensitization of the data of the whole library level are ensured through a consistency desensitization algorithm.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be able to cover the technical solutions and the inventive concepts of the present invention within the technical scope of the present invention.

Claims (7)

1. A big data static database desensitization system of zero copy full mirror image is characterized in that the desensitization system comprises a system management module for managing the basic functions of the system, a data source management module for managing the source database address and the target address of the system, a data desensitization task execution module for realizing the configuration and execution of desensitization tasks and monitoring, and a desensitization configuration management module for providing a configuration basis for the desensitization tasks;
the data desensitization task execution module comprises an integer library level static data desensitization module for configuring the data desensitization tasks of the integer library level, a table level static data desensitization module for configuring the data desensitization tasks of the table level, a view level static data desensitization module for configuring the data desensitization tasks of the view level, and a data desensitization task monitoring module for monitoring all the data desensitization tasks.
2. The zero-copy full-mirror big data static database desensitization system according to claim 1, wherein the system management modules include role management modules, user management modules, device management modules, cluster management modules.
3. The zero-copy full-mirror big data static database desensitization system according to claim 1, wherein the data source management modules include a source database address management module that manages the registration, modification, deletion, querying, enabling and disabling functions of the source database; the data source management module also comprises a target address management module which can manage the registration, modification, deletion, inquiry, activation and deactivation of the storage address of the desensitized data.
4. The zero-copy full-mirror big data static database desensitization system according to claim 1, wherein the desensitization configuration management modules include a sensitive classification system management module, a data desensitization policy management module, a custom sensitive field management module, a data desensitization algorithm management module, a custom sensitive data management module, and a log query and analysis module.
5. A desensitization method of a big data static database of zero copy full mirror image, which is characterized in that the desensitization system of the big data static database of zero copy full mirror image according to any claim 1 to 4 is adopted to desensitize the data, and comprises the following steps:
s1, acquiring the database, and turning to the step s 2;
s2, inputting database source information, registering database sources, and going to step s 3;
s3, configuring a data desensitization task, starting the desensitization task, and turning to the step s 4;
s4, synchronizing data, and turning to step s 5;
s5, monitoring whether the target library is successfully connected, if so, turning to step s 6; if not, go to step s 7;
s6, judging the desensitization task type and selecting a task, and if the desensitization task type is whole-library desensitization, executing the whole-library desensitization task; if the desensitization task type is table-level desensitization, executing a table-level desensitization task; if the type of the dragging desensitization task is view-level desensitization, executing the view-level desensitization task;
performing a whole-library-level desensitization task includes the steps of:
z11, acquiring desensitization task configuration information, executing desensitization tasks, and turning to step z 12;
z12, executing data synchronization flow, and turning to step z 13;
z13, detecting whether the target library is successfully connected, if so, turning to a step z14, otherwise, failing to execute the desensitization task, ending and exiting;
z14, judging whether the desensitization task is full-library desensitization, if yes, turning to a step z15, and if not, turning to other types of desensitization;
z15, judging whether the target library has a relation, if yes, turning to a step z16, and if not, turning to a step z 17;
z16, deleting the target library relation, and turning to the step z 17;
z17, performing task distribution on the source database, and turning to step z 18;
z18, acquiring a data desensitization rule, and turning to step z 19;
z19, judging whether the structure of the target base table is consistent with the source database, if so, turning to step z 110; if not, go to step z 114;
z110, judging whether the desensitization task is a timing increment task, if so, turning to a step z111, and if not, turning to a step z 113;
z111, desensitizing the incremental data, and adding to the target library, and going to step z 112;
z112, judging whether the addition is successful, if so, turning to a step z 115; if not, go to step z 114;
z113, deleting the table which is the same as the target table of the desensitization task, and turning to a step z 114;
z114, carrying out sensitive data discovery, desensitization and import on the table again, and turning to a step z 115;
z115, write relationships and views;
performing a table-level desensitization task includes the steps of:
z21, acquiring desensitization task configuration information, executing desensitization tasks, and turning to step z 22;
z22, executing data synchronization flow, and turning to step z 23;
z23, detecting whether the target library is successfully connected, if so, turning to a step z24, otherwise, failing to execute the desensitization task, and ending the exit;
z24, judging whether the desensitization task is table-level desensitization, if yes, turning to a step z25, and if not, turning to other types of desensitization;
z25, performing task distribution on the source database, and turning to step z 26;
z26, acquiring a data desensitization rule, and turning to step z 27;
z27, judging whether the structure of the target base table is consistent with that of the source database, if so, turning to a step z 28; if not, go to step z 212;
z28, judging whether the desensitization task is a timing increment task, if so, turning to a step z29, and if not, turning to a step z 211;
z29, desensitize to incremental data, and append to target library, go to step z 210;
z210, judging whether the addition is successful, if so, turning to a step z 213; if not, go to step z 212;
z211, deleting the table which is the same as the target table of the desensitization task, and turning to the step z 212;
z212, performing sensitive data discovery, desensitization and import on the table again, and turning to the step z 213;
z213, write relationships and views;
performing a view-level desensitization task includes the steps of:
z31, acquiring desensitization task configuration information, executing desensitization tasks, and turning to step z 32;
z32, executing data synchronization flow, and turning to step z 33;
z33, detecting whether the target library is successfully connected, if so, turning to a step z34, otherwise, failing to execute the desensitization task, and ending the exit;
z34, judging whether the desensitization task is view-level desensitization, if yes, turning to a step z35, and if not, turning to other types of desensitization;
z35, performing task distribution on the source database, and turning to step z 36;
z36, acquiring a data desensitization rule, and turning to step z 37;
z37, judging whether the structure of the target base table is consistent with the source database, if so, turning to step z 38; if not, go to step z 312;
z38, judging whether the desensitization task is a timing increment task, if so, turning to a step z39, and if not, turning to a step z 311;
z39, desensitize to incremental data, and append to target library, go to step z 310;
z310, judging whether the addition is successful, if so, turning to step z 313; if not, go to step z 312;
z311, deleting the table which is the same as the target table of the desensitization task, and turning to the step z 312;
z312, carrying out sensitive data discovery, desensitization and import on the table again, and turning to a step z 313;
z313, write relationships and views;
s7, desensitization task execution fails, ending the process.
6. The method for desensitizing large data static databases that are fully mirrored with zero copy, according to claim 5, wherein registering the database data sources comprises the steps of:
y1, inputting the data source information of the registered source database, and turning to the step y 2;
y2, judging whether the information input in the step y1 is correct, if not, failing to register the source database data source information, and turning to a step y 3; if yes, go to step y 4;
y3, finishing data source registration;
y4, obtaining a table structure of the database, and turning to the step y 5;
y5, acquiring the constraint relation information of the database, and turning to the step y 6;
y6, acquiring database view information, and turning to the step y 7;
y7, extracting a database data sample by adopting a random sampling algorithm, and turning to the step y 8;
y8, identifying the sensitive data of the sample by adopting an intelligent self-adaptive sensitive data identification mode, and turning to the step y 9;
y9, outputting a sensitive data recognition result, and turning to the step y 10;
y10, the source database data source registration is successful.
7. The method for desensitizing large data static databases of zero copy full mirroring according to claim 6, wherein data synchronization comprises the steps of:
t1, the system receives the synchronous database event, starts the data synchronization task, and goes to step t 2;
t2, automatically connecting the database, and turning to the step t 3;
t3, judging whether the database connection is successful, if so, turning to a step t4, otherwise, turning to a step t10 if the database connection is failed;
t4, acquiring a database structure, and turning to the step t 5;
t5, comparing the database structure obtained in the step t4 with the table structure when the data source is registered, and turning to a step t 6;
t6, judging whether the comparison structures in the step t5 are consistent, if yes, turning to a step t 10; if not, go to step t 7;
t7, updating the table structure during registration, and going to step t 8;
t8, discovering the sensitive data in the updated database based on the multithreading sensitive data automatic discovery mode, and turning to step t 9;
t9, data synchronization is successful;
and t10, ending.
CN201811563203.6A 2018-12-20 2018-12-20 Zero-copy full-mirror-image big data static database desensitization system and method Active CN109657496B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811563203.6A CN109657496B (en) 2018-12-20 2018-12-20 Zero-copy full-mirror-image big data static database desensitization system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811563203.6A CN109657496B (en) 2018-12-20 2018-12-20 Zero-copy full-mirror-image big data static database desensitization system and method

Publications (2)

Publication Number Publication Date
CN109657496A CN109657496A (en) 2019-04-19
CN109657496B true CN109657496B (en) 2022-07-05

Family

ID=66115360

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811563203.6A Active CN109657496B (en) 2018-12-20 2018-12-20 Zero-copy full-mirror-image big data static database desensitization system and method

Country Status (1)

Country Link
CN (1) CN109657496B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110532799B (en) * 2019-07-31 2023-03-24 平安科技(深圳)有限公司 Data desensitization control method, electronic device and computer readable storage medium
CN111177785B (en) * 2019-12-31 2023-04-11 广东鸿数科技有限公司 Desensitization processing method for private data of enterprise-based business system
CN111858546A (en) * 2020-06-22 2020-10-30 网联清算有限公司 Data processing method, device and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106529329A (en) * 2016-10-11 2017-03-22 中国电子科技网络信息安全有限公司 Desensitization system and desensitization method used for big data
CN106599713A (en) * 2016-11-11 2017-04-26 中国电子科技网络信息安全有限公司 Database masking system and method based on big data
CN106778351A (en) * 2016-12-30 2017-05-31 中国民航信息网络股份有限公司 Data desensitization method and device
CN107403111A (en) * 2017-08-10 2017-11-28 中国民航信息网络股份有限公司 HIVE data desensitization method and device
CN207489017U (en) * 2017-10-23 2018-06-12 中恒华瑞(北京)信息技术有限公司 Data desensitization system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090132419A1 (en) * 2007-11-15 2009-05-21 Garland Grammer Obfuscating sensitive data while preserving data usability
CN107441317A (en) * 2016-05-30 2017-12-08 王停 It is a kind of to be used for the special Chinese medicinal formulae for reporting the treatment of constitution allergic rhinitis
CN106407843A (en) * 2016-10-17 2017-02-15 深圳中兴网信科技有限公司 Data desensitization method and data desensitization device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106529329A (en) * 2016-10-11 2017-03-22 中国电子科技网络信息安全有限公司 Desensitization system and desensitization method used for big data
CN106599713A (en) * 2016-11-11 2017-04-26 中国电子科技网络信息安全有限公司 Database masking system and method based on big data
CN106778351A (en) * 2016-12-30 2017-05-31 中国民航信息网络股份有限公司 Data desensitization method and device
CN107403111A (en) * 2017-08-10 2017-11-28 中国民航信息网络股份有限公司 HIVE data desensitization method and device
CN207489017U (en) * 2017-10-23 2018-06-12 中恒华瑞(北京)信息技术有限公司 Data desensitization system

Also Published As

Publication number Publication date
CN109657496A (en) 2019-04-19

Similar Documents

Publication Publication Date Title
US11100103B2 (en) Data sharing in multi-tenant database systems
CN109657496B (en) Zero-copy full-mirror-image big data static database desensitization system and method
CN111602131B (en) Secure data sharing in a multi-tenant database system
US8078595B2 (en) Secure normal forms
Ulusoy et al. GuardMR: Fine-grained security policy enforcement for MapReduce systems
US9411866B2 (en) Replication mechanisms for database environments
JP4571746B2 (en) System and method for selectively defining access to application functions
CN109144994A (en) Index updating method, system and relevant apparatus
US20200379995A1 (en) Sharing materialized views in multiple tenant database systems
JP2020126409A (en) Data managing system and data managing method
JP4777459B2 (en) Security architecture for content management systems
CN105956468A (en) Method and system for detecting Android malicious application based on file access dynamic monitoring
CN107301179A (en) The method and apparatus of data base read-write separation
US6564203B1 (en) Defining instead-of triggers over nested collection columns of views
US20130185280A1 (en) Multi-join database query
CN108717516A (en) File label method, terminal and medium
CN106844497A (en) The check device and method of a kind of database code
US10911539B2 (en) Managing shared content directory structure metadata
CN114528593A (en) Data authority control method, device, equipment and storage medium
Liu Corpus-based Japanese reading teaching database cloud service model
CN112182023B (en) Big data access control method and device, electronic equipment and storage medium
Mu et al. Enterprise Rights Management System Based on RBAC Model
US20220334829A1 (en) Custom abap cloud enabler
KR101697301B1 (en) Method and system for intensify security of DBMS(database management system)
CN116186726A (en) Database operation processing method, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant