CN109766713A - A kind of data dynamic Rapid desensitization implementation method based on agency - Google Patents

A kind of data dynamic Rapid desensitization implementation method based on agency Download PDF

Info

Publication number
CN109766713A
CN109766713A CN201811536698.3A CN201811536698A CN109766713A CN 109766713 A CN109766713 A CN 109766713A CN 201811536698 A CN201811536698 A CN 201811536698A CN 109766713 A CN109766713 A CN 109766713A
Authority
CN
China
Prior art keywords
data
desensitization
sensitive
dynamic
agency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811536698.3A
Other languages
Chinese (zh)
Other versions
CN109766713B (en
Inventor
杨国玉
白西让
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Datang Corp Science and Technology Research Institute Co Ltd
Original Assignee
China Datang Corp Science and Technology Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Datang Corp Science and Technology Research Institute Co Ltd filed Critical China Datang Corp Science and Technology Research Institute Co Ltd
Priority to CN201811536698.3A priority Critical patent/CN109766713B/en
Publication of CN109766713A publication Critical patent/CN109766713A/en
Application granted granted Critical
Publication of CN109766713B publication Critical patent/CN109766713B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)
  • Storage Device Security (AREA)

Abstract

The data dynamic Rapid desensitization implementation method based on agency that the present invention relates to a kind of, comprising: step 1, the data of uniform format in data are individually splitted out, obtain splitting dictionary set;The data of uniform format include 11 digit digital datas, 2 character datas, 3 character datas, extra 10 characters text data in it is a variety of;Step 2, the sensitive information split in dictionary set is classified and is identified, obtain sensitive data;Sensitive information includes identification card number, cell-phone number, bank's card number, name, a variety of in social security number;Step 3, dynamic desensitization is carried out to sensitive data based on desensitization algorithm, by the data bulk under sensitive data classification and each sensitive data classification in dynamic desensitization, carries out load balance process, so that the efficiency of dynamic desensitization reaches highest.The present invention can be used for the desensitization work to sensitive data, realize when data are accessed, and carries out quickly dynamic and desensitizes, and establish solid foundation to construct safe and reliable data use environment.

Description

A kind of data dynamic Rapid desensitization implementation method based on agency
Technical field
The invention belongs to field of information security technology more particularly to a kind of data dynamic Rapid desensitization realizations based on agency Method.
Background technique
Data desensitization refers to the deformation that certain sensitive informations are carried out with data by desensitization rule, realizes privacy-sensitive data Reliably protecting.In the case where being related to client secure data or some commercial sensitive datas, system convention is not being violated Under the conditions of, test use, such as the personal letter of identification card number, cell-phone number, card number, customer ID are transformed and provided to truthful data Breath requires to carry out data desensitization.One of Database security technology, Database security technology specifically include that database drain sweep, number According to library encryption, the desensitization of database firewall, data, database security auditing system.Database security risk includes: to drag library, brush Library is hit in library.
Big data environment is gradually applied to Liao Ge large enterprises.The ownership and the right to use of enterprise's sensitive data lack bright It really defines and manages, may cause the leakage of user privacy information and the leakage of inside data of enterprise, directly contribute corporate reputation With economic double loss.Externally, data are worth, and complicated in big data platform, sensitive, comprehensive data undoubtedly can Attract more potential attackers.Meanwhile data largely collect so that hacker's successful attack once can obtain it is most According to greatly reducing the attack cost of hacker.Therefore, big data would be possible to the well-marked target as network attack.Big data The serious loss of platform safety ability and risk it is generally existing, cause big data platform itself be it is fragile, to business data Safety causes great risk, is the risk point for being difficult to ignore for enterprise.
In big data environment, data are mostly the storage forms with NoSql, and various types of data are also not that desensitization is laggard Row storage.When accessing data, sensitive detection is carried out to the data accessed while being desensitized, is realized under big data environment The important leverage of Data Access Security.
Summary of the invention
The data dynamic Rapid desensitization implementation method based on agency that the object of the present invention is to provide a kind of is used for data safety It with desensitization field, realizes when data are accessed, carries out quickly dynamic and desensitize.
The data dynamic Rapid desensitization implementation method based on agency that the present invention provides a kind of, comprising:
Step 1, the data of uniform format in data are individually splitted out, obtains splitting dictionary set;The format system One data include 11 digit digital datas, 2 character datas, 3 character datas, extra 10 characters text data in it is more Kind;
Step 2, the sensitive information in the fractionation dictionary set is classified and is identified, obtain sensitive data;It is described Sensitive information includes identification card number, cell-phone number, bank's card number, name, a variety of in social security number;
Step 3, dynamic desensitization is carried out to the sensitive data based on desensitization algorithm, by sensitive number in dynamic desensitization According to the data bulk under classification and each sensitive data classification, load balance process is carried out, so that the efficiency of dynamic desensitization reaches Highest.
Further, the step 1 includes:
Data are subjected to whole division, distinguish text, number, English alphabet;
Based on above-mentioned division result, the length of each section of statistics, and length and division result are combined, are tied dividing Fruit is as the key for splitting dictionary;
Data are stored under key corresponding to its format, obtain splitting dictionary set.
Further, the step 3 includes:
The quantity for counting sensitive field, is denoted as M;The total quantity of data under every kind of sensitive field is counted, accumulated result is denoted as N;
Each corresponding data of sensitive field is put to library to be processed;
M/2 asynchronous thread is initialized, following state is set for it: when every thread sensitive data per treatment, only N/M data is handled, does not take other classifications when insufficient;And it is set to idle state;
When certain thread is in idle condition, the sensitive field progress desensitization process to be processed for going in library to take one is gone, Until all data are all disposed under the sensitivity field, the sensitivity field and its data are removed into library to be processed.
According to the above aspect of the present invention, can be used for by the data dynamic Rapid desensitization implementation method based on agency to sensitive data Desensitization work, realize when data are accessed, carry out quickly dynamic and desensitize, established to construct safe and reliable data use environment Solid foundation is determined.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention, And can be implemented in accordance with the contents of the specification, the following is a detailed description of the preferred embodiments of the present invention and the accompanying drawings.
Detailed description of the invention
Fig. 1 is a kind of overall flow figure of the data dynamic Rapid desensitization implementation method based on agency of the present invention;
Fig. 2 is that a kind of data of the data dynamic Rapid desensitization implementation method based on agency of the present invention split algorithm flow Figure;
Fig. 3 is a kind of data subsumption algorithm process of the data dynamic Rapid desensitization implementation method based on agency of the present invention Figure;
Fig. 4 is a kind of data desensitization algorithm flow of data dynamic Rapid desensitization implementation method based on agency of the present invention Figure.
Specific embodiment
With reference to the accompanying drawings and examples, specific embodiments of the present invention will be described in further detail.Implement below Example is not intended to limit the scope of the invention for illustrating the present invention.
Present embodiments provide a kind of data dynamic Rapid desensitization implementation method based on agency, comprising:
Step 1, the data of uniform format in data are individually splitted out, obtains splitting dictionary set;The format system One data include 11 digit digital datas, 2 character datas, 3 character datas, extra 10 characters text data in it is more Kind.
Step 2, the sensitive information in the fractionation dictionary set is classified and is identified, obtain sensitive data;It is described Sensitive information includes identification card number, cell-phone number, bank's card number, name, a variety of in social security number.
Step 3, dynamic desensitization is carried out to the sensitive data based on desensitization algorithm, by sensitive number in dynamic desensitization According to the data bulk under classification and each sensitive data classification, load balance process is carried out, so that the efficiency of dynamic desensitization reaches Highest.
The data dynamic Rapid desensitization implementation method based on agency, can be used for the desensitization work to sensitive data, realize It when data are accessed, carry out quickly dynamic and desensitizes, established solid foundation to construct safe and reliable data use environment.
Invention is further described in detail below.
Overall flow figure as shown in Figure 1, this method include dynamic resolution, classification and the desensitization of data.
Join shown in Fig. 2, the dynamic resolution algorithm of data is a kind of dismantling to data, it is intended to data are split to come, it is real Existing Fast Classification and targetedly desensitization.I.e. by data, the data of uniform format are individually splitted out, comprising: 11 digits Digital data, 2 character datas, 3 character datas, more than text data of 10 characters etc..For subsequent targeted desensitization Processing, carries out adequate preparation.Specific steps include:
(1) data are subjected to whole division, that is, distinguish text, number, three kinds of English alphabet;
(2) it is directed to above-mentioned division result, the length of each section of statistics, and length and division result are combined, such as " 3 Position Chinese character ", " 11 bit digital ", " 10 or less English alphabet " etc., using division result as the key for splitting dictionary;
(3) data are stored under key corresponding to its format, obtain splitting dictionary set.
Join shown in Fig. 3, the subsumption algorithm of data, is after splitting data as a result, splitting dictionary set, is sorted out With identification, including common sensitive information: identification card number, cell-phone number, bank's card number, name, social security number etc., and carried out to it Label.
Join shown in Fig. 4, the desensitization algorithm of data, refers to the sensitive data after classification, targetedly calculated using desensitization Method carries out dynamic desensitization.By the data bulk under sensitive data classification and each sensitive data classification, it is equal to carry out effective load Weighing apparatus processing, the efficiency for making dynamic desensitize reach highest.Specific steps include:
(1) quantity for counting sensitive field, is denoted as M;Count the total quantity of data under every kind of sensitive field, accumulated result, It is denoted as N;
(2) it by each corresponding data of sensitive field, puts to library to be processed;
(3) M/2 asynchronous thread is initialized, following state is set for it: when every thread sensitive data per treatment, Only processing N/M data, does not take other classifications when insufficient;And it is set to idle state;
(4) it when certain thread is in idle condition, goes to be processed the sensitive field for taking one to be gone in library to carry out at desensitization The sensitivity field and its data are removed library to be processed until all data are all disposed under the sensitivity field by reason.
The above is only a preferred embodiment of the present invention, it is not intended to restrict the invention, it is noted that for this skill For the those of ordinary skill in art field, without departing from the technical principles of the invention, can also make it is several improvement and Modification, these improvements and modifications also should be regarded as protection scope of the present invention.

Claims (3)

1. a kind of data dynamic Rapid desensitization implementation method based on agency characterized by comprising
Step 1, the data of uniform format in data are individually splitted out, obtains splitting dictionary set;The uniform format Data include 11 digit digital datas, 2 character datas, 3 character datas, extra 10 characters text data in it is a variety of;
Step 2, the sensitive information in the fractionation dictionary set is classified and is identified, obtain sensitive data;The sensitivity Information includes identification card number, cell-phone number, bank's card number, name, a variety of in social security number;
Step 3, dynamic desensitization is carried out to the sensitive data based on desensitization algorithm, by sensitive data class in dynamic desensitization Not with the data bulk under each sensitive data classification, load balance process is carried out, so that the efficiency of dynamic desensitization reaches highest.
2. the data dynamic Rapid desensitization implementation method according to claim 1 based on agency, which is characterized in that the step Rapid 1 includes:
Data are subjected to whole division, distinguish text, number, English alphabet;
Based on above-mentioned division result, the length of each section of statistics, and length and division result are combined, division result is made For the key for splitting dictionary;
Data are stored under key corresponding to its format, obtain splitting dictionary set.
3. the data dynamic Rapid desensitization implementation method according to claim 2 based on agency, which is characterized in that the step Rapid 3 include:
The quantity for counting sensitive field, is denoted as M;The total quantity of data under every kind of sensitive field is counted, accumulated result is denoted as N;
Each corresponding data of sensitive field is put to library to be processed;
M/2 asynchronous thread is initialized, following state is set for it: when every thread sensitive data per treatment, only being handled N/M data does not take other classifications when insufficient;And it is set to idle state;
When certain thread is in idle condition, the sensitive field progress desensitization process to be processed for going in library to take one is gone, until All data are all disposed under the sensitivity field, and the sensitivity field and its data are removed library to be processed.
CN201811536698.3A 2018-12-15 2018-12-15 Method for realizing dynamic rapid desensitization of data based on proxy Active CN109766713B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811536698.3A CN109766713B (en) 2018-12-15 2018-12-15 Method for realizing dynamic rapid desensitization of data based on proxy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811536698.3A CN109766713B (en) 2018-12-15 2018-12-15 Method for realizing dynamic rapid desensitization of data based on proxy

Publications (2)

Publication Number Publication Date
CN109766713A true CN109766713A (en) 2019-05-17
CN109766713B CN109766713B (en) 2021-01-12

Family

ID=66451897

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811536698.3A Active CN109766713B (en) 2018-12-15 2018-12-15 Method for realizing dynamic rapid desensitization of data based on proxy

Country Status (1)

Country Link
CN (1) CN109766713B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004043222A2 (en) * 2002-11-05 2004-05-27 Wellstat Biologics Corporation Treating carcinoid neoplasms with therapeuthic viruses
CN102638578A (en) * 2012-03-29 2012-08-15 深圳市高正软件有限公司 Data synchronization method and data synchronization system based on mobile devices
CN102724035A (en) * 2012-06-15 2012-10-10 中国电力科学研究院 Encryption and decryption method for encrypt card
CN104038314A (en) * 2014-05-09 2014-09-10 中煤电气有限公司 Novel safety-monitoring networking real-time dynamic data transmission system and method
CN104731976A (en) * 2015-04-14 2015-06-24 海量云图(北京)数据技术有限公司 Method for finding and sorting private data in data table
CN107247741A (en) * 2017-05-14 2017-10-13 四川盛世天成信息技术有限公司 A kind of concentrating type textual magnanimity sensitive data processing method and system
CN107609418A (en) * 2017-08-31 2018-01-19 深圳市牛鼎丰科技有限公司 Desensitization method, device, storage device and the computer equipment of text data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004043222A2 (en) * 2002-11-05 2004-05-27 Wellstat Biologics Corporation Treating carcinoid neoplasms with therapeuthic viruses
CN102638578A (en) * 2012-03-29 2012-08-15 深圳市高正软件有限公司 Data synchronization method and data synchronization system based on mobile devices
CN102724035A (en) * 2012-06-15 2012-10-10 中国电力科学研究院 Encryption and decryption method for encrypt card
CN104038314A (en) * 2014-05-09 2014-09-10 中煤电气有限公司 Novel safety-monitoring networking real-time dynamic data transmission system and method
CN104731976A (en) * 2015-04-14 2015-06-24 海量云图(北京)数据技术有限公司 Method for finding and sorting private data in data table
CN107247741A (en) * 2017-05-14 2017-10-13 四川盛世天成信息技术有限公司 A kind of concentrating type textual magnanimity sensitive data processing method and system
CN107609418A (en) * 2017-08-31 2018-01-19 深圳市牛鼎丰科技有限公司 Desensitization method, device, storage device and the computer equipment of text data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨红玉等: "重要心脏疾病专病数据库脱敏系统的设计", 《中国数字医学》 *

Also Published As

Publication number Publication date
CN109766713B (en) 2021-01-12

Similar Documents

Publication Publication Date Title
CN109815742B (en) Data desensitization method and device
Tian et al. Needle in a haystack: Tracking down elite phishing domains in the wild
Maiorca et al. R-PackDroid: API package-based characterization and detection of mobile ransomware
KR102007809B1 (en) A exploit kit detection system based on the neural net using image
CN109040103A (en) A kind of mail account is fallen detection method, device, equipment and readable storage medium storing program for executing
CN103430504A (en) System and method for protecting specified data combinations
CN107895122A (en) A kind of special sensitive information active defense method, apparatus and system
Sarno et al. Who are phishers luring?: A demographic analysis of those susceptible to fake emails
Shamiulla Role of artificial intelligence in cyber security
CN108809928B (en) Network asset risk portrait method and device
CN106845220A (en) A kind of Android malware detecting system and method
Kulkarni et al. Personally identifiable information (pii) detection in the unstructured large text corpus using natural language processing and unsupervised learning technique
CN112039874B (en) Malicious mail identification method and device
Queiroz et al. Eavesdropping hackers: Detecting software vulnerability communication on social media using text mining
Zhang et al. Data breach: analysis, countermeasures and challenges
CN116938600B (en) Threat event analysis method, electronic device and storage medium
CN110535821A (en) A kind of Host Detection method of falling based on DNS multiple features
CN107360197B (en) DNS log-based phishing analysis method and device
CN109766713A (en) A kind of data dynamic Rapid desensitization implementation method based on agency
Trabelsi Monitoring leaked confidential data
RU2763921C1 (en) System and method for creating heuristic rules for detecting fraudulent emails attributed to the category of bec attacks
CN108965350A (en) A kind of mail auditing method, device and computer readable storage medium
CN114662111A (en) Malicious code software gene homology analysis method
Wen et al. CNN based zero-day malware detection using small binary segments
CN114398887A (en) Text classification method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant