CN111199048A

CN111199048A - Big data grading desensitization method and system based on container with life cycle

Info

Publication number: CN111199048A
Application number: CN202010000740.0A
Authority: CN
Inventors: 顾津; 潘竞旭; 任钦正; 孙少平; 鲁龙; 宋颖; 陈晓敏
Original assignee: Aisino Corp
Current assignee: Aisino Corp
Priority date: 2020-01-02
Filing date: 2020-01-02
Publication date: 2020-05-26
Anticipated expiration: 2040-01-02
Also published as: CN111199048B

Abstract

The invention provides a big data grading desensitization method and a big data grading desensitization system based on a container with a life cycle. The method and system achieve lifecycle management for data desensitization by staging sensitive data and by creating containers with lifecycles in which different staged desensitization models are established for different levels of sensitive data. According to the method and the system, through a container technology with a life cycle, system resource consumption and data transmission service operation and maintenance working cost are reduced, data processing and transmission efficiency is improved, different encryption algorithms are used for desensitization of sensitive data of different levels and different levels by establishing a hierarchical desensitization model, the risks of stealing and cracking of the sensitive data are greatly reduced on the premise of not changing the characteristics of original data, the defect of a large data platform in the aspect of data safety is overcome, the safety of the large data platform is improved, and the risk of information leakage of the large data is effectively reduced.

Description

Big data grading desensitization method and system based on container with life cycle

Technical Field

The invention relates to the technical field of data encryption, in particular to a big data grading desensitization method and system based on a container with a life cycle.

Background

With the rapid development of big data technology, a large amount of sensitive information is accumulated in the information system of an enterprise. The normal production and operation of an enterprise are extremely dependent on the data security guarantee of an information system, once the data protection in the information system is improper, business secrets such as business information, important personnel information, customer information, core product technical information and the like of the enterprise are leaked or stolen, and important risks and hidden dangers in the production and operation aspect of the enterprise are caused. Particularly, a large amount of sensitive data information related to enterprise visiting and leaving client information, supply chain transaction detail information and the like in an enterprise information system becomes a main target of a cyber crime group pursuing political or economic benefits and individual attention and attack.

Sensitive data has risks of being revealed and attacked in all links of the life cycle of the sensitive data, namely links of data generation, storage, application, exchange and the like. Therefore, enhancing the protection of data in the enterprise information system is a necessary premise and an important way for effectively maintaining the own rights and interests of the enterprise and ensuring the value preservation and increment of the enterprise.

On one hand, the traditional desensitization technical method mainly adopts static desensitization, the design flow is fixed, the tool capability is limited, the specificity is strong, the configuration rule is complex, the maintenance is difficult, the algorithm of the existing computer hardware and the decoding software is more and more advanced, and the traditional desensitization algorithm can be decoded easily. Once the desensitization algorithm is cracked, real data can be obtained through batch reduction processing, and sensitive data are seriously leaked.

On the other hand, as informatization is continuously deepened, the data volume of a service system is larger and larger, the data generation speed is higher and faster, and the data volume within a few days can reach the total information capacity of the previous 10 years. The data volume of the sensitive information is also rapidly increased to reach TB magnitude and PB magnitude, while the traditional desensitization technical method is mostly desensitization technology and products aiming at a relational database, and the desensitization problem of the sensitive data is difficult to effectively process and solve when the sensitive data with large capacity is faced.

Disclosure of Invention

Aiming at the technical problems that a static desensitization algorithm is easy to crack, sensitive data are large in magnitude and sensitive data desensitization is difficult to process and solve effectively in the prior art, the invention provides a big data grading desensitization method and system based on a container with a life cycle, so as to solve the problem that the existing big data desensitization method is insufficient in safety and reliability.

In a first aspect, the present invention provides a method of hierarchical desensitization of big data based on containers having a lifecycle, the method comprising:

carrying out data cleaning on pre-generated pre-desensitization data to generate sensitive data, identifying the sensitive data to determine a first type of the sensitive data, and grading the sensitive data to determine a sensitivity grade;

classifying the first type of sensitive data of each sensitivity level according to different data use objects and data content values, and determining a second type of the first type of sensitive data of each sensitivity level;

creating a container with a life cycle, allocating a network address to the container based on a virtualized network layer, and determining and storing a mapping relation between a port of the container and the network address according to port information of the container and the network address allocated to the container;

storing the sensitive data with the determined level to the container with the life cycle based on the mapping relation between the port of the container and the network address;

respectively configuring desensitization algorithms according to a second type of sensitive data stored in a container with a life cycle and then establishing respective corresponding data desensitization models;

desensitizing the sensitive data corresponding to the second type according to the established data desensitization model, and storing the desensitized data in a container with a life cycle;

and responding to a data acquisition request sent by a designated object, and transmitting the desensitized data stored in the container to the designated object, wherein when the storage time of the data in the container reaches a preset time and/or after the data stored in the container is transmitted to the designated object, the life cycle of the container is ended, the container is destroyed, and the data stored in the container is deleted.

Further, the method may also previously include generating pre-desensitization data, the generating pre-desensitization data comprising:

extracting source data in a distributed and heterogeneous data source in a business system by a big data extraction tool, wherein the source data comprises structured data and unstructured data;

and carrying out cleaning, conversion, integration and structuring operations on the source data to generate pre-desensitization data, and transmitting the pre-desensitization data to a desensitization database in a big data storage system.

Further, performing data cleaning on pre-generated pre-desensitization data to generate sensitive data, identifying the sensitive data to determine a first type of the sensitive data, and ranking the sensitive data, wherein determining a sensitivity ranking includes:

carrying out data cleaning on the pre-generated pre-desensitization data, and generating sensitive data after eliminating repeated values, missing values and abnormal values in the pre-desensitization data;

dividing the sensitive data according to different data attributes, and determining a first type of the sensitive data;

and evaluating the security value of the sensitive data according to the confidentiality, integrity and availability of the sensitive data, and determining the sensitivity level of the sensitive data.

Further, the evaluating the security value of the sensitive data according to confidentiality, integrity and availability thereof, and the determining the sensitivity level thereof comprises:

grading the sensitive data according to a preset score interval of each safety value scoring item, wherein the safety value scoring item comprises whether the sensitive data can directly identify a specific enterprise object and is closely related to the actual operation state of the enterprise object, and through whether other related information can be obtained through the data information, the data information can possibly cause potential economic loss and bring potential information threat to the enterprise;

summing the scores of each safety value scoring item of the sensitive data to determine the safety value score of the sensitive data;

and determining the sensitivity level of the sensitive data according to the corresponding relation between the sensitivity level and the safety value score.

Further, storing the sensitive data with the determined level to the container with the life cycle based on the mapping relationship between the port and the network address of the container comprises:

analyzing a protocol field of a received sensitive data message, and determining a destination network address of the sensitive data message;

determining a container port corresponding to a destination network address of the sensitive data message based on a mapping relation between the network address and the container port;

and distributing the sensitive data to a corresponding storage position of a container with a life cycle according to the container port corresponding to the destination network address of the sensitive data message.

Further, the establishing of the data desensitization models corresponding to the desensitization algorithms after respectively configuring the desensitization algorithms according to the second type of sensitive data stored in the container with the life cycle comprises:

configuring a desensitization algorithm according to a second type of sensitive data stored in containers having a lifecycle, respectively, wherein the desensitization algorithm is irreversible and is automated, repeatable;

and establishing a data desensitization model based on a desensitization algorithm configured by each second type of sensitive data, wherein the data desensitization model satisfies that the desensitized data has the characteristics of the original data, the integrity of the data is kept as much as possible, all non-sensitive fields which have relevance and can generate sensitive data are desensitized, and the desensitization grade of the desensitized data can be marked.

In a second aspect, the present invention provides a big data grading desensitization system based on containers having a lifecycle, the system comprising:

the device comprises a sensitivity grade unit, a data processing unit and a data processing unit, wherein the sensitivity grade unit is used for carrying out data cleaning on pre-generated pre-desensitization data to generate sensitive data, identifying the sensitive data to determine a first type of the sensitive data, grading the sensitive data and determining a sensitivity grade;

the data classification unit is used for classifying the sensitive data of the first type of each sensitivity level according to different data use objects and data content values and determining a second type of the sensitive data of the first type of each sensitivity level;

the system comprises a container establishing unit, a service establishing unit and a service establishing unit, wherein the container establishing unit is used for establishing a container with a life cycle, allocating a network address to the container based on a virtualized network layer, and determining and storing a mapping relation between a port of the container and the network address according to port information of the container and the network address allocated to the container;

the data storage unit is used for storing the sensitive data with the determined level to the container with the life cycle based on the mapping relation between the port of the container and the network address;

the desensitization model unit is used for respectively configuring desensitization algorithms according to the second type of the sensitive data stored in the container with the life cycle and then establishing data desensitization models corresponding to the desensitization algorithms;

the data desensitization unit is used for desensitizing the sensitive data corresponding to the second type according to the established data desensitization model and storing the desensitized data in a container with a life cycle;

and the data transmission unit is used for responding to a data acquisition request sent by a specified object and transmitting the desensitized data stored in the container to the specified object, wherein when the storage time length of the data in the container reaches a preset time length and/or after the data stored in the container is transmitted to the specified object, the life cycle of the container is ended, the container is destroyed, and the data stored in the container is deleted.

Further, the system further comprises a data pre-processing unit for generating pre-desensitization data, wherein the data pre-processing unit comprises:

the data extraction unit is used for extracting source data in heterogeneous data sources distributed in a business system through a big data extraction tool, wherein the source data comprises structured data and unstructured data;

and the data processing unit is used for generating pre-desensitization data after the source data are subjected to cleaning, conversion, integration and structuring operation, and transmitting the pre-desensitization data to a desensitization database in the big data storage system.

Further, the sensitivity level unit includes:

the sensitive data unit is used for carrying out data cleaning on the pre-generated pre-desensitization data and generating sensitive data after eliminating repeated values, missing values and abnormal values in the pre-desensitization data;

the data dividing unit is used for dividing the sensitive data according to different data attributes and determining a first type of the sensitive data;

and the grade determining unit is used for evaluating the security value of the sensitive data according to the confidentiality, the integrity and the availability of the sensitive data and determining the sensitivity grade of the sensitive data.

Further, the level determination unit evaluates the security value of the sensitive data according to confidentiality, integrity and availability thereof, and determining the sensitivity level thereof comprises:

Further, the data storage unit includes:

the data analysis unit is used for analyzing the protocol field of the received sensitive data message and determining the destination network address of the sensitive data message;

a port determining unit, configured to determine, based on a mapping relationship between the network address and the container port, a container port corresponding to a destination network address of the sensitive data packet;

and the data distribution unit is used for distributing the sensitive data to the corresponding storage position of the container with the life cycle according to the container port corresponding to the destination network address of the sensitive data message.

Further, the desensitization model unit comprises:

an algorithm configuration unit for configuring a desensitization algorithm, respectively, according to a second type of sensitive data stored in containers having a lifecycle, wherein the desensitization algorithm is irreversible and is automated, repeatable;

and the model establishing unit is used for establishing a data desensitization model based on a desensitization algorithm configured by the second type of sensitive data, wherein the data desensitization model meets the requirement that the desensitized data has the characteristics of the original data, the integrity of the data is reserved as much as possible, all non-sensitive fields which have relevance and can generate sensitive data are desensitized, and the desensitization data can be marked with sensitivity levels.

In summary, the present invention provides a method and system for hierarchical desensitization of large data based on containers having a lifecycle. The method and system achieve lifecycle management for data desensitization by staging sensitive data and by creating containers with lifecycles in which different staged desensitization models are established for different levels of sensitive data. According to the method and the system, through a container technology with a life cycle, system resource consumption and data transmission service operation and maintenance working cost are reduced, data processing and transmission efficiency is improved, different encryption algorithms are used for desensitization of sensitive data of different levels and different levels by establishing a hierarchical desensitization model, the risks of stealing and cracking of the sensitive data are greatly reduced on the premise of not changing the characteristics of original data, the defect of a large data platform in the aspect of data safety is overcome, the safety of the large data platform is improved, and the risk of information leakage of the large data is effectively reduced.

Drawings

A more complete understanding of exemplary embodiments of the present invention may be had by reference to the following drawings in which:

FIG. 1 is a schematic flow diagram of a big data grading desensitization method based on containers with life cycles according to a preferred embodiment of the present invention;

fig. 2 is a schematic structural diagram of a big data grading desensitization system based on containers with life cycles according to a preferred embodiment of the present invention.

Detailed Description

The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, the same units/elements are denoted by the same reference numerals.

Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Further, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.

FIG. 1 is a schematic flow diagram of a big data grading desensitization method based on a container with a life cycle according to a preferred embodiment of the invention. As shown in FIG. 1, the big data graded desensitization method based on containers with a life cycle according to the preferred embodiment begins at step 101.

At step 101, pre-desensitization data is generated.

In the preferred embodiment, the source data in the heterogeneous data sources distributed in the business system, which includes structured data and unstructured data, is extracted by the big data extraction tool in the preferred embodiment. Unlike structured data in a relational database, the large data platform of the preferred embodiment also includes a large amount of unstructured data in the processed data. For unstructured data, a common way is to convert the data into structured data by means of indexing/tagging, so that the data has a definite meaning and a definite association relationship between the data. Therefore, the source data is extracted to a desensitization database in a big data storage system after cleaning, conversion, integration and structuring operations. Query and usage data can only be obtained from large data desensitization databases.

In step 102, data cleaning is performed on the pre-desensitization data to generate sensitive data, the sensitive data is identified to determine a first type of the sensitive data, and the sensitive data is graded to determine a sensitivity grade.

After the pre-desensitization data is cleaned to generate sensitive data, the first step of performing hierarchical management on the sensitive data is to identify the sensitive data to clarify the type of the sensitive data, such as enterprise core personnel information, bank account information, customer information, and the like.

The hierarchical management of the third data also evaluates the security value of the third data according to the confidentiality, integrity and availability of the third data, and determines the sensitivity level of the third data. In the process of grading sensitive data, the sensitive data should be fully communicated with a data provider and a data management department, so that the grading reasonability of the information is ensured, and the phenomena of overuse, excessive protection, resource waste and data information loss are prevented. Generally, the sensitive data classification adopts four sensitive levels of core quotient secret, common quotient secret, internal sensitivity and non-sensitivity.

In step 103, the sensitive data of the first type of each sensitivity level is classified according to different data use objects and data content values, and a second type of the sensitive data of the first type of each sensitivity level is determined.

The classification of the sensitive data should be based on the classification of the sensitive data. And after grading is finished, classifying the sensitive data of each grade by adopting a machine learning model and combining practical application experience. For example, when classifying sensitive data of enterprise information, the factors of two aspects of data use object property and information content are mainly considered:

a) aiming at the data use objects divided according to types, a limited differentiated data service strategy is adopted, the business requirements of the use objects of each level are met on the premise of ensuring the legal compliance of the data service and guaranteeing the data safety, the value conversion of data assets is realized, the improvement of the comprehensive service capability of the data of a large data platform is facilitated, the business waste is optimized, and higher added value is obtained. Therefore, classification by using object properties is a very important and fundamental way of partitioning.

b) Due to the difference of data use objects, the data information content value is different inevitably, and the influence caused by leakage is different. Different management measures are taken for different information data of the use object, so that the direct use of development and test on key data information is reduced or avoided, and the potential safety hazard of data information leakage can be effectively reduced. Therefore, dividing the client information by the information content is an important prerequisite for classifying and distinguishing the protection of the data usage object information.

In step 104, a container with a life cycle is created, a network address is allocated to the container based on the virtualized network layer, and a mapping relation between a port of the container and the network address is determined and stored according to the port information of the container and the network address allocated to the container.

At step 105, the sensitive data with the determined level is stored in the container with the life cycle based on the mapping relation between the port and the network address of the container.

In step 106, desensitization algorithms are configured according to the second type of sensitive data stored in the containers with the life cycles, and then corresponding data desensitization models are established.

In the preferred embodiment, the enterprise information is divided into enterprise basic attribute data and enterprise transaction numerical data according to different use objects. The industry basic attribute data is the basic information of enterprise name, various codes (taxpayer identification number, unified social credit code, invoice code/number, commodity code and the like) and enterprise address, telephone, account opening bank and the like, the desensitization of the data adopts a traditional desensitization method, and corresponding encryption algorithm can be selected from desensitization strategy configuration options according to the data attributes of the desensitization method for desensitization, for example: the commercial code 1040201240000000000 can be desensitized by selecting an asymmetric encryption algorithm (MD5) and converted into ef5c11c555b5e09fe75bf466b57338bcee11c40 b; the purchaser QQ number 3279248039@ qq.com may choose a masking algorithm desensitization to convert to × @qq.com; the seller address "Beijing Tongzhou district Luzhou town New City Industrial district No. 9" may be converted into "Beijing Tongzhou district" by desensitizing the interception algorithm.

The enterprise transaction numerical data is transaction numerical data such as amount, tax amount, price and tax sum, commodity unit price and tax rate, and the data desensitization adopts a homomorphic encryption algorithm, namely, a specific algebraic operation is carried out on the data to be desensitized to obtain a still encrypted result, and the result obtained by decrypting the data is the same as the result obtained by carrying out the same operation on a plaintext. The homomorphic encryption algorithm comprises the following operation steps:

(1) the original data is scaled to fall within a particular interval.

(2) And (3) disturbing the scaled original data by adding noise by using a scrambling technology to realize distortion and change of the original data. The noise term calculation adopts the steps of carrying out dimensionless normalization processing on the original data and then carrying out weighted synthesis on the original data and the original data.

In practical application, the noise item calculation of the encryption algorithm model adopts a maximum and minimum value normalization method and an inverse cotangent conversion normalization method according to different application scenes. If the data acquisition and updating period is a period of observation time (year/season/month), adopting a maximum and minimum value normalization method; if the data is collected in real time and updated in real time, an inverse cotangent conversion normalization method is adopted. The specific formula is as follows:

the maximum and minimum value normalization method comprises the following steps:

wherein the content of the first and second substances,

and the conversion value is the conversion value encrypted by the maximum and minimum value normalization method, omega is a normalized interference term weighting coefficient, sigma is a random interference term, and X is the full-scale sample value of the observation period.

The inverse cotangent conversion normalization method comprises the following steps:

wherein the content of the first and second substances,

the method is characterized in that the method is a conversion value encrypted by an inverse cotangent conversion normalization method, omega is a normalization interference item weighting coefficient, sigma is a random interference item, and tau is an original value scaling coefficient.

The advantages of the encryption algorithm are as follows: the unit dimension limitation of the original data is eliminated, and the unit dimension limitation is converted into a dimensionless pure numerical value, so that indexes of different units or orders can be compared and weighted conveniently. The data after noise disturbance still retains the distribution characteristics of the original data.

At step 107, desensitization is performed on the sensitive data corresponding to the second type according to the established data desensitization model, and the desensitized data is stored in containers having a life cycle.

In the preferred embodiment, the data desensitization model can automatically identify the desensitized data grade in the container, and mark the sensitive grade identification of the sensitive data according to different grades. Sensitive data of three levels of core quotient secret, common quotient secret and internal sensitivity after desensitization are stored in a container.

After the desensitized data is stored in the container, a mirror image container of the container can be created to back up the data in the container, so that the problem of low processing efficiency caused by the need of reacquiring the data in the system for processing when the data in the container is processed in error is avoided.

And in step 108, responding to a data acquisition request sent by a designated object, transmitting the desensitized data stored in the container to the designated object, wherein when the storage time of the data in the container reaches a preset time and/or after the data stored in the container is transmitted to the designated object, the life cycle of the container is ended, destroying the container and deleting the data stored in the container.

Preferably, the generating pre-desensitization data comprises:

extracting structured and unstructured data in distributed and heterogeneous data sources in a business system through a big data extraction tool;

Preferably, the method comprises the steps of performing data cleaning on pre-generated pre-desensitization data to generate sensitive data, identifying the sensitive data to determine a first type of the sensitive data, and grading the sensitive data, wherein determining the sensitivity grade comprises:

Preferably, the evaluating the security value of the sensitive data according to confidentiality, integrity and availability thereof, and the determining the sensitivity level thereof comprises:

Preferably, storing the sensitive data with the determined level to the container with the life cycle based on the mapping relationship between the port and the network address of the container comprises:

Preferably, the establishing of the respective corresponding data desensitization models after respectively configuring the desensitization algorithms according to the second type of sensitive data stored in the container with the life cycle comprises:

Fig. 2 is a schematic structural diagram of a big data grading desensitization system based on containers with life cycles according to a preferred embodiment of the present invention. As shown in FIG. 2, the big data grading desensitization system 200 based on containers with life cycles according to the preferred embodiment includes:

a preprocessing unit 201 for generating pre-desensitization data;

the sensitivity level unit 202 is used for performing data cleaning on pre-generated pre-desensitization data to generate sensitive data, identifying the sensitive data to determine a first type of the sensitive data, and grading the sensitive data to determine a sensitivity level;

the data classification unit 203 is used for classifying the sensitive data of the first type of each sensitivity level according to different data use objects and data content values and determining a second type of the sensitive data of the first type of each sensitivity level;

a container establishing unit 204, configured to create a container with a life cycle, assign a network address to the container based on a virtualized network layer, and determine and store a mapping relationship between a port of the container and the network address according to port information of the container and the network address assigned to the container;

a data storage unit 205, configured to store the sensitive data with the determined level to the container with the lifecycle based on a mapping relationship between a port of the container and a network address;

desensitization model unit 206, which is used to build the respective corresponding data desensitization models after respectively configuring desensitization algorithms according to the second type of sensitive data stored in the container with life cycle;

a data desensitization unit 207 for desensitizing the sensitive data corresponding to the second type according to the established data desensitization model and storing the desensitized data in a container having a life cycle;

and the data transmission unit 208 is used for responding to a data acquisition request sent by a specified object and transmitting the desensitized data stored in the container to the specified object, wherein when the storage time length of the data in the container reaches a preset time length and/or after the data stored in the container is transmitted to the specified object, the life cycle of the container is ended, the container is destroyed, and the data stored in the container is deleted.

Preferably, the data preprocessing unit 201 includes:

a data extraction unit 211, configured to extract source data in heterogeneous data sources distributed in a business system through a big data extraction tool, where the source data includes structured data and unstructured data;

and the data processing unit 212 is used for generating pre-desensitization data after the source data is subjected to cleaning, conversion, integration and structuring operation, and transmitting the pre-desensitization data to a desensitization database in the big data storage system.

Preferably, the sensitivity level unit 202 includes:

a sensitive data unit 221, configured to perform data cleaning on the pre-generated pre-desensitization data, and generate sensitive data after eliminating a repetition value, a missing value, and an abnormal value in the pre-desensitization data;

the data dividing unit 222 is configured to divide the sensitive data according to different data attributes, and determine a first type of the sensitive data;

and a grade determining unit 223 for evaluating the security value of the sensitive data according to the confidentiality, integrity and availability of the sensitive data, and determining the sensitivity grade of the sensitive data.

Preferably, the level determination unit 223 evaluates the security value of the sensitive data according to confidentiality, integrity and availability thereof, and determining the sensitivity level thereof includes:

Preferably, the data storage unit 205 includes:

a data parsing unit 251, configured to parse a protocol field of a received sensitive data packet, and determine a destination network address of the sensitive data packet;

a port determining unit 252, configured to determine, based on a mapping relationship between the network address and the container port, a container port corresponding to a destination network address of the sensitive data packet;

and the data distribution unit 253 is configured to distribute the sensitive data to the corresponding storage location of the container with the life cycle according to the container port corresponding to the destination network address of the sensitive data packet.

Preferably, the desensitization model unit 206 comprises:

an algorithm configuration unit 261 for configuring a desensitization algorithm, respectively, according to a second type of sensitive data stored in containers having a life cycle, wherein the desensitization algorithm is irreversible and is automated, repeatable;

the model establishing unit 262 is used for establishing a data desensitization model based on a desensitization algorithm configured by the second type of sensitive data, wherein the data desensitization model satisfies that the desensitized data has the characteristics of the original data, the integrity of the data is kept as much as possible, desensitization processing is also performed on all non-sensitive fields which have relevance and can generate sensitive data, and the desensitization data can be marked with a sensitivity level.

The steps of desensitization of the big data grading desensitization system based on the container with the life cycle according to the preferred embodiment are the same as those of the big data grading desensitization method based on the container with the life cycle, so that the technical effects are the same, and further description is omitted.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

The invention has been described above by reference to a few embodiments. However, other embodiments of the invention than the one disclosed above are equally possible within the scope of the invention, as would be apparent to a person skilled in the art from the appended patent claims.

Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a// the [ device, component, etc ]" are to be interpreted openly as at least one instance of a device, component, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.

Claims

1. A method for hierarchical desensitization of big data based on containers having a life cycle, the method comprising:

2. The method of claim 1, further comprising, prior to generating pre-desensitization data, the generating pre-desensitization data comprising:

3. The method of claim 1, wherein data cleansing pre-generated pre-desensitization data generates sensitive data and identifying the sensitive data to determine a first type of sensitive data, and wherein ranking the sensitive data comprises:

4. The method of claim 3, wherein the security value of the sensitive data is evaluated based on its confidentiality, integrity and availability, and wherein determining its sensitivity level comprises:

5. The method of claim 1, wherein storing the sensitive data with the determined level to the container with the life cycle based on the mapping relationship between the port and the network address of the container comprises:

6. The method of claim 1, wherein establishing respective corresponding data desensitization models after respectively configuring desensitization algorithms according to the second type of sensitive data stored in the life cycle container comprises:

7. A big data grading desensitization system based on containers having a lifecycle, the system comprising:

8. The system of claim 7, further comprising a data pre-processing unit for generating pre-desensitization data, wherein the data pre-processing unit comprises:

9. The system of claim 7, wherein the sensitivity level unit comprises:

10. The system of claim 9, wherein the level determining unit evaluates the security value of the sensitive data according to its confidentiality, integrity and availability, and wherein determining the level of sensitivity comprises:

11. The system of claim 7, wherein the data storage unit comprises:

12. The system of claim 7, wherein the desensitization model unit comprises: