CN115146319A

CN115146319A - Data desensitization method, data desensitization device and storage medium

Info

Publication number: CN115146319A
Application number: CN202211075445.7A
Authority: CN
Inventors: 李秀玉
Original assignee: Beijing Century Century Science And Technology Development Co ltd
Current assignee: Beijing Century Century Science And Technology Development Co ltd
Priority date: 2022-09-05
Filing date: 2022-09-05
Publication date: 2022-10-04

Abstract

The invention is suitable for the technical field of data desensitization, and provides a data desensitization method, a data desensitization device and a storage medium, wherein the data desensitization method comprises the following steps: receiving original data, and identifying the original data to obtain content to be desensitized and a category to which the content belongs; receiving a content modification determining instruction to be desensitized, and determining content to be desensitized and the category of the content to be desensitized; receiving a data access instruction sent by a user account, and judging the authority of the user account; when the user does not have the right to view the sensitive data, desensitizing the content to be desensitized by using an irreversible desensitization algorithm; when the user has the right to view the sensitive data, determining the category to which the user belongs, desensitizing the content needing desensitization in the category to which the user belongs by using a reversible desensitization algorithm, and desensitizing other content needing desensitization by using an irreversible desensitization algorithm. The invention can adopt different desensitization algorithms to desensitize, the irreversible desensitization algorithm ensures desensitization speed, and the reversible desensitization algorithm ensures that data after desensitization can be processed for the second time.

Description

Data desensitization method, data desensitization device and storage medium

Technical Field

The invention relates to the technical field of data desensitization, in particular to a data desensitization method, a data desensitization device and a storage medium.

Background

Along with the popularization of modern office and the continuous construction of intelligent factories, various confidential data stored in an enterprise database are more and more, once leakage occurs, serious economic loss can be brought to the enterprise, high attention is paid to data security by all enterprises for the reason, and desensitization treatment of sensitive data is a positive and effective means for preventing data leakage. Common desensitization processing methods comprise numerical value conversion, encryption and shielding, wherein the numerical value conversion refers to controllable adjustment of numerical value and date type source data through functions so as to complete disguising of specific numerical values while maintaining relevant statistical characteristics of original data, but the numerical value conversion is only suitable for the specific numerical values and is not suitable for character characters; encryption refers to encryption processing of data to be desensitized, so that external users only see meaningless encrypted data, but encryption itself needs certain computing power and generates large resource overhead for a large data set source; the occlusion means that part of the content of the sensitive data is uniformly replaced by a masking symbol (such as "") so that the sensitive data keeps the part of the content open, but the occluded data loses the characteristics of the original data and cannot be reversed, and cannot be processed and used subsequently. Therefore, it is desirable to provide a data desensitization method, a data desensitization apparatus and a storage medium, which aim to solve or alleviate the above problems.

Disclosure of Invention

In view of the shortcomings in the prior art, it is an object of the present invention to provide a data desensitization method, a data desensitization apparatus and a storage medium, so as to solve or alleviate the above-mentioned problems in the background art.

The invention is realized in such a way that a data desensitization method comprises the following steps:

receiving original data, identifying the original data to obtain content to be desensitized and corresponding belonged categories, and displaying the content to be desensitized and the corresponding belonged categories;

receiving a content modification determining instruction to be desensitized, and determining content to be desensitized and corresponding categories;

receiving a data access instruction sent by a user account, and judging the authority of the user account;

when the user does not have the right to view the sensitive data, the irreversible desensitization algorithm is used for desensitizing the content needing desensitization, sending the desensitized data to a user account;

when a user is authorized to view sensitive data, determining the category to which the user belongs, desensitizing content needing desensitization of the category to which the user belongs by using a reversible desensitization algorithm, desensitizing other content needing desensitization by using an irreversible desensitization algorithm, and sending the desensitized data to a user account, wherein the reversible desensitization algorithm is a binary code-based calculation formula.

As a further scheme of the invention: the step of identifying the original data to obtain the content to be desensitized and the corresponding category to which the content belongs specifically includes:

identifying identity data, financial data, production data and research and development data in the original data;

and marking the identified identity data, financial data, production data and development data as contents to be desensitized, and correspondingly marking the categories to which the identity data, the financial data, the production data and the development data belong as an identity category, a financial category, a production category and a development category.

As a further scheme of the invention: the step of identifying identity data, financial data, production data and research and development data in the original data specifically comprises:

identifying identity data in the original data according to the identity card number, the telephone number, the age, the name and the address;

identifying financial data in the raw data from financial value units, bill bills, revenues, expenses, and assets;

identifying production data in the original data according to the production numerical unit, the yield, the discharge yield and the qualification rate;

and identifying the research and development data in the original data according to the product parameters, the project names and the test names.

As a further scheme of the invention: the step of receiving a modification determination instruction of the content to be desensitized and determining the content to be desensitized and the corresponding category includes:

receiving a content editing instruction to be desensitized, wherein the content editing instruction to be desensitized comprises content modification information to be desensitized and belonging category modification information;

modifying the content to be desensitized and the corresponding category according to the content editing instruction to be desensitized;

and receiving a desensitization content determination instruction, and determining content needing desensitization and the corresponding category.

As a further scheme of the invention: the method comprises the steps that a user account base is set, all user accounts which are not authorized to view sensitive data and all user accounts which are authorized to view the sensitive data are displayed in the user account base, and each user account corresponds to a category to which the user account belongs.

It is a further object of the invention to provide a data desensitising apparatus comprising:

the original data identification module is used for receiving the original data, identifying the original data to obtain the content to be desensitized and the corresponding belonged category, and displaying the content to be desensitized and the corresponding belonged category;

a desensitization content determining module, configured to receive a modification determining instruction of content to be desensitized, and determine content to be desensitized and a corresponding category to which the content belongs;

the data access receiving module is used for receiving a data access instruction sent by a user account and judging the authority of the user account; and

the data desensitization processing module is used for desensitizing content to be desensitized by using an irreversible desensitization algorithm when a user does not have the right to view sensitive data, and sending the desensitized data to a user account; when a user is authorized to view sensitive data, determining the category to which the user belongs, desensitizing content needing desensitization of the category to which the user belongs by using a reversible desensitization algorithm, desensitizing other content needing desensitization by using an irreversible desensitization algorithm, and sending the desensitized data to a user account, wherein the reversible desensitization algorithm is a binary code-based calculation formula.

As a further scheme of the invention: the raw data identification module comprises:

the data automatic identification unit is used for identifying identity data, financial data, production data and research and development data in the original data; and

and the content to be desensitized determining unit is used for marking the identified identity data, financial data, production data and development data as the content to be desensitized, and correspondingly marking the categories to which the identity data, the financial data, the production data and the development data belong as the identity category, the financial category, the production category and the development category.

As a further scheme of the invention: the data automatic identification unit includes:

the identity data identification subunit is used for identifying the identity data in the original data according to the identity card number, the telephone number, the age, the name and the address;

the financial data identification subunit is used for identifying financial data in the original data according to financial numerical units, bill bills, income, expenditure and assets;

the production data identification subunit is used for identifying the production data in the original data according to the production numerical units, the yield, the row yield and the qualification rate; and

and the research and development data identification subunit is used for identifying the research and development data in the original data according to the product parameters, the project names and the test names.

As a further scheme of the invention: the desensitization content determination module comprises:

the device comprises an editing command receiving unit, a content desensitization processing unit and a content desensitization processing unit, wherein the editing command receiving unit is used for receiving a content desensitization processing instruction to be desensitized, and the content desensitization processing instruction to be desensitized comprises content modification information to be desensitized and category modification information to which the content modification information belongs;

the content to be desensitized modifying unit is used for modifying the content to be desensitized and the corresponding category according to the content to be desensitized editing instruction; and

and the content needing desensitization determining unit is used for receiving a desensitization content determining instruction and determining the content needing desensitization and the corresponding category.

It is a further object of the invention to provide a storage medium having stored thereon a computer program which, when executed by a processor, causes the processor to perform the steps of the method of data desensitization.

Compared with the prior art, the invention has the beneficial effects that:

according to the invention, different desensitization algorithms can be adopted to desensitize the content to be desensitized according to whether the user has the right to check sensitive data, the requirement of the irreversible desensitization algorithm on the computing capacity is low, and the desensitization speed is ensured; the reversible desensitization algorithm ensures that the desensitized data can be processed for the second time, is a calculation formula based on binary codes, and is suitable for numbers, letters and Chinese characters.

Drawings

Fig. 1 is a flow chart of a method of data desensitization.

Fig. 2 is a flowchart of identifying original data in a data desensitization method to obtain content to be desensitized and corresponding belonging categories.

FIG. 3 is a flow diagram of identifying identity data, financial data, production data, and development data in raw data in a data desensitization method.

Fig. 4 is a flowchart of receiving a modification determination instruction of content to be desensitized in a data desensitization method, determining content to be desensitized and corresponding belonging categories.

Fig. 5 is a schematic structural diagram of a data desensitization apparatus.

Fig. 6 is a schematic structural diagram of an original data identification module in a data desensitization apparatus.

Fig. 7 is a schematic structural diagram of an automatic data identification unit in a data desensitization apparatus.

Fig. 8 is a schematic structural diagram of a desensitization content determining module in a data desensitization apparatus.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more clear, the present invention is further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Specific implementations of the present invention are described in detail below with reference to specific embodiments.

As shown in fig. 1, an embodiment of the present invention provides a data desensitization method, including the following steps:

s100, receiving original data, identifying the original data to obtain content to be desensitized and a corresponding belonging category, and displaying the content to be desensitized and the corresponding belonging category;

s200, receiving a content modification determining instruction to be desensitized, and determining the content to be desensitized and the corresponding category;

s300, receiving a data access instruction sent by a user account, and judging the authority of the user account;

s400, when the user does not have the right to view sensitive data, desensitizing the content needing desensitization by using an irreversible desensitization algorithm, and sending the desensitized data to a user account;

s500, when the user is authorized to view sensitive data, determining the category to which the user belongs, desensitizing content needing desensitization of the category to which the user belongs by using a reversible desensitization algorithm, desensitizing other content needing desensitization by using an irreversible desensitization algorithm, and sending desensitized data to a user account, wherein the reversible desensitization algorithm is a binary code-based calculation formula.

It should be noted that, with the popularization of modern offices and the continuous construction of intelligent factories, various kinds of confidential data stored in enterprise databases are increasing, once leakage occurs, serious economic loss will be brought to the enterprises, and therefore, high importance is given to data security by each enterprise, wherein desensitization processing on sensitive data is a positive and effective means for preventing data leakage. Common desensitization processing methods comprise numerical value transformation, encryption and shielding, wherein the numerical value transformation refers to controllable adjustment of numerical value and date type source data through functions so as to complete disguising of specific numerical values while maintaining relevant statistical characteristics of original data, but the numerical value transformation is only suitable for the specific numerical values and is not suitable for characters; encryption is to encrypt data to be desensitized, so that an external user only sees meaningless encrypted data, but encryption itself needs certain computing power and generates great resource overhead for a large data set source; the occlusion refers to a unified replacement of a part of the content of the sensitive data with a masking symbol (such as ""), so that the sensitive data keeps the part of the content open, but the occluded data loses the characteristics of the original data and cannot be reversed, and cannot be processed and used subsequently.

In the embodiment of the invention, when various original data are received and stored, the original data are automatically identified to obtain contents to be desensitized and corresponding belonged categories, it needs to be explained that basic-level employees do not need to perform secondary processing on the sensitive data, the employees are not authorized to check the sensitive data, middle-level and high-level employees in each department need to perform subsequent processing on the sensitive data related to the department, the employees are authorized to check the sensitive data related to the department, and different departments correspond to different categories, such as an identity category (personnel department), a financial category (financial department), a production category (production workshop) and a research and development category (research and development department), in order to ensure that the data are not leaked in the transmission process, whether the employees are authorized to check the sensitive data, the sensitive data need to be desensitized; the content to be desensitized and the corresponding category thereof obtained by automatic identification may have errors, so that an uploader of original data is required to check and modify the content, and specifically, the uploader inputs a modification determining instruction of the content to be desensitized, so that the final content to be desensitized and the corresponding category thereof can be determined; and when a data access instruction sent by a user account is received, automatically judging the authority of the user account.

The method comprises the steps that a user account library is arranged, all user accounts which are not authorized to view sensitive data and all user accounts which are authorized to view the sensitive data are displayed in the user account library, each user account corresponds to a category to which the user belongs, the authority of the user and the category to which the user belongs can be known through the user account library, when the user does not authorize to view the sensitive data, an irreversible desensitization algorithm is used for desensitizing content needing desensitization, the desensitized data are sent to the user accounts, the irreversible desensitization algorithm is used for shielding or deleting the content needing desensitization, and 12345 is converted into the content, for example; when a user has the right to view sensitive data, a reversible desensitization algorithm is used for desensitizing content needing desensitization of the category to which the user belongs, for example, the user is a staff of a financial department, the content needing desensitization of the financial category is desensitized by using the reversible desensitization algorithm, the reversible desensitization algorithm is a binary code-based calculation formula, and no matter whether specific numbers, letters or Chinese characters correspond to binary codes, for example, the calculation formula of the embodiment of the invention is as follows: adding 1 to the binary code of each character in the content to be desensitized each time until the binary code added 1 once has a corresponding character, wherein the character is data after desensitization, for example, the character in the content to be desensitized has "secret", the dense binary code is 111001011010111110000110, the binary code 111001011010111110000110 added 1 becomes 111001011010111110000111, the character corresponding to 111001011010111110000111 is "kou", and if the 111001011010111110000111 does not have a corresponding character, the binary code is continuously added 1 until the corresponding character appears. According to the embodiment of the invention, different desensitization algorithms are adopted to desensitize the content to be desensitized according to whether the user has the right to check sensitive data, the requirement of the irreversible desensitization algorithm on the computing capacity is low, and the desensitization speed is ensured; the reversible desensitization algorithm ensures that the desensitized data can be processed for the second time, is a calculation formula based on binary codes, and is suitable for numbers, letters and Chinese characters.

As shown in fig. 2, as a preferred embodiment of the present invention, the step of identifying the original data to obtain the content to be desensitized and the corresponding category specifically includes:

s101, identifying identity data, financial data, production data and research and development data in original data;

s102, marking the identified identity data, financial data, production data and development data as contents to be desensitized, and correspondingly marking the categories to which the identity data, the financial data, the production data and the development data belong as identity categories, financial categories, production categories and development categories.

In the embodiment of the invention, the identity data, the financial data, the production data and the research and development data in the original data can be identified, the identified data is marked as the content to be desensitized, the category to which the identity data is marked is the identity category, the category to which the financial data is marked is the financial category, the category to which the production data is marked is the production category, and the research and development data is marked is the research and development category.

As shown in fig. 3, as a preferred embodiment of the present invention, the step of identifying the identity data, the financial data, the production data, and the development data in the original data specifically includes:

s1011, identifying the identity data in the original data according to the identity card number, the telephone number, the age, the name and the address;

s1012, identifying financial data in the original data according to the financial numerical units, bill bills, income, expenditure and assets;

s1013, identifying production data in the original data according to the production numerical unit, the yield, the row yield and the qualification rate;

and S1014, identifying the research and development data in the original data according to the product parameters, the project names and the test names.

In the embodiment of the invention, some keywords and appearance characteristics can be set to identify the identity data, financial data, production data and research and development data in the original data, for example, the identity data in the original data is identified according to the identification number, the telephone number, the age, the name and the address; identifying financial data in the raw data from financial numerical units (e.g., dollars), bill bills, revenues, expenses, and assets; according to the production numerical unit (e.g. bench), the yield identifying production data in the original data according to the discharge quantity and the qualified rate; the development data in the raw data is identified based on product parameters, project names, and test names (e.g., noise tests).

As shown in fig. 4, as a preferred embodiment of the present invention, the step of receiving a content modification determination instruction to be desensitized, and determining a content to be desensitized and a corresponding category includes:

s201, receiving a content editing instruction to be desensitized, wherein the content editing instruction to be desensitized comprises content modification information to be desensitized and belonging category modification information;

s202, modifying the content to be desensitized and the corresponding category according to the content editing instruction to be desensitized;

s203, receiving a desensitization content determining instruction, and determining content to be desensitized and corresponding belonging categories.

In the embodiment of the invention, when a user sees the displayed content to be desensitized and the corresponding category to which the content to be desensitized belongs and feels that the content needs to be modified, directly inputting a content to be desensitized editing instruction, wherein the content to be desensitized editing instruction comprises content to be desensitized modification information and category modification information; and then modifying the content to be desensitized and the corresponding belonged category according to the content editing instruction to be desensitized, inputting a desensitization content determining instruction after all modifications are finished, and finally determining the modified content as the content to be desensitized and corresponding to the belonged category.

As shown in fig. 5, an embodiment of the present invention further provides a data desensitization apparatus, including:

the original data identification module 100 is configured to receive original data, identify the original data to obtain content to be desensitized and a corresponding category, and display the content to be desensitized and the corresponding category;

a desensitization content determining module 200, configured to receive a modification determining instruction of content to be desensitized, and determine content to be desensitized and a corresponding category to which the content belongs;

the data access receiving module 300 is configured to receive a data access instruction sent by a user account, and determine the authority of the user account; and

when the user does not have the right to view sensitive data, the data desensitization processing module 400 desensitizes content to be desensitized by using an irreversible desensitization algorithm and sends the desensitized data to the user account; when a user is authorized to view sensitive data, determining the category to which the user belongs, desensitizing content needing desensitization of the category to which the user belongs by using a reversible desensitization algorithm, desensitizing other content needing desensitization by using an irreversible desensitization algorithm, and sending the desensitized data to a user account, wherein the reversible desensitization algorithm is a binary code-based calculation formula.

In the embodiment of the invention, when various kinds of original data are received and stored, the original data are automatically identified to obtain contents to be desensitized and corresponding categories, it needs to be noted that basic employees do not need to carry out secondary processing on the sensitive data, so the employees are not authorized to check the sensitive data, middle and high-level employees in each department need to carry out subsequent processing on the sensitive data related to the department, so the employees are authorized to check the sensitive data related to the department, and different departments correspond to different categories, such as identity category (personnel department), financial category (financial department), production category (production workshop) and research and development category (research and development department), in order to ensure that the data are not leaked in the transmission process, regardless of whether the employees are authorized to check the sensitive data, the sensitive data are required to be desensitized; the content to be desensitized and the corresponding category which the content to be desensitized belongs to are automatically identified to be wrong, so that an uploader of original data is required to check and modify the content, and specifically, the uploader inputs a modification determining instruction of the content to be desensitized to determine the final content to be desensitized and the corresponding category which the content to be desensitized belongs to; and when a data access instruction sent by a user account is received, automatically judging the authority of the user account.

The method comprises the steps that a user account library is arranged, all user accounts which are not authorized to view sensitive data and all user accounts which are authorized to view the sensitive data are displayed in the user account library, each user account corresponds to a category to which the user belongs, the authority of the user and the category to which the user belongs can be known through the user account library, when the user does not authorize to view the sensitive data, an irreversible desensitization algorithm is used for desensitizing content needing desensitization, the desensitized data are sent to the user accounts, the irreversible desensitization algorithm is used for shielding or deleting the content needing desensitization, and 12345 is converted into the content, for example; when a user has the right to view sensitive data, a reversible desensitization algorithm is used for desensitizing content needing desensitization of the category to which the user belongs, for example, the user is a staff of a financial department, the content needing desensitization of the financial category is desensitized by using the reversible desensitization algorithm, the reversible desensitization algorithm is a binary code-based calculation formula, and no matter whether specific numbers, letters or Chinese characters correspond to binary codes, for example, the calculation formula of the embodiment of the invention is as follows: adding 1 to the binary code of each character in the content to be desensitized each time until the binary code added 1 once has a corresponding character, wherein the character is data after desensitization, for example, the character in the content to be desensitized has "secret", the dense binary code is 111001011010111110000110, the binary code 111001011010111110000110 added 1 becomes 111001011010111110000111, the character corresponding to 111001011010111110000111 is "kou", and if the 111001011010111110000111 does not have a corresponding character, the binary code is continuously added 1 until the corresponding character appears. According to the embodiment of the invention, different desensitization algorithms are adopted to desensitize the content to be desensitized according to whether the user has the right to check sensitive data, the requirement of an irreversible desensitization algorithm on the computing capacity is low, and the desensitization speed is ensured; the reversible desensitization algorithm ensures that the desensitized data can be processed for the second time, is a calculation formula based on binary codes, and is suitable for numbers, letters and Chinese characters.

As shown in fig. 6, as a preferred embodiment of the present invention, the raw data identification module 100 includes:

the automatic data identification unit 101 is used for identifying identity data, financial data, production data and research and development data in the original data; and

and the content to be desensitized determining unit 102 is configured to mark the identified identity data, financial data, production data, and development data as content to be desensitized, and mark the categories to which the identity data, the financial data, the production data, and the development data belong as identity categories, financial categories, production categories, and development categories.

As shown in fig. 7, as a preferred embodiment of the present invention, the data automatic identification unit 101 includes:

an identity data identification subunit 1011 configured to identify identity data in the original data according to the identification number, the telephone number, the age, the name, and the address;

a financial data identification subunit 1012 for identifying financial data in the original data according to the financial value unit, bill, income, expense and asset;

a production data identifying subunit 1013 configured to identify the production data in the raw data according to the production numerical units, the yield, the rank yield, and the yield; and

and the research and development data identification subunit 1014 is used for identifying the research and development data in the raw data according to the product parameters, the project names and the test names.

As shown in fig. 8, as a preferred embodiment of the present invention, the desensitization content determining module 200 includes:

an editing command receiving unit 201, configured to receive a content editing instruction to be desensitized, where the content editing instruction to be desensitized includes content modification information to be desensitized and category modification information to which the content modification information belongs;

the content to be desensitized modifying unit 202 is configured to modify the content to be desensitized and the corresponding category according to the content to be desensitized editing instruction; and

a desensitization required content determining unit 203, configured to receive a desensitization content determining instruction, and determine desensitization required content and a corresponding category to which the desensitization required content belongs.

An embodiment of the present invention further provides a storage medium, where a computer program is stored on the storage medium, and when the computer program is executed by a processor, the processor is caused to execute the steps in the data desensitization method.

The present invention has been described in detail with reference to the preferred embodiments thereof, and it should be understood that the present invention is not limited thereto, but includes any modifications, equivalents, and improvements within the spirit and scope of the present invention.

It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of steps in various embodiments may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternatingly with other steps or at least a portion of sub-steps or stages of other steps.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice in the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims

1. A method of data desensitization, the method comprising the steps of:

when the user does not have the right to view the sensitive data, desensitizing the content to be desensitized by using an irreversible desensitization algorithm, and sending the desensitized data to the user account;

2. A data desensitization method according to claim 1, wherein said step of identifying the original data to obtain the content to be desensitized and the corresponding category includes:

and marking the identified identity data, financial data, production data and development data as contents to be desensitized, and correspondingly marking the categories to which the identity data, the financial data, the production data and the development data belong as identity categories, financial categories, production categories and development categories.

3. A method of desensitizing data according to claim 2, wherein said step of identifying identity data, financial data, production data, and development data in the raw data comprises:

identifying financial data in the raw data based on financial value units, bill of note, income, expense, and assets;

identifying production data in the original data according to the production numerical units, the yield, the row yield and the qualified rate;

4. The data desensitization method according to claim 1, wherein said step of receiving a content modification determination instruction to be desensitized, determining content to be desensitized and corresponding category includes:

and receiving a desensitization content determining instruction, and determining content needing desensitization and a corresponding category to which the content needs desensitization belongs.

5. A data desensitization method according to claim 1, characterized in that a user account database is provided, in which all user accounts which are not authorized to view sensitive data and all user accounts which are authorized to view sensitive data are displayed, and each user account corresponds to a category.

6. A data desensitization apparatus, characterized in that the apparatus comprises:

the desensitization content determining module is used for receiving a modification determining instruction of content to be desensitized and determining content to be desensitized and corresponding categories;

the data desensitization processing module is used for desensitizing content needing desensitization by using an irreversible desensitization algorithm when a user does not have the right to view sensitive data, and transmitting the desensitized data to a user account; when a user is authorized to view sensitive data, determining the category to which the user belongs, desensitizing content needing desensitization of the category to which the user belongs by using a reversible desensitization algorithm, desensitizing other content needing desensitization by using an irreversible desensitization algorithm, and sending the desensitized data to a user account, wherein the reversible desensitization algorithm is a binary code-based calculation formula.

7. A data desensitization apparatus according to claim 6, wherein said raw data identification module comprises:

8. A data desensitization apparatus according to claim 7, wherein said data automatic identification unit comprises:

9. The data desensitization apparatus according to claim 6, wherein said desensitization content determining module comprises:

10. A storage medium having stored thereon a computer program which, when executed by a processor, causes the processor to carry out the steps of a method of data desensitization according to any of claims 1 to 5.