CN115017213A - Sensitive data processing method and device - Google Patents

Sensitive data processing method and device Download PDF

Info

Publication number
CN115017213A
CN115017213A CN202210767088.4A CN202210767088A CN115017213A CN 115017213 A CN115017213 A CN 115017213A CN 202210767088 A CN202210767088 A CN 202210767088A CN 115017213 A CN115017213 A CN 115017213A
Authority
CN
China
Prior art keywords
data
sensitive
sensitive data
determining
data source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210767088.4A
Other languages
Chinese (zh)
Inventor
徐雪莲
张翼飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Financial Technology Co Ltd
Original Assignee
Bank of China Financial Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Financial Technology Co Ltd filed Critical Bank of China Financial Technology Co Ltd
Priority to CN202210767088.4A priority Critical patent/CN115017213A/en
Publication of CN115017213A publication Critical patent/CN115017213A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a sensitive data processing method and a device, which can be applied to the financial field or other fields, and the method comprises the following steps: responding to a sensitive data identification instruction, and determining a data source corresponding to the data identification instruction; scanning the data source according to a preset scanning mode to obtain a data dictionary in the data source; detecting the data dictionary to determine each sensitive data in the data dictionary; determining a data type and a sensitivity level of each of the sensitive data; the sensitivity level of each of the sensitive data characterizes the importance of each of the sensitive data; generating sensitive data statistical information of the data source according to the data type and the sensitive level of each piece of sensitive data; and outputting the sensitive data statistical information. By applying the method provided by the embodiment of the invention, the sensitive data can be effectively identified and managed, and the leakage risk of the sensitive data is reduced.

Description

Sensitive data processing method and device
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for processing sensitive data.
Background
With the advent of the big data era, the huge value of data is mined, and meanwhile, the difficulties in the protection of private information and key sensitive data are brought. How to realize the efficient sharing of data and protect sensitive information from being leaked becomes a key link of data security intelligent development. In order to protect sensitive data from being leaked, the sensitive data needs to be identified and managed first.
At present, a sensitive data identification method usually adopts manual identification, judges whether the sensitive data is sensitive data or not by utilizing subjective consciousness of a data analyst, has low identification efficiency, has a long period of manual carding speed relative to machine identification speed when facing a large amount of data, and has high requirements on professional quality of processing personnel; the judgment standards are not uniform, and since the sensitive data identification process mainly depends on subjective judgment of people, different people may have different judgment standards on the same data, and even the results identified by the same person at different times are still different, the difference of the sensitive data identification results and inaccurate sensitive data identification are caused, so that the sensitive data cannot be effectively managed, and the sensitive data has leakage risks.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a sensitive data processing method, which can effectively identify and manage sensitive data and reduce the leakage risk of the sensitive data.
The invention also provides a sensitive data processing device which is used for ensuring the realization and the application of the method in practice.
A sensitive data processing method, comprising:
responding to a sensitive data identification instruction, and determining a data source corresponding to the data identification instruction;
scanning the data source according to a preset scanning mode to obtain a data dictionary in the data source;
detecting the data dictionary to determine each sensitive data in the data dictionary;
determining a data type and a sensitivity level of each of the sensitive data; the sensitivity level of each said sensitive data characterizes the importance of each said sensitive data;
generating sensitive data statistical information of the data source according to the data type and the sensitive level of each piece of sensitive data;
and outputting the sensitive data statistical information.
Optionally, the method for scanning the data source according to the preset scanning manner to obtain the data dictionary in the data source includes:
determining a data source type of the data source;
establishing communication connection with the data source according to a communication mode corresponding to the data source type of the data source;
and under the condition that the communication connection with the data source is successfully established, scanning the data source according to a timing plan in a preset scanning mode to obtain a data dictionary in the data source.
In the above method, optionally, the detecting the data dictionary to determine each sensitive data in the data dictionary includes:
acquiring each to-be-detected data contained in the data dictionary;
detecting each data to be detected by using a preset sensitive identification technology to obtain a detection result of each data to be detected; the sensitive identification technology comprises at least one of sensitive data semantic analysis, a sensitive data identification algorithm and sensitive field regular matching;
and determining sensitive data in the data to be detected according to the detection result of each data to be detected.
The method optionally includes a process of determining a data type of each of the sensitive data, including:
acquiring a field identifier in each sensitive data;
matching the field identification in each piece of sensitive data with each preset sensitive field;
and for each sensitive data, determining the data type corresponding to the sensitive field successfully matched with the field identification of the sensitive data as the data type of the sensitive data.
The method optionally determines the sensitivity level of each piece of sensitive data, and includes:
acquiring a preset grading rule;
and matching each sensitive data with the grading rule to obtain the sensitivity level of each sensitive data.
A sensitive data processing apparatus comprising:
the first determining unit is used for responding to a sensitive data identification instruction and determining a data source corresponding to the data identification instruction;
the scanning unit is used for scanning the data source according to a preset scanning mode to obtain a data dictionary in the data source;
the detection unit is used for detecting the data dictionary to determine each sensitive data in the data dictionary;
the second determining unit is used for determining the data type and the sensitivity level of each piece of sensitive data; the sensitivity level of each said sensitive data characterizes the importance of each said sensitive data;
the generating unit is used for generating sensitive data statistical information of the data source according to the data type and the sensitive level of each piece of sensitive data;
and the output unit is used for outputting the sensitive data statistical information.
The above apparatus, optionally, the scanning unit, includes:
the first determining subunit is used for determining the data source type of the data source;
the first execution subunit is used for establishing communication connection with the data source according to a communication mode corresponding to the data source type of the data source;
and the scanning subunit is used for scanning the data source according to a timing plan in a preset scanning mode under the condition that the communication connection with the data source is successfully established, so as to obtain a data dictionary in the data source.
The above apparatus, optionally, the detection unit includes:
the first acquisition subunit is used for acquiring each to-be-detected data contained in the data dictionary;
the detection subunit is used for detecting each data to be detected by using a preset sensitive identification technology to obtain a detection result of each data to be detected; the sensitive identification technology comprises at least one of sensitive data semantic analysis, a sensitive data identification algorithm and sensitive field regular matching;
and the second determining subunit is used for determining sensitive data in the data to be detected according to the detection result of each data to be detected.
The above apparatus, optionally, the determining unit includes:
the second acquisition subunit is used for acquiring the field identifier in each piece of sensitive data;
the matching subunit is used for matching the field identification in each piece of sensitive data with each preset sensitive field;
and the third determining subunit is configured to determine, as the data type of the sensitive data, a data type corresponding to a sensitive field that is successfully matched with the field identifier of the sensitive data.
The above apparatus, optionally, the determining unit includes:
the third acquisition subunit is used for acquiring a preset grading rule;
and the second execution subunit is used for matching each sensitive data with the grading rule to obtain the sensitivity level of each sensitive data.
A storage medium comprising stored instructions, wherein the instructions, when executed, control a device on which the storage medium is located to perform the sensitive data processing method as described above.
An electronic device comprising a memory, and one or more instructions, wherein the one or more instructions are stored in the memory and configured to be executed by one or more processors to perform the sensitive data processing method as described above.
Based on the sensitive data processing method and device, the storage medium and the electronic device, the method comprises the following steps: responding to a sensitive data identification instruction, and determining a data source corresponding to the data identification instruction; scanning the data source according to a preset scanning mode to obtain a data dictionary in the data source; detecting the data dictionary to determine each sensitive data in the data dictionary; determining a data type and a sensitivity level of each of the sensitive data; the sensitivity level of each said sensitive data characterizes the importance of each said sensitive data; generating sensitive data statistical information of the data source according to the data type and the sensitive level of each piece of sensitive data; and outputting the sensitive data statistical information. By applying the method provided by the embodiment of the invention, the sensitive data can be effectively identified and managed, and the leakage risk of the sensitive data is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a flow chart of a method for processing sensitive data according to the present invention;
FIG. 2 is a flow chart of a process for obtaining a data dictionary in a data source according to the present invention;
FIG. 3 is a flow chart of a process for detecting a data dictionary in accordance with the present invention;
FIG. 4 is a schematic structural diagram of a sensitive data processing apparatus according to the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to the present invention;
fig. 6 is an exemplary diagram of a sensitive data processing process provided by the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
An embodiment of the present invention provides a sensitive data processing method, which may be applied to an electronic device, and a method flowchart of the method is shown in fig. 1, and specifically includes:
s101: and responding to the sensitive data identification instruction, and determining a data source corresponding to the data identification instruction.
In this embodiment, the sensitive data identification instruction may be an instruction triggered when there is a sensitive data processing requirement, and the sensitive data processing instruction may be triggered by a user clicking a preset control, or may be automatically triggered during the running process of the application program.
Optionally, the sensitive data identification instruction may be analyzed to obtain instruction information, a data source identifier is obtained from the instruction information, and a data source corresponding to the data identification instruction is determined in each preset alternative data source according to the data source identifier.
The data source may be a structured data source, for example, a relational database, a mysql database, an oracle database, or the like; the data source may also be an unstructured data source, for example, may be a big data platform, a cloud database, and the like.
S102: and scanning the data source according to a preset scanning mode to obtain a data dictionary in the data source.
In this embodiment, the number of the data dictionaries in the data source may be one or more, and the data dictionaries in the data source may be obtained by scanning the data source.
S103: and detecting the data dictionary to determine each sensitive data in the data dictionary.
In this embodiment, the data dictionary includes a plurality of data, and each data in the data dictionary may be detected to determine the sensitive data in the data dictionary.
Optionally, the data dictionary may be detected by using at least one of sensitive data semantic analysis, a sensitive data recognition algorithm, and sensitive field regular matching.
S104: determining a data type and a sensitivity level of each of the sensitive data; the sensitivity level of each piece of sensitive data represents the importance degree of each piece of sensitive data, and particularly represents the influence objects and the influence degrees caused by the influence objects after the safety of the sensitive data is damaged.
S105: and generating sensitive data statistical information of the data source according to the data type and the sensitivity level of each piece of sensitive data.
In this embodiment, the sensitive data statistics may include distribution information and classification rating reports of each sensitive data in the data source.
S106: and outputting the sensitive data statistical information.
In this embodiment, the sensitive data statistical information may be transmitted to a preset terminal, or the sensitive data statistical information may be output and displayed on a preset display interface.
By applying the method provided by the embodiment of the invention, the sensitive data can be effectively identified and managed, and the leakage risk of the sensitive data is reduced.
In an embodiment provided by the present invention, based on the real-time process, specifically, the process of scanning the data source in a preset scanning manner to obtain the data dictionary in the data source includes, as shown in fig. 2:
s201: determining a data source type of the data source.
In this embodiment, the data source may be one of a relational database, a mysql database, an oracle database, a big data platform, a cloud database, and the like.
S202: and establishing communication connection with the data source according to the communication mode corresponding to the data source type of the data source.
In this embodiment, a communication mode corresponding to a data source type of the data source may be determined first, and a communication connection may be established with the data source according to a communication protocol in the communication mode.
Optionally, different data source types correspond to different communication modes.
S203: and under the condition that the communication connection with the data source is successfully established, scanning the data source according to a timing plan in a preset scanning mode to obtain a data dictionary in the data source.
In this embodiment, the timing plan may be set according to actual requirements, for example, the data source may be scanned at intervals of 5 minutes, 1 hour or one day.
The table and the data in the data source can be scanned according to an industry template and a built-in data standard of the industry to which the data source belongs and by combining the industry to which the data source belongs and the characteristics of the enterprise to which the data source belongs.
In an embodiment provided by the present invention, based on the real-time process, specifically, the process of detecting the data dictionary to determine each sensitive data in the data dictionary includes, as shown in fig. 3:
s301: and acquiring each data to be detected contained in the data dictionary.
In this embodiment, the data dictionary may be analyzed to obtain each data in the data dictionary, and each data in the data dictionary may be used as the data to be detected.
S302: detecting each data to be detected by using a preset sensitive identification technology to obtain a detection result of each data to be detected; the sensitive identification technology comprises at least one of sensitive data semantic analysis, a sensitive data identification algorithm and sensitive field regular matching.
In the embodiment, the field comments of the data to be detected can be acquired to perform sensitive data semantic analysis, the analysis result of the data to be detected is obtained, and the analysis result is used as the sensitive data identification result; the data to be detected stored in the data table can be scanned and analyzed through a sensitive data identification algorithm, the data content is judged according to the actual stored data, and a sensitive data identification result is obtained; the sensitive data identification result in the data source can be determined through the regular matching of the sensitive field.
Optionally, the detection result of the data to be detected may be determined according to each sensitive data identification result, and specifically, when the sensitive data identification result of at least one sensitive identification technology exists in the data to be detected and indicates that the data to be detected is sensitive data, it is determined that the detection result of the detection data indicates that the data to be detected is sensitive data.
S303: and determining sensitive data in the data to be detected according to the detection result of each data to be detected.
By applying the method provided by the embodiment of the invention, the sensitive data can be quickly and accurately determined from each data to be detected.
In an embodiment of the present invention, based on the real-time process, specifically, the process of determining the data type of each piece of sensitive data includes:
acquiring a field identifier in each sensitive data;
matching the field identification in each piece of sensitive data with each preset sensitive field;
and for each sensitive data, determining the data type corresponding to the sensitive field successfully matched with the field identification of the sensitive data as the data type of the sensitive data.
In this embodiment, the data types corresponding to different sensitive fields are different, the data type of the sensitive data can be determined by matching each sensitive data with each sensitive field, and the matching manner between the sensitive data and the sensitive fields can be regular matching.
By applying the method provided by the embodiment of the invention, the data type of the sensitive data can be rapidly determined.
In an embodiment of the present invention, specifically, the determining the sensitivity level of each piece of sensitive data based on the real-time process includes:
acquiring a preset grading rule;
and matching each piece of sensitive data with the grading rule to obtain the sensitivity level of each piece of sensitive data.
In this embodiment, influence information of each sensitive data may be determined, where the influence information includes an influence object and an influence range; and matching the influence information of the sensitive data with the grading rule to determine the sensitivity level of the sensitive data, wherein the more important the influence object is and the larger the influence range is, the higher the sensitivity level of the sensitive data is determined to be.
Alternatively, the sensitivity levels may be divided into 5, 4, 3, 2, 1 levels from high to low.
Corresponding to the method described in fig. 1, an embodiment of the present invention further provides a sensitive data processing apparatus, which is used for implementing the method in fig. 1 specifically, and the sensitive data processing apparatus provided in the embodiment of the present invention may be applied to an electronic device, and a schematic structural diagram of the sensitive data processing apparatus is shown in fig. 4, and specifically includes:
a first determining unit 401, configured to determine, in response to a sensitive data identification instruction, a data source corresponding to the data identification instruction;
a scanning unit 402, configured to scan the data source according to a preset scanning manner, so as to obtain a data dictionary in the data source;
a detecting unit 403, configured to detect the data dictionary to determine each sensitive data in the data dictionary;
a second determining unit 404, configured to determine a data type and a sensitivity level of each of the sensitive data; the sensitivity level of each said sensitive data characterizes the importance of each said sensitive data;
a generating unit 405, configured to generate sensitive data statistical information of the data source according to the data type and the sensitivity level of each piece of sensitive data;
and an output unit 406, configured to output the sensitive data statistical information.
In an embodiment of the present invention, based on the above scheme, optionally, the scanning unit 402 includes:
the first determining subunit is used for determining the data source type of the data source;
the first execution subunit is used for establishing communication connection with the data source according to a communication mode corresponding to the data source type of the data source;
and the scanning subunit is used for scanning the data source according to a timing plan in a preset scanning mode under the condition that the communication connection with the data source is successfully established, so as to obtain a data dictionary in the source.
In an embodiment provided by the present invention, based on the above scheme, optionally, the detecting unit 403 includes:
the first acquisition subunit is used for acquiring each to-be-detected data contained in the data dictionary;
the detection subunit is used for detecting each data to be detected by using a preset sensitive identification technology to obtain a detection result of each data to be detected; the sensitive identification technology comprises at least one of sensitive data semantic analysis, a sensitive data identification algorithm and sensitive field regular matching;
and the second determining subunit is used for determining sensitive data in the data to be detected according to the detection result of each data to be detected.
In an embodiment provided by the present invention, based on the above scheme, optionally, the second determining unit 404 includes:
the second acquisition subunit is used for acquiring the field identifier in each piece of sensitive data;
the matching subunit is used for matching the field identification in each piece of sensitive data with each preset sensitive field;
and the third determining subunit is configured to determine, as the data type of the sensitive data, a data type corresponding to a sensitive field that is successfully matched with the field identifier of the sensitive data.
In an embodiment provided by the present invention, based on the above scheme, optionally, the second determining unit 404 includes:
the third acquisition subunit is used for acquiring a preset grading rule;
and the second execution subunit is used for matching each sensitive data with the hierarchical rule to obtain the sensitivity level of each sensitive data.
The specific principle and the execution process of each unit and each module in the sensitive data processing apparatus disclosed in the above embodiment of the present invention are the same as those of the sensitive data processing method disclosed in the above embodiment of the present invention, and reference may be made to corresponding parts in the sensitive data processing method provided in the above embodiment of the present invention, which are not described herein again.
The embodiment of the invention also provides a storage medium, which comprises a stored instruction, wherein when the instruction runs, the device where the storage medium is located is controlled to execute the sensitive data processing method.
An electronic device is provided in an embodiment of the present invention, and the structural diagram of the electronic device is shown in fig. 5, which specifically includes a memory 501 and one or more instructions 502, where the one or more instructions 502 are stored in the memory 501, and are configured to be executed by one or more processors 503 to perform the following operations according to the one or more instructions 502:
responding to a sensitive data identification instruction, and determining a data source corresponding to the data identification instruction;
scanning the data source according to a preset scanning mode to obtain a data dictionary in the data source;
detecting the data dictionary to determine each sensitive data in the data dictionary;
determining a data type and a sensitivity level of each of the sensitive data; the sensitivity level of each said sensitive data characterizes the importance of each said sensitive data;
generating sensitive data statistical information of the data source according to the data type and the sensitive level of each piece of sensitive data;
and outputting the sensitive data statistical information.
Referring to fig. 6, an exemplary diagram of a sensitive data processing process provided in an embodiment of the present invention is shown, wherein the present solution can perform automatic scanning on structured and unstructured data sources, and implement discovery and identification of data assets of a financial enterprise. In the scanning process, a built-in financial industry template and a built-in data standard are used, and strategies and means such as semantic content analysis, a sensitive data recognition algorithm and sensitive field regular matching are used in combination with the characteristics of the industry and the self, so that a financial enterprise data asset data discovery and classification grading report is finally obtained. The financial enterprise obtains and defines the distribution, the data type and the sensitive level of the data assets through the scanning report, and further provides data security construction and asset checking basis for the enterprise, so that asset discovery operation can be executed, the result is displayed, and the risk of data leakage can be reduced.
In an embodiment provided by the present invention, a sensitive data processing system is provided, which specifically includes:
the data source management module is used for connecting a data source;
the scanning plan task module is used for making a scanning plan and scanning a data source to obtain a data dictionary;
the sensitive data identification module is used for scanning the data dictionary according to the scanning rule to obtain sensitive data distribution;
the sensitive data identification module is used for automatically classifying the identified sensitive data;
the sensitive data identification module is used for judging the level of the identified sensitive data;
and the scanning report display module is used for acquiring a data asset combing condition report.
The comprehensive combing of data assets and the establishment of proper data security classification are necessary prerequisites and foundations for effective data classification management of financial institutions. The data grading management is the basic work for establishing a unified and perfect data life cycle safety protection framework, and can provide support for making targeted data safety control measures for financial institutions.
It should be noted that the sensitive data processing method and device provided by the invention can be used in the fields of artificial intelligence, block chain, distribution, cloud computing, big data, internet of things, mobile internet, network security, chip, virtual reality, augmented reality, holography, quantum computing, quantum communication, quantum measurement, digital twinning, and finance. The above is merely an example, and the application field of the sensitive data processing method and apparatus provided by the present invention is not limited.
It should be noted that, in this specification, each embodiment is described in a progressive manner, and each embodiment focuses on differences from other embodiments, and portions that are the same as and similar to each other in each embodiment may be referred to. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functions of the units may be implemented in the same software and/or hardware or in a plurality of software and/or hardware when implementing the invention.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The sensitive data processing method provided by the invention is described in detail above, and the principle and the implementation mode of the invention are explained by applying a specific example, and the description of the above example is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A method for processing sensitive data, comprising:
responding to a sensitive data identification instruction, and determining a data source corresponding to the data identification instruction;
scanning the data source according to a preset scanning mode to obtain a data dictionary in the data source;
detecting the data dictionary to determine each sensitive data in the data dictionary;
determining a data type and a sensitivity level of each of the sensitive data;
generating sensitive data statistical information of the data source according to the data type and the sensitive level of each piece of sensitive data;
and outputting the sensitive data statistical information.
2. The method according to claim 1, wherein the scanning the data source according to a preset scanning mode to obtain the data dictionary in the data source comprises:
determining a data source type of the data source;
establishing communication connection with the data source according to a communication mode corresponding to the data source type of the data source;
and under the condition that the communication connection with the data source is successfully established, scanning the data source according to a timing plan in a preset scanning mode to obtain a data dictionary in the data source.
3. The method of claim 1, wherein said detecting the data dictionary to determine each sensitive data in the data dictionary comprises:
acquiring each to-be-detected data contained in the data dictionary;
detecting each data to be detected by using a preset sensitive identification technology to obtain a detection result of each data to be detected; the sensitive identification technology comprises at least one of sensitive data semantic analysis, a sensitive data identification algorithm and sensitive field regular matching;
and determining sensitive data in the data to be detected according to the detection result of each data to be detected.
4. The method of claim 1, wherein determining the data type for each of the sensitive data comprises:
acquiring a field identifier in each sensitive data;
matching the field identification in each piece of sensitive data with each preset sensitive field;
and for each sensitive data, determining the data type corresponding to the sensitive field successfully matched with the field identification of the sensitive data as the data type of the sensitive data.
5. The method of claim 1, wherein determining a sensitivity level for each of the sensitive data comprises:
acquiring a preset grading rule;
and matching each piece of sensitive data with the grading rule to obtain the sensitivity level of each piece of sensitive data.
6. A sensitive data processing apparatus, comprising:
the first determining unit is used for responding to a sensitive data identification instruction and determining a data source corresponding to the data identification instruction;
the scanning unit is used for scanning the data source according to a preset scanning mode to obtain a data dictionary in the data source;
the detection unit is used for detecting the data dictionary to determine each sensitive data in the data dictionary;
the second determining unit is used for determining the data type and the sensitivity level of each piece of sensitive data;
the generating unit is used for generating sensitive data statistical information of the data source according to the data type and the sensitive level of each piece of sensitive data;
and the output unit is used for outputting the sensitive data statistical information.
7. The apparatus of claim 6, wherein the scanning unit comprises:
the first determining subunit is used for determining the data source type of the data source;
the first execution subunit is used for establishing communication connection with the data source according to a communication mode corresponding to the data source type of the data source;
and the scanning subunit is used for scanning the data source according to a timing plan in a preset scanning mode under the condition that the communication connection with the data source is successfully established, so as to obtain a data dictionary in the data source.
8. The apparatus of claim 6, wherein the detection unit comprises:
the first acquisition subunit is used for acquiring each to-be-detected data contained in the data dictionary;
the detection subunit is used for detecting each data to be detected by using a preset sensitive identification technology to obtain a detection result of each data to be detected; the sensitive identification technology comprises at least one of sensitive data semantic analysis, a sensitive data identification algorithm and sensitive field regular matching;
and the second determining subunit is used for determining sensitive data in the data to be detected according to the detection result of each data to be detected.
9. The apparatus of claim 6, wherein the second determining unit comprises:
the second acquisition subunit is used for acquiring the field identifier in each piece of sensitive data;
the matching subunit is used for matching each preset sensitive field by using the field identifier in each piece of sensitive data;
and the third determining subunit is configured to determine, for each piece of the sensitive data, a data type corresponding to a sensitive field that is successfully matched with the field identifier of the sensitive data, as the data type of the sensitive data.
10. The apparatus of claim 6, wherein the second determining unit comprises:
the third acquisition subunit is used for acquiring a preset grading rule;
and the second execution subunit is used for matching each sensitive data with the grading rule to obtain the sensitivity level of each sensitive data.
CN202210767088.4A 2022-07-01 2022-07-01 Sensitive data processing method and device Pending CN115017213A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210767088.4A CN115017213A (en) 2022-07-01 2022-07-01 Sensitive data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210767088.4A CN115017213A (en) 2022-07-01 2022-07-01 Sensitive data processing method and device

Publications (1)

Publication Number Publication Date
CN115017213A true CN115017213A (en) 2022-09-06

Family

ID=83078519

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210767088.4A Pending CN115017213A (en) 2022-07-01 2022-07-01 Sensitive data processing method and device

Country Status (1)

Country Link
CN (1) CN115017213A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116186601A (en) * 2022-12-15 2023-05-30 广州光点信息科技股份有限公司 Data classification method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116186601A (en) * 2022-12-15 2023-05-30 广州光点信息科技股份有限公司 Data classification method and device

Similar Documents

Publication Publication Date Title
WO2019100576A1 (en) Automated test management method and apparatus, terminal device, and storage medium
US11403208B2 (en) Generating a virtualized stub service using deep learning for testing a software module
CN111897806A (en) Big data offline data quality inspection method and device
CN115017213A (en) Sensitive data processing method and device
CN115204733A (en) Data auditing method and device, electronic equipment and storage medium
CN113535577B (en) Application testing method and device based on knowledge graph, electronic equipment and medium
CN112822210B (en) Vulnerability management system based on network assets
CN112950359A (en) User identification method and device
CN115080827B (en) Sensitive data processing method and device
CN114841815A (en) Transaction analysis method and device, electronic equipment and computer-readable storage medium
CN115238292A (en) Data security management and control method and device, electronic equipment and storage medium
CN114722401A (en) Equipment safety testing method, device, equipment and storage medium
CN110532158B (en) Safety evaluation method, device and equipment for operation data and readable storage medium
CN114064498A (en) Script development control method and device, computer equipment and storage medium
CN113791980A (en) Test case conversion analysis method, device, equipment and storage medium
CN117195183B (en) Data security compliance risk assessment system
KR20200123891A (en) Method and apparatus for providing quality information of application
CN116882724B (en) Method, device, equipment and medium for generating business process optimization scheme
CN110414186B (en) Data asset segmentation verification method and device
CN115511450A (en) Electric power marketing inspection method, device, medium and equipment
CN117421248A (en) Software testing method and device, electronic equipment and computer readable medium
CN115170315A (en) Monitoring report generation method and device, storage medium and electronic equipment
CN115865409A (en) Code risk detection method, device, equipment and medium
CN117951027A (en) Method and device for detecting full-quantity project codes, storage medium and electronic equipment
CN117093494A (en) Test processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination