CN114491195A - Feature data identification method and device, electronic equipment and storage medium - Google Patents

Feature data identification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114491195A
CN114491195A CN202210056829.8A CN202210056829A CN114491195A CN 114491195 A CN114491195 A CN 114491195A CN 202210056829 A CN202210056829 A CN 202210056829A CN 114491195 A CN114491195 A CN 114491195A
Authority
CN
China
Prior art keywords
template
industry
data
classification
identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210056829.8A
Other languages
Chinese (zh)
Inventor
于元河
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Alibaba Cloud Computing Ltd
Original Assignee
Alibaba China Co Ltd
Alibaba Cloud Computing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd, Alibaba Cloud Computing Ltd filed Critical Alibaba China Co Ltd
Priority to CN202210056829.8A priority Critical patent/CN114491195A/en
Publication of CN114491195A publication Critical patent/CN114491195A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates

Abstract

The application provides a feature data identification method and device, electronic equipment and a storage medium, wherein the method comprises the following steps: determining a target industry template for identifying the characteristic data; the target industry template comprises a plurality of template classifications corresponding to the target industry and template rules of each template classification; and performing characteristic data identification on authorized asset data according to a plurality of template classifications corresponding to the target industry and the template rules of each template classification to obtain a characteristic data classification and classification identification result corresponding to the target industry. In the embodiment of the application, a user only needs to authorize the characteristic data identification authority of the asset data to a data security platform for identifying the characteristic data, and the data security platform can perform target industry classification and classification characteristic data identification on the authorized asset data according to a target industry template to obtain a characteristic data identification result; the user can realize the appeal of classifying and identifying the characteristic data in the asset data by industry only by simple operation.

Description

Feature data identification method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for identifying feature data, an electronic device, and a storage medium.
Background
Each business has respective characteristic data classification and classification rules and requirements, and in the related technology, the general scheme of characteristic data identification of each manufacturer is as follows: the method comprises the steps of scanning user data through a built-in identification rule, carrying out full-quantity identification filtering on the user data through a rule engine or an open source identification framework, judging whether the rule is hit or not, then writing the data of the hit rule into a database again by reorganizing a data format and information including risk level of the feature data, hit rule and the like, and recording a sampling sample.
Disclosure of Invention
The embodiment of the application provides a feature data identification method and device, electronic equipment and a storage medium, and aims to solve or partially solve the problem that the feature data identification method in the related technology cannot meet the requirement of a user on the feature data identification based on industry attributes.
In order to solve the above problem, an embodiment of the present application discloses a feature data identification method, including:
determining a target industry template for identifying the characteristic data; the target industry template comprises a plurality of template classifications corresponding to the target industry and template rules of each template classification;
and performing characteristic data identification on authorized asset data according to a plurality of template classifications corresponding to the target industry and the template rules of each template classification to obtain a characteristic data classification and classification identification result corresponding to the target industry.
Optionally, the determining a target industry template for identifying feature data includes:
displaying a plurality of industry templates;
and responding to the starting operation aiming at the industry template, and determining the industry template corresponding to the starting operation as a target industry template.
Optionally, before the determining the target industry template for identifying the feature data, the method further comprises:
acquiring a characteristic data classification and classification data table of each industry;
and generating a corresponding industry template comprising a plurality of template classifications corresponding to the industry and template rules of each template classification according to the characteristic data classification grading data table.
Optionally the template rule association identifies a scope; the method for identifying the feature data of the authorized asset data according to the template classifications corresponding to the target industry and the template rules of the template classifications to obtain the feature data classification and classification identification result corresponding to the target industry comprises the following steps:
for each template rule, determining corresponding asset data to be identified according to the identification range associated with the template rule;
and performing characteristic data identification on the asset data to be identified by adopting the template rule. .
Optionally, the method further comprises:
in response to a modification operation for an industry template, the industry template is updated based on the modification operation.
Optionally, the method further comprises:
and displaying the characteristic data classification and grading identification result.
Optionally, after the performing feature data identification on authorized asset data according to the plurality of template classifications corresponding to the target industry and the template rules of each template classification to obtain a result of performing classification and classification identification on feature data corresponding to the target industry, the method further includes:
and in response to the correction operation aiming at the characteristic data classification grading identification result, modifying the characteristic data classification grading identification result based on the correction operation.
The embodiment of the application also discloses a feature data identification device, the device includes:
the target template determining module is used for determining a target industry template for identifying the characteristic data; the target industry template comprises a plurality of template classifications corresponding to the target industry and template rules of each template classification;
and the characteristic data identification module is used for carrying out characteristic data identification on authorized asset data according to a plurality of template classifications corresponding to the target industry and the template rules of the template classifications to obtain a characteristic data classification and classification identification result corresponding to the target industry.
Optionally, the target template determining module includes:
the industry template display module is used for displaying a plurality of industry templates;
and the industry template starting module is used for responding to starting operation aiming at the industry template and determining the industry template corresponding to the starting operation as a target industry template.
Optionally, the apparatus further comprises:
the classified and classified data table acquisition module is used for acquiring a classified and classified data table of the feature data of each industry;
and the industry template generating module is used for generating a corresponding industry template comprising a plurality of template classifications corresponding to the industry and template rules of each template classification according to the characteristic data classification grading data table.
Optionally, the template rule association identifies a scope; the feature data identification module comprises:
the identification range determining module is used for determining corresponding asset data to be identified according to the identification range associated with each template rule;
and the identification module based on the identification range is used for identifying the characteristic data of the asset data to be identified by adopting the template rule.
Optionally, the apparatus further comprises:
and the industry template modification module is used for responding to modification operation aiming at the industry template and updating the industry template based on the modification operation.
Optionally, the apparatus further comprises:
and the characteristic data identification result display module is used for displaying the characteristic data classification and classification identification result.
Optionally, the apparatus further comprises:
and the characteristic data identification result correcting module is used for responding to correcting operation aiming at the characteristic data classification and grading identification result and correcting the characteristic data classification and grading identification result based on the correcting operation.
The embodiment of the application also discloses an electronic device, which comprises: a processor; and a memory having executable code stored thereon, which when executed, causes the processor to perform a method of feature data identification as described in one or more of the embodiments of the present application.
One or more machine-readable media having stored thereon executable code that, when executed, causes a processor to perform a method of feature data identification as described in one or more of the embodiments of the present application are also disclosed.
Compared with the prior art, the embodiment of the application has the following advantages:
in the embodiment of the application, in the process of identifying the feature data, particularly the process of identifying the requirement of the feature data based on the industry attribute, a target industry template for identifying the feature data can be determined; the target industry template comprises a plurality of template classifications corresponding to the target industry and template rules of each template classification; performing characteristic data identification on authorized asset data according to a plurality of template classifications corresponding to the target industry and template rules of each template classification to obtain a characteristic data classification and classification identification result corresponding to the target industry; and enabling the characteristic data identification result to meet the characteristic data identification requirement of the target industry attribute. In the embodiment of the application, a user only needs to authorize the characteristic data identification authority of the asset data to a data security platform for identifying the characteristic data, and the data security platform can perform target industry classification and classification characteristic data identification on the authorized asset data according to the determined target industry template to obtain a characteristic data identification result; the user can realize the appeal of classifying and identifying the characteristic data in the asset data by industry only by simple operation.
Drawings
Fig. 1 is a system block diagram corresponding to a feature data identification method provided in an embodiment of the present application;
FIG. 2 is a flow chart of the steps of a method for identifying feature data provided in an embodiment of the present application;
FIG. 3 is a diagram illustrating feature classification information of an industry template in a list mode according to an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating feature classification information of an industry template in a topological mode according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a feature data identification result display page in the embodiment of the present application;
FIG. 6 is a schematic diagram of a feature data identification result correction page in an embodiment of the present application;
fig. 7 is a block diagram of a feature data identification apparatus provided in an embodiment of the present application.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.
With the development of information technology, a plurality of business basic affairs, core flows, business to business and other affairs and activities are operated on an information-based support carrier, information generated in the production operation process of each industry is gradually converted into digital assets in different forms, and the digital assets are transferred between different information networks and systems. With the deep application of new technologies such as big data, artificial intelligence, cloud computing and the like in various industries, the data gradually realizes the transformation from information assets to production elements, and the importance of the data is increasingly prominent.
The characteristic data refers to data meeting specific requirements, and exemplarily, the characteristic data may refer to data which may cause serious harm to the society or individuals after leakage. Including personal privacy data such as name, identification, address, telephone, bank account, mailbox, password, medical information, educational background, etc.; but also data that the enterprise or social organization is not suitable for publishing, such as the business situation of the enterprise, the network structure of the enterprise, the IP address list, etc.
The data classification and classification is the basis of data use, management and safety protection, is a solid fort of data safety, and is the core of data governance.
Different industries have different definition standards for the characteristic data, and different industries have different classification and grading representation methods for the identification of the characteristic data. In the related art, a common scheme for identifying the feature data of each manufacturer is as follows: the method comprises the steps of scanning user data through a built-in identification rule, carrying out full-quantity identification and filtration on the user data through a rule engine or an open source identification framework, judging whether the rule is hit or not, then writing the data of the hit rule into a database again by reorganizing a data format and information including risk level of the feature data, hit rule and the like, and recording a sampling sample.
In view of the above, one of the core invention points of the feature data identification method provided in the embodiment of the present application is that, by managing the built-in multiple types of industry templates, a user only needs to authorize the feature data identification authority of asset data to the data security platform for feature data identification, and the data security platform can perform feature data identification of related industries on the authorized asset data according to the currently enabled industry template, so as to achieve an appeal that the user performs industry-based classification and classification identification on the feature data in the asset data.
Referring to fig. 1, a system block diagram corresponding to a feature data identification method provided in an embodiment of the present application, that is, a system block diagram of a feature data identification system, is shown; the system (or called data security platform) comprises a management and control service module and an engine service module.
A plurality of templates (or industry templates) are arranged in the management and control service module; each industry template has a plurality of template classifications associated with a corresponding industry and template rules for each template classification; the user can interact with the control service module, and template management and configuration are carried out through the control service module, wherein the template management and configuration comprises the steps of determining an enabled industry template, the enabled industry template is a target industry template, and configuring a scanning range, a recognition model base, a characteristic data risk level and the like corresponding to the target industry template. In addition, the management and control service module can also display the industry-related multi-level classification template rules corresponding to each industry template on a front-end page in a list or topology form so as to allow a user to perform customized modification. A user can interact with the management service module through a template management page displayed at the front end, and the management service module is used for performing template management, starting, stopping, detail checking, copying and other operations to determine a target industry template for identifying the characteristic data.
The engine service module can pull the identification tasks, the template rules and the like from the management and control service module, namely the engine service module can interact with the management and control service module to obtain a target industry template started by the management and control service module, and the multistage template classification of the target industry template, the template rules, the scanning range corresponding to the template rules, the identification model base, the feature data risk level and the like. The engine service module scans the asset data authorized by the user by using the pulled identification task, the template rule and the like, identifies the feature data through the model algorithm library according to the identification range and the associated model defined by the template rule, and reports the identification result of the feature data to the management and control service module.
The management and control service module receives the feature data identification result reported by the engine service module through the interface layer, writes the structured and unstructured feature data in the feature data result into the database respectively, and displays the result through the front-end page, so that a user can clearly inquire the feature data identification result based on the target industry template through the front-end page.
The specific implementation is described in detail by the specific embodiments below.
Referring to fig. 2, a flowchart of steps of a feature data identification method provided in this embodiment of the present application is shown, where an execution subject of the method may be the above-mentioned feature data identification system (or referred to as a data security platform), or may be various servers/terminal devices with data processing capability, or may be a device or a chip integrated on these servers/terminal devices; for convenience of description, the following takes the execution subject as an example of the data security platform for illustrative explanation. The characteristic data identification method specifically comprises the following steps:
step 101, determining a target industry template for identifying characteristic data; the target industry template comprises a plurality of template classifications corresponding to the target industry and template rules of each template classification.
In the embodiment of the application, before the data security platform identifies the feature data, a target industry template for identifying the feature data needs to be determined from a plurality of industry templates, wherein the target industry template comprises a plurality of template classifications corresponding to a target industry and template rules of each template classification, namely, the target industry template comprises classification rules of industry-to-feature data related to an asset data industry classification identification appeal of a user.
In this embodiment, the data security platform may provide a plurality of industry templates, each industry template including a plurality of template classifications and template rules involved in the identification of the feature data by its corresponding industry. The template is the result of fixing and standardizing the structural rule of an object, and embodies the standardization of the structural form; that is, the plurality of industry templates provided by the data security platform have standardized structural forms. Each industry template contains a plurality of template classifications and template rules involved in the identification of the feature data by its corresponding industry. The template rule refers to a classification type rule, a risk level and an associated identification rule can be defined in the template rule, and an identification range of corresponding characteristic data can be further defined; the identification rule may identify the feature data based on a keyword or a regular expression. Through each template rule, the identification rule and the identification range adopted by each template rule can be determined, so that the determined identification rule is adopted to identify the characteristic data of the data in the corresponding identification range, and the identified characteristic data is subjected to risk grade division.
In an example, the target industry template may be enabled by a user selection. Specifically, the determining the target industry template for identifying the feature data may include:
displaying a plurality of industry templates;
and responding to the starting operation aiming at the industry template, and determining the industry template corresponding to the starting operation as a target industry template.
In this example, the data security platform may display a plurality of industry templates in the front-end page, and the user may interact with the data security platform through the front-end page, select one of the industry templates from the plurality of industry templates as needed, and set the state of the selected industry template to the enabled state, where the industry template set to the enabled state is the target industry template, so that the data security platform performs feature data identification using the target industry template.
In another example, the target industry template may be an industry template that is a default for the data security platform. Specifically, the data security platform may have a default industry template, and the default industry template may be a pre-specified industry template or an industry template used when the data security platform performs the feature data identification last time. When a user uses the data security platform to identify the feature data and does not set the state of a certain industry template into an enabled state, the default industry template of the data security platform is determined as a target industry template, and then the target industry template is adopted to identify the feature data.
Further, in an optional embodiment of the present application, before the determining the target industry template for identifying the feature data, the method further comprises:
acquiring a characteristic data classification and classification data table of each industry;
and generating a corresponding industry template comprising a plurality of template classifications corresponding to the industry and template rules of each template classification according to the characteristic data classification hierarchical data table.
In this embodiment, the industry template built in the data security platform may generate a corresponding industry template according to the characteristic data classification and classification data table by obtaining the characteristic data classification and classification data table of each industry.
For example, for each industry template, a feature data classification and classification data table corresponding to the industry template may be read, and the feature data classification and classification data table records the feature data classification corresponding to the industry and the risk level corresponding to each classification feature data. After the characteristic data classification hierarchical data table corresponding to the industry template is read, data in the data table can be processed according to classification levels, a section of Structured Query Language (SQL) is generated, the industry template can be built in the data security platform by executing the SQL, and a plurality of industry templates of the data security platform can be obtained in the same mode. Different industry templates and data structures of characteristic data classification corresponding to the industry templates are consistent, only the specific field contents are slightly different, and the built-in methods of the industry templates are consistent.
Further, the user may also customize an industry template for identifying the feature data, so as to perform the feature data identification based on the customized industry template, and therefore, in an optional embodiment of the present application, the method may further include:
in response to a modification operation for an industry template, the industry template is updated based on the modification operation.
In this embodiment, the user can perform customized modification on the basis of the industry template provided by the data security platform to obtain a customized industry template more meeting the requirement.
In one example, the data security platform may display a plurality of industry templates in a front-end page, each industry template having a corresponding detail control, the detail control being used to display feature classification information corresponding to the industry template; the user can select one industry template according to the requirement, and the front-end page displays the feature classification information of the selected industry template in a list mode or a topology mode by triggering the detail control of the selected industry template. In the list mode, the front-end page may also display information such as a risk level of the template rule corresponding to the feature classification information, whether the template rule is enabled, and the like, as shown in fig. 3, a user may modify information such as the risk level of the template rule, whether the template rule is enabled, and the like according to a requirement in the page of the list mode, and then save the modification, so as to modify the selected industry template into the customized industry template. In the topology mode, as shown in fig. 4, the user may drag feature classification information according to the requirement, and then save and modify the feature classification information, so as to modify the selected industry template into the customized industry template.
In another example, the data security platform may display a plurality of industry templates in a front-end page, each industry template having a corresponding copy control, the user may select one of the industry templates as needed, and copy the selected industry template by triggering the copy control of the selected industry template, and display the copied industry template in the front-end page, where the copied industry template includes a detail control; the user can trigger the detail control of the copied industry template, so that the front-end page displays the characteristic classification information of the copied industry template in a list mode or a topology mode. In the list mode, the front-end page may also display information such as a risk level of the template rule corresponding to the feature classification information, whether the template rule is enabled, and the like, as shown in fig. 3, a user may modify the information such as the risk level of the template rule, whether the template rule is enabled, and the like according to a requirement in the page in the list mode, and then store the modification, so as to modify the copied industry template into the customized industry template. In the topology mode, as shown in fig. 4, the user can drag feature classification information according to the requirement and save the feature classification information, so as to modify the copied industry template into a customized industry template.
Generally, for a built-in industry template provided by a data security platform, a user can modify the risk level of a template rule and modify whether the template rule is opened to customize the industry template. For the copied industry template obtained by copying, a user can not only modify the risk level of the template rule and modify whether the template rule is started to customize the industry template, but also process the feature classification information by dragging, adding, deleting, modifying and other modes and reset the feature classification information to customize the industry template. The built-in industry template can be regarded as one or more industry templates of factory setting of the data security platform. The industry template copy can be regarded as an industry template copy obtained by copying the built-in industry template by a user. Generally, duplicate industry templates can be deleted, while built-in industry templates cannot.
And 102, performing characteristic data identification on authorized asset data according to a plurality of template classifications corresponding to the target industry and the template rules of the template classifications to obtain a characteristic data classification and classification identification result corresponding to the target industry.
After the target industry model is determined, feature data recognition can be performed on authorized asset data according to a plurality of template classifications corresponding to the target industry template and template rules of each template classification to obtain corresponding feature data recognition results, and then the feature data classification recognition results corresponding to the target industry are obtained. The specific process can comprise the following steps: and respectively judging whether authorized asset data hits the corresponding template rule according to the identification model, the scanning range and the like configured by each template rule of the target industry template, and if the authorized asset data hits the corresponding template rule, obtaining a corresponding characteristic data classification and classification identification result. The characteristic data classification and classification identification result can comprise identified characteristic data, file names corresponding to the characteristic data and risk levels of the characteristic data; wherein, the file name corresponding to the characteristic data can be understood as the position of the characteristic data in the authorized asset data.
For example, the data security platform may present an asset data authorization page in the front end page, and may authorize the data security platform's right to access the asset data through one-touch authorization or debit authorization. The one-key authorization means an authorization mode that a data security platform can automatically generate a read-only account without inputting an account and a password by a user; account authorization refers to the authority a user needs to authorize the data security platform to access asset data through a username and password. The user can also preview the current authorized, unauthorized and failed authorized conditions on the asset data authorization page. After the authorization setting is carried out, namely the characteristic data identification authority of the asset data is in an enabling state, the data security platform can carry out characteristic data classification and classification identification on the authorized asset data.
The asset data types supported by the data security platform in the embodiment of the application comprise: OSS, RDS-PASS, DRDS, PolarDB, OTS, SelfDB, Dataphin, MaxCompute, ADB-PG, ADB-MYSQL, MongoDB, OceanBase, Redis, etc. When the front-end page is displayed, classified display can be carried out according to the asset data types.
Each industry template provided by the data security platform comprises a plurality of template classifications, each template classification comprises at least one template rule, and each template rule can be associated with an identification range; the above-mentioned characteristic data identification is carried out to authorized asset data to adopting target trade template, obtains characteristic data classification recognition result, still includes:
for each template rule, determining corresponding asset data to be identified according to the identification range associated with the template rule;
and performing characteristic data identification on the asset data to be identified by adopting the template rule.
In this embodiment, after the target industry template is determined, all template rules included in the target industry template may be obtained, for each template rule, asset data to be identified corresponding to the template rule may be determined according to an identification range associated with the template rule, where the asset data to be identified is asset data corresponding to the identification range associated with the template rule in authorized asset data, and exemplarily, it is assumed that the authorized asset data includes asset data of three types, namely OSS, RDS, and RDS-PASS, and when an identification range associated with a template rule 1 in the target industry template is an RDS asset type and a database name is test, the asset data to be identified corresponding to a template rule 1 is the asset data in authorized asset data, the database name is test, and the asset type is RDS asset data. Then, carrying out characteristic data identification on asset data to be identified by adopting the identification rule associated with the template rule to obtain a characteristic data classification and classification identification result corresponding to the template rule; and summarizing the characteristic data classification and grading identification results corresponding to all template rules contained in the target industry template to obtain the characteristic data classification and grading identification result obtained by identifying the authorized asset data by using the target industry template. By setting the identification range associated with the template rule, the data source of the characteristic data identification can be reduced, and the efficiency of the characteristic data identification is improved.
Further, in an optional embodiment of the present application, the method further includes:
and displaying the characteristic data classification and classification identification result.
In this embodiment, after the feature data classification and classification recognition result is obtained, the feature data classification and classification recognition result may be displayed on a front-end page. Specifically, the feature data classification and classification identification result comprises feature data, a file name corresponding to the feature data and a risk level of the feature data, and the feature data identification result can be displayed according to a preset display specification during display; the preset display specification may include performing sequencing display according to file names corresponding to the feature data, performing sequencing display according to feature data identification time, performing sequencing display according to risk levels of the feature data, and the like.
Illustratively, the authorized asset data relates to a structured data type and an unstructured data type, and therefore, the feature data identification result identified by the target industry template includes the structured data and the unstructured data; respectively writing the characteristic data into corresponding databases according to the types of the characteristic data; and then, acquiring data related to the feature data identification result from the database, and displaying the data, wherein a display page is shown in fig. 5.
Further, in an optional embodiment of the present application, after the performing feature data identification on the authorized asset data by using the target industry template to obtain a result of classifying and classifying the feature data corresponding to the target industry, the method further includes:
in response to a correction operation for classifying the classification and classification recognition result with respect to the feature data, the classification and classification recognition result is corrected based on the correction operation.
In this embodiment, after obtaining the feature data classification and classification recognition result, the user may further perform a correction operation on the feature data classification and classification recognition result to correct the feature data recognition result, so as to obtain the feature data classification and classification recognition result more meeting the user requirement.
For example, a correction page may be displayed in the front-end page, as shown in fig. 6, the user may perform a correction operation on the feature data identification result in the correction page, including correcting the identification rule and the feature data level; the correction page may also show the change of the identification rule before and after correction, and the change of the feature data level before and after correction.
Furthermore, after the characteristic data classification and classification identification result is obtained, a corresponding desensitization rule can be set according to the risk level so as to desensitize the characteristic data and further ensure the safety of user information.
In the embodiment of the application, in the process of identifying the feature data, particularly the process of identifying the requirement of the feature data based on the industry attribute, a target industry template for identifying the feature data can be determined; the target industry template comprises a plurality of template classifications corresponding to the target industry and template rules of each template classification; performing characteristic data identification on authorized asset data according to a plurality of template classifications corresponding to the target industry and template rules of each template classification to obtain a characteristic data classification and classification identification result corresponding to the target industry; the classification and classification identification result corresponding to the target industry comprises feature data, file names corresponding to the feature data and risk levels of the feature data; and enabling the characteristic data identification result to meet the characteristic data identification requirement of the target industry attribute. In the embodiment of the application, a user only needs to authorize the characteristic data identification authority of asset data to a data security platform for identifying the characteristic data, and the bottom layer of the data security platform identifies and labels the authorized asset data by classification and classification characteristic data of a target industry in a micro-service mode according to a determined target industry template to obtain a characteristic data identification result; the characteristic data identification result is displayed through a front-end page, so that the user can realize the appeal of classifying and identifying the characteristic data in the asset data by industry only through simple operation.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the embodiments are not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the embodiments. Further, those skilled in the art will also appreciate that the embodiments described in the specification are presently preferred and that no particular act is required of the embodiments of the application.
Referring to fig. 7, a block diagram of a feature data identification apparatus according to an embodiment of the present application is shown, where the apparatus corresponds to the above feature data identification method embodiment, and specifically may include the following modules:
a target template determination module 701, configured to determine a target industry template for identifying feature data; the target industry template comprises a plurality of template classifications corresponding to the target industry and template rules of each template classification;
the feature data identification module 702 is configured to perform feature data identification on authorized asset data according to a plurality of template classifications corresponding to the target industry and template rules of each template classification, so as to obtain a feature data classification and classification identification result corresponding to the target industry.
In an alternative embodiment, the target template determining module 701 includes:
the industry template display module is used for displaying a plurality of industry templates;
and the industry template starting module is used for responding to starting operation aiming at the industry template and determining the industry template corresponding to the starting operation as a target industry template.
In an alternative embodiment, the apparatus further comprises:
the classified and classified data table acquisition module is used for acquiring a classified and classified data table of the feature data of each industry;
and the industry template generating module is used for generating a corresponding industry template comprising a plurality of template classifications corresponding to the industry and template rules of each template classification according to the characteristic data classification grading data table.
In an alternative embodiment, the template rule associates an identification scope; the feature data identification module 702 includes:
the identification range determining module is used for determining corresponding asset data to be identified according to the identification range associated with each template rule;
and the identification module based on the identification range is used for identifying the characteristic data of the asset data to be identified by adopting the template rule.
In an alternative embodiment, the apparatus further comprises:
and the industry template modification module is used for responding to modification operation aiming at the industry template and updating the industry template based on the modification operation.
In an alternative embodiment, the apparatus further comprises:
and the characteristic data identification result display module is used for displaying the characteristic data classification and classification identification result.
In an alternative embodiment, the apparatus further comprises:
and the characteristic data identification result correcting module is used for responding to correcting operation aiming at the characteristic data classification and grading identification result and correcting the characteristic data classification and grading identification result based on the correcting operation.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiment of the application also discloses an electronic device, which comprises a processor, a memory and a computer program stored on the memory and capable of running on the processor, wherein when the computer program is executed by the processor, the steps of the feature data identification method are realized.
The embodiment of the application also discloses a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the steps of the characteristic data identification method are realized.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one of skill in the art, embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The characteristic data identification method, the characteristic data identification device, the characteristic data identification equipment and the storage medium are introduced in detail, specific examples are applied in the description to explain the principle and the implementation of the characteristic data identification method, and the description of the specific examples is only used for helping to understand the method and the core idea of the characteristic data identification method; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A method for identifying feature data, the method comprising:
determining a target industry template for identifying the characteristic data; the target industry template comprises a plurality of template classifications corresponding to the target industry and template rules of each template classification;
and performing characteristic data identification on authorized asset data according to a plurality of template classifications corresponding to the target industry and the template rules of each template classification to obtain a characteristic data classification and classification identification result corresponding to the target industry.
2. The method of claim 1, wherein determining the target industry template for identifying the characteristic data comprises:
displaying a plurality of industry templates;
and responding to the starting operation aiming at the industry template, and determining the industry template corresponding to the starting operation as a target industry template.
3. The method of claim 2, wherein prior to said determining a target industry template for identifying characteristic data, the method further comprises:
acquiring a characteristic data classification and classification data table of each industry;
and generating a corresponding industry template comprising a plurality of template classifications corresponding to the industry and template rules of each template classification according to the characteristic data classification grading data table.
4. The method of claim 1, wherein the template rule association identifies a scope; the method for identifying the feature data of the authorized asset data according to the template classifications corresponding to the target industry and the template rules of the template classifications to obtain the feature data classification and classification identification result corresponding to the target industry comprises the following steps:
for each template rule, determining corresponding asset data to be identified according to the identification range associated with the template rule;
and performing characteristic data identification on the asset data to be identified by adopting the template rule.
5. The method of claim 1, further comprising:
in response to a modification operation for an industry template, the industry template is updated based on the modification operation.
6. The method of claim 1, further comprising:
and displaying the characteristic data classification and grading identification result.
7. The method according to claim 1 or 6, wherein after the performing the feature data recognition on the authorized asset data according to the plurality of template classifications and the template rules of each template classification corresponding to the target industry to obtain the feature data classification recognition result corresponding to the target industry, the method further comprises:
and in response to the correction operation aiming at the characteristic data classification grading identification result, modifying the characteristic data classification grading identification result based on the correction operation.
8. An apparatus for identifying feature data, the apparatus comprising:
the target template determining module is used for determining a target industry template for identifying the characteristic data; the target industry template comprises a plurality of template classifications corresponding to the target industry and template rules of each template classification;
and the characteristic data identification module is used for carrying out characteristic data identification on authorized asset data by adopting a plurality of template classifications corresponding to the target industry and the template rules of the template classifications to obtain a characteristic data classification and classification identification result corresponding to the target industry.
9. An electronic device, comprising a processor, a memory and a computer program stored on the memory and being executable on the processor, the computer program, when executed by the processor, implementing the steps of the method for feature data identification according to any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for feature data identification according to one of claims 1 to 7.
CN202210056829.8A 2022-01-18 2022-01-18 Feature data identification method and device, electronic equipment and storage medium Pending CN114491195A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210056829.8A CN114491195A (en) 2022-01-18 2022-01-18 Feature data identification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210056829.8A CN114491195A (en) 2022-01-18 2022-01-18 Feature data identification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114491195A true CN114491195A (en) 2022-05-13

Family

ID=81472296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210056829.8A Pending CN114491195A (en) 2022-01-18 2022-01-18 Feature data identification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114491195A (en)

Similar Documents

Publication Publication Date Title
US11036771B2 (en) Data processing systems for generating and populating a data inventory
US20220159041A1 (en) Data processing and scanning systems for generating and populating a data inventory
US10438016B2 (en) Data processing systems for generating and populating a data inventory
US8201079B2 (en) Maintaining annotations for distributed and versioned files
US7971231B2 (en) Configuration management database (CMDB) which establishes policy artifacts and automatic tagging of the same
US8332359B2 (en) Extended system for accessing electronic documents with revision history in non-compatible repositories
CN103377336B (en) The control method of a kind of computer system user authority and system
US8250532B2 (en) Efficient development of configurable software systems in a large software development community
US20070043716A1 (en) Methods, systems and computer program products for changing objects in a directory system
US10642870B2 (en) Data processing systems and methods for automatically detecting and documenting privacy-related aspects of computer software
US11175909B2 (en) Software discovery using exclusion
US20120290544A1 (en) Data compliance management
CA2673422C (en) Software for facet classification and information management
US11720825B2 (en) Framework for multi-tenant data science experiments at-scale
US8527446B2 (en) Information integrity rules framework
CN112997172A (en) Computationally efficient tag determination for data assets
CN114491195A (en) Feature data identification method and device, electronic equipment and storage medium
CN115543428A (en) Simulated data generation method and device based on strategy template
US8560572B2 (en) System for lightweight objects
US8656410B1 (en) Conversion of lightweight object to a heavyweight object
US11934800B2 (en) Generating metadata to facilitate code generation
US11138242B2 (en) Data processing systems and methods for automatically detecting and documenting privacy-related aspects of computer software
CN117407893A (en) Data authority management method, device, equipment and medium based on API configuration
CN115982623A (en) Data processing method and device, electronic equipment and storage medium
JP2015026187A (en) Management system, management device, and computer program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination