CN115982623A - Data processing method and device, electronic equipment and storage medium - Google Patents

Data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115982623A
CN115982623A CN202211737179.XA CN202211737179A CN115982623A CN 115982623 A CN115982623 A CN 115982623A CN 202211737179 A CN202211737179 A CN 202211737179A CN 115982623 A CN115982623 A CN 115982623A
Authority
CN
China
Prior art keywords
target data
data
classification
rule
data according
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211737179.XA
Other languages
Chinese (zh)
Inventor
肖龙
王小伟
王建立
朱亚平
杨毅
辛北军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN202211737179.XA priority Critical patent/CN115982623A/en
Publication of CN115982623A publication Critical patent/CN115982623A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the invention provides a data processing method, a data processing device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring target data; obtaining a classification rule and a classification rule written based on a declarative language, wherein the classification rule comprises at least one of classification of target data according to the times of accessing the target data by a user, classification of the target data according to keywords contained in the target data, and classification of the target data according to the source of the target data; the classification rule comprises classifying the target data according to the data type of the target data; grading the target data according to a grading rule; and classifying the target data according to the classification rule. The hierarchical classification rules are set based on the declarative language, and the data can be classified and classified in multiple dimensions.

Description

Data processing method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of information technologies, and in particular, to a data processing method and apparatus, an electronic device, and a storage medium.
Background
Under the background of opening and sharing big data, data resources are in a complex environment for obtaining and using, security management of sensitive data is closely related to national security, social stability, enterprise development and individual privacy protection, and data classification provides support for realizing opening and sharing security of the data resources by describing the sensitivity degree and multi-dimensional characteristics of the data, so that the method is an important component for building a big data security management system.
Data classification refers to classifying data resources from multiple dimensions such as organization mode, type, content, attribute relationship, and the like, for example, data can be classified according to the database table hierarchy in which the data are located according to the organization mode; the data is divided into texts, pictures, videos, voices, positions, messages and the like according to types, and the same classification dimension can be subjected to expansion classification according to multiple levels, such as the pictures are further divided into people pictures, object pictures, place pictures and the like.
The data classification management comprises the definition and rule setting of classification, the classification setting of data is based on the data access control, safety management and user classification authorization approval of classification, under the condition that the definition and the rule are simple, the classification can be uniformly realized by using a label system, the data classification is traditionally realized by using labels, but the expression capability of the labels is limited, the maintenance cost is high, and the use is difficult to understand.
Disclosure of Invention
In view of the above problems, embodiments of the present invention are proposed to provide a data processing method, apparatus, electronic device and storage medium that overcome or at least partially solve the above problems.
In order to solve the above problem, in a first aspect, an embodiment of the present invention discloses a data processing method, where the method includes:
acquiring target data;
obtaining a classification rule and a classification rule written based on a declarative language, wherein the classification rule comprises at least one of classification of the target data according to the times of accessing the target data by a user, classification of the target data according to keywords contained in the target data, and classification of the target data according to the source of the target data; the classification rule comprises classifying the target data according to the data type of the target data;
grading the target data according to the grading rule;
and classifying the target data according to the classification rule.
Optionally, when the classification rule is to classify the target data according to the number of times that the user accesses the target data, the classifying the target data according to the classification rule includes:
acquiring a record of times of accessing the target data by a user and an access time requirement corresponding to a hierarchical level, wherein the hierarchical level comprises: high frequency usage data, intermediate frequency usage data, low frequency usage data;
and determining the level of the target data according to the frequency record and the access frequency requirement corresponding to the hierarchical level.
Optionally, the method further includes:
acquiring an update period for the target data classification level;
and updating the level of the target data according to the updating period, the number record and the access number requirement corresponding to the hierarchical level.
Optionally, when the ranking rule is to rank the target data according to the keywords contained in the target data, the ranking the target data according to the ranking rule includes:
obtaining keyword setting corresponding to a hierarchical level, wherein the hierarchical level comprises: low-sensitivity data, medium-sensitivity data and high-sensitivity data;
identifying the target data and extracting the keywords in the target data;
and determining the grading level of the target data according to the keywords in the target data and the keyword setting corresponding to the grading level.
Optionally, when the classification rule is to classify the target data according to the source of the target data, the classifying the target data according to the classification rule includes:
acquiring source information of the target data and source requirements corresponding to a hierarchical level, wherein the source information comprises at least one of a network source, an individual source and an enterprise source, and the hierarchical level comprises: low-sensitivity data, medium-sensitivity data and high-sensitivity data;
and determining the classification level of the target data according to the source information of the target data and the source requirement corresponding to the classification level.
Optionally, the method further includes:
if the target data is set to a plurality of hierarchical levels, designating a highest level among the plurality of hierarchical levels as a hierarchical level of the target data.
Optionally, when the classification rule is to classify the target data according to the data type of the target data, the classifying the target data according to the classification rule includes:
obtaining a type of the target data, wherein the type comprises: at least one of a text type, a picture type, an audio type, and a video type;
classifying the target data of a text type as text data, and classifying the target data of a picture type, or an audio type, or a video type as multimedia data.
In a second aspect, an embodiment of the present invention discloses a data processing apparatus, where the apparatus includes:
the data acquisition module is used for acquiring target data;
the hierarchical classification rule obtaining module is used for obtaining a hierarchical rule and a classification rule written based on a declarative language, wherein the hierarchical rule comprises at least one of the steps of classifying the target data according to the times of accessing the target data by a user, classifying the target data according to keywords contained in the target data and classifying the target data according to the source of the target data; the classification rule comprises classifying the target data according to the data type of the target data;
the grading module is used for grading the target data according to the grading rule;
and the classification module is used for classifying the target data according to the classification rule.
Optionally, when the ranking rule is to rank the target data according to the number of times that the user accesses the target data, the ranking module is specifically configured to:
acquiring a record of times of accessing the target data by a user and an access time requirement corresponding to a hierarchical level, wherein the hierarchical level comprises: high frequency usage data, medium frequency usage data, low frequency usage data;
and determining the level of the target data according to the frequency record and the access frequency requirement corresponding to the hierarchical level.
Optionally, the apparatus further includes a hierarchical level update module, where the hierarchical level update module is configured to:
acquiring an update period for the target data classification level;
and updating the level of the target data according to the updating period, the number of times record and the access number requirement corresponding to the hierarchical level.
Optionally, when the ranking rule is to rank the target data according to the keywords contained in the target data, the ranking module is specifically configured to:
acquiring keyword setting corresponding to a hierarchical level, wherein the hierarchical level comprises: low-sensitivity data, medium-sensitivity data and high-sensitivity data;
identifying the target data and extracting the keywords in the target data;
and determining the grading level of the target data according to the keywords in the target data and the keyword setting corresponding to the grading level.
Optionally, when the classification rule is that the target data is classified according to the source of the target data, the classification module is specifically configured to:
acquiring source information of the target data and source requirements corresponding to a hierarchical level, wherein the source information comprises at least one of a network source, an individual source and an enterprise source, and the hierarchical level comprises: low-sensitivity data, medium-sensitivity data and high-sensitivity data;
and determining the classification level of the target data according to the source information of the target data and the source requirement corresponding to the classification level.
Optionally, the apparatus further comprises a hierarchical level resolution module, and the hierarchical level resolution module is configured to:
if the target data is set to a plurality of hierarchical levels, designating a highest level among the plurality of hierarchical levels as a hierarchical level of the target data.
Optionally, when the classification rule is to classify the target data according to the data type of the target data, the classification module is specifically configured to:
obtaining a type of the target data, wherein the type comprises: at least one of a text type, a picture type, an audio type, and a video type;
classifying the target data of a text type as text data, and classifying the target data of a picture type, or an audio type, or a video type as multimedia data.
In a third aspect, the invention shows an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the data processing method according to the first aspect when executing the program.
In a fourth aspect, the invention shows a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the data processing method according to the first aspect.
The embodiment of the invention has the following advantages:
acquiring target data; obtaining a classification rule and a classification rule written based on a declarative language, wherein the classification rule comprises at least one of classification of target data according to the times of accessing the target data by a user, classification of the target data according to keywords contained in the target data, and classification of the target data according to the source of the target data; the classification rule comprises classifying the target data according to the data type of the target data; grading the target data according to a grading rule; and classifying the target data according to the classification rule. The hierarchical classification rules are set based on the declarative language, and data can be classified and classified in multiple dimensions.
Drawings
FIG. 1 is a flow chart of steps of a data processing method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a data processing method according to an embodiment of the present invention;
fig. 3 is a block diagram of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
The expression ability of the label is limited, and the hierarchical classification method is limited by classifying the data through the label. Writing hierarchical classification rules based on declarative language can classify and classify data in multiple dimensions.
Referring to fig. 1, a flowchart illustrating steps of a data processing method according to an embodiment of the present invention is shown, where the method specifically includes the following steps:
step 101, target data is obtained.
The target data is user-specified data, and the target data may be a database, a table, a file, a column, a record, a picture, a video, an audio, an element, and the like. The target data may be acquired through a network, or may be acquired through an individual, an enterprise, or the like. The type and the obtaining mode of the target data are not limited in the application.
102, obtaining a classification rule and a classification rule written based on a declarative language, wherein the classification rule comprises at least one of classification of target data according to the times of accessing the target data by a user, classification of the target data according to keywords contained in the target data and classification of the target data according to the source of the target data; the classification rules include classifying the target data according to a data type of the target data.
The data classification refers to the classification of public data according to the degree of harm to national security, social order, public benefits and legal rights and interests (invaded objects) of individuals, legal persons and other organizations after the public data is damaged (including attack, leakage, tampering, illegal use and the like), and provides support for the establishment of a security strategy for the full-life-cycle management of the data.
Data classification refers to classifying data from multiple dimensions such as organization mode, type, content, attribute relationship, and the like, for example, data can be classified according to the database table hierarchy in the organization mode; the data is divided into texts, pictures, videos, voices, positions, messages and the like according to types, and the same classification dimension can be subjected to expansion classification according to multiple levels, such as the pictures are further divided into people pictures, object pictures, place pictures and the like.
In the present application, the classification rule may include at least one of classifying the target data according to the number of times the target data is accessed by the user, classifying the target data according to a keyword included in the target data, and classifying the target data according to a source of the target data. The classification rules include classifying the target data according to a data type of the target data.
In the present application, the classification rules and classification rules are written based on a declarative language, datalog being a logic-based programming language. Datalog's statements are composed of facts and rules that enable deductive reasoning about the knowledge base, i.e., new facts can be derived from known facts from the following reasoning. Declarative languages can define any complex hierarchical rules within the language expression capabilities, defined according to attribute features, relationships between data resource objects, graph calculations, machine learning models, and the like.
And 103, grading the target data according to the grading rule.
After the grading rule is obtained, the target data can be graded according to the grading rule.
When the classification rule is to classify the target data according to the number of times that the user accesses the target data, classifying the target data according to the classification rule may include: acquiring a frequency record of user access to target data and an access frequency requirement corresponding to a hierarchical level, wherein the hierarchical level comprises: high frequency usage data, medium frequency usage data, low frequency usage data; and determining the level of the target data according to the frequency record and the access frequency requirement corresponding to the hierarchical level.
And setting a higher hierarchical level for the data with high use frequency so as to strengthen backup or other safety management and provide safe use support for data management. When the classification rule is to classify the target data according to the number of times that the user accesses the target data, the number of times that the user accesses the target data and the access number requirement corresponding to the classification level can be obtained first, and the classification level includes: high frequency usage data, intermediate frequency usage data, low frequency usage data. For example, 3 access times thresholds may be set, where the 3 access times thresholds are a first threshold, a second threshold, and a third threshold, respectively, the first threshold is smaller than the second threshold, and the second threshold is smaller than the third threshold. When the number of times of accessing the target data by the user reaches a first threshold value, the target data is determined to be low-frequency use data, when the number of times of accessing the target data by the user reaches a second threshold value, the target data is determined to be intermediate-frequency use data, and when the number of times of accessing the target data by the user reaches a third threshold value, the target data is determined to be high-frequency use data. The high-frequency use data means that the data is important, and more backups can be performed on the high-frequency use data or access qualification is limited, so that the data security is protected.
When the classification rule is to classify the target data according to the number of times the target data is accessed by the user, the classification rule written based on the declarative language may be
Rule 1: freq (X, Y, countY): -relation (X, access Y, T).
Rule 2: smaller (X, Y1, Y2): -freq (X, Y1, M), freq (X, Y2, N), M < N.
Rule 3: smaller (X, Y1, Y2): -freqX, Y1, M, freq (X, Y2, M), Y1< Y2.
Rule 4: countsemagler (X, Y1, count < Y2 >): -smaller (X, Y1, Y2).
Rule 5: roundsmaller (X, Y1, 0): freq (X, Y1, N), smaller (X, Y1, Y2).
Rule 6: assign (X, K, Y, user topK data): -countsmaller (X, Y, N), N < K.
X represents a user account, Y represents target data, and T represents the number of accesses. Acquiring a record of target data Y accessed by an account X, and obtaining the times of accessing each target data by the account X by using an aggregation operation count () according to a rule 1; rule 2 and rule 3 define the full-order relationship of the access times; the rule 4 and the rule 5 obtain the number of other data which is more than the number of times of accessing each target data by utilizing aggregation and negation operations; rule 6 defines the first K highest access amounts of target data Y for account X as "user topK data".
In one embodiment, each time a user accesses target data, the user account and the number of times the user account accesses the target data may be recorded. Then, obtaining an update cycle aiming at the grading level of the target data; and updating the level of the target data according to the updating period, the number record and the access number requirement corresponding to the hierarchical level. Namely, level updating is carried out on the target data according to the latest number record of accessing the target data and the access number requirement corresponding to the grading level every other preset period.
In one embodiment, when the ranking rule is to rank the target data according to a keyword contained in the target data, ranking the target data according to the ranking rule includes: acquiring keyword setting corresponding to a grading grade, wherein the grading grade comprises the following steps: low-sensitivity data, medium-sensitivity data and high-sensitivity data; identifying target data and extracting keywords in the target data; and determining the grading level of the target data according to the keywords in the target data and the keyword setting corresponding to the grading level.
The grading may be based on the content of the target data, such as setting a sensitivity level based on sensitive information contained in the text, picture, etc. The high-sensitivity data represents that the data importance is higher, the medium-sensitivity data represents that the data importance is general, and the low-sensitivity data represents that the data importance is low. Different keywords can be set corresponding to the high-sensitivity data, the medium-sensitivity data and the low-sensitivity data respectively, for example, the keywords of the high-sensitivity data can be set to be identity card numbers, bank card numbers, contracts, salaries and the like, the keywords of the medium-sensitivity data can be set to be user accounts, home addresses, telephones, sales clients and the like, and the keywords of the non-high-sensitivity data and the medium-sensitivity data can be determined to be the low-sensitivity data, or the corresponding keywords can be set for the low-sensitivity data. After keywords are respectively set for the low-sensitivity data, the medium-sensitivity data and the high-sensitivity data, the grading level of the target data can be determined according to the keywords. For example, when the target data contains the keyword identification number, the target data is determined to be highly sensitive data. Generally, the keywords can be identified and extracted through an extraction operator or a model. Because the Datalog does not allow the item to be a function or a predicate, the extraction operator or the model can be abstracted to extract (OID, W), where OID (Object Identifier) is an Identifier of the target data, and is generated according to a preset rule, and after an OID is generated for each target data, the query and the calling can be facilitated.
When the classification rule is to classify the target data according to the keywords contained in the target data, the rule of the sensitivity level may be set as: assign (OID, 'sendlevelhi'): -extract (OID, W), senslords (W). Wherein, sensords (W) indicates that the extracted keywords are sensitive keywords. Note that extract (OID, W) can simply implement a result set of data content extraction, and the fact that a plurality of extracts (OID, W) indicates that a plurality of keywords are extracted, and the like.
In one embodiment, when the classification rule is to classify the target data according to the source of the target data, classifying the target data according to the classification rule includes: obtaining source information of the target data and source requirements corresponding to a grading level, wherein the source information comprises at least one of a network source, a personal source and an enterprise source, and the grading level comprises: low-sensitivity data, medium-sensitivity data and high-sensitivity data; and determining the classification level of the target data according to the source information of the target data and the source requirement corresponding to the classification level.
The basis for the ranking may be the source of the data, with a lower sensitivity level being set for public internet information and a higher sensitivity level being set for business secrets, technical secrets or personal privacy data. After the source information of the target data is obtained, the classification level of the target data can be determined according to the source information of the target data and the source requirement corresponding to the classification level.
When the classification rule is to classify the target data according to the source of the target data, the classification rule written in the Datalog language may be a classification rule expressed by a conjunctive query program: assign (RID, 'sendevelo'), -relation (RID, 'Source', 'Internet'); -assign ('record 0001', L). Where RID is a variable that matches any data. Datalog is a typeless language, and therefore, relation () is used to limit the scope of RIDs. The statement shows that the hypo-sensitive data is set for all records from the internet, 'sendlevello' is a constant of SensLevel type.
In one embodiment, the data source of the target data may be further subdivided, for example, the enterprise source may be subdivided into government sources, corporate sources, and the like.
It should be noted that different sensitivity levels may be set for the same target data when there are multiple classification rules. For example, it is possible to adopt both the classification of target data according to the source of the target data and the classification of the target data according to keywords contained in the target data, and the same target data may be determined as 2 different sensitivity levels, and if the target data is set to a plurality of classification levels, the highest level among the plurality of classification levels may be designated as the classification level of the target data. For example, if the target data is from the internet, but the target data includes an identification number, the target data is determined as less sensitive data and more sensitive data, and the more sensitive data can be designated as a hierarchical level of the target data. Hierarchical level conflict resolution is expressed by using Datalog, and an extended max aggregation operation can be used to specify a unique sensitivity level, which is usually the highest level in different setting results and can be expressed as slevel (RID, max): -assign (RID, L).
And 104, classifying the target data according to the classification rule.
After the classification rules are obtained, the target data can be classified according to the classification rules.
In one embodiment, when the classification rule is to classify the target data according to the data type of the target data, classifying the target data according to the classification rule includes: acquiring types of target data, wherein the types comprise: at least one of a text type, a picture type, an audio type and a video type; the text type target data is classified into text data, and the picture type, or audio type, or video type target data is classified into multimedia data.
In one embodiment, the classification may be based on the organization of the target data, for example, the software copyright and patent tables of the enterprise are classified into intellectual property data, or the classification may be derived based on the relationship between the target data, for example, the related records are classified according to a plurality of attribute relationship combinations.
In one embodiment, the same data resource can be classified from different perspectives and different dimensions according to business needs. For example, a sharing attribute classification may be set for target data according to a hierarchical level of the target data, low-sensitive data is set as a sharable classification category, and medium-sensitive data and high-sensitive data are set as an unshared classification category; or from the data processing perspective, dividing the target data into native data and derivative data; or the target data is divided into batch data, real-time data and the like according to the data updating mode.
Referring to fig. 2, a schematic diagram of a data processing method according to an embodiment of the present invention is shown. After the target data is obtained, the target data includes databases, tables, files, columns, records, pictures, videos, audios, elements, and the like. The target data can be abstracted and subject described, and then classified according to preset classification rules and classification rules. After the target data is classified in a grading way, relevant management regulations can be preset for the target data, and the management regulations comprise safety management, task scheduling and the like. For example, encryption and multiple backups are set for highly sensitive data, or access qualification requirements are set for highly sensitive data. After the target data is classified in a hierarchical manner and management regulation setting is performed, the target data can be stored. The target data can be stored through a graph database, or can be stored through a text database and a relational database.
Acquiring target data; obtaining a classification rule and a classification rule written based on a declarative language, wherein the classification rule comprises at least one of classification of target data according to the times of accessing the target data by a user, classification of the target data according to keywords contained in the target data, and classification of the target data according to the source of the target data; the classification rule comprises classifying the target data according to the data type of the target data; grading the target data according to a grading rule; and classifying the target data according to the classification rule. The hierarchical classification rules are set based on the declarative language, and data can be classified and classified in multiple dimensions.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 3, a block diagram of a data processing apparatus according to an embodiment of the present invention is shown, which may specifically include the following modules:
the data acquisition module is used for acquiring target data;
the hierarchical classification rule obtaining module is used for obtaining a hierarchical rule and a classification rule compiled based on a declarative language, wherein the hierarchical rule comprises at least one of the steps of classifying target data according to the times of accessing the target data by a user, classifying the target data according to keywords contained in the target data and classifying the target data according to the source of the target data; the classification rule comprises classifying the target data according to the data type of the target data;
the grading module is used for grading the target data according to the grading rule;
and the classification module is used for classifying the target data according to the classification rule.
Optionally, when the classification rule is that the target data is classified according to the number of times that the user accesses the target data, the classification module is specifically configured to:
acquiring a frequency record of user access to target data and an access frequency requirement corresponding to a hierarchical level, wherein the hierarchical level comprises: high frequency usage data, medium frequency usage data, low frequency usage data;
and determining the level of the target data according to the frequency record and the access frequency requirement corresponding to the hierarchical level.
Optionally, the apparatus further includes a hierarchical level updating module, where the hierarchical level updating module is configured to:
acquiring an update period aiming at a target data grading level;
and updating the level of the target data according to the updating period, the number record and the access number requirement corresponding to the hierarchical level.
Optionally, when the classification rule is to classify the target data according to the keywords contained in the target data, the classification module is specifically configured to:
acquiring keyword setting corresponding to a grading grade, wherein the grading grade comprises the following steps: low-sensitivity data, medium-sensitivity data and high-sensitivity data;
identifying target data and extracting keywords in the target data;
and determining the grading level of the target data according to the keywords in the target data and the keyword setting corresponding to the grading level.
Optionally, when the classification rule is that the target data is classified according to the source of the target data, the classification module is specifically configured to:
obtaining source information of the target data and source requirements corresponding to a grading level, wherein the source information comprises at least one of a network source, a personal source and an enterprise source, and the grading level comprises: low-sensitivity data, medium-sensitivity data and high-sensitivity data;
and determining the classification level of the target data according to the source information of the target data and the source requirement corresponding to the classification level.
Optionally, the apparatus further comprises a hierarchical level resolution module, and the hierarchical level resolution module is configured to:
when the target data is set to a plurality of hierarchical levels, the highest level among the plurality of hierarchical levels is designated as the hierarchical level of the target data.
Optionally, when the classification rule is to classify the target data according to the data type of the target data, the classification module is specifically configured to:
acquiring types of target data, wherein the types comprise: at least one of a text type, a picture type, an audio type, and a video type;
the text type target data is classified into text data, and the picture type, or audio type, or video type target data is classified into multimedia data.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
An embodiment of the present invention further provides an electronic device, including:
the data processing method comprises a processor, a memory and a computer program which is stored on the memory and can run on the processor, wherein when the computer program is executed by the processor, each process of the data processing method embodiment is realized, the same technical effect can be achieved, and the details are not repeated here to avoid repetition.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the processes of the data processing method embodiment, and can achieve the same technical effects, and in order to avoid repetition, the computer program is not described herein again.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or terminal device that comprises the element.
The data processing method, the data processing apparatus, the electronic device, and the storage medium according to the present invention are described in detail above, and a specific example is applied in the description to explain the principles and embodiments of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A method of data processing, comprising:
acquiring target data;
obtaining a classification rule and a classification rule written based on a declarative language, wherein the classification rule comprises at least one of classification of the target data according to the times of accessing the target data by a user, classification of the target data according to keywords contained in the target data, and classification of the target data according to the source of the target data; the classification rule comprises classifying the target data according to the data type of the target data;
grading the target data according to the grading rule;
and classifying the target data according to the classification rule.
2. The data processing method of claim 1, wherein when the ranking rule is to rank the target data according to a number of times the target data is accessed by a user, the ranking the target data according to the ranking rule comprises:
acquiring a record of times of accessing the target data by a user and an access time requirement corresponding to a hierarchical level, wherein the hierarchical level comprises: high frequency usage data, medium frequency usage data, low frequency usage data;
and determining the level of the target data according to the frequency record and the access frequency requirement corresponding to the hierarchical level.
3. The data processing method of claim 2, wherein the method further comprises:
acquiring an update period for the target data classification level;
and updating the level of the target data according to the updating period, the number of times record and the access number requirement corresponding to the hierarchical level.
4. The data processing method according to claim 1, wherein when the ranking rule is to rank the target data according to a keyword included in the target data, the ranking the target data according to the ranking rule includes:
acquiring keyword setting corresponding to a hierarchical level, wherein the hierarchical level comprises: low-sensitivity data, medium-sensitivity data and high-sensitivity data;
identifying the target data and extracting the keywords in the target data;
and determining the grading level of the target data according to the keywords in the target data and the keyword setting corresponding to the grading level.
5. The data processing method of claim 1, wherein when the ranking rule is to rank the target data according to its source, the ranking the target data according to the ranking rule comprises:
acquiring source information of the target data and source requirements corresponding to a hierarchical level, wherein the source information comprises at least one of a network source, an individual source and an enterprise source, and the hierarchical level comprises: low-sensitivity data, medium-sensitivity data and high-sensitivity data;
and determining the classification level of the target data according to the source information of the target data and the source requirement corresponding to the classification level.
6. The data processing method of claim 1, wherein the method further comprises:
if the target data is set to a plurality of hierarchical levels, designating a highest level among the plurality of hierarchical levels as a hierarchical level of the target data.
7. The data processing method of claim 1, wherein when the classification rule is to classify the target data according to a data type of the target data, the classifying the target data according to the classification rule comprises:
obtaining a type of the target data, wherein the type comprises: at least one of a text type, a picture type, an audio type and a video type;
classifying the target data of a text type as text data, and classifying the target data of a picture type, or an audio type, or a video type as multimedia data.
8. A data processing apparatus, characterized in that the apparatus comprises:
the data acquisition module is used for acquiring target data;
the hierarchical classification rule obtaining module is used for obtaining a hierarchical rule and a classification rule written based on a declarative language, wherein the hierarchical rule comprises at least one of the steps of classifying the target data according to the times of accessing the target data by a user, classifying the target data according to keywords contained in the target data and classifying the target data according to the source of the target data; the classification rule comprises classifying the target data according to the data type of the target data;
the grading module is used for grading the target data according to the grading rule;
and the classification module is used for classifying the target data according to the classification rule.
9. An electronic device, comprising: processor, memory and a computer program stored on the memory and being executable on the processor, the computer program, when executed by the processor, implementing the steps of the data processing method according to any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the data processing method according to any one of claims 1 to 7.
CN202211737179.XA 2022-12-30 2022-12-30 Data processing method and device, electronic equipment and storage medium Pending CN115982623A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211737179.XA CN115982623A (en) 2022-12-30 2022-12-30 Data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211737179.XA CN115982623A (en) 2022-12-30 2022-12-30 Data processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115982623A true CN115982623A (en) 2023-04-18

Family

ID=85958494

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211737179.XA Pending CN115982623A (en) 2022-12-30 2022-12-30 Data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115982623A (en)

Similar Documents

Publication Publication Date Title
Al-Saggaf et al. Data mining and privacy of social network sites’ users: Implications of the data mining problem
US20140136941A1 (en) Focused Personal Identifying Information Redaction
US20220100899A1 (en) Protecting sensitive data in documents
Sheykhkanloo Employing neural networks for the detection of SQL injection attack
US20090259622A1 (en) Classification of Data Based on Previously Classified Data
CN107211000B (en) System and method for implementing a privacy firewall
CN112417492A (en) Service providing method based on data classification and classification
CN110263817B (en) Risk grade classification method and device based on user account
CN109492401B (en) Content carrier risk detection method, device, equipment and medium
CN108234392B (en) Website monitoring method and device
Ferwerda et al. Predicting musical sophistication from music listening behaviors: a preliminary study
Reedy Strategic leadership in digital evidence: What executives need to know
CN110532773B (en) Malicious access behavior identification method, data processing method, device and equipment
CN116340989A (en) Data desensitization method and device, electronic equipment and storage medium
CN115982623A (en) Data processing method and device, electronic equipment and storage medium
Gupta et al. Security measures in data mining
CN111428037B (en) Method for analyzing matching performance of behavior policy
Breitinger et al. Sharing datasets for Digital Forensic: A novel taxonomy and legal concerns
US20180150752A1 (en) Identifying artificial intelligence content
KumarTripathi Discrimination prevention with classification and privacy preservation in data mining
Tahir et al. Considering Context in Procedures of Personal Data Discovery
CN110969333A (en) User behavior data processing method and device
Nadler et al. Governance and Regulations Implications on Machine Learning
CN116668106B (en) Threat information processing system and method
Adinehnia et al. Effective mining on large databases for intrusion detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination