CN111782736A - Data classification management method and system - Google Patents

Data classification management method and system Download PDF

Info

Publication number
CN111782736A
CN111782736A CN202010696437.9A CN202010696437A CN111782736A CN 111782736 A CN111782736 A CN 111782736A CN 202010696437 A CN202010696437 A CN 202010696437A CN 111782736 A CN111782736 A CN 111782736A
Authority
CN
China
Prior art keywords
label
prefix
user
labels
predefined
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010696437.9A
Other languages
Chinese (zh)
Other versions
CN111782736B (en
Inventor
郑敏
吴呈良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Original Assignee
Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chaozhou Zhuoshu Big Data Industry Development Co Ltd filed Critical Chaozhou Zhuoshu Big Data Industry Development Co Ltd
Priority to CN202010696437.9A priority Critical patent/CN111782736B/en
Publication of CN111782736A publication Critical patent/CN111782736A/en
Application granted granted Critical
Publication of CN111782736B publication Critical patent/CN111782736B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results

Abstract

The invention relates to the field of data management, and particularly provides a method and a system for data classification management.A user carries out label marking on an unlabelled table, checks whether the unlabelled table exists, if so, checks whether a predefined label is tried, if so, carries out automatic marking, and if not, carries out manual marking; if not, the label or the label classification needs to be perfected, and the table is displayed. Compared with the prior art, the invention can effectively reduce the manpower input in the data management through a certain degree of automatic marking function, is convenient for a user to carry out multi-dimensional classification checking on the existing data through a form of table-label classification three-level, and continuously improves the existing data management through a warning-feedback mode.

Description

Data classification management method and system
Technical Field
The invention relates to the field of data management, and particularly provides a method and a system for data classification management.
Background
With the development of computer science and information science, each enterprise unit increasingly pays more attention to the construction of an information system, various information systems are gradually perfected, and massive data are generated in daily operation.
The data generated by a plurality of different information systems may differ in organization and structure, even to generate tables with ambiguous or temporary tables. In enterprise data management, due to the lack of corresponding management means, enterprises often have difficulty in realizing effective utilization of data, or some data which are actually discarded but occupy system resources for a long time due to unmarked data exist.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a data classification management method with strong practicability.
The invention further aims to provide a system for data classification management, which is reasonable in design, safe and applicable.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a method for managing data classification, user label mark to the non-labeled table, check whether there is non-labeled table, if there is, check whether to try predefined label, if try, carry on the automatic marking, if there is not try, carry on the manual marking;
if not, the label or the label classification needs to be perfected, and the table is displayed.
Further, before the user labels the untagged table, firstly, a label correspondence table is created in the database to store the correspondence between the data table and the label, a predefined label table is created, and the correspondence between the prefix and the label is initialized and marked according to specific business rules.
Preferably, the field NAMEs included in the tag correspondence TABLE are TABLE _ NAME, LABEL _ TYPE, and PREFIX _ CHECK;
the field NAMEs included in the predefined tag TABLE are TABLE _ PREFIX, LABEL _ NAME, and LABEL _ TYPE.
Further, checking whether an unmarked table exists, performing inspection operation on the table of the database to be managed according to a period defined by a user, and checking whether an unmarked table exists according to a comparison system table list and a label corresponding table;
if the non-tagged table exists, checking whether a PREFIX _ CHECK field exists and whether the PREFIX _ CHECK field is empty, namely, the pre-tagged flow is not performed;
if the situation is met, performing label printing operation on the watch by combining with a predefined label table;
if the PREFIX _ CHECK field is not empty, namely the pre-defined labeling process is performed, the user is informed to perform custom labeling on the tables which are not labeled, and the corresponding relation between all the pre-defined or custom tables and the labels is recorded in the label corresponding table.
Further, after the initial labeling is finished, the user completes and classifies the existing labels, combines and unifies the labels through the standardization of the labels, and updates the label name fields in the label correspondence table.
A data classification management system comprises an inspection module, a marking module, a warning module and a table display module, wherein the inspection module is used for performing inspection operation on a table in a database to be managed according to a period defined by a user;
the marking module is used for performing labeling operation on the watch; the warning module is used for informing a user to carry out custom labeling processing on the untagged table; and the table display module is used for checking the labeled table according to the label.
Further, a tag correspondence table is created in the database for storing the data table and the tag correspondence subsequently, a predefined tag table is created in the database, and the correspondence between the table name prefix and the tag is initialized according to specific business rules.
Preferably, the field NAMEs included in the tag correspondence TABLE are TABLE _ NAME, LABEL _ TYPE, and PREFIX _ CHECK;
the field NAMEs included in the predefined tag TABLE are TABLE _ PREFIX, LABEL _ NAME, and LABEL _ TYPE.
Further, the polling module is used for polling the tables in the database to be managed according to a period defined by a user, and checking whether an unlabeled table exists according to the comparison system table list and the label corresponding table;
if the non-tagged table exists, checking whether a PREFIX _ CHECK field exists and whether the PREFIX _ CHECK field is empty, namely, the pre-tagged flow is not performed;
if the situation is met, performing labeling operation on the table through a marking module by combining a predefined label table;
if the PREFIX _ CHECK field is not null, namely the labeling process is predefined, the warning module is used for informing a user to perform custom labeling processing on the non-labeled table;
and recording the corresponding relation between all the predefined or self-defined tables and the labels to the label corresponding table through the marking module.
Further, after the initial labeling is completed, the user completes and classifies the existing labels, combines and unifies the labels through the standardization of the labels, and updates the label name fields in the label correspondence table.
Compared with the prior art, the data classification management method and the data classification management system have the following outstanding beneficial effects:
(1) the invention can effectively reduce the input of manpower in data management through a certain degree of automatic marking function.
(2) Through the form of table-label classification three levels, the user can conveniently conduct multi-dimensional classification checking on the existing data. Existing data management is continuously refined through an alert-feedback model.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a data classification management system.
Detailed Description
The present invention will be described in further detail with reference to specific embodiments in order to better understand the technical solutions of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A preferred embodiment is given below:
as shown in fig. 1, a method for data classification management in this embodiment is as follows: the user labels the unlabelled table, checks whether the unlabelled table exists, if so, checks whether a predefined label has been tried, if so, performs automatic labeling, and if not, performs manual labeling. If not, the label or the label classification needs to be perfected, and the table is displayed.
The specific process is as follows:
before a user LABELs an untagged TABLE, firstly, a LABEL corresponding TABLE R _ TABLE _ LABEL is created in a database to store the corresponding relation between the data TABLE and the LABEL, a predefined LABEL TABLE R _ PREFIX _ LABEL is created, and the corresponding relation between a PREFIX and the LABEL is initialized and marked according to specific business rules.
The field NAMEs included in the tag correspondence TABLE R _ TABLE _ LABEL are TABLE _ NAME, LABEL _ TYPE, and PREFIX _ CHECK. The field NAMEs included in the predefined tag TABLE R _ PREFIX _ LABEL are TABLE _ PREFIX, LABEL _ NAME, and LABEL _ TYPE.
Checking whether an unmarked table exists, performing inspection operation on the table of the database to be managed according to a period defined by a user, and checking whether an unmarked table exists according to a comparison system table list and a label corresponding table.
If there is a non-tagged table, it is checked if there is a PREFIX _ CHECK field and if the PREFIX _ CHECK field is empty, i.e. not subjected to the predefined tagging flow.
If the situation is met, the table is labeled by combining the predefined LABEL table R _ PREFIX _ LABEL.
If the PREFIX _ CHECK field is not empty, namely the pre-defined labeling process is performed, the user is informed to perform custom labeling on the tables which are not labeled, and the corresponding relation between all the pre-defined or custom tables and the labels is recorded in the label corresponding table.
After the initial labeling is finished, the user completes and classifies the existing labels, combines and unifies the labels through the standardization of the labels, and updates the label name fields in the label corresponding table.
The system for realizing the method comprises the following steps:
a data classification management system comprises an inspection module, a marking module, a warning module and a table display module, wherein the inspection module is used for performing inspection operation on a table in a database to be managed according to a period defined by a user.
The marking module is used for performing labeling operation on the watch; the warning module is used for informing a user to carry out custom labeling processing on the untagged table; and the table display module is used for checking the labeled table according to the label.
The method comprises the following specific steps:
(1) and creating a LABEL corresponding TABLE R _ TABLE _ LABEL in the database for storing the corresponding relation between the data TABLE and the LABEL subsequently.
Name of field Data type Note
TABLE_NAME Character type Name of data table
LABEL_NAME Character type Label name
LABEL_TYPE Character type Type of label
PREFIX_CHECK Character type Whether a predefined marking process has been performed
(2) A predefined LABEL table R _ PREFIX _ LABEL is established in a database, and the corresponding relation between the table name PREFIX and the LABEL is initialized according to specific business rules.
Name of field Data type Note
TABLE_PREFIX Character type Table name prefix
LABEL_NAME Character type Label name
LABEL_TYPE Character type Label type 5
(3) The polling module is used for polling the TABLE in the database to be managed according to the period defined by the user, and checking whether an untagged TABLE exists according to the comparison system TABLE list and the LABEL corresponding TABLE R _ TABLE _ LABEL.
If there is an untagged table, it is checked if there is a PREFIX _ CHECK field that is empty, i.e. not subjected to a predefined tagging flow.
If such a situation is met, the table is tagged by the tagging module in conjunction with the predefined tag table R _ PREFIX _ LABEL. If the PREFIX _ CHECK field is not null, namely the labeling process is predefined, the user is informed to perform custom labeling processing on the non-labeled table through an alarm module. And recording the corresponding relation between all predefined or self-defined TABLEs and the LABELs to a LABEL corresponding TABLE R _ TABLE _ LABEL through a marking module.
(4) After the initial labeling is completed, the user can perfect and classify the existing labels. The perfection of the LABEL mainly relates to the standardization work of the LABEL, because data can be generated by a plurality of information systems, users in different service fields have different appellations to the same entity or different appellations, in the process of customizing the LABEL by the user, the LABEL with different names but the same meaning appears, in this link, the part of LABELs are merged and unified through the standardization of the LABEL, and the LABEL name field in the LABEL corresponding TABLE R _ TABLE _ LABEL is updated. The classification of the labels mainly relates to the type division of the labels, and is used for reducing the screening range and improving the query efficiency in the actual use process.
(5) And checking the labeled table according to the label through the table display module.
The above embodiments are only specific examples of the present invention, and the scope of the present invention includes but is not limited to the above embodiments, and any suitable changes or substitutions that are consistent with the method and system claims for data classification management and are made by those skilled in the art should fall within the scope of the present invention.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. A method for managing data classification is characterized in that a user labels an unlabelled table, checks whether the unlabelled table exists, if so, checks whether a predefined label is tried, if so, performs automatic labeling, and if not, performs manual labeling;
if not, the label or the label classification needs to be perfected, and the table is displayed.
2. The data classification management method according to claim 1, wherein before the user labels the untagged table, firstly, a label correspondence table is created in the database to store the correspondence between the data table and the label, and a predefined label table is created, and the correspondence between the prefix and the label is initialized and marked according to specific business rules.
3. The data classification management method according to claim 2, wherein the field NAMEs included in the tag correspondence TABLE are TABLE _ NAME, LABEL _ TYPE, and PREFIX _ CHECK;
the field NAMEs included in the predefined tag TABLE are TABLE _ PREFIX, LABEL _ NAME, and LABEL _ TYPE.
4. The data classification management method according to claim 3, characterized by checking whether an unlabeled table exists, performing patrol operation on the table of the database to be managed according to a period defined by a user, and checking whether an unlabeled table exists according to a comparison system table list and a label correspondence table;
if the non-tagged table exists, checking whether a PREFIX _ CHECK field exists and whether the PREFIX _ CHECK field is empty, namely, the pre-tagged flow is not performed;
if the situation is met, performing label printing operation on the watch by combining with a predefined label table;
if the PREFIX _ CHECK field is not empty, namely the pre-defined labeling process is performed, the user is informed to perform custom labeling on the tables which are not labeled, and the corresponding relation between all the pre-defined or custom tables and the labels is recorded in the label corresponding table.
5. The data classification management method as claimed in claim 4, characterized in that after the initial labeling is completed, the user completes and classifies the existing labels, merges and unifies the labels by standardizing the labels, and updates the label name field in the label correspondence table.
6. A data classification management system is characterized by comprising an inspection module, a marking module, a warning module and a table display module, wherein the inspection module is used for performing inspection operation on a table in a database to be managed according to a period defined by a user;
the marking module is used for performing labeling operation on the watch; the warning module is used for informing a user to carry out custom labeling processing on the untagged table; and the table display module is used for checking the labeled table according to the label.
7. The data classification management system according to claim 6, characterized in that a tag correspondence table is created in the database for subsequent storage of the data table and tag correspondence, and a predefined tag table is created in the database, and the correspondence between table name prefixes and tags is initialized according to specific business rules.
8. The data classification management system according to claim 7, wherein the field NAMEs included in the tag correspondence TABLE are TABLE _ NAME, LABEL _ TYPE, and PREFIX _ CHECK;
the field NAMEs included in the predefined tag TABLE are TABLE _ PREFIX, LABEL _ NAME, and LABEL _ TYPE.
9. The data classification management system according to claim 8, wherein the patrol module is configured to perform patrol operations on the tables in the database to be managed according to a period defined by a user, and check whether an untagged table exists according to a comparison system table list and a tag correspondence table;
if the non-tagged table exists, checking whether a PREFIX _ CHECK field exists and whether the PREFIX _ CHECK field is empty, namely, the pre-tagged flow is not performed;
if the situation is met, performing labeling operation on the table through a marking module by combining a predefined label table;
if the PREFIX _ CHECK field is not null, namely the labeling process is predefined, the warning module is used for informing a user to perform custom labeling processing on the non-labeled table;
and recording the corresponding relation between all the predefined or self-defined tables and the labels to the label corresponding table through the marking module.
10. The data classification management system according to claim 9, characterized in that after the initial labeling is completed, the user completes and classifies the existing labels, merges and unifies the labels by standardizing the labels, and updates the label name field in the label correspondence table.
CN202010696437.9A 2020-07-20 2020-07-20 Data classification management method and system Active CN111782736B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010696437.9A CN111782736B (en) 2020-07-20 2020-07-20 Data classification management method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010696437.9A CN111782736B (en) 2020-07-20 2020-07-20 Data classification management method and system

Publications (2)

Publication Number Publication Date
CN111782736A true CN111782736A (en) 2020-10-16
CN111782736B CN111782736B (en) 2022-07-26

Family

ID=72763547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010696437.9A Active CN111782736B (en) 2020-07-20 2020-07-20 Data classification management method and system

Country Status (1)

Country Link
CN (1) CN111782736B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239539A (en) * 2017-06-02 2017-10-10 山东浪潮商用系统有限公司 A kind of user-defined m odel method based on relevant database
CN110750514A (en) * 2019-09-17 2020-02-04 福建天泉教育科技有限公司 Method and terminal for labeling main data
CN111090656A (en) * 2020-03-23 2020-05-01 北京大数元科技发展有限公司 Method and system for dynamically constructing object portrait
CN111191125A (en) * 2019-12-24 2020-05-22 长威信息科技发展股份有限公司 Data analysis method based on tagging

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239539A (en) * 2017-06-02 2017-10-10 山东浪潮商用系统有限公司 A kind of user-defined m odel method based on relevant database
CN110750514A (en) * 2019-09-17 2020-02-04 福建天泉教育科技有限公司 Method and terminal for labeling main data
CN111191125A (en) * 2019-12-24 2020-05-22 长威信息科技发展股份有限公司 Data analysis method based on tagging
CN111090656A (en) * 2020-03-23 2020-05-01 北京大数元科技发展有限公司 Method and system for dynamically constructing object portrait

Also Published As

Publication number Publication date
CN111782736B (en) 2022-07-26

Similar Documents

Publication Publication Date Title
CN111459985B (en) Identification information processing method and device
CN111506559A (en) Data storage method and device, electronic equipment and storage medium
CN111090656B (en) Method and system for dynamically constructing object portrait
CN102254012A (en) Graph data storing method and subgraph enquiring method based on external memory
CN108376364A (en) A kind of method, equipment and the terminal device of payment system reconciliation
CN111897856A (en) Supervision message generation method and device, electronic equipment and readable storage medium
CN108563431A (en) Software development methodology, device, computer readable storage medium and electronic equipment
CN110851663B (en) Method and device for managing metadata
CN110929120B (en) Method and apparatus for managing technical metadata
CN109697488A (en) A kind of the RFID Internet of Things application system and method for supply chain orientation management
CN111782736B (en) Data classification management method and system
CN112883413A (en) Intelligent management method for IT asset data in power grid enterprise
CN110879799B (en) Method and device for labeling technical metadata
CN112669133A (en) Intelligent cost control reimbursement method capable of automatically matching according to application scenes
CN112363996A (en) Method, system, and medium for building a physical model of a power grid knowledge graph
CN103995832A (en) Complex relational data storage technology implementation method based on separation of attributes and relations
US9864789B2 (en) Method and system for implementing an on-demand data warehouse
CN115827862A (en) Associated acquisition method for multivariate expense voucher data
CN105512829A (en) Web service protocol-based card value multidimensional cost accumulation method
CN115759746A (en) Configurable user early warning method and device, electronic equipment and storage medium
CN114840519A (en) Data labeling method, equipment and storage medium
CN111309996A (en) Intelligent library auxiliary management system
CN112102054A (en) Payroll management method and system in parallel enterprise finance and tax SaaS system
CN111026705A (en) Building engineering file management method, system and terminal equipment
CN110928979B (en) Method and apparatus for managing technical metadata

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant