CN115700495A - Government affair data-based governance model and method - Google Patents

Government affair data-based governance model and method Download PDF

Info

Publication number
CN115700495A
CN115700495A CN202211382916.9A CN202211382916A CN115700495A CN 115700495 A CN115700495 A CN 115700495A CN 202211382916 A CN202211382916 A CN 202211382916A CN 115700495 A CN115700495 A CN 115700495A
Authority
CN
China
Prior art keywords
data
rule
quality
cleaning
desensitization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211382916.9A
Other languages
Chinese (zh)
Inventor
韦双梅
肖益
李宝东
穆显显
刘韶辉
张菁
贾若
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taiji Computer Corp Ltd
Original Assignee
Taiji Computer Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taiji Computer Corp Ltd filed Critical Taiji Computer Corp Ltd
Priority to CN202211382916.9A priority Critical patent/CN115700495A/en
Publication of CN115700495A publication Critical patent/CN115700495A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of government affair data management, in particular to a management model and a management method based on government affair data, which comprises a data classification grading module, a data standard modeling module, a data quality auditing module, a multi-source data fusion module, a data cleaning module, a data desensitization module and a data service module.

Description

Government affair data-based governance model and method
Technical Field
The invention relates to a data management model and a data management method, in particular to a management model and a management method based on government affair data, and belongs to the technical field of data management.
Background
With the development of information technology, government departments apply information technology to the management of administrative organizations, which has become a common trend, data generated by the government departments are widely applied to various social fields, play an important role in activities such as government management and decision, government service, enterprise operation and decision, personal development and the like, are used as national basic strategic resources, have high authority and large volume, have high development and utilization values, and are highly valued and generally paid attention to in various circles.
The invention provides a method for constructing a government affair big data management system in the Chinese invention patent with the application number of CN201911038460.2, which realizes the effective fusion of business data and two-dimensional and three-dimensional graphic data, realizes the multi-dimensional data information presentation and assists a decision maker in making a proposal.
The continuous increase of information systems and data of various government departments is facing to a complex data environment, the problems of multiple information systems, non-uniform data standards, inconsistent data, incomplete data and the like are increasingly highlighted, the data cannot be effectively shared, the development and the use of the data are restricted to a certain extent, the exertion of data value is influenced, the unified management of government affair data is trapped in a dilemma, great challenges are brought to the effective development and the efficient utilization of the government affair data and the supply level of a data element market, and the comparison files do not effectively solve the problems.
The present invention has been made in view of this situation.
Disclosure of Invention
The invention aims to solve the problems and provide a governance model and a governance method based on government data, which standardizes data elements by data classification and grading, performs data quality audit, multi-source data fusion, data cleaning and data desensitization on the government data according to unified standards and apertures, improves the quality of the government data, provides more accurate data service externally in a data service form, fully exerts the value of the data, forms government data assets, realizes data value increment, provides systematic support for data sharing exchange and data opening, and drives government data service innovation.
The invention achieves the aim through the following technical scheme, and the government affair data-based governance model comprises a data classification grading module, a data standard modeling module, a data quality auditing module, a multi-source data fusion module, a data cleaning module, a data desensitization module and a data service module.
A government affair data-based governance method is achieved through a data classification grading module, a data standard modeling module, a data quality auditing module, a multi-source data fusion module, a data cleaning module, a data desensitization module and a data service module.
The governance method based on the government affair data comprises the following steps:
1) The data classification and classification is an important basis for determining a balance point between data protection and utilization, and lays a foundation for the safety protection of various data. The data classification and classification mainly carries out systematic classification and differentiation and classification according with the actual data situation according to the attribute and the characteristic of the government affair data and the multidimensional characteristic of the data and the logic association objectively existing among the multidimensional characteristic of the data, and establishes a certain classification system and an arrangement sequence, so as to better manage and use the government affair data.
2) The data standard, the definition and interpretation of each item of data related to the information system, is the basis of data quality rules, data management and application. Data criteria modeling is the process of correlating a data catalog to data metadata criteria according to the data metadata criteria.
3) And (3) data quality auditing, namely, carrying out multi-directional check and inspection on data according to a data element standard rule and set inspection dimensions and rules, storing result data into a specified database after the data quality inspection is finished, generating a quality report, and assigning the inspected problem data to a corresponding data provider for inquiry and correction.
4) Data fusion, a process of analyzing and processing multi-source data, aims to obtain more accurate and uniform information, and is generally used for enhancing a decision process.
5) And (4) data cleaning, namely cleaning and filtering the data which do not accord with the quality rule for processing, and replacing dirty data with high-quality available data.
6) Data desensitization, namely data privacy removal, sets different data desensitization rules for different data values, and carries out data deformation through the desensitization rules.
7) The data service module provides data assets for a business party to call in an online configuration API mode, the data service is externally issued in a directory mode, a user can inquire corresponding data service and apply for use, and the user can use the data service of a data provider after verification, examination and approval management.
Further, in step 1), the data classification and classification method is as follows:
1. and (4) formulating a data classification and classification standard, wherein the data classification and classification standard is formulated according to national relevant standards and industry relevant standards and by combining government service characteristics. The data classification emphasizes the classification according to attributes and characteristics according to the difference of categories, and can be classified according to main objects, the fields, the departments and the industries described by the government affair data. The data classification focuses on classifying the attributes of the same category according to a certain defined standard according to the height and the size, and can comprehensively consider and classify the attributes according to the sharing category, the open category, the main body object, the data characteristics, the data volume, the sensitivity degree and the influence degree of the data.
2. And combing classification and classification identification rules, wherein the classification and classification identification rules are combed according to the data classification and classification standard to form a related template of the classification and classification rules.
3. And (3) data asset preparation, namely marking and dividing a data catalogue according to the related dimensions in the data classification and classification standard, such as a main object, a belonging field, a belonging department, a belonging industry, data characteristics, a data volume, sensitivity and influence degree, and laying a foundation for subsequent classification and classification work.
4. And (4) classified and graded modeling, namely intelligently analyzing the data assets according to classified and graded identification rules through data content analysis and machine learning means to form strategy rules and model templates of data classified and graded.
5. And (4) classification and grading judgment, automatic scanning according to a configured time period, intelligent analysis and output of classification and grading results.
The data is continuously updated, self-propagated, and generates more data in use, and the amount and the influence degree of the data change accordingly, so that the classification and classification of the data also need to be continuously updated and maintained.
Further, in step 2), the data standard modeling method is as follows:
1. combing the data meta-information. And combing the information of the data elements according to the standard data elements defined in the data element standard file and the standardized template established in the system.
2. Add/import data meta information. And inputting/importing the data meta information into the system, and submitting the audit. The data element information comprises a Chinese name, an internal identifier, a Chinese full spelling, an English name, a definition, a data type, a data length, a data format, a metering unit, a value range description, an associated code, an associated data element, a remark, a standard type of a reference source, a standard number, a standard name, an implementation date and whether to enforce.
3. And (6) auditing and issuing. After the data meta-information is submitted, corresponding auditing and releasing processes need to be carried out, and the accuracy of the data meta-information is ensured.
4. The data elements are associated. And combing the association relation between the government affair information resource catalog and the standard data elements, and establishing an association model to form unified and standard management on the standard data elements.
5. And establishing a data model. And establishing a relation with a quality rule and a data model on the basis of unified and standard management of standard data elements to form a unified standard rule base.
6. And (5) maintenance and management. And if the data model is established and new or updated data element standards exist, corresponding verification and release processes can be correspondingly adjusted, so that the management of updating, releasing and using the data standards is ensured.
Further, in step 3), the method for auditing the data quality includes:
1. and creating a data auditing rule. And creating data multidimensional rule check, which is a precondition for data quality audit. The multi-dimensional inspection rules of the data can be set according to the dimensions of timeliness, availability, integrity, normalization, accuracy and consistency, and the multi-dimensional inspection rules comprise setting of dimension numbers, dimension names, distribution values, calculation methods and descriptions. The method for creating the distribution value and calculating the distribution value is a basis for evaluating the quality degree of the data and is also a basis for generating the score in the quality report.
2. The entity data is associated with a data audit rule. After the data audit rule is created, the entity data and the data audit rule need to be associated, including association rule number, rule name, rule method, processing method, dimension and description.
3. And (5) carrying out fine inspection on data quality. Before data quality fine check, a quality check service needs to be created, a quality rule method is quoted, and the service is operated, so that the data quality fine check is carried out. After the data quality refinement examination, the data with problems can be extracted into a problem bank.
4. A quality report is generated. And finally forming a data quality report of each data provider according to the result of the data quality fine check and the distribution value and the calculation method set by the data audit rule.
5. Problem data feedback and rectification. After the quality report is generated, the quality report can be distributed to each data provider for the data provider to look up the problem data and carry out the problem data rectification work.
6. And (6) data statistics display. The data statistics display can visually display the data inspection dimensionality, the data inspection result, the quality report assignment condition and the problem data correction condition of each data provider, and is convenient for integrally knowing the data quality audit condition.
7. And (5) monitoring and maintaining. All services configured for data quality audit are brought into monitoring management, if error warning is found, problems need to be checked and maintained in time, and normal operation of data quality audit work is guaranteed.
Further, in step 4), the multi-source data fusion method is as follows:
1. the table that needs to be fused is selected. The first step in developing multi-source data fusion is to select the tables that need to be fused or the tables of the data sources in the database.
2. And establishing a data fusion model. The establishment of the data fusion model first requires analysis of the association relationship and the association fields between the data source tables, so as to comb out a new data model.
3. And mapping the source table and the data fusion model. And performing association mapping according to a new data organization logic through the analyzed association relation and association field between each data source table and the data fusion model, and aiming at strengthening the internal association of data.
4. And setting a data synchronization rule. And analyzing and defining the synchronization rule of the data in the data source table and the data fusion model, and carrying out system setting. For example, a piece of data is deleted from the data source table, and whether the piece of data is deleted synchronously in the data fusion model or not is determined.
5. And (6) data fusion. And (4) running related services according to the configured method and rules to complete the fusion of multi-source data, and finally forming a new table.
6. And (5) monitoring and maintaining. All services configured by multi-source data fusion are brought into monitoring management, if error warning is found, problems need to be checked and maintained in time, and normal operation of multi-source data fusion work is guaranteed.
Further, in step 5), the data cleaning method is as follows:
1. a data cleansing rule is created. The data cleaning rule is created as a precondition for developing data cleaning, and data standard cleaning can be carried out by referring to a data element standard, and a corresponding data cleaning rule is configured. And the data cleaning rule can be combed according to specific business requirements to configure the data cleaning rule.
2. And the entity data is associated with a data cleaning rule. After the data cleansing rule is created, entity data and the data cleansing rule need to be associated, including association rule numbers, rule names, rule methods, processing methods, dimensions, and descriptions.
3. And (6) cleaning and checking the data quality. Before data quality cleaning inspection, data cleaning inspection service needs to be established, data cleaning rules and methods are quoted, service is operated, and then data quality cleaning inspection is carried out. After the data quality cleaning inspection, the data which do not accord with the data cleaning rule are extracted into the problem library, and the data which accord with the data cleaning rule are extracted into the standard library.
4. And (5) processing problem data. The checked problem data needs to be processed, and dirty data is replaced by high-quality available data. For example, the created data cleansing rule is set according to the data metadata standard, wherein the name is a Chinese name in the metadata standard, the Chinese is fully spelled into a "xing-ming", the English is named as a "name", the data type is a "character string type C", the field checked as the "name" is replaced, and if the data cleansing rule does not meet the specification, all fields are replaced. For another example, data in column b in the table a is subjected to data cleaning according to business requirements, and if the data in column b is not equal to the values in columns a and c, all the data are replaced with the values in columns a and c.
5. The dirty data is reflowed. When the problem data processing is completed, the data flow returns to the standard library, so that a closed loop is formed.
6. And (6) data statistics display. The data statistics show can carry out visual show to the data cleaning dimension, the data cleaning condition, the problem data handling condition of each data provider, is convenient for wholly know the abluent condition of data.
7. And (5) monitoring and maintaining. All services configured for data cleaning are brought into monitoring management, if error warning is found, problems need to be checked and maintained in time, and normal operation of data cleaning work is guaranteed.
Further, in step 6), the flow of data desensitization is as follows:
1. a data desensitization rule is created. Creating a data desensitization rule is a prerequisite for data desensitization work. The data desensitization rules comprise data encryption and decryption, data fuzzification processing, data camouflage replacement, digital desensitization and data shuffling.
2. And (4) entity data association data desensitization rules. After the data desensitization rule is created, entity data needs to be associated with the data desensitization rule, including an association rule number, a rule name, a rule method, and a processing method.
3. Data desensitization. Before data desensitization, a data desensitization service needs to be established, data desensitization rules and methods are quoted, and the service is operated, so that data desensitization work is carried out.
4. And (5) desensitizing data warehousing. After data desensitization, the desensitized data needs to be extracted into a service library, so that data application is facilitated.
5. And (5) monitoring and maintaining. All services configured for data desensitization are brought into monitoring management, if error warning is found, problems need to be checked and maintained in time, and normal operation of data desensitization work is guaranteed.
The invention has the technical effects and advantages that: the invention aims to improve the quality of government affair data and provide accurate data service for the outside, and improves the quality of government affair data by standardizing data elements and carrying out data quality audit, multi-source data fusion, data cleaning and data desensitization on the government affair data according to a unified standard and caliber, and then provides more accurate data service for the outside in a data service form, thereby fully exerting the value of the data, forming government affair data assets, realizing data value increment, providing systematic support for data sharing exchange and data opening and driving government data service innovation.
Drawings
FIG. 1 is a schematic block diagram of an abatement model of the present invention;
FIG. 2 is a flow chart of the abatement method of the present invention;
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Referring to fig. 1-2, a governance model based on government affairs data includes a data classification and classification module, a data standard modeling module, a data quality auditing module, a multi-source data fusion module, a data cleaning module, a data desensitization module, and a data service module.
The government affair data-based governance method is realized by a data classification grading module, a data standard modeling module, a data quality auditing module, a multi-source data fusion module, a data cleaning module, a data desensitization module and a data service module.
The governance method based on the government affair data comprises the following steps:
1) The data classification and classification is an important basis for determining a balance point between data protection and utilization, and lays a foundation for the safety protection of various data. The classification and classification of the data are mainly based on the attribute and the characteristic of the government affair data, and based on the multidimensional characteristic of the data and the logic association objectively existing among the multidimensional characteristic of the data, the data are classified systematically and are distinguished and classified according with the actual situation of the data, and a certain classification system and a certain arrangement sequence are established, so that the government affair data can be managed and used better.
2) The data standard, the definition and interpretation of each item of data related to the information system, is the basis of data quality rules, data management and application. Data criteria modeling is the process of correlating a data catalog to data metadata criteria according to the data metadata criteria.
3) And (3) data quality auditing, namely, carrying out multi-directional check and inspection on data according to a data element standard rule and set inspection dimensions and rules, storing result data into a specified database after the data quality inspection is finished, generating a quality report, and assigning the inspected problem data to a corresponding data provider for inquiry and correction.
4) Data fusion, a process of analyzing and processing multi-source data, aims to obtain more accurate and uniform information, and is generally used for enhancing a decision process.
5) And (4) data cleaning, namely cleaning and filtering the data which do not accord with the quality rule for processing, and also a process for replacing dirty data with high-quality available data.
6) Data desensitization, namely data privacy removal, sets different data desensitization rules for different data values, and carries out data deformation through the desensitization rules.
7) The data service module provides data assets for a business party to call in an online configuration API mode, the data service is externally issued in a directory mode, a user can inquire corresponding data service and apply for use, and the user can use the data service of a data provider after verification, examination and approval management.
The data classification and classification method comprises the following steps:
1. and (4) formulating a data classification and classification standard, wherein the data classification and classification standard is formulated according to national relevant standards and industry relevant standards and by combining government service characteristics. The data classification emphasizes the classification according to attributes and characteristics according to the difference of categories, and can be classified according to main objects, the fields, the departments and the industries described by the government affair data. The data classification focuses on classifying the attributes of the same category according to a certain defined standard according to the height and the size, and can comprehensively consider and classify the attributes according to the sharing category, the open category, the main body object, the data characteristics, the data volume, the sensitivity degree and the influence degree of the data.
2. And combing classification and classification identification rules, wherein the classification and classification identification rules are combed according to the data classification and classification standard to form a related template of the classification and classification rules.
3. And (3) data asset preparation, namely marking and dividing a data catalogue according to the related dimensions in the data classification and classification standard, such as a main object, a belonging field, a belonging department, a belonging industry, data characteristics, a data volume, sensitivity and influence degree, and laying a foundation for subsequent classification and classification work.
4. And (4) classified and graded modeling, namely intelligently analyzing the data assets according to classified and graded identification rules by means of data content analysis and machine learning to form strategy rules and model templates of data classified and graded.
5. And (4) classification and grading judgment, automatic scanning according to the configured time period, intelligent analysis and output of classification and grading results.
The data is continuously updated, self-propagated, and generates more data in use, and the amount and the influence degree of the data change accordingly, so that the classification and classification of the data also need to be continuously updated and maintained.
The data standard modeling method is as follows:
1. combing the data meta information. And combing the information of the data elements according to the standard data elements defined in the data element standard file and the standardized template established in the system.
2. Add/import data metadata. And inputting/importing the data meta information into the system, and submitting the audit. The data element information comprises a Chinese name, an internal identifier, a Chinese full spelling, an English name, a definition, a data type, a data length, a data format, a metering unit, a value range description, an association code, an association data element, remarks, a standard type of a reference source, a standard number, a standard name, an implementation date and whether to enforce.
3. And (6) auditing and issuing. After the data meta-information is submitted, corresponding auditing and releasing processes need to be carried out, and the accuracy of the data meta-information is ensured.
4. The data elements are associated. And combing the association relation between the government affair information resource catalog and the standard data elements, and establishing an association model to form unified and standard management on the standard data elements.
5. And establishing a data model. And establishing a relation with a quality rule and a data model on the basis of unified and standard management of standard data elements to form a unified standard rule base.
6. And (5) maintenance and management. And if the data model is established and has newly added or updated data element standards, the corresponding auditing and issuing process can be correspondingly adjusted, so that the management of updating, issuing and using the data standards is ensured.
The method for auditing the data quality comprises the following steps:
1. and creating a data auditing rule. And creating data multidimensional rule check, which is a precondition for data quality audit. The multi-dimensional inspection rules of the data can be set according to the dimensions of timeliness, availability, integrity, normalization, accuracy and consistency, and the multi-dimensional inspection rules comprise setting of dimension numbers, dimension names, distribution values, calculation methods and descriptions. The method for creating the distribution value and calculating the distribution value is a basis for evaluating the quality degree of the data and is also a basis for generating the score in the quality report.
2. The entity data is associated with a data audit rule. After the data audit rule is created, the entity data and the data audit rule need to be associated, including association rule number, rule name, rule method, processing method, dimension and description.
3. And (5) carrying out fine inspection on data quality. Before the data quality fine check, a quality check service needs to be established, a quality rule method is introduced, the service is operated, and then the data quality fine check is carried out. After the data quality is finely checked, the data with problems are extracted into a problem bank.
4. A quality report is generated. And finally forming a data quality report of each data provider according to the result of the data quality fine check and the distribution value and the calculation method set by the data audit rule.
5. Problem data feedback and rectification. After the quality report is generated, the quality report can be distributed to each data provider for the data provider to look up the problem data and carry out the problem data rectification work.
6. And (6) data statistics display. The data statistics display can visually display the data inspection dimensionality, the data inspection result, the quality report assignment condition and the problem data correction condition of each data provider, and is convenient for integrally knowing the data quality audit condition.
7. And (5) monitoring and maintaining. All services configured for data quality audit are brought into monitoring management, if error warning is found, problems need to be checked and maintained in time, and normal operation of data quality audit work is guaranteed.
The multi-source data fusion method comprises the following steps:
1. the table that needs to be fused is selected. The first step in developing multi-source data fusion is to select the tables that need to be fused or the tables of the data sources in the database.
2. And establishing a data fusion model. The establishment of the data fusion model first requires analysis of the association relationship and the association fields between the data source tables, so as to comb out a new data model.
3. And mapping the source table and the data fusion model. And performing association mapping according to a new data organization logic through the analyzed association relation and association field between each data source table and the data fusion model, and aiming at strengthening the internal association of data.
4. And setting a data synchronization rule. And analyzing and defining the synchronization rule of the data in the data source table and the data fusion model, and carrying out system setting. For example, if a piece of data is deleted from the data source table, whether the piece of data is deleted synchronously in the data fusion model or not is determined.
5. And (6) data fusion. And (4) related services are operated according to the configured method and rules to complete the fusion of multi-source data, and finally a new table is formed.
6. And (5) monitoring and maintaining. All services configured by multi-source data fusion are brought into monitoring management, if error warning is found, problems need to be checked and maintained in time, and normal operation of multi-source data fusion work is guaranteed.
The data cleaning method comprises the following steps:
1. a data cleansing rule is created. The data cleaning rule is created as a precondition for developing data cleaning, and data standard cleaning can be carried out by referring to a data element standard, and a corresponding data cleaning rule is configured. And the data cleaning rule can be combed according to specific business requirements to configure the data cleaning rule.
2. And the entity data is associated with a data cleaning rule. After the data cleansing rule is created, entity data and the data cleansing rule need to be associated, including association rule numbers, rule names, rule methods, processing methods, dimensions, and descriptions.
3. And (6) cleaning and checking the data quality. Before data quality cleaning inspection, data cleaning inspection service needs to be established, data cleaning rules and methods are quoted, service is operated, and then data quality cleaning inspection is carried out. After the data quality cleaning inspection, the data which do not accord with the data cleaning rule are extracted into the problem library, and the data which accord with the data cleaning rule are extracted into the standard library.
4. And (5) processing problem data. The checked problem data needs to be processed, and dirty data is replaced by high-quality available data. For example, the created data cleansing rule is set according to the data metadata standard, wherein the name is a Chinese name in the metadata standard, the Chinese is fully spelled into a "xing-ming", the English is named as a "name", the data type is a "character string type C", the field checked as the "name" is replaced, and if the data cleansing rule does not meet the specification, all fields are replaced. For another example, data in column b in the table a is subjected to data cleaning according to business requirements, and if the data in column b is not equal to the values in columns a and c, all the data are replaced with the values in columns a and c.
5. The dirty data is reflowed. When the problem data processing is completed, the data flow returns to the standard library, so that a closed loop is formed.
6. And (6) data statistics display. The data statistics display can visually display the data cleaning dimension, the data cleaning condition and the problem data processing condition of each data provider, and is convenient for integrally knowing the data cleaning condition.
7. And (5) monitoring and maintaining. All services configured for data cleaning are brought into monitoring management, if error warning is found, problems need to be checked and maintained in time, and normal operation of data cleaning work is guaranteed.
The flow of data desensitization is as follows:
1. a data desensitization rule is created. Creating a data desensitization rule is a prerequisite for data desensitization work. The data desensitization rules comprise data encryption and decryption, data fuzzification processing, data camouflage replacement, digital desensitization and data shuffling.
2. And (4) entity data association data desensitization rules. After the data desensitization rule is created, entity data needs to be associated with the data desensitization rule, including an association rule number, a rule name, a rule method, and a processing method.
3. Data desensitization. Before data desensitization, a data desensitization service needs to be established, data desensitization rules and methods are quoted, and the service is operated, so that data desensitization work is carried out.
4. And (5) desensitizing data warehousing. After data desensitization, the desensitized data needs to be extracted into a service library, so that data application is facilitated.
5. And (5) monitoring and maintaining. All services configured for data desensitization are brought into monitoring management, if error warning is found, problems need to be checked and maintained in time, and normal operation of data desensitization work is guaranteed.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (9)

1. A governance model based on government affairs data is characterized in that: the system comprises a data classification grading module, a data standard modeling module, a data quality auditing module, a multi-source data fusion module, a data cleaning module, a data desensitization module and a data service module.
2. A government affairs data-based governance method, which is characterized by being realized by the data classification and classification module, the data standard modeling module, the data quality auditing module, the multi-source data fusion module, the data cleaning module, the data desensitization module and the data service module in claim 1.
3. A governance method based on government data according to claim 2, comprising the steps of:
1) Classifying and grading the data, systematically classifying the data according to the attribute and the characteristic of the government affair data and the multi-dimensional characteristic of the data and the logic association objectively existing among the data through the data classifying and grading module, distinguishing and classifying the data according with the actual condition of the data, and establishing a classification system and an arrangement sequence;
2) The data standard modeling is used for defining and explaining various data related to an information system, is the basis of data quality rules and data management and application, and establishes an association model between a data directory and a data element standard according to the data element standard through the data standard modeling module;
3) Data quality audit, which is to carry out multi-directional check and inspection on data according to a data element standard rule and set inspection dimensionality and rules through the data quality audit module, store result data into a specified database after the data quality inspection is finished, and generate a quality report, or assign the inspected problem data to a corresponding data provider for inquiry and correction;
4) Multi-source data fusion, namely analyzing and processing the multi-source data through the multi-source data fusion module;
5) Data cleaning, namely rechecking and verifying data through the data cleaning module, deleting repeated information, and replacing dirty data with high-quality available data;
6) Data desensitization, namely realizing data privacy removal through the data desensitization module, setting different data desensitization rules for different field contents, and deforming data through the desensitization rules;
7) And the data service module provides data assets for the business party to call in an on-line configuration API mode, the data service is externally issued in a directory mode, a user can inquire corresponding data service and apply for use, and the user can use the data service of the data provider after verification, examination, approval and management.
4. A governance method based on government affairs data according to claim 3, wherein in step 1), the data classification and classification method is as follows:
1. formulating a data classification grading standard, and comprehensively considering and dividing according to the sharing category, the open category, the main object, the data characteristics, the data size, the sensitivity and the influence degree of the data;
2. carding classification identification rules, wherein the classification identification rules are carded according to data classification standards to form a related template of the classification rules;
3. preparing data assets, marking and dividing a data catalogue according to related dimensions in a data classification and grading standard, such as a main object, a belonging field, a belonging department, a belonging industry, data characteristics, a data volume, sensitivity and an influence degree, and laying a foundation for subsequent classification and grading work;
4. carrying out classified grading modeling, and intelligently analyzing the data assets according to classified grading identification rules by means of data content analysis and machine learning to form strategy rules and model templates of the classified grading of the data;
5. classification and grading judgment, automatic scanning according to a configured time period, intelligent analysis and output of classification and grading results;
6. the data is continuously updated, self-propagated, and generates more data in use, and the amount and the influence degree of the data change accordingly, so that the classification and classification of the data also need to be continuously updated and maintained.
5. A governance method based on government affairs data according to claim 3, wherein in step 2), the data standard modeling method is as follows:
1. combing the data element information, combing the information of the data element according to the standard data element defined in the data element standard file and the standardized template established in the system;
2. newly adding or importing data meta-information, inputting or importing the data meta-information into a system, and submitting for auditing;
3. auditing and issuing, namely auditing and issuing the imported data elements;
4. associating data elements, combing association relation with standard data elements in the government affair information resource catalog, and establishing an association model to form unified and standard management on the standard data elements;
5. establishing a data model, establishing a relation with a quality rule and the data model on the basis of unified and standard management of standard data elements, and forming a unified standard rule base;
6. and if the data model is established, new or updated data element standards exist, corresponding verification and release processes can be correspondingly adjusted, and the management of updating, releasing and using the data standards is further ensured.
6. A governance method based on government data according to claim 3, wherein: in step 3), the method for auditing the data quality comprises the following steps:
1. creating a data audit rule, and creating data multi-dimensional rule check;
2. the entity data association data audit rule is used for associating the entity data with the data audit rule after the data audit rule is established, and the association rule comprises an association rule number, a rule name, a rule method, a processing method, a dimension and a description;
3. and (4) carrying out data quality fine inspection, wherein before the data quality fine inspection, a quality inspection service needs to be established, a quality rule method is quoted, the service is operated, and then the data quality fine inspection is carried out. After the data quality is finely checked, the data with problems are extracted into a problem library;
4. generating a quality report, and finally forming a data quality report of each data provider according to the result of the data quality fine check and the distribution value and the calculation method set by the data audit rule;
5. problem data feedback and rectification, wherein after a quality report is generated, the quality report can be distributed to each data provider for the data provider to look up problem data and carry out problem data rectification and rectification work;
6. the data statistics display can visually display the data inspection dimensionality, the data inspection result, the quality report assignment condition and the problem data correction condition of each data provider, so that the data quality audit condition can be integrally known;
7. monitoring and maintaining, wherein all services configured for data quality auditing are brought into monitoring management, if error warning is found, problems need to be checked and maintained in time, and normal operation of data quality auditing work is guaranteed.
7. A governance method based on government data according to claim 3, wherein: in step 4), the multi-source data fusion method is as follows:
1. selecting a table to be fused, and carrying out the first step of multi-source data fusion, namely selecting the table to be fused or a table of a data source in a database;
2. establishing a data fusion model, wherein the establishment of the data fusion model firstly needs to analyze the incidence relation and the incidence field among data source tables so as to comb out a new data model;
3. the source table and the data fusion model are subjected to relational mapping, and the relational mapping is carried out according to new data organization logic through the analyzed incidence relation and incidence field between each data source table and the data fusion model, so that the internal relation of data is strengthened;
4. setting a data synchronization rule, analyzing and defining the synchronization rule of the data in the data source table and the data fusion model, and performing system setting, such as deleting a certain piece of data in the data source table and whether the data in the data fusion model is deleted synchronously;
5. data fusion, namely running related services according to the configured method and rules to complete fusion of multi-source data, and finally forming a new table;
6. monitoring and maintaining, namely bringing all services configured by multi-source data fusion into monitoring management, and if error warning is found, timely troubleshooting and maintaining are needed to ensure normal operation of multi-source data fusion work.
8. A governance method based on government data according to claim 3, wherein: in step 5), the data cleaning method is as follows:
1. creating a data cleaning rule, wherein the created data cleaning rule is a precondition for developing data cleaning, can refer to a data metadata standard to carry out data standardized cleaning and configure a corresponding data cleaning rule, and can also comb the data cleaning rule according to a specific service requirement to configure the data cleaning rule;
2. the method comprises the steps that entity data are associated with data cleaning rules, and after the data cleaning rules are established, the entity data need to be associated with the data cleaning rules, wherein the association rules comprise association rule numbers, rule names, rule methods, processing methods, dimensions and descriptions;
3. data quality cleaning and checking, wherein data cleaning and checking service needs to be established before the data quality cleaning and checking, data cleaning rules and methods are quoted, the service is operated, and the data quality cleaning and checking is carried out;
4. processing the problem data, namely processing the checked problem data, and replacing dirty data with high-quality available data;
5. the dirty data reflows, and when the problem data processing is finished, the dirty data reflows to the standard library to form a closed loop;
6. data statistics display, which can visually display data cleaning dimensions, data cleaning conditions and problem data processing conditions of each data provider, and is convenient for integrally knowing the data cleaning conditions;
7. monitoring and maintaining, namely bringing all services configured by data cleaning into monitoring management, and if an error warning is found, timely troubleshooting and maintaining are needed to ensure the normal operation of data cleaning work.
9. A governance method based on government data according to claim 3, wherein in step 6), the flow of data desensitization is as follows:
1. creating a data desensitization rule, wherein the created data desensitization rule is a precondition of data desensitization work, and the data desensitization rule comprises data encryption and decryption, data fuzzification processing, data masking replacement, digital desensitization and data shuffling;
2. the entity data association data desensitization rule is established, and then the entity data and the data desensitization rule need to be associated, wherein the association rule comprises an association rule number, a rule name, a rule method and a processing method;
3. data desensitization, wherein a data desensitization service needs to be established before data desensitization, a data desensitization rule and a data desensitization method are referred, and the service is operated, so that data desensitization work is carried out;
4. desensitizing data is put into a database, and after data desensitization, the desensitized data needs to be extracted into a service library, so that data application is facilitated;
5. monitoring and maintaining, wherein all services configured for data desensitization are brought into monitoring management, if error warning is found, problems need to be checked in time and maintained, and normal operation of data desensitization work is guaranteed.
CN202211382916.9A 2022-11-07 2022-11-07 Government affair data-based governance model and method Pending CN115700495A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211382916.9A CN115700495A (en) 2022-11-07 2022-11-07 Government affair data-based governance model and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211382916.9A CN115700495A (en) 2022-11-07 2022-11-07 Government affair data-based governance model and method

Publications (1)

Publication Number Publication Date
CN115700495A true CN115700495A (en) 2023-02-07

Family

ID=85121023

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211382916.9A Pending CN115700495A (en) 2022-11-07 2022-11-07 Government affair data-based governance model and method

Country Status (1)

Country Link
CN (1) CN115700495A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117892849A (en) * 2023-11-17 2024-04-16 深圳市销邦科技股份有限公司 E-government big data processing optimization method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117892849A (en) * 2023-11-17 2024-04-16 深圳市销邦科技股份有限公司 E-government big data processing optimization method and system

Similar Documents

Publication Publication Date Title
CN110781236A (en) Method for constructing government affair big data management system
KR100815628B1 (en) System and method for electronically managing discovery pleading information
CN112231315A (en) Data management method based on big data
CN111190881A (en) Data management method and system
US20130166515A1 (en) Generating validation rules for a data report based on profiling the data report in a data processing tool
CN110851667B (en) Integration analysis method and tool for large amount of data of multiple sources
CN113592680A (en) Service platform based on regional education big data
CN109784721B (en) Employment data analysis and data mining analysis platform system
CN112199433A (en) Data management system for city-level data middling station
US8458178B2 (en) Dimensional data explorer
CN111680153A (en) Big data authentication method and system based on knowledge graph
Hikmawati et al. Improving Data Quality and Data Governance Using Master Data Management: A Review
CN113722301A (en) Big data processing method, device and system based on education information and storage medium
CN110751361A (en) Bank demand item level management method and system
Baškarada How spreadsheet applications affect information quality
CN110555675A (en) Method for realizing real-time online supervision
CN112506892A (en) Index traceability management system based on metadata technology
CN116384889A (en) Intelligent analysis method for information big data based on natural language processing technology
CN110555676A (en) Dynamic supervision platform system implementation method
CN115700495A (en) Government affair data-based governance model and method
CN110807583A (en) Configurable ERP role authority verification system and method based on RBAC
CN113342786A (en) Model management and control-based online data management and management method and system
CN116701358B (en) Data processing method and system
CN113407161A (en) Complex equipment-oriented collaborative research and development management system
CN116228402A (en) Financial credit investigation feature warehouse technical support system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination