CN115617776A - Data management system and method - Google Patents

Data management system and method Download PDF

Info

Publication number
CN115617776A
CN115617776A CN202211217657.4A CN202211217657A CN115617776A CN 115617776 A CN115617776 A CN 115617776A CN 202211217657 A CN202211217657 A CN 202211217657A CN 115617776 A CN115617776 A CN 115617776A
Authority
CN
China
Prior art keywords
data
management
database
unit
metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211217657.4A
Other languages
Chinese (zh)
Inventor
马剑林
沈忱
吴志锋
刘涛
刘昊
陈莎
宋雪峰
曹文琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Oil and Gas Pipeline Network Corp
National Pipeline Network Southwest Pipeline Co Ltd
Original Assignee
China Oil and Gas Pipeline Network Corp
National Pipeline Network Southwest Pipeline Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Oil and Gas Pipeline Network Corp, National Pipeline Network Southwest Pipeline Co Ltd filed Critical China Oil and Gas Pipeline Network Corp
Priority to CN202211217657.4A priority Critical patent/CN115617776A/en
Publication of CN115617776A publication Critical patent/CN115617776A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2358Change logging, detection, and notification

Abstract

The invention provides a data management system and a method, which belong to the field of data management, and the system comprises: the system comprises a data source management module, a database management module, a data model management module, a data standard management module and a data quality management module, wherein the data source management module is used for acquiring a plurality of initial metadata and data types of each preset service system from the preset service systems and constructing a to-be-processed database through the initial metadata and the data types; the database management module is used for updating the database of the database to be processed to obtain an updated database; and the data model management module is used for constructing a target model according to the updated database. The invention solves the problems of low data line availability, low structuralization rate, multi-source counting and the like caused by the more data volume of the business system, establishes unified data asset management standard and lays a good data management foundation for constructing an intelligent pipeline.

Description

Data management system and method
Technical Field
The invention mainly relates to the technical field of data management, in particular to a data management system and a data management method.
Background
At present, as the data volume of a business system is more and more, a plurality of companies do not establish a unified data asset management standard in the aspects of data acquisition, data unified storage, data application and data management, and have the problems of low data line availability, low structuralization rate, multi-source counting and the like, so that the requirements for constructing a pipeline digital twin body, an intelligent pipeline and an intelligent pipeline network cannot be met. How to realize unified data sources and solve the problems of data islands and the like by constructing a unified data standard system becomes a key step for converting the traditional pipeline management mode into an intelligent management mode.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a data management system and method for overcoming the defects of the prior art.
The technical scheme for solving the technical problems is as follows: a data management system, comprising: a data source management module, a database management module, a data model management module, a data standard management module and a data quality management module,
the data source management module is used for obtaining a plurality of initial metadata of each preset service system and a data type corresponding to each initial metadata from the plurality of preset service systems, and constructing a to-be-processed database of each preset service system through the plurality of initial metadata of each preset service system and the data type corresponding to each initial metadata;
the database management module is used for respectively updating the databases to be processed to obtain updated databases of the preset service systems;
the data model management module is used for respectively constructing a target model of each preset service system according to each updated database;
the data standard management module is used for importing standard management information and updating the standard management information to obtain updated standard management information;
and the data quality management module is used for carrying out inspection analysis on all target models according to the updated standard management information to obtain a data quality inspection result report.
Another technical solution of the present invention for solving the above technical problems is as follows: a data management method, comprising the steps of:
acquiring a plurality of initial metadata of each preset service system and a data type corresponding to each initial metadata from the plurality of preset service systems;
constructing a to-be-processed database of each preset service system through a plurality of initial metadata of each preset service system and a data type corresponding to each initial metadata;
updating the databases of the to-be-processed databases respectively to obtain updated databases of the preset service systems;
respectively constructing a target model of each preset service system according to each updated database;
importing standard management information, and updating the standard management information to obtain updated standard management information;
and checking and analyzing all target models according to the updated standard management information to obtain a data quality checking result report.
The invention has the beneficial effects that: the method comprises the steps of constructing a to-be-processed database through a plurality of initial metadata and data types of a preset service system, updating the database of the to-be-processed database to obtain an updated database, constructing a target model according to the updated database, updating information of standard management information to obtain updated standard management information, and checking and analyzing all target models according to the updated standard management information to obtain a data quality check result report.
Drawings
FIG. 1 is a block diagram of a data management system according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a data management method according to an embodiment of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
Fig. 1 is a block diagram of a data management system according to an embodiment of the present invention.
As shown in fig. 1, a data management system includes: a data source management module, a database management module, a data model management module, a data standard management module and a data quality management module,
the data source management module is used for obtaining a plurality of initial metadata of each preset service system and a data type corresponding to each initial metadata from the plurality of preset service systems, and constructing a to-be-processed database of each preset service system through the plurality of initial metadata of each preset service system and the data type corresponding to each initial metadata;
the database management module is used for respectively updating the databases to be processed to obtain updated databases of the preset service systems;
the data model management module is used for respectively constructing a target model of each preset service system according to each updated database;
the data standard management module is used for importing standard management information and updating the standard management information to obtain updated standard management information;
and the data quality management module is used for carrying out inspection analysis on all target models according to the updated standard management information to obtain a data quality inspection result report.
Preferably, the preset service system may be an operation management system, a production operation system, a comprehensive support system, or other information systems.
In the embodiment, the to-be-processed database is constructed by presetting a plurality of initial metadata and data types of the service system, the to-be-processed database is updated to obtain an updated database, the target model is constructed according to the updated database, the updated standard management information is updated according to the information of the standard management information, and the data quality inspection result report is obtained by inspecting and analyzing all the target models according to the updated standard management information.
Optionally, as an embodiment of the present invention, the data source management module includes a data source collection unit, a database distinguishing unit, a first data conversion unit, a second data conversion unit, a third data conversion unit, and a database integration unit,
the data source acquisition unit is used for acquiring a plurality of initial metadata of each preset service system and a data type corresponding to each initial metadata from the plurality of preset service systems;
the database distinguishing unit is configured to calculate data capacities of a plurality of initial metadata of each preset service system, obtain data capacities of each preset service system, and respectively judge whether each preset service system satisfies a condition that data types corresponding to all the initial metadata in the preset service system are any data type of the plurality of preset data types, and the data capacity of the preset service system is greater than or equal to the preset data capacity; if so, creating a first database, and storing all the initial metadata in the preset service system meeting the condition into the first database; if not, creating a second database, and storing all the initial metadata in the preset service system which does not meet the condition into the second database;
the first data conversion unit is configured to perform format conversion on the initial metadata of which the data type is a first preset distinguishing data type in the first database or the second database, and convert the initial metadata into first converted metadata so as to update the first database or the second database;
the second data conversion unit is configured to perform format conversion on the initial metadata of which the data type is a second preset distinguishing data type in the first database or the second database, and convert the initial metadata into second converted metadata to update the first database or the second database;
the third data conversion unit is configured to perform format conversion on the initial metadata of which the data type is a third preset distinguishing data type in the first database or the second database, and convert the initial metadata into third converted metadata to update the first database or the second database;
the database integration unit is configured to use the first database or the second database updated by the preset service systems through the first data conversion unit, the second data conversion unit, and the third data conversion unit as the to-be-processed database of each preset service system.
Preferably, the first database may be a relational database, the second database may be a non-relational database, the first predetermined discriminating data type may be a text file and/or an XML file, the second predetermined discriminating data type may be an EXECL file, and the third predetermined discriminating data type may be a PDM file and an HML file.
It should be understood that data source management is provided, and data sources to be managed can be defined and collected according to the service system (i.e. the preset service system), the management department and the like. The method mainly comprises data source support, data source exploration and physical data modeling.
Specifically, the data source management module supports access data of relational databases such as Oracle, mySQL, SQLServer and the like based on a database acquisition adapter technology, and supports access data of non-relational databases such as HBase, mongoDB and the like. And the PDM model relation import data is supported.
Specifically, a text file and an XML file (i.e., the first preset distinguishing data type) filled according to a template are imported through an interface, and the file (i.e., the first preset distinguishing data type) is analyzed according to a preset analysis strategy of a data source management module, so that the interface file is serialized.
It should be understood that, the data source management module presets an analysis policy, and analyzes the EXECL file imported according to the template filling (i.e. the second preset distinguishing data type), so as to serialize the interface file. The EXCEL acquisition adapter only needs to support the formulated template, and if the existing template cannot meet the service requirement or the format is not accordant, a new template can be formulated through the data source management module (namely the second data conversion unit) to provide analysis and acquisition of the EXCEL file.
Specifically, the PDM file or the XML file (i.e., the third predetermined discrimination data type) of the stored data model can be parsed and acquired by using an existing model parsing technology. The analysis capability of the relationship among the entities, the entity attributes and the entities in the data model is achieved.
In the embodiment, the to-be-processed database is constructed by presetting a plurality of initial metadata and data types of the service system, the PDM model relation import data can be supported, and the formats of the data are unified, so that the management of the data source is realized.
Optionally, as an embodiment of the present invention, the metadata to be processed in the database to be processed includes a plurality of attribute information, and the metadata to be processed is any one of the initial metadata, the first converted metadata, the second converted metadata, and the third converted metadata; the database management module comprises an attribute management unit and a database association unit;
the attribute management unit is used for importing a plurality of department terminals which are in one-to-one correspondence with the attribute information, respectively sending the attribute information to the corresponding department terminals, respectively obtaining updated attribute information from the corresponding department terminals, respectively updating the corresponding metadata to be processed according to the updated attribute information corresponding to the metadata to be processed to obtain updated metadata corresponding to the metadata to be processed, and respectively updating the corresponding databases to be processed according to the updated metadata corresponding to the databases to be processed to obtain updated databases to be processed;
the database association unit is used for associating each updated database to be processed with at least one of the remaining updated databases to be processed respectively to obtain an associated database of each preset service system, and taking the associated database as an updated database;
the database management module also comprises any one unit or a combination of a plurality of units in a catalog generation unit, a data view generation unit, a business system management unit and an asset map generation unit;
the catalog generation unit is used for generating a data catalog table according to all the department terminals and all the updated attribute information of all the databases to be processed;
the data view generating unit is used for respectively performing visualization processing on each updated metadata to obtain the visualized metadata of each updated metadata, and accessing and/or querying each visualized metadata;
the service system management unit is used for adding, modifying and/or deleting each preset service system;
and the asset map generating unit is used for drawing maps of all the updated databases to obtain the asset map.
It should be understood that the remaining updated databases to be processed may be understood as all updated databases to be processed, except the updated database to be processed of the current preset service system.
It should be understood that after each attribute information is sent to the corresponding department terminal, the department terminal adds or modifies the corresponding attribute information, so as to obtain the updated attribute information.
It should be understood that the data asset management focuses on building a data asset management system, data standard management and data processing can be organically fused through the data asset management to form a data standard, metadata description of specific resource data is achieved, various data asset applications can be rapidly customized by utilizing a standardized data interface and a rich-form diagram display tool, and the standardization capability of the data assets and the opening and application capability of the data assets are gradually improved in cooperation with comprehensive evaluation of the data assets.
Specifically, the management and monitoring of the full life cycle of data are realized based on the data asset management of the metadata omnibearing portrait, the traceability of the full-flow record is realized, the panoramic asset visualization is realized, the full scene view of the data asset is provided, the requirements of application scenes of different users are met, the manager of the global planning is provided, the user who is related to the detail definition is also provided, the developer of processing and operation and maintenance is provided, the multi-level graphical display is provided, and the graphical query and auxiliary analysis of the application scenes are met.
A big data asset management system is built according to the data directory technology, management of data assets is achieved on the basis, a data management mode of 'logical unification and physical dispersion' which is suitable for the current situation in the field is achieved, data entering a data resource pool are classified and dimension attributes are labeled, the data can be rooted and traced, and use and statistical analysis are facilitated. Metadata attributes managed by a data catalog may contain nine classes, as shown in Table one, which is a metadata attribute classification.
Table one:
Figure BDA0003873962040000081
it should be understood that the attribute management unit configures the affiliation of a department (i.e. the department terminal) and its management service system based on the department affiliation configured in the department management. The situation of department management level and department responsible service system is reflected. And the service system and the subordinate department thereof (namely the department terminal) can be newly added and modified. And after the department is configured, the unified management of all department business items is realized through the management of the organization business list. The system mainly comprises business responsibility of business matters, business responsibility, service objects, materials required by the business matters, an information system for supporting the business and the like. The export operation of the business items is supported, and the export format is a standard Excel format. The information list management realizes the unified management of the information resource requirements of all departments, and mainly comprises information such as information resource names, source departments, acquisition forms, data updating frequency and the like.
It should be understood that the catalog generation unit may configure system information to which a department (i.e., the department terminal) belongs, and embody the management status of the system by the department (i.e., the department terminal). The relevant functions of the system will be briefly described.
It should be understood that the data view generating unit mainly provides access for viewing metadata information such as tables, fields and the like in the business system, and provides data sampling query, so that users can understand data conveniently.
Specifically, the service system management unit may be understood as performing addition, modification, and deletion operations on the service domain related information on the platform according to the service domain definition, and providing the result to other modules for use.
Specifically, the database association unit provides a blood relationship analysis function, realizes tracking and tracing of data, and performs data tracking and tracing operation on a designated table (i.e., an updated database to be processed) by selecting the designated table (i.e., the updated database to be processed) so as to analyze a data source.
Specifically, the asset map generation unit provides multi-level graphical display according to panoramic management of the dimension of an application scene, based on the metadata asset full scene view of each service system, from the dimension of the application scene, not only a manager of global planning, but also a user related to detail definition, and a developer of processing and operation and maintenance, and meets the requirements of graphical query and auxiliary analysis of the application scene.
It should be understood that the asset map may view the collected metadata that is put in storage at different viewing angles (system dimensions, theme domain dimensions, service tags), support displaying the metadata and the association relationship, and then expand, drill down or backtrack the metadata and the association relationship at each level layer by layer. Such as: physical attributes of cable anti-corrosion materials, including pipeline integrity management, line integrity, cables, etc., present metadata to search by entry, support advanced searches, and intelligent recommendations for results.
In the embodiment, the database of the database to be processed is updated to obtain the updated database, so that metadata description of specific resource data is realized, various data asset applications can be quickly customized, the standardization capability of the data assets and the opening and application capability of the data assets are gradually improved in cooperation with comprehensive evaluation of the data assets, meanwhile, the requirements of application scenes of different users and the graphical query and auxiliary analysis of the application scenes are met, and multi-level graphical display is provided.
Optionally, as an embodiment of the present invention, the data model management module includes a logical model management unit, a physical model management unit, and a mapping relationship management unit,
the logic model management unit is used for correspondingly constructing an original logic model of each preset service system through each updated database, and modifying each original logic model to obtain a target logic model of each preset service system;
the physical model management unit is used for correspondingly constructing an original physical model of each preset service system through each updated database, and modifying each original physical model to obtain a target physical model of each preset service system;
the mapping relationship management unit is configured to perform mapping processing on the target logic model and the target physical model of each preset service system, respectively, to obtain a target model of each preset service system.
It should be understood that the modifications may be creation, editing, deletion, etc. of the subject field.
It should be understood that, providing metadata management to realize the model definition and storage of metadata, packaging the metadata into various metadata functions at a functional layer, and finally providing applications and presentations to the outside; metadata classification and modeling, blood relationship and influence analysis are provided, and data tracking and backtracking are facilitated.
In particular, a data model is an abstraction of real-world data features that describe the concept and definition of a set of data. The data model is a storage mode of data in a database and is the basis of a database system. In a database, the physical structure of data is also called the storage structure of data, namely the representation and the configuration of data elements in a computer memory; the logical structure of data refers to the logical relationship between data elements. And a data attribute in the conceptual model is conveniently queried through a mapping relation between the logic model and the physical model. Information in a plurality of physical models. And the data quality and the data time point generated by the business process can be combined to be used by the data warehouse to determine the data fusion rule.
It should be understood that the logical model management unit provides data modeling functionality, visually defines data models, defines relationships between models, supports the design of tables, fields, relationships, views, indexes, partitions. Model classification management is provided, including creation, editing, deletion, etc. of subject domains. And the E-R diagram display data model supports visualization. And the logic model (namely the target logic model) is defined by supporting the input of the EXECL mode.
Specifically, the physical model management unit provides a data modeling function, visually defines data models, defines relationships among the models, and supports the design of tables, fields, relationships, views, indexes and partitions. Model classification management is provided, including creation, editing, deletion, etc. of topic domains. And the E-R diagram supporting visualization shows a data model. Data reverse modeling is supported. The created data model can be embodied into a corresponding table structure according to the selected database type.
It should be understood that the mapping relationship management unit provides a mapping relationship definition between a physical model (i.e. the target physical model) and a logical model (i.e. the target logical model), so as to embody a mapping relationship (field level) between a business system and a logical model evolved from a standard, and provides a corresponding data interface for a data standard module (i.e. the data standard management module) and the data quality management module, so that a data standard can generate a data quality detection rule according to the mapping relationship.
In the embodiment, the target model is constructed according to the updated database, so that data tracking and backtracking are facilitated, and the data quality detection rule can be generated according to the data standard according to the mapping relation.
Optionally, as an embodiment of the present invention, the standard management information includes original standard system information, original standard file information, and original standard specification information, the data standard management module includes a standard specification management unit, a standard system management unit, and a standard file management unit,
the standard system management unit is used for importing the original standard system information, and adding, modifying and/or deleting the original standard system information to obtain updated standard system information;
the standard file management unit is used for importing the original standard file information, and adding, modifying and/or deleting the original standard file information to obtain updated standard file information;
the standard specification management unit is used for importing the original standard specification information, adding, modifying and/or deleting the original standard specification information to obtain updated standard specification information, and taking the updated standard system information, the updated standard file information and the updated standard specification information as updated standard management information.
It should be understood that, the standard system management unit is configured to add corresponding standard system information under a primary service domain directory, so that a user can conveniently make different standard systems according to service domain categories, and search a standard name according to a service domain to which a standard belongs, so as to implement operations of adding, modifying, and deleting a standard system (i.e., the original standard system information).
Specifically, the standard file management unit is configured to add corresponding standard file information in a primary service domain standard system directory, so that a user can conveniently make different standard files according to a standard system by classification, and the standard file (i.e., the original standard file information) can be newly added, modified and deleted according to a service domain search standard name to which a standard belongs.
It should be understood that, the standard specification management unit is used for adding corresponding standard specification information under a standard file, so as to facilitate a user to formulate different standard specifications according to standard file classification, and search a standard name according to a service domain to which the standard belongs, so as to realize adding, modifying and deleting operations on the standard specification (namely, the original standard specification information) introduction.
In the embodiment, the standard management information is updated to obtain the updated standard management information, so that the standard management information is conveniently set in a user-defined manner.
Optionally, as an embodiment of the present invention, the data quality management module includes a quality rule management unit, a quality check unit, a data modification unit and a report generation unit,
the quality rule management unit is used for checking the updated standard management information and establishing a quality check rule through the updated standard management information;
the quality inspection unit is used for respectively performing quality inspection on each target model according to the quality inspection rule to obtain an initial inspection result of each preset service system and initial inspection parameter information of each preset service system;
the data modification unit is used for respectively modifying the data of each target model to obtain modified target models of each preset service system, and respectively carrying out quality inspection on each modified target model again according to the quality inspection rules to obtain target inspection results of each preset service system;
the report generating unit is used for generating a quality inspection result report according to all the initial metadata, the initial inspection result, the modified target model and the target inspection result;
the data quality management module also comprises any one unit or a combination of a plurality of units of an inspection parameter management unit and an inspection result sending unit;
the inspection parameter management unit is used for sending all initial inspection parameter information to a preset server;
the checking result sending unit is used for sending all the initial checking results to the appointed terminal.
It should be understood that the data quality management module realizes quality management of a data full life cycle, can visually configure a data quality inspection strategy according to standard rules, realizes inspection of data quality through a scheduling center, finds problem data, and dispatches the problem data to relevant personnel for correction according to an owner system. And can form data quality evaluation reports, problem processing reports and the like according to needs.
Specifically, to reduce the influence on the database of the information system, the data quality inspection adopts a data flow inspection technology, a data quality inspection method and SQL which runs in an engine instead of relying on the database. The system not only has the functions of preliminary detection and inspection of libraries and tables, data quality inspection in the data processing process, but also has the functions of post-based refined data quality inspection, screening, troubleshooting and problem tracking of problem data, data quality report and problem data analysis providing, abnormal data processing and the like.
It should be understood that the quality rule management unit is configured to view the corresponding standard value range and quality rule information under the standard specification (i.e., the updated standard management information), so as to facilitate a user to make different standard value ranges and rule information under the standard specification (i.e., the updated standard management information).
In particular, the quality inspection unit can provide data quality accuracy inspection, facilitating refined data quality analysis for a given table. The method comprises the steps of providing data quality inspection service to perform specified rule inspection on a database table (namely the target model), providing logical expression inspection, providing composite inspection, providing a visual definition interface, providing a data quality inspection method interface and conveniently increasing a data quality inspection method.
Specifically, the quality inspection unit performs specified rule inspection on a database table (i.e., the target model), including format inspection, range inspection, missing record inspection, precision inspection, logical expression inspection, composite rule inspection, and the like. The method comprises the following steps of configuring single-field multi-rule check in data quality check service visualization, configuring multi-field same-rule check, and configuring association check among multiple fields.
The format checking rules comprise time format checking, digital checking, identity card checking, regular expression checking and the like, and different input interfaces are provided according to different rule characteristics
Scope checking rules, including not checking in tables, not checking in custom scopes, etc., provide a visual definition interface.
The logical expression check comprises a logical check and a character string logical check. The logical check sets a logical judgment rule among one or more fields, including equal to, not equal to, less than or equal to, greater than or equal to, greater than and less than or equal to, true, false, etc., and the comparison value of the logical check may be derived from different fields of the check table. String logic checks are included in equals, lists, contains, starts, ends, string length comparisons, and the like. A visualization definition interface is provided.
The composite quality inspection refers to the combination of and or of a plurality of data quality inspection rules, and can be visually configured.
The data quality inspection rule is open, and the interface of the data quality inspection method is provided, so that java expansion is conveniently adopted to increase data quality inspection.
It should be understood that the inspection parameter management unit is used for configuring the inspection model defined by the quality inspection service, the execution time, the frequency, the execution node, and the like (i.e. the initial inspection parameter information). And generates the check service as a service to be transmitted to a corresponding node server (i.e., the preset server).
It should be understood that the checking result sending unit is used for assigning the checked problem data to the corresponding owner service user to perform operations of problem data query, modification and the like.
Specifically, the data modification unit may perform quality check after processing the data, for example: and checking the sum field value parameter, modifying and perfecting problematic data, specifically checking and modifying the problem condition of the problematic data for each service according to the result after each operation, sending the problem condition to the source after the modification is finished, achieving the effect of perfecting the problematic data, and simultaneously, after a service system administrator (namely the designated terminal) receives the assignment of the data quality problem, rectifying and reforming the data problem of the service system according to the data quality report error information and the check rule, and finishing the rectification. The system feeds back the data quality correction condition.
It should be understood that the report generating unit provides a data quality inspection result report (i.e., the quality inspection result report) including abnormal data, rule description of abnormal data inspection, and capable of performing problem data statistics, modification situation statistics, inspection rule statistics, for each data quality inspection service. And the problem data and the problem reason display are required to be provided, and the assignment modification, the modification interface and the problem data change trend analysis are provided.
In the embodiment, the data quality inspection result report is obtained by inspecting and analyzing all the target models through the updated standard management information, so that the quality management of the full life cycle of the data is realized, the inspection strategy for visually configuring the data quality can be realized according to the standard rule, and the problems of low online rate of the data lines, low structuralization rate, multi-source counting and the like caused by the fact that the data volume of the business system is accumulated more and more are solved.
Optionally, as an embodiment of the present invention, the data management system further includes a collection task scheduling unit, where the collection task scheduling unit is configured to:
importing a scheduling work task instruction, acquiring scheduling task starting time according to the scheduling work task instruction, and acquiring scheduling states of the initial metadata from a plurality of preset service systems;
respectively judging whether the scheduling state is a scheduling success state, if so, storing the scheduling state of the starting metadata; if not, generating a scheduling failure reason according to the scheduling state, and storing the scheduling failure reason;
and importing a scheduling work stopping instruction, acquiring scheduling task closing time according to the scheduling work stopping instruction, and calculating the difference value of the scheduling task starting time according to the scheduling task closing time to obtain the scheduling time.
It should be understood that, the data source management module (i.e. the collection task scheduling unit) configures the work tasks, so as to schedule the work tasks at regular time, and can record the operation conditions of the work tasks, where the operation conditions include: scheduling success, scheduling failure, failure reason, scheduling time and other information, wherein the failure reason is fed back by the program.
In the above embodiment, the start time and the scheduling state of the scheduling task are obtained through the scheduling work task instruction, whether the scheduling state is the successful scheduling state is judged, the close time of the scheduling task is obtained through the scheduling work stop instruction, and the scheduling time is obtained through calculation according to the difference value of the close time of the scheduling task and the start time of the scheduling task, so that the timed scheduling of the work task is realized, and the running condition of the work task can be recorded.
Optionally, as an embodiment of the present invention, the data management system further includes a system management module, the system management module includes any one or a combination of a plurality of units of a department terminal management unit, a user management unit, a role management unit, a menu management unit, a right management unit and an attribute information management unit,
the department terminal management unit is used for carrying out newly adding, terminal information modification and membership relation modification on the department terminal;
the user management unit is used for creating a user and adding, modifying and deleting a login name, a password, a department, a post responsibility and a user group of the user;
the role management unit is used for creating a plurality of roles and distributing at least one role to the user;
the menu management unit is used for creating a menu and adding, modifying and deleting the menu;
the authority management unit is used for creating, modifying and deleting authorities and distributing the authorities to the users;
the attribute information management unit is used for adding, modifying and deleting the attribute information.
It should be understood that the department terminal management unit is based on the configuration of the department membership hierarchical relationship, the department (i.e., the department terminal) is displayed in a tree structure after the configuration of the department is completed, and the addition, modification and membership modification of the department (i.e., the department terminal) can be conveniently performed after the adjustment of the department (i.e., the department terminal).
It should be understood that the user management unit is used for configuring and managing the login name, the affiliated department, the duty of the post, the affiliated user group and other related information of the user.
Specifically, the roles are sets of permissions, a user group refers to a set of users with the same characteristic, one user group can be assigned with a plurality of roles, users belonging to the user group inherit the roles owned by the user group, and simultaneously, the roles or the permissions can be individually assigned to the users. The account number is built in a system super manager, after the account number is successfully logged in, the account number does not need to detect the authority, has all the authority of the system, is reserved, can modify the password of the account number, enables/disables the account number and the like. A super administrator user group is built in the system, and users belonging to the user group have all system permissions without detecting the system permissions.
It should be understood that the menu management unit implements menu addition, modification, and deletion functions.
Specifically, the authority is divided into a data authority and a function authority. Data rights refer to a range of data visible to a user, including: metadata viewing permissions, data model viewing permissions, and data asset viewing permissions. The function authority is divided into a menu authority and an operation authority, wherein the menu authority refers to one authority corresponding to each menu item, and the operation authority refers to each sub-item in a menu page, such as 'creation' and the like. The functional authority requirements are subdivided into a certain operation, such as adding a user, deleting a certain mechanism and the like, and the authority granularity setting is required to meet the operation habit of the service requirement level and cannot be too complicated or too simple.
It should be understood that the attribute information management unit adds, modifies, deletes data attribute definitions used by the system for maintaining menu items and drop-down lists in the system.
In the embodiment, the user-defined management of the system is realized, and the flexibility of the management of the system is increased.
Optionally, as an embodiment of the present invention, the data management system further includes a comprehensive management module, the comprehensive management module includes any one or a combination of a plurality of units of the identity verification unit and the data retrieval unit,
the identity authentication unit is used for performing identity authentication through the login name and the password and storing the IP address, login time and accumulated login times of the user who successfully authenticates;
the data retrieval unit is used for retrieving all initial metadata, the attribute information, the newly added and/or modified attribute information and the updated metadata.
It should be understood that the data retrieval unit, i.e. related to the data presentation page, needs to provide data retrieval functionality. The method is convenient for quickly finding the needed data content and the user to quickly obtain the related information.
Specifically, the system can be accessed only by logging in, the user logs in the system by the user name and the password through the identity authentication unit, in order to ensure the password security, the system adopts the method of encrypting the password and the random number base64encode, and records the IP address, the login time and the accumulated login times of the user who successfully logs in. Meanwhile, after the system successfully logs in, the system reads the user information: user ID, user login name, company name, and the like, belonging organization information: organization ID, organization number, organization name, superior organization ID, organization number, organization name, etc., user group information: user group ID, user group code, user group name, etc., permission information: the read information is stored through the SESSION, and all SESSION information is destroyed after the user quits.
In the above embodiment, the information security of the user using the system can be ensured, and meanwhile, the required data content is conveniently and quickly found, so that the user can conveniently and quickly obtain the relevant information.
Optionally, as an embodiment of the present invention, the system management module further includes a node management unit, where the node management unit is specifically configured to:
when a multi-server environment is provided, a node server on which a system task is executed is allocated, and high availability of the system is achieved. The situation that the system bottleneck performance is reduced due to resource contention is avoided.
Optionally, as an embodiment of the present invention, the comprehensive management module further includes any one unit or a combination of multiple units of a notification notice display unit, an examination and approval pending issue unit, a task statistics unit and an operation monitoring unit,
the notification announcement display unit is used for displaying a notification information page;
the examination and approval to-be-handled issuing unit is used for an examination and approval process of issuing the data view of the system;
the task counting unit is used for carrying out relevant statistics on data inspection tasks, data quality problem processing conditions and the like through a report;
the operation monitoring unit is used for checking a page of the operation state of the scheduling task in the system, so that a system administrator can master the states of the current executing scheduling task and the historical scheduling task.
Optionally, as another embodiment of the present invention, the present invention solves the problems that as the data volume of the business system increases, a company has not established a unified data asset management specification in the aspects of data acquisition, data unified storage, data application, and data management, and has a low data online rate, a low structuring rate, and multiple sources, and the like, and realizes that a "data standard system" and a "data asset map" fall to the ground of a data asset management and control platform, thereby laying a good data management foundation for the company to establish an intelligent pipeline.
Optionally, as another embodiment of the present invention, the present invention is based on a platform construction target, and the platform needs to have metadata collection and data quality checking capabilities. Meanwhile, standards in a standard system are required to be embodied in a platform, and an association relation is established between the standards and a logic model and a physical model.
Fig. 2 is a flowchart illustrating a data management method according to an embodiment of the present invention.
Optionally, as another embodiment of the present invention, as shown in fig. 2, a data management method includes the following steps:
acquiring a plurality of initial metadata of each preset service system and a data type corresponding to each initial metadata from the plurality of preset service systems;
constructing a to-be-processed database of each preset service system through a plurality of initial metadata of each preset service system and a data type corresponding to each initial metadata;
updating the databases of the to-be-processed databases respectively to obtain updated databases of the preset service systems;
respectively constructing a target model of each preset service system according to each updated database;
importing standard management information, and updating the standard management information to obtain updated standard management information;
and checking and analyzing all target models according to the updated standard management information to obtain a data quality checking result report.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the above-described apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is only a logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. It will be understood that the technical solution of the present invention essentially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A data management system, comprising: a data source management module, a database management module, a data model management module, a data standard management module and a data quality management module,
the data source management module is used for obtaining a plurality of initial metadata of each preset service system and a data type corresponding to each initial metadata from the plurality of preset service systems, and constructing a to-be-processed database of each preset service system through the plurality of initial metadata of each preset service system and the data type corresponding to each initial metadata;
the database management module is used for respectively updating the databases to be processed to obtain updated databases of the preset service systems;
the data model management module is used for respectively constructing a target model of each preset service system according to each updated database;
the data standard management module is used for importing standard management information and updating the standard management information to obtain updated standard management information;
and the data quality management module is used for carrying out inspection analysis on all target models according to the updated standard management information to obtain a data quality inspection result report.
2. The data management system of claim 1, wherein the data source management module comprises a data source collection unit, a database discrimination unit, a first data conversion unit, a second data conversion unit, a third data conversion unit, and a database integration unit,
the data source acquisition unit is used for acquiring a plurality of initial metadata of each preset service system and a data type corresponding to each initial metadata from the plurality of preset service systems;
the database distinguishing unit is configured to calculate data capacities of a plurality of initial metadata of each preset service system, obtain the data capacities of each preset service system, and respectively judge whether each preset service system satisfies a condition that data types corresponding to all the initial metadata in the preset service system are any one of a plurality of preset data types, and the data capacity of the preset service system is greater than or equal to the preset data capacity; if so, creating a first database, and storing all the initial metadata in the preset service system meeting the condition into the first database; if not, creating a second database, and storing all the initial metadata in the preset service system which does not meet the condition into the second database;
the first data conversion unit is configured to perform format conversion on the initial metadata of which the data type is a first preset distinguishing data type in the first database or the second database, and convert the initial metadata into first converted metadata so as to update the first database or the second database;
the second data conversion unit is configured to perform format conversion on the initial metadata of which the data type is a second preset distinguishing data type in the first database or the second database, and convert the initial metadata into second converted metadata to update the first database or the second database;
the third data conversion unit is configured to perform format conversion on the initial metadata of which the data type is a third preset distinguishing data type in the first database or the second database, and convert the initial metadata into third converted metadata to update the first database or the second database;
the database integration unit is configured to use the first database or the second database updated by the preset service systems through the first data conversion unit, the second data conversion unit, and the third data conversion unit as the to-be-processed database of each preset service system.
3. The data management system according to claim 2, wherein the metadata to be processed in the database to be processed each includes a plurality of attribute information, and the metadata to be processed is any one of the initial metadata, the first converted metadata, the second converted metadata, and the third converted metadata; the database management module comprises an attribute management unit and a database association unit;
the attribute management unit is used for importing a plurality of department terminals which are in one-to-one correspondence with the attribute information, respectively sending the attribute information to the corresponding department terminals, respectively obtaining updated attribute information from the corresponding department terminals, respectively updating the corresponding metadata to be processed according to the updated attribute information corresponding to the metadata to be processed to obtain updated metadata corresponding to the metadata to be processed, and respectively updating the corresponding databases to be processed according to the updated metadata corresponding to the databases to be processed to obtain updated databases to be processed;
the database association unit is used for associating each updated database to be processed with at least one of the remaining updated databases to be processed respectively to obtain an associated database of each preset service system, and taking the associated database as an updated database;
the database management module also comprises any one unit or a combination of a plurality of units in a catalog generation unit, a data view generation unit, a business system management unit and an asset map generation unit;
the catalog generation unit is used for generating a data catalog table according to all the department terminals and all the updated attribute information of all the databases to be processed;
the data view generating unit is used for respectively performing visualization processing on each updated metadata to obtain the visualized metadata of each updated metadata, and accessing and/or querying each visualized metadata;
the service system management unit is used for carrying out addition, modification and/or deletion on each preset service system;
and the asset map generating unit is used for drawing maps of all the updated databases to obtain the asset map.
4. The data management system of claim 1, wherein the data model management module comprises a logical model management unit, a physical model management unit, and a mapping relationship management unit,
the logic model management unit is used for correspondingly constructing an original logic model of each preset service system through each updated database, and modifying each original logic model to obtain a target logic model of each preset service system;
the physical model management unit is used for correspondingly constructing an original physical model of each preset service system through each updated database, and modifying each original physical model to obtain a target physical model of each preset service system;
the mapping relationship management unit is configured to perform mapping processing on the target logic model and the target physical model of each preset service system, respectively, to obtain a target model of each preset service system.
5. The data management system according to claim 1, wherein the standard management information includes original standard system information, original standard file information, and original standard specification information, the data standard management module includes a standard specification management unit, a standard system management unit, and a standard file management unit,
the standard system management unit is used for importing the original standard system information, and adding, modifying and/or deleting the original standard system information to obtain updated standard system information;
the standard file management unit is used for importing the original standard file information, and adding, modifying and/or deleting the original standard file information to obtain updated standard file information;
the standard specification management unit is used for importing the original standard specification information, adding, modifying and/or deleting the original standard specification information to obtain updated standard specification information, and taking the updated standard system information, the updated standard file information and the updated standard specification information as updated standard management information.
6. The data management system of claim 1, wherein the data quality management module comprises a quality rule management unit, a quality check unit, a data modification unit, and a report generation unit,
the quality rule management unit is used for checking the updated standard management information and establishing a quality check rule through the updated standard management information;
the quality inspection unit is used for respectively performing quality inspection on each target model according to the quality inspection rule to obtain an initial inspection result of each preset service system and initial inspection parameter information of each preset service system;
the data modification unit is used for respectively modifying the data of each target model to obtain modified target models of each preset service system, and respectively carrying out quality inspection on each modified target model again according to the quality inspection rules to obtain target inspection results of each preset service system;
the report generating unit is used for generating a quality inspection result report according to all the initial metadata, the initial inspection result, the modified target model and the target inspection result;
the data quality management module also comprises any one unit or a combination of a plurality of units of an inspection parameter management unit and an inspection result sending unit;
the inspection parameter management unit is used for sending all initial inspection parameter information to a preset server;
the checking result sending unit is used for sending all initial checking results to a specified terminal.
7. The data management system of claim 2, further comprising a collection task scheduling unit, the collection task scheduling unit configured to:
importing a scheduling work task instruction, acquiring scheduling task starting time according to the scheduling work task instruction, and acquiring scheduling states of the initial metadata from a plurality of preset service systems;
respectively judging whether the scheduling state is a scheduling success state, if so, storing the scheduling state of the starting metadata; if not, generating a scheduling failure reason according to the scheduling state, and storing the scheduling failure reason;
and importing a scheduling work stopping instruction, acquiring scheduling task closing time according to the scheduling work stopping instruction, and calculating the difference value of the scheduling task starting time according to the scheduling task closing time to obtain the scheduling time.
8. The data management system according to claim 3, further comprising a system management module including any one or a combination of a plurality of units of a department terminal management unit, a user management unit, a role management unit, a menu management unit, a right management unit, and an attribute information management unit,
the department terminal management unit is used for carrying out newly adding, terminal information modification and membership relation modification on the department terminal;
the user management unit is used for creating a user and adding, modifying and deleting a login name, a password, a department, a post responsibility and a user group of the user;
the role management unit is used for creating a plurality of roles and distributing at least one role to the user;
the menu management unit is used for creating a menu and adding, modifying and deleting the menu;
the authority management unit is used for creating, modifying and deleting authorities and distributing the authorities to the users;
the attribute information management unit is used for adding, modifying and deleting the attribute information.
9. The data management system of claim 8, further comprising a comprehensive management module comprising any one or a combination of units of an identity verification unit and a data retrieval unit,
the identity authentication unit is used for performing identity authentication through the login name and the password and storing the IP address, login time and accumulated login times of the user who successfully authenticates;
the data retrieval unit is used for retrieving all initial metadata, the attribute information, the newly added and/or modified attribute information and the updated metadata.
10. A data management method, comprising the steps of:
acquiring a plurality of initial metadata of each preset service system and a data type corresponding to each initial metadata from the plurality of preset service systems;
constructing a to-be-processed database of each preset service system through a plurality of initial metadata of each preset service system and a data type corresponding to each initial metadata;
updating the databases of the to-be-processed databases respectively to obtain updated databases of the preset service systems;
respectively constructing a target model of each preset service system according to each updated database;
importing standard management information, and updating the standard management information to obtain updated standard management information;
and checking and analyzing all target models according to the updated standard management information to obtain a data quality checking result report.
CN202211217657.4A 2022-09-30 2022-09-30 Data management system and method Pending CN115617776A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211217657.4A CN115617776A (en) 2022-09-30 2022-09-30 Data management system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211217657.4A CN115617776A (en) 2022-09-30 2022-09-30 Data management system and method

Publications (1)

Publication Number Publication Date
CN115617776A true CN115617776A (en) 2023-01-17

Family

ID=84859721

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211217657.4A Pending CN115617776A (en) 2022-09-30 2022-09-30 Data management system and method

Country Status (1)

Country Link
CN (1) CN115617776A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115858377A (en) * 2022-12-20 2023-03-28 北京领雁科技股份有限公司 Data testing system and method based on guest group management
CN116501757A (en) * 2023-06-20 2023-07-28 鹏城实验室 ER diagram-based simulation data construction method and device
CN116955463A (en) * 2023-06-12 2023-10-27 自然资源陕西省卫星应用技术中心 Multi-source heterogeneous data integration system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115858377A (en) * 2022-12-20 2023-03-28 北京领雁科技股份有限公司 Data testing system and method based on guest group management
CN116955463A (en) * 2023-06-12 2023-10-27 自然资源陕西省卫星应用技术中心 Multi-source heterogeneous data integration system
CN116955463B (en) * 2023-06-12 2024-04-02 自然资源陕西省卫星应用技术中心 Multi-source heterogeneous data integration system
CN116501757A (en) * 2023-06-20 2023-07-28 鹏城实验室 ER diagram-based simulation data construction method and device
CN116501757B (en) * 2023-06-20 2023-10-03 鹏城实验室 ER diagram-based simulation data construction method and device

Similar Documents

Publication Publication Date Title
CN112699175B (en) Data management system and method thereof
CN107315776B (en) Data management system based on cloud computing
Wang et al. Data quality
CN108492028A (en) Demand data standardized method and standardized system
CN115617776A (en) Data management system and method
CN103455540B (en) The system and method for generating memory model from data warehouse model
CN110019176B (en) Data management control system for improving success rate of data management service
CN106682097A (en) Method and device for processing log data
CN106682096A (en) Method and device for log data management
CN104200402A (en) Publishing method and system of source data of multiple data sources in power grid
CN112199433A (en) Data management system for city-level data middling station
CN110807015A (en) Big data asset value delivery management method and system
CA2673422C (en) Software for facet classification and information management
US9123006B2 (en) Techniques for parallel business intelligence evaluation and management
US20110131247A1 (en) Semantic Management Of Enterprise Resourses
US20040181518A1 (en) System and method for an OLAP engine having dynamic disaggregation
CN113722301A (en) Big data processing method, device and system based on education information and storage medium
CN114218218A (en) Data processing method, device and equipment based on data warehouse and storage medium
CN112150122A (en) Agile network resource positioning and decision-making system
CN113592680A (en) Service platform based on regional education big data
CN108664509A (en) A kind of method, apparatus and server of extemporaneous inquiry
CN116415199B (en) Business data outlier analysis method based on audit intermediate table
CN110928963B (en) Column-level authority knowledge graph construction method for operation and maintenance service data table
Chen et al. Event-based spatio-temporal database design
CN116415203A (en) Government information intelligent fusion system and method based on big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination