CN115600803A - Data quality inspection method and system for multi-service system - Google Patents

Data quality inspection method and system for multi-service system Download PDF

Info

Publication number
CN115600803A
CN115600803A CN202211209392.3A CN202211209392A CN115600803A CN 115600803 A CN115600803 A CN 115600803A CN 202211209392 A CN202211209392 A CN 202211209392A CN 115600803 A CN115600803 A CN 115600803A
Authority
CN
China
Prior art keywords
data
data quality
service
inspection
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211209392.3A
Other languages
Chinese (zh)
Inventor
师蕊
赵璐
宋雪峰
曹文琛
连江桥
谢锐
郑宇恒
范人丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Oil and Gas Pipeline Network Corp
National Pipeline Network Southwest Pipeline Co Ltd
Original Assignee
China Oil and Gas Pipeline Network Corp
National Pipeline Network Southwest Pipeline Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Oil and Gas Pipeline Network Corp, National Pipeline Network Southwest Pipeline Co Ltd filed Critical China Oil and Gas Pipeline Network Corp
Priority to CN202211209392.3A priority Critical patent/CN115600803A/en
Publication of CN115600803A publication Critical patent/CN115600803A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • General Factory Administration (AREA)

Abstract

The invention discloses a data quality inspection method and a data quality inspection system of a multi-service system, and relates to the field of data quality inspection. The method comprises the following steps: collecting service data of multiple service systems, generating data quality inspection tasks according to all service data, configuring inspection rules through visual service, executing corresponding data quality inspection tasks according to the inspection rules, and acquiring a task execution result, generating a problem data quality report according to the task execution result, and correcting the multi-service system according to the problem data quality report. The data quality inspection method of the scheme is used for inspecting and inspecting the data quality of the service system, so that the data quality is greatly improved, and the data asset management and control platform provides functions of the whole data quality management process such as standard definition, quality monitoring, quality report and the like. Through the rules, the scheduling time and the working process which are defined in advance, the quality inspection of the data is automatically completed, the input of manpower and the process intervention are greatly reduced, the efficiency is improved, and the error is reduced.

Description

Data quality inspection method and system for multi-service system
Technical Field
The present invention relates to the field of data quality inspection, and in particular, to a method and a system for inspecting data quality of a multi-service system.
Background
The quality of data is an important guarantee for various data applications such as business work operation analysis of the southwest pipeline company, policy making assistance of the southwest pipeline company, risk management and control and the like. Due to years of basic information construction, southwest pipeline companies build 102 information systems, although the service systems effectively support engineering construction, production operation and daily operation management of the companies, and accumulate a large amount of data in the production service systems, the data are precious resources of the companies, such as gold mine. However, as the data volume of the service system is more and more accumulated, the company has not established a unified data asset management standard in the aspects of data acquisition, data unified storage, data application and data management, and cannot meet the requirements for constructing a pipeline digital twin, an intelligent pipeline and an intelligent pipe network.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method and a system for checking data quality of a multi-service system, aiming at the defects of the prior art.
The technical scheme for solving the technical problems is as follows:
a data quality inspection method of a multi-service system comprises the following steps:
collecting service data of a plurality of service systems, and generating a data quality inspection task according to all the service data;
configuring a checking rule through a visualization service;
executing a corresponding data quality inspection task according to the inspection rule to obtain a task execution result;
and generating a problem data quality report according to the task execution result, and correcting the multi-service system according to the problem data quality report.
The beneficial effects of the invention are: the data quality inspection method of the scheme is used for inspecting and inspecting the data quality of the service system, so that the data quality is greatly improved, and the data asset management and control platform provides functions of the whole data quality management process such as standard definition, quality monitoring, quality report and the like. Through the rules, the scheduling time and the working process which are defined in advance, the quality inspection of the data is automatically completed, the input of manpower and the process intervention are greatly reduced, the efficiency is improved, and the error is reduced. The user can know the system inspection result in time, the data quality of the southwest pipeline is effectively improved, and a foundation is laid for further and deeply developing the data management of the southwest pipeline in future.
Further, still include: and constructing the check rule according to the accuracy index, the integrity index, the consistency index and the timeliness index.
Further, the collecting service data of a plurality of service systems specifically includes:
business data from each stage of the acquisition, storage, sharing, maintenance, application, and extinction lifecycle in a plurality of business systems is collected.
Further, still include: when a plurality of data quality inspection tasks are generated according to the service data, scheduling each data quality inspection task through a scheduling center;
the executing of the corresponding data quality inspection task according to the inspection rule specifically includes:
and executing each data quality inspection task scheduled by the scheduling center according to the inspection rule.
Further, the checking rules include:
at least one of format check, range check, missing record check, precision check, logical expression check, and compound rule check.
Another technical solution of the present invention for solving the above technical problems is as follows:
a data quality inspection system for a multi-service system, comprising: the system comprises a checking task creating module, a rule configuration module, a checking module and a correcting module;
the inspection task creating module is used for collecting the service data of a plurality of service systems and generating a data quality inspection task according to all the service data;
the rule configuration module is used for configuring a check rule through a visualization service;
the inspection module is used for executing a corresponding data quality inspection task according to the inspection rule to obtain a task execution result;
and the correction module is used for generating a problem data quality report according to the task execution result and correcting the multi-service system according to the problem data quality report.
The beneficial effects of the invention are: the data quality inspection method of the scheme is used for inspecting and inspecting the data quality of the service system, so that the data quality is greatly improved, and the data asset management and control platform provides functions of the whole data quality management process such as standard definition, quality monitoring, quality report and the like. Through the rules, the scheduling time and the working process which are defined in advance, the quality inspection of the data is automatically completed, the input of manpower and the process intervention are greatly reduced, the efficiency is improved, and the error is reduced. The user can know the system inspection result in time, the data quality of the southwest pipeline is effectively improved, and a foundation is laid for further and deeply developing the data management of the southwest pipeline in future.
Further, still include: and the inspection rule construction module is used for constructing the inspection rule according to the accuracy index, the integrity index, the consistency index and the timeliness index.
Further, the inspection task creating module is specifically configured to collect service data from each stage of the acquisition, storage, sharing, maintenance, application, and extinction life cycle in the plurality of service systems.
Further, still include: the scheduling module is used for scheduling each data quality inspection task through a scheduling center when a plurality of data quality inspection tasks are generated according to the service data;
the checking module is specifically used for executing each data quality checking task scheduled by the scheduling center according to the checking rule.
Further, the checking rule includes:
at least one of format check, scope check, missing record check, precision check, logical expression check, and compound rule check.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
Fig. 1 is a schematic flow chart illustrating a data quality inspection method of a multi-service system according to an embodiment of the present invention;
fig. 2 is a block diagram of a data quality inspection system of a multi-service system according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a southwest pipeline data quality management method according to another embodiment of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth to illustrate, but are not to be construed to limit the scope of the invention.
As shown in fig. 1, a data quality inspection method for a multi-service system provided in an embodiment of the present invention includes:
s1, collecting service data of a plurality of service systems, and generating a data quality inspection task according to all the service data; wherein, the plurality of service systems may include: the system comprises a GPS inspection system, a production intelligent system, a QHSE system, a storage bar code system, a cathode protection system, a welding seam file system, a China Burmese crude oil pipeline industry Internet of things data sharing system, a geological disaster monitoring system, an intelligent construction site system, a welding seam negative film file management system, an oil transfer pump operation state intelligent monitoring system, a pipeline management system, an operation area information platform and the like. The checking task can check the task according to the accuracy, integrity, consistency and timeliness of the construction of the business data.
S2, configuring a check rule through a visual service;
in a certain embodiment, S2 may include: the method comprises the following steps of visually configuring single-field multi-rule inspection in a data quality inspection service, namely, setting various rules through one field to inspect the data quality, for example, setting null value inspection, range inspection, unit inspection and the like for a minimum working condition flow field of a valve; the method can also be configured with multi-field same-rule check, that is, the data quality of a plurality of fields is checked under the same rule, for example, the integrity rule check is performed on fields such as the axial length, the piston stroke, the reciprocating frequency and the like of the pump; it is also possible to configure a correlation check between the multiple fields, i.e. to check whether the correlation between the multiple fields is accurate, for example, the maximum operating temperature of the filtering and separating device cannot be greater than the maximum design temperature.
The dispatching center is used for checking the operation of tasks on the accuracy, the integrity, the consistency and the timeliness of the data by using the checking rules, finding out problem data, forming a problem data quality report as required, and dispatching the problem data to related personnel for correction according to an owner system
S3, executing a corresponding data quality inspection task according to the inspection rule to obtain a task execution result;
and S4, generating a problem data quality report according to the task execution result, and correcting the multi-service system according to the problem data quality report.
In one embodiment, generating the issue data quality report based on the task execution result may include:
in the data verification results of the process flow: data errors, data loss or data duplication and the like are caused during data processing. The source system may change the structure without notifying downstream consumers (including people and systems) or having sufficient time for the downstream consumers to respond to the change. This may result in invalid values or prevent data transfer and loading, or in the downstream system not being able to immediately detect more subtle changes.
In the data verification results of the information system functions: the system lacks the ability to verify, check, and enter data, and if the data entry interface has no editing or control to prevent incorrect data from being entered into the system, the data handler may take short cuts to process the data, such as skipping non-mandatory fields and not updating fields with default values.
In the data verification results of the database: the database design does not meet the data standard or lacks the functions of inspection and check; such as poor database performance, loss of data integrity, field design considerations, penalties in computation or statistics, etc.
In the data verification results of the denormal operation: some improper use of personnel systems or databases, etc.
The data quality inspection method of the scheme is used for inspecting and inspecting the data quality of the service system, so that the data quality is greatly improved, and the data asset management and control platform provides functions of the whole data quality management process such as standard definition, quality monitoring, quality report and the like. Through the rules, the scheduling time and the working process which are defined in advance, the quality inspection of the data is automatically completed, the input of manpower and the process intervention are greatly reduced, the efficiency is improved, and the error is reduced. The user can know the system inspection result in time, the data quality of the southwest pipeline is effectively improved, and a foundation is laid for further and deeply developing the data management of the southwest pipeline in future.
Optionally, in any of the above embodiments, the method further includes: and constructing the check rule according to the accuracy index, the integrity index, the consistency index and the timeliness index.
In one embodiment, the accuracy index includes: the accuracy of the data is problematic, and the data cannot accurately and truly reflect actual information, such as:
(1) The information reflected by the data is invalid and unrealistic, and the data items do not follow specific business logic;
(2) Unnecessary repetition of fields and records exists, for example, information of the same user should not have a plurality of repeated records;
(3) Data does not conform to the definition of the data standard, e.g., data has the proper format, length, and data type;
(4) The data content does not correspond to the column name.
The integrity indicators include: the integrity of data is a problem, and data does not have all information required by service operation, such as:
(1) Field information necessary for service operation is missing;
(2) The data does not have all information required by service operation;
(3) The data does not describe the necessary association between the service information.
The consistency index includes:
data consistency problem, data inconsistency reflecting the same entity in different information systems, such as:
(1) The meanings of data items reflecting the same entity in different information systems are inconsistent;
(2) The logic of each data item of the same entity is inconsistent;
(3) Logical inconsistencies between data items of different entities.
The timeliness indexes include:
the data cannot reflect the current service operation status in time due to the timeliness problem of the data, such as:
(1) The data is in a required time window and cannot be processed on time;
(2) Within the required time window, no data can be provided on time for downstream applications.
Optionally, in any embodiment above, the collecting service data of multiple service systems specifically includes:
business data from each stage of the acquisition, storage, sharing, maintenance, application, and extinction lifecycle in a plurality of business systems is collected.
Optionally, in any of the above embodiments, the method further includes: when a plurality of data quality inspection tasks are generated according to the service data, scheduling each data quality inspection task through a scheduling center;
the executing of the corresponding data quality inspection task according to the inspection rule specifically includes:
and executing each data quality inspection task scheduled by the scheduling center according to the inspection rule.
Optionally, in any embodiment above, the check rule includes:
at least one of format check, scope check, missing record check, precision check, logical expression check, and compound rule check.
In one embodiment, data quality verification/evaluation for 13 systems is an effective means to implement data quality management. Data quality inspection strategies are visually configured according to standard rules, data quality precision inspection is provided through a platform according to four dimensions of accuracy, completeness, consistency and timeliness, and fine data quality analysis is conveniently performed on a given table.
The data quality is the matching degree of the data and the service requirement, namely the degree of the data meeting the requirements of service operation, management and decision analysis, and can be measured by indexes such as accuracy, integrity, consistency, timeliness and the like.
Data quality management refers to a series of management activities such as identification, check, monitoring, early warning and the like for various data quality problems possibly caused in each stage of a life cycle of data acquisition, storage, sharing, maintenance, application and extinction, and the data quality is further improved by improving and enhancing the management level of an organization. Data quality identification considers data quality mainly from three aspects: invalid cells, i.e. invalid values; an invalid variable; and (6) invalid recording. And data quality information of each link of the service is acquired by data monitoring, the data quality condition is diagnosed by combining related inspection rules and acquisition rules, and the data quality condition is reported to a data quality management implementation department in time. The data early warning is based on a data warehouse and an ETL method, and source data can be synchronized to the data warehouse and early warning values can be set.
The data quality management range includes data quality problem discovery, data quality problem analysis, data quality improvement (big data analysis, intelligent perception, simulation analysis and the like), data quality check rule management (unified management is performed on quality management rules, and different rules are checked conveniently on fields), and construction and maintenance of data quality management tools (association graphs, affinity graphs, system graphs, matrix data analysis methods, PDPC methods and network graphs) and other related works.
The data quality management range is a solution and guidance evaluation provided for quality problems of enterprises in data warehouse construction, data mining and data centers.
The quality of data is an important guarantee for various data applications such as business work operation analysis of the southwest pipeline company, auxiliary policy making and risk control of the southwest pipeline company and the like.
Dimension of data quality:
accuracy of
The accuracy of the data is a problem, and the data cannot accurately and truly reflect actual information, such as:
(1) The information reflected by the data is invalid and unreal, and the data items do not follow specific business logic;
(2) Unnecessary repetition of fields and records exists, for example, information of the same user should not have a plurality of repeated records;
(3) Data does not conform to the definition of the data standard, for example, data has proper format, length and data type;
(4) The data content does not correspond to the column name.
Integrity of
The integrity of data is a problem, and the data does not have all information required by service operation, such as:
(1) Field information necessary for service operation is missing;
(2) The data does not have all information required by service operation;
(3) The data does not describe the necessary association between the service information.
Consistency
Data consistency problem, data inconsistency reflecting the same entity in different information systems, such as:
(1) The meanings of data items reflecting the same entity in different information systems are inconsistent;
(2) The logic between each data item of the same entity is inconsistent;
(3) Logical inconsistencies between data items of different entities.
Timeliness
The timeliness of the data is a problem, and the data cannot reflect the current service operation status in time, such as:
(1) The data is in a required time window and cannot be processed on time;
(2) Within the required time window, no data can be provided on time for downstream applications.
Analyzing reasons of data quality problems:
(1) The treatment process comprises the following steps:
the accuracy is as follows: data errors, data loss or data duplication and the like are caused during data processing.
Consistency: the source system may alter the structure without notifying downstream consumers (including people and systems) or having sufficient time for the downstream consumers to respond to the changes. This may result in invalid values or prevent data transfer and loading, or in the downstream system not being able to immediately detect more subtle changes.
(2) Information system function (integrity and accuracy): the system lacks the ability to verify, check, and enter data, and if the data entry interface has no editing or control to prevent incorrect data from being entered into the system, the data handler may take short cuts to process the data, such as skipping non-mandatory fields and not updating fields with default values.
(3) Database (integrity and accuracy): the database design does not meet the data standard or lacks the functions of inspection and check; such as poor database performance, loss of data integrity, field design concerns, penalties in computation or statistics, etc.
(4) Operation is not standardized: partial user system or database misoperations, etc., may result in incomplete data, such as skipping non-mandatory fields and not updating fields with default values.
The data quality inspection is an effective means for realizing data quality management, and mainly comprises four aspects: accuracy, completeness, coherence and timeliness.
The data asset management and control platform provides data quality precision inspection through the platform according to four dimensions of accuracy, completeness, consistency and timeliness by visually configuring a data quality inspection strategy according to standard rules, and facilitates fine data quality analysis of a given table. The method comprises the steps of providing a data quality inspection service to check a database table according to specified rules (such as verification accuracy, integrity, consistency and timeliness), providing logical expression inspection (such as that the pump 'maximum operating pressure' cannot be null) and composite inspection (such as that the pump 'maximum operating pressure' cannot be null and cannot exceed a maximum allowable value) to ensure the accuracy of data, providing a visual definition interface (such as that inspection rules are packaged into inspection strategies according to standard rules, and a user can directly select corresponding inspection strategies to perform data governance inspection), and providing a data quality inspection method interface (such as that an interface is provided to be integrated with a system and the quality of data in the system can be directly inspected), so as to increase a data quality inspection method.
The data asset management and control platform provides data quality inspection service to perform specified rule inspection on a database table, wherein the specified rule inspection comprises format inspection, range inspection, missing record inspection, precision inspection, logic expression inspection, composite rule inspection and the like. Information such as format, range, missing records, accuracy of the maximum operating pressure of the pump is checked against standards. And gives the inspection result.
A single-field multi-rule check is configured in a data quality check service visualization mode (for example, the maximum operation pressure of a pump cannot be empty and cannot exceed a maximum allowable value), and multiple rules are set for one field to check the data quality; multi-field same-rule checking (for example, the "maximum operating pressure" and the "minimum operating pressure" of the pump execute rule checking which cannot be null) can be configured to check the data quality of the plurality of fields under the same rule; it is also possible to configure an association check between multiple fields (for example, the maximum operating temperature of the filtering separation device cannot be greater than the maximum design temperature), that is, check whether the association relationship between multiple fields is accurate. And by using the dispatching center, the checking rules are used for checking the accuracy, the integrity, the consistency and the timeliness of the data to run, finding out the problem data, and forming a problem data quality report as required to dispatch the problem data to relevant personnel for correction according to an owner system. The task is to use the checking rules to check the accuracy, integrity, consistency and timeliness of the vehicle data.
This 13 business systems of mainly encamping have detected 13 systems, are GPS system of patrolling and examining, production intelligence system, QHSE system, storage bar code system, cathodic protection system, welding seam file system, and Zhonghai crude oil pipeline industry thing allies oneself with data sharing system, geological disasters monitoring system, intelligent building site system, crater film file management system, oil transfer pump running state intelligent monitoring system, pipeline management system, operation area information platform respectively. 19646659 data are detected totally, 1026324 data are found, 1026324 data are corrected, and the problem data are found to be completely corrected through secondary detection.
The platform adopts two modes of migration service system database detection and online data detection aiming at the data quality detection of a main cause service system, although the effective control effect of the data quality can be achieved, the complexity of the detection operation steps of the migration service system database is far greater than that of the online data quality detection mode, because the database needs to be migrated to the local, the platform has certain requirements on the butt joint personnel of each service system, and also has certain requirements on the detection operation personnel of the control platform, and the online data quality detection is more excellent from the aspects of timeliness and consistency, so that the condition of directly carrying out online data quality detection is gradually and uniformly achieved by each service system, the operation process of the platform is simplified, and the learning cost of the platform operation personnel is effectively reduced.
Currently, structured data are used for detecting data quality, a southwest pipeline company still has a lot of unstructured data, and the fact that unstructured data are combed in a data governance project of the next period is suggested, and the unstructured data can be converted into structured data to be converted and subjected to data quality detection.
The data quality of the service system is checked through the data asset control platform, so that the data quality is greatly improved, and the data asset control platform provides functions of the whole data quality management process such as standard definition, quality monitoring and quality reporting. Through the rules, the scheduling time and the working process which are defined in advance, the quality inspection of the data is automatically completed, the input of manpower and the process intervention are greatly reduced, the efficiency is improved, and the error is reduced. The user can know the system inspection result in time, the data quality of the southwest pipeline is effectively improved, and a foundation is laid for further and deeply developing the data management of the southwest pipeline in future.
In one embodiment, as shown in fig. 2, a data quality inspection system of a multi-service system includes: a verification task creation module 1101, a rule configuration module 1102, a verification module 1103, and a modification module 1104;
the inspection task creating module 1101 is configured to collect service data of a plurality of service systems, and generate a data quality inspection task according to all the service data;
the rule configuration module 1102 is configured to configure a check rule through a visualization service;
the inspection module 1103 is configured to execute a corresponding data quality inspection task according to the inspection rule, and obtain a task execution result;
the correcting module 1104 is configured to generate a problem data quality report according to the task execution result, and correct the multi-service system according to the problem data quality report.
The data quality inspection method of the scheme is used for inspecting and inspecting the data quality of the service system, so that the data quality is greatly improved, and the data asset management and control platform provides functions of the whole data quality management process such as standard definition, quality monitoring, quality report and the like. Through the rules, the scheduling time and the working process which are defined in advance, the quality inspection of the data is automatically completed, the investment of manpower and the process intervention are greatly reduced, the efficiency is improved, and the errors are reduced. The user can know the system inspection result in time, the data quality of the southwest pipeline is effectively improved, and a foundation is laid for further and deeply developing the data management of the southwest pipeline in future.
Optionally, in any of the above embodiments, the method further includes: and the inspection rule construction module is used for constructing the inspection rule according to the accuracy index, the integrity index, the consistency index and the timeliness index.
Optionally, in any of the above embodiments, the inspection task creating module 1101 is specifically configured to collect business data from each stage of the acquiring, storing, sharing, maintaining, applying, and dying lifecycle in a plurality of business systems.
Optionally, in any of the above embodiments, the method further includes: the scheduling module is used for scheduling each data quality inspection task through a scheduling center when a plurality of data quality inspection tasks are generated according to the service data;
the checking module is specifically used for executing each data quality checking task scheduled by the scheduling center according to the checking rule.
Optionally, in any embodiment above, the check rule includes:
at least one of format check, scope check, missing record check, precision check, logical expression check, and compound rule check.
It is understood that some or all of the alternative embodiments described above may be included in some embodiments.
It should be noted that the above embodiments are product embodiments corresponding to the previous method embodiments, and for the description of each optional implementation in the product embodiments, reference may be made to corresponding descriptions in the above method embodiments, and details are not described here again.
The reader should understand that in the description of this specification, reference to the description of the terms "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Moreover, various embodiments or examples and features of various embodiments or examples described in this specification can be combined and combined by one skilled in the art without being mutually inconsistent.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described method embodiments are merely illustrative, and for example, the division of steps into only one logical functional division may be implemented in practice in another way, for example, multiple steps may be combined or integrated into another step, or some features may be omitted, or not implemented.
The above method, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A data quality inspection method for a multi-service system is characterized by comprising the following steps:
collecting service data of a plurality of service systems, and generating a data quality inspection task according to all the service data;
configuring a checking rule through a visualization service;
executing a corresponding data quality inspection task according to the inspection rule to obtain a task execution result;
and generating a problem data quality report according to the task execution result, and correcting the multi-service system according to the problem data quality report.
2. The data quality inspection method of multi-service system according to claim 1, further comprising: and constructing the check rule according to the accuracy index, the integrity index, the consistency index and the timeliness index.
3. The method for checking data quality of a multi-service system according to claim 1, wherein the collecting the service data of a plurality of service systems specifically includes:
business data from each stage of the acquisition, storage, sharing, maintenance, application, and extinction lifecycle in a plurality of business systems is collected.
4. A method for data quality inspection of a multi-service system according to any one of claims 1 to 3, further comprising: when a plurality of data quality inspection tasks are generated according to the service data, scheduling each data quality inspection task through a scheduling center;
the executing of the corresponding data quality inspection task according to the inspection rule specifically includes:
and executing each data quality inspection task scheduled by the scheduling center according to the inspection rule.
5. The data quality inspection method of multi-service system according to claim 2, wherein said inspection rule comprises:
at least one of format check, scope check, missing record check, precision check, logical expression check, and compound rule check.
6. A data quality inspection system for a multi-service system, comprising: the system comprises a checking task creating module, a rule configuration module, a checking module and a correcting module;
the inspection task creating module is used for collecting the service data of a plurality of service systems and generating a data quality inspection task according to all the service data;
the rule configuration module is used for configuring a check rule through a visualization service;
the inspection module is used for executing a corresponding data quality inspection task according to the inspection rule to obtain a task execution result;
and the correction module is used for generating a problem data quality report according to the task execution result and correcting the multi-service system according to the problem data quality report.
7. The system of claim 6, further comprising: and the inspection rule construction module is used for constructing the inspection rule according to the accuracy index, the integrity index, the consistency index and the timeliness index.
8. The data quality inspection system of a multi-service system according to claim 6, wherein the inspection task creation module is specifically configured to collect the service data from each stage of the acquisition, storage, sharing, maintenance, application and extinction lifecycle in the plurality of service systems.
9. A data quality inspection system for a multi-service system according to any of claims 6 to 8, further comprising: the scheduling module is used for scheduling each data quality inspection task through a scheduling center when a plurality of data quality inspection tasks are generated according to the service data;
the checking module is specifically used for executing each data quality checking task scheduled by the scheduling center according to the checking rule.
10. The data quality inspection system of multi-service system as claimed in claim 7, wherein said inspection rule comprises:
at least one of format check, scope check, missing record check, precision check, logical expression check, and compound rule check.
CN202211209392.3A 2022-09-30 2022-09-30 Data quality inspection method and system for multi-service system Pending CN115600803A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211209392.3A CN115600803A (en) 2022-09-30 2022-09-30 Data quality inspection method and system for multi-service system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211209392.3A CN115600803A (en) 2022-09-30 2022-09-30 Data quality inspection method and system for multi-service system

Publications (1)

Publication Number Publication Date
CN115600803A true CN115600803A (en) 2023-01-13

Family

ID=84845435

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211209392.3A Pending CN115600803A (en) 2022-09-30 2022-09-30 Data quality inspection method and system for multi-service system

Country Status (1)

Country Link
CN (1) CN115600803A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117093589A (en) * 2023-10-16 2023-11-21 北京国基科技股份有限公司 Unstructured data warehousing method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117093589A (en) * 2023-10-16 2023-11-21 北京国基科技股份有限公司 Unstructured data warehousing method and device
CN117093589B (en) * 2023-10-16 2024-01-16 北京国基科技股份有限公司 Unstructured data warehousing method and device

Similar Documents

Publication Publication Date Title
CN109784758B (en) Engineering quality supervision early warning system and method based on BIM model
CN110516820B (en) BIM-based steel structure bridge informatization operation and maintenance system and processing method
CN107810500A (en) Data quality analysis
US20090204517A1 (en) Intercompany accounting data analytics
Bigonha et al. The usefulness of software metric thresholds for detection of bad smells and fault prediction
CN104392297A (en) Method and system for realizing non-business process irregularity detection in large data environment
CN115600803A (en) Data quality inspection method and system for multi-service system
CN115828390A (en) Four-pre-function implementation method for safety monitoring of hydraulic and hydroelectric engineering
Zhao et al. Research on international standardization of software quality and software testing
CN115392805A (en) Transaction type contract compliance risk diagnosis method and system
CN117114412A (en) Safety pre-control method and device for dangerous chemical production enterprises
Wiesner et al. An ontology-based environment for effective collaborative and concurrent process engineering
Ibarra et al. Model for integrated production and quality control: implementation and testing using commercial software applications
Onyshchenko et al. Industry 4.0 and Accounting: a theoretical approach
CN110347741B (en) System for effectively improving output result data quality in big data processing process and control method thereof
CN113297146A (en) Processing model and method for local supervision submission data
Li et al. Research on Welding Quality Traceability Model of Offshore Platform Block Construction Process.
Atagoren et al. A case study in defect measurement and root cause analysis in a turkish software organization
Ma et al. Data management of salt cavern gas storage based on data model
CN113515542B (en) Pipeline data association method and device
Martins Maintenance management of a production line-a case study in a furniture industry
CN117972115B (en) Method, equipment and medium for constructing process automation rule base
He et al. Software component reliability evaluation method based on characteristic parameters
CN115576958B (en) Data verification method, equipment and medium for production equipment supervision report
Liashenko et al. The Impact of Data Analytics on the Nature of Doing Business

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination