CN111539633A - Service data quality auditing method, system, device and storage medium - Google Patents

Service data quality auditing method, system, device and storage medium Download PDF

Info

Publication number
CN111539633A
CN111539633A CN202010338216.4A CN202010338216A CN111539633A CN 111539633 A CN111539633 A CN 111539633A CN 202010338216 A CN202010338216 A CN 202010338216A CN 111539633 A CN111539633 A CN 111539633A
Authority
CN
China
Prior art keywords
auditing
data
interface
audit
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010338216.4A
Other languages
Chinese (zh)
Inventor
白树军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Si Tech Information Technology Co Ltd
Original Assignee
Beijing Si Tech Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Si Tech Information Technology Co Ltd filed Critical Beijing Si Tech Information Technology Co Ltd
Priority to CN202010338216.4A priority Critical patent/CN111539633A/en
Publication of CN111539633A publication Critical patent/CN111539633A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis

Abstract

The invention relates to a method, a system, a device and a storage medium for auditing the quality of service data, which are used for acquiring the service data in a source system and sending the service data to an ETL interface machine to obtain a target data file; acquiring a preset interface audit rule, judging whether the ETL interface machine accords with the interface audit rule, if not, ending audit and sending interface alarm information; if so, obtaining an interface audit report, loading the target data file into a target database, obtaining a preset data audit rule, and auditing and judging the target data file in the target database according to the data audit rule to obtain a data audit result. The invention monitors the data from the source, finds the abnormity of the service data in advance, analyzes the positioning problem and controls the risk in time, and feeds back the data monitoring of each node in the whole life cycle, thereby realizing the full-process monitoring of the quality of the service data and effectively ensuring the consistency of the service data.

Description

Service data quality auditing method, system, device and storage medium
Technical Field
The present invention relates to the field of network communication service support, and in particular, to a method, a system, an apparatus, and a storage medium for auditing service data quality.
Background
With the development of network communication technology, at present, communications are in internet transformation, and the business acceptance transaction amount and the transaction amount of the online and offline front-end contact are larger and larger, for example: the possibility of business data difference is more and more, meanwhile, the number of platforms involved in business is also more, and the situation of inconsistent business data often occurs, so that the customer complaints are increased, the customer service level and the satisfaction degree are reduced, and the harm of business income loss is reduced.
Therefore, it is necessary to audit the quality of the service data, monitor the quality of the service data, reduce the influence of the service data on the subsequent system, find the abnormality of the service data in time, further find and analyze the problem of the service system, ensure the consistency of the service data, and control the risk in time.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method, a system, a device and a storage medium for auditing the quality of service data, which can monitor data from the source, discover the abnormality of the service data in advance, further discover the abnormality of the service system, analyze the positioning problem in time, monitor the quality of the service data in a full flow, ensure the consistency of the service data, control the risk in time and improve the service efficiency.
The technical scheme for solving the technical problems is as follows:
a method for auditing service data quality comprises the following steps:
step 1: acquiring service data in a source system to an ETL interface machine to obtain a target data file;
step 2: acquiring a preset interface audit rule, judging whether the ETL interface machine accords with the interface audit rule, if so, acquiring an interface audit report, loading the target data file into a target database, and then executing a step 3; if not, finishing the audit and sending interface alarm information;
and step 3: and acquiring a preset data auditing rule, and auditing and judging the target data files in the target database according to the data auditing rule to obtain a data auditing result.
The invention has the beneficial effects that: when business data in a source system is acquired and is transmitted to an ETL interface machine, a target data file is obtained, an interface (namely the ETL interface machine) is audited, a subsequent data auditing process is carried out only when the ETL interface machine accords with a preset interface auditing rule, and if the ETL interface machine does not accord with the preset interface auditing rule, interface warning information is sent out and the source of a business production system is informed of abnormity; if the ETL interface machine accords with a preset interface auditing rule, performing data auditing on a target data file loaded into a target database from the ETL interface machine by using the preset data auditing rule to obtain a data auditing result corresponding to the target data file;
the method for auditing the service data can be used for auditing the quality of the data from an interface, monitoring the data from the source, stopping a subsequent flow and giving an alarm if a problem exists, solving the problem that the problem is found after report data comes out and then retreating, knowing and perceiving the problem first, finding the abnormity of the service data in advance, finding the abnormity of a service system, analyzing and positioning the problem in time, controlling risks and effectively improving the service efficiency; meanwhile, the quality monitoring of end-to-end data of the service data from the service domain to the analysis system, from the data source of the production system, to the interface layer of the analysis system, to the summary layer of the intermediate data warehouse and to the report application, feeds back the data monitoring of each node in the whole life cycle, finds the data abnormity, and can intervene in time under the condition of data abnormity, thereby preventing the influence from expanding, realizing the full-process monitoring of the quality of the service data, effectively ensuring the consistency of the service data, greatly improving the data monitoring effect and being beneficial to improving the quality of the service.
On the basis of the technical scheme, the invention also has the following improvements:
further: the source system comprises a relational database, a Hadoop system and a file system;
the specific steps of the step 1 comprise:
extracting the business data from the relational database, the Hadoop system and the file system respectively, and performing data cleaning on the business data to obtain a first data file;
the relational database, the Hadoop system and the file system respectively generate a second data file according to a preset data file standard format;
and obtaining the target data file according to the first data file or/and the second data file.
The beneficial effects of the further technical scheme are as follows: the first data file is obtained by directly extracting service data from the different types of source systems and cleaning the data, the second data file is generated by the different types of source systems directly according to a preset data file standard format, and the target data file is obtained according to the first data file or/and the second data file, so that more types of data sources are covered, the quality of more service data can be audited in a full flow, the universality is higher, and the application range is wider; the preset data file standard format can be set and adjusted according to actual conditions.
Further: the interface auditing rule comprises interface file timeliness auditing and interface file integrity auditing;
the specific steps of the step 2 comprise:
presetting a first auditing index corresponding to the timeliness auditing of the interface file and a second auditing index corresponding to the integrity auditing of the interface file;
calling an auditing engine to obtain the first auditing index and the second auditing index;
auditing and judging the ETL interface machine according to the first auditing index and the second auditing index respectively, if the ETL interface machine meets the first auditing index and the second auditing index, obtaining an interface auditing report, loading the target data file into the target database, and then executing the step 3; if the ETL interface machine does not meet the first audit index or/and the second audit index, ending audit and sending interface alarm information.
The beneficial effects of the further technical scheme are as follows: whether the interface files in the ETL interface meet the timeliness or not is conveniently judged through the preset first auditing index corresponding to the timeliness auditing of the interface files, so that the interface files are ensured to be timely, the quality of service data is conveniently monitored from the source, and the abnormity appearing at the source is avoided; similarly, whether the interface files in the ETL interface meet the integrity is conveniently judged through a second auditing index corresponding to the preset interface file integrity auditing, namely, the interface files are determined to be complete, the reliability of the subsequent auditing of the target data files is conveniently improved, and the quality of the service data is also monitored from the source; only when the ETL interface meets the first audit index and the second audit index at the same time, the subsequent target service data can be audited, and only when one audit index or two audit indexes are not met, corresponding interface alarm information is sent out, the subsequent audit process is ended, the problem found after report data comes out is avoided and then is reprocessed, the problem is known first, the abnormity of the service data is found in advance, the abnormity of a service system is further found, the positioning problem is analyzed in time, the risk is controlled, and the service efficiency is effectively improved; the first audit index and the second audit index can be selected and adjusted according to actual conditions.
Further: the data auditing result comprises auditing normal information and auditing warning information;
the specific steps of the step 3 comprise:
calling the auditing engine to obtain the data auditing rule;
auditing and judging the target data files in the target database according to the data auditing rule, and if the auditing is passed, generating the normal auditing message; and if the audit does not pass, generating the audit alarm information.
The beneficial effects of the further technical scheme are as follows: when the interface audit is passed, the target data file is audited, namely, the full-process monitoring of the service data is realized, and when the target data file is audited, if the target service data meets the preset data audit rule, a corresponding data audit result, namely, an audit normal message is obtained; if the data auditing rule is not satisfied, the corresponding data auditing result, namely the auditing alarm information, is obtained, the relevant personnel is informed to carry out corresponding solving measures in time, the risk is controlled in time, the data monitoring of each node fed back in the whole life cycle is effectively ensured, and the data consistency is ensured.
Further: the data auditing rule comprises at least one of null value verification, data type verification, value range verification, primary key uniqueness verification, index volatility verification and index expression verification.
The beneficial effects of the further technical scheme are as follows: through at least one of the data auditing rules, whether the target service data is abnormal or not can be effectively and accurately judged, the accuracy of data quality monitoring is improved, the consistency of the service data is effectively ensured, and the service quality is further improved.
Further: after the step 3, the method further comprises the following steps:
and 4, step 4: and respectively displaying the interface audit report and the data audit result according to a preset display requirement.
The beneficial effects of the further technical scheme are as follows: by displaying the interface audit report and the data audit result, related monitoring personnel can conveniently, quickly and intuitively master the audit process of the whole service data quality full-flow monitoring, so that the abnormity can be timely found, and the audit efficiency is effectively improved.
According to another aspect of the present invention, a system for auditing quality of service data is also provided, which includes a data acquisition module, an interface auditing module, a report generation module, an interface alarm module and a data auditing module;
the data acquisition module is used for acquiring service data in a source system to an ETL interface machine to obtain a target data file;
the interface auditing module is used for acquiring a preset interface auditing rule and judging whether the ETL interface machine accords with the interface auditing rule or not;
the report generating module is used for obtaining an interface auditing report when the interface auditing module judges that the ETL interface machine accords with the interface auditing rule;
the interface warning module is used for ending auditing and sending interface warning information when the interface auditing module judges that the ETL interface machine does not accord with the interface auditing rule;
the data auditing module is used for loading the target data file into a target database when the interface auditing module judges that the ETL interface machine accords with the interface auditing rule, acquiring a preset data auditing rule, and auditing and judging the target data file in the target database according to the data auditing rule to obtain a data auditing result.
The invention has the beneficial effects that: the auditing system for the quality of the service data can audit the quality of the data from the interface, can monitor the data from the source, stop the subsequent flow and give an alarm if the problem exists, can solve the problem that the problem is found after the report data comes out and then reprocess the report data, and achieves the purposes of knowing the problem first, finding the abnormity of the service data in advance, finding the abnormity of the service system, analyzing and positioning the problem in time, controlling the risk and effectively improving the service efficiency; meanwhile, the quality monitoring of end-to-end data of the service data from the service domain to the analysis system, from the data source of the production system, to the interface layer of the analysis system, to the summary layer of the intermediate data warehouse and to the report application, feeds back the data monitoring of each node in the whole life cycle, finds the data abnormity, and can intervene in time under the condition of data abnormity, thereby preventing the influence from expanding, realizing the full-process monitoring of the quality of the service data, effectively ensuring the consistency of the service data, greatly improving the data monitoring effect and being beneficial to improving the quality of the service.
On the basis of the technical scheme, the invention also has the following improvements:
further: the source system comprises a relational database, a Hadoop system and a file system;
the data acquisition module is specifically configured to:
extracting the business data from the relational database, the Hadoop system and the file system respectively, and performing data cleaning on the business data to obtain a first data file;
the relational database, the Hadoop system and the file system respectively generate a second data file according to a preset data file standard format;
and obtaining the target data file according to the first data file or/and the second data file.
Further: the interface auditing rule comprises interface file timeliness auditing and interface file integrity auditing;
the interface auditing module is specifically configured to:
presetting a first auditing index corresponding to the timeliness auditing of the interface file and a second auditing index corresponding to the integrity auditing of the interface file;
calling an auditing engine to obtain the first auditing index and the second auditing index;
auditing and judging the ETL interface machine according to the first auditing index and the second auditing index respectively;
the report generation module is specifically configured to:
if the interface auditing module judges that the ETL interface machine meets the first auditing index and the second auditing index, the interface auditing report is obtained;
the interface alarm module is specifically configured to:
if the interface auditing module judges that the ETL interface machine does not meet the first auditing index or/and the second auditing index, the auditing is finished, and interface warning information is sent out.
Further: the data auditing result comprises auditing normal information and auditing warning information;
the data auditing module is specifically configured to:
calling the auditing engine to obtain the data auditing rule;
auditing and judging the target data files in the target database according to the data auditing rule, and if the auditing is passed, generating the normal auditing message; and if the audit does not pass, generating the audit alarm information.
Further: the data auditing rule comprises at least one of null value verification, data type verification, value range verification, primary key uniqueness verification, index volatility verification and index expression verification.
Further: still include the show module, the show module is specifically used for:
and respectively displaying the interface audit report and the data audit result according to a preset display requirement.
According to another aspect of the present invention, an auditing device for business data quality is provided, which includes a processor, a memory and a computer program stored in the memory and operable on the processor, wherein the computer program realizes the steps in an auditing method for business data quality when running.
The invention has the beneficial effects that: the method realizes the auditing of the quality of the service data by the computer program stored in the memory and running on the processor, the quality of the data is audited from the interface, the data can be monitored from the source, the problem is known first and precedented, the abnormity of the service data is found in advance, the problem is analyzed and positioned in time, the risk is controlled, and the service efficiency is effectively improved; meanwhile, the invention realizes the data monitoring of feeding back each node in the whole life cycle, finds the data abnormity, can intervene in time under the condition of data abnormity, prevents the influence from expanding, realizes the full-process monitoring of the service data quality, effectively ensures the consistency of the service data, greatly improves the data monitoring effect and is beneficial to improving the service quality.
In accordance with another aspect of the present invention, there is provided a computer storage medium comprising: at least one instruction which, when executed, implements a step in a method for auditing quality of service data according to the invention.
The invention has the beneficial effects that: the auditing of the quality of the service data is realized by executing the computer storage medium containing at least one instruction, the quality of the data is audited from the interface, the data can be monitored from the source, the problem is known first, the abnormity of the service data is found in advance, the problem is analyzed and positioned in time, the risk is controlled, and the service efficiency is effectively improved; meanwhile, the invention realizes the data monitoring of feeding back each node in the whole life cycle, finds the data abnormity, can intervene in time under the condition of data abnormity, prevents the influence from expanding, realizes the full-process monitoring of the service data quality, effectively ensures the consistency of the service data, greatly improves the data monitoring effect and is beneficial to improving the service quality.
Drawings
Fig. 1 is a schematic flowchart illustrating a method for auditing quality of service data according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart illustrating obtaining a target data file according to a first embodiment of the present invention;
FIG. 3 is a schematic flow chart illustrating a process of determining whether an ETL interface machine complies with interface audit rules according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a process of obtaining a data audit result according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating another method for auditing quality of service data according to an embodiment of the present invention;
FIG. 6 is a block diagram of a complete model for auditing quality of service data according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an auditing system for service data quality according to a second embodiment of the present invention;
fig. 8 is a schematic structural diagram of another auditing system for service data quality according to a second embodiment of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
The present invention will be described with reference to the accompanying drawings.
In an embodiment, as shown in fig. 1, a method for auditing quality of service data includes the following steps:
s1: acquiring service data in a source system to an ETL interface machine to obtain a target data file;
s2: acquiring a preset interface audit rule, judging whether the ETL interface machine accords with the interface audit rule, if so, acquiring an interface audit report, loading the target data file into a target database, and then executing S3; if not, finishing the audit and sending interface alarm information;
s3: and acquiring a preset data auditing rule, and auditing and judging the target data files in the target database according to the data auditing rule to obtain a data auditing result.
When business data in a source system is acquired and is transmitted to an ETL interface machine, a target data file is obtained, an interface (namely the ETL interface machine) is audited, a subsequent data auditing process is carried out only when the ETL interface machine accords with a preset interface auditing rule, and if the ETL interface machine does not accord with the preset interface auditing rule, interface warning information is sent out and the source of a business production system is informed of abnormity; if the ETL interface machine accords with a preset interface auditing rule, performing data auditing on a target data file loaded into a target database from the ETL interface machine by using the preset data auditing rule to obtain a data auditing result corresponding to the target data file;
according to the method for auditing the business data, the quality of the data is audited from the interface, the data can be monitored from the source, if a problem exists, the subsequent flow is stopped and an alarm is given, the problem is found after the report data comes out and is reprocessed, the problem is known and found first, the abnormity of the business data is found in advance, the abnormity of a business system is found, the problem is analyzed and positioned in time, the risk is controlled, and the business efficiency is effectively improved; meanwhile, the quality monitoring of end-to-end data of the business data from the business domain to the analysis system is performed from the data source of the production system, the interface layer of the analysis system, the summary layer of the intermediate data warehouse and the report application, the data monitoring of each node is fed back in the whole life cycle, the data abnormity is found, the intervention can be performed in time under the condition of data abnormity, the influence expansion is prevented, the full-process monitoring of the quality of the business data is realized, the consistency of the business data is effectively ensured, the data monitoring effect is greatly improved, and the business service quality is favorably improved.
Specifically, the ETL interface can extract, clean and convert the data of the business system and then load the data into the data warehouse, so that the scattered, disordered and non-uniform data in the enterprise can be integrated together, and an analysis basis is provided for the decision of the enterprise.
Preferably, the source system comprises a relational database, a Hadoop system and a file system;
as shown in fig. 2, the specific step of S1 includes:
s11: extracting the business data from the relational database, the Hadoop system and the file system respectively, and performing data cleaning on the business data to obtain a first data file;
s12: the relational database, the Hadoop system and the file system respectively generate a second data file according to a preset data file standard format;
s13: and obtaining the target data file according to the first data file or/and the second data file.
The first data file is obtained by directly extracting service data from the different types of source systems and cleaning the data, the second data file is generated by the different types of source systems directly according to a preset data file standard format, and the target data file is obtained according to the first data file or/and the second data file, so that more types of data sources are covered, the quality of more service data can be audited in a full flow, the universality is higher, and the application range is wider; the preset data file standard format can be set and adjusted according to actual conditions.
Specifically, the relational database is a database established on the basis of a relational model, and data in the database are processed by means of mathematical concepts and methods such as set algebra and the like; the standard data query language SQL is a language based on a relational database, and the language is used for retrieving and operating data in the relational database; relational databases include Oracle, MysQL, and the like.
The Hadoop system is a distributed system infrastructure developed by Apache foundation, can enable users to develop distributed programs without understanding details of a distributed bottom layer, and fully utilizes the power of a cluster to carry out high-speed operation and storage; it solves two problems: big data storage and big data analysis; namely two major cores of Hadoop: HDFS and MapReduce.
The file system is a method and a data structure used by an operating system to specify files on a storage device (usually a disk, and also a solid state disk based on NANDFlash) or a partition, that is, a method for organizing files on the storage device; the operating system is responsible for managing and storing file information; the file system consists of three parts, namely an interface, an object and an attribute of the file system; at the time of file transfer, a variety of file transfer protocols are satisfied, such as ftp protocol, sftp protocol, and the like.
Preferably, the interface auditing rule comprises interface file timeliness auditing and interface file integrity auditing;
as shown in fig. 3, the specific step of S2 includes:
s21: presetting a first auditing index corresponding to the timeliness auditing of the interface file and a second auditing index corresponding to the integrity auditing of the interface file;
s22: calling an auditing engine to obtain the first auditing index and the second auditing index;
s23: auditing and judging the ETL interface machine according to the first auditing index and the second auditing index respectively, if the ETL interface machine meets the first auditing index and the second auditing index, obtaining an interface auditing report, loading the target data file into the target database, and then executing S3; if the ETL interface machine does not meet the first audit index or/and the second audit index, ending audit and sending interface alarm information.
Whether the interface files in the ETL interface meet the timeliness or not is conveniently judged through the preset first auditing index corresponding to the timeliness auditing of the interface files, so that the interface files are ensured to be timely, the quality of service data is conveniently monitored from the source, and the abnormity appearing at the source is avoided; similarly, whether the interface files in the ETL interface meet the integrity is conveniently judged through a second auditing index corresponding to the preset interface file integrity auditing, namely, the interface files are determined to be complete, the reliability of the subsequent auditing of the target data files is conveniently improved, and the quality of the service data is also monitored from the source; only when the ETL interface meets the first audit index and the second audit index at the same time, the subsequent target service data can be audited, and only when one audit index or two audit indexes are not met, corresponding interface alarm information is sent out, the subsequent audit process is ended, the problem found after report data comes out is avoided and then is reprocessed, the problem is known first, the abnormity of the service data is found in advance, the abnormity of a service system is further found, the positioning problem is analyzed in time, the risk is controlled, and the service efficiency is effectively improved; the first audit index and the second audit index can be selected and adjusted according to actual conditions.
Specifically, the first auditing indicator in this embodiment may be a time attribute of the interface file, and the second auditing indicator may be a file number, a file record number, or a file size of the interface file, or may be a record number of a data table (i.e., a table row number) in the interface file.
Preferably, the data auditing result comprises auditing normal messages and auditing alarm information;
as shown in fig. 4, the specific step of S3 includes:
s31: calling the auditing engine to obtain the data auditing rule;
s32: auditing and judging the target data files in the target database according to the data auditing rule, and if the auditing is passed, generating the normal auditing message; and if the audit does not pass, generating the audit alarm information.
When the interface audit is passed, the target data file is audited, namely, the full-process monitoring of the service data is realized, and when the target data file is audited, if the target service data meets the preset data audit rule, a corresponding data audit result, namely, an audit normal message is obtained; if the data auditing rule is not satisfied, the corresponding data auditing result, namely the auditing alarm information, is obtained, the relevant personnel is informed to carry out corresponding solving measures in time, the risk is controlled in time, the data monitoring of each node fed back in the whole life cycle is effectively ensured, and the data consistency is ensured.
Preferably, the data audit rule includes at least one of null value check, data type check, value range check, primary key uniqueness check, index volatility check and index expression check.
Through at least one of the data auditing rules, whether the target service data is abnormal or not can be effectively and accurately judged, the accuracy of data quality monitoring is improved, the consistency of the service data is effectively ensured, and the service quality is further improved.
Specifically, the null value check refers to checking whether a field has a null value, and returning a corresponding checking result through SQL configuration, for example: a selected count from the table name word field isnull, a result greater than 0 indicating a null value; the table name and the field name are transmitted as parameters through variables, so that the universality of audit is realized.
The data type verification refers to the field value of the data type to be checked, and the corresponding checking result is returned through SQL configuration, so that the data type is verified; similarly, the value range check refers to checking the value range of the value range, and returning the corresponding checking result through SQL configuration, so as to check the value range; the primary key uniqueness check refers to checking the field value of the primary key and returning a corresponding checking result through SQL configuration, thereby checking the primary key and judging whether the primary key is only one root.
The index volatility verification refers to the volatility of the audit index, and comprises record number volatility, field volatility and the like, wherein the record number and the field are indexes. For example, when checking the volatility of a table in the target data file, the number of today's records and the number of yesterday's records in the table are counted, and the number of two days ' records are compared to see how large the volatility is, wherein the number of today's records and the number of yesterday's records are both indicators; for example: when the A table is audited, acquiring yesterday record number and today record number of the A table, comparing the today record number and yesterday record number of the A table, and setting 3 conditions in a configuration table before comparison: absolute values of today's record number/yesterday's record number are 0-30%, 30% -1 and >1 respectively; if the audit result is 0-30%, the audit result is considered normal, and if the audit result is 30% -1, a short message alarm is sent, and if the audit result is more than 1, an error is directly reported.
Specifically, the embodiment further includes a difference check, where the difference check refers to checking an audit occupation ratio, that is, checking a ratio of difference data to total data, and setting a ratio threshold, for example, the audit occupation ratio (difference data/total data) of the current online service is controlled within 1%; the audit percentage of the offline service (difference data/total data) is controlled within 5%.
Specifically, the audit alarm information includes a voice alarm or/and a short message alarm, that is, an alarm is given to the relevant monitoring personnel in a voice mode or/and an alarm is given by sending a short message to a handheld terminal of the relevant monitoring personnel.
Preferably, as shown in fig. 5, after S3, the method further includes:
s4: and respectively displaying the interface audit report and the data audit result according to a preset display requirement.
By displaying the interface audit report and the data audit result, related monitoring personnel can conveniently, quickly and intuitively master the audit process of the whole service data quality full-flow monitoring, so that the abnormity can be timely found, and the audit efficiency is effectively improved; the preset display requirements can be set and adjusted according to actual conditions.
Specifically, the embodiment can display the report, the view and the interface to be confirmed (i.e. the interface which needs the relevant monitoring personnel to confirm on the operation interface) in the interface audit report and the data audit result in a plurality of ways.
Specifically, a block diagram of a complete model for auditing the quality of the service data in the present embodiment is shown in fig. 6.
In the second embodiment, as shown in fig. 7, an auditing system for service data quality includes a data obtaining module, an interface auditing module, a report generating module, an interface alarm module, and a data auditing module;
the data acquisition module is used for acquiring service data in a source system to an ETL interface machine to obtain a target data file;
the interface auditing module is used for acquiring a preset interface auditing rule and judging whether the ETL interface machine accords with the interface auditing rule or not;
the report generating module is used for obtaining an interface auditing report when the interface auditing module judges that the ETL interface machine accords with the interface auditing rule;
the interface warning module is used for ending auditing and sending interface warning information when the interface auditing module judges that the ETL interface machine does not accord with the interface auditing rule;
the data auditing module is used for loading the target data file into a target database when the interface auditing module judges that the ETL interface machine accords with the interface auditing rule, acquiring a preset data auditing rule, and auditing and judging the target data file in the target database according to the data auditing rule to obtain a data auditing result.
The auditing system for the quality of the service data of the embodiment can audit the quality of the data from an interface, can monitor the data from the source, stop the subsequent flow and give an alarm if the problem exists, solve the problem that the report data is found after the report data comes out and then reprocess the problem, so that the problem is known and found first, the abnormity of the service data is found in advance, the abnormity of the service system is found, the problem is analyzed and positioned in time, the risk is controlled, and the service efficiency is effectively improved; meanwhile, the quality monitoring of end-to-end data of the business data from the business domain to the analysis system is performed from the data source of the production system, the interface layer of the analysis system, the summary layer of the intermediate data warehouse and the report application, the data monitoring of each node is fed back in the whole life cycle, the data abnormity is found, the intervention can be performed in time under the condition of data abnormity, the influence expansion is prevented, the full-process monitoring of the quality of the business data is realized, the consistency of the business data is effectively ensured, the data monitoring effect is greatly improved, and the business service quality is favorably improved.
Preferably, the source system comprises a relational database, a Hadoop system and a file system;
the data acquisition module is specifically configured to:
extracting the business data from the relational database, the Hadoop system and the file system respectively, and performing data cleaning on the business data to obtain a first data file;
the relational database, the Hadoop system and the file system respectively generate a second data file according to a preset data file standard format;
and obtaining the target data file according to the first data file or/and the second data file.
Through the data acquisition module, more types of data sources are covered, so that the quality of more service data can be audited in the whole process, the universality is higher, and the application range is wider.
Preferably, the interface auditing rule comprises interface file timeliness auditing and interface file integrity auditing;
the interface auditing module is specifically configured to:
presetting a first auditing index corresponding to the timeliness auditing of the interface file and a second auditing index corresponding to the integrity auditing of the interface file;
calling an auditing engine to obtain the first auditing index and the second auditing index;
auditing and judging the ETL interface machine according to the first auditing index and the second auditing index respectively;
the report generation module is specifically configured to:
if the interface auditing module judges that the ETL interface machine meets the first auditing index and the second auditing index, the interface auditing report is obtained;
the interface alarm module is specifically configured to:
if the interface auditing module judges that the ETL interface machine does not meet the first auditing index or/and the second auditing index, the auditing is finished, and interface warning information is sent out.
Through the interface auditing module, the report generating module and the interface alarming module, the quality of the business data is monitored and audited from the source, the problem that the report data is found after coming out is avoided and then the report data is reprocessed, the problem is known first and precedent, the abnormity of the business data is found in advance, the abnormity of a business system is further found, the positioning problem is analyzed and the risk is controlled in time, and the business efficiency is effectively improved.
Preferably, the data auditing result comprises auditing normal messages and auditing alarm information;
the data auditing module is specifically configured to:
calling the auditing engine to obtain the data auditing rule;
auditing and judging the target data files in the target database according to the data auditing rule, and if the auditing is passed, generating the normal auditing message; and if the audit does not pass, generating the audit alarm information.
Through the data auditing module, the full-process monitoring of the service data is realized, when the service data is abnormal, related personnel can be timely informed to carry out corresponding solving measures, risks are timely controlled, the data monitoring of each node fed back in the whole life cycle is effectively guaranteed, and the consistency of the data is guaranteed.
Preferably, the data audit rule includes at least one of null value check, data type check, value range check, primary key uniqueness check, index volatility check and index expression check.
Through at least one of the data auditing rules, whether the target service data is abnormal or not can be effectively and accurately judged, the accuracy of data quality monitoring is improved, the consistency of the service data is effectively ensured, and the service quality is further improved.
Preferably, as shown in fig. 8, the display device further includes a display module, and the display module is specifically configured to:
and respectively displaying the interface audit report and the data audit result according to a preset display requirement.
Through the display module, related monitoring personnel can conveniently, quickly and intuitively master the auditing process of the whole service data quality full-flow monitoring, so that the abnormity can be timely found, and the auditing efficiency is effectively improved.
Third embodiment, based on the first embodiment and the second embodiment, the present embodiment further discloses an auditing device for business data quality, which includes a processor, a memory, and a computer program stored in the memory and operable on the processor, where the computer program implements specific steps S1 to S3 shown in fig. 1 when running.
The method realizes the auditing of the quality of the service data by the computer program stored in the memory and running on the processor, the quality of the data is audited from the interface, the data can be monitored from the source, the problem is known first and precedented, the abnormity of the service data is found in advance, the problem is analyzed and positioned in time, the risk is controlled, and the service efficiency is effectively improved; meanwhile, the invention realizes the data monitoring of feeding back each node in the whole life cycle, finds the data abnormity, can intervene in time under the condition of data abnormity, prevents the influence from expanding, realizes the full-process monitoring of the service data quality, effectively ensures the consistency of the service data, greatly improves the data monitoring effect and is beneficial to improving the service quality.
The present embodiment also provides a computer storage medium having at least one instruction stored thereon, where the instruction when executed implements the specific steps of S1-S3.
The auditing of the quality of the service data is realized by executing the computer storage medium containing at least one instruction, the quality of the data is audited from the interface, the data can be monitored from the source, the problem is known first, the abnormity of the service data is found in advance, the problem is analyzed and positioned in time, the risk is controlled, and the service efficiency is effectively improved; meanwhile, the invention realizes the data monitoring of feeding back each node in the whole life cycle, finds the data abnormity, can intervene in time under the condition of data abnormity, prevents the influence from expanding, realizes the full-process monitoring of the service data quality, effectively ensures the consistency of the service data, greatly improves the data monitoring effect and is beneficial to improving the service quality.
Details of S1 to S3 in this embodiment are not described in detail in the first embodiment and the detailed descriptions in fig. 1 to 6, which are not repeated herein.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A method for auditing service data quality is characterized by comprising the following steps:
step 1: acquiring service data in a source system to an ETL interface machine to obtain a target data file;
step 2: acquiring a preset interface audit rule, judging whether the ETL interface machine accords with the interface audit rule, if so, acquiring an interface audit report, loading the target data file into a target database, and then executing a step 3; if not, finishing the audit and sending interface alarm information;
and step 3: and acquiring a preset data auditing rule, and auditing and judging the target data files in the target database according to the data auditing rule to obtain a data auditing result.
2. The method for auditing quality of business data according to claim 1, where the source system comprises a relational database, a Hadoop system and a file system;
the specific steps of the step 1 comprise:
extracting the business data from the relational database, the Hadoop system and the file system respectively, and performing data cleaning on the business data to obtain a first data file;
the relational database, the Hadoop system and the file system respectively generate a second data file according to a preset data file standard format;
and obtaining the target data file according to the first data file or/and the second data file.
3. The method of claim 1, wherein the interface auditing rules include interface file timeliness audit and interface file integrity audit;
the specific steps of the step 2 comprise:
presetting a first auditing index corresponding to the timeliness auditing of the interface file and a second auditing index corresponding to the integrity auditing of the interface file;
calling an auditing engine to obtain the first auditing index and the second auditing index;
auditing and judging the ETL interface machine according to the first auditing index and the second auditing index respectively, if the ETL interface machine meets the first auditing index and the second auditing index, obtaining an interface auditing report, loading the target data file into the target database, and then executing the step 3; if the ETL interface machine does not meet the first audit index or/and the second audit index, ending audit and sending interface alarm information.
4. The method of claim 3, wherein the data auditing results include audit normal messages and audit alarm messages;
the specific steps of the step 3 comprise:
calling the auditing engine to obtain the data auditing rule;
auditing and judging the target data files in the target database according to the data auditing rule, and if the auditing is passed, generating the normal auditing message; and if the audit does not pass, generating the audit alarm information.
5. The method of claim 1, wherein the data auditing rules include at least one of null value check, data type check, value range check, primary key uniqueness check, index volatility check, and index expression check.
6. The method for auditing service data quality according to any one of claims 1-5, characterized by further comprising after said step 3:
and 4, step 4: and respectively displaying the interface audit report and the data audit result according to a preset display requirement.
7. An auditing system for service data quality is characterized by comprising a data acquisition module, an interface auditing module, a report generation module, an interface alarm module and a data auditing module;
the data acquisition module is used for acquiring service data in a source system to an ETL interface machine to obtain a target data file;
the interface auditing module is used for acquiring a preset interface auditing rule and judging whether the ETL interface machine accords with the interface auditing rule or not;
the report generating module is used for obtaining an interface auditing report when the interface auditing module judges that the ETL interface machine accords with the interface auditing rule;
the interface warning module is used for ending auditing and sending interface warning information when the interface auditing module judges that the ETL interface machine does not accord with the interface auditing rule;
the data auditing module is used for loading the target data file into a target database when the interface auditing module judges that the ETL interface machine accords with the interface auditing rule, acquiring a preset data auditing rule, and auditing and judging the target data file in the target database according to the data auditing rule to obtain a data auditing result.
8. The system of claim 7, wherein the interface audit rules include interface document timeliness audit and interface document integrity audit;
the interface auditing module is specifically configured to:
presetting a first auditing index corresponding to the timeliness auditing of the interface file and a second auditing index corresponding to the integrity auditing of the interface file;
calling an auditing engine to obtain the first auditing index and the second auditing index;
auditing and judging the ETL interface machine according to the first auditing index and the second auditing index respectively;
the report generation module is specifically configured to:
if the interface auditing module judges that the ETL interface machine meets the first auditing index and the second auditing index, the interface auditing report is obtained;
the interface alarm module is specifically configured to:
if the interface auditing module judges that the ETL interface machine does not meet the first auditing index or/and the second auditing index, the auditing is finished, and interface warning information is sent out.
9. An auditing device for quality of service data, comprising a processor, a memory and a computer program stored in the memory and operable on the processor, the computer program when executed implementing the method steps of any one of claims 1 to 6.
10. A computer storage medium, the computer storage medium comprising: at least one instruction which, when executed, implements the method steps of any one of claims 1 to 6.
CN202010338216.4A 2020-04-26 2020-04-26 Service data quality auditing method, system, device and storage medium Pending CN111539633A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010338216.4A CN111539633A (en) 2020-04-26 2020-04-26 Service data quality auditing method, system, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010338216.4A CN111539633A (en) 2020-04-26 2020-04-26 Service data quality auditing method, system, device and storage medium

Publications (1)

Publication Number Publication Date
CN111539633A true CN111539633A (en) 2020-08-14

Family

ID=71977203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010338216.4A Pending CN111539633A (en) 2020-04-26 2020-04-26 Service data quality auditing method, system, device and storage medium

Country Status (1)

Country Link
CN (1) CN111539633A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035159A (en) * 2020-08-28 2020-12-04 中国建设银行股份有限公司 Configuration method, device, equipment and storage medium of audit model
CN112183952A (en) * 2020-09-08 2021-01-05 支付宝(杭州)信息技术有限公司 Index quality supervision processing method and device and electronic equipment
CN112214532A (en) * 2020-10-13 2021-01-12 北京思特奇信息技术股份有限公司 Service data auditing method and related device
CN112508346A (en) * 2020-11-17 2021-03-16 四川新网银行股份有限公司 Method for realizing indexed business data auditing
CN112579352A (en) * 2020-12-14 2021-03-30 广州信安数据有限公司 Quality monitoring result generation method, storage medium and quality monitoring system of service data processing link
CN112597165A (en) * 2020-12-28 2021-04-02 中国建设银行股份有限公司 Supervision data quality verification method and device, electronic equipment and storage medium
CN112861499A (en) * 2020-12-02 2021-05-28 国网浙江省电力有限公司台州供电公司 Multi-source data statistical method for power grid dispatching and power market application
CN112860410A (en) * 2021-03-08 2021-05-28 北京睿芯高通量科技有限公司 Method for enhancing hierarchical control of production system
CN112882896A (en) * 2021-02-23 2021-06-01 广州虎牙科技有限公司 Data monitoring method and device and electronic equipment
CN112926941A (en) * 2021-03-04 2021-06-08 远光软件股份有限公司 Management method and device for financial auditing rules, storage medium and server
CN114048516A (en) * 2022-01-13 2022-02-15 武汉烽火信息集成技术有限公司 Data auditing method, device, equipment and storage medium based on loss data packet
CN114356902A (en) * 2021-12-14 2022-04-15 中核武汉核电运行技术股份有限公司 Industrial data quality management method and device
CN115545682A (en) * 2022-12-05 2022-12-30 深圳迅策科技有限公司 Report form auditing method and computing equipment
CN116400836A (en) * 2023-03-30 2023-07-07 江西省通信产业服务有限公司 Auditing result custom export data middle stage and operation method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102497435A (en) * 2011-12-16 2012-06-13 海南杰福瑞网络科技有限公司 Data distributing method and device of data service
CN103473672A (en) * 2013-09-30 2013-12-25 国家电网公司 System, method and platform for auditing metadata quality of enterprise-level data center
CN105761010A (en) * 2016-02-24 2016-07-13 国网山东省电力公司 Method and system for real-time monitoring of group enterprise audit based on real-time data acquisition
CN109039710A (en) * 2018-07-10 2018-12-18 中国联合网络通信集团有限公司 Route data auditing method, device, server and storage medium
CN110008201A (en) * 2019-04-09 2019-07-12 浩鲸云计算科技股份有限公司 A kind of quality of data towards big data checks monitoring method
CN110729054A (en) * 2019-10-14 2020-01-24 平安医疗健康管理股份有限公司 Abnormal diagnosis behavior detection method and device, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102497435A (en) * 2011-12-16 2012-06-13 海南杰福瑞网络科技有限公司 Data distributing method and device of data service
CN103473672A (en) * 2013-09-30 2013-12-25 国家电网公司 System, method and platform for auditing metadata quality of enterprise-level data center
CN105761010A (en) * 2016-02-24 2016-07-13 国网山东省电力公司 Method and system for real-time monitoring of group enterprise audit based on real-time data acquisition
CN109039710A (en) * 2018-07-10 2018-12-18 中国联合网络通信集团有限公司 Route data auditing method, device, server and storage medium
CN110008201A (en) * 2019-04-09 2019-07-12 浩鲸云计算科技股份有限公司 A kind of quality of data towards big data checks monitoring method
CN110729054A (en) * 2019-10-14 2020-01-24 平安医疗健康管理股份有限公司 Abnormal diagnosis behavior detection method and device, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
佟鑫等: "数据平台安全风险分析与评估方法", 保密科学技术 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035159A (en) * 2020-08-28 2020-12-04 中国建设银行股份有限公司 Configuration method, device, equipment and storage medium of audit model
CN112035159B (en) * 2020-08-28 2024-03-08 中国建设银行股份有限公司 Configuration method, device, equipment and storage medium of audit model
CN112183952A (en) * 2020-09-08 2021-01-05 支付宝(杭州)信息技术有限公司 Index quality supervision processing method and device and electronic equipment
CN112214532A (en) * 2020-10-13 2021-01-12 北京思特奇信息技术股份有限公司 Service data auditing method and related device
CN112508346B (en) * 2020-11-17 2022-06-24 四川新网银行股份有限公司 Method for realizing indexed business data auditing
CN112508346A (en) * 2020-11-17 2021-03-16 四川新网银行股份有限公司 Method for realizing indexed business data auditing
CN112861499A (en) * 2020-12-02 2021-05-28 国网浙江省电力有限公司台州供电公司 Multi-source data statistical method for power grid dispatching and power market application
CN112579352A (en) * 2020-12-14 2021-03-30 广州信安数据有限公司 Quality monitoring result generation method, storage medium and quality monitoring system of service data processing link
CN112597165A (en) * 2020-12-28 2021-04-02 中国建设银行股份有限公司 Supervision data quality verification method and device, electronic equipment and storage medium
CN112882896A (en) * 2021-02-23 2021-06-01 广州虎牙科技有限公司 Data monitoring method and device and electronic equipment
CN112926941A (en) * 2021-03-04 2021-06-08 远光软件股份有限公司 Management method and device for financial auditing rules, storage medium and server
CN112926941B (en) * 2021-03-04 2023-07-11 远光软件股份有限公司 Management method and device of financial auditing rules, storage medium and server
CN112860410A (en) * 2021-03-08 2021-05-28 北京睿芯高通量科技有限公司 Method for enhancing hierarchical control of production system
CN114356902A (en) * 2021-12-14 2022-04-15 中核武汉核电运行技术股份有限公司 Industrial data quality management method and device
CN114048516A (en) * 2022-01-13 2022-02-15 武汉烽火信息集成技术有限公司 Data auditing method, device, equipment and storage medium based on loss data packet
CN115545682A (en) * 2022-12-05 2022-12-30 深圳迅策科技有限公司 Report form auditing method and computing equipment
CN116400836A (en) * 2023-03-30 2023-07-07 江西省通信产业服务有限公司 Auditing result custom export data middle stage and operation method

Similar Documents

Publication Publication Date Title
CN111539633A (en) Service data quality auditing method, system, device and storage medium
CN105095056B (en) A kind of method of data warehouse data monitoring
CN105956481B (en) A kind of data processing method and its device
US20180046662A1 (en) Processing of Updates in a Database System Using Different Scenarios
CN111459698A (en) Database cluster fault self-healing method and device
CN114741396B (en) Data service processing method and device, electronic equipment and storage medium
CN113595761A (en) Micro-service component optimization method of power system information and communication integrated scheduling platform
CN112905323A (en) Data processing method and device, electronic equipment and storage medium
CN112148779A (en) Method, device and storage medium for determining service index
CN112559567A (en) Query method and device suitable for OLAP query engine
CN110363381B (en) Information processing method and device
CN116628023B (en) Waiting event type query method and device, storage medium and electronic equipment
CN102855297B (en) A kind of method of control data transmission and connector
CN117112651A (en) Enterprise data quality assessment method and equipment
WO2020208149A1 (en) Enterprise resource planning system, server and supervision method of sql queries in such a system or server
CN111752838A (en) Question checking method and device, server and storage medium
CN113472881B (en) Statistical method and device for online terminal equipment
CN106547860B (en) Method for positioning performance fault of distributed database
CN109829016B (en) Data synchronization method and device
CN113222223A (en) Wind control linkage early warning method, system, equipment and storage medium for real-time warehouse
CN117573680B (en) Positioning data transmission management system and method based on big data
CN111209130B (en) Fault processing method, system, equipment and medium based on MySQL master-slave replication cluster
CN116662059B (en) MySQL database CPU fault diagnosis and self-healing method and readable storage medium
CN114462373B (en) Audit rule determination method and device, electronic equipment and storage medium
CN114020571A (en) Monitoring method and monitoring equipment for index server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200814