CN111159171A - Data auditing method and system - Google Patents

Data auditing method and system Download PDF

Info

Publication number
CN111159171A
CN111159171A CN201911419646.2A CN201911419646A CN111159171A CN 111159171 A CN111159171 A CN 111159171A CN 201911419646 A CN201911419646 A CN 201911419646A CN 111159171 A CN111159171 A CN 111159171A
Authority
CN
China
Prior art keywords
audit
rule
data
script
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911419646.2A
Other languages
Chinese (zh)
Inventor
刘建波
吴子龙
程赓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Tower Co Ltd
Original Assignee
China Tower Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Tower Co Ltd filed Critical China Tower Co Ltd
Priority to CN201911419646.2A priority Critical patent/CN111159171A/en
Publication of CN111159171A publication Critical patent/CN111159171A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • G06N5/025Extracting rules from data

Abstract

The invention provides a data auditing method and a system, wherein the method comprises the following steps: obtaining audit rule information including collected data issues; classifying the audit rule information, and generating an audit rule audit script according to the classified audit rule information; performing iterative audit on the audit rule audit script in a preset audit rule sediment library; and performing data audit according to the audit rule audit script after iterative audit to obtain an audit result report. According to the technical scheme, the corresponding audit rule audit script is automatically generated according to the acquired audit rule information, and after the audit rule audit script is precipitated, data audit is performed according to the audit rule audit script to obtain an audit result report. The problems of inflexibility and error proneness caused by manual audit in the traditional data audit method are solved, and the labor and material cost caused by manually writing the audit script is reduced, so that the work efficiency is improved.

Description

Data auditing method and system
Technical Field
The invention relates to the field of computer information management systems, in particular to a data auditing method and system.
Background
Most of the existing data auditing methods are to prepare data to be audited in advance, set problem auditing rules by special persons in advance, compare the data according to the auditing data and the rules to be audited, and form data difference analysis through comparison. The method is not flexible on the whole, needs to spend a large amount of manpower and material resource costs, has certain limitation on improving data quality, and has the following problems when processing the business modes of a multi-business system and a multi-database:
1. audit data "not centralized, not timely": in the traditional data auditing method, auditing rules are preset, a series of timer tasks need to be set, and offline service data of a plurality of databases are integrated into an analysis database in advance. The extracted data needs to be cleaned manually, the work is not flexible enough and errors easily occur.
2. The manual audit mode is slow: in the process of manually maintaining the data quality of the system, the service data changes frequently, and the manual checking time period is long. The complicated checking script needs to be compiled manually, and a large amount of developers are invested. The data analysis randomness is high, an analysis report needs to be written manually, and the working efficiency is low;
3. data quality tracking "mess": the discovery, the monitoring, the rectification and the promotion process of the data quality problem are invisible, uncontrollable and unidimensional.
4. Check rule maintainability "poor": the traditional tool needs to assign a specially-assigned person to maintain the checking rule, and a great amount of manpower and material resources are needed in the process of converting the business rule into the technical rule; maintenance personnel often lack understanding of the business, and data inaccuracy and rule repetition are caused.
Disclosure of Invention
The technical scheme of the invention aims to provide a data auditing method and a system, which are used for solving the problems that the traditional data auditing method is not flexible in data processing and needs to spend a large amount of manpower and material resources.
The present invention provides a data auditing method, which includes:
obtaining audit rule information including collected data issues;
classifying the audit rule information, and generating an audit rule audit script according to the classified audit rule information;
performing iterative audit on the audit rule audit script in a preset audit rule sediment library;
and performing data audit according to the audit rule audit script after iterative audit to obtain an audit result report.
Specifically, before obtaining audit rule information including collected data problems, the data audit method further includes:
the method comprises the steps that rule collection information is sent to a first target responsible end periodically, and data problems and/or audit rule information provided by the first target responsible end responding to the rule collection information are obtained;
and acquiring data problems and/or audit rule information registered by the second target responsible end.
Preferably, after obtaining audit rule information including collected data questions, the data auditing method further includes:
verifying and processing the audit rule information;
and storing the audit rule information which passes the check and repeat processing into an audit rule storage base.
Preferably, the step of generating the audit rule audit script according to the categorized audit rule information in the data audit method includes:
configuring a data source corresponding to the classified audit rule information;
selecting corresponding metadata and a data standard needing to be matched according to the classified audit rule information;
generating an audit rule audit script according to the data source, the metadata, the data standard and the audit category to which the classified audit rule information belongs;
the audit rule audit script comprises an audit total amount script and/or a problem detail audit script.
Specifically, after the data auditing method generates the auditing rule checking script, the method further includes:
testing the correctness of the audit script by the audit rule;
and storing the audit rule audit script which is tested correctly and the corresponding audit rule information into an audit rule intermediate library.
Preferably, in the data auditing method, before the iterative audit of the audit rule audit script in the preset audit rule sediment library, the method further includes:
generating a corresponding rule registration card according to the audit rule information in the audit rule intermediate library and the corresponding audit rule audit script;
and when the confirmation indication information of the rule registration card is acquired, storing the audit rule audit script into the audit rule sediment library.
Specifically, the data auditing method, which performs the data auditing according to the auditing rule auditing script after the iterative auditing, includes:
and acquiring an audit selection instruction or a timing task scheduling instruction, adopting a multi-thread concurrent mode, and executing a corresponding audit rule audit script to audit data according to pre-acquired audit parameters, audit task configuration information and audit task scheduling information.
Preferably, in the data auditing method, the audit result report includes: the audit result chart report comprises a rule difference distribution table, a data difference fluctuation analysis table and/or a key data audit rule ranking.
Further, the data auditing method as described above further includes:
according to the audit result report, the audit result and/or the problem detail information in the audit result report are/is sent to a second target responsible end;
and receiving feedback information of the second target responsible end to generate a four-quadrant data audit analysis card, and storing the four-quadrant data audit analysis card in a knowledge base.
Preferably, the data auditing method as described above, the method further includes:
and performing data quality evaluation at preset intervals according to a preset data quality evaluation model, and forming a data quality report.
Another preferred embodiment of the present invention further provides a data auditing system, including:
the first acquisition module is used for acquiring audit rule information comprising the collected data problems;
the first processing module is used for classifying the audit rule information and generating an audit rule audit script according to the classified audit rule information;
the second processing module is used for performing iterative audit on the audit rule audit script in a preset audit rule sediment library;
and the third processing module is used for auditing data according to the auditing rule auditing script after iterative auditing to obtain an auditing result report.
Specifically, the data auditing system described above further includes:
the fourth processing module is used for acquiring data problems and/or audit rule information provided by the first target responsible end responding to the rule collection information by periodically sending the rule collection information to the first target responsible end;
and the second acquisition module is used for acquiring the data problems and/or audit rule information registered by the second target responsible end.
Preferably, the data auditing system as described above, further includes:
the fifth processing module is used for carrying out check and repeat processing on the audit rule information;
and the sixth processing module is used for storing the audit rule information which passes the check and repeat processing into the audit rule storage library.
Preferably, in the data auditing system, the first processing module includes:
the first processing unit is used for configuring a data source corresponding to the classified audit rule information;
the second processing unit is used for selecting corresponding metadata and data standards needing to be matched according to the classified audit rule information;
the third processing unit is used for generating an audit rule audit script according to the data source, the metadata, the data standard and the audit category to which the classified audit rule information belongs;
the audit rule audit script comprises an audit total amount script and/or a problem detail audit script.
Specifically, the data auditing system described above further includes:
the test module is used for testing the correctness of the audit rule check script;
and the seventh processing module is used for storing the audit rule audit script which is tested correctly and the corresponding audit rule information into the audit rule intermediate library.
Preferably, in the data auditing system, the second processing module further includes:
the fourth processing unit is used for generating a corresponding rule registration card according to the audit rule information in the audit rule intermediate library and the corresponding audit rule audit script;
and the fifth processing unit is used for storing the audit rule audit script to the audit rule sediment library when the confirmation indication information of the rule registration card is acquired.
Specifically, the data auditing system and the third processing module are specifically configured to:
and acquiring an audit selection instruction or a timing task scheduling instruction, adopting a multi-thread concurrent mode, and executing a corresponding audit rule audit script to audit data according to pre-acquired audit parameters, audit task configuration information and audit task scheduling information.
Preferably, in the data auditing system, the audit result report includes: the audit result chart report comprises a rule difference distribution table, a data difference fluctuation analysis table and/or a key data audit rule ranking.
Further, the data auditing system as described above further includes:
the eighth processing module is used for sending the audit result and/or the problem detail information in the audit result report to the second target responsible end according to the audit result report;
and the ninth processing module is used for receiving the feedback information of the second target responsible end to generate a four-quadrant data audit analysis card and storing the four-quadrant data audit analysis card in the knowledge base.
Preferably, the data auditing system as described above, further includes:
and the tenth processing module is used for carrying out data quality evaluation at preset time intervals according to a preset data quality evaluation model and forming a data quality report.
Yet another preferred embodiment of the present invention further provides a computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when being executed by a processor, the computer program implements the steps in the data auditing method as described above.
At least one of the above technical solutions of the present invention has the following beneficial effects:
when the data auditing system executes the data auditing method, auditing rule information including collected data problems is firstly acquired; after the audit rule information is obtained, classification is carried out according to multiple dimensions in the audit rule information, and a corresponding audit rule audit script is automatically generated according to the classified audit rule information, so that the problems that a transmission data audit method needs to integrate multiple data sources together in advance and is not flexible enough and prone to errors caused by manual cleaning are solved, the labor and material cost caused by manual compiling of the audit script is avoided, and the work efficiency is improved; after the audit rule audit script is generated, the audit rule audit script is precipitated, so that the obtained audit rule audit script is ensured to meet the actual requirement, the reliability of the audit rule audit script is improved, data audit is performed according to the audit rule audit script after iterative audit, an audit result report is obtained, and a user or a technician can analyze data problems visually.
Drawings
FIG. 1 is a flowchart illustrating a data auditing method according to one embodiment of the present invention;
FIG. 2 is a second flowchart illustrating a data auditing method according to the present invention;
FIG. 3 is a third flowchart illustrating a data auditing method according to the present invention;
FIG. 4 is a fourth flowchart illustrating a data auditing method according to the present invention;
FIG. 5 is a fifth flowchart illustrating a data auditing method according to the present invention;
FIG. 6 is a schematic diagram illustrating a data auditing system according to the present invention;
FIG. 7 is a diagram illustrating a data audit rule registration card according to the present invention;
FIG. 8 is a diagram illustrating a data audit rule quality evaluation table according to the present invention;
FIG. 9 is a schematic diagram of a problem rule difference distribution table of the present invention;
FIG. 10 is a schematic diagram of a data variance analysis table according to the present invention;
FIG. 11 is a second schematic diagram of a data variance analysis table according to the present invention;
FIG. 12 is a diagram illustrating an arrangement of key data audit rules according to the present invention;
FIG. 13 is a second illustration of the key data audit rule arrangement of the present invention;
FIG. 14 is a third exemplary diagram illustrating the arrangement of key data audit rules according to the present invention;
FIG. 15 is a diagram illustrating a four-quadrant data audit analysis card according to the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.
Referring to fig. 1, an embodiment of the present invention provides a data auditing method, including:
step S101, obtaining audit rule information including collected data problems;
step S102, classifying the audit rule information, and generating an audit rule audit script according to the classified audit rule information;
step S103, performing iterative audit on audit rule audit scripts in a preset audit rule sediment library;
and step S104, auditing data according to the auditing rule auditing script after iterative auditing to obtain an auditing result report.
In an embodiment of the present invention, when the data auditing system executes the data auditing method, the auditing rule information including the collected data problem is firstly obtained; after the audit rule information is obtained, the audit rule information is classified according to at least one of multiple dimensions including data quality information, audit category information, subject domain information, problem reason information, audit rule description and the like in the audit rule information, and a corresponding audit rule audit script is automatically generated according to the classified audit rule information, so that the problems that a plurality of data sources are required to be integrated together in advance and are inflexible and easy to make mistakes due to manual cleaning in the traditional data audit method are avoided, the labor and material cost caused by manually compiling the audit script is avoided, and the work efficiency is improved; after the audit rule audit script is generated, iterative audit, namely deposition, is carried out on the audit rule audit script in a preset audit rule deposition library, so that the obtained audit rule audit script is ensured to meet the actual requirement, the reliability of the audit rule audit script is improved, data audit is carried out according to the audit rule audit script after iterative audit, an audit result report is obtained, and a user or a technical staff can analyze data problems visually.
Specifically, before obtaining audit rule information including collected data problems, the data audit method further includes:
the method comprises the steps that rule collection information is sent to a first target responsible end periodically, and data problems and/or audit rule information provided by the first target responsible end responding to the rule collection information are obtained;
and acquiring data problems and/or audit rule information registered by the second target responsible end.
In the embodiment of the present invention, the data auditing system may periodically send rule collection information to the first target responsible end, and acquire data problems and/or auditing rule information provided by the first target responsible end in response to the rule collection information, where the rule collection information includes, but is not limited to, data problem classification, data problem description, influence of the data problems on services, service modules related to the data problems, levels of the data problems, data problem keywords, feedback persons, contact information, and the like, and the objects used by the first target responsible end include, but are not limited to, service experts and service responsible persons. In addition, the data auditing system can acquire data problems and/or auditing rule information registered by a second target responsible end, wherein the use objects of the second target responsible end include but are not limited to provinces and data responsible persons in cities. By expanding the sources of the audit rule information, the real-time performance and high centralization of data audit can be realized.
Referring to fig. 2, preferably, the data auditing method as described above, after obtaining auditing rule information including collected data issues, the method further includes:
step S201, carrying out check and repeat processing on the audit rule information;
step S202, the audit rule information passing the check and repeat process is saved to the audit rule storage base.
In the embodiment of the invention, after the audit rule information is obtained, the audit rule information is verified and reprocessed based on the existing audit rule storage library, the audit rule information which is verified and reprocessed is stored in the audit rule storage library, and through the verification and reprocessing, the same audit rule information fed back by different personnel is simplified, and the resource waste caused by processing the same information for multiple times is reduced. In the review process, it is preferable to sequentially perform the determination according to the range size involved in the audit rule information, for example, in an embodiment of the present invention, the filtering is performed according to the service module and the problem classification in the audit rule information, when the audit rules all have the same value, the determination is performed according to the repeated reading of the keywords, for example, when the repeated keywords have 2 or more than 2, the newly added audit rule information is marked, and the existing audit rule information similar to the newly added audit rule information is listed, so as to facilitate the manual verification.
Referring to fig. 3, preferably, the step of generating the audit rule audit script according to the categorized audit rule information by the data audit method as described above includes:
step S301, configuring a data source corresponding to the classified audit rule information;
step S302, selecting corresponding metadata and data standards needing to be matched according to the classified audit rule information;
step S303, generating an audit rule audit script according to the data source, the metadata, the data standard and the audit type to which the classified audit rule information belongs;
the audit rule audit script comprises an audit total amount script and/or a problem detail audit script.
In the embodiment of the invention, after the audit rule information is classified, the data sources corresponding to the audit rule information are configured, wherein the number of the data sources can be one or more, the metadata such as corresponding libraries, tables, fields, theme fields and the like are determined according to the service module information and the theme field information in the audit rule information, the data standards needing to be matched are determined according to the audit rule description and comprise data model standards and data dictionary standards, and the configuration is completed and the audit rule audit script is automatically generated according to the data sources, the metadata, the data standards corresponding to the audit rule information and the audit category information in the audit rule data, so that the obtained audit rule audit script is adaptive to the corresponding data sources, the labor and material cost caused by manually writing the audit script is avoided, and the work efficiency is improved. Wherein, checking the category information comprises: and controlling at least one of checking, code value checking, main external key checking, data model checking and quantity difference checking. The audit rule audit script comprises an audit gross amount script and/or a question detail audit script.
Specifically, after the data auditing method generates the auditing rule checking script, the method further includes:
step S304, testing the correctness of the audit rule check script;
step S305, the audit rule audit script with correct test and the corresponding audit rule information are saved in the audit rule intermediate library.
In the embodiment of the invention, after the audit rule audit script is generated, the correctness of the audit rule audit script is also tested, only the audit rule information corresponding to the audit rule audit script which is tested correctly is stored in the audit rule intermediate library for further processing, if the test is wrong, the audit rule audit script needs to be adjusted, and the test is carried out again after the adjustment until the test is correct, which is favorable for ensuring the normal operation of the audit rule audit script. Optionally, in an embodiment of the present invention, when it is determined that the audit rule checks that the script is correct, a portion of the data may be previewed. It is within the scope of the present invention for a person skilled in the art to determine that the audit rule checks that the script is correct in other ways.
Referring to fig. 4, preferably, the data auditing method as described above further includes, before the iterative audit of the audit rule audit script in the preset audit rule sediment library:
step S401, generating a corresponding rule registration card according to the audit rule information in the audit rule intermediate library and the corresponding audit rule audit script;
step S402, when the confirmation indication information of the rule registration card is acquired, the audit rule check script is stored in the audit rule deposit library.
In the embodiment of the invention, after the audit rule audit script which is tested correctly and the corresponding audit rule information are stored in the audit rule intermediate library, a corresponding rule registration card (see fig. 7) is generated, and after the confirmation indication information of the rule registration card is obtained, the audit rule audit script is stored in the audit rule sediment library. The confirmation indication information is generated after all departments related to the rule registration card confirm, so that the audit rule is ensured to be confirmed by the related departments, and the credibility of the audit rule is improved.
Optionally, the conversion rate of the audit rule may be obtained according to a ratio of the number of the audit rule audit scripts determining the completion of the precipitation to the number of the collected audit rule information.
Specifically, the data auditing method, which performs the data auditing according to the auditing rule auditing script after the iterative auditing, includes:
and acquiring an audit selection instruction or a timing task scheduling instruction, adopting a multi-thread concurrent mode, and executing a corresponding audit rule audit script to audit data according to pre-acquired audit parameters, audit task configuration information and audit task scheduling information.
In the embodiment of the invention, after the audit selection instruction or the timing task scheduling instruction is obtained, the audit rule audit script is configured according to the audit parameter, the audit task configuration information and the audit task scheduling information which are obtained in advance, and the audit rule audit script is executed to audit data after the configuration is finished. Specifically, the audit parameters may be default parameters of the system, or may be customized and adjusted, and include: checking and inquiring the maximum result number, checking and inquiring the overtime time, checking and inquiring the result file path, checking and concurrently executing the method, ignoring the exception in the execution, and automatically auditing again after the execution. The checking task configuration information is used for setting a plurality of checking methods into one checking task and can carry out operations such as suspension, continuation, modification, deletion and the like on each checking task. And the checking task scheduling information is used for matching the checking task with scheduling frequency, scheduling starting time and scheduling ending time. In order to ensure that the read-write bottleneck is reduced by the multithreading operation efficiency, the obtained audit result is firstly stored in a file and then read after the audit is finished.
Preferably, in the data auditing method, the audit result report includes: the audit result chart report and the audit result full-chain chart comprise: a rule difference distribution table, a data difference fluctuation analysis table and/or a key data audit rule ranking.
In the embodiment of the invention, an audit result report is generated according to the obtained audit result, wherein a rule difference distribution table (see fig. 9), a data difference fluctuation analysis table (see fig. 10 and fig. 11) and/or a key data audit rule ranking (see fig. 12 to 14) are respectively generated according to different analysis dimensions to obtain an audit result chart report, and an audit result full-chain graph is obtained at the same time, wherein the audit result full-chain graph obtains key database, table, field-up consanguineness analysis and downward influence analysis according to metadata information associated with the audit rule, and whether the data problem of the current field and the upstream field needs to be audited or not is determined according to the root of positioning problem data by the consanguineness analysis; the influence range can be determined according to the influence analysis, and the influence range can be determined, so that the downstream fields are influenced by the modification of the data problem of the current field, whether the data problem of the current field and the data problem of other downstream fields need to be audited or not is determined, the potential data problem can be found in advance, and the audit rule information can be obtained as a new data problem.
Referring to fig. 5, further, the data auditing method as described above further includes:
step S501, according to the audit result report, the audit result and/or the problem detail information in the audit result report are sent to a second target responsible end;
step S502, receiving feedback information of a second target responsible end to generate a four-quadrant data audit analysis card, and storing the card in a knowledge base.
In the embodiment of the invention, the system also sends the audit result and/or the problem detail information in the audit result report to the second target responsible end according to the audit result report, so that a responsible person at the second target responsible end can analyze the problem according to the audit result report and feed back the reason of the data problem, the influence on the service, the solution and the processing progress, and the data audit system receives the feedback information to generate a four-quadrant data audit analysis card (see fig. 15) and stores the four-quadrant data audit analysis card in the knowledge base.
Specifically, in the embodiment of the present invention, an alarm condition is configured for the auditing rule of key monitoring, and after the alarm condition is reached, the auditing result report and the problem details are automatically sent to the corresponding second target responsible end.
In the embodiment of the invention, when the data problem is rectified, the rectification situation of the data problem can be tracked in real time, re-audit is carried out in time according to the processing result, the latest audit result is fed back, and whether the problem rectification is completed or not is properly carried out.
Specifically, the data auditing method further includes:
and performing data quality evaluation at preset intervals according to a preset data quality evaluation model, and forming a data quality report.
In the embodiment of the invention, data quality evaluation is carried out at intervals of preset time according to a preset data quality evaluation model, and a data quality report is formed, specifically, in the data quality evaluation model, weight distribution is carried out according to the analysis dimensionality of an audit rule, the reason of a data problem, the rectification efficiency of the data problem and the quality grade of six sigma data; and when data quality evaluation is carried out according to the data quality evaluation model, the scores of all the dimensions are respectively calculated according to the weights, and a data quality report is formed.
Referring to fig. 6, another preferred embodiment of the present invention further provides a data auditing system, including:
a first obtaining module 601, configured to obtain audit rule information including collected data problems;
the first processing module 602 is configured to classify the audit rule information, and generate an audit rule audit script according to the classified audit rule information;
the second processing module 603 is configured to perform iterative audit on the audit rule audit script in a preset audit rule repository;
the third processing module 604 is configured to perform data audit according to the audit rule audit script after the iterative audit, so as to obtain an audit result report.
Specifically, the data auditing system described above further includes:
the fourth processing module is used for acquiring data problems and/or audit rule information provided by the first target responsible end responding to the rule collection information by periodically sending the rule collection information to the first target responsible end;
and the second acquisition module is used for acquiring the data problems and/or audit rule information registered by the second target responsible end.
Preferably, the data auditing system as described above, further includes:
the fifth processing module is used for carrying out check and repeat processing on the audit rule information;
and the sixth processing module is used for storing the audit rule information which passes the check and repeat processing into the audit rule storage library.
Preferably, in the data auditing system, the first processing module includes:
the first processing unit is used for configuring a data source corresponding to the classified audit rule information;
the second processing unit is used for selecting corresponding metadata and data standards needing to be matched according to the classified audit rule information;
the third processing unit is used for generating an audit rule audit script according to the data source, the metadata, the data standard and the audit category to which the classified audit rule information belongs;
the audit rule audit script comprises an audit total amount script and/or a problem detail audit script.
Specifically, the data auditing system described above further includes:
the test module is used for testing the correctness of the audit rule check script;
and the seventh processing module is used for storing the audit rule information corresponding to the audit rule audit script which is tested correctly into the audit rule intermediate library.
Preferably, in the data auditing system, the second processing module further includes:
the fourth processing unit is used for generating a corresponding rule registration card according to the audit rule information in the audit rule intermediate library and the corresponding audit rule audit script;
and the fifth processing unit is used for storing the audit rule audit script to the audit rule sediment library when the confirmation indication information of the rule registration card is acquired.
Specifically, the data auditing system and the third processing module are specifically configured to:
and acquiring an audit selection instruction or a timing task scheduling instruction, adopting a multi-thread concurrent mode, and executing a corresponding audit rule audit script to audit data according to pre-acquired audit parameters, audit task configuration information and audit task scheduling information.
Preferably, in the data auditing system, the audit result report includes: the audit result chart report comprises a rule difference distribution table, a data difference fluctuation analysis table and/or a key data audit rule ranking.
Further, the data auditing system as described above further includes:
the eighth processing module is used for sending the audit result and/or the problem detail information in the audit result report to the second target responsible end according to the audit result report;
and the ninth processing module is used for receiving the feedback information of the second target responsible end to generate a four-quadrant data audit analysis card and storing the four-quadrant data audit analysis card in the knowledge base.
Preferably, the data auditing system as described above, further includes:
and the tenth processing module is used for carrying out data quality evaluation at preset time intervals according to a preset data quality evaluation model and forming a data quality report.
Preferably, the data auditing system as described above, further includes:
and the eleventh processing module is used for carrying out data quality evaluation at preset time intervals according to a preset data quality evaluation model and forming a data quality report.
The system embodiment of the present invention is a system corresponding to the above method embodiment, and all implementation means in the above method embodiment are applicable to the system embodiment, and can achieve the same technical effect.
Yet another preferred embodiment of the present invention further provides a computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when being executed by a processor, the computer program implements the steps in the data auditing method as described above.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be physically included alone, or two or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute some steps of the transceiving method according to various embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
While the preferred embodiments of the present invention have been described, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims (12)

1. A data auditing method, comprising:
obtaining audit rule information including collected data issues;
classifying the audit rule information, and generating an audit rule audit script according to the classified audit rule information;
performing iterative audit on the audit rule audit script in a preset audit rule sediment library;
and performing data audit according to the audit rule audit script after iterative audit to obtain an audit result report.
2. The data auditing method of claim 1, prior to obtaining audit rule information including collected data issues, the method further comprising:
the method comprises the steps that rule collection information is sent to a first target responsible end periodically, and data problems and/or audit rule information provided by the first target responsible end responding to the rule collection information are obtained;
and acquiring data problems and/or audit rule information registered by the second target responsible end.
3. The data auditing method according to claim 1 or 2, characterized in that after obtaining audit rule information including collected data issues, the method further comprises:
verifying the audit rule information;
and storing the audit rule information which passes the check and repeat processing into an audit rule storage library.
4. The data auditing method of claim 1, where the step of generating an auditing rule audit script based on the categorized audit rule information comprises:
configuring a data source corresponding to the classified audit rule information;
selecting corresponding metadata and a data standard needing to be matched according to the classified audit rule information;
generating the audit rule audit script according to the data source, the metadata, the data standard and the audit category to which the audit rule information belongs after classification;
the audit rule audit script comprises an audit total amount script and/or a question detail audit script.
5. The data auditing method of claim 4, after generating the auditing rule audit script, the method further comprising:
testing the correctness of the audit script according to the audit rule;
and storing the audit rule audit script which is tested correctly and the corresponding audit rule information into an audit rule intermediate library.
6. The data auditing method according to claim 5, wherein before the iterative audit of the audit rule audit script in a pre-defined audit rule precipitation library, the method further comprises:
generating a corresponding rule registration card according to the audit rule information in the audit rule intermediate library and the corresponding audit rule audit script;
and when the confirmation indication information of the rule registration card is acquired, storing the audit rule audit script into an audit rule sediment library.
7. The data auditing method according to claim 1, where the step of auditing data according to the auditing rule audit script after iterative audit comprises:
and acquiring an audit selection instruction or a timing task scheduling instruction, adopting a multi-thread concurrent mode, and executing a corresponding audit rule audit script to audit data according to pre-acquired audit parameters, audit task configuration information and audit task scheduling information.
8. The data auditing method of claim 1, where the audit result report includes: the audit result chart report comprises a rule difference distribution table, a data difference fluctuation analysis table and/or a key data audit rule ranking.
9. The data auditing method of claim 1, said method further comprising:
according to the audit result report, sending the audit result and/or the problem detail information in the audit result report to a second target responsible end;
and receiving feedback information of the second target responsible end to generate a four-quadrant data audit analysis card, and storing the four-quadrant data audit analysis card in a knowledge base.
10. The data auditing method of claim 1, said method further comprising:
and performing data quality evaluation at preset intervals according to a preset data quality evaluation model, and forming a data quality report.
11. A data auditing system, comprising:
the acquisition module is used for acquiring audit rule information comprising the collected data problems;
the first processing module is used for classifying the audit rule information and generating an audit rule audit script according to the classified audit rule information;
the second processing module is used for performing iterative audit on the audit rule audit script in a preset audit rule sediment library;
and the third processing module is used for auditing data according to the auditing rule auditing script after iterative auditing to obtain an auditing result report.
12. A computer-readable storage medium, having a computer program stored thereon, which, when executed by a processor, performs the steps of the data auditing method according to any one of claims 1-10.
CN201911419646.2A 2019-12-31 2019-12-31 Data auditing method and system Pending CN111159171A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911419646.2A CN111159171A (en) 2019-12-31 2019-12-31 Data auditing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911419646.2A CN111159171A (en) 2019-12-31 2019-12-31 Data auditing method and system

Publications (1)

Publication Number Publication Date
CN111159171A true CN111159171A (en) 2020-05-15

Family

ID=70560327

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911419646.2A Pending CN111159171A (en) 2019-12-31 2019-12-31 Data auditing method and system

Country Status (1)

Country Link
CN (1) CN111159171A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767283A (en) * 2020-06-19 2020-10-13 北京思特奇信息技术股份有限公司 Data system monitoring method and system
CN112508526A (en) * 2020-12-15 2021-03-16 中国联合网络通信集团有限公司 Data auditing method and device
CN112785124A (en) * 2021-01-05 2021-05-11 科大国创云网科技有限公司 Method and system for auditing compliance of telecommunication service
CN112926941A (en) * 2021-03-04 2021-06-08 远光软件股份有限公司 Management method and device for financial auditing rules, storage medium and server
US11314489B1 (en) * 2021-04-16 2022-04-26 27 Software U.S. Inc. Automated authoring of software solutions by first analyzing and resolving anomalies in a data model
US11409505B1 (en) 2021-04-16 2022-08-09 27 Software U.S. Inc. Automated authoring of software solutions from a data model with related patterns
CN112508526B (en) * 2020-12-15 2024-04-19 中国联合网络通信集团有限公司 Data auditing method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050066240A1 (en) * 2002-10-04 2005-03-24 Tenix Investments Pty Ltd Data quality & integrity engine
CN101902532A (en) * 2009-05-27 2010-12-01 北京汉铭通信有限公司 Data auditing method and system of telecommunication services
CN102222088A (en) * 2011-05-30 2011-10-19 大连银行股份有限公司 System and method for checking, summarizing and displaying data quality according to multidimensional attribute
CN103473672A (en) * 2013-09-30 2013-12-25 国家电网公司 System, method and platform for auditing metadata quality of enterprise-level data center
CN103473643A (en) * 2013-09-10 2013-12-25 北京思特奇信息技术股份有限公司 Product management data auditing method and system for BOSS system
CN103942633A (en) * 2013-12-26 2014-07-23 远光软件股份有限公司 Audit result data presentation and data penetrating system and method
CN107256247A (en) * 2017-06-07 2017-10-17 九次方大数据信息集团有限公司 Big data data administering method and device
CN110083623A (en) * 2019-03-12 2019-08-02 中国平安人寿保险股份有限公司 A kind of business rule generation method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050066240A1 (en) * 2002-10-04 2005-03-24 Tenix Investments Pty Ltd Data quality & integrity engine
CN101902532A (en) * 2009-05-27 2010-12-01 北京汉铭通信有限公司 Data auditing method and system of telecommunication services
CN102222088A (en) * 2011-05-30 2011-10-19 大连银行股份有限公司 System and method for checking, summarizing and displaying data quality according to multidimensional attribute
CN103473643A (en) * 2013-09-10 2013-12-25 北京思特奇信息技术股份有限公司 Product management data auditing method and system for BOSS system
CN103473672A (en) * 2013-09-30 2013-12-25 国家电网公司 System, method and platform for auditing metadata quality of enterprise-level data center
CN103942633A (en) * 2013-12-26 2014-07-23 远光软件股份有限公司 Audit result data presentation and data penetrating system and method
CN107256247A (en) * 2017-06-07 2017-10-17 九次方大数据信息集团有限公司 Big data data administering method and device
CN110083623A (en) * 2019-03-12 2019-08-02 中国平安人寿保险股份有限公司 A kind of business rule generation method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
曹建军,习兴春: "数据质量导论", 31 October 2017, 国防工业出版社, pages: 268 - 270 *
蔡莉,朱扬勇: "大数据质量", 31 January 2017, 上海科学技术出版社, pages: 90 - 91 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767283A (en) * 2020-06-19 2020-10-13 北京思特奇信息技术股份有限公司 Data system monitoring method and system
CN111767283B (en) * 2020-06-19 2023-08-18 北京思特奇信息技术股份有限公司 Data system monitoring method and system
CN112508526A (en) * 2020-12-15 2021-03-16 中国联合网络通信集团有限公司 Data auditing method and device
CN112508526B (en) * 2020-12-15 2024-04-19 中国联合网络通信集团有限公司 Data auditing method and device
CN112785124A (en) * 2021-01-05 2021-05-11 科大国创云网科技有限公司 Method and system for auditing compliance of telecommunication service
CN112926941A (en) * 2021-03-04 2021-06-08 远光软件股份有限公司 Management method and device for financial auditing rules, storage medium and server
US11314489B1 (en) * 2021-04-16 2022-04-26 27 Software U.S. Inc. Automated authoring of software solutions by first analyzing and resolving anomalies in a data model
US11409505B1 (en) 2021-04-16 2022-08-09 27 Software U.S. Inc. Automated authoring of software solutions from a data model with related patterns

Similar Documents

Publication Publication Date Title
CN111159171A (en) Data auditing method and system
Fabijan et al. The evolution of continuous experimentation in software product development: from data to a data-driven organization at scale
US20070282876A1 (en) Method for service offering comparitive it management activity complexity benchmarking
Feyh et al. Lean software development measures and indicators-a systematic mapping study
Lyberg et al. Quality assurance and quality control in surveys
CN110096569A (en) A kind of crowd survey personnel set recommended method
Christley et al. Analysis of activity in the open source software development community
CN108733712A (en) A kind of question answering system evaluation method and device
CN102024198A (en) Product test management system and test management method
CN111930611B (en) Statistical method and device for test data
Linger et al. Cleanroom Software Engineering Reference Model: Version 1.0
TW201426352A (en) Automatic operation system and method for data mining model
de Jesus et al. Technical debt and the software project characteristics. A repository-based exploratory analysis
Kalinowski et al. An industry ready defect causal analysis approach exploring bayesian networks
Shikhli et al. Data Acquisition Model for Analyzing Schedule Delays Using KDD: Knowledge Discovery and Datamining
Lavazza et al. Defining and evaluating software project success indicators: A GQM-based case study
CN111552639A (en) Software test comprehensive control method and system
Jirapanthong Experience on re-engineering applying with software product line
Alqodri et al. Helpdesk ticket support system based on fuzzy Tahani algorithm
Radulovic et al. Towards a quality model for semantic technologies
Soderborg Better Before Bigger Data
El Bajta et al. A Software Cost Estimation Taxonomy for Global Software Development Projects.
Wang et al. Incorporating qualitative and quantitative factors for software defect prediction
Rifa et al. Building sustainable software testing using machine learning for green engineering
Higo et al. Predicting fault-prone modules based on metrics transitions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Room 101, floors 1-3, building 14, North District, yard 9, dongran North Street, Haidian District, Beijing 100029

Applicant after: CHINA TOWER Co.,Ltd.

Address before: 100142 19th floor, 73 Fucheng Road, Haidian District, Beijing

Applicant before: CHINA TOWER Co.,Ltd.

CB02 Change of applicant information