CN111949642B - Method and device for data quality control - Google Patents

Method and device for data quality control

Info

Publication number
CN111949642B
CN111949642B CN202010813723.9A CN202010813723A CN111949642B CN 111949642 B CN111949642 B CN 111949642B CN 202010813723 A CN202010813723 A CN 202010813723A CN 111949642 B CN111949642 B CN 111949642B
Authority
CN
China
Prior art keywords
index
detail
data quality
data
aggregation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010813723.9A
Other languages
Chinese (zh)
Other versions
CN111949642A (en
Inventor
赵枫
黄浩
薛飞
李梦姣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202010813723.9A priority Critical patent/CN111949642B/en
Publication of CN111949642A publication Critical patent/CN111949642A/en
Application granted granted Critical
Publication of CN111949642B publication Critical patent/CN111949642B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention provides a method and a device for data quality control, which can be used in the financial field. The method comprises the following steps: weighting and evaluating the use frequency, the index credibility and the index validity of the index to obtain an index scoring result; wherein each indicator is derived from a polymerization, and each polymerization is derived from a plurality of details; judging whether a data quality problem exists or not according to the index scoring result; if the data quality problem is known, determining the target detail through the index scoring result, the corresponding relation between the index and the aggregation and the corresponding relation between the aggregation and the detail. According to the invention, quality control is effectively and timely carried out by detail, aggregation and index data corresponding relation and by utilizing the grading result of the index, the problem origin is positioned, and the control of data quality is realized.

Description

Method and device for data quality control
Technical Field
The invention relates to the technical field of data processing in the financial industry, in particular to a method and a device for data quality control.
Background
At present, for the management and control of the supervision data quality of financial institutions, due to the historical reasons of division of departments and technological construction of each banking institution, the asymmetric conditions exist in the data quality standard of a transaction system and the supervision and reporting quality requirement, and the requirement of the supervision institutions on strengthening supervision cannot be met. Therefore, in the process of solving the data quality problem, banking industry lacks effective detection, analysis and tracking means, and is difficult to effectively and rapidly find and solve the problem.
Disclosure of Invention
The embodiment of the invention mainly aims to provide a method and a device for data quality control, which can realize traceability and positioning of data quality problems and effectively manage the data quality problems.
To achieve the above object, an embodiment of the present invention provides a method for data quality control, including:
Weighting and evaluating the use frequency, the index credibility and the index validity of the index to obtain an index scoring result; wherein each indicator is derived from a polymerization, and each polymerization is derived from a plurality of details;
Judging whether a data quality problem exists or not according to the index scoring result;
if the data quality problem is known, determining the target detail through the index scoring result, the corresponding relation between the index and the aggregation and the corresponding relation between the aggregation and the detail.
Optionally, in an embodiment of the present invention, the performing weighted evaluation on the usage frequency, the index confidence and the index validity of the index, and obtaining an index scoring result includes: weighting evaluation is carried out according to the report generation times of the index and the times of the index to be independently referred to, so as to obtain the use frequency of the index; determining index credibility according to the index verification pass times and the total verification times; and determining the effectiveness of the index according to the occurrence times of the index.
Optionally, in an embodiment of the present invention, the determining whether the data quality problem exists according to the index scoring result includes: and if the index scoring result is lower than the preset index threshold, determining that the data quality problem exists.
Optionally, in an embodiment of the present invention, determining the target detail through the index scoring result, the corresponding relationship between the index and the aggregation, and the corresponding relationship between the aggregation and the detail includes: determining the aggregate score corresponding to the index according to the index scoring result; determining a score of the detail corresponding to the aggregation according to the score of the aggregation; and comparing the score of the detail with a detail preset threshold value to determine the target detail.
Optionally, in an embodiment of the present invention, the method further includes: and performing data quality control on the target detail.
The embodiment of the invention also provides a device for controlling the data quality, which comprises:
The index evaluation module is used for carrying out weighted evaluation on the use frequency, the index credibility and the index validity of the index to obtain an index scoring result; wherein each indicator is derived from a polymerization, and each polymerization is derived from a plurality of details;
The quality problem judging module is used for judging whether a data quality problem exists or not according to the index grading result;
And the data quality control module is used for determining the target detail through the index scoring result, the corresponding relation between the index and the aggregation and the corresponding relation between the aggregation and the detail if the data quality problem is known.
Optionally, in an embodiment of the present invention, the index evaluation module includes: the using frequency unit is used for carrying out weighted evaluation according to the report generation times of the index and the times of the index to be independently referred to, so as to obtain the using frequency of the index; the index credibility unit is used for determining index credibility according to the index verification pass times and the total verification times; and the index effectiveness unit is used for determining the index effectiveness according to the occurrence times of the index.
Optionally, in an embodiment of the present invention, the quality problem judging module includes a quality problem judging unit, and if it is judged that the index scoring result is lower than an index preset threshold, it is determined that a data quality problem exists.
Optionally, in an embodiment of the present invention, the data quality control module includes: an aggregation scoring unit, configured to determine an aggregate score corresponding to the index according to the index scoring result; a detail scoring unit for determining a score of a detail corresponding to the aggregation according to the score of the aggregation; and the target detail unit is used for comparing the score of the detail with a preset threshold value of the detail to determine the target detail.
Optionally, in an embodiment of the present invention, the data quality control module further includes a data quality control unit, configured to perform data quality control on the target details.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the above method when executing the program.
The present invention also provides a computer readable storage medium storing a computer program for executing the above method.
According to the invention, quality control is effectively and timely carried out by detail, aggregation and index data corresponding relation and by utilizing the grading result of the index, the problem origin is positioned, and the control of data quality is realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for data quality control according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of the data hierarchy correspondence in an embodiment of the present invention;
FIGS. 3A-3B are schematic diagrams illustrating the relationship between layers according to embodiments of the present invention;
FIGS. 4A-4B are schematic diagrams of index and report forms and relationship between the system and report forms in the embodiment of the invention;
FIG. 5 is a diagram illustrating a relationship between supervision data in an embodiment of the present invention;
FIG. 6 is a schematic diagram of a data management and quality control process according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a data quality control process according to another embodiment of the present invention;
FIG. 8 is a schematic diagram of a data quality control process according to another embodiment of the present invention;
FIG. 9 is a schematic diagram illustrating a device for data quality control according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the invention.
Detailed Description
The embodiment of the invention provides a method and a device for data quality control. It should be noted that the method and apparatus for data quality control of the present invention may be used in the financial field, and may be used in any field other than the financial field.
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a flowchart of a method for controlling data quality according to an embodiment of the present invention, where the method includes:
Step S1, carrying out weighted evaluation on the use frequency, the index credibility and the index validity of the index to obtain an index scoring result; wherein each indicator is derived from one aggregation, and each aggregation is derived from a plurality of details. As shown in fig. 2, the service data includes an original detail layer, an aggregation summary layer, an abstract index layer and a package display layer. The correspondence of each layer of data is shown in fig. 3A to 3B, and includes: the original detail layer comprises a plurality of details, and the details correspond to the result of the business transaction action; the aggregation summary layer comprises a plurality of aggregations, wherein the aggregations are formed by integrating results according to the service field after the aggregation of the details is performed through data cleaning, and each aggregation is derived from the plurality of details; the abstract index layer comprises a plurality of indexes, wherein the indexes are aggregate values which are abstracted into service definitions according to different statistical dimensions, and are generally balance, occurrence, number of strokes, number of households and the like, and each index is derived from one aggregation; the packaging display layer comprises a plurality of reports, the reports have specific samples, and report items of the reports are report sets obtained by one or more indexes through basic operation, and each report is derived from the plurality of indexes.
In addition, a certain index can be weighted by the frequency, credibility and effectiveness of use, so that the scoring result can be objectively obtained, and the higher the score is, the better the index is.
And S2, judging whether the data quality problem exists or not according to the index grading result. And judging whether the index has a data quality problem or not by comparing the index scoring result with a preset threshold value of the index scoring. Generally, when the index scoring result is lower than a preset threshold, it is determined that a data quality problem exists.
And step S3, if the data quality problem is known, determining the target detail through the index scoring result, the corresponding relation between the index and the aggregation and the corresponding relation between the aggregation and the detail. The corresponding relation between the index and the aggregation is that one index is derived from one aggregation, one aggregation can correspond to a plurality of indexes, and the corresponding relation between the aggregation and the detail is that one aggregation is derived from a plurality of details.
Wherein, the index of the data quality problem corresponds to one data quality problem aggregation, and one data quality problem aggregation corresponds to a plurality of details of the data quality problems possibly occurring. In addition, according to the index scoring result, a corresponding aggregate score and a detail score can be calculated. And judging target details in a plurality of details which possibly have data quality problems according to comparison of the detail scores and preset thresholds of the detail scores, wherein the target details can be one or a plurality of target details. The details are the original data of the business transaction, so after the target details are positioned, the original data with the data quality problem is positioned, and therefore, targeted data quality control can be performed.
As an embodiment of the present invention, performing weighted evaluation on the use frequency, the index confidence and the index validity of the index, and obtaining an index scoring result includes: weighting evaluation is carried out according to the report generation times of the index and the times of the index to be independently referred to, so as to obtain the use frequency of the index; determining index credibility according to the index verification pass times and the total verification times; and determining the effectiveness of the index according to the occurrence times of the index.
In this embodiment, weighting evaluation is performed according to the report generation times BY of the indexes and the times CY of the indexes to be individually referred to, so as to obtain the index use frequency PN:
In this embodiment, a class of reference indexes exists in the indexes, and the other indexes can all establish a check rule with the indexes according to the relations such as "equal to", "not greater than", "greater than or equal to", "not less than", "less than or equal to", and the like.
The index confidence XN may be expressed as:
In this embodiment, the number of times the index is used in various supervision systems is higher as the number of times the index is used is greater. The index validity XA can be expressed as:
Thus, the score result ZKPI can be objectively obtained by weighting the frequency, confidence and effectiveness of a certain index, and the higher the score, the better the index. The index scoring result ZKPI may be expressed as:
As one embodiment of the present invention, determining whether there is a data quality problem according to the index scoring result includes: if the index scoring result is lower than the preset index threshold, determining that the data quality problem exists. The indexes are different, the preset thresholds of the indexes are also different, and in addition, the preset thresholds of the indexes can be adjusted according to the change of a supervision system, so that the accuracy of data quality control is ensured.
As one embodiment of the present invention, determining the target detail by the index scoring result, the correspondence between the index and the aggregation, and the correspondence between the aggregation and the detail includes: determining the aggregate score corresponding to the index according to the index scoring result; determining a score of the detail corresponding to the aggregation according to the score of the aggregation; and comparing the score of the detail with a detail preset threshold value to determine the target detail.
In this embodiment, the aggregate table data and the detail table data quality status can be evaluated by the scoring result of the index. Wherein, for a certain aggregation table, the ZKPI average of all the following indicators is its aggregation score:
For a particular list, the average of all aggregation tables JKPI referenced to it is its list score:
Similar to the data quality problem determination, the detail score is compared to its preset threshold. Typically, the detail score is below its preset threshold, the detail is deemed to be a target detail, which may be one or more.
As an embodiment of the present invention, the method further comprises: and performing data quality control on the target detail. After the data quality problem is found, the problem data needs to be corrected or deleted in time so as to ensure that the data in the supervision and reporting process is accurate.
In this embodiment, as shown in fig. 4A-4B, the index and report and the relationship between the system and report are shown. The report has a specific sample, and the report item is a report set obtained by one or more indexes through basic operation, and each report is derived from a plurality of indexes. In the supervision report, the supervision system of each supervision department sends a report form sample and a report filling rule, and the report forms and the system have a many-to-many relationship. Each supervision system corresponds to different supervision reports, and the same report can correspond to a plurality of supervision systems.
In a specific embodiment of the present invention, as shown in fig. 5, in addition to tracing the target details with problems through the indexes, the present invention can also connect the supervision system and the blood-margin relationship of the original details in series through the system, the report, the indexes, the aggregation, the details, and simultaneously, can also discover the supervision data fluctuation range caused by the fluctuation of the transaction data in time through the system, the aggregation, the indexes, the report.
In this embodiment, as shown in fig. 6, for example, when the scoring quality of the index 1-unit bad loan balance is found to be poor, the report 1 under the supervision system 1 can be traced back quickly, and the report accuracy has deviation, which may cause the situation of missing report or false report of data; the quality problem of aggregated data which is evaluated as bad by public (unit) and loan in a loan aggregation table can be rapidly found through the bad loan balance of index 1-unit, and then the quality problem is traced back to the original detail data table 1, so that the positioning of the root of the data quality is completed. Meanwhile, the secondary problems caused by the detail data 1 are aggregated 2, the index 2 is sent to the report 2, the link can be rapidly revealed due to the influence of the series of quality problems, and remedial measures such as follow-up data management can be rapidly taken.
In this embodiment, as shown in fig. 7, in the daily audit process, a problem is found in a certain loan, and the problem should be lost, and the business personnel is illegal to change to normal, so that the reported supervision data can be corrected in time to carry out re-report through detail, aggregation, index and report, thereby avoiding affecting the statistical result of macroscopic economy or reducing reputation influence and punishment caused by misreporting.
In one embodiment of the present invention, as shown in fig. 8, by performing quality evaluation control on each index, an index classification set may be: the index group 1, the index group 2, the index group 3 and the like are objectively evaluated. For example, the index group 1 personal loan class is poor, the index group 2 legal loan class is general, the index group 3 legal loan class is good, objective quality evaluation can be obtained for aggregation and detail after tracing, data correction can be carried out in a very targeted manner on source data with serious quality problems such as detail 1 and detail 2, quality control in the data generation process is realized, and systematic data quality control is achieved.
The invention can analyze the financial supervision field, comprehensively supervise the reporting range and business field of the influence of abnormal fluctuation of data, and locate the sources of data quality problems such as professional products, transaction channels and the like related to the generation of problems. And constructing a supervision data quality assurance system, and effectively managing the problem of the quality of the data discovered in the comprehensive supervision reporting process. Through the standardized construction of the full link of the supervision data, the problems of data quality brought by links such as the establishment of data acquisition standards, the standards of the data standards, the concentration of data entities, the processing of data results and the like are effectively solved, and the demands of increasingly strict large supervision and strong supervision are practically met. The quality assurance mechanism radiates to the data of the data full link, and further promotes the improvement of the data quality.
According to the invention, quality control is effectively and timely carried out by detail, aggregation and index data corresponding relation and by utilizing the grading result of the index, the problem origin is positioned, and the control of data quality is realized.
Fig. 9 is a schematic structural diagram of an apparatus for data quality control according to an embodiment of the present invention, where the apparatus includes:
The index evaluation module 10 is used for carrying out weighted evaluation on the use frequency, the index credibility and the index validity of the index to obtain an index scoring result; wherein each indicator is derived from a polymerization, and each polymerization is derived from a plurality of details;
A quality problem judging module 20, configured to judge whether a data quality problem exists according to the index scoring result;
the data quality control module 30 is configured to determine the target detail according to the index scoring result, the corresponding relationship between the index and the aggregation, and the corresponding relationship between the aggregation and the detail if it is known that the data quality problem exists.
As one embodiment of the present invention, the index evaluation module includes: the using frequency unit is used for carrying out weighted evaluation according to the report generation times of the index and the times of the index to be independently referred to, so as to obtain the using frequency of the index; the index credibility unit is used for determining index credibility according to the index verification pass times and the total verification times; and the index effectiveness unit is used for determining the index effectiveness according to the occurrence times of the index.
As an embodiment of the present invention, the quality problem judging module includes a quality problem judging unit, and determines that a data quality problem exists if it is judged that the index scoring result is lower than an index preset threshold.
As one embodiment of the present invention, a data quality control module includes: an aggregation scoring unit, configured to determine an aggregate score corresponding to the index according to the index scoring result; a detail scoring unit for determining a score of a detail corresponding to the aggregation according to the score of the aggregation; and the target detail unit is used for comparing the score of the detail with a preset threshold value of the detail to determine the target detail.
As an embodiment of the present invention, the data quality control module further includes a data quality control unit for performing data quality control on the target details.
Based on the same application conception as the data quality control method, the invention also provides a data quality control device. Because the principle of the device for data quality control to solve the problem is similar to that of a method for data quality control, the implementation of the device for data quality control can refer to the implementation of the method for data quality control, and the repetition is omitted.
According to the invention, quality control is effectively and timely carried out by detail, aggregation and index data corresponding relation and by utilizing the grading result of the index, the problem origin is positioned, and the control of data quality is realized.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the above method when executing the program.
The present invention also provides a computer readable storage medium storing a computer program for executing the above method.
As shown in fig. 10, the electronic device 600 may further include: a communication module 110, an input unit 120, an audio processing unit 130, a display 160, a power supply 170. It is noted that the electronic device 600 need not include all of the components shown in fig. 10; in addition, the electronic device 600 may further include components not shown in fig. 10, to which reference is made to the related art.
As shown in fig. 10, the central processor 100, sometimes also referred to as a controller or operational control, may include a microprocessor or other processor device and/or logic device, which central processor 100 receives inputs and controls the operation of the various components of the electronic device 600.
The memory 140 may be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information about failure may be stored, and a program for executing the information may be stored. And the central processor 100 can execute the program stored in the memory 140 to realize information storage or processing, etc.
The input unit 120 provides an input to the central processor 100. The input unit 120 is, for example, a key or a touch input device. The power supply 170 is used to provide power to the electronic device 600. The display 160 is used for displaying display objects such as images and characters. The display may be, for example, but not limited to, an LCD display.
The memory 140 may be a solid state memory such as Read Only Memory (ROM), random Access Memory (RAM), SIM card, or the like. But also a memory which holds information even when powered down, can be selectively erased and provided with further data, an example of which is sometimes referred to as EPROM or the like. Memory 140 may also be some other type of device. Memory 140 includes a buffer memory 141 (sometimes referred to as a buffer). The memory 140 may include an application/function storage 142, the application/function storage 142 for storing application programs and function programs or a flow for executing operations of the electronic device 600 by the central processor 100.
The memory 140 may also include a data store 143, the data store 143 for storing data, such as contacts, digital data, pictures, sounds, and/or any other data used by the electronic device. The driver storage 144 of the memory 140 may include various drivers of the electronic device for communication functions and/or for performing other functions of the electronic device (e.g., messaging applications, address book applications, etc.).
The communication module 110 is a transmitter/receiver 110 that transmits and receives signals via an antenna 111. A communication module (transmitter/receiver) 110 is coupled to the central processor 100 to provide an input signal and receive an output signal, which may be the same as in the case of a conventional mobile communication terminal.
Based on different communication technologies, a plurality of communication modules 110, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, etc., may be provided in the same electronic device. The communication module (transmitter/receiver) 110 is also coupled to a speaker 131 and a microphone 132 via an audio processor 130 to provide audio output via the speaker 131 and to receive audio input from the microphone 132 to implement usual telecommunication functions. The audio processor 130 may include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor 130 is also coupled to the central processor 100 so that sound can be recorded locally through the microphone 132 and so that sound stored locally can be played through the speaker 131.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principles and embodiments of the present invention have been described in detail with reference to specific examples, which are provided to facilitate understanding of the method and core ideas of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (8)

1. A method of data quality control, the method comprising:
Weighting and evaluating the use frequency, the index credibility and the index validity of the index to obtain an index scoring result; wherein each indicator is derived from a polymerization, and each polymerization is derived from a plurality of details;
Judging whether a data quality problem exists or not according to the index scoring result;
if the data quality problem is known, determining a target detail through an index scoring result, a corresponding relation between an index and aggregation and a corresponding relation between the aggregation and the detail; the target detail is used for positioning data with problems so as to correct or delete the data with problems; wherein the data is transaction data;
wherein, the step of carrying out weighted evaluation on the use frequency, the index credibility and the index validity of the index to obtain an index scoring result comprises the following steps:
Weighting evaluation is carried out according to the report generation times BY of the indexes and the times CY of the indexes which are independently referred, and the index use frequency PN is obtained BY utilizing the following formula:
according to the index verification pass times and the total verification times, the index credibility XN is determined by using the following formula:
according to the number of times of index occurrence, the index effectiveness XA is determined by using the following formula:
The index scoring result ZKPI is obtained using the following formula:
Wherein, determining the target detail through the index scoring result, the corresponding relation between the index and the aggregation and the corresponding relation between the aggregation and the detail comprises:
from the index scoring results, an aggregate score JKPI corresponding to the index is determined using the following formula:
from the scores of the aggregations, a score MKPI for the detail corresponding to the aggregation is determined using the following formula:
And comparing the score of the detail with a detail preset threshold value to determine the target detail.
2. The method of claim 1, wherein determining whether a data quality problem exists based on the index scoring result comprises: and if the index scoring result is lower than the preset index threshold, determining that the data quality problem exists.
3. The method according to claim 1, wherein the method further comprises: and performing data quality control on the target detail.
4. An apparatus for data quality control, the apparatus comprising:
The index evaluation module is used for carrying out weighted evaluation on the use frequency, the index credibility and the index validity of the index to obtain an index scoring result; wherein each indicator is derived from a polymerization, and each polymerization is derived from a plurality of details;
The quality problem judging module is used for judging whether a data quality problem exists or not according to the index grading result;
The data quality control module is used for determining a target detail through an index scoring result, a corresponding relation between an index and aggregation and a corresponding relation between aggregation and detail if the data quality problem is known to exist; the target detail is used for positioning data with problems so as to correct or delete the data with problems; wherein the data is transaction data;
Wherein, the index evaluation module includes:
the usage frequency unit is used for carrying out weighted evaluation according to the report generation times BY of the indexes and the times CY of the indexes which are independently referred to, and obtaining the index usage frequency PN BY using the following formula:
the index credibility unit is used for determining the index credibility XN according to the index verification pass times and the total verification times by using the following formula:
The index effectiveness unit is used for determining index effectiveness XA according to the occurrence times of the index by using the following formula:
The index scoring result ZKPI is obtained using the following formula:
Wherein, the data quality management and control module includes:
An aggregate scoring unit, configured to determine, according to the index scoring result, an aggregate score JKPI corresponding to the index using the following formula:
a detail scoring unit for determining a score MKPI of the detail corresponding to the aggregation according to the score of the aggregation by using the following formula:
and the target detail unit is used for comparing the score of the detail with a preset threshold value of the detail to determine the target detail.
5. The apparatus of claim 4, wherein the quality problem determination module includes a quality problem determination unit that determines that a data quality problem exists if the index scoring result is determined to be lower than an index preset threshold.
6. The apparatus of claim 4, wherein the data quality control module further comprises a data quality control unit for data quality controlling the target detail.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any one of claims 1 to 3 when executing the program.
8. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program for executing the method of any one of claims 1 to 3.
CN202010813723.9A 2020-08-13 Method and device for data quality control Active CN111949642B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010813723.9A CN111949642B (en) 2020-08-13 Method and device for data quality control

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010813723.9A CN111949642B (en) 2020-08-13 Method and device for data quality control

Publications (2)

Publication Number Publication Date
CN111949642A CN111949642A (en) 2020-11-17
CN111949642B true CN111949642B (en) 2024-07-09

Family

ID=

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522318A (en) * 2018-10-22 2019-03-26 中国银行股份有限公司 A kind of data quality management method and system
CN111143334A (en) * 2019-11-13 2020-05-12 深圳市华傲数据技术有限公司 Data quality closed-loop control method
CN111324602A (en) * 2020-02-21 2020-06-23 上海软中信息技术有限公司 Method for realizing financial big data oriented analysis visualization

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522318A (en) * 2018-10-22 2019-03-26 中国银行股份有限公司 A kind of data quality management method and system
CN111143334A (en) * 2019-11-13 2020-05-12 深圳市华傲数据技术有限公司 Data quality closed-loop control method
CN111324602A (en) * 2020-02-21 2020-06-23 上海软中信息技术有限公司 Method for realizing financial big data oriented analysis visualization

Similar Documents

Publication Publication Date Title
CN113407517B (en) Data quality health degree analysis method and system based on multidimensional analysis technology
CN111010700B (en) Method and device for determining load threshold
CN104463668A (en) Online credit checking method and device
CN114612018B (en) Internal control risk monitoring method and system and readable storage medium
CN115409290A (en) Business data risk model verification method and device, electronic equipment and medium
CN115422065A (en) Fault positioning method and device based on code coverage rate
CN116611747A (en) Resource information intelligent management system based on cloud computing
CN111949642B (en) Method and device for data quality control
CN113570379B (en) Abnormal transaction group partner identification method and device
CN105184649A (en) Data processing method and server
CN113095782A (en) Automatic approval decision-making method and device
CN104484249A (en) Smart check system of hardware product quality and server
CN116244202A (en) Automatic performance test method and device
CN111949642A (en) Data quality control method and device
US20070162361A1 (en) Method and Data Processing System For Performing An Audit
CN115619534A (en) Enterprise loan request processing method and device
CN114564405A (en) Test case checking method and system based on log monitoring
CN114626662A (en) Green plant evaluation method and interface and computer readable storage medium
Prawirasasra Analysis of relationship of environmental performance and firm value
CN103971194A (en) Apparatus and method for generating bill of inspection material
Yuniasih et al. The role of financial development and manufacturing sector expansion on emission reduction for sustainable economic development in the world’s biggest emitter Asia
CN117786491A (en) Networking method and device for suspicious cases
US20240062222A1 (en) Method, apparatus and device for auditing data based on blockchain, and storage medium
CN116582459A (en) Front-end server testing method and device
CN116882622A (en) Method and system for accurately determining test range

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant