CN109522318A - A kind of data quality management method and system - Google Patents

A kind of data quality management method and system Download PDF

Info

Publication number
CN109522318A
CN109522318A CN201811228360.1A CN201811228360A CN109522318A CN 109522318 A CN109522318 A CN 109522318A CN 201811228360 A CN201811228360 A CN 201811228360A CN 109522318 A CN109522318 A CN 109522318A
Authority
CN
China
Prior art keywords
data
index
report
analysis
quality
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811228360.1A
Other languages
Chinese (zh)
Other versions
CN109522318B (en
Inventor
范怡
蒋先虎
彭轶
高迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN201811228360.1A priority Critical patent/CN109522318B/en
Publication of CN109522318A publication Critical patent/CN109522318A/en
Application granted granted Critical
Publication of CN109522318B publication Critical patent/CN109522318B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of data quality management method and systems, this method comprises: configure to data observation index, obtain index allocation table, wherein data observation index characterization data report and submit in focus;To the index allocation information in index allocation table, data observation index is carried out to calculate acquisition index value, and the delta data according to index value in preset time range, generates achievement data quality report;It determines that the theme of data is reported and submitted in supervision, data analysis is carried out to each theme, obtain the quality of data report of topicalization;According to index value, threshold value of warning is determined, early warning processing is carried out to data observation index, obtains warning information;According to the report of index quantity-quality, the quality of data report and warning information of topicalization, data quality monitoring analysis report is generated.It realizes through the invention and improves the accuracy that data report and submit quality and the monitoring to the quality of data.

Description

A kind of data quality management method and system
Technical field
The present invention relates to technical field of data processing, more particularly to a kind of data quality management method and system.
Background technique
In certain declaration systems of financial institution, since these declaration systems would generally be towards the Administration of Foreign Exchange, the People's Bank Equal regulatory agencies, report and submit supervisory systems.With the continuous promotion that regulatory agency requires the quality of data, original upstream acquisition Add the mode reported and submitted, can not gradually cope with the supervision pressure of regulatory agency.The head office of some banks and the business department of branch Also data are more and more in feedback system for door, and supervision pressure increasingly increases.
Data quality monitoring analysis tool will use to the process that data are monitored existing, generated based on the tool Result data realizes in the form of statements, i.e. display data situation in table form, and in terms of data early warning, in advance Alert threshold value depends on business department's artificial parameter to safeguard mostly.Since data show in a tabular form so that data readability compared with Difference, and based on the threshold value of warning being manually arranged, the accuracy of data early warning and the problem that Shi Geng is poor can be reduced, to drop Low data report and submit quality.
Summary of the invention
It is directed to the above problem, the present invention provides a kind of data quality management method and system, realizes raising datagram Send the accuracy of quality and the monitoring to the quality of data.
To achieve the goals above, the present invention provides the following technical scheme that
A kind of data quality management method, this method comprises:
Data observation index is configured, obtains index allocation table, wherein the data observation index characterization datagram Focus in sending;
To the index allocation information in the index allocation table, the data observation index is carried out to calculate acquisition index Value, and the delta data according to the index value in preset time range generate achievement data quality report;
It determines that the theme of data is reported and submitted in supervision, data analysis is carried out to each theme, obtains the quality of data report of topicalization It accuses;
According to the index value, threshold value of warning is determined, early warning processing is carried out to the data observation index, obtain early warning letter Breath;
According to index quantity-quality report, the quality of data report and the warning information of the topicalization, generate Data quality monitoring analysis report.
Optionally, described that data observation index is configured, obtain index allocation table, comprising:
Obtain data report and submit in focus, the focus is defined as data observation index;
According to the incidence relation between each focus, the index dependence between each data observation index is determined;
The index dependence is verified, the index dependence for meeting verification condition is obtained;
According to the index dependence after the data observation index and verification, the data observation index is matched It sets, obtains index allocation table.
Optionally, the index allocation information in the index allocation table, counts the data observation index It calculates and obtains index value, and the delta data according to the index value in preset time range, generate achievement data quality report, Include:
Index allocation information in the index allocation table is parsed, index operation mode is obtained;
Judge to whether there is in the index allocation table to operation index, if it is, determining and described to operation index The index operation mode to match;
According to the index operation mode to match to operation index, carry out summarizing meter to operation index to described It calculates, obtains index value;
According to the delta data of the index value in preset time range, achievement data quality report is generated, wherein institute Achievement data quality report is stated for visualizing to the related data of data observation index.
Optionally, the theme of data is reported and submitted in the determining supervision, is carried out data analysis to each theme, is obtained topicalization Quality of data report, comprising:
Determine that the theme of data is reported and submitted in supervision, wherein the theme includes up-stream system data, artificial amended record data, number According to report and submit, feedback error and overdue data;
Data source analysis is carried out to the up-stream system data, and carries out data check analysis, obtains up-stream system number According to analysis result;
Data volume statistics is carried out to the artificial amended record data, and carries out the amended record analysis of causes, obtains artificial amended record data Analyze result;
Row analysis is sent into the datagram, data is obtained and reports and submits as a result, wherein, it includes normal that the data, which report and submit result, It reports and submits data volume and overdue reports and submits data volume;
The feedback error is analyzed according to preset field dimension, feedback error is obtained and analyzes result;
It is for statistical analysis to the overdue data, obtain overdue data analysis result;
Report and submit result, feedback wrong according to the up-stream system data analysis result, artificial amended record data analysis result, data Accidentally analysis result and overdue data analysis result generate the quality of data report of topicalization.
Optionally it is determined that threshold value of warning, carries out early warning processing to the data observation index, obtains warning information, comprising:
According to the index value, the mean value and variance for obtaining data observation index are calculated;
According to the mean value and variance, the confidence interval of the data observation index is determined;
Based on the confidence interval, threshold value of warning is determined;
According to the threshold value of warning, early warning processing is carried out to the data observation index, obtains warning information.
A kind of data quality management system, the system include:
Configuration unit obtains index allocation table, wherein the data observation for configuring to data observation index Index characterization data report and submit in focus;
Indicator calculating unit, for the index allocation information in the index allocation table, to the data observation index Calculate and obtain index value, and the delta data according to the index value in preset time range, generates achievement data matter Amount report;
Subject analysis unit carries out data analysis to each theme, is led for determining that the theme of data is reported and submitted in supervision The quality of data of topicization is reported;
Prewarning unit carries out at early warning the data observation index for determining threshold value of warning according to the index value Reason obtains warning information;
Report generation unit, for according to the index quantity-quality report, the topicalization the quality of data report and The warning information generates data quality monitoring analysis report.
Optionally, the configuration unit includes:
Obtain subelement, due to obtain data report and submit in focus, the focus is defined as data observation index;
Relationship determines subelement, for determining each data observation index according to the incidence relation between each focus Between index dependence;
Subelement is verified, for verifying to the index dependence, obtains the index dependence for meeting verification condition Relationship;
Subelement is configured, for the index dependence according to the data observation index and after verifying, to the data Observation index is configured, and index allocation table is obtained.
Optionally, the indicator calculating unit includes:
Parsing subunit obtains index operation for parsing to the index allocation information in the index allocation table Mode;
Judgment sub-unit, for judging in the index allocation table with the presence or absence of to operation index, if it is, determine with The index operation mode to match to operation index;
First computation subunit, for according to and the index operation mode to match to operation index, to it is described to Operation index carries out summarizing calculating, obtains index value;
First report generation subelement is generated for the delta data according to the index value in preset time range Achievement data quality report, wherein the related data progress that the achievement data quality report is used for data observation index can It is shown depending on changing.
Optionally, the subject analysis unit includes:
Theme determines subelement, for determining that the theme of data is reported and submitted in supervision, wherein the theme includes up-stream system number According to, artificial amended record data, data report and submit, feedback error and overdue data;
First analysis subelement, for carrying out data source analysis to the up-stream system data, and carries out data check Analysis obtains up-stream system data analysis result;
Second analysis subelement for carrying out data volume statistics to the artificial amended record data, and carries out amended record reason point Analysis, obtains artificial amended record data analysis result;
Third analyzes subelement, for being sent into row analysis to the datagram, obtains data and reports and submits as a result, wherein, described It includes normally reporting and submitting data volume and overdue reporting and submitting data volume that data, which report and submit result,;
It is wrong to obtain feedback for analyzing according to preset field dimension the feedback error for 4th analysis subelement Accidentally analysis result;
5th analysis subelement obtains overdue data analysis result for for statistical analysis to the overdue data;
Second report generation subelement, for being analyzed according to the up-stream system data analysis result, artificial amended record data As a result, data report and submit result, feedback error analysis result and overdue data analysis result, generate the quality of data report of topicalization It accuses.
Optionally, the prewarning unit includes:
Second computation subunit, for calculating the mean value and variance for obtaining data observation index according to the index value;
Section determines subelement, for determining the confidence interval of the data observation index according to the mean value and variance;
Threshold value determines subelement, for being based on the confidence interval, determines threshold value of warning;
Early warning handles subelement, has with you according to the threshold value of warning, carries out early warning processing to the data observation index, Obtain warning information
Compared to the prior art, the present invention provides a kind of data quality management method and devices, will count in the method It is determined as data observation index according to the focus in reporting and submitting, and data observation index is configured to obtain index allocation table, then base It carries out data tracking analysis to each data observation index, calculate acquisition to be mark in index allocation table, realizing can be quick The purpose of response and adjustment index, so that the achievement data quality report generated is more accurate, and is able to carry out visual presentation, The analysis and displaying of subject data can be carried out by carrying out analysis based on theme simultaneously, so that the monitoring of data is more complete, according to finger Scale value carries out the determination of threshold value of warning, enables to threshold value more accurate and has real-time, the quality of data ultimately generated Monitoring analysis, which is reported, to be more clear, complete and accurate display data quality information, realized raising data and reported and submitted quality With the accuracy of the monitoring to the quality of data.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of flow diagram of data quality management method provided in an embodiment of the present invention;
Fig. 2 is a kind of flow diagram of method for obtaining index allocation table provided in an embodiment of the present invention;
Fig. 3 is a kind of flow diagram of index operation method provided in an embodiment of the present invention;
Fig. 4 is a kind of structural schematic diagram of data quality management system provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Term " first " and " second " in description and claims of this specification and above-mentioned attached drawing etc. are for area Not different objects, rather than for describing specific sequence.Furthermore term " includes " and " having " and their any deformations, It is intended to cover and non-exclusive includes.Such as it contains the process, method of a series of steps or units, system, product or sets It is standby not to be set in listed step or unit, but may include the step of not listing or unit.
A kind of data quality management method is provided in embodiments of the present invention, referring to Fig. 1, this method comprises:
S101, data observation index is configured, obtains index allocation table;
Wherein, data observation index characterization data report and submit in focus.In embodiments of the present invention in the form of " index ", The focus of each quality of data is turned to a data observation and referred to by the quality of data point that standardization business personnel pays close attention to daily Mark.Using data observation index as minimum unit, calculate each data report and submit in focus, and need to data observation index it Between dependence verified, while to be configured according to the relevant information of data observation index, obtain index allocation table.
S102, to the index allocation information in index allocation table, data observation index calculate obtain index value, and According to the delta data of index value in preset time range, achievement data quality report is generated.
Specifically the process includes:
Index allocation information in the index allocation table is parsed, index operation mode is obtained;
Judge to whether there is in the index allocation table to operation index, if it is, determining and described to operation index The index operation mode to match;
According to the index operation mode to match to operation index, carry out summarizing meter to operation index to described It calculates, obtains index value;
According to the delta data of the index value in preset time range, achievement data quality report is generated, wherein institute Achievement data quality report is stated for visualizing to the related data of data observation index.
S103, it determines that the theme of data is reported and submitted in supervision, data analysis is carried out to each theme, obtains the data matter of topicalization Amount report;
Supervision reports and submits the common-mode of class system general are as follows: upstream acquisition, business intervention, reports and submits supervision, prison at system processing Pipe feedback.Based on the above process, in embodiments of the present invention in the form of theme, it is further analyzed and shows, at present Existing theme includes following items: up-stream system data, artificial amended record data, data report and submit, feedback error and overdue data. The data that analysis theme is relied on are most of from index calculated result, but because of the flexibility of index allocation, versatility, on a small quantity The data of system performance calculate, and will complete in analysis theme auxiliary data calculates.
S104, according to index value, determine threshold value of warning, to data observation index carry out early warning processing, obtain warning information;
The threshold value setting of supervision class system is generally carried out by business department's experience, and threshold value validity, update timeliness are deposited In deficiency, therefore in forewarning index part, the regular computing function in design threshold section provides business personnel's reference, such as business people Member receives system-computed threshold value as a result, then admissible and come into force.
Specifically, the process may include:
According to the index value, the mean value and variance for obtaining data observation index are calculated;
According to the mean value and variance, the confidence interval of the data observation index is determined;
Based on the confidence interval, threshold value of warning is determined;
According to the threshold value of warning, early warning processing is carried out to the data observation index, obtains warning information.
S105, according to index quantity-quality report, topicalization the quality of data report and warning information, generate the quality of data Monitoring analysis report.
Visual Data Quality Analysis report, the Data Quality Analysis of topicalization, daily dynamic early-warning, and more than being based on Three formation data quality monitoring analysis reports provide comprehensive, intuitive daily data cases view.
The present invention provides a kind of data quality management methods, in the method by data report and submit in focus be determined as Data observation index, and data observation index is configured to obtain index allocation table, index allocation table is then based on to each data Observation index carries out data tracking analysis, calculates acquisition to be mark, and realizing can be made with the purpose of quick response and adjustment index The achievement data quality report that must be generated is more accurate, and is able to carry out visual presentation, while carrying out analysis based on theme can The analysis and displaying of subject data are carried out, so that the monitoring of data is more complete, the determination of threshold value of warning is carried out according to index value, It enables to threshold value more accurate and there is real-time, the data quality monitoring analysis report ultimately generated can be more clear Clear, complete and accurate display data quality information realizes and improves the standard that data report and submit quality and the monitoring to the quality of data True property.
On the basis of the above embodiments, referring to fig. 2, a kind of obtain is additionally provided in this clearly demarcated another embodiment to refer to The method for marking allocation list, comprising:
S201, obtain data report and submit in focus, the focus is defined as data observation index;
S202, according to the incidence relation between each focus, determine that index between each data observation index relies on Relationship;
S203, the index dependence is verified, obtains the index dependence for meeting verification condition;
S204, according to the data observation index and verification after index dependence, to the data observation index into Row configuration, obtains index allocation table.
Specifically, using data observation index as minimum unit, calculate each data report and submit in focus, because can between focus There can be incidence relation, for example, situations such as calculating ratio, accounting, index support interdepends, for example, A index can be dependent on B The calculated result of index before then executing calculating, carries out index dependence legitimacy verifies, avoid occurring index rely on it is nested or Index depends on situations such as index being not present.Mainly include following information in design objective allocation list according to the demand:
Support subsystem operation: by SYSTEM field configuration, homologous ray difference is not configured, operation respectively;
The operation of support point province: by BYBRANCH field configuration, consider that homologous ray divides province according to difference, point province's foundation exists Configuration in another system allocation list;
It supports different frequency operations: by TARGET_TYPE distribu-tion index operation frequency, can carry out per diem, monthly, per year Configuration;
It supports index to rely on: being configured by REL_TARGET;
Support index detail storage: business personnel is daily check when, often need to check specifically corresponded under certain numerical value it is bright Carefully, it such as tells business personnel, there is 10 feedback errors today, if it is possible to while the specific detail of 10 feedbacks being provided, without It is that business personnel is allowed to inquire, check again by other function, is able to ascend the data processing and operation ease of business personnel, Increase its use good opinion to function.Therefore it can configure whether need detail by NEED_DETAIL, while in order to reduce detail Storage cost, the flexibility for guaranteeing configuration simultaneously, when NEED_DETAIL is configured to Y, need to configure DATA_KEY field, record bright Thin major key (managing detailed catalogue only stores major key);
Support section index executes: being configured by VALID, can carry out part execution to index and (such as encounter certain indexs It when the case where item need to recalculate, is calculated without full dose);
It supports various dimensions to calculate: being configured with TARGET_UNIT field, be suitable for monitoring and supervised in various dimensions such as the amount of money, stroke counts Control.
A kind of index calculating method is additionally provided in embodiments of the present invention, comprising:
Index allocation information in the index allocation table is parsed, index operation mode is obtained;
Judge to whether there is in the index allocation table to operation index, if it is, determining and described to operation index The index operation mode to match;
According to the index operation mode to match to operation index, carry out summarizing meter to operation index to described It calculates, obtains index value;
According to the delta data of the index value in preset time range, achievement data quality report is generated, wherein institute Achievement data quality report is stated for visualizing to the related data of data observation index.
It should be noted that the configuration in index operation foundation index allocation table, from leaf index (i.e. independent of other The index of index) start, cycle calculations, until all index calculating that come into force finish.It is below point with index operation mode Operation is saved, is a kind of flow diagram of index operation method provided in an embodiment of the present invention referring to Fig. 3, which includes:
S301, pending index state is updated for original state;
S302, state be original state index in, search can operation index;
S303, judge whether there is can operation index, if it is, execute S304;
S304, judge whether that a point province is needed to execute, if it is, executing S305, otherwise execute S306;
S305, point province is read according to table, foundation divides circulation point province, mechanism, province to execute;
S306, it executes to have jurisdiction over entirely and summarizes calculating;
S307, whether need to record detail, if it is, executing S308, otherwise execute S309;
S308, detail allocation list is read, records detail;
S309, it is finished, more New Set state is dbjective state.
The part of detail storage, as above described in a bit, the detail for the often basic data that business personnel wishes to, As fed back 5 mistakes, what business wished to be mistake what, such as reported 10 late, business can wish to be which ten Item is reported late, therefore the table broad covered area that the storage of detail is related to, while the data in table are that system is existing, such as in detail When storage, corresponding data is replicated again, extracts storage, will lead to that Data duplication storage, scalability, flexibility is insufficient asks Topic, therefore design as follows:
It establishes index detail list: recording detail in the form of " index-date-detail major key ", i.e., only record major key, together When in index detail allocation list, the corresponding specific business table name of record index passes through when user query, checking detail It is associated with detail list and index detail allocation list, the data in specific business datum table is read and is shown.The design can be flexible It supports the detail of different business tables of data to show, while when later period newly-increased index, newly-increased detail, only can be realized by configuring, Without building table, again exploitation inquiry export function again.
A kind of method of subject data analysis is additionally provided in embodiments of the present invention, comprising:
Determine that the theme of data is reported and submitted in supervision, wherein the theme includes up-stream system data, artificial amended record data, number According to report and submit, feedback error and overdue data;
Data source analysis is carried out to the up-stream system data, and carries out data check analysis, obtains up-stream system number According to analysis result;
Data volume statistics is carried out to the artificial amended record data, and carries out the amended record analysis of causes, obtains artificial amended record data Analyze result;
Row analysis is sent into the datagram, data is obtained and reports and submits as a result, wherein, it includes normal that the data, which report and submit result, It reports and submits data volume and overdue reports and submits data volume;
The feedback error is analyzed according to preset field dimension, feedback error is obtained and analyzes result;
It is for statistical analysis to the overdue data, obtain overdue data analysis result;
Report and submit result, feedback wrong according to the up-stream system data analysis result, artificial amended record data analysis result, data Accidentally analysis result and overdue data analysis result generate the quality of data report of topicalization.
For example, project team is according to daily production O&M experience, and understands with the communication of business department, be configured with Lower index relates generally to upstream data situation, artificial amended record situation, datagram and gives a present condition, feedback error situation, overdue data five A theme:
The analysis of up-stream system data:
Supervision reports and submits class system to relate generally to several or even ten several, dozens of up-stream systems, the data matter of up-stream system Amount is for reporting and submitting data to have very important influence, it may be said that promotes the up-stream system quality of data, is to promote supervision to report and submit matter Amount, the basis for reducing manual intervention, therefore devise the analysis of following items source system data:
Each system data amount statistical analysis -- statistically analyze daily data source distribution situation;
The verification situation of each system data statistical analysis of quality-statistical analysis data involved by each system daily, i.e. upstream system It unites to how many correct data, how many wrong data, which error reason has, and relates to which field;
By being analyzed above as a result, project team and business department can be assisted, find in time up-stream system there are the problem of, and When with up-stream system link up analyze, improve source mass of system, fundamentally reduce wrong data, manual intervention amount.
Artificial amended record data analysis:
The case where not acquired by upstream there are still partial service at present, system is directly entered by business personnel, the situation The reason and difficulty for respectively having each branch different in lines, therefore artificial amended record data analysis is devised, it is artificial that each branch is counted daily The data volume of amended record, and corresponding branch's operation teller's information is provided, with after observing a period of time, project team can actively be initiated Head office and involved branch are contacted, can artificial amended record data reasons are discussed and be analyzed acquire data by way of automatic collection.
Data report and submit analysis: including normally reporting and submitting data volume and overdue reporting and submitting data volume.
Feedback error analysis:
Regulatory agency devises feedback error analysis for reporting and submitting data that will carry out error feedback, and daily statistics is worked as Day error feedback situation includes following sections at present:
The feedback error analysis-of field dimension counts feedback error situation, finds the word in Error Set using field as dimension Section, is intervened in time.
The feedback error analysis-of branch's dimension shows feedback error situation, finds in Error Set using mechanism as dimension In lines, in time connection and with its analysis solution.
Overdue data analysis:
Overdue data are always pain spot when business department's supervision is reported and submitted, overdue deduction of points or the place that will cause regulatory agency Penalize, thus how to find it is overdue, handle in time overdue, be that system needs that business department is assisted to be carried out, therefore devises and exceed Issue statisticallys analyze daily overdue situation from mechanism dimension according to the function of analysis, and assistance always concentrates on crucial points in lines, distinguishes weight delays It is anxious, the purpose for promoting the quality of data is reached with lesser energy.
And day before yesterday data cases are periodically combined, dynamic generation threshold value of warning is generated based on nearest data experience Threshold value of warning the most accurate is dimension according to These parameters, is monitored early warning, note abnormalities data point in time.
For example, the threshold value setting that supervision class letter leads to generally is carried out by service part experience, threshold value validity, update are timely Property Shortcomings, therefore in forewarning index part, the regular computing function in design threshold section provides business personnel's reference, such as industry Business personnel receive system-computed threshold value as a result, then admissible and come into force.Threshold calculations use normal distribution formula, by periodically counting The mean value and variance for calculating preset time range (such as 1 year in the past) interior index, obtain the confidence interval of the index.It was calculating Cheng Zhong, it is contemplated that (such as festivals or holidays do not report and submit data to the characteristic that banking system is reported and submitted with supervision, first job daily paper after festivals or holidays Send the data on top n day off), first working day after calculating day, working day, festivals or holidays, festivals or holidays can be carried out to different indexs Etc. the settings of types calculate same confidence interval by different type.The confidence interval being calculated, it will thus provide O&M, business department With reference to such as adopting, by the upper lower threshold value as index, carry out early warning judgement.
Data quality monitoring analysis report is generated in embodiments of the present invention, will provide this report in the front end of system Show, this report show item according to index, theme, subject analysis configuration, dynamic generation.It can use visual chemical industry simultaneously Tool (such as echarts3 tool) is shown.Such as displayed page can be divided into two displayed pages in left and right, left side is shown Achievement data quality report, can be by reading interim index allocation table and index calculation result table, and displaying has data feelings under its command Condition clicks numeric field, the nearly one month situation of change of index can be checked below the page, and the inquiry of same day detail is supported to lead Out.Subject analysis allocation list is read in can reporting by the quality of data of the topicalization on right side, shows corresponding theme configuration Analytic function, and corresponding link can be set, by clickthrough, concrete analysis will be checked as a result, can below the page So that it is subsequent with business demand, regulatory requirements keep addition, refinement.It is realized during the displaying of report visual The data of change are shown, such as tendency chart etc. is shown.
The present invention is by reaching monitoring system for data quality monitoring point indexing for Data Quality Analysis report frame The purpose for the quality of data of uniting, while having accomplished to be enable to respond quickly, adjust in time monitoring point, adapt to external administration, internal control Variation demand.In terms of forewarning index, according to history achievement data, carries out monthly metrics-thresholds and calculates, use for reference historical data, It assists business personnel to adjust threshold parameter in time, adapts to the continuous variation of data.Using visualization tool, by uninteresting data Quality Analysis Report turns to various intuitive trend, variation figure, while skeletonisation Data Quality Analysis report, and can prop up It holds subsequent more visual analyzings to develop and match in merging report according to business demand in time, constantly promotion Data Quality Analysis The value of report.
It is corresponding, a kind of data quality management system is additionally provided in another embodiment of the invention, it referring to fig. 4, should System includes:
Configuration unit 401 obtains index allocation table for configuring to data observation index, wherein the data are seen Survey index characterization data report and submit in focus;
Indicator calculating unit 402, for referring to the data observation to the index allocation information in the index allocation table Mark, which calculate, obtains index value, and the delta data according to the index value in preset time range, generates achievement data Quality report;
Subject analysis unit 403 carries out data analysis to each theme, obtains for determining that the theme of data is reported and submitted in supervision Obtain the quality of data report of topicalization;
Prewarning unit 404 carries out the data observation index pre- for determining threshold value of warning according to the index value Alert processing, obtains warning information;
Report generation unit 405, for being reported according to the quality of data of index quantity-quality report, the topicalization With the warning information, data quality monitoring analysis report is generated.
The present invention provides a kind of data quality management device, by data report and submit in focus be determined as data observation and refer to Mark, and data observation index is configured in configuration unit to obtain index allocation table, index allocation table is then based in index meter It calculates in unit and each data observation index is carried out data tracking analysis, calculates acquisition to be mark, realizing can be with quick response With the purpose of adjustment index, so that the achievement data quality report generated is more accurate, and it is able to carry out visual presentation, simultaneously The analysis and displaying of subject data can be carried out by carrying out analysis based on theme in subject analysis unit, so that the monitoring of data is more Completely, the determination that threshold value of warning is carried out according to index value enables to threshold value more accurate and has real-time, finally reporting The data quality monitoring analysis report generated in announcement generation unit can be more clear, complete and accurate display data quality is believed Breath realizes and improves the accuracy that data report and submit quality and the monitoring to the quality of data.
On the basis of the above embodiments, the configuration unit includes:
Obtain subelement, due to obtain data report and submit in focus, the focus is defined as data observation index;
Relationship determines subelement, for determining each data observation index according to the incidence relation between each focus Between index dependence;
Subelement is verified, for verifying to the index dependence, obtains the index dependence for meeting verification condition Relationship;
Subelement is configured, for the index dependence according to the data observation index and after verifying, to the data Observation index is configured, and index allocation table is obtained.
Optionally, the indicator calculating unit includes:
Parsing subunit obtains index operation for parsing to the index allocation information in the index allocation table Mode;
Judgment sub-unit, for judging in the index allocation table with the presence or absence of to operation index, if it is, determine with The index operation mode to match to operation index;
First computation subunit, for according to and the index operation mode to match to operation index, to it is described to Operation index carries out summarizing calculating, obtains index value;
First report generation subelement is generated for the delta data according to the index value in preset time range Achievement data quality report, wherein the related data progress that the achievement data quality report is used for data observation index can It is shown depending on changing.
Optionally, the subject analysis unit includes:
Theme determines subelement, for determining that the theme of data is reported and submitted in supervision, wherein the theme includes up-stream system number According to, artificial amended record data, data report and submit, feedback error and overdue data;
First analysis subelement, for carrying out data source analysis to the up-stream system data, and carries out data check Analysis obtains up-stream system data analysis result;
Second analysis subelement for carrying out data volume statistics to the artificial amended record data, and carries out amended record reason point Analysis, obtains artificial amended record data analysis result;
Third analyzes subelement, for being sent into row analysis to the datagram, obtains data and reports and submits as a result, wherein, described It includes normally reporting and submitting data volume and overdue reporting and submitting data volume that data, which report and submit result,;
It is wrong to obtain feedback for analyzing according to preset field dimension the feedback error for 4th analysis subelement Accidentally analysis result;
5th analysis subelement obtains overdue data analysis result for for statistical analysis to the overdue data;
Second report generation subelement, for being analyzed according to the up-stream system data analysis result, artificial amended record data As a result, data report and submit result, feedback error analysis result and overdue data analysis result, generate the quality of data report of topicalization It accuses.
Optionally, the prewarning unit includes:
Second computation subunit, for calculating the mean value and variance for obtaining data observation index according to the index value;
Section determines subelement, for determining the confidence interval of the data observation index according to the mean value and variance;
Threshold value determines subelement, for being based on the confidence interval, determines threshold value of warning;
Early warning handles subelement, has with you according to the threshold value of warning, carries out early warning processing to the data observation index, Obtain warning information
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment For, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is said referring to method part It is bright.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (10)

1. a kind of data quality management method, which is characterized in that this method comprises:
Data observation index is configured, obtains index allocation table, wherein during the data observation index characterization data are reported and submitted Focus;
To the index allocation information in the index allocation table, the data observation index is carried out to calculate acquisition index value, and According to the delta data of the index value in preset time range, achievement data quality report is generated;
It determines that the theme of data is reported and submitted in supervision, data analysis is carried out to each theme, obtain the quality of data report of topicalization;
According to the index value, threshold value of warning is determined, early warning processing is carried out to the data observation index, obtains warning information;
According to index quantity-quality report, the quality of data report and the warning information of the topicalization, data are generated Analysis of Quality Control report.
2. obtaining index the method according to claim 1, wherein described configure data observation index Allocation list, comprising:
Obtain data report and submit in focus, the focus is defined as data observation index;
According to the incidence relation between each focus, the index dependence between each data observation index is determined;
The index dependence is verified, the index dependence for meeting verification condition is obtained;
According to the index dependence after the data observation index and verification, the data observation index is configured, is obtained Obtain index allocation table.
3. according to the method described in claim 2, it is characterized in that, the index allocation in the index allocation table is believed Breath carries out the data observation index to calculate acquisition index value, and the change according to the index value in preset time range Change data, generate achievement data quality report, comprising:
Index allocation information in the index allocation table is parsed, index operation mode is obtained;
Judge to whether there is in the index allocation table to operation index, if it is, determining with described to operation index phase The index operation mode matched;
According to the index operation mode to match to operation index, carry out summarizing calculating to operation index to described, obtain Obtain index value;
According to the delta data of the index value in preset time range, achievement data quality report is generated, wherein the finger Mark quality of data report is visualized for the related data to data observation index.
4. the method according to claim 1, wherein the theme of data is reported and submitted in the determining supervision, to each master Topic carries out data analysis, obtains the quality of data report of topicalization, comprising:
Determine that the theme of data is reported and submitted in supervision, wherein the theme includes up-stream system data, artificial amended record data, datagram It send, feedback error and overdue data;
Data source analysis is carried out to the up-stream system data, and carries out data check analysis, obtains up-stream system data point Analyse result;
Data volume statistics is carried out to the artificial amended record data, and carries out the amended record analysis of causes, obtains artificial amended record data analysis As a result;
Row analysis is sent into the datagram, data is obtained and reports and submits as a result, wherein, it includes normally reporting and submitting that the data, which report and submit result, Data volume and overdue report and submit data volume;
The feedback error is analyzed according to preset field dimension, feedback error is obtained and analyzes result;
It is for statistical analysis to the overdue data, obtain overdue data analysis result;
Result, feedback error point are reported and submitted according to the up-stream system data analysis result, artificial amended record data analysis result, data Result and overdue data analysis result are analysed, the quality of data report of topicalization is generated.
5. the method according to claim 1, wherein described determine threshold value of warning according to the index value, to institute It states data observation index and carries out early warning processing, obtain warning information, comprising:
According to the index value, the mean value and variance for obtaining data observation index are calculated;
According to the mean value and variance, the confidence interval of the data observation index is determined;
Based on the confidence interval, threshold value of warning is determined;
According to the threshold value of warning, early warning processing is carried out to the data observation index, obtains warning information.
6. a kind of data quality management system, which is characterized in that the system includes:
Configuration unit obtains index allocation table, wherein the data observation index for configuring to data observation index Characterize data report and submit in focus;
Indicator calculating unit, for being carried out to the data observation index to the index allocation information in the index allocation table It calculates and obtains index value, and the delta data according to the index value in preset time range, generate achievement data quality report It accuses;
Subject analysis unit carries out data analysis to each theme, obtains topicalization for determining that the theme of data is reported and submitted in supervision The quality of data report;
Prewarning unit carries out early warning processing to the data observation index for determining threshold value of warning according to the index value, Obtain warning information;
Report generation unit, for being reported according to the index quantity-quality, the quality of data of the topicalization is reported and described Warning information generates data quality monitoring analysis report.
7. system according to claim 6, which is characterized in that the configuration unit includes:
Obtain subelement, due to obtain data report and submit in focus, the focus is defined as data observation index;
Relationship determines subelement, for determining between each data observation index according to the incidence relation between each focus Index dependence;
Subelement is verified, for verifying to the index dependence, obtains the index dependence for meeting verification condition;
Subelement is configured, for the index dependence according to the data observation index and after verifying, to the data observation Index is configured, and index allocation table is obtained.
8. system according to claim 7, which is characterized in that the indicator calculating unit includes:
Parsing subunit obtains index operation mode for parsing to the index allocation information in the index allocation table;
Judgment sub-unit, for judging in the index allocation table with the presence or absence of to operation index, if it is, it is determining with it is described The index operation mode to match to operation index;
First computation subunit, for basis and the index operation mode to match to operation index, to described to operation Index carries out summarizing calculating, obtains index value;
First report generation subelement generates index for the delta data according to the index value in preset time range Quality of data report, wherein the achievement data quality report is for visualizing the related data of data observation index It shows.
9. system according to claim 6, which is characterized in that the subject analysis unit includes:
Theme determines subelement, for determining that the themes of data is reported and submitted in supervision, wherein the theme include up-stream system data, Artificial amended record data, data report and submit, feedback error and overdue data;
First analysis subelement, for carrying out data source analysis to the up-stream system data, and carries out data check analysis, Obtain up-stream system data analysis result;
Second analysis subelement for carrying out data volume statistics to the artificial amended record data, and carries out the amended record analysis of causes, obtains Obtain artificial amended record data analysis result;
Third analyzes subelement, for being sent into row analysis to the datagram, obtains data and reports and submits as a result, wherein, the data Reporting and submitting result includes normally reporting and submitting data volume and overdue reporting and submitting data volume;
4th analysis subelement obtains feedback error point for analyzing according to preset field dimension the feedback error Analyse result;
5th analysis subelement obtains overdue data analysis result for for statistical analysis to the overdue data;
Second report generation subelement, for according to the up-stream system data analysis result, artificial amended record data analysis result, Data report and submit result, feedback error analysis result and overdue data analysis result, generate the quality of data report of topicalization.
10. system according to claim 6, which is characterized in that the prewarning unit includes:
Second computation subunit, for calculating the mean value and variance for obtaining data observation index according to the index value;
Section determines subelement, for determining the confidence interval of the data observation index according to the mean value and variance;
Threshold value determines subelement, for being based on the confidence interval, determines threshold value of warning;
Early warning handles subelement, has with you according to the threshold value of warning, carries out early warning processing to the data observation index, obtains Warning information.
CN201811228360.1A 2018-10-22 2018-10-22 Data quality management method and system Active CN109522318B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811228360.1A CN109522318B (en) 2018-10-22 2018-10-22 Data quality management method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811228360.1A CN109522318B (en) 2018-10-22 2018-10-22 Data quality management method and system

Publications (2)

Publication Number Publication Date
CN109522318A true CN109522318A (en) 2019-03-26
CN109522318B CN109522318B (en) 2022-01-21

Family

ID=65772784

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811228360.1A Active CN109522318B (en) 2018-10-22 2018-10-22 Data quality management method and system

Country Status (1)

Country Link
CN (1) CN109522318B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110309125A (en) * 2019-06-24 2019-10-08 招商局金融科技有限公司 Data verification method, electronic device and storage medium
CN111241086A (en) * 2020-01-17 2020-06-05 甘肃省卫生健康统计信息中心(西北人口信息中心) Data quality improvement method and system based on medical big data
CN111311086A (en) * 2020-02-11 2020-06-19 中国银联股份有限公司 Capacity monitoring method and device and computer readable storage medium
CN111949642A (en) * 2020-08-13 2020-11-17 中国工商银行股份有限公司 Data quality control method and device
CN112579699A (en) * 2020-12-14 2021-03-30 广州信安数据有限公司 Quality monitoring method, system and storage medium for service data processing link

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550511A (en) * 2015-12-11 2016-05-04 北京锐软科技股份有限公司 Data quality evaluation system and method based on data verification technique
CN106649840A (en) * 2016-12-30 2017-05-10 国网江西省电力公司经济技术研究院 Method suitable for power data quality assessment and rule check
CN107358416A (en) * 2017-09-12 2017-11-17 安徽易商数码科技有限公司 A kind of product quality supervision management system
CN108460678A (en) * 2017-02-22 2018-08-28 北京数信互融科技发展有限公司 Assets screening, quality-monitoring, prediction whole process internet financial asset manage cloud platform
CN108647340A (en) * 2018-05-14 2018-10-12 浪潮通用软件有限公司 A kind of multidimensional data real-time analysis method based on dynamic crosstab

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550511A (en) * 2015-12-11 2016-05-04 北京锐软科技股份有限公司 Data quality evaluation system and method based on data verification technique
CN106649840A (en) * 2016-12-30 2017-05-10 国网江西省电力公司经济技术研究院 Method suitable for power data quality assessment and rule check
CN108460678A (en) * 2017-02-22 2018-08-28 北京数信互融科技发展有限公司 Assets screening, quality-monitoring, prediction whole process internet financial asset manage cloud platform
CN107358416A (en) * 2017-09-12 2017-11-17 安徽易商数码科技有限公司 A kind of product quality supervision management system
CN108647340A (en) * 2018-05-14 2018-10-12 浪潮通用软件有限公司 A kind of multidimensional data real-time analysis method based on dynamic crosstab

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周天军: ""基于数据仓库的银行监管报送系统设计与实现"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110309125A (en) * 2019-06-24 2019-10-08 招商局金融科技有限公司 Data verification method, electronic device and storage medium
CN110309125B (en) * 2019-06-24 2021-09-21 招商局金融科技有限公司 Data verification method, electronic device and storage medium
CN111241086A (en) * 2020-01-17 2020-06-05 甘肃省卫生健康统计信息中心(西北人口信息中心) Data quality improvement method and system based on medical big data
CN111241086B (en) * 2020-01-17 2021-08-31 甘肃省卫生健康统计信息中心(西北人口信息中心) Data quality improvement method and system based on medical big data
CN111311086A (en) * 2020-02-11 2020-06-19 中国银联股份有限公司 Capacity monitoring method and device and computer readable storage medium
CN111311086B (en) * 2020-02-11 2024-02-09 中国银联股份有限公司 Capacity monitoring method, device and computer readable storage medium
CN111949642A (en) * 2020-08-13 2020-11-17 中国工商银行股份有限公司 Data quality control method and device
CN112579699A (en) * 2020-12-14 2021-03-30 广州信安数据有限公司 Quality monitoring method, system and storage medium for service data processing link

Also Published As

Publication number Publication date
CN109522318B (en) 2022-01-21

Similar Documents

Publication Publication Date Title
CN109522318A (en) A kind of data quality management method and system
US5771179A (en) Measurement analysis software system and method
US8818758B1 (en) Methods and apparatus to track, visualize and understand energy and utilities usage
EP1788494A1 (en) Tracking usage of data elements in electronic business communications
CN109784689B (en) Power grid infrastructure project report data processing method
KR101167850B1 (en) Method on Generating Expaned Fee Information by Unit Item Comprising Direct Fee of Construction Cost Including Small Space or Element Information
CN106530069A (en) Financial data analysis method and system
CN112907034B (en) Partition metering leakage monitoring management system based on Internet of things and machine learning
CN108171421A (en) A kind of road surface O&M disposal efficiency and method for evaluating quality towards urban transportation smart machine system O&M
CN115374329A (en) Method and system for managing enterprise business metadata and technical metadata
CN110084439A (en) A kind of software cost measure and cloud system based on the estimation of NESMA function point
CN108694522A (en) A kind of data analysing method and device
CN115145358A (en) Carbon emission metering all-in-one machine based on edge cloud cooperation
US8473389B2 (en) Methods and systems of purchase contract price adjustment calculation tools
TWI503780B (en) Optimizing system for contract demand and optimizing method using for the same
CN108876298A (en) A kind of Gain sharing management method and system
RU122793U1 (en) DEVICE OF AUTOMATED FORMATION, CALCULATION AND ANALYSIS OF WORKERS SALES AT THE PRODUCTION ENTERPRISE
CN117057835A (en) Auxiliary analysis method and system for power grid engineering cost
CN106934518A (en) A kind of capital rationing control method and system
CN115630113A (en) Account flow checking method and device
CN102890795A (en) System and method for accounting production working hours of workshop
CN110659867A (en) Comprehensive personal management service system for fund flow, travel management and financial management
US20230306479A1 (en) Systems and methods of utility data triangulation to verify data accuracy
CN113592627A (en) Bond business management method, system and computer readable storage medium
CN116402470A (en) Multidimensional budget management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant