CN101719139B - Method for monitoring data quality based on index set - Google Patents

Method for monitoring data quality based on index set Download PDF

Info

Publication number
CN101719139B
CN101719139B CN2009102126751A CN200910212675A CN101719139B CN 101719139 B CN101719139 B CN 101719139B CN 2009102126751 A CN2009102126751 A CN 2009102126751A CN 200910212675 A CN200910212675 A CN 200910212675A CN 101719139 B CN101719139 B CN 101719139B
Authority
CN
China
Prior art keywords
checkpoint
index set
index
data
data quality
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2009102126751A
Other languages
Chinese (zh)
Other versions
CN101719139A (en
Inventor
万星明
刘树权
余志刚
孙力斌
沈鹏程
兰清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NANJING CITY LINKAGE SYSTEM INTEGRATION CO Ltd
Original Assignee
NANJING CITY LINKAGE SYSTEM INTEGRATION CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NANJING CITY LINKAGE SYSTEM INTEGRATION CO Ltd filed Critical NANJING CITY LINKAGE SYSTEM INTEGRATION CO Ltd
Priority to CN2009102126751A priority Critical patent/CN101719139B/en
Publication of CN101719139A publication Critical patent/CN101719139A/en
Application granted granted Critical
Publication of CN101719139B publication Critical patent/CN101719139B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a method for monitoring data quality based on an index set. The method comprises the following steps of: defining a plurality of index sets according to a data index; judging whether the data indexes in the index set is normal or not according to a checking point; and managing the checking point through a detachable dispatching module. All checking in the method for monitoring the data quality based on the index set all takes the data index set as a center, 60 percent of coding amount on checking codes can be reduced, and fussy manual operation with low efficiency is avoided. The dispatching module of the checking point is detachable and can be applied to any other modules for reutilization.

Description

Data quality monitoring method based on index set
Technical field
The present invention is relevant for quality of data platform field, and particularly relevant for a kind of data quality monitoring method based on index set.
Background technology
At present, existing quality of data platform does not have independent data target monitoring mechanism, just need to lean on and wait mode manually if obtain some data targets, for example to the server exectorial mode to obtain index.There is the problem of the duplication of labour and underaction in above-mentioned traditional method.
Summary of the invention
In view of this, the present invention provides a kind of data quality monitoring method based on index set, and the monitoring of data quality is had higher flexibility and transplantability.
The present invention proposes a kind of data quality monitoring method based on index set, may further comprise the steps: define some index sets according to data target; Whether the data target of concentrating through the checkpoint judge index is normal; Manage said checkpoint through separable scheduler module.
Further, the These parameters collection extracts data target through self-defining SQL and generates, and adopts XML to be configured.
Further, above-mentioned checkpoint is to be made up of one section script.
Further, above-mentioned checkpoint can be expanded and send mail and send the built-in bag of note.
Further, the general caching bag is used in above-mentioned checkpoint, and the script startup engine is carried out buffer memory.
Further, the trigger mechanism of above-mentioned scheduler module comprises that Time Triggered and condition trigger.
Beneficial effect of the present invention does; All inspections all are to be the center with the data target collection in the data quality monitoring method based on index set provided by the present invention; And all checkpoints all are one section scripts, select to use script to make a decision and have greatly improved dirigibility.The scheduler module of checkpoint is separable module, and does not aim at quality of data platform design, can be applied to other any modules, makes it obtain recycling.Utilize method for supervising of the present invention can reduce the size of code of 60% inspection code, avoided manually-operated loaded down with trivial details and inefficiency, non-professional expert also can be monitored some indexs.
Description of drawings
Shown in Figure 1 is schematic flow sheet according to the data quality monitoring method based on index set of the present invention.
Shown in Figure 2 is function block schematic diagram among Fig. 1.
Embodiment
For let above and other objects of the present invention, feature and advantage can be more obviously understandable, hereinafter is special lifts preferred embodiment, and conjunction with figs., elaborates as follows.
Shown in Figure 1 is schematic flow sheet according to the data quality monitoring method based on index set of the present invention.Shown in Figure 2 is function block schematic diagram among Fig. 1, please in the lump with reference to figure 1 and Fig. 2.As shown in Figure 1, this method comprises:
Step S10 defines some index sets according to data target.In the present embodiment, each index set all is to have a self-defining SQL to extract data target to generate.Consider transplantability, adopt XML to dispose these indexs.With each index definition is MAP, and it is exactly the index set that we need that these all MAP combine.And these indexs are not overlap each other, and are separate.Concrete XML is following:
<maps>
< map id=" report 1 " db=" ods " name=" inspection 1 " >
<![CDATA[
SELECT?KEY1?FROM?TABLE_A?WHERE....
]]>
</map>
</maps>
Step S11 judges through the checkpoint whether the data target in the said index set is normal.In the present embodiment, whether normally be to have a checkpoint to judge for data target.Wherein, the checkpoint is one section RUBY script.As shown in Figure 2, if the checkpoint judged result is normal, then circulate; If the checkpoint judged result is unusual, then alarm.
Object lesson is following:
< checkpoint id=" 1 " name=" inspection of XXX " >
<![CDATA[
$result.error()if$report1<$report2
]]>
</checkpoint>
The checkpoint uses script to judge the extendability of the whole data target monitoring mechanism of great raising.For example, can expand and send mail and send the built-in bag of note, when checking out problem, as the built-in bag use of alarming mechanism in the checkpoint.
It is exactly very slow when starting script that yet the checkpoint uses script can bring a very big drawback, can just cause system effectiveness low like this.In the present embodiment, the script of checkpoint has used some general caching bags, carries out buffer memory for the script startup engine.Be can be a bit slow so only, carry out the delay of script later on regard to imperceptible travelling speed starting for the first time script engine.
Step S12 manages said checkpoint through separable scheduler module.In the present embodiment, all checkpoints are managed by a uniform dispatching module.Here checking mechanism module and scheduler module have realized de fully, make scheduler module not aim at quality of data platform design, can be used for other any modules.Can let sundry item reuse this dispatching platform like this.
In the present embodiment, time scheduling mainly is that a CRONTAB expresses when to trigger and calls in the scheduler module, here the expression formula rich of CRONTAB.Specifically, a CRONTAB expression formula has at least 6 (also possibility is 7) that the time element of space-separated is arranged.Be followed successively by in order:
Second (0~59)
Minute (0~59)
Hour (0~23)
My god (moon) (0~31, need to consider the fate of this month)
The moon (0~11)
My god (week) (1~71=SUN or SUN, MON, TUE, WED, THU, FRI, SAT)
Time (1970-2099)
A simple example is following:
" 01510? * 6#3 " representative every month the 3rd the morning on Friday 10:15 trigger;
" 00/514,18**? " Representative 2 of every afternoons during the 2:55 with 6 pm during the 6:55 per 5 minutes trigger.
Traditional monitoring trigger mechanism mainly is a Time Triggered, in the present embodiment, triggers except Time Triggered mechanism can also comprise condition.Condition triggers can carry out external trigger, combines such as the ETL flow process with ODS.The interface of an external trigger for example can be set, when flow process of ETL finishes, just can trigger a checkpoint as long as ETL calls this external trigger interface.
Specifically, at first at server deploy quality of data background scheduler, and the database of initialization background scheduler; Then in server deploy quality of data index watchdog routine, and in server deploy note mail alerting service program; Be used for configuration schedules and checkpoint in monitor supervision platform deploy WEB module at last; Be included in the dispatch service of configuration index inspect-type on the index monitoring interface, in the scheduling time of the XML of configuration index inspection on the index monitoring interface and configuration index inspection on the index monitoring interface.
The case of practical implementation described in the present invention is merely preferable case study on implementation of the present invention, is not to be used for limiting practical range of the present invention.Be that all equivalences of doing according to the content of claim of the present invention change and modification, all should be as technological category of the present invention.

Claims (1)

1. the data quality monitoring method based on index set is characterized in that, may further comprise the steps: define some index sets according to data target; Said index set is extracted data target through self-defining SQL and is generated, and adopts XML to be configured; With each index definition is MAP, and it is exactly the index set that needs that these all MAP combine;
Judge through the checkpoint whether the data target in the said index set is normal; Said checkpoint is to be made up of one section script; Said checkpoint can be expanded and send mail and send the built-in bag of note; The general caching bag is used in the checkpoint, and the script startup engine is carried out buffer memory; Manage said checkpoint through separable scheduler module; The trigger mechanism of said scheduler module comprises that Time Triggered and condition trigger.
CN2009102126751A 2009-11-10 2009-11-10 Method for monitoring data quality based on index set Expired - Fee Related CN101719139B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009102126751A CN101719139B (en) 2009-11-10 2009-11-10 Method for monitoring data quality based on index set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009102126751A CN101719139B (en) 2009-11-10 2009-11-10 Method for monitoring data quality based on index set

Publications (2)

Publication Number Publication Date
CN101719139A CN101719139A (en) 2010-06-02
CN101719139B true CN101719139B (en) 2012-07-04

Family

ID=42433713

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009102126751A Expired - Fee Related CN101719139B (en) 2009-11-10 2009-11-10 Method for monitoring data quality based on index set

Country Status (1)

Country Link
CN (1) CN101719139B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829750A (en) * 2018-05-24 2018-11-16 国信优易数据有限公司 A kind of quality of data determines system and method
CN110378599A (en) * 2019-07-22 2019-10-25 精英数智科技股份有限公司 Ranking method, system, equipment and the computer storage medium of accident prevention quality

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1547145A (en) * 2003-12-08 2004-11-17 西安交通大学 Dynamic detecting and ensuring method for equipment operating status data quality
CN101118550A (en) * 2007-09-04 2008-02-06 山东浪潮齐鲁软件产业股份有限公司 Application data quality detecting method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1547145A (en) * 2003-12-08 2004-11-17 西安交通大学 Dynamic detecting and ensuring method for equipment operating status data quality
CN101118550A (en) * 2007-09-04 2008-02-06 山东浪潮齐鲁软件产业股份有限公司 Application data quality detecting method

Also Published As

Publication number Publication date
CN101719139A (en) 2010-06-02

Similar Documents

Publication Publication Date Title
Notz et al. The trajectory towards a seasonally ice-free Arctic Ocean
CN105446878B (en) A kind of lasting programming automation method of testing
McKain et al. Assessment of ground-based atmospheric observations for verification of greenhouse gas emissions from an urban region
US20160247085A1 (en) Managing computational workloads of computing apparatuses powered by renewable resources
CN104951856B (en) Comprehensive emergency materials visualization system and method based on GIS
CN105387567B (en) The processing method and processing device of purifying part in air purifier
CN101719139B (en) Method for monitoring data quality based on index set
CN111143167A (en) Alarm merging method, device, equipment and storage medium for multiple platforms
CN109344967A (en) A kind of intelligent electric meter life cycle prediction technique based on artificial neural network
CN101561778A (en) Method for detecting task closed loop of multi-task operating system
Duran et al. Assessing the connection between nuclear and renewable energy on ecological footprint within the EKC framework: implications for sustainable policy in leading nuclear energy-producing countries
CN106874027A (en) A kind of transportation industry quality of data monitoring platform based on plug-in unit mode
CN102722521A (en) Method and system for monitoring data comparison
EP3843384A1 (en) Delivery server, method and program
CN106408092A (en) Elevator maintenance list automatic generation method and system
CN105897498A (en) Business monitoring method and device
CN110865329B (en) Electric energy metering method and system based on big data self-diagnosis
CN107934711A (en) A kind of Sensor monitoring management system for elevator maintenance maintenance
Collier et al. Survival, fidelity, and recovery rates of white‐winged doves in Texas
CN115551060A (en) Low-power consumption data monitoring method
CN115525392A (en) Container monitoring method and device, electronic equipment and storage medium
CN111104289B (en) System and method for checking efficiency of GPU (graphics processing Unit) cluster
CN111526202B (en) Fan fault early warning system and method
CN116185787B (en) Self-learning type monitoring alarm method, device, equipment and storage medium
WO2022218001A1 (en) Video analysis method and related system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120704

Termination date: 20211110