CN109993439A - A kind of quality determining method based on government data - Google Patents

A kind of quality determining method based on government data Download PDF

Info

Publication number
CN109993439A
CN109993439A CN201910261220.2A CN201910261220A CN109993439A CN 109993439 A CN109993439 A CN 109993439A CN 201910261220 A CN201910261220 A CN 201910261220A CN 109993439 A CN109993439 A CN 109993439A
Authority
CN
China
Prior art keywords
data
quality
task
government
quality testing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910261220.2A
Other languages
Chinese (zh)
Inventor
李婕
孙延庆
刘建峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Inspur Cloud Information Technology Co Ltd
Original Assignee
Shandong Inspur Cloud Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Inspur Cloud Information Technology Co Ltd filed Critical Shandong Inspur Cloud Information Technology Co Ltd
Priority to CN201910261220.2A priority Critical patent/CN109993439A/en
Publication of CN109993439A publication Critical patent/CN109993439A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Theoretical Computer Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • General Factory Administration (AREA)

Abstract

The present invention is more particularly directed to a kind of quality determining methods based on government data.The quality determining method based on government data, quality detected rule is defined in the database, create rule model and distributed mass Detection task, statistical quality testing result after completion quality testing task, alert notification is sent simultaneously to problem owner and superintendent, it scores detection data, and quality of production examining report.The quality determining method based on government data, pass through the detection to data, analysis, alarm, supervise and examine, the efficient works such as scoring and is effectively guaranteed the exact specifications of data, not only can the fast and accurately position of orientation problem data and the reason of problem, the interface operation quickly handled can also be provided, to better understand data, providing support using data, mining data value.

Description

A kind of quality determining method based on government data
Technical field
The present invention relates to data quality checking technical field, in particular to a kind of quality testing side based on government data Method.
Background technique
Government data covers the every aspect of social management and public service, and authority with higher.Political affairs at different levels Mansion knows the data of society 80%, is the largest data owner, and a large amount of data resource is badly in need of government and Society Open, is total to It enjoys and utilizes.
Government data itself is used as a kind of information resources, and there is the behaviour such as acquisition, processing, analysis, preservation, transmission in the process Make, wherein the abnormal even mistake of data may be will lead to, and government data must have authoritative and accuracy, and deposit In industry multiplicity, the feature that data volume is big, variation is fast.Therefore, to the efficient, logical of a large amount of, multifarious government data Quality testing, problem visualization processing faster, more intuitively recognize data to help government and society, understand data, benefit It is particularly important with data.
Based on the above situation, the invention proposes a kind of quality determining method based on government data, to data problem into Row effectively detects, and makes it possible on guaranteeing government data accuracy and normalization, so that government and society are quickly, intuitively Ground data of finding the problem are possibly realized;Patterned problem data is provided to show and handle, so that in express statistic problem data, Processing problem data and Monitor Problems disposition are possibly realized.
Summary of the invention
In order to compensate for the shortcomings of the prior art, the present invention provides a kind of quality inspections based on government data being simple and efficient Survey method.
The present invention is achieved through the following technical solutions:
A kind of quality determining method based on government data, it is characterised in that: in the database to quality detected rule into Row definition, creates rule model and distributed mass Detection task, completes statistical quality testing result after quality testing task, to Problem owner and superintendent send alert notification simultaneously, score detection data, and quality of production examining report.
The present invention is based on the quality determining methods of government data to monitor Druid using Druid Database Connection Pool Database connection pool connects pond and SQL executive condition, and the connection and release called every time guarantee the reasonable utilization of resources;Meanwhile it building A vertical big database SQL generates factory, selects corresponding factory according to the type of data source in quality testing task execution, Guarantee normally to execute under different databases.
To avoid repeating monitoring, setting time pointer offset and maximum monitoring in data monitoring, to be rapidly completed Quality testing task.
The quality testing rule includes data meta-rule and general rule, and the data meta-rule needs corresponding government The support of platform data standards service;The general rule defines the type of SQL type and regular expression, passes through SQL type It realizes the support to multiple data sources, the rule of multiple data sources is configured and is managed collectively;It is selected in quality testing The quality testing rule of respective type is handled, and then dynamically realizes the configuration of data source, for later extension provides branch It holds.
The quality testing task uses lightweight distributed task management scheme, to realize the load balancing of multitask; After the quality testing task is assigned to actuator, quality testing task is executed by control centre's triggering actuator;The scheduling Center is based on cluster Quartz and realizes and support clustered deploy(ment), and the actuator supports clustered deploy(ment);When having on new actuator Line or it is offline when, redistribute task;The lightweight distributed task management scheme can reduce single server hardware The pressure of demand and server, while can go wrong to avoid some server and influence the detection of the quality of data.
The quality testing task is distributed to each actuator according to the routing mode of each self-configuring respectively;The actuator Clustered deploy(ment), is periodically automatically registered to control centre, and control centre finds the quality testing task of registration automatically and triggers It executes.
It supports to be manually entered actuator address in the control centre.
The routing mode includes selection first, the last one, poll or failure transfer, each quality testing task are matched Set a kind of routing mode.
The beneficial effects of the present invention are: being somebody's turn to do the quality determining method based on government data, by the detection to data, divide Analysis alerts, supervise and examine, and the efficient works such as scoring and the exact specification that data are effectively guaranteed not only can fast and accurately be determined The reason of position of position problem data and problem, moreover it is possible to the interface operation quickly handled is provided, to better understand data, benefit Support is provided with data, mining data value.
Specific embodiment
In order to which technical problems, technical solutions and advantages to be solved are more clearly understood, tie below Embodiment is closed, the present invention will be described in detail.It should be noted that specific embodiment described herein is only to explain The present invention is not intended to limit the present invention.
Due to consuming in memory, resource is big and execution speed is slow, the quality determining method based on government data, in number Quality detected rule is defined according in library, creates rule model and distributed mass Detection task, quality testing is completed and appoints Statistical quality testing result after business sends alert notification simultaneously to problem owner and superintendent, scores detection data, And quality of production examining report.
The quality determining method based on government data monitors Druid data using Druid Database Connection Pool Library connection pool connects pond and SQL (Structured Query Language, structured query language) executive condition, calls every time Connection and release, guarantee the reasonable utilization of resources;Meanwhile establishing a big database SQL and generating factory, in quality testing Corresponding factory is selected according to the type of data source when task execution, guarantees normally to execute under different databases.
Druid database connection pool replaces DBCP and C3P0, provides that one efficient, powerful, scalability is good Database connection pool has the function of:
(1) a powerful StatFilter plug-in unit is provided built in Druid database connection pool, can be united in detail The execution performance for counting SQL, can be helpful for on-line analysis database access performance with monitoring data library access performance.
(2) DruidDruiver and DruidDataSource supports PasswordCallback, supports database password Encryption, can ensure data safety.
(3) SQL execution journal is supported, Druid database connection pool provides different LogFilter, can support Common-Logging, Log4j and JdkLog, user select corresponding LogFilter as needed, and the database for monitoring application is visited Ask situation.
(4) extension JDBC is supported, when (Java Data Base Connectivity, java database connects user to JDBC Connecing) layer when having requirement of programming, can easily write by the Filter filter mechanism that Druid database connection pool provides JDBC layers of expansion plugin.
To avoid repeating monitoring, setting time pointer offset and maximum monitoring in data monitoring, to be rapidly completed Quality testing task.
The quality testing rule includes data meta-rule and general rule, and the data meta-rule needs corresponding government The support of platform data standards service;The general rule defines the type of SQL type and regular expression, passes through SQL type It realizes the support to multiple data sources, the rule of multiple data sources is configured and is managed collectively;It is selected in quality testing The quality testing rule of respective type is handled, and then dynamically realizes the configuration of data source, for later extension provides branch It holds.
The correctness of detection data member, whether data are accurate, specification, are completely the important contents of quality of data monitoring.Number It is several for being also known as data type by a series of data cell of attribute descriptions such as definition, mark, expression and permissible value according to member According to not subdivisible minimum data unit.
All there is some specific data elements for all trades and professions in government data.Such as: " student's classification " is exactly a number According to member, in store proprietary educational information in government, data volume is very big, depends merely on people to examine being unpractical.At this point it is possible to To the data meta-rule of data selection " student's classification ", then all data are screened, when student's classification letter in data Breath does not meet the standard of " student's classification " data element, will be screened out.
The quality testing task uses lightweight distributed task management scheme, to realize the load balancing of multitask; After the quality testing task is assigned to actuator, quality testing task is executed by control centre's triggering actuator;The scheduling Center is based on cluster Quartz and realizes and support clustered deploy(ment), and the actuator supports clustered deploy(ment);When having on new actuator Line or it is offline when, redistribute task;The lightweight distributed task management scheme can reduce single server hardware The pressure of demand and server, while can go wrong to avoid some server and influence the detection of the quality of data.
In design and exploitation, extended in line with easy, decoupling and pluggable thinking, when adding a kind of database, no Big variation can be carried out to code.
The quality testing task is distributed to each actuator according to the routing mode of each self-configuring respectively;The actuator Clustered deploy(ment), is periodically automatically registered to control centre, and control centre finds the quality testing task of registration automatically and triggers It executes.
It supports to be manually entered actuator address in the control centre.
The routing mode includes selection first, the last one, poll or failure transfer, each quality testing task are matched Set a kind of routing mode.
The quality determining method based on government data provides a kind of problem number based on the information of government's catalogue data It is investigated that data are carried out multi-faceted effective detection, including data element detection, specification at a station interface by the method looked for Property, consistency, the detection of accuracy.It is specific comprising regular definition, quality model, quality checks, problem alerts, quality analysis, A series of complete treatment processes such as quality supervise and examine, quality report.In face of the open data of a large amount of, multifarious government, lead to Quality testing scheme is faster more intuitively recognized help government data with society, understands data, provided using data Guidance, not only can the fast and accurately position of orientation problem data and the reason of problem, moreover it is possible to which the boundary quickly handled is provided Face operation.
Meanwhile it being somebody's turn to do the quality determining method based on government data, more increased based on processing in the database than in memory Effect, while current general all data source types have been compatible with, it is more convenient to use extensive;And in order to which fast and stable is efficient Statistics, we use the technology of distributed task scheduling, reduce the pressure of server end and ensure that when server needs It can also guarantee the normal execution of other server tasks when safeguarding or paralysing.
Complete quality testing task, filtering out problem data is the first step, user it should be understood that data overall condition, And handle in time, it is necessary to the classified statistic of various dimensions be presented to problem data;Such as the type of rule criterion, it is currently used Rule, the department to go wrong, the trend etc. of problem;To problem owner and superintendent be simultaneously emitted by problem alarm can and When notification data author timely handled, the quality supervise and examine personnel of specified permission can carry out tracking to problem and examine It looks into, guarantees that data are timely handled;Quality testing report may be implemented to summarize data, facilitate storage and access, from And realize the closed loop to issue handling.

Claims (8)

1. a kind of quality determining method based on government data, it is characterised in that: carried out in the database to quality detected rule Definition creates rule model and distributed mass Detection task, completes statistical quality testing result after quality testing task, Xiang Wen Topic person liable and superintendent send alert notification simultaneously, score detection data, and quality of production examining report.
2. the quality determining method according to claim 1 based on government data, it is characterised in that: use Druid data Library connection pool realizes that monitoring Druid database connection pool connects pond and SQL executive condition, and the connection and release called every time guarantee The reasonable utilization of resources;Meanwhile establishing a big database SQL and generating factory, in quality testing task execution according to number Corresponding factory is selected according to the type in source, guarantees normally to execute under different databases.
3. the quality determining method according to claim 2 based on government data, it is characterised in that: to avoid repeating supervising It surveys, setting time pointer offset and maximum monitoring in data monitoring, so that a quality testing task is rapidly completed.
4. the quality determining method according to claim 1 or 3 based on government data, it is characterised in that: the quality inspection Gauge then includes data meta-rule and general rule, and the data meta-rule needs corresponding government's platform data standards service It supports;The general rule defines the type of SQL type and regular expression, is realized by SQL type to multiple data sources It supports, the rule of multiple data sources is configured and is managed collectively;The quality testing of respective type is selected in quality testing Rule is handled, and then dynamically realizes the configuration of data source, for later extension provides support.
5. the quality determining method according to claim 1 or 3 based on government data, it is characterised in that: the quality inspection Survey task uses lightweight distributed task management scheme, to realize the load balancing of multitask;The quality testing task point After being fitted on actuator, quality testing task is executed by control centre's triggering actuator;The control centre is based on cluster Quartz Realize and support clustered deploy(ment), the actuator supports clustered deploy(ment);When there is new actuator online or offline, divide again With task;The lightweight distributed task management scheme can reduce the pressure of single server hsrdware requirements and server, Can go wrong simultaneously to avoid some server influences the detection of the quality of data.
6. the quality determining method according to claim 5 based on government data, it is characterised in that: the quality testing is appointed Business is distributed to each actuator according to the routing mode of each self-configuring respectively;The actuator clustered deploy(ment), it is periodically automatic It is registered to control centre, control centre finds the quality testing task of registration automatically and triggers execution.
7. the quality determining method according to claim 6 based on government data, it is characterised in that: control centre's branch It holds and is manually entered actuator address.
8. the quality determining method according to claim 6 based on government data, it is characterised in that: the routing mode packet Containing selection first, the last one, poll or failure transfer, each quality testing task configures a kind of routing mode.
CN201910261220.2A 2019-04-02 2019-04-02 A kind of quality determining method based on government data Pending CN109993439A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910261220.2A CN109993439A (en) 2019-04-02 2019-04-02 A kind of quality determining method based on government data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910261220.2A CN109993439A (en) 2019-04-02 2019-04-02 A kind of quality determining method based on government data

Publications (1)

Publication Number Publication Date
CN109993439A true CN109993439A (en) 2019-07-09

Family

ID=67130857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910261220.2A Pending CN109993439A (en) 2019-04-02 2019-04-02 A kind of quality determining method based on government data

Country Status (1)

Country Link
CN (1) CN109993439A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110597798A (en) * 2019-09-17 2019-12-20 山东爱城市网信息技术有限公司 Data detection method based on Thrift
CN110704502A (en) * 2019-11-20 2020-01-17 中电万维信息技术有限责任公司 Componentized data quality checking method
CN111563074A (en) * 2020-04-28 2020-08-21 厦门市美亚柏科信息股份有限公司 Data quality detection method and system based on multi-dimensional label
CN112948365A (en) * 2021-03-04 2021-06-11 浪潮云信息技术股份公司 Data quality detection method based on intelligent data element matching
CN114066170A (en) * 2021-10-22 2022-02-18 广西贵港市中科曙光云计算有限公司 Government data open sharing-oriented problem feedback processing system and method
CN115941563A (en) * 2023-03-14 2023-04-07 湖南智芯微科技有限公司 Task monitoring method and device integrating information of multiple command platforms

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1894151A2 (en) * 2005-06-20 2008-03-05 Future Route Limited Analytical system for discovery and generation of rules to predict and detect anomalies in data and financial fraud
CN106201694A (en) * 2016-07-13 2016-12-07 北京农信互联科技有限公司 Configuration method and system for executing timing task under distributed system
CN107038162A (en) * 2016-02-03 2017-08-11 滴滴(中国)科技有限公司 Real time data querying method and system based on database journal
CN107958049A (en) * 2017-11-28 2018-04-24 航天科工智慧产业发展有限公司 A kind of quality of data checking and administration system
CN109491990A (en) * 2018-09-17 2019-03-19 武汉达梦数据库有限公司 A kind of method of detection data quality and the device of detection data quality

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1894151A2 (en) * 2005-06-20 2008-03-05 Future Route Limited Analytical system for discovery and generation of rules to predict and detect anomalies in data and financial fraud
CN107038162A (en) * 2016-02-03 2017-08-11 滴滴(中国)科技有限公司 Real time data querying method and system based on database journal
CN106201694A (en) * 2016-07-13 2016-12-07 北京农信互联科技有限公司 Configuration method and system for executing timing task under distributed system
CN107958049A (en) * 2017-11-28 2018-04-24 航天科工智慧产业发展有限公司 A kind of quality of data checking and administration system
CN109491990A (en) * 2018-09-17 2019-03-19 武汉达梦数据库有限公司 A kind of method of detection data quality and the device of detection data quality

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110597798A (en) * 2019-09-17 2019-12-20 山东爱城市网信息技术有限公司 Data detection method based on Thrift
CN110597798B (en) * 2019-09-17 2023-08-25 浪潮卓数大数据产业发展有限公司 Data detection method based on thread
CN110704502A (en) * 2019-11-20 2020-01-17 中电万维信息技术有限责任公司 Componentized data quality checking method
CN111563074A (en) * 2020-04-28 2020-08-21 厦门市美亚柏科信息股份有限公司 Data quality detection method and system based on multi-dimensional label
CN111563074B (en) * 2020-04-28 2022-05-31 厦门市美亚柏科信息股份有限公司 Data quality detection method and system based on multi-dimensional label
CN112948365A (en) * 2021-03-04 2021-06-11 浪潮云信息技术股份公司 Data quality detection method based on intelligent data element matching
CN114066170A (en) * 2021-10-22 2022-02-18 广西贵港市中科曙光云计算有限公司 Government data open sharing-oriented problem feedback processing system and method
CN115941563A (en) * 2023-03-14 2023-04-07 湖南智芯微科技有限公司 Task monitoring method and device integrating information of multiple command platforms
CN115941563B (en) * 2023-03-14 2023-05-02 湖南智芯微科技有限公司 Task monitoring method and device integrating multi-command platform information

Similar Documents

Publication Publication Date Title
CN109993439A (en) A kind of quality determining method based on government data
CN111736875B (en) Version update monitoring method, device, equipment and computer storage medium
US10469320B2 (en) Versioning system for network states in a software-defined network
Tan et al. Adaptive system anomaly prediction for large-scale hosting infrastructures
CN114500250B (en) System linkage comprehensive operation and maintenance system and method in cloud mode
CN105099783B (en) A kind of method and system for realizing operation system alarm emergency disposal automation
CN106571960B (en) Log collection management system and method
CN106371986A (en) Log treatment operation and maintenance monitoring system
CN108763957A (en) A kind of safety auditing system of database, method and server
CN106888106A (en) The extensive detecting system of IT assets in intelligent grid
CN108170566A (en) Product failure information processing method, system, equipment and collaboration platform
CN109669844A (en) Equipment obstacle management method, apparatus, equipment and storage medium
EP3503473B1 (en) Server classification in networked environments
CN113242157B (en) Centralized data quality monitoring method under distributed processing environment
Shen et al. Evolving from traditional systems to AIOps: design, implementation and measurements
CN112445844A (en) Financial data management control system of big data platform
CN111314158A (en) Big data platform monitoring method, device, equipment and medium
KR102410151B1 (en) Method, apparatus and computer-readable medium for machine learning based observation level measurement using server system log and risk calculation using thereof
CN107666399A (en) A kind of method and apparatus of monitoring data
Sharma et al. Big data reliability: A critical review
He et al. Tscope: Automatic timeout bug identification for server systems
RU180789U1 (en) DEVICE OF INFORMATION SECURITY AUDIT IN AUTOMATED SYSTEMS
CN112487053B (en) Abnormal control extraction working method for mass financial data
Shih et al. Implementation and visualization of a netflow log data lake system for cyberattack detection using distributed deep learning
KR102512857B1 (en) Factory smart analysis system and method based on bigdata

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190709