CN107092694A - The inspection task creating method and device of the quality of data - Google Patents

The inspection task creating method and device of the quality of data Download PDF

Info

Publication number
CN107092694A
CN107092694A CN201710278260.9A CN201710278260A CN107092694A CN 107092694 A CN107092694 A CN 107092694A CN 201710278260 A CN201710278260 A CN 201710278260A CN 107092694 A CN107092694 A CN 107092694A
Authority
CN
China
Prior art keywords
data
checked
target
data set
inspection task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710278260.9A
Other languages
Chinese (zh)
Other versions
CN107092694B (en
Inventor
周万
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dt Dream Technology Co Ltd
Original Assignee
Hangzhou Dt Dream Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dt Dream Technology Co Ltd filed Critical Hangzhou Dt Dream Technology Co Ltd
Priority to CN201710278260.9A priority Critical patent/CN107092694B/en
Priority to CN202010994742.6A priority patent/CN112115130A/en
Publication of CN107092694A publication Critical patent/CN107092694A/en
Application granted granted Critical
Publication of CN107092694B publication Critical patent/CN107092694B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Abstract

The embodiments of the invention provide the inspection task creating method and device of the quality of data, the attribute to be checked of data set to be checked is obtained;Target data member corresponding with the attribute to be checked is obtained from each data element pre-set, and sets up the data set check incidence relation first with the target data;From the corresponding business rule of each data element pre-set, the corresponding target service rule of the target data member is determined;According to the target data is first and target service rule, the inspection task of the generation data set to be checked.Corresponding business rule is only set once to the respectively corresponding data element of attribute to be checked in the present invention, without repeatedly setting, redundancy is reduced;And only need to set up the data set to be checked and the incidence relation of target data member, you can inspection task is automatically generated, so as to simplify the setting up procedure of inspection task.

Description

The inspection task creating method and device of the quality of data
Technical field
The application is related to technical field of data processing, more particularly relate to the quality of data inspection task creating method and Device.
Background technology
Data management is the process that data are carried out with effective collection, storage, processing and application.Received from multiple data sources During collecting data, it is related to the process that the data of collection are carried out with quality of data inspection, so as to the matter of the data that improve collection Amount.
Current quality of data checking method includes:Data set to be checked is selected from data source;Artificial configuration inspection is appointed Business, treats inspection data set according to inspection task and is audited.In its different inspection task of the data type of data set to be checked Business rule will be different, such as when when checking the data type of data set and being numeric type, business rule may be value Scope;When the data type of data set is checked for character string, business rule may be length range;When data set to be checked Data type be date type when, business rule may be date range;When the data type of data set to be checked is to enumerate During type, business rule may be a series of enumerated value.Therefore quality of data inspection task sets operation relatively complicated.It is comprehensive On, the inspection task of the quality of data of the prior art sets relatively complicated.
The content of the invention
In view of this, it is existing to overcome the invention provides the inspection task creating method and device of a kind of quality of data The problem of inspection task for having the quality of data in technology sets relatively complicated.
To achieve the above object, the present invention provides following technical scheme:
A kind of inspection task creating method of the quality of data, including:
The attribute to be checked of data set to be checked is obtained, the attribute to be checked includes the data of the data set to be checked Type, and/or, data category;
Target data member corresponding with the attribute to be checked is obtained from each data element pre-set, and sets up described Data set to be checked and the incidence relation of target data member, data element, which is used to characterize accordingly data set to be checked, to be needed to meet Business rule identification information;
From the corresponding business rule of each data element pre-set, the corresponding target service of target data member is determined Rule, business rule is used for the span information for characterizing the data set for belonging to corresponding attribute to be checked;
According to the target data is first and target service rule, the inspection times of generation data set check Business.
Wherein, obtaining the attribute to be checked of data set to be checked includes:
The attribute to be checked is obtained from the tables of data comprising the data set to be checked;
The attribute to be checked is shown in human-computer interaction interface, wherein, the human-computer interaction interface is still further comprised Associated data member, associated data member is used to showing that its corresponding data set to be checked to be relevant with corresponding data Member.
Wherein, according to the target property information, the incidence relation bag of the data to be checked and target data member is set up Include:
Described when the associated key for checking the corresponding associated data member of data set is triggered, the man-machine friendship when detecting Each data element that mutual interface display is prestored;
From each data element prestored, it is determined that the corresponding target data of attribute to be checked of the data set to be checked Member;
It is determined that the data set to be checked has incidence relation with target data member.
Wherein, each data element has a mark ID, the mark ID of the mark of data element business rule corresponding ID It is identical.
Wherein, according to the target data is first and target service rule, the generation data set to be checked is checked The task of looking into includes:
From each business rule pre-set, mark ID and the target identification ID identicals of target data member are obtained The target service rule;
It is raw according to the data column name of the data set to be checked, the target identification ID and target service rule Into the inspection task.
Wherein, according to the target data is first and target service rule, the generation data set to be checked is checked The task of looking into includes:
It is determined that setting up the data to be checked and user's communication information of the incidence relation of target data member;
Warning mark is set for the data set to be checked, the warning mark is used to indicate when the data set to be checked When being unsatisfactory for the target service rule, alarm signal is sent to the user with user's communication information;
According to the target data is first, the target service is regular, user's communication information and the warning mark, Generate the inspection task.
A kind of inspection task generating device of the quality of data, including:
Acquisition module, the attribute to be checked for obtaining data set to be checked, the attribute to be checked is waited to check including described The data type of data set is looked into, and/or, data category;
Module is set up, for obtaining target data corresponding with the attribute to be checked from each data element pre-set Member, and the data set to be checked and the incidence relation of target data member are set up, data element is used to characterize accordingly to wait to check Data set needs the identification information of the business rule met;
Determining module, for from the corresponding business rule of each data element pre-set, determining the target data member Corresponding target service rule, business rule is used for the span information for characterizing the data set for belonging to corresponding attribute to be checked;
Generation module, for according to the target data is first and target service rule, the generation number to be checked According to the inspection task of collection.
Wherein, acquisition module includes:
First acquisition unit, for obtaining the attribute to be checked from the tables of data comprising the data set to be checked;
First display unit, for showing the attribute to be checked in human-computer interaction interface, wherein, the man-machine interaction Interface still further comprises associated data member, and associated data member is used to showing the relevant pass of its corresponding data set check System with corresponding data element;
Second display unit, for when the associated key quilt for detecting the corresponding associated data member of the data set to be checked During triggering, the human-computer interaction interface shows each data element prestored.
Wherein, the module of setting up includes:
First determining unit, for from each data element prestored, it is determined that the data set to be checked wait check The corresponding target data member of attribute;
Second determining unit, for determining the data set to be checked with target data member with incidence relation.
Wherein, each data element has a mark ID, the mark ID of the mark of data element business rule corresponding ID It is identical.
Wherein, the generation module includes:
Second acquisition unit, for from each business rule pre-set, obtaining mark ID and target data member Target identification ID identicals described in target service rule;
First generation unit, for according to the data column name of the data set to be checked, the target identification ID and The target service rule generation inspection task.
Wherein, the generation module includes:
3rd determining unit, for determining that setting up the data to be checked and the user of the incidence relation of target data member leads to Letter information;
Alarm unit is set, and for setting warning mark for the data set to be checked, the warning mark is used to indicate When described when inspection data set is unsatisfactory for the target service rule, send and report to the user with user's communication information Alert signal;
Second generation unit, for according to the target data is first, the target service is regular, user's communication information And the warning mark, generate the inspection task.
Understand that compared with prior art, the embodiments of the invention provide a kind of quality of data via above-mentioned technical scheme Inspection task creating method, obtain the attribute to be checked of data set to be checked;Obtained from each data element pre-set with The corresponding target data member of the attribute to be checked, and set up data set check and the target data it is first associate pass System;From the corresponding business rule of each data element pre-set, the corresponding target service rule of the target data member is determined; According to the target data is first and target service rule, the inspection task of the generation data set to be checked.The present invention In corresponding business rule is only set once to the respectively corresponding data element of attribute to be checked, without repeatedly setting, reduce Redundancy;And only need to set up the data set to be checked and the incidence relation of target data member, you can automatically generate inspection and appoint Business, so as to simplify the setting up procedure of inspection task.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the accompanying drawing used required in technology description to be briefly described, it should be apparent that, drawings in the following description are only this The embodiment of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis The accompanying drawing of offer obtains other accompanying drawings.
A kind of structural representation for quality of data auditing system that Fig. 1 provides for the embodiment of the present application;
A kind of schematic flow sheet of the inspection task creating method for quality of data that Fig. 2 provides for the embodiment of the present application;
A kind of content schematic diagram for data element that Fig. 3 provides for the embodiment of the present application;
A kind of schematic diagram for data element selection window that Fig. 4 provides for the embodiment of the present application;
A kind of structural representation of the inspection task generating device for quality of data that Fig. 5 provides for the embodiment of the present application;
The structural representation for a kind of electronic equipment that Fig. 6 provides for the embodiment of the present application.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made Embodiment, belongs to the scope of protection of the invention.
The quality of data inspection rule generating method that the embodiment of the present application is provided can apply to quality of data auditing system, As shown in figure 1, a kind of structural representation of the quality of data auditing system provided for the embodiment of the present application, quality of data inspection System includes:Rule generating unit 11 and data source unit 12, wherein:
The quality of data inspection rule generating method corresponding data quality inspection rule generation journey that the embodiment of the present application is provided Sequence.Quality of data inspection regular generation program is installed in rule generating unit 11.
The storage of data source unit 12 needs to be checked data set, and respectively data set to be checked includes the attribute to be checked of itself.Treat The data type of accordingly data set to be checked can be included by checking attribute, and/or, data category.
Data source unit 12 can include database, and lane database is stored with tables of data, and tables of data includes data row, data Row correspondence data set to be checked, such as a certain data are classified as name, and this row includes Zhang San, Li Si, king five, Zhao six, then the number Include Zhang San, Li Si, king five, Zhao six according to corresponding data set is arranged.
User can open the regular generation program of quality of data inspection in rule generating unit 11, be checked based on the quality of data Regular generation program is looked into, the attribute to be checked of data set to be checked is obtained from data source unit 12, from each number pre-set According to the corresponding target data member of attribute to be checked in member, is obtained, data element, which is used to characterize accordingly data set to be checked, to be needed to meet Business rule identification information.Then according in the corresponding business rule of each data element pre-set, target data is determined First corresponding target service rule, business rule is used for the span letter for characterizing the data set for belonging to corresponding attribute to be checked Breath.So as to generate corresponding inspection task, then inspection task is sent to data source unit 12.
Data source unit 12 can just treat inspection data set according to inspection task and be audited.
Above-mentioned rule generating unit 11 and data source unit 12 can be deployed in same electronic equipment, can also be deployed in not In same electronic equipment.The present invention is not specifically limited to this.
Below based on above-mentioned quality of data auditing system, the quality of data inspection rule generation provided the embodiment of the present application Method is illustrated.As shown in Fig. 2 checking the flow of rule generating method for a kind of quality of data that the embodiment of the present application is provided Schematic diagram, this method includes:
Step S201:The attribute to be checked of data set to be checked is obtained, the attribute to be checked includes the number to be checked According to the data type of collection, and/or, data category.
Data category in the embodiment of the present application includes data column name, and for different application scene, data column name is not Together, for example for school, data category can include the student number of student, name, the achievement of student, the job number of teacher, old The wage of teacher, teacher's educational background etc..For public security organ, data category can include:Public security prison institute accident event classification Code, the gender code of people, marital status code, working condition code, countries in the world and area name code, political affiliation generation Code, occupational classification code, service grade code etc..
Data type is can be with:Integer, Boolean type, character string type etc..
Step S202:Target data member corresponding with the attribute to be checked is obtained from each data element pre-set, And the data set to be checked and the incidence relation of target data member are set up, data element is used to characterize corresponding data to be checked Collection needs the identification information of the business rule met.
The initial stage of big data improvement is carried out in enterprise, by business diagnosis teacher and the investigation of data architect, selection governance process In, data element needs the standard criterion observed, for not ready-made standard criterion, can be needed according to historical data and user Ask progress to design and define in advance, form enterprise's big data governance standard the specification of data element.
Machine can be carried out according to historical data (for example, history data set to be checked, history check task) and user's request Device learns, and obtains each data element under different application scene.And the corresponding business rule of each data element.It is understood that Different user's requests, i.e., different application scenarios, the information that data element is included is different, and the corresponding business rule of data element is not Together, below by taking public security prison institute accident event class code as an example, the content to the data element defined in specification is illustrated, tool Body is as shown in Figure 3.
Data element is the identification information for the business rule that corresponding data set to be checked needs satisfaction, and the data element can be wrapped Include:Identify ID (such as DE00141), Chinese (such as public security prison institute accident event class code), Chinese spelling (for example Gong-an-jian-suo-shi-gu-shi-jian-lei-bie-dai-ma), identifier (such as GAJSSGSJLBDM), version This (such as 1.0), data type (this is the data type of the corresponding business rule of data element, such as character type), data format (such as C2), business rule (being for example shown in Table 1), submission mechanism (such as management board of Ministry of Public Security's supervision institute), main draftsman are (for example Zhang San, Li Si), one or more of approved date (such as 2011 March 14).
Optionally, data element can also include:Synonymous title (for example illustrates:The accident thing occurred in public security supervision place The class code of part), object class word (such as classification), relation, represent that word (such as code), measurement unit, state (are for example marked One or more of it is accurate).
Step S203:From the corresponding business rule of each data element pre-set, determine that the target data member is corresponding Target service rule, business rule is used for the span information for characterizing the data set for belonging to corresponding attribute to be checked.
In the rule generating unit that the embodiment of the present application is provided, attribute database, data metadata storehouse, industry can be included Business rule database.
Attribute database be used for store obtained from data source unit data set check treat inspection belong to
Property.Data metadata storehouse is used to store each data element pre-set.Business rule database is used to store in advance The corresponding business rule of each data element set.
Business rule is that the value of corresponding data to be checked needs the business rule deferred to.
For example for the sex of people, including man, female, unknown, it is impossible to still other type, it is assumed that man's code is 00th, woman's code be 01, unknown code be 10, then the data type of the gender code of people is enumeration type, and be only 00, 01st, any one in 10, i.e., for the gender code of people this data category, the business rule bag of its corresponding data element Include:00、01、10.
For complicated data category, such as ID card No., first 6 of ID card No. are administrative division coding, the 7th Position to the 10th be year of birth, the 11st to the 12nd be birth month, the 13rd and the 14th be date of birth;Then identification card number The corresponding business rule of code can be as shown in table 1.
The content that the corresponding business rule of the ID card No. of table 1 is included
It can be calculated by byte order to verify object value rule, using first character section as initial character, last Byte is used as last character.
The some types of business rule can be as shown in table 2.
The part data type of the business rule of table 2
The some types of span can be as shown in table 3.
The some types and explanation of the span of table 3
GB/T 2659 is countries in the world and area name code table.
For another example table 4 is the corresponding partial service rule of public security prison institute's accident event class code this data category.
The corresponding partial service rule of the public security of table 4 prison institute accident event class code
Preferably, each data element has a mark ID, the mark of the mark of data element business rule corresponding ID ID is identical.
Step S204:According to the target data is first and target service rule, the generation data set to be checked Inspection task.
In the prior art, quality of data inspection task is typically all to be completed by operation maintenance personnel, and operation maintenance personnel is to business Rule is not familiar with, so as to be difficult to the correctness for holding business rule, during configuration data quality inspection task, is also needed Want corresponding business personnel or data framework teacher to verify the correctness of inspection task, that is, checking the generation of task needs individual more Mutual cooperation between member, easily error, and in the embodiment of the present application, because the data set of different attributes to be checked sets corresponding Data element, each corresponding business rule of data element all pre-sets, and in inspection task generating process, need to only build Found data set to be checked and the incidence relation of target data member, you can automatically generate the inspection task of data set to be checked, so that Without the mutual cooperation between multiple personnel, accuracy rate is improved.
In the quality of data inspection rule generating method that the embodiment of the present application is provided, attribute to be checked to difference is set in advance Corresponding data element, corresponding business rule is set to each data element;, can when needs, which treat inspection data set, to be checked With the attribute to be checked according to data set to be checked, data set to be checked and the incidence relation of target data member are set up;Again from pre- In the corresponding business rule of each data element first set, the corresponding target service rule of target data member is determined;Final foundation mesh Mark data element and target service rule, the inspection task of the generation data to be checked.Respectively attribute to be checked is counted accordingly According to member, corresponding business rule is only set once, without repeatedly setting, redundancy is reduced;And only need to set up and wait to check Look into data set and the incidence relation of target data member, you can the inspection task of data set to be checked is automatically generated, so as to simplify The quality of data checks the method to set up of task.
The embodiment of the present application additionally provides in a kind of inspection task creating method of quality of data and obtains data set to be checked Attribute to be checked a kind of implementation method, this method includes:
The attribute to be checked is obtained from the tables of data comprising the data set to be checked.Show in human-computer interaction interface Show the attribute to be checked, wherein, the human-computer interaction interface still further comprises associated data member, and the associated data member is used In showing that its corresponding data set to be checked is relevant with corresponding data element.
Tables of data can be stored in data source unit 12.
As shown in figure 4, being a kind of schematic diagram of human-computer interaction interface provided in an embodiment of the present invention.
Assuming that record has following field in the tables of data comprising data set to be checked:sync_seq、sync_stat、zjid、 maintext、managementlevel、corporationname、organizationcode、lastdate、createdate Etc. field, one data set to be checked of each field correspondence.
Human-computer interaction interface can be shown:Sequence number 400, column name 401, row type 402, whether divide keypad 403, row retouch State information 404, associated data member 405, user's communication information 406, warning mark 407, data set to be checked affiliated tables of data Title 408 etc..
Human-computer interaction interface can only include column name 401, row type 402 and associated data member 405.Man-machine interaction circle Depending on the content that bread contains can be according to actual conditions, the embodiment of the present invention is not specifically limited to this.
Column name 401 is used for the field name for representing each data set column to be checked.
Can be stored with tables of data the respectively attribute to be checked of data set to be checked, such as data class in data source unit 12 Type is row type, whether distinguish key, row description information.These attributes to be checked can be obtained from data source unit 12, and are shown Show in human-computer interaction interface.
Row type 402 is used for the data type for representing each data set to be checked.
Whether subregion key 403 is used for the key to tables of data progress subregion.
Row description information 404 can include:For characterize the row whether major key, and/or, the Chinese (example of the row Such as, corporation name Chinese is that enterprise name, organization code Chinese are institutional framework Code, management level Chinese are managerial class), and/or, the establishment of the row or modification time etc..
Associated data member 405, each data set to be checked is to that should have associated data member.
The corresponding data set to be checked of attribute to be checked is corresponding with positioned at the associated data member with a line in Fig. 4.
The form of expression of the corresponding associated key of associated data member has a variety of, for example each row and associated data member 405 this A virtual associated button is shown at the crossover location of one row, or, at the crossover location of each row and first 405 this row of associated data Blank (as shown in Figure 4) is shown, or, input frame is shown at the crossover location of each row and first 405 this row of associated data.User After triggering virtual associated button or blank space, each data element pre-set can be shown;Or, user can be in input frame The mark ID of middle input data member.
Assuming that user needs to set serial number 3 and column name is the data element of the corresponding data sets to be checked of zjid, then may be used At crossover location to click on first 405 this row of the row and associated data.It is (pre- in each data element pre-set shown Each data element first set can be shown in the form of data list), user can the attribute to be checked based on data set to be checked Select corresponding data element, it is assumed that the mark ID of this data element is DE00141, then data element selection window can show the number According to the mark ID of member.Or show the store path of the data element etc..
Associated key is used to set up corresponding data set to be checked and the incidence relation of corresponding data member, due to data The attribute to be checked of mark IDDE00141 and data set check of member positioned at same a line, thus establish field zjid and Data element DE00141 incidence relation.
To sum up, target data member corresponding with the attribute to be checked is obtained from each data element pre-set, and is built The vertical data set to be checked and the incidence relation of target data member include:
Described when the associated key for checking the corresponding associated data member of data set is triggered, the man-machine friendship when detecting Each data element that mutual interface display is prestored;
From each data element prestored, it is determined that the corresponding target data of attribute to be checked of the data set to be checked Member;
It is determined that the data set to be checked has incidence relation with target data member.
User's communication information 406, the setting people for setting respectively data set to be checked and the incidence relation of corresponding data member The contact method of member.
The corresponding data set to be checked of attribute to be checked is corresponding with positioned at user's communication information with a line in Fig. 4.
User's communication information can include address name, job number, department name, phone number, E-mail address, QQ number code, WeChat ID etc..
In the prior art during quality of data inspection, if finding, data set to be checked is unsatisfactory for business rule, by Operation maintenance personnel informs phase, causes that the workload of operation maintenance personnel is heavy and efficiency is low.Therefore, the embodiment of the present application is cleverly added and built User's communication information of the incidence relation of data set and data element to be checked is found, when data set to be checked is unsatisfactory for corresponding service rule When then, alarm signal is sent to the personnel with user's communication information, the data phase to be checked is known without artificial search The personnel for the business rule answered, improve efficiency.
Assuming that it is that data set to be checked and data element DE00141 are associated zjid accordingly to set up serial number 3 and column name The attendant of relation is at Zhang San, the then crossover location that can click on this row of the row Yu user's communication mode.Then input The communication information of Zhang San.Or in the contact method selective listing of ejection, select the communication information of Zhang San.
Assuming that it is lastdate data set check and data element DE00141 accordingly to set up serial number 8 and column name The attendant of incidence relation is at Li Si, the then crossover location that can click on this row of the row Yu user's communication mode.Then Input the communication information of Li Si.Or in the contact method selective listing of ejection, select the communication information of Li Si.As a result as schemed Shown in 4.
Warning mark 407, it is each when inspection data set does not meet its corresponding business rule for setting, if to carry out Alarm.
The corresponding data set to be checked of attribute to be checked is corresponding with positioned at the warning mark with a line in Fig. 4.
Assuming that user needs setting serial number 8 and column name is the alarm mark of the corresponding data sets to be checked of lastdate At will, the then crossover location that can click on this row of row with warning mark.Then inputting is.Or in the drop-down menu of ejection In, selection is.
After user carries out relative set by human-computer interaction interface, inspection task can be generated according to corresponding information. The method for generating inspection task is illustrated below.
The method of the first generation inspection task.
In the embodiment of the present invention, each data element has a mark ID, and the business that the mark ID of data element is corresponding is advised Mark ID then is identical.Data category in the attribute to be checked of data set to be checked includes data column name, according to the mesh The inspection task of mark data element and the target service rule generation data set to be checked includes:
From each business rule pre-set, mark ID and the target identification ID identicals of target data member are obtained The target service rule;
It is raw according to the data column name of the data set to be checked, the target identification ID and target service rule Into the inspection task.
First method is when inspection data set makes a mistake, it is impossible to notify corresponding personnel in time.
The method of second of generation inspection task.
According to the target data is first and target service rule, the inspection task of the generation data set to be checked Including:
It is determined that setting up the data to be checked and user's communication information of the incidence relation of target data member;
Warning mark is set for the data set to be checked, the warning mark is used to indicate when the data set to be checked When being unsatisfactory for the target service rule, alarm signal is sent to the user with user's communication information;
According to the target data is first, the target service is regular, user's communication information and the warning mark, Generate the inspection task.
As in Fig. 4, serial number 3 and column name are that the mark ID of the data element of the corresponding data sets to be checked of zjid is DE00141;Data type is bigint;Its user's communication information is the communication information of Zhang San;And warning mark is yes.Assuming that sequence Number it is 3 and the corresponding target service rule of data element that column name is zjid is each value enumerated in table 4.
Then inspection task can be with as follows:
Zjid DE00141 values must the communication information of Zhang San be in table 4
When the value for finding some in zjid fields or multiple data is not belonging to any value in table 4, then according to Zhang San's The communication information sends short message, voice call, mail, QQ information or wechat etc. to Zhang San.
Serial number 8 and column name are designated DE00142 for the data element of the corresponding data sets to be checked of lastdate;Number It is DATETIME according to type;Its user's communication information is the communication information of Li Si;And warning mark is yes;Assuming that 8 pairs of serial number The business rule for the data set to be checked answered is 20101022<value of record<2016112.
Then inspection task can be with as follows:
lastdate DE00142 20101022<value of record<The communication information of 2016112 Li Sis is
If it is understood that can not include in the tables of data that is only stored with data source unit, inspection task The mark of data set to be checked (mark of data set to be checked can be the title of the affiliated tables of data of data set to be checked);Work as number When including multiple tables of data according to source unit, in order to each inspection for allowing data source unit distinguishing rule generation unit 11 to generate Task is which tables of data be directed to, then checks the mark that task also needs to include data set to be checked.
In the inspection task creating method of any of the above-described quality of data, it can also include:By the inspection task send to The data source unit, so that the data source unit is according to the inspection task inspection data set to be checked.
The embodiment of the present application additionally provides a kind of quality of data corresponding with the inspection task creating method of the quality of data Task generating device is checked, each module and each unit included below to the inspection task generating device of the quality of data is said Bright, the detailed description of each module and each unit can be found in the description for checking corresponding steps in task creating method of the quality of data, Here repeat no more.
As shown in figure 5, a kind of structure of the inspection task generating device of the quality of data provided for the embodiment of the present application is shown It is intended to, the inspection task generating device of the quality of data includes:Acquisition module 51, set up module 52, determining module 53 and raw Into module 54, wherein:
Acquisition module 51, the attribute to be checked for obtaining data set to be checked, the attribute to be checked includes described treat The data type of data set is checked, and/or, data category;
Module 52 is set up, for obtaining number of targets corresponding with the attribute to be checked from each data element pre-set According to member, and the data set to be checked and the incidence relation of target data member are set up, data element is used to characterize accordingly to wait to check Looking into data set needs the identification information of business rule of satisfaction;
Determining module 53, for from the corresponding business rule of each data element pre-set, determining the target data First corresponding target service rule, business rule is used for the span letter for characterizing the data set for belonging to corresponding attribute to be checked Breath;
Generation module 54, for according to the target data is first and target service rule, waiting to check described in generation The inspection task of data set.
Optionally, acquisition module includes:
First acquisition unit, for obtaining the attribute to be checked from the tables of data comprising the data set to be checked;
First display unit, for showing the attribute to be checked in human-computer interaction interface, wherein, the man-machine interaction Interface still further comprises associated data member, and associated data member is used to showing the relevant pass of its corresponding data set check System with corresponding data element;
Second display unit, for when the associated key quilt for detecting the corresponding associated data member of the data set to be checked During triggering, the human-computer interaction interface shows each data element prestored
Optionally, the module of setting up includes:
First determining unit, for from each data element prestored, it is determined that the data set to be checked wait check The corresponding target data member of attribute;
Second determining unit, for determining the data set to be checked with target data member with incidence relation.
Optionally, each data element has a mark ID, the mark of the mark of data element business rule corresponding ID ID is identical.
Optionally, the generation module 54 includes:
Second acquisition unit, for from each business rule pre-set, obtaining mark ID and target data member Target identification ID identicals described in target service rule;
First generation unit, for according to the data column name of the data set to be checked, the target identification ID and The target service rule generation inspection task.
Optionally, generation module 54 includes:
3rd determining unit, for determining that setting up the data to be checked and the user of the incidence relation of target data member leads to Letter information;
Alarm unit is set, and for setting warning mark for the data set to be checked, the warning mark is used to indicate When described when inspection data set is unsatisfactory for the target service rule, send and report to the user with user's communication information Alert signal;
Second generation unit, for according to the target data is first, the target service is regular, user's communication information And the warning mark, generate the inspection task.
As shown in fig. 6, the structural representation of a kind of electronic equipment provided for the embodiment of the present application, the regular electronic equipment Including:Processor 61, communication interface 62, memory 63 and communication bus 64;
Wherein processor 61, communication interface 62, memory 63 complete mutual communication by communication bus 64;
Optionally, communication interface 62 can be the interface of communication module, such as GSM (global system for mobile communications, Global Systemfor Mobile Communication) module interface;
Processor 61, for configuration processor;
Memory 63, for depositing program and data;
Program can include program code, and described program code includes computer-managed instruction;Data can include waiting to check Look into the attribute to be checked, each data element, the corresponding business rule of each data element of data set.
Processor 61 is probably a central processor CPU (Central Processing Unit), or specific collection Into circuit ASIC (Application Specific Integrated Circuit), or it is arranged to implement the present invention One or more integrated circuits of embodiment.
Memory 63 may include high-speed RAM memory, it is also possible to also including nonvolatile memory (non-volatile Memory), for example, at least one magnetic disk storage.
Wherein, program can be specifically for:
The attribute to be checked of data set to be checked is obtained, the attribute to be checked includes the data of the data set to be checked Type, and/or, data category;
Target data member corresponding with the attribute to be checked is obtained from each data element pre-set, and sets up described Data set to be checked and the incidence relation of target data member, data element, which is used to characterize accordingly data set to be checked, to be needed to meet Business rule identification information;
From the corresponding business rule of each data element pre-set, the corresponding target service of target data member is determined Rule, business rule is used for the span information for characterizing the data set for belonging to corresponding attribute to be checked;
According to the target data is first and target service rule, the inspection times of generation data set check Business.
Finally, in addition it is also necessary to explanation, herein, such as first and second or the like relational terms be used merely to by One entity or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or operation Between there is any this actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant meaning Covering including for nonexcludability, so that process, method, article or equipment including a series of key elements not only include that A little key elements, but also other key elements including being not expressly set out, or also include be this process, method, article or The intrinsic key element of equipment.In the absence of more restrictions, the key element limited by sentence "including a ...", is not arranged Except also there is other identical element in the process including the key element, method, article or equipment.
The embodiment of each in this specification is described by the way of progressive, and what each embodiment was stressed is and other Between the difference of embodiment, each embodiment identical similar portion mutually referring to.
The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or use the application. A variety of modifications to these embodiments will be apparent for those skilled in the art, as defined herein General Principle can in other embodiments be realized in the case where not departing from spirit herein or scope.Therefore, the application The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one The most wide scope caused.

Claims (12)

1. a kind of inspection task creating method of quality of data, it is characterised in that including:
The attribute to be checked of data set to be checked is obtained, the attribute to be checked includes the data class of the data set to be checked Type, and/or, data category;
Obtain target data member corresponding with the attribute to be checked from each data element pre-set, and wait to check described in setting up Data set and the incidence relation of target data member are looked into, data element is used to characterize the industry that corresponding data set to be checked needs to meet The identification information for rule of being engaged in;
From the corresponding business rule of each data element pre-set, the corresponding target service rule of the target data member are determined Then, business rule is used for the span information for characterizing the data set for belonging to corresponding attribute to be checked;
According to the target data is first and target service rule, the inspection task of the generation data set to be checked.
2. the inspection task creating method of the quality of data according to claim 1, it is characterised in that obtain data set to be checked Attribute to be checked include:
The attribute to be checked is obtained from the tables of data comprising the data set to be checked;
The attribute to be checked is shown in human-computer interaction interface, wherein, the human-computer interaction interface still further comprises association Data element, associated data member is used to showing that its corresponding data set to be checked to be relevant with corresponding data element.
3. the inspection task creating method of the quality of data according to claim 2, it is characterised in that set up the number to be checked Include according to the incidence relation with target data member:
Described when the associated key for checking the corresponding associated data member of data set is triggered, man-machine interaction circle when detecting Face shows each data element prestored;
From each data element prestored, it is determined that the corresponding target data member of the attribute to be checked of the data set to be checked;
It is determined that the data set to be checked has incidence relation with target data member.
4. according to the inspection task creating method of any quality of data of claims 1 to 3, it is characterised in that each data Member has a mark ID, and the mark ID of the mark of data element business rule corresponding ID is identical.
5. the inspection task creating method of the quality of data according to claim 4, it is characterised in that according to the target data First and described target service rule, the inspection task of the generation data set to be checked includes:
From each business rule pre-set, obtain described in the mark ID target identification ID identicals first with the target data Target service rule;
According to the data column name of the data set to be checked, the target identification ID and target service rule generation institute State inspection task.
6. the inspection task creating method of the quality of data according to claim 1, it is characterised in that according to the target data First and described target service rule, the inspection task of the generation data set to be checked includes:
It is determined that setting up the data to be checked and user's communication information of the incidence relation of target data member;
Warning mark is set for the data set to be checked, the warning mark is used to indicate when the data set to be checked is discontented with When the foot target service is regular, alarm signal is sent to the user with user's communication information;
According to the target data is first, the target service is regular, user's communication information and the warning mark, generation The inspection task.
7. a kind of inspection task generating device of quality of data, it is characterised in that including:
Acquisition module, the attribute to be checked for obtaining data set to be checked, the attribute to be checked includes the number to be checked According to the data type of collection, and/or, data category;
Module is set up, for obtaining target data member corresponding with the attribute to be checked from each data element pre-set, And the data set to be checked and the incidence relation of target data member are set up, data element is used to characterize corresponding data to be checked Collection needs the identification information of the business rule met;
Determining module, for from the corresponding business rule of each data element pre-set, determining that the target data member is corresponding Target service rule, business rule is used for the span information for characterizing the data set for belonging to corresponding attribute to be checked;
Generation module, for according to the target data is first and target service rule, the generation data set to be checked Inspection task.
8. the inspection task generating device of the quality of data according to claim 7, it is characterised in that acquisition module includes:
First acquisition unit, for obtaining the attribute to be checked from the tables of data comprising the data set to be checked;
First display unit, for showing the attribute to be checked in human-computer interaction interface, wherein, the human-computer interaction interface Associated data member is still further comprised, the associated data member is used to show that its corresponding data set to be checked is relevant With corresponding data element;
Second display unit, for being triggered when the associated key for detecting the corresponding associated data member of the data set to be checked When, the human-computer interaction interface shows each data element prestored.
9. the inspection task generating device of the quality of data according to claim 8, it is characterised in that described to set up module bag Include:
First determining unit, for from each data element prestored, it is determined that the attribute to be checked of the data set to be checked Corresponding target data member;
Second determining unit, for determining the data set to be checked with target data member with incidence relation.
10. according to the inspection task generating device of any quality of data of claim 7 to 9, it is characterised in that each data Member has a mark ID, and the mark ID of the mark of data element business rule corresponding ID is identical.
11. the inspection task generating device of the quality of data according to claim 10, it is characterised in that the generation module bag Include:
Second acquisition unit, for from each business rule pre-set, obtaining mark ID and the mesh of target data member Target service rule described in mark mark ID identicals;
First generation unit, for according to the data column name of the data set to be checked, the target identification ID and described The target service rule generation inspection task.
12. the inspection task generating device of the quality of data according to claim 7, it is characterised in that the generation module bag Include:
3rd determining unit, for determining that the user for setting up the data check incidence relation first with target data communicates letter Breath;
Alarm unit is set, and for setting warning mark for the data set to be checked, the warning mark is used to indicate to work as institute State when inspection data set is unsatisfactory for the target service rule, alarm signal is sent to the user with user's communication information Number;
Second generation unit, for according to the target data is first, the target service is regular, user's communication information and The warning mark, generates the inspection task.
CN201710278260.9A 2017-04-25 2017-04-25 Data quality inspection task generation method and device Active CN107092694B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201710278260.9A CN107092694B (en) 2017-04-25 2017-04-25 Data quality inspection task generation method and device
CN202010994742.6A CN112115130A (en) 2017-04-25 2017-04-25 Method, device, equipment and medium for acquiring data corresponding relation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710278260.9A CN107092694B (en) 2017-04-25 2017-04-25 Data quality inspection task generation method and device

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202010994742.6A Division CN112115130A (en) 2017-04-25 2017-04-25 Method, device, equipment and medium for acquiring data corresponding relation

Publications (2)

Publication Number Publication Date
CN107092694A true CN107092694A (en) 2017-08-25
CN107092694B CN107092694B (en) 2020-10-20

Family

ID=59637075

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201710278260.9A Active CN107092694B (en) 2017-04-25 2017-04-25 Data quality inspection task generation method and device
CN202010994742.6A Pending CN112115130A (en) 2017-04-25 2017-04-25 Method, device, equipment and medium for acquiring data corresponding relation

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202010994742.6A Pending CN112115130A (en) 2017-04-25 2017-04-25 Method, device, equipment and medium for acquiring data corresponding relation

Country Status (1)

Country Link
CN (2) CN107092694B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107958049A (en) * 2017-11-28 2018-04-24 航天科工智慧产业发展有限公司 A kind of quality of data checking and administration system
CN109271377A (en) * 2018-08-10 2019-01-25 蜜小蜂智慧(北京)科技有限公司 A kind of data quality checking method and device
CN110569234A (en) * 2019-07-30 2019-12-13 深圳市华傲数据技术有限公司 Data checking method and device, electronic equipment and computer readable storage medium
CN111143335A (en) * 2019-11-13 2020-05-12 深圳市华傲数据技术有限公司 Data quality problem discovery method
CN111563074A (en) * 2020-04-28 2020-08-21 厦门市美亚柏科信息股份有限公司 Data quality detection method and system based on multi-dimensional label
CN112395325A (en) * 2020-11-27 2021-02-23 广州光点信息科技有限公司 Data management method, system, terminal equipment and storage medium
CN112508433A (en) * 2020-12-16 2021-03-16 广东电网有限责任公司惠州供电局 Data inspection method and device for operation and maintenance system
CN113377758A (en) * 2021-06-30 2021-09-10 数字郑州科技有限公司 Data quality auditing engine and auditing method thereof
CN114648316A (en) * 2022-05-18 2022-06-21 国网浙江省电力有限公司 Digital processing method and system based on inspection tag library

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040034699A1 (en) * 2002-08-16 2004-02-19 Martina Gotz Managing data integrity using a filter condition
CN101256588A (en) * 2008-03-18 2008-09-03 金蝶软件(中国)有限公司 Method and system for setting acquiesce data riddling plan
CN101515289A (en) * 2009-03-25 2009-08-26 中国工商银行股份有限公司 Device for detecting conventional data file and method thereof
US8463742B1 (en) * 2010-09-17 2013-06-11 Permabit Technology Corp. Managing deduplication of stored data
CN103699693A (en) * 2014-01-10 2014-04-02 中国南方电网有限责任公司 Metadata-based data quality management method and system
CN104766151A (en) * 2014-12-29 2015-07-08 国家电网公司 Quality management and control method for electricity transaction data warehouses and management and control system thereof
CN106203852A (en) * 2016-07-13 2016-12-07 广东电网有限责任公司 Online inspection rule determines method and device, method for processing business and system

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050204340A1 (en) * 2004-03-10 2005-09-15 Ruminer Michael D. Attribute-based automated business rule identifier and methods of implementing same
US8078643B2 (en) * 2006-11-27 2011-12-13 Sap Ag Schema modeler for generating an efficient database schema
KR20100058445A (en) * 2010-05-24 2010-06-03 (주)위세아이텍 Automatic extracting method of heterogeneous metadata by using rule-based technology and system thereof
CN103246753A (en) * 2013-05-30 2013-08-14 安徽皖通科技股份有限公司 Method for generating entity metadata model according to database structure
CN103514514A (en) * 2013-09-23 2014-01-15 广州供电局有限公司 On-line monitoring method for electricity marketing business data
CN103729713A (en) * 2013-11-06 2014-04-16 远光软件股份有限公司 Audit result display configuration method and device
CN104636484A (en) * 2015-02-16 2015-05-20 广东省公安厅 Monitoring task generating method and device based on data monitoring
CN105701626A (en) * 2016-03-03 2016-06-22 国网浙江省电力公司 Electric marketing inception lean control multi-system integrated method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040034699A1 (en) * 2002-08-16 2004-02-19 Martina Gotz Managing data integrity using a filter condition
CN101256588A (en) * 2008-03-18 2008-09-03 金蝶软件(中国)有限公司 Method and system for setting acquiesce data riddling plan
CN101515289A (en) * 2009-03-25 2009-08-26 中国工商银行股份有限公司 Device for detecting conventional data file and method thereof
US8463742B1 (en) * 2010-09-17 2013-06-11 Permabit Technology Corp. Managing deduplication of stored data
CN103699693A (en) * 2014-01-10 2014-04-02 中国南方电网有限责任公司 Metadata-based data quality management method and system
CN104766151A (en) * 2014-12-29 2015-07-08 国家电网公司 Quality management and control method for electricity transaction data warehouses and management and control system thereof
CN106203852A (en) * 2016-07-13 2016-12-07 广东电网有限责任公司 Online inspection rule determines method and device, method for processing business and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
QING-XIANG ZHU 等: "Research of tax inspection cases-choice based on association rules in data mining", 《2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS》 *
皮洪琴: "《供电所技能人员培训教材 营销分册》", 30 June 2016, 长沙中南大学出版社 *
肖波 等: "基于模型驱动的中国石化企业数据中心模型架构", 《计算机与自动化工程》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107958049A (en) * 2017-11-28 2018-04-24 航天科工智慧产业发展有限公司 A kind of quality of data checking and administration system
CN107958049B (en) * 2017-11-28 2021-09-14 航天科工智慧产业发展有限公司 Data quality inspection management system
CN109271377A (en) * 2018-08-10 2019-01-25 蜜小蜂智慧(北京)科技有限公司 A kind of data quality checking method and device
CN110569234A (en) * 2019-07-30 2019-12-13 深圳市华傲数据技术有限公司 Data checking method and device, electronic equipment and computer readable storage medium
CN111143335A (en) * 2019-11-13 2020-05-12 深圳市华傲数据技术有限公司 Data quality problem discovery method
CN111563074A (en) * 2020-04-28 2020-08-21 厦门市美亚柏科信息股份有限公司 Data quality detection method and system based on multi-dimensional label
CN111563074B (en) * 2020-04-28 2022-05-31 厦门市美亚柏科信息股份有限公司 Data quality detection method and system based on multi-dimensional label
CN112395325A (en) * 2020-11-27 2021-02-23 广州光点信息科技有限公司 Data management method, system, terminal equipment and storage medium
CN112508433A (en) * 2020-12-16 2021-03-16 广东电网有限责任公司惠州供电局 Data inspection method and device for operation and maintenance system
CN113377758A (en) * 2021-06-30 2021-09-10 数字郑州科技有限公司 Data quality auditing engine and auditing method thereof
CN114648316A (en) * 2022-05-18 2022-06-21 国网浙江省电力有限公司 Digital processing method and system based on inspection tag library

Also Published As

Publication number Publication date
CN107092694B (en) 2020-10-20
CN112115130A (en) 2020-12-22

Similar Documents

Publication Publication Date Title
CN107092694A (en) The inspection task creating method and device of the quality of data
US10241991B2 (en) Providing context-aware input data
US9218568B2 (en) Disambiguating data using contextual and historical information
CN105264555A (en) Evaluation control
KR940002724A (en) Election terminal device
CN106447239A (en) Auditing method and device for data release
CN106682150A (en) Information processing method and device
CN109636681A (en) Contract generation method, device, equipment and storage medium
CN109635300A (en) Data verification method and device
CN109828906A (en) UI automated testing method, device, electronic equipment and storage medium
US20090055720A1 (en) Apparatus, and associated method, for generating an information technology incident report
CN112256853A (en) Question generation method, device, equipment and computer readable storage medium
CN108170838B (en) Topic evolution visualization display method, application server and computer readable storage medium
CN104766258B (en) A kind of general enrolment system on line and registration method
CN114443632A (en) Intelligent conversion method and system for credit of credit bank and computer equipment
CN107704484B (en) Webpage error information processing method and device, computer equipment and storage medium
Tagacay et al. Development of HAAQ: Hands-Free Attendance Archive using QR Code
CN111723212A (en) Reading information processing method and device and electronic equipment
CN106951449B (en) Service driving method and device
Ahmad et al. Developing the modified accountability disclosure index for local governments
DE112012000158T5 (en) Electronic device and method for dynamically formatting monetary expressions
JPS6125347A (en) Electronic slip system
Gomez-Magdaraog et al. Socio-economic impact of the COVID-19 pandemic on Overseas Filipino Workers and their left-behind families: a scoping review
Veerabhadraswamy MyVote-An Effective Online Voting System that can be Trusted
CN114742532A (en) Mail classification method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant