CN105653730B - A kind of method of inspection and device of the quality of data - Google Patents

A kind of method of inspection and device of the quality of data Download PDF

Info

Publication number
CN105653730B
CN105653730B CN201610069825.8A CN201610069825A CN105653730B CN 105653730 B CN105653730 B CN 105653730B CN 201610069825 A CN201610069825 A CN 201610069825A CN 105653730 B CN105653730 B CN 105653730B
Authority
CN
China
Prior art keywords
data
inspection
days
tested
stock
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610069825.8A
Other languages
Chinese (zh)
Other versions
CN105653730A (en
Inventor
赵维平
李琼
李辉
何海清
卞乃文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Bank of China
Original Assignee
Agricultural Bank of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Bank of China filed Critical Agricultural Bank of China
Priority to CN201610069825.8A priority Critical patent/CN105653730B/en
Publication of CN105653730A publication Critical patent/CN105653730A/en
Application granted granted Critical
Publication of CN105653730B publication Critical patent/CN105653730B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • General Factory Administration (AREA)

Abstract

The invention discloses a kind of method of inspection of quality of data and devices, which comprises acceptance inspection request;It is requested in response to the inspection, obtains and examine queue, there are data to be tested in the inspection queue;T-1 days the first data on stock and t+1 days the second data on stock are obtained in data warehouse, first data on stock is different from the attribute of second data on stock, and t is the positive integer more than or equal to 2;It searches whether there are t days data corresponding with described t-1 days the first data on stock attributes in the inspection queue, obtains the first inspection result;It searches whether there are t days data corresponding with described t+1 days the second data on stock attributes in the inspection queue, obtains the second inspection result;Based on first inspection result and second inspection result, the inspection result of the data to be tested is generated.

Description

A kind of method of inspection and device of the quality of data
Technical field
The present invention relates to technical field of data processing, in particular to the method for inspection and device of a kind of quality of data.
Background technique
As big data technology is in the gradually application development of various industries, the source data from every profession and trade underlying services system Quality problems also gradually expose, how to detect to the quality of source data, maintain the stability of data, become and primarily face Problem.
In the prior art when testing to the quality of data, usually in the way of comparison historical data, i.e., it will count It is compared one by one according to the historical data in warehouse with the current data in the system of source, obtains inspection result.
But in this scheme, need to compare the data under varying environment, and from source system to data warehouse during Historical data has handled a degree of conversion, therefore, when being compared, it is necessary to be based on consistent transformation rule, cause to need The processing logical AND various dimensions for repeating source system to data warehouse recycle the operation compared, and thereby result in this data inspection The mode inefficiency tested.
Summary of the invention
In view of this, the object of the present invention is to provide a kind of method of inspection of quality of data and devices, to solve In the prior art the technical issues of data detection inefficiency.
The present invention provides a kind of methods of inspection of quality of data, comprising:
Acceptance inspection request;
It is requested in response to the inspection, obtains and examine queue, there are data to be tested in the inspection queue;
Obtain t-1 days the first data on stock and t+1 days the second data on stock in data warehouse, described first Data on stock is different from the attribute of second data on stock, and t is the positive integer more than or equal to 2;
Search whether there is corresponding with described t-1 days the first data on stock attributes the in the inspection queue T days data, obtain the first inspection result;
Search whether there is corresponding with described t+1 days the second data on stock attributes the in the inspection queue T days data, obtain the second inspection result;
Based on first inspection result and second inspection result, the inspection result of the data to be tested is generated.
The above method, it is preferred that obtain current inspection queue, comprising:
It requests, is judged in preset pretreatment zone with the presence or absence of data to be tested in response to the inspection, if it does, Preset inspection queue is sent by the data to be tested of the pretreatment zone;If it does not exist, it is obtained using pretreatment processing stream Data to be tested are taken, are placed in the inspection queue.
The above method, it is preferred that t-1 days the first data on stock and the second of t+1 days are obtained in data warehouse Data on stock, comprising:
T-1 days data records are obtained in data warehouse as the first data on stock, the data record has only Row record in one mark and correspondence database entity table;
The dimension data of t+1 days data records is obtained in data warehouse as the second data on stock, the dimension Data correspond to the column dimension values of data record in the entity table.
The above method, it is preferred that after the inspection result for generating the data to be tested, the method also includes:
The data to be tested are characterized there are when t days shortage of data, to the number to be tested in the inspection result According to being repaired;
Data to be tested after reparation are placed in the data warehouse.
The above method, it is preferred that after the inspection result for generating the data to be tested, the method also includes:
The inspection result is transmitted.
The present invention also provides a kind of verifying attachments of quality of data, comprising:
Request reception unit is requested for acceptance inspection;
Queue obtaining unit is obtained and examines queue for requesting in response to inspections, in the inspection queue with to Inspection data;
Storage acquiring unit, for obtaining t-1 days the first data on stock and the second of t+1 days in data warehouse Data on stock, first data on stock is different from the attribute of second data on stock, and t is the positive integer more than or equal to 2;
First verification unit, for searching whether the first storage for having with described t-1 days in the inspection queue The corresponding t days data of data attribute, obtain the first inspection result;
Second verification unit, for searching whether the second storage for having with described t+1 days in the inspection queue The corresponding t days data of data attribute, obtain the second inspection result;
As a result generation unit generates described to be checked for being based on first inspection result and second inspection result Test the inspection result of data.
Above-mentioned apparatus, it is preferred that the queue obtaining unit includes:
Data judging subelement judges to whether there is in preset pretreatment zone for requesting in response to the inspection Data to be tested, if it does, trigger data transmission sub-unit, if it does not exist, trigger data obtain subelement;
Data transmission sub-unit, for sending preset inspection queue for the data to be tested of the pretreatment zone;
Data acquisition subelement is placed in the inspection queue for obtaining data to be tested using pretreatment processing stream In.
Above-mentioned apparatus, it is preferred that the storage acquiring unit includes:
Record obtains subelement, for obtaining t-1 days data records in data warehouse as the first data on stock, The data record has the row record in unique mark and correspondence database entity table;
Dimension obtains subelement, for obtaining t+ days dimension datas in data warehouse as the second data on stock, The dimension data corresponds to the column dimension values of data record in the entity table.
Above-mentioned apparatus, it is preferred that further include:
Data repair unit, and for characterizing the data to be tested in the inspection result, there are t days shortage of data When, the data to be tested are repaired;
Data are placed in unit, are placed in the data warehouse for the data to be tested after repairing.
Above-mentioned apparatus, it is preferred that further include:
As a result transmission unit, for transmitting the inspection result.
By above scheme it is found that the method for inspection and device of a kind of quality of data provided by the invention, by to be tested Comparing is carried out to front and back data on stock on the two in the angle of different attribute in data, to judge whether number occurs According to missing, the inspection of data stability is realized.Comparing is carried out merely with front and back data on stock on the two in the present invention, because This, have the characteristics that occupy little space under the conditions of big data quantity, processing speed it is fast, to improve the inspection effect of the quality of data Rate will not influence other search efficiencies of user.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of flow chart of the method for inspection for quality of data that the embodiment of the present invention one provides;
Fig. 2 is a kind of partial process view of the method for inspection of the quality of data provided by Embodiment 2 of the present invention;
Fig. 3 is a kind of partial process view of the method for inspection for quality of data that the embodiment of the present invention three provides;
Fig. 4 is a kind of flow chart of the method for inspection for quality of data that the embodiment of the present invention four provides;
Fig. 5 is a kind of flow chart of the method for inspection for quality of data that the embodiment of the present invention five provides;
Fig. 6 is a kind of structural schematic diagram of the verifying attachment for quality of data that the embodiment of the present invention six provides;
Fig. 7 is a kind of partial structure diagram of the verifying attachment for quality of data that the embodiment of the present invention seven provides;
Fig. 8 is a kind of partial structure diagram of the verifying attachment for quality of data that the embodiment of the present invention eight provides;
Fig. 9 is a kind of structural schematic diagram of the verifying attachment for quality of data that the embodiment of the present invention nine provides;
Figure 10 is a kind of structural schematic diagram of the verifying attachment for quality of data that the embodiment of the present invention ten provides.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
It is a kind of implementation flow chart of the method for inspection for quality of data that the embodiment of the present invention one provides with reference to Fig. 1, In, the method is suitable for carrying out quality inspection to data, wherein before quality of data inspection is the subsequent progress analysis mining of data An indispensable link, is the basis of data processing work.During information is created, produces and is integrated, Ren Hehuan Section is slipped, and data quality problem can be all caused.The quality of data examine be in order to ensure the accuracys of data, data it is complete Property and data consistency, both guaranteed Data Elements meet business meaning, ensure record it is complete, ensure record history deposit It is continuous.
In the present embodiment, the method may include following steps:
Step 101: acceptance inspection request.
Wherein, the inspection request can be operated using its terminal and be generated for user, and inspection request here can Think that batch examines request, that is to say, that the present embodiment is suitable for batch and examines or concurrently examine.
Step 102: being requested in response to the inspection, obtain and examine queue, there are data to be tested in the inspection queue.
Wherein, inspections can be responded by batch jobs group in the present embodiment to request, adjust in batches data operation into The concurrent data detection processing of row, data to be tested are put into and are examined in queue, the inspection queue is obtained in the present embodiment.
Step 103: t-1 days the first data on stock and t+1 days the second data on stock are obtained in data warehouse.
Wherein, first data on stock is different from the attribute of second data on stock, and t is just more than or equal to 2 Integer.That is, first data on stock and second data on stock are respectively different attribute in the data warehouse On data.
It should be noted that data on stock here refers to the historical data being stored in Data Warehouse table, packet Include the business information etc. of set generation.
Step 104: searching whether the first data on stock attribute phase for having with described t-1 days in the inspection queue Corresponding t days data, obtain the first inspection result.
That is, by whether there is the t-1 in the data to be tested for examining queue in the present embodiment Day the survival data of the first data on stock determined, and then determine whether have in the data to be tested t days with The corresponding data of attribute of first data on stock, and then determine on the attribute of first data on stock, it is described Data to be tested whether there is shortage of data, obtain the first inspection result.
Step 105: searching whether the second data on stock attribute phase for having with described t+1 days in the inspection queue Corresponding t days data, obtain the second inspection result.
That is, by whether there is the t+1 in the data to be tested for examining queue in the present embodiment Day the data of the second data on stock determined, and then determine whether have in the data to be tested t days with it is described The corresponding data of the attribute of second data on stock, and then determine on the attribute of second data on stock, it is described to be checked Data are tested with the presence or absence of shortage of data, obtain the second inspection result.
Step 106: being based on first inspection result and second inspection result, generate the inspection of the data to be tested Test result.
As a result, based on the quality of data inspection result on different attribute in the present embodiment, Lai Shengcheng is able to reflect described The quality test results of the data stability of data to be tested.
By above scheme it is found that a kind of method of inspection for quality of data that the embodiment of the present invention one provides, by to be checked It tests in data and comparing is carried out to front and back data on stock on the two in the angle of different attribute, to judge whether occur Shortage of data realizes the inspection of data stability.Comparing is carried out merely with front and back data on stock on the two in the present embodiment, Therefore, have the characteristics that occupy little space under the conditions of big data quantity, processing speed it is fast, to improve the inspection effect of the quality of data Rate will not influence other search efficiencies of user.
It is the reality of step 102 described in a kind of method of inspection of the quality of data provided by Embodiment 2 of the present invention with reference to Fig. 2 Existing flow chart, wherein the step 102 can be realized by following steps:
Step 121: it is requested in response to the inspection, judges to whether there is data to be tested in preset pretreatment zone, If it does, executing step 122, if it does not exist, step 123 is executed.
Wherein, the pretreatment zone refers to, the real data storage area inside the data warehouse, i.e., described to be checked Test the data pool where data.
Step 122: sending preset inspection queue for the data to be tested of the pretreatment zone.
Step 123: obtaining data to be tested using pretreatment processing stream, be placed in the inspection queue.
That is, after user issues the inspection request of the quality of data, being issued in response to subscriber's main station in the present embodiment Inspection request, judge in the pretreatment zone whether there is data to be tested, if it is present by these numbers to be tested According to the inspection queue is sent to, subsequent stability test processing is carried out, if it does not exist, then pretreatment processing stream is called, Data to be tested are obtained from database carries out stability test.
It is the realization stream of step 103 in a kind of method of inspection for quality of data that the embodiment of the present invention three provides with reference to Fig. 3 Cheng Tu, wherein the step 103 can be realized by following steps:
Step 131: t-1 days data records are obtained in data warehouse as the first data on stock.
Wherein, the data record has the row record in unique mark and correspondence database entity table, is referred to as: Contract data on stock.
The present embodiment is after obtaining this first data on stock as a result, by judging this t-1 days contract storages Whether data survived on t, to determine whether the data to be tested are stable in contract grade, i.e., by examining the whole of contract grade The survival in the time on the two of front and back is recorded in confirm the stability of data, obtains the first inspection result.
Step 132: the dimension data of t+1 days data records is obtained in data warehouse as the second data on stock.
Wherein, the dimension data corresponds to the column dimension values of data record in the entity table, can be one or more columns per page Dimension values are referred to as: element data on stock.
The present embodiment is after obtaining this second data on stock as a result, by judging this t+1 days element storages Data whether there is corresponding data on t, to determine whether the data to be tested are stable in element grade, i.e., want by examining Survival of certain element within the time on the two of front and back obtains the second inspection to confirm the stability of data in the record of plain grade As a result.
Determined that is, whether having continuity by treating inspection data in the present embodiment, specifically, to depositing Measure whether data keep continued presence, leakage number free of discontinuities or phenomenon of uprushing to be determined in the historical data in data warehouse, And then judge that the data to be tested have data stability.
By above scheme it is found that based on the two days storages in t-1 days and t days and t days and t+1 days front and backs in the present embodiment Data carry out two-way pumping station verifying, and the contract grade of verification verification same day data is stable and element grade is stablized, in the item of big data quantity Under part can quickly, be accurately positioned abnormal data, it is ensured that abnormal data does not involve downstream application.
It should be noted that the stable element test range of element grade can change according to business, artificially it is adjusted flexibly, Abnormal data under diversity situation of change to meet business demand hits needs.
It is a kind of implementation flow chart of the method for inspection for quality of data that the embodiment of the present invention four provides, in institute with reference to Fig. 4 After stating step 106, the method can with the following steps are included:
Step 107: characterizing the data to be tested there are when t days shortage of data, to described in the inspection result Data to be tested are repaired.
Alternatively, can also guarantee the data to be tested in the inspection result in the present embodiment, there are t days data When missing, the data to be tested are first stored in problem log, the data to be tested are repaired further according to demand.
Step 108: the data to be tested after reparation are placed in the data warehouse.
That is, it is unstable on t in the inspection result characterization data to be tested in the present embodiment, such as go out When existing contract grade or discontinuous or unstable element grade data, the data to be tested repair perfect, obtained stable Data, and then store into data warehouse.
By above scheme it is found that by carrying out to data on stock, contract grade is stable and element grade stabilization is tested in the present embodiment Card, the verifying including positive disturbance and negative sense disturbance, not only can detecte out the quality of data reduces problem, while can also capture The quality of data improves situation, it is ensured that the judgement of the accuracy and consistency of data on stock quality, and to there is missing Data are repaired perfect, therefore improve data subsequent use efficiency.
It is a kind of implementation flow chart of the method for inspection for quality of data that the embodiment of the present invention five provides with reference to Fig. 5, In, after the step 106, the method can with the following steps are included:
Step 109: the inspection result is transmitted.
Specifically, the inspection result can be transferred to the transmitting terminal for examining request, i.e. user in the present embodiment End.Wherein, the inspection result directly can be shown or be stored in the user terminal, in order to subsequent calls.
It is a kind of structural schematic diagram of the verifying attachment for quality of data that the embodiment of the present invention six provides with reference to Fig. 6, In, described device is used to carry out quality inspection to data, wherein quality of data inspection be before the subsequent progress analysis mining of data not The link that can lack, is the basis of data processing work.During information is created, produces and is integrated, any link It slips, can all lead to data quality problem.Quality of data inspection is accuracy, the integrality of data in order to ensure data With the consistency of data, the history survival that is complete, ensuring record that Data Elements meet business meaning, ensure to record both was guaranteed.
Specifically, in the present embodiment, the apparatus may include with flowering structure:
Request reception unit 601 is requested for acceptance inspection.
Wherein, the inspection request can be operated using its terminal and be generated for user, and inspection request here can Think batch inspection request, that is to say, that the present embodiment is suitable for batch and examines or concurrently examine, and criticizes specifically, can use Amount task module realizes the present embodiment.
Queue obtaining unit 602 obtains for requesting in response to the inspection and examines queue, have in the inspection queue Need inspection data.
Wherein, inspections can be responded by batch jobs group in the present embodiment to request, adjust in batches data operation into The concurrent data detection processing of row, data to be tested are put into and are examined in queue, the inspection queue is obtained in the present embodiment.
Storage acquiring unit 603, for obtain in data warehouse t-1 days the first data on stock and t+1 days Second data on stock.
Wherein, first data on stock is different from the attribute of second data on stock, and t is just more than or equal to 2 Integer.That is, first data on stock and second data on stock are respectively different attribute in the data warehouse On data.
It should be noted that data on stock here refers to the historical data being stored in Data Warehouse table, packet Include the business information etc. of set generation.
First verification unit 604, for searching whether to deposit in the presence of first with described t-1 days in the inspection queue The corresponding t days data of data attribute are measured, the first inspection result is obtained.
That is, by whether there is the t-1 in the data to be tested for examining queue in the present embodiment Day the survival data of the first data on stock determined, and then determine whether have in the data to be tested t days with The corresponding data of attribute of first data on stock, and then determine on the attribute of first data on stock, it is described Data to be tested whether there is shortage of data, obtain the first inspection result.
Second verification unit 605, for searching whether to deposit in the presence of second with described t+1 days in the inspection queue The corresponding t days data of data attribute are measured, the second inspection result is obtained.
That is, by whether there is the t+1 in the data to be tested for examining queue in the present embodiment Day the data of the second data on stock determined, and then determine whether have in the data to be tested t days with it is described The corresponding data of the attribute of second data on stock, and then determine on the attribute of second data on stock, it is described to be checked Data are tested with the presence or absence of shortage of data, obtain the second inspection result.
As a result generation unit 606, for being based on first inspection result and second inspection result, generate it is described to The inspection result of inspection data.
As a result, based on the quality of data inspection result on different attribute in the present embodiment, Lai Shengcheng is able to reflect described The quality test results of the data stability of data to be tested.
By above scheme it is found that a kind of verifying attachment for quality of data that the embodiment of the present invention six provides, by to be checked It tests in data and comparing is carried out to front and back data on stock on the two in the angle of different attribute, to judge whether occur Shortage of data realizes the inspection of data stability.Comparing is carried out merely with front and back data on stock on the two in the present embodiment, Therefore, have the characteristics that occupy little space under the conditions of big data quantity, processing speed it is fast, to improve the inspection effect of the quality of data Rate will not influence other search efficiencies of user.
It is that queue described in a kind of verifying attachment for quality of data that the embodiment of the present invention seven provides obtains list with reference to Fig. 7 The structural schematic diagram of member 602, wherein the queue obtaining unit 602 can be by being realized with flowering structure:
Data judging subelement 621 judges whether deposit in preset pretreatment zone for requesting in response to the inspection In data to be tested, if it does, trigger data transmission sub-unit 622, if it does not exist, trigger data obtain subelement 623.
Wherein, the pretreatment zone refers to, the real data storage area inside the data warehouse, i.e., described to be checked Test the data pool where data.
Data transmission sub-unit 622, for sending preset inspection team for the data to be tested of the pretreatment zone Column.
Data acquisition subelement 623 is placed in the inspection team for obtaining data to be tested using pretreatment processing stream In column.
That is, after user issues the inspection request of the quality of data, being issued in response to subscriber's main station in the present embodiment Inspection request, judge in the pretreatment zone whether there is data to be tested, if it is present by these numbers to be tested According to the inspection queue is sent to, subsequent stability test processing is carried out, if it does not exist, then pretreatment processing stream is called, Data to be tested are obtained from database carries out stability test.
It is that storage described in a kind of verifying attachment for quality of data that the embodiment of the present invention eight provides obtains list with reference to Fig. 8 The structural schematic diagram of member 603, wherein the storage acquiring unit 603 may include being realized with flowering structure:
Record obtains subelement 631, for obtaining t-1 days data records in data warehouse as the first storage number According to.
Wherein, the data record has the row record in unique mark and correspondence database entity table, is referred to as: Contract data on stock.
It is logical to can use contract grade stability checking component after obtaining this first data on stock for the present embodiment as a result, It crosses and judges whether this t-1 days contract data on stock survived on t, to determine the data to be tested whether in contract grade Stablize, i.e., by examine contract grade it is whole be recorded in the survival in the time on the two of front and back and confirm the stability of data, Obtain the first inspection result.
Dimension obtains subelement 632, for obtaining t+ days dimension datas in data warehouse as the second storage number According to.
Wherein, the dimension data corresponds to the column dimension values of data record in the entity table, can be one or more columns per page Dimension values are referred to as: element data on stock.
It is logical to can use element grade stability checking component after obtaining this second data on stock for the present embodiment as a result, Cross judge this t+1 days element data on stock on t with the presence or absence of corresponding data, to determine that the data to be tested are No to stablize in element grade, i.e., survival of certain element within the time on the two of front and back in the record by examining element grade comes true The stability for recognizing data obtains the second inspection result.
Determined that is, whether having continuity by treating inspection data in the present embodiment, specifically, to depositing Measure whether data keep continued presence, leakage number free of discontinuities or phenomenon of uprushing to be determined in the historical data in data warehouse, And then judge that the data to be tested have data stability.
By above scheme it is found that based on the two days storages in t-1 days and t days and t days and t+1 days front and backs in the present embodiment Data carry out two-way pumping station verifying, and the contract grade of verification verification same day data is stable and element grade is stablized, in the item of big data quantity Under part can quickly, be accurately positioned abnormal data, it is ensured that abnormal data does not involve downstream application.
It should be noted that the stable element test range of element grade can change according to business, artificially it is adjusted flexibly, Abnormal data under diversity situation of change to meet business demand hits needs.
It is a kind of structural schematic diagram of the verifying attachment for quality of data that the embodiment of the present invention nine provides with reference to Fig. 9, In, described device can also include with flowering structure:
Data repair unit 607, and for characterizing the data to be tested in the inspection result, there are t days data to lack When mistake, the data to be tested are repaired.
Alternatively, the data reparation unit 607 can also guarantee the data to be tested in the inspection result, there are t When the shortage of data of day, the data to be tested are first stored in problem log, the data to be tested are carried out further according to demand It repairs.
Data are placed in unit 608, are placed in the data warehouse for the data to be tested after repairing.
That is, it is unstable on t in the inspection result characterization data to be tested in the present embodiment, such as go out When existing contract grade or discontinuous or unstable element grade data, the data to be tested repair perfect, obtained stable Data, and then store into data warehouse.
By above scheme it is found that by carrying out to data on stock, contract grade is stable and element grade stabilization is tested in the present embodiment Card, the verifying including positive disturbance and negative sense disturbance, not only can detecte out the quality of data reduces problem, while can also capture The quality of data improves situation, it is ensured that the judgement of the accuracy and consistency of data on stock quality, and to there is missing Data are repaired perfect, therefore improve data subsequent use efficiency.
It is a kind of structural schematic diagram of the verifying attachment for quality of data that the embodiment of the present invention ten provides with reference to Figure 10, In, described device can also include with flowering structure:
As a result transmission unit 609, for transmitting the inspection result.
Specifically, the inspection result can be transferred to the transmitting terminal for examining request, i.e. user in the present embodiment End.Wherein, the inspection result directly can be shown or be stored in the user terminal, in order to subsequent calls.
If function described in the present embodiment method is realized in the form of SFU software functional unit and as independent product pin It sells or in use, can store in a storage medium readable by a compute device.Based on this understanding, the embodiment of the present application The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, this is soft Part product is stored in a storage medium, including some instructions are used so that calculating equipment (it can be personal computer, Server, mobile computing device or network equipment etc.) execute all or part of step of each embodiment the method for the application Suddenly.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), deposits at random The various media that can store program code such as access to memory (RAM, Random Access Memory), magnetic or disk.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with it is other The difference of embodiment, same or similar part may refer to each other between each embodiment.
The foregoing description of the disclosed embodiments makes professional and technical personnel in the field can be realized or use the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the application.Therefore, the application It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (8)

1. a kind of method of inspection of the quality of data characterized by comprising
Acceptance inspection request;
It is requested in response to the inspection, obtains and examine queue, there are data to be tested in the inspection queue;
T-1 days the first data on stock and t+1 days the second data on stock, first storage are obtained in data warehouse Data are different from the attribute of second data on stock, and t is the positive integer more than or equal to 2;Wherein, it is obtained in data warehouse T-1 days the first data on stock and t+1 days the second data on stock, comprising: t-1 days numbers are obtained in data warehouse It is used as the first data on stock according to record, the data record has the row record in unique mark and correspondence database entity table; The dimension data of t+1 days data records is obtained in data warehouse as the second data on stock, the dimension data is corresponding The column dimension values of data record in the entity table;
It searches whether to exist in the inspection queue t days corresponding with described t-1 days the first data on stock attributes Data obtain the first inspection result;
It searches whether to exist in the inspection queue t days corresponding with described t+1 days the second data on stock attributes Data obtain the second inspection result;
Based on first inspection result and second inspection result, the inspection result of the data to be tested is generated.
2. the method according to claim 1, wherein obtaining current inspection queue, comprising:
It requests, is judged in preset pretreatment zone with the presence or absence of data to be tested, if it does, by institute in response to the inspection The data to be tested for stating pretreatment zone are sent to preset inspection queue;If it does not exist, using pretreatment processing stream obtain to Inspection data is placed in the inspection queue.
3. the method according to claim 1, wherein after the inspection result for generating the data to be tested, The method also includes:
Characterize the data to be tested there are when t days shortage of data in the inspection result, to the data to be tested into Row is repaired;
Data to be tested after reparation are placed in the data warehouse.
4. the method according to claim 1, wherein after the inspection result for generating the data to be tested, The method also includes:
The inspection result is transmitted.
5. a kind of verifying attachment of the quality of data characterized by comprising
Request reception unit is requested for acceptance inspection;
Queue obtaining unit obtains for requesting in response to the inspection and examines queue, has in the inspection queue to be tested Data;
Storage acquiring unit, for obtaining t-1 days the first data on stock and t+1 days the second storages in data warehouse Data, first data on stock is different from the attribute of second data on stock, and t is the positive integer more than or equal to 2;
Wherein, the storage acquiring unit includes: that record obtains subelement, for obtaining t-1 days numbers in data warehouse It is used as the first data on stock according to record, the data record has the row record in unique mark and correspondence database entity table; Dimension obtains subelement, for obtaining t+1 days dimension datas in data warehouse as the second data on stock, the dimension Data correspond to the column dimension values of data record in the entity table;
First verification unit, for searching whether the first data on stock for having with described t-1 days in the inspection queue The corresponding t days data of attribute, obtain the first inspection result;
Second verification unit, for searching whether the second data on stock for having with described t+1 days in the inspection queue The corresponding t days data of attribute, obtain the second inspection result;
As a result generation unit generates the number to be tested for being based on first inspection result and second inspection result According to inspection result.
6. verifying attachment according to claim 5, which is characterized in that the queue obtaining unit includes:
Data judging subelement judges in preset pretreatment zone for requesting in response to the inspection with the presence or absence of to be checked Data are tested, if it does, trigger data transmission sub-unit, if it does not exist, trigger data obtain subelement;
Data transmission sub-unit, for sending preset inspection queue for the data to be tested of the pretreatment zone;
Data acquisition subelement is placed in the inspection queue for obtaining data to be tested using pretreatment processing stream.
7. verifying attachment according to claim 5, which is characterized in that further include:
Data repair unit, right for characterizing the data to be tested there are when t days shortage of data in the inspection result The data to be tested are repaired;
Data are placed in unit, are placed in the data warehouse for the data to be tested after repairing.
8. verifying attachment according to claim 5, which is characterized in that further include:
As a result transmission unit, for transmitting the inspection result.
CN201610069825.8A 2016-02-01 2016-02-01 A kind of method of inspection and device of the quality of data Active CN105653730B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610069825.8A CN105653730B (en) 2016-02-01 2016-02-01 A kind of method of inspection and device of the quality of data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610069825.8A CN105653730B (en) 2016-02-01 2016-02-01 A kind of method of inspection and device of the quality of data

Publications (2)

Publication Number Publication Date
CN105653730A CN105653730A (en) 2016-06-08
CN105653730B true CN105653730B (en) 2019-07-09

Family

ID=56489212

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610069825.8A Active CN105653730B (en) 2016-02-01 2016-02-01 A kind of method of inspection and device of the quality of data

Country Status (1)

Country Link
CN (1) CN105653730B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682173B (en) * 2016-12-28 2019-10-18 华南理工大学 A kind of social security big data OLAP preprocess method and on-line analysis querying method
CN108875056B (en) * 2018-06-28 2021-08-13 中国建设银行股份有限公司 Data checking method and device, electronic equipment and readable storage medium
CN110958126B (en) * 2018-09-26 2022-09-06 中国移动通信集团有限公司 Checking method, checking device and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073912A (en) * 2009-11-23 2011-05-25 中国移动通信集团黑龙江有限公司 Data quality control method, device and system
CN103716301A (en) * 2013-12-04 2014-04-09 深圳市华傲数据技术有限公司 Firewall-based data restoration method and system
CN104268064A (en) * 2014-09-11 2015-01-07 百度在线网络技术(北京)有限公司 Abnormity diagnosis method and device of product logs

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073912A (en) * 2009-11-23 2011-05-25 中国移动通信集团黑龙江有限公司 Data quality control method, device and system
CN103716301A (en) * 2013-12-04 2014-04-09 深圳市华傲数据技术有限公司 Firewall-based data restoration method and system
CN104268064A (en) * 2014-09-11 2015-01-07 百度在线网络技术(北京)有限公司 Abnormity diagnosis method and device of product logs

Also Published As

Publication number Publication date
CN105653730A (en) 2016-06-08

Similar Documents

Publication Publication Date Title
CN106484606A (en) Method and apparatus submitted to by a kind of code
CN105653730B (en) A kind of method of inspection and device of the quality of data
CN105868956A (en) Data processing method and device
CN109491754A (en) The performance test methods and device of virtual server
US11119902B2 (en) Creating a higher order mutant for mutation testing software
CN107704568A (en) Method and device for adding test data
CN114240177A (en) Government affair data quality assessment method and system
CN111858377B (en) Quality evaluation method and device for test script, electronic equipment and storage medium
CN111639094A (en) Chemical product formula retrieval implementation method and device and storage medium
CN111125066A (en) Method and device for detecting functions of database audit equipment
CN113159537B (en) Assessment method and device for new technical project of power grid and computer equipment
CN110851344B (en) Big data testing method and device based on complexity of calculation formula and electronic equipment
CN114020642A (en) Big data task testing method and device, storage medium and electronic equipment
Brito et al. Investigating measures for applying statistical process control in software organizations
CN112667669A (en) Method and device for evaluating maintainability testability of equipment and computer equipment
CN111427778A (en) Test method, test device, terminal equipment and storage medium
CN110196796A (en) The effect evaluation method and device of proposed algorithm
CN116795723B (en) Chain unit test processing method and device and computer equipment
CN113590488B (en) System test method and test platform for simulating financial data support
CN116701227A (en) Test workload determination method, device, equipment and storage medium
CN114625635A (en) Three-dimensional geometric modeling kernel quality evaluation method and system based on code security
CN118192937A (en) Data processing system based on bank system software simulation
Li et al. Performance modeling and benchmarking of bank intermediary business on high-performance fault-tolerant computers
CN114490361A (en) Test script quality obtaining method and device, computer equipment and storage medium
CN115687323A (en) Index data quality monitoring method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant