CN110059083A - A kind of data evaluation method, apparatus and electronic equipment - Google Patents
A kind of data evaluation method, apparatus and electronic equipment Download PDFInfo
- Publication number
- CN110059083A CN110059083A CN201910337642.3A CN201910337642A CN110059083A CN 110059083 A CN110059083 A CN 110059083A CN 201910337642 A CN201910337642 A CN 201910337642A CN 110059083 A CN110059083 A CN 110059083A
- Authority
- CN
- China
- Prior art keywords
- data
- evaluated
- evaluation
- quality
- preset
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention provides a kind of data evaluation method, apparatus and electronic equipments, are related to the technical field of data processing, comprising: obtain data to be evaluated;Automatic Evaluation is carried out according to preset evaluation rule, and in conjunction with the quality of data of the preset quality comparison data of the data to be evaluated to the data to be evaluated, obtains data evaluation result;The quality of data grade of the data to be evaluated is determined according to the data evaluation result, the technical issues of the application alleviates existing quality testing system appraisal low efficiency.
Description
Technical field
The present invention relates to the technical fields of data processing, set more particularly, to a kind of data evaluation method, apparatus and electronics
It is standby.
Background technique
Currently, big data brings magnanimity, multiplicity, non-structured data, user is able to carry out more extensive and gos deep into
Analysis, still, the process of data analysis must be set up in the data of high quality just significant.And there are one in enterprise
Universal problem, that is, the acquisition of data, using, manage and maintain during, business norms lack or execute not
Power leads to data deficiency accuracy, therefore how to establish a data quality evaluation system and seem most important.Traditional data
Quality evaluation system needs to put into a large amount of manpower and goes to be monitored, and cost increases with the increase of data volume, efficiency with
The increase of data volume and reduce.
Summary of the invention
In view of this, the purpose of the present invention is to provide a kind of data evaluation method, apparatus and electronic equipments, to alleviate
The technical issues of existing quality testing system appraisal low efficiency.
In a first aspect, the embodiment of the invention provides a kind of data evaluation methods, comprising: obtain data to be evaluated;According to
Preset evaluation rule, and in conjunction with the preset quality comparison data of the data to be evaluated to the number of the data to be evaluated
Automatic Evaluation is carried out according to quality, obtains data evaluation result;The data to be evaluated are determined according to the data evaluation result
Quality of data grade.
It further, include evaluation parameter in the evaluation rule;According to preset evaluation rule, and in conjunction with described
It includes: according to institute that the preset quality comparison data of data to be evaluated, which carries out automatic Evaluation to the quality of data of the data to be evaluated,
At least one evaluation parameter in evaluation rule is stated, the data to be evaluated and the preset quality comparison data are compared
It is right, obtain at least one comparison result, wherein the evaluation parameter comprises at least one of the following evaluation parameter: the amount of data, number
According to accuracy and data timeliness;The data evaluation knot of the data to be evaluated is determined according at least one described comparison result
Fruit.
Further, the evaluation parameter includes data timeliness;It is evaluated according at least one of described evaluation rule
Parameter, it includes: according to issuing time respectively by institute that the data to be evaluated and the preset quality comparison data, which are compared,
It states data to be evaluated and the preset quality comparison data is ranked up, respectively obtain the first ranking results and the second sequence knot
Fruit;The timeliness that the data to be evaluated are determined according to first ranking results and second ranking results, obtains first
Comparison result.
Further, the evaluation parameter includes data volume;According at least one evaluation parameter in the evaluation rule,
It includes: by data to be evaluated and described default that the data to be evaluated and the preset quality comparison data, which are compared,
Quality comparison data starts the cleaning processing respectively;By the default matter after the data volume and cleaning of the data to be evaluated after cleaning
The data volume of amount comparison data is compared, and obtains the second comparison result.
Further, the method also includes: according to default automatic test rule, institute is extracted in target message queue
State data to be evaluated;And according to preset evaluation rule, in conjunction with the preset quality comparison data pair of the data to be evaluated
The quality of data of the data to be evaluated carries out automatic Evaluation.
Further, if determining that the quality of data of the data to be evaluated includes: described according to the data evaluation result
Data evaluation result is more than or equal to preset threshold, it is determined that and the quality of data of the data to be evaluated is higher than preset quality,
It and is the default mark of the data setting first to be evaluated;If the data evaluation result is less than the preset threshold, generate
Data warning message, and be the default mark of the data setting second to be evaluated.
Further, obtaining data to be evaluated includes: to carry out according to the significance level of each company to each company
Classification, obtains multiple groupings;The data that belonging each grouping is determined in preset data set, obtain target data set;
Data are extracted in each target data set according to default extraction ratio, obtain the data to be evaluated.
Further, the method also includes: according to the data evaluation result generate object statistics report, wherein institute
State in object statistics report includes: the data evaluation result and/or historical data evaluation result.
Second aspect, the embodiment of the invention provides a kind of data evaluation devices, comprising: acquiring unit, for obtain to
Evaluate data;Evaluation unit is used for according to preset evaluation rule, and in conjunction with the preset quality ratio of the data to be evaluated
Automatic Evaluation is carried out to the quality of data of the data to the data to be evaluated, obtains data evaluation result;Determination unit is used for root
The quality of data grade of the data to be evaluated is determined according to the data evaluation result.
The third aspect the embodiment of the invention provides a kind of electronic equipment, including memory, processor and is stored in described
On memory and the computer program that can run on the processor, the processor are realized when executing the computer program
The step of method described in any one of above-mentioned first aspect.
In embodiments of the present invention, firstly, obtaining data to be evaluated;Then, it according to preset evaluation rule, and ties
The preset quality comparison data for closing data to be evaluated treats the quality of data progress automatic Evaluation of evaluation data, obtains data evaluation
As a result;Finally, determining the quality of data grade of data to be evaluated according to data evaluation result.As can be seen from the above description, traditional
Quality testing system need to provide a large amount of manpower and material resources, and provided by the present invention can be not necessarily to artificial participation
Cost efficiency greatly reduces in the quality for going monitoring data in real time down.Further, it is effectively commented by establishing
Valence rule, can for it is existing aiming at the problem that, achieve the purpose that the promotion quality of data, and then alleviate the existing quality of data and comment
The technical issues of valence system appraisal low efficiency.
Other features and advantages of the present invention will illustrate in the following description, also, partly become from specification
It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention are in specification, claims
And specifically noted structure is achieved and obtained in attached drawing.
To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate
Appended attached drawing, is described in detail below.
Detailed description of the invention
It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art
Embodiment or attached drawing needed to be used in the description of the prior art be briefly described, it should be apparent that, it is described below
Attached drawing is some embodiments of the present invention, for those of ordinary skill in the art, before not making the creative labor
It puts, is also possible to obtain other drawings based on these drawings.
Fig. 1 is the structural schematic diagram of a kind of electronic equipment according to an embodiment of the present invention;
Fig. 2 is a kind of flow chart of data evaluation method according to an embodiment of the present invention;
Fig. 3 is the flow chart of another data evaluation method according to an embodiment of the present invention;
Fig. 4 is a kind of schematic diagram of data evaluation device according to an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with attached drawing to the present invention
Technical solution be clearly and completely described, it is clear that described embodiments are some of the embodiments of the present invention, rather than
Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise
Under every other embodiment obtained, shall fall within the protection scope of the present invention.
Embodiment one:
Firstly, describing the exemplary electrons of the data evaluation method and apparatus for realizing the embodiment of the present invention referring to Fig.1
Equipment 100.
As shown in Figure 1, electronic equipment 100 include one or more processors 102, it is one or more storage device 104, defeated
Enter device 106, output device 108 and data collector 110, these components pass through bus system 112 and/or other forms
The interconnection of bindiny mechanism's (not shown).It should be noted that the component and structure of electronic equipment 100 shown in FIG. 1 are only exemplary, and
Unrestricted, as needed, the electronic equipment also can have other assemblies and structure.
The processor 102 can be central processing unit (CPU) or have data-handling capacity and/or instruction execution
The processing unit of the other forms of ability, and the other components that can control in the electronic equipment 100 are desired to execute
Function.
The storage device 104 may include one or more computer program products, and the computer program product can
To include various forms of computer readable storage mediums, such as volatile memory and/or nonvolatile memory.It is described easy
The property lost memory for example may include random access memory (RAM) and/or cache memory (cache) etc..It is described non-
Volatile memory for example may include read-only memory (ROM), hard disk, flash memory etc..In the computer readable storage medium
On can store one or more computer program instructions, processor 102 can run described program instruction, to realize hereafter institute
The client functionality (realized by processor) in the embodiment of the present invention stated and/or other desired functions.In the meter
Can also store various application programs and various data in calculation machine readable storage medium storing program for executing, for example, the application program use and/or
The various data etc. generated.
The input unit 106 can be the device that user is used to input instruction, and may include keyboard, mouse, wheat
One or more of gram wind and touch screen etc..
The output device 108 can export various information (for example, image or sound) to external (for example, user), and
It and may include one or more of display, loudspeaker etc..
The data collector 110 can be with data to be evaluated, and data to be evaluated are stored in the storage device 104
In for other components use.
Illustratively, the exemplary electronic device for realizing data evaluation method according to an embodiment of the present invention can be by reality
Now in the equipment such as server.
Embodiment two:
According to embodiments of the present invention, a kind of embodiment of data evaluation method is provided, it should be noted that in attached drawing
The step of process illustrates can execute in a computer system such as a set of computer executable instructions, although also,
Logical order is shown in flow chart, but in some cases, it can be to be different from shown by sequence execution herein or retouch
The step of stating.
Fig. 2 is a kind of flow chart of data evaluation method according to an embodiment of the present invention, as shown in Fig. 2, this method includes
Following steps S202 to step S206.
It should be noted that in the present embodiment, method described in above-mentioned steps S202 to step S206 can be certainly
It is executed in dynamicization testing tool, optionally, which can be selenium automated test tool, remove this
Except, it is also an option that being other automated test tools, the present embodiment is not specifically limited.
Step S202 obtains data to be evaluated.
In the present embodiment, it can be obtained by automated test tool (for example, selenium automated test tool)
Data to be evaluated.Data to be evaluated can be any one data, for example, the data such as judgement document, user behavior data, this Shen
The data class that please treat evaluation data is not specifically limited.
Step S204 compares logarithm according to preset evaluation rule, and in conjunction with the preset quality of the data to be evaluated
Automatic Evaluation is carried out according to the quality of data to the data to be evaluated, obtains data evaluation result.
In the present embodiment, after automated test tool gets data to be evaluated, automated test tool can be with
Preset quality comparison data is obtained, and data to be evaluated and preset quality comparison data are subjected to mass ratio pair, thus realization pair
The quality of data of data to be evaluated carries out automatic Evaluation.
It should be noted that preset quality comparison data and data to be evaluated are the data of same type.For example, if to be evaluated
Data are the judgement document of company A, then preset quality comparison data is the judgement document of company B.
In the present embodiment, data evaluation result can reflect the quality feelings of data to be evaluated from least one data dimension
Condition, for example, reflecting the quality condition of data to be evaluated from the following aspects: the amount of data, the data precision and data are timely
Property.
In an alternative embodiment, it may include the following contents in the data evaluation result: to be evaluated for reacting
The total value of the quality of data of data, one or more subnumbers of the quality of data for reacting data to be evaluated from each dimension
Value, wherein one or more subnumber values are for determining total value.
Step S206 determines the quality of data grade of the data to be evaluated according to the data evaluation result.
In the present embodiment, automated test tool is after obtaining data evaluation result, so that it may according to data evaluation
As a result the quality of data grade of data to be evaluated is determined.
Quality of data grade can be the pre-set grade of user, for example, can be low by quality of data grade classification
Qualitative data and quality data.Alternatively, being divided into the data of the first credit rating, the data of the second credit rating and third matter
Measure the data of grade.The present embodiment is not specifically limited the division rule of data credit rating, can be according to following embodiments
Described in preset threshold set.
In embodiments of the present invention, firstly, obtaining data to be evaluated, wherein data to be evaluated are one or more;So
Afterwards, according to preset evaluation rule, and the preset quality comparison data of data to be evaluated is combined to treat the number for evaluating data
Automatic Evaluation is carried out according to quality, obtains data evaluation result;Finally, determining the data of data to be evaluated according to data evaluation result
Credit rating.As can be seen from the above description, traditional quality testing system needs to provide a large amount of manpower and material resources, and this hair
The quality that can go monitoring data provided by bright in real time under without artificial participation, greatly reduces cost
Efficiency.Further, by establishing effectively evaluating rule, can for it is existing aiming at the problem that, reach the promotion quality of data
Purpose, and then the technical issues of alleviate existing quality testing system appraisal low efficiency.
As can be seen from the above description, in the present embodiment, data to be evaluated are obtained first.Optionally, data to be evaluated can
Think that the partial data extracted from mass data, data to be evaluated can also be full dose data.
If data to be evaluated are the partial data extracted from mass data, then the selection mode of data to be evaluated can be with
Data to be evaluated are extracted from mass data for the mode of simple random sampling;Can also for stratified sampling mode from magnanimity number
Data to be evaluated are extracted according to middle.
As shown in Figure 3, it is assumed that data pick-up mode is the mode of stratified sampling, then step S202, obtains data to be evaluated
Include the following steps:
Step S301 classifies to each company according to the significance level of each company, obtains multiple groupings.
In the present embodiment, can according to the significance level for characterizing each company extent index to each company into
Row classification.Wherein, extent index can be at least one of: the registered address of each company, and the scale of each company is each
The enterprise nature (for example, central enterprise, enterprise and institution, state and individual enterprise etc.) of company, whether each company is listed company, each
The information such as the profit of company.
It optionally, can be by companies all in database point after classifying according to significance level to each company
For three layers of important company, general company and self-employed entrepreneur.For example, by " Baidu ", the esbablished corporations such as " millet " or registration
Address is Beijing, and Shanghai, Guangdong, Shenzhen is divided into important this set of company;Pageview is lower and registered address is not being gone up north extensively
Deep company is divided into general company;The last layer is then the enterprise that company's type is self-employed entrepreneur.
Step S302 determines the data of belonging each grouping in preset data set, obtains target data set.
Each company is being classified according to above-mentioned described step, after obtaining multiple groupings, so that it may pre-
If determining the data of belonging each grouping in data acquisition system, target data set is obtained.
For example, the data of " important company " belonging to determining, obtain target data set A1;It is " common public belonging to determination
The data of department ", obtain target data set A2;And the data of " self-employed entrepreneur " belonging to determining, obtain target data set
Close A3.
Step S303 extracts data in each target data set according to default extraction ratio, obtain it is described to
Evaluate data.
After obtaining target data set according to above-mentioned described step, so that it may in target data set according to
Default extraction ratio extracts sample data out, and using the sample data being drawn into as data to be evaluated, and then is directed to these samples
Data again go to be compared by random some dimensions of therefrom extraction.
For example, it is directed to number of targets set A1 to target data set A3, it can be in target data set A1 according to default
Extraction ratio sample drawn data are as data to be evaluated;It can also be taken out in target data set A2 according to default extraction ratio
Notebook data is sampled as data to be evaluated;It can also be in target data set A3 according to default extraction ratio sample drawn data
As data to be evaluated.
As can be seen from the above description, data to be evaluated are extracted by the way of stratified sampling, it can be according to company attributes pair
Mass data in preset data set is classified, and the data after classification are carried out with the process of quality evaluation, can be realized
Quality evaluation is carried out to the data to be evaluated of each type, so that the evaluation to the quality of data is more accurate.
In the present embodiment, selenium automated test tool is obtained according to above-mentioned described method from magnanimity number
After obtaining data to be evaluated according to middle extraction, so that it may by the data to be evaluated being drawn into be put into target message queue (for example,
Redis queue) in.Later, selenium automated test tool is according to default automatic test rule, in target message queue
It is middle to extract the data to be evaluated;And according to preset evaluation rule, in conjunction with the preset quality ratio of the data to be evaluated
Automatic Evaluation is carried out to the quality of data of the data to the data to be evaluated.
As can be seen from the above description, in the present embodiment, the sample that selenium automated test tool can will be drawn into
Notebook data is respectively put into redis queue, goes to disappear by the program of the automatic test of selenium automated test tool
It consumes, does not need the sample for being always maintained at predetermined quantity in redis queue, it is only necessary to which guarantee has enterprise to go consumption i.e. for program
Can, when being depleted to certain amount, monitoring script recycles identical sampling prescription to supplement queue, to realize entire
Circulation.
After obtaining data to be evaluated according to above-mentioned described mode, so that it may be advised according to preset evaluation
Then, and gather data to be evaluated preset quality comparison data treat evaluation data the quality of data carry out automatic Evaluation, obtain
Data evaluation result.
For data to be evaluated, can be finely divided when the dimension of data, from the accuracy of data, timeliness and data
It measures these aspects data to be evaluated and preset quality comparison data (or referred to as competing product data) are compared, obtains data
Evaluation result.
In the present embodiment, set includes following evaluation parameter: amount, the data precision and the data of data in evaluation rule
Timeliness.
Based on this, in the present embodiment, step S204, according to preset evaluation rule, and in conjunction with described to be evaluated
The preset quality comparison data of data carries out automatic Evaluation to the quality of data of the data to be evaluated and includes the following steps:
Step S2041, according at least one evaluation parameter in the evaluation rule, by data to be evaluated and described
Preset quality comparison data is compared, and obtains at least one comparison result, wherein the evaluation parameter includes following at least one
Kind evaluation parameter: amount, the data precision and the data timeliness of data.
Specifically, in the present embodiment, can according to evaluation parameter, by data to be evaluated and preset quality comparison data into
Row compares.For example, data to be evaluated and preset quality comparison data can be compared from the measuring angle of data, be compared
To result B1;Can also data to be evaluated and preset quality comparison data be compared from the data precision angle, be compared
To result B2;Can also data to be evaluated and preset quality comparison data be compared from data timeliness angle, be compared
To result B3.
Using evaluation parameter treat evaluation data evaluated by the way of, the quality feelings of data can be reflected from dimension
Condition, to obtain more accurate data evaluation result.
Step S2042 determines the data evaluation result of the data to be evaluated according at least one described comparison result.
After obtaining at least one comparison result according to above-mentioned described step, so that it may compare and tie at least one
Fruit is calculated, and the data evaluation result of data to be evaluated is obtained.For example, being weighted summation meter at least one comparison result
It calculates, obtains the data evaluation result of data to be evaluated.
If evaluation parameter includes data timeliness, then step S2041 is commented according at least one of described evaluation rule
The data to be evaluated and the preset quality comparison data are compared and are included the following steps: by valence parameter
Step S11 respectively arranges the data to be evaluated and the preset quality comparison data according to issuing time
Sequence respectively obtains the first ranking results and the second ranking results.
Step S12, according to first ranking results and second ranking results determine the data to be evaluated and
Shi Xing obtains the first comparison result.
It specifically, can be first when data to be evaluated and preset quality comparison data are carried out data age comparison
Evaluation data are treated respectively according to the issuing time of data and preset quality comparison data is ranked up.For example, according to data
Issuing time treat respectively evaluation data and preset quality comparison data carry out Bit-reversed, respectively obtain the first ranking results and
Second ranking results.
After obtaining the first ranking results and the second ranking results, N item (example before being selected from the first ranking results
Such as, first 100) data are as data C1 to be compared, and preceding N item (for example, first 100) data of selection from the second ranking results
As data C2 to be compared.Then, the timeliness between data C1 to be compared and data C2 to be compared is compared, wherein compare mark
Standard is the lookup newest data of issuing time in data C1 to be compared and data C2 to be compared.The newest data institute of issuing time
The timeliness of the data to be compared belonged to is more excellent.The comparison that data C1 to be compared and data C2 to be compared are carried out to timeliness it
Afterwards, a comparison result, i.e. the first comparison result will be obtained, at this point it is possible in the database by first comparison result storage
It is spare.
If evaluation parameter includes data volume, then step S2041, evaluates according at least one of described evaluation rule and joins
Number, the data to be evaluated and the preset quality comparison data are compared and are included the following steps:
Step S21 starts the cleaning processing the data to be evaluated and the preset quality comparison data respectively.
Step S22, by the preset quality comparison data after the data volume and cleaning of the data to be evaluated after cleaning
Data volume is compared, and obtains the second comparison result.
Specifically, in the present embodiment, the repeat number in data to be evaluated can be removed according to preset data cleaning rule
According to, and the repeated data in removal preset quality comparison data.And it is removed respectively according to the preset data cleaning rule to be evaluated
Dirty data in valence mumber evidence and preset quality comparison data etc. interferes data.To the data to be evaluated after clean and clearly
Preset quality comparison data after washing.Later, so that it may by the default matter after the data to be evaluated and cleaning after cleaning
Measure the comparison that comparison data carries out data volume.
It is illustrated by taking judgement document as an example below.It is assumed that it is data to be evaluated that judgement document, which organizes A, judgement document organizes B and is
Preset quality comparison data.Firstly, removing respectively according to preset data cleaning rule, judgement document organizes A and judgement document organizes in B
Repeated data and dirty data, so that the judgement document that judgement document after being cleaned organizes after A and cleaning organizes B.Later,
The number that judgement document included in A is organized according to judgement document determines that judgement document organizes the data volume of A;And according to identical
Method determines that judgement document organizes the data volume of B.Later, so that it may organize the data volume of A using judgement document and judgement document organizes B's
Data volume determines the second comparison result.
If evaluation parameter includes accuracy, then step S2041, evaluates according at least one of described evaluation rule and joins
Number, the detailed process that the data to be evaluated and the preset quality comparison data are compared is described as follows:
Specifically, in the present embodiment, available normal data, then, respectively by data to be evaluated and preset quality
Comparison data is compared with normal data, to obtain the repetitive rate between data to be evaluated and normal data, and obtains
Repetitive rate between preset quality comparison data and normal data.To determine third comparison result according to the two repetitive rates,
Wherein, third comparison result is used to characterize the accuracy of data to be evaluated.
It should be noted that in the present embodiment, according to preset evaluation rule, treating evaluation data and carrying out matter
Before amount evaluation, environment needed for can putting up selenium automated test tool in Linux system in advance is installed
Chrome browser without a head on the server by the quality of data checking routine finished writing deployment is examined in cooperation timing script automatically
The operating status of ranging sequence can be restarted automatically when program delay machine, ensure that entire quality testing system is not necessarily to artificially
It goes to participate in.After completing above-mentioned deployment, so that it may according to preset evaluation rule, treat evaluation data progress quality and comment
Valence.
Above-mentioned evaluation procedure is introduced by taking judgement document as an example below.
For example, can use first when judgement document's (data to be evaluated) to some company is compared
Selenium automated test tool carries out automation login to competing product (company belonging to preset quality comparison data), passes through tune
Identifying code is solved the problems, such as with third party's stamp platform, goes to navigate to company belonging to preset quality comparison data later again
The position of DOM element where judgement document's dimension.The information that the competing product page is shown is compared with data to be evaluated.Than
Pair content include: data amount, the data precision and data timeliness, specific comparison process is as described above, herein no longer in detail
Carefully repeat.After being compared according to above-mentioned described alignments, at least one comparison result is obtained, and according at least
One comparison result determines the data evaluation result of the data to be evaluated.
In the present embodiment, after obtaining data evaluation result according to above-mentioned described mode, so that it may according to number
The quality of data grade of data to be evaluated is determined according to evaluation data.
If the data evaluation result is more than or equal to preset threshold, it is determined that the quality of data of the data to be evaluated
It higher than preset quality, and is the default mark of the data setting first to be evaluated;If the data evaluation result is less than described pre-
If threshold value, then data warning message is generated, and is the default mark of the data setting second to be evaluated.
Specifically, in the present embodiment, it can be determined that whether data evaluation result is more than or equal to preset pre-
If threshold value, if being higher than preset threshold, mark is preset for the data addition first to be evaluated, for example, by the data to be evaluated
Color rendering at green, indicate that the quality of the data to be evaluated is good.If being lower than preset threshold, for the data to be evaluated
The default mark of addition second, for example, indicating that the quality of the data to be evaluated is deposited by the color rendering of the data to be evaluated at red
In problem.In the present embodiment, the script that can also utilize timed task regularly sends mail daily and goes notice relevant person in charge
The evaluation situation of the quality of data.
In the present embodiment, after obtaining data evaluation result, mesh can also be generated according to the data evaluation result
Mark statistical report form, wherein include: the data evaluation result and/or historical data evaluation result in the object statistics report.
Specifically, in the present embodiment, the data evaluation result on the same day can be carried out by company's grouping from database
The comparison details of each dimension of each company are stored in excel document, so that relevant person in charge goes to check case by statistics.For
These cases go to find out current data there are the problem of, the result for later comparing the same day in library carries out logic deletion, so as to
The data for distinguishing newest comparison, will summarize the result obtained can be stored in another table as the statistical result of daily history, from
The trend that can analyze quality of data variation in historical statistics result, can also be to be made into various statistical report forms.It finally can be with
Take the same day summarizes comparing result as daily paper, so as to the daily situation of change of the relevant person in charge observation quality of data.
It should be noted that in the present embodiment, before obtaining data to be evaluated, it is also necessary to the rule of building evaluation in advance
Then, wherein can go to establish the rule evaluated, the i.e. synthesis of the amount of data, data accuracy, data timeliness in terms of three
Score can be described as final evaluation index, specific building process:
Firstly, obtaining the preset data of each company, and competing product data corresponding with the said firm.Then, from data
Amount, data accuracy, data timeliness these three aspect, preset data and competing product data are compared, to be counted
According to comparison result.After obtaining the comparing result of each company, so that it may according to the comparing result of each company
Determine evaluation rule.
As can be seen from the above description, traditional quality testing system needs to provide a large amount of manpower and material resources, and this hair
The quality that can go monitoring data provided by bright in real time under without artificial participation, greatly reduces cost
Efficiency.Further, by establishing effectively evaluating rule, can for it is existing aiming at the problem that, reach the promotion quality of data
Purpose, and then the technical issues of alleviate existing quality testing system appraisal low efficiency.
Embodiment three:
The embodiment of the invention also provides a kind of data evaluation device, which is mainly used for executing the present invention
Data evaluation method provided by embodiment above content, below does specifically data evaluation device provided in an embodiment of the present invention
It introduces.
Fig. 4 is a kind of schematic diagram of data evaluation device according to an embodiment of the present invention, as shown in figure 4, the data evaluation
Device mainly includes acquiring unit 10, evaluation unit 20 and determination unit 30, in which:
Acquiring unit 10, for obtaining data to be evaluated;
Evaluation unit 20 is used for according to preset evaluation rule, and in conjunction with the preset quality of the data to be evaluated
Comparison data carries out automatic Evaluation to the quality of data of the data to be evaluated, obtains data evaluation result;
Determination unit 30, for determining the quality of data grade of the data to be evaluated according to the data evaluation result.
In embodiments of the present invention, firstly, obtaining data to be evaluated, wherein data to be evaluated are one or more;So
Afterwards, according to preset evaluation rule, and the preset quality comparison data of data to be evaluated is combined to treat the number for evaluating data
Automatic Evaluation is carried out according to quality, obtains data evaluation result;Finally, determining the data of data to be evaluated according to data evaluation result
Credit rating.As can be seen from the above description, traditional quality testing system needs to provide a large amount of manpower and material resources, and this hair
The quality that can go monitoring data provided by bright in real time under without artificial participation, greatly reduces cost
Efficiency.Further, by establishing effectively evaluating rule, can for it is existing aiming at the problem that, reach the promotion quality of data
Purpose, and then the technical issues of alleviate existing quality testing system appraisal low efficiency.
Optionally, if in the evaluation rule including evaluation parameter;Evaluation unit is used for: according in the evaluation rule
The data to be evaluated and the preset quality comparison data are compared, obtain at least one by least one evaluation parameter
Comparison result, wherein the evaluation parameter comprises at least one of the following evaluation parameter: amount, the data precision and the data of data
Timeliness;The data evaluation result of the data to be evaluated is determined according at least one described comparison result.
Optionally, if the evaluation parameter includes data timeliness;Evaluation unit is also used to: respectively will according to issuing time
The data to be evaluated and the preset quality comparison data are ranked up, and respectively obtain the first ranking results and the second sequence knot
Fruit;The timeliness that the data to be evaluated are determined according to first ranking results and second ranking results, obtains first
Comparison result.
Optionally, if the evaluation parameter includes data volume;Evaluation unit is also used to: by data to be evaluated and described
Preset quality comparison data starts the cleaning processing respectively;It will be pre- after the data volume and cleaning of the data to be evaluated after cleaning
If the data volume of quality comparison data is compared, the second comparison result is obtained.
Optionally, described device is also used to: according to default automatic test rule, in target message queue described in extraction
Data to be evaluated;And according to preset evaluation rule, in conjunction with the data to be evaluated preset quality comparison data to institute
The quality of data for stating data to be evaluated carries out automatic Evaluation.
Optionally it is determined that unit is used for: if the data evaluation result is more than or equal to preset threshold, it is determined that described
The quality of data of data to be evaluated is higher than preset quality, and is the default mark of the data setting first to be evaluated;If the number
It is less than the preset threshold according to evaluation result, then generates data warning message, and default for the data setting second to be evaluated
Mark.
Optionally, acquiring unit is used for: being classified according to the significance level of each company to each company, is obtained
Multiple groupings;The data that belonging each grouping is determined in preset data set, obtain target data set;According to default pumping
It takes ratio to extract data in each target data set, obtains the data to be evaluated.
Optionally, described device is also used to: generating object statistics report according to the data evaluation result, wherein described
It include: the data evaluation result and/or historical data evaluation result in object statistics report.
The technical effect and preceding method embodiment phase of device provided by the embodiment of the present invention, realization principle and generation
Together, to briefly describe, Installation practice part does not refer to place, can refer to corresponding contents in preceding method embodiment.
Present invention also provides it is a kind of with processor can be performed non-volatile program code computer-readable medium,
Said program code makes the processor execute any the method in above method embodiment.
In addition, in the description of the embodiment of the present invention unless specifically defined or limited otherwise, term " installation ", " phase
Even ", " connection " shall be understood in a broad sense, for example, it may be being fixedly connected, may be a detachable connection, or be integrally connected;It can
To be mechanical connection, it is also possible to be electrically connected;It can be directly connected, can also can be indirectly connected through an intermediary
Connection inside two elements.For the ordinary skill in the art, above-mentioned term can be understood at this with concrete condition
Concrete meaning in invention.
In the description of the present invention, it should be noted that term " center ", "upper", "lower", "left", "right", "vertical",
The orientation or positional relationship of the instructions such as "horizontal", "inner", "outside" be based on the orientation or positional relationship shown in the drawings, merely to
Convenient for description the present invention and simplify description, rather than the device or element of indication or suggestion meaning must have a particular orientation,
It is constructed and operated in a specific orientation, therefore is not considered as limiting the invention.In addition, term " first ", " second ",
" third " is used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with
It realizes by another way.The apparatus embodiments described above are merely exemplary, for example, the division of the unit,
Only a kind of logical function partition, there may be another division manner in actual implementation, in another example, multiple units or components can
To combine or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or beg for
The mutual coupling, direct-coupling or communication connection of opinion can be through some communication interfaces, device or unit it is indirect
Coupling or communication connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product
It is stored in the executable non-volatile computer-readable storage medium of a processor.Based on this understanding, of the invention
Technical solution substantially the part of the part that contributes to existing technology or the technical solution can be with software in other words
The form of product embodies, which is stored in a storage medium, including some instructions use so that
One computer equipment (can be personal computer, server or the network equipment etc.) executes each embodiment institute of the present invention
State all or part of the steps of method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-
Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can be with
Store the medium of program code.
Finally, it should be noted that embodiment described above, only a specific embodiment of the invention, to illustrate the present invention
Technical solution, rather than its limitations, scope of protection of the present invention is not limited thereto, although with reference to the foregoing embodiments to this hair
It is bright to be described in detail, those skilled in the art should understand that: anyone skilled in the art
In the technical scope disclosed by the present invention, it can still modify to technical solution documented by previous embodiment or can be light
It is readily conceivable that variation or equivalent replacement of some of the technical features;And these modifications, variation or replacement, do not make
The essence of corresponding technical solution is detached from the spirit and scope of technical solution of the embodiment of the present invention, should all cover in protection of the invention
Within the scope of.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. a kind of data evaluation method characterized by comprising
Obtain data to be evaluated;
According to preset evaluation rule, and in conjunction with the preset quality comparison data of the data to be evaluated to described to be evaluated
The quality of data of data carries out automatic Evaluation, obtains data evaluation result;
The quality of data grade of the data to be evaluated is determined according to the data evaluation result.
2. the method according to claim 1, wherein including evaluation parameter in the evaluation rule;
According to preset evaluation rule, and in conjunction with the preset quality comparison data of the data to be evaluated to described to be evaluated
The quality of data of data carries out automatic Evaluation
According at least one evaluation parameter in the evaluation rule, the data to be evaluated and the preset quality are compared into logarithm
According to being compared, at least one comparison result is obtained, wherein the evaluation parameter comprises at least one of the following evaluation parameter: number
According to amount, the data precision and data timeliness;
The data evaluation result of the data to be evaluated is determined according at least one described comparison result.
3. according to the method described in claim 2, it is characterized in that, the evaluation parameter includes data timeliness;
According at least one evaluation parameter in the evaluation rule, the data to be evaluated and the preset quality are compared into logarithm
Include: according to being compared
The data to be evaluated and the preset quality comparison data are ranked up respectively according to issuing time, respectively obtain
One ranking results and the second ranking results;
The timeliness that the data to be evaluated are determined according to first ranking results and second ranking results, obtains first
Comparison result.
4. according to the method described in claim 2, it is characterized in that, the evaluation parameter includes data volume;
According at least one evaluation parameter in the evaluation rule, the data to be evaluated and the preset quality are compared into logarithm
Include: according to being compared
The data to be evaluated and the preset quality comparison data are started the cleaning processing respectively;
The data volume of preset quality comparison data after the data volume and cleaning of data to be evaluated after cleaning is compared
It is right, obtain the second comparison result.
5. method according to claim 1 to 4, which is characterized in that the method also includes:
According to default automatic test rule, the data to be evaluated are extracted in target message queue;And according to presetting
Evaluation rule, the qualities of data of the data to be evaluated is carried out in conjunction with the preset quality comparison data of the data to be evaluated
Automatic Evaluation.
6. method according to claim 1 to 4, which is characterized in that determined according to the data evaluation result
The quality of data grade of the data to be evaluated includes:
If the data evaluation result is more than or equal to preset threshold, it is determined that the quality of data of the data to be evaluated is higher than
Preset quality, and be the default mark of the data setting first to be evaluated;
If the data evaluation result is less than the preset threshold, data warning message is generated, and is the data to be evaluated
The default mark of setting second.
7. method according to claim 1 to 4, which is characterized in that obtaining data to be evaluated includes:
Classify according to the significance level of each company to each company, obtains multiple groupings;
The data that belonging each grouping is determined in preset data set, obtain target data set;
Data are extracted in each target data set according to default extraction ratio, obtain the data to be evaluated.
8. method according to claim 1 to 4, which is characterized in that the method also includes:
Object statistics report is generated according to the data evaluation result, wherein includes: the data in the object statistics report
Evaluation result and/or historical data evaluation result.
9. a kind of data evaluation device characterized by comprising
Acquiring unit, for obtaining data to be evaluated;
Evaluation unit, for comparing logarithm according to preset evaluation rule, and in conjunction with the preset quality of the data to be evaluated
Automatic Evaluation is carried out according to the quality of data to the data to be evaluated, obtains data evaluation result;
Determination unit, for determining the quality of data grade of the data to be evaluated according to the data evaluation result.
10. a kind of electronic equipment, including memory, processor and it is stored on the memory and can transports on the processor
Capable computer program, which is characterized in that the processor realizes the claims 1 to 8 when executing the computer program
The step of described in any item methods.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910337642.3A CN110059083A (en) | 2019-04-24 | 2019-04-24 | A kind of data evaluation method, apparatus and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910337642.3A CN110059083A (en) | 2019-04-24 | 2019-04-24 | A kind of data evaluation method, apparatus and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110059083A true CN110059083A (en) | 2019-07-26 |
Family
ID=67320778
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910337642.3A Pending CN110059083A (en) | 2019-04-24 | 2019-04-24 | A kind of data evaluation method, apparatus and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110059083A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111027821A (en) * | 2019-11-25 | 2020-04-17 | 泰康保险集团股份有限公司 | Service organization evaluation method and device, storage medium and electronic equipment |
CN112487453A (en) * | 2020-12-07 | 2021-03-12 | 马力 | Data security sharing method and device based on central coordinator |
CN113822602A (en) * | 2021-11-22 | 2021-12-21 | 武汉龙津科技有限公司 | Data value evaluation method, device, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102262678A (en) * | 2011-08-16 | 2011-11-30 | 郑毅 | System for sampling mass data and managing sampled data |
US20150310166A1 (en) * | 2012-11-28 | 2015-10-29 | Institut National De La Sante Et De La Recherche Medicale (Inserm) | Method and system for processing data for evaluating a quality level of a dataset |
CN106469395A (en) * | 2016-08-31 | 2017-03-01 | 国信优易数据有限公司 | A kind of data commodity dynamic comprehensive appraisal procedure and system |
CN108764705A (en) * | 2018-05-24 | 2018-11-06 | 国信优易数据有限公司 | A kind of data quality accessment platform and method |
CN109254959A (en) * | 2018-08-17 | 2019-01-22 | 广东技术师范学院 | A kind of data evaluation method, apparatus, terminal device and readable storage medium storing program for executing |
-
2019
- 2019-04-24 CN CN201910337642.3A patent/CN110059083A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102262678A (en) * | 2011-08-16 | 2011-11-30 | 郑毅 | System for sampling mass data and managing sampled data |
US20150310166A1 (en) * | 2012-11-28 | 2015-10-29 | Institut National De La Sante Et De La Recherche Medicale (Inserm) | Method and system for processing data for evaluating a quality level of a dataset |
CN106469395A (en) * | 2016-08-31 | 2017-03-01 | 国信优易数据有限公司 | A kind of data commodity dynamic comprehensive appraisal procedure and system |
CN108764705A (en) * | 2018-05-24 | 2018-11-06 | 国信优易数据有限公司 | A kind of data quality accessment platform and method |
CN109254959A (en) * | 2018-08-17 | 2019-01-22 | 广东技术师范学院 | A kind of data evaluation method, apparatus, terminal device and readable storage medium storing program for executing |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111027821A (en) * | 2019-11-25 | 2020-04-17 | 泰康保险集团股份有限公司 | Service organization evaluation method and device, storage medium and electronic equipment |
CN112487453A (en) * | 2020-12-07 | 2021-03-12 | 马力 | Data security sharing method and device based on central coordinator |
CN113822602A (en) * | 2021-11-22 | 2021-12-21 | 武汉龙津科技有限公司 | Data value evaluation method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105283851B (en) | For selecting the cost analysis of tracking target | |
CN108520357B (en) | Method and device for judging line loss abnormality reason and server | |
CN105283848B (en) | Application tracking is carried out with distributed object | |
CN113792825A (en) | Fault classification model training method and device for electricity information acquisition equipment | |
CN105122212A (en) | Periodicity optimization in an automated tracing system | |
CN110059083A (en) | A kind of data evaluation method, apparatus and electronic equipment | |
CN105122234A (en) | Deploying trace objectives using cost analyses | |
WO2021254027A1 (en) | Method and apparatus for identifying suspicious community, and storage medium and computer device | |
CN110287316A (en) | A kind of Alarm Classification method, apparatus, electronic equipment and storage medium | |
CN110597719B (en) | Image clustering method, device and medium for adaptation test | |
CN108170830B (en) | Group event data visualization method and system | |
CN110659985A (en) | Method and device for fishing back false rejection potential user and electronic equipment | |
CN110706096A (en) | Method and device for managing credit line based on salvage-back user and electronic equipment | |
CN105184886A (en) | Cloud data center intelligence inspection system and cloud data center intelligence inspection method | |
CN109412839A (en) | A kind of recognition methods, device, equipment and the storage medium of exception account | |
CN112598294A (en) | Method, device, machine readable medium and equipment for establishing scoring card model on line | |
CN109787958A (en) | Network flow real-time detection method and detection terminal, computer readable storage medium | |
CN115237804A (en) | Performance bottleneck assessment method, performance bottleneck assessment device, electronic equipment, medium and program product | |
CN114090556B (en) | Electric power marketing data acquisition method and system | |
CN115357629A (en) | Processing method, system, electronic device and storage medium for financial data stream | |
CN111210332A (en) | Method and device for generating post-loan management strategy and electronic equipment | |
CN106528774A (en) | Method and apparatus for predicting distribution network project management trend | |
CN113269378A (en) | Network traffic processing method and device, electronic equipment and readable storage medium | |
CN109064211A (en) | Sales service data analysing method, device and server | |
CN107430590A (en) | Data compare |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |