CN110059083A - A kind of data evaluation method, apparatus and electronic equipment - Google Patents

A kind of data evaluation method, apparatus and electronic equipment Download PDF

Info

Publication number
CN110059083A
CN110059083A CN201910337642.3A CN201910337642A CN110059083A CN 110059083 A CN110059083 A CN 110059083A CN 201910337642 A CN201910337642 A CN 201910337642A CN 110059083 A CN110059083 A CN 110059083A
Authority
CN
China
Prior art keywords
data
evaluated
evaluation
quality
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910337642.3A
Other languages
Chinese (zh)
Inventor
杜波
程浩
黄文瀚
蓝春倩
柳超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dike Technology Co Ltd
Original Assignee
Beijing Dike Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dike Technology Co Ltd filed Critical Beijing Dike Technology Co Ltd
Priority to CN201910337642.3A priority Critical patent/CN110059083A/en
Publication of CN110059083A publication Critical patent/CN110059083A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of data evaluation method, apparatus and electronic equipments, are related to the technical field of data processing, comprising: obtain data to be evaluated;Automatic Evaluation is carried out according to preset evaluation rule, and in conjunction with the quality of data of the preset quality comparison data of the data to be evaluated to the data to be evaluated, obtains data evaluation result;The quality of data grade of the data to be evaluated is determined according to the data evaluation result, the technical issues of the application alleviates existing quality testing system appraisal low efficiency.

Description

A kind of data evaluation method, apparatus and electronic equipment
Technical field
The present invention relates to the technical fields of data processing, set more particularly, to a kind of data evaluation method, apparatus and electronics It is standby.
Background technique
Currently, big data brings magnanimity, multiplicity, non-structured data, user is able to carry out more extensive and gos deep into Analysis, still, the process of data analysis must be set up in the data of high quality just significant.And there are one in enterprise Universal problem, that is, the acquisition of data, using, manage and maintain during, business norms lack or execute not Power leads to data deficiency accuracy, therefore how to establish a data quality evaluation system and seem most important.Traditional data Quality evaluation system needs to put into a large amount of manpower and goes to be monitored, and cost increases with the increase of data volume, efficiency with The increase of data volume and reduce.
Summary of the invention
In view of this, the purpose of the present invention is to provide a kind of data evaluation method, apparatus and electronic equipments, to alleviate The technical issues of existing quality testing system appraisal low efficiency.
In a first aspect, the embodiment of the invention provides a kind of data evaluation methods, comprising: obtain data to be evaluated;According to Preset evaluation rule, and in conjunction with the preset quality comparison data of the data to be evaluated to the number of the data to be evaluated Automatic Evaluation is carried out according to quality, obtains data evaluation result;The data to be evaluated are determined according to the data evaluation result Quality of data grade.
It further, include evaluation parameter in the evaluation rule;According to preset evaluation rule, and in conjunction with described It includes: according to institute that the preset quality comparison data of data to be evaluated, which carries out automatic Evaluation to the quality of data of the data to be evaluated, At least one evaluation parameter in evaluation rule is stated, the data to be evaluated and the preset quality comparison data are compared It is right, obtain at least one comparison result, wherein the evaluation parameter comprises at least one of the following evaluation parameter: the amount of data, number According to accuracy and data timeliness;The data evaluation knot of the data to be evaluated is determined according at least one described comparison result Fruit.
Further, the evaluation parameter includes data timeliness;It is evaluated according at least one of described evaluation rule Parameter, it includes: according to issuing time respectively by institute that the data to be evaluated and the preset quality comparison data, which are compared, It states data to be evaluated and the preset quality comparison data is ranked up, respectively obtain the first ranking results and the second sequence knot Fruit;The timeliness that the data to be evaluated are determined according to first ranking results and second ranking results, obtains first Comparison result.
Further, the evaluation parameter includes data volume;According at least one evaluation parameter in the evaluation rule, It includes: by data to be evaluated and described default that the data to be evaluated and the preset quality comparison data, which are compared, Quality comparison data starts the cleaning processing respectively;By the default matter after the data volume and cleaning of the data to be evaluated after cleaning The data volume of amount comparison data is compared, and obtains the second comparison result.
Further, the method also includes: according to default automatic test rule, institute is extracted in target message queue State data to be evaluated;And according to preset evaluation rule, in conjunction with the preset quality comparison data pair of the data to be evaluated The quality of data of the data to be evaluated carries out automatic Evaluation.
Further, if determining that the quality of data of the data to be evaluated includes: described according to the data evaluation result Data evaluation result is more than or equal to preset threshold, it is determined that and the quality of data of the data to be evaluated is higher than preset quality, It and is the default mark of the data setting first to be evaluated;If the data evaluation result is less than the preset threshold, generate Data warning message, and be the default mark of the data setting second to be evaluated.
Further, obtaining data to be evaluated includes: to carry out according to the significance level of each company to each company Classification, obtains multiple groupings;The data that belonging each grouping is determined in preset data set, obtain target data set; Data are extracted in each target data set according to default extraction ratio, obtain the data to be evaluated.
Further, the method also includes: according to the data evaluation result generate object statistics report, wherein institute State in object statistics report includes: the data evaluation result and/or historical data evaluation result.
Second aspect, the embodiment of the invention provides a kind of data evaluation devices, comprising: acquiring unit, for obtain to Evaluate data;Evaluation unit is used for according to preset evaluation rule, and in conjunction with the preset quality ratio of the data to be evaluated Automatic Evaluation is carried out to the quality of data of the data to the data to be evaluated, obtains data evaluation result;Determination unit is used for root The quality of data grade of the data to be evaluated is determined according to the data evaluation result.
The third aspect the embodiment of the invention provides a kind of electronic equipment, including memory, processor and is stored in described On memory and the computer program that can run on the processor, the processor are realized when executing the computer program The step of method described in any one of above-mentioned first aspect.
In embodiments of the present invention, firstly, obtaining data to be evaluated;Then, it according to preset evaluation rule, and ties The preset quality comparison data for closing data to be evaluated treats the quality of data progress automatic Evaluation of evaluation data, obtains data evaluation As a result;Finally, determining the quality of data grade of data to be evaluated according to data evaluation result.As can be seen from the above description, traditional Quality testing system need to provide a large amount of manpower and material resources, and provided by the present invention can be not necessarily to artificial participation Cost efficiency greatly reduces in the quality for going monitoring data in real time down.Further, it is effectively commented by establishing Valence rule, can for it is existing aiming at the problem that, achieve the purpose that the promotion quality of data, and then alleviate the existing quality of data and comment The technical issues of valence system appraisal low efficiency.
Other features and advantages of the present invention will illustrate in the following description, also, partly become from specification It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention are in specification, claims And specifically noted structure is achieved and obtained in attached drawing.
To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate Appended attached drawing, is described in detail below.
Detailed description of the invention
It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art Embodiment or attached drawing needed to be used in the description of the prior art be briefly described, it should be apparent that, it is described below Attached drawing is some embodiments of the present invention, for those of ordinary skill in the art, before not making the creative labor It puts, is also possible to obtain other drawings based on these drawings.
Fig. 1 is the structural schematic diagram of a kind of electronic equipment according to an embodiment of the present invention;
Fig. 2 is a kind of flow chart of data evaluation method according to an embodiment of the present invention;
Fig. 3 is the flow chart of another data evaluation method according to an embodiment of the present invention;
Fig. 4 is a kind of schematic diagram of data evaluation device according to an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with attached drawing to the present invention Technical solution be clearly and completely described, it is clear that described embodiments are some of the embodiments of the present invention, rather than Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise Under every other embodiment obtained, shall fall within the protection scope of the present invention.
Embodiment one:
Firstly, describing the exemplary electrons of the data evaluation method and apparatus for realizing the embodiment of the present invention referring to Fig.1 Equipment 100.
As shown in Figure 1, electronic equipment 100 include one or more processors 102, it is one or more storage device 104, defeated Enter device 106, output device 108 and data collector 110, these components pass through bus system 112 and/or other forms The interconnection of bindiny mechanism's (not shown).It should be noted that the component and structure of electronic equipment 100 shown in FIG. 1 are only exemplary, and Unrestricted, as needed, the electronic equipment also can have other assemblies and structure.
The processor 102 can be central processing unit (CPU) or have data-handling capacity and/or instruction execution The processing unit of the other forms of ability, and the other components that can control in the electronic equipment 100 are desired to execute Function.
The storage device 104 may include one or more computer program products, and the computer program product can To include various forms of computer readable storage mediums, such as volatile memory and/or nonvolatile memory.It is described easy The property lost memory for example may include random access memory (RAM) and/or cache memory (cache) etc..It is described non- Volatile memory for example may include read-only memory (ROM), hard disk, flash memory etc..In the computer readable storage medium On can store one or more computer program instructions, processor 102 can run described program instruction, to realize hereafter institute The client functionality (realized by processor) in the embodiment of the present invention stated and/or other desired functions.In the meter Can also store various application programs and various data in calculation machine readable storage medium storing program for executing, for example, the application program use and/or The various data etc. generated.
The input unit 106 can be the device that user is used to input instruction, and may include keyboard, mouse, wheat One or more of gram wind and touch screen etc..
The output device 108 can export various information (for example, image or sound) to external (for example, user), and It and may include one or more of display, loudspeaker etc..
The data collector 110 can be with data to be evaluated, and data to be evaluated are stored in the storage device 104 In for other components use.
Illustratively, the exemplary electronic device for realizing data evaluation method according to an embodiment of the present invention can be by reality Now in the equipment such as server.
Embodiment two:
According to embodiments of the present invention, a kind of embodiment of data evaluation method is provided, it should be noted that in attached drawing The step of process illustrates can execute in a computer system such as a set of computer executable instructions, although also, Logical order is shown in flow chart, but in some cases, it can be to be different from shown by sequence execution herein or retouch The step of stating.
Fig. 2 is a kind of flow chart of data evaluation method according to an embodiment of the present invention, as shown in Fig. 2, this method includes Following steps S202 to step S206.
It should be noted that in the present embodiment, method described in above-mentioned steps S202 to step S206 can be certainly It is executed in dynamicization testing tool, optionally, which can be selenium automated test tool, remove this Except, it is also an option that being other automated test tools, the present embodiment is not specifically limited.
Step S202 obtains data to be evaluated.
In the present embodiment, it can be obtained by automated test tool (for example, selenium automated test tool) Data to be evaluated.Data to be evaluated can be any one data, for example, the data such as judgement document, user behavior data, this Shen The data class that please treat evaluation data is not specifically limited.
Step S204 compares logarithm according to preset evaluation rule, and in conjunction with the preset quality of the data to be evaluated Automatic Evaluation is carried out according to the quality of data to the data to be evaluated, obtains data evaluation result.
In the present embodiment, after automated test tool gets data to be evaluated, automated test tool can be with Preset quality comparison data is obtained, and data to be evaluated and preset quality comparison data are subjected to mass ratio pair, thus realization pair The quality of data of data to be evaluated carries out automatic Evaluation.
It should be noted that preset quality comparison data and data to be evaluated are the data of same type.For example, if to be evaluated Data are the judgement document of company A, then preset quality comparison data is the judgement document of company B.
In the present embodiment, data evaluation result can reflect the quality feelings of data to be evaluated from least one data dimension Condition, for example, reflecting the quality condition of data to be evaluated from the following aspects: the amount of data, the data precision and data are timely Property.
In an alternative embodiment, it may include the following contents in the data evaluation result: to be evaluated for reacting The total value of the quality of data of data, one or more subnumbers of the quality of data for reacting data to be evaluated from each dimension Value, wherein one or more subnumber values are for determining total value.
Step S206 determines the quality of data grade of the data to be evaluated according to the data evaluation result.
In the present embodiment, automated test tool is after obtaining data evaluation result, so that it may according to data evaluation As a result the quality of data grade of data to be evaluated is determined.
Quality of data grade can be the pre-set grade of user, for example, can be low by quality of data grade classification Qualitative data and quality data.Alternatively, being divided into the data of the first credit rating, the data of the second credit rating and third matter Measure the data of grade.The present embodiment is not specifically limited the division rule of data credit rating, can be according to following embodiments Described in preset threshold set.
In embodiments of the present invention, firstly, obtaining data to be evaluated, wherein data to be evaluated are one or more;So Afterwards, according to preset evaluation rule, and the preset quality comparison data of data to be evaluated is combined to treat the number for evaluating data Automatic Evaluation is carried out according to quality, obtains data evaluation result;Finally, determining the data of data to be evaluated according to data evaluation result Credit rating.As can be seen from the above description, traditional quality testing system needs to provide a large amount of manpower and material resources, and this hair The quality that can go monitoring data provided by bright in real time under without artificial participation, greatly reduces cost Efficiency.Further, by establishing effectively evaluating rule, can for it is existing aiming at the problem that, reach the promotion quality of data Purpose, and then the technical issues of alleviate existing quality testing system appraisal low efficiency.
As can be seen from the above description, in the present embodiment, data to be evaluated are obtained first.Optionally, data to be evaluated can Think that the partial data extracted from mass data, data to be evaluated can also be full dose data.
If data to be evaluated are the partial data extracted from mass data, then the selection mode of data to be evaluated can be with Data to be evaluated are extracted from mass data for the mode of simple random sampling;Can also for stratified sampling mode from magnanimity number Data to be evaluated are extracted according to middle.
As shown in Figure 3, it is assumed that data pick-up mode is the mode of stratified sampling, then step S202, obtains data to be evaluated Include the following steps:
Step S301 classifies to each company according to the significance level of each company, obtains multiple groupings.
In the present embodiment, can according to the significance level for characterizing each company extent index to each company into Row classification.Wherein, extent index can be at least one of: the registered address of each company, and the scale of each company is each The enterprise nature (for example, central enterprise, enterprise and institution, state and individual enterprise etc.) of company, whether each company is listed company, each The information such as the profit of company.
It optionally, can be by companies all in database point after classifying according to significance level to each company For three layers of important company, general company and self-employed entrepreneur.For example, by " Baidu ", the esbablished corporations such as " millet " or registration Address is Beijing, and Shanghai, Guangdong, Shenzhen is divided into important this set of company;Pageview is lower and registered address is not being gone up north extensively Deep company is divided into general company;The last layer is then the enterprise that company's type is self-employed entrepreneur.
Step S302 determines the data of belonging each grouping in preset data set, obtains target data set.
Each company is being classified according to above-mentioned described step, after obtaining multiple groupings, so that it may pre- If determining the data of belonging each grouping in data acquisition system, target data set is obtained.
For example, the data of " important company " belonging to determining, obtain target data set A1;It is " common public belonging to determination The data of department ", obtain target data set A2;And the data of " self-employed entrepreneur " belonging to determining, obtain target data set Close A3.
Step S303 extracts data in each target data set according to default extraction ratio, obtain it is described to Evaluate data.
After obtaining target data set according to above-mentioned described step, so that it may in target data set according to Default extraction ratio extracts sample data out, and using the sample data being drawn into as data to be evaluated, and then is directed to these samples Data again go to be compared by random some dimensions of therefrom extraction.
For example, it is directed to number of targets set A1 to target data set A3, it can be in target data set A1 according to default Extraction ratio sample drawn data are as data to be evaluated;It can also be taken out in target data set A2 according to default extraction ratio Notebook data is sampled as data to be evaluated;It can also be in target data set A3 according to default extraction ratio sample drawn data As data to be evaluated.
As can be seen from the above description, data to be evaluated are extracted by the way of stratified sampling, it can be according to company attributes pair Mass data in preset data set is classified, and the data after classification are carried out with the process of quality evaluation, can be realized Quality evaluation is carried out to the data to be evaluated of each type, so that the evaluation to the quality of data is more accurate.
In the present embodiment, selenium automated test tool is obtained according to above-mentioned described method from magnanimity number After obtaining data to be evaluated according to middle extraction, so that it may by the data to be evaluated being drawn into be put into target message queue (for example, Redis queue) in.Later, selenium automated test tool is according to default automatic test rule, in target message queue It is middle to extract the data to be evaluated;And according to preset evaluation rule, in conjunction with the preset quality ratio of the data to be evaluated Automatic Evaluation is carried out to the quality of data of the data to the data to be evaluated.
As can be seen from the above description, in the present embodiment, the sample that selenium automated test tool can will be drawn into Notebook data is respectively put into redis queue, goes to disappear by the program of the automatic test of selenium automated test tool It consumes, does not need the sample for being always maintained at predetermined quantity in redis queue, it is only necessary to which guarantee has enterprise to go consumption i.e. for program Can, when being depleted to certain amount, monitoring script recycles identical sampling prescription to supplement queue, to realize entire Circulation.
After obtaining data to be evaluated according to above-mentioned described mode, so that it may be advised according to preset evaluation Then, and gather data to be evaluated preset quality comparison data treat evaluation data the quality of data carry out automatic Evaluation, obtain Data evaluation result.
For data to be evaluated, can be finely divided when the dimension of data, from the accuracy of data, timeliness and data It measures these aspects data to be evaluated and preset quality comparison data (or referred to as competing product data) are compared, obtains data Evaluation result.
In the present embodiment, set includes following evaluation parameter: amount, the data precision and the data of data in evaluation rule Timeliness.
Based on this, in the present embodiment, step S204, according to preset evaluation rule, and in conjunction with described to be evaluated The preset quality comparison data of data carries out automatic Evaluation to the quality of data of the data to be evaluated and includes the following steps:
Step S2041, according at least one evaluation parameter in the evaluation rule, by data to be evaluated and described Preset quality comparison data is compared, and obtains at least one comparison result, wherein the evaluation parameter includes following at least one Kind evaluation parameter: amount, the data precision and the data timeliness of data.
Specifically, in the present embodiment, can according to evaluation parameter, by data to be evaluated and preset quality comparison data into Row compares.For example, data to be evaluated and preset quality comparison data can be compared from the measuring angle of data, be compared To result B1;Can also data to be evaluated and preset quality comparison data be compared from the data precision angle, be compared To result B2;Can also data to be evaluated and preset quality comparison data be compared from data timeliness angle, be compared To result B3.
Using evaluation parameter treat evaluation data evaluated by the way of, the quality feelings of data can be reflected from dimension Condition, to obtain more accurate data evaluation result.
Step S2042 determines the data evaluation result of the data to be evaluated according at least one described comparison result.
After obtaining at least one comparison result according to above-mentioned described step, so that it may compare and tie at least one Fruit is calculated, and the data evaluation result of data to be evaluated is obtained.For example, being weighted summation meter at least one comparison result It calculates, obtains the data evaluation result of data to be evaluated.
If evaluation parameter includes data timeliness, then step S2041 is commented according at least one of described evaluation rule The data to be evaluated and the preset quality comparison data are compared and are included the following steps: by valence parameter
Step S11 respectively arranges the data to be evaluated and the preset quality comparison data according to issuing time Sequence respectively obtains the first ranking results and the second ranking results.
Step S12, according to first ranking results and second ranking results determine the data to be evaluated and Shi Xing obtains the first comparison result.
It specifically, can be first when data to be evaluated and preset quality comparison data are carried out data age comparison Evaluation data are treated respectively according to the issuing time of data and preset quality comparison data is ranked up.For example, according to data Issuing time treat respectively evaluation data and preset quality comparison data carry out Bit-reversed, respectively obtain the first ranking results and Second ranking results.
After obtaining the first ranking results and the second ranking results, N item (example before being selected from the first ranking results Such as, first 100) data are as data C1 to be compared, and preceding N item (for example, first 100) data of selection from the second ranking results As data C2 to be compared.Then, the timeliness between data C1 to be compared and data C2 to be compared is compared, wherein compare mark Standard is the lookup newest data of issuing time in data C1 to be compared and data C2 to be compared.The newest data institute of issuing time The timeliness of the data to be compared belonged to is more excellent.The comparison that data C1 to be compared and data C2 to be compared are carried out to timeliness it Afterwards, a comparison result, i.e. the first comparison result will be obtained, at this point it is possible in the database by first comparison result storage It is spare.
If evaluation parameter includes data volume, then step S2041, evaluates according at least one of described evaluation rule and joins Number, the data to be evaluated and the preset quality comparison data are compared and are included the following steps:
Step S21 starts the cleaning processing the data to be evaluated and the preset quality comparison data respectively.
Step S22, by the preset quality comparison data after the data volume and cleaning of the data to be evaluated after cleaning Data volume is compared, and obtains the second comparison result.
Specifically, in the present embodiment, the repeat number in data to be evaluated can be removed according to preset data cleaning rule According to, and the repeated data in removal preset quality comparison data.And it is removed respectively according to the preset data cleaning rule to be evaluated Dirty data in valence mumber evidence and preset quality comparison data etc. interferes data.To the data to be evaluated after clean and clearly Preset quality comparison data after washing.Later, so that it may by the default matter after the data to be evaluated and cleaning after cleaning Measure the comparison that comparison data carries out data volume.
It is illustrated by taking judgement document as an example below.It is assumed that it is data to be evaluated that judgement document, which organizes A, judgement document organizes B and is Preset quality comparison data.Firstly, removing respectively according to preset data cleaning rule, judgement document organizes A and judgement document organizes in B Repeated data and dirty data, so that the judgement document that judgement document after being cleaned organizes after A and cleaning organizes B.Later, The number that judgement document included in A is organized according to judgement document determines that judgement document organizes the data volume of A;And according to identical Method determines that judgement document organizes the data volume of B.Later, so that it may organize the data volume of A using judgement document and judgement document organizes B's Data volume determines the second comparison result.
If evaluation parameter includes accuracy, then step S2041, evaluates according at least one of described evaluation rule and joins Number, the detailed process that the data to be evaluated and the preset quality comparison data are compared is described as follows:
Specifically, in the present embodiment, available normal data, then, respectively by data to be evaluated and preset quality Comparison data is compared with normal data, to obtain the repetitive rate between data to be evaluated and normal data, and obtains Repetitive rate between preset quality comparison data and normal data.To determine third comparison result according to the two repetitive rates, Wherein, third comparison result is used to characterize the accuracy of data to be evaluated.
It should be noted that in the present embodiment, according to preset evaluation rule, treating evaluation data and carrying out matter Before amount evaluation, environment needed for can putting up selenium automated test tool in Linux system in advance is installed Chrome browser without a head on the server by the quality of data checking routine finished writing deployment is examined in cooperation timing script automatically The operating status of ranging sequence can be restarted automatically when program delay machine, ensure that entire quality testing system is not necessarily to artificially It goes to participate in.After completing above-mentioned deployment, so that it may according to preset evaluation rule, treat evaluation data progress quality and comment Valence.
Above-mentioned evaluation procedure is introduced by taking judgement document as an example below.
For example, can use first when judgement document's (data to be evaluated) to some company is compared Selenium automated test tool carries out automation login to competing product (company belonging to preset quality comparison data), passes through tune Identifying code is solved the problems, such as with third party's stamp platform, goes to navigate to company belonging to preset quality comparison data later again The position of DOM element where judgement document's dimension.The information that the competing product page is shown is compared with data to be evaluated.Than Pair content include: data amount, the data precision and data timeliness, specific comparison process is as described above, herein no longer in detail Carefully repeat.After being compared according to above-mentioned described alignments, at least one comparison result is obtained, and according at least One comparison result determines the data evaluation result of the data to be evaluated.
In the present embodiment, after obtaining data evaluation result according to above-mentioned described mode, so that it may according to number The quality of data grade of data to be evaluated is determined according to evaluation data.
If the data evaluation result is more than or equal to preset threshold, it is determined that the quality of data of the data to be evaluated It higher than preset quality, and is the default mark of the data setting first to be evaluated;If the data evaluation result is less than described pre- If threshold value, then data warning message is generated, and is the default mark of the data setting second to be evaluated.
Specifically, in the present embodiment, it can be determined that whether data evaluation result is more than or equal to preset pre- If threshold value, if being higher than preset threshold, mark is preset for the data addition first to be evaluated, for example, by the data to be evaluated Color rendering at green, indicate that the quality of the data to be evaluated is good.If being lower than preset threshold, for the data to be evaluated The default mark of addition second, for example, indicating that the quality of the data to be evaluated is deposited by the color rendering of the data to be evaluated at red In problem.In the present embodiment, the script that can also utilize timed task regularly sends mail daily and goes notice relevant person in charge The evaluation situation of the quality of data.
In the present embodiment, after obtaining data evaluation result, mesh can also be generated according to the data evaluation result Mark statistical report form, wherein include: the data evaluation result and/or historical data evaluation result in the object statistics report.
Specifically, in the present embodiment, the data evaluation result on the same day can be carried out by company's grouping from database The comparison details of each dimension of each company are stored in excel document, so that relevant person in charge goes to check case by statistics.For These cases go to find out current data there are the problem of, the result for later comparing the same day in library carries out logic deletion, so as to The data for distinguishing newest comparison, will summarize the result obtained can be stored in another table as the statistical result of daily history, from The trend that can analyze quality of data variation in historical statistics result, can also be to be made into various statistical report forms.It finally can be with Take the same day summarizes comparing result as daily paper, so as to the daily situation of change of the relevant person in charge observation quality of data.
It should be noted that in the present embodiment, before obtaining data to be evaluated, it is also necessary to the rule of building evaluation in advance Then, wherein can go to establish the rule evaluated, the i.e. synthesis of the amount of data, data accuracy, data timeliness in terms of three Score can be described as final evaluation index, specific building process:
Firstly, obtaining the preset data of each company, and competing product data corresponding with the said firm.Then, from data Amount, data accuracy, data timeliness these three aspect, preset data and competing product data are compared, to be counted According to comparison result.After obtaining the comparing result of each company, so that it may according to the comparing result of each company Determine evaluation rule.
As can be seen from the above description, traditional quality testing system needs to provide a large amount of manpower and material resources, and this hair The quality that can go monitoring data provided by bright in real time under without artificial participation, greatly reduces cost Efficiency.Further, by establishing effectively evaluating rule, can for it is existing aiming at the problem that, reach the promotion quality of data Purpose, and then the technical issues of alleviate existing quality testing system appraisal low efficiency.
Embodiment three:
The embodiment of the invention also provides a kind of data evaluation device, which is mainly used for executing the present invention Data evaluation method provided by embodiment above content, below does specifically data evaluation device provided in an embodiment of the present invention It introduces.
Fig. 4 is a kind of schematic diagram of data evaluation device according to an embodiment of the present invention, as shown in figure 4, the data evaluation Device mainly includes acquiring unit 10, evaluation unit 20 and determination unit 30, in which:
Acquiring unit 10, for obtaining data to be evaluated;
Evaluation unit 20 is used for according to preset evaluation rule, and in conjunction with the preset quality of the data to be evaluated Comparison data carries out automatic Evaluation to the quality of data of the data to be evaluated, obtains data evaluation result;
Determination unit 30, for determining the quality of data grade of the data to be evaluated according to the data evaluation result.
In embodiments of the present invention, firstly, obtaining data to be evaluated, wherein data to be evaluated are one or more;So Afterwards, according to preset evaluation rule, and the preset quality comparison data of data to be evaluated is combined to treat the number for evaluating data Automatic Evaluation is carried out according to quality, obtains data evaluation result;Finally, determining the data of data to be evaluated according to data evaluation result Credit rating.As can be seen from the above description, traditional quality testing system needs to provide a large amount of manpower and material resources, and this hair The quality that can go monitoring data provided by bright in real time under without artificial participation, greatly reduces cost Efficiency.Further, by establishing effectively evaluating rule, can for it is existing aiming at the problem that, reach the promotion quality of data Purpose, and then the technical issues of alleviate existing quality testing system appraisal low efficiency.
Optionally, if in the evaluation rule including evaluation parameter;Evaluation unit is used for: according in the evaluation rule The data to be evaluated and the preset quality comparison data are compared, obtain at least one by least one evaluation parameter Comparison result, wherein the evaluation parameter comprises at least one of the following evaluation parameter: amount, the data precision and the data of data Timeliness;The data evaluation result of the data to be evaluated is determined according at least one described comparison result.
Optionally, if the evaluation parameter includes data timeliness;Evaluation unit is also used to: respectively will according to issuing time The data to be evaluated and the preset quality comparison data are ranked up, and respectively obtain the first ranking results and the second sequence knot Fruit;The timeliness that the data to be evaluated are determined according to first ranking results and second ranking results, obtains first Comparison result.
Optionally, if the evaluation parameter includes data volume;Evaluation unit is also used to: by data to be evaluated and described Preset quality comparison data starts the cleaning processing respectively;It will be pre- after the data volume and cleaning of the data to be evaluated after cleaning If the data volume of quality comparison data is compared, the second comparison result is obtained.
Optionally, described device is also used to: according to default automatic test rule, in target message queue described in extraction Data to be evaluated;And according to preset evaluation rule, in conjunction with the data to be evaluated preset quality comparison data to institute The quality of data for stating data to be evaluated carries out automatic Evaluation.
Optionally it is determined that unit is used for: if the data evaluation result is more than or equal to preset threshold, it is determined that described The quality of data of data to be evaluated is higher than preset quality, and is the default mark of the data setting first to be evaluated;If the number It is less than the preset threshold according to evaluation result, then generates data warning message, and default for the data setting second to be evaluated Mark.
Optionally, acquiring unit is used for: being classified according to the significance level of each company to each company, is obtained Multiple groupings;The data that belonging each grouping is determined in preset data set, obtain target data set;According to default pumping It takes ratio to extract data in each target data set, obtains the data to be evaluated.
Optionally, described device is also used to: generating object statistics report according to the data evaluation result, wherein described It include: the data evaluation result and/or historical data evaluation result in object statistics report.
The technical effect and preceding method embodiment phase of device provided by the embodiment of the present invention, realization principle and generation Together, to briefly describe, Installation practice part does not refer to place, can refer to corresponding contents in preceding method embodiment.
Present invention also provides it is a kind of with processor can be performed non-volatile program code computer-readable medium, Said program code makes the processor execute any the method in above method embodiment.
In addition, in the description of the embodiment of the present invention unless specifically defined or limited otherwise, term " installation ", " phase Even ", " connection " shall be understood in a broad sense, for example, it may be being fixedly connected, may be a detachable connection, or be integrally connected;It can To be mechanical connection, it is also possible to be electrically connected;It can be directly connected, can also can be indirectly connected through an intermediary Connection inside two elements.For the ordinary skill in the art, above-mentioned term can be understood at this with concrete condition Concrete meaning in invention.
In the description of the present invention, it should be noted that term " center ", "upper", "lower", "left", "right", "vertical", The orientation or positional relationship of the instructions such as "horizontal", "inner", "outside" be based on the orientation or positional relationship shown in the drawings, merely to Convenient for description the present invention and simplify description, rather than the device or element of indication or suggestion meaning must have a particular orientation, It is constructed and operated in a specific orientation, therefore is not considered as limiting the invention.In addition, term " first ", " second ", " third " is used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with It realizes by another way.The apparatus embodiments described above are merely exemplary, for example, the division of the unit, Only a kind of logical function partition, there may be another division manner in actual implementation, in another example, multiple units or components can To combine or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or beg for The mutual coupling, direct-coupling or communication connection of opinion can be through some communication interfaces, device or unit it is indirect Coupling or communication connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product It is stored in the executable non-volatile computer-readable storage medium of a processor.Based on this understanding, of the invention Technical solution substantially the part of the part that contributes to existing technology or the technical solution can be with software in other words The form of product embodies, which is stored in a storage medium, including some instructions use so that One computer equipment (can be personal computer, server or the network equipment etc.) executes each embodiment institute of the present invention State all or part of the steps of method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read- Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can be with Store the medium of program code.
Finally, it should be noted that embodiment described above, only a specific embodiment of the invention, to illustrate the present invention Technical solution, rather than its limitations, scope of protection of the present invention is not limited thereto, although with reference to the foregoing embodiments to this hair It is bright to be described in detail, those skilled in the art should understand that: anyone skilled in the art In the technical scope disclosed by the present invention, it can still modify to technical solution documented by previous embodiment or can be light It is readily conceivable that variation or equivalent replacement of some of the technical features;And these modifications, variation or replacement, do not make The essence of corresponding technical solution is detached from the spirit and scope of technical solution of the embodiment of the present invention, should all cover in protection of the invention Within the scope of.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. a kind of data evaluation method characterized by comprising
Obtain data to be evaluated;
According to preset evaluation rule, and in conjunction with the preset quality comparison data of the data to be evaluated to described to be evaluated The quality of data of data carries out automatic Evaluation, obtains data evaluation result;
The quality of data grade of the data to be evaluated is determined according to the data evaluation result.
2. the method according to claim 1, wherein including evaluation parameter in the evaluation rule;
According to preset evaluation rule, and in conjunction with the preset quality comparison data of the data to be evaluated to described to be evaluated The quality of data of data carries out automatic Evaluation
According at least one evaluation parameter in the evaluation rule, the data to be evaluated and the preset quality are compared into logarithm According to being compared, at least one comparison result is obtained, wherein the evaluation parameter comprises at least one of the following evaluation parameter: number According to amount, the data precision and data timeliness;
The data evaluation result of the data to be evaluated is determined according at least one described comparison result.
3. according to the method described in claim 2, it is characterized in that, the evaluation parameter includes data timeliness;
According at least one evaluation parameter in the evaluation rule, the data to be evaluated and the preset quality are compared into logarithm Include: according to being compared
The data to be evaluated and the preset quality comparison data are ranked up respectively according to issuing time, respectively obtain One ranking results and the second ranking results;
The timeliness that the data to be evaluated are determined according to first ranking results and second ranking results, obtains first Comparison result.
4. according to the method described in claim 2, it is characterized in that, the evaluation parameter includes data volume;
According at least one evaluation parameter in the evaluation rule, the data to be evaluated and the preset quality are compared into logarithm Include: according to being compared
The data to be evaluated and the preset quality comparison data are started the cleaning processing respectively;
The data volume of preset quality comparison data after the data volume and cleaning of data to be evaluated after cleaning is compared It is right, obtain the second comparison result.
5. method according to claim 1 to 4, which is characterized in that the method also includes:
According to default automatic test rule, the data to be evaluated are extracted in target message queue;And according to presetting Evaluation rule, the qualities of data of the data to be evaluated is carried out in conjunction with the preset quality comparison data of the data to be evaluated Automatic Evaluation.
6. method according to claim 1 to 4, which is characterized in that determined according to the data evaluation result The quality of data grade of the data to be evaluated includes:
If the data evaluation result is more than or equal to preset threshold, it is determined that the quality of data of the data to be evaluated is higher than Preset quality, and be the default mark of the data setting first to be evaluated;
If the data evaluation result is less than the preset threshold, data warning message is generated, and is the data to be evaluated The default mark of setting second.
7. method according to claim 1 to 4, which is characterized in that obtaining data to be evaluated includes:
Classify according to the significance level of each company to each company, obtains multiple groupings;
The data that belonging each grouping is determined in preset data set, obtain target data set;
Data are extracted in each target data set according to default extraction ratio, obtain the data to be evaluated.
8. method according to claim 1 to 4, which is characterized in that the method also includes:
Object statistics report is generated according to the data evaluation result, wherein includes: the data in the object statistics report Evaluation result and/or historical data evaluation result.
9. a kind of data evaluation device characterized by comprising
Acquiring unit, for obtaining data to be evaluated;
Evaluation unit, for comparing logarithm according to preset evaluation rule, and in conjunction with the preset quality of the data to be evaluated Automatic Evaluation is carried out according to the quality of data to the data to be evaluated, obtains data evaluation result;
Determination unit, for determining the quality of data grade of the data to be evaluated according to the data evaluation result.
10. a kind of electronic equipment, including memory, processor and it is stored on the memory and can transports on the processor Capable computer program, which is characterized in that the processor realizes the claims 1 to 8 when executing the computer program The step of described in any item methods.
CN201910337642.3A 2019-04-24 2019-04-24 A kind of data evaluation method, apparatus and electronic equipment Pending CN110059083A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910337642.3A CN110059083A (en) 2019-04-24 2019-04-24 A kind of data evaluation method, apparatus and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910337642.3A CN110059083A (en) 2019-04-24 2019-04-24 A kind of data evaluation method, apparatus and electronic equipment

Publications (1)

Publication Number Publication Date
CN110059083A true CN110059083A (en) 2019-07-26

Family

ID=67320778

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910337642.3A Pending CN110059083A (en) 2019-04-24 2019-04-24 A kind of data evaluation method, apparatus and electronic equipment

Country Status (1)

Country Link
CN (1) CN110059083A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027821A (en) * 2019-11-25 2020-04-17 泰康保险集团股份有限公司 Service organization evaluation method and device, storage medium and electronic equipment
CN112487453A (en) * 2020-12-07 2021-03-12 马力 Data security sharing method and device based on central coordinator
CN113822602A (en) * 2021-11-22 2021-12-21 武汉龙津科技有限公司 Data value evaluation method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262678A (en) * 2011-08-16 2011-11-30 郑毅 System for sampling mass data and managing sampled data
US20150310166A1 (en) * 2012-11-28 2015-10-29 Institut National De La Sante Et De La Recherche Medicale (Inserm) Method and system for processing data for evaluating a quality level of a dataset
CN106469395A (en) * 2016-08-31 2017-03-01 国信优易数据有限公司 A kind of data commodity dynamic comprehensive appraisal procedure and system
CN108764705A (en) * 2018-05-24 2018-11-06 国信优易数据有限公司 A kind of data quality accessment platform and method
CN109254959A (en) * 2018-08-17 2019-01-22 广东技术师范学院 A kind of data evaluation method, apparatus, terminal device and readable storage medium storing program for executing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262678A (en) * 2011-08-16 2011-11-30 郑毅 System for sampling mass data and managing sampled data
US20150310166A1 (en) * 2012-11-28 2015-10-29 Institut National De La Sante Et De La Recherche Medicale (Inserm) Method and system for processing data for evaluating a quality level of a dataset
CN106469395A (en) * 2016-08-31 2017-03-01 国信优易数据有限公司 A kind of data commodity dynamic comprehensive appraisal procedure and system
CN108764705A (en) * 2018-05-24 2018-11-06 国信优易数据有限公司 A kind of data quality accessment platform and method
CN109254959A (en) * 2018-08-17 2019-01-22 广东技术师范学院 A kind of data evaluation method, apparatus, terminal device and readable storage medium storing program for executing

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027821A (en) * 2019-11-25 2020-04-17 泰康保险集团股份有限公司 Service organization evaluation method and device, storage medium and electronic equipment
CN112487453A (en) * 2020-12-07 2021-03-12 马力 Data security sharing method and device based on central coordinator
CN113822602A (en) * 2021-11-22 2021-12-21 武汉龙津科技有限公司 Data value evaluation method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN105283851B (en) For selecting the cost analysis of tracking target
CN108520357B (en) Method and device for judging line loss abnormality reason and server
CN105283848B (en) Application tracking is carried out with distributed object
CN113792825A (en) Fault classification model training method and device for electricity information acquisition equipment
CN105122212A (en) Periodicity optimization in an automated tracing system
CN110059083A (en) A kind of data evaluation method, apparatus and electronic equipment
CN105122234A (en) Deploying trace objectives using cost analyses
WO2021254027A1 (en) Method and apparatus for identifying suspicious community, and storage medium and computer device
CN110287316A (en) A kind of Alarm Classification method, apparatus, electronic equipment and storage medium
CN110597719B (en) Image clustering method, device and medium for adaptation test
CN108170830B (en) Group event data visualization method and system
CN110659985A (en) Method and device for fishing back false rejection potential user and electronic equipment
CN110706096A (en) Method and device for managing credit line based on salvage-back user and electronic equipment
CN105184886A (en) Cloud data center intelligence inspection system and cloud data center intelligence inspection method
CN109412839A (en) A kind of recognition methods, device, equipment and the storage medium of exception account
CN112598294A (en) Method, device, machine readable medium and equipment for establishing scoring card model on line
CN109787958A (en) Network flow real-time detection method and detection terminal, computer readable storage medium
CN115237804A (en) Performance bottleneck assessment method, performance bottleneck assessment device, electronic equipment, medium and program product
CN114090556B (en) Electric power marketing data acquisition method and system
CN115357629A (en) Processing method, system, electronic device and storage medium for financial data stream
CN111210332A (en) Method and device for generating post-loan management strategy and electronic equipment
CN106528774A (en) Method and apparatus for predicting distribution network project management trend
CN113269378A (en) Network traffic processing method and device, electronic equipment and readable storage medium
CN109064211A (en) Sales service data analysing method, device and server
CN107430590A (en) Data compare

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination