CN104794234A - Data processing method and device for benchmarking - Google Patents

Data processing method and device for benchmarking Download PDF

Info

Publication number
CN104794234A
CN104794234A CN201510226886.6A CN201510226886A CN104794234A CN 104794234 A CN104794234 A CN 104794234A CN 201510226886 A CN201510226886 A CN 201510226886A CN 104794234 A CN104794234 A CN 104794234A
Authority
CN
China
Prior art keywords
index
achievement data
affairs
described multiple
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510226886.6A
Other languages
Chinese (zh)
Other versions
CN104794234B (en
Inventor
王志强
戴天泽
夏宝亮
张洪奎
梁颖
单晓东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China Electric Power University
Original Assignee
North China Electric Power University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China Electric Power University filed Critical North China Electric Power University
Priority to CN201510226886.6A priority Critical patent/CN104794234B/en
Publication of CN104794234A publication Critical patent/CN104794234A/en
Application granted granted Critical
Publication of CN104794234B publication Critical patent/CN104794234B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method and device for benchmarking. The method includes the steps that index data of each index in multiple indexes in multiple affairs are acquired, wherein the indexes are indexes for benchmarking; part of indexes meeting the preset association rule in the indexes are determined according to the index data of each index in the indexes in the multiple affairs. The data processing method and device solve the technical problem that the working efficiency of enterprises is low due to the fact that association between different index data cannot be known.

Description

For data processing method and the device of fellow peers' evaluation
Technical field
The present invention relates to data processing field, in particular to a kind of data processing method for fellow peers' evaluation and device.
Background technology
In the related, for the management platform of fellow peers' evaluation when analysis indexes data, also rest in the top-down qualitative analysis of single index data and raising, lack depth analysis, and item by item whole achievement data is analyzed to the time of inevitable at substantial top-downly, have impact on the work efficiency of enterprise.
Further, inventor finds, in the related, carries out isolating the degree of association analyzed and cannot learn between each index, cause to carry out science decision to business affair to achievement data.
For above-mentioned problem, at present effective solution is not yet proposed.
Summary of the invention
Embodiments provide a kind of data processing method for fellow peers' evaluation and device, at least to solve due to relevance between different achievement data cannot be learnt and the inefficient technical matters of enterprise work caused.
According to an aspect of the embodiment of the present invention, provide a kind of data processing method for fellow peers' evaluation, comprising: obtain the achievement data of each index in multiple affairs in multiple index, above-mentioned multiple index is the index for fellow peers' evaluation; And determine in above-mentioned multiple index, to meet the part index number presetting correlation rule according to the achievement data of each index in above-mentioned multiple affairs in above-mentioned multiple index.
Further, above-mentioned multiple index forms candidate's index set, each index in above-mentioned multiple index is an element in above-mentioned candidate's index set, each transaction packet in above-mentioned multiple affairs is contained in above-mentioned candidate's index set, determines that meeting the part index number presetting correlation rule in above-mentioned multiple index item comprises: obtain the Apriori algorithm preset according to the achievement data of each index in above-mentioned multiple affairs in above-mentioned multiple index; All indexs in above-mentioned candidate's index set are excavated in above-mentioned multiple affairs to the frequent item set meeting and preset correlation rule according to the achievement data of each index in above-mentioned multiple affairs in the above-mentioned Apriori algorithm that presets and above-mentioned multiple index, above-mentioned frequent item set is the item collection that the support of every is all greater than minimum support; And using index involved in above-mentioned frequent item set as the part index number meeting above-mentioned default correlation rule.
Further, according to the achievement data of each index in above-mentioned multiple affairs in the above-mentioned Apriori algorithm that presets and above-mentioned multiple index all indexs in above-mentioned candidate's index set are excavated in above-mentioned multiple affairs and meet the frequent item set presetting correlation rule and comprise: according to the achievement data of each index in above-mentioned multiple affairs in the above-mentioned Apriori algorithm that presets and above-mentioned multiple index, in above-mentioned multiple affairs, all frequent item sets are determined to all indexs in above-mentioned candidate's index set; Produce Strong association rule by above-mentioned all frequent item sets, above-mentioned Strong association rule meets the requirement of above-mentioned minimum support; Using index involved in above-mentioned Strong association rule as the part index number meeting above-mentioned default correlation rule.
Further, using index involved in above-mentioned Strong association rule as before the part index number meeting above-mentioned default correlation rule, above-mentioned data processing method also comprises: detect the requirement whether above-mentioned Strong association rule meets min confidence, wherein, if detect that above-mentioned Strong association rule meets the requirement of above-mentioned min confidence, then using index involved in above-mentioned Strong association rule as the part index number meeting above-mentioned default correlation rule; And/or, if detect that above-mentioned Strong association rule does not meet the requirement of above-mentioned min confidence, then first one or several indexs not meeting the requirement of above-mentioned min confidence in above-mentioned Strong association rule are rejected, then using eliminate above-mentioned one or several not meet in the correlation rule of the index of the requirement of above-mentioned min confidence involved index as the part index number meeting above-mentioned default correlation rule.
Further, in the multiple index of acquisition after the achievement data of each index in multiple affairs, above-mentioned data processing method also comprises: detect the achievement data obtained and whether there is vacancy value, if detect that the achievement data of above-mentioned acquisition exists above-mentioned vacancy value, then above-mentioned vacancy value is filled, afterwards, the step determining to meet in above-mentioned multiple index item the part index number presetting correlation rule according to the achievement data of each index in above-mentioned multiple affairs in above-mentioned multiple index is performed; And/or whether the achievement data detecting acquisition exists isolated point data, if detect that the achievement data of above-mentioned acquisition exists above-mentioned isolated point data, then above-mentioned isolated point data are filled, afterwards, the step determining to meet in above-mentioned multiple index item the part index number presetting correlation rule according to the achievement data of each index in above-mentioned multiple affairs in above-mentioned multiple index is performed.
According to the another aspect of the embodiment of the present invention, additionally provide a kind of data processing equipment for fellow peers' evaluation, comprise: acquiring unit, for obtaining the achievement data of each index in multiple affairs in multiple index, above-mentioned multiple index is the index for fellow peers' evaluation; And determining unit, meet for determining according to the achievement data of each index in above-mentioned multiple affairs in above-mentioned multiple index the part index number presetting correlation rule in above-mentioned multiple index.
Further, above-mentioned multiple index forms candidate's index set, each index in above-mentioned multiple index is an element in above-mentioned candidate's index set, each transaction packet in above-mentioned multiple affairs is contained in above-mentioned candidate's index set, above-mentioned determining unit comprises: acquisition module, for obtaining the Apriori algorithm preset; Excavate module, for excavating according to the achievement data of each index in above-mentioned multiple affairs in the above-mentioned Apriori algorithm that presets and above-mentioned multiple index the frequent item set meeting and preset correlation rule to all indexs in above-mentioned candidate's index set in above-mentioned multiple affairs, above-mentioned frequent item set is the item collection that the support of every is all greater than minimum support; And determination module, for using index involved in above-mentioned frequent item set as the part index number meeting above-mentioned default correlation rule.
Further, above-mentioned excavation module comprises: determine submodule, for determining all frequent item sets to all indexs in above-mentioned candidate's index set according to the achievement data of each index in above-mentioned multiple affairs in the above-mentioned Aprior i algorithm that presets and above-mentioned multiple index in above-mentioned multiple affairs; Generate submodule, for producing Strong association rule by above-mentioned all frequent item sets, above-mentioned Strong association rule meets the requirement of above-mentioned minimum support; Determine submodule, for using index involved in above-mentioned Strong association rule as the part index number meeting above-mentioned default correlation rule.
Further, above-mentioned data processing equipment also comprises: the first detecting unit, for using index involved in above-mentioned Strong association rule as before the part index number meeting above-mentioned default correlation rule, detect the requirement whether above-mentioned Strong association rule meets min confidence, wherein, above-mentionedly determine submodule also for when detecting that above-mentioned Strong association rule meets the requirement of above-mentioned min confidence, using index involved in above-mentioned Strong association rule as the part index number meeting above-mentioned default correlation rule; And/or, above-mentionedly determine submodule also for detecting that above-mentioned Strong association rule does not meet the requirement of above-mentioned min confidence, then first one or several indexs not meeting the requirement of above-mentioned min confidence in above-mentioned Strong association rule are rejected, then using eliminate above-mentioned one or several not meet in the correlation rule of the index of the requirement of above-mentioned min confidence involved index as the part index number meeting above-mentioned default correlation rule.
Further, above-mentioned data processing equipment also comprises: the second detecting unit, after the achievement data of each index in multiple affairs in the multiple index of acquisition, detects the achievement data obtained and whether there is vacancy value; First filler cells, for when detecting that the achievement data of above-mentioned acquisition exists above-mentioned vacancy value, above-mentioned vacancy value is filled, afterwards, above-mentioned determining unit performs the function determining to meet in above-mentioned multiple index item the part index number presetting correlation rule according to the achievement data of each index in above-mentioned multiple affairs in above-mentioned multiple index, and/or the 3rd detecting unit, whether there are isolated point data for the achievement data detecting acquisition; Second filler cells, for when detecting that the achievement data of above-mentioned acquisition exists above-mentioned isolated point data, above-mentioned isolated point data are filled, afterwards, above-mentioned determining unit performs the function determining to meet in above-mentioned multiple index item the part index number presetting correlation rule according to the achievement data of each index in above-mentioned multiple affairs in above-mentioned multiple index.
In embodiments of the present invention, adopt the mode of Mining Association Rules, by obtaining the achievement data of each index in multiple affairs in multiple index, multiple index is the index for fellow peers' evaluation; And determine in multiple index item, to meet the part index number presetting correlation rule according to the achievement data of index each in multiple index in multiple affairs, reach the object excavating the part index number with certain correlation rule from all indexs, thus achieve the technique effect of work efficiency improving enterprise, and then solve the inefficient technical matters of enterprise work caused due to relevance between different achievement data cannot be learnt.
Accompanying drawing explanation
Accompanying drawing described herein is used to provide a further understanding of the present invention, and form a application's part, schematic description and description of the present invention, for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is a kind of optional process flow diagram for the data processing method of fellow peers' evaluation according to the embodiment of the present invention; And
Fig. 2 is a kind of optional schematic diagram for the data processing equipment of fellow peers' evaluation according to the embodiment of the present invention.
Embodiment
The present invention program is understood better in order to make those skilled in the art person, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the embodiment of a part of the present invention, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, should belong to the scope of protection of the invention.
It should be noted that, term " first ", " second " etc. in instructions of the present invention and claims and above-mentioned accompanying drawing are for distinguishing similar object, and need not be used for describing specific order or precedence.Should be appreciated that the data used like this can be exchanged in the appropriate case, so as embodiments of the invention described herein can with except here diagram or describe those except order implement.In addition, term " comprises " and " having " and their any distortion, intention is to cover not exclusive comprising, such as, contain those steps or unit that the process of series of steps or unit, method, system, product or equipment is not necessarily limited to clearly list, but can comprise clearly do not list or for intrinsic other step of these processes, method, product or equipment or unit.
Embodiment 1
According to the embodiment of the present invention, provide a kind of embodiment of the method for the data processing method for fellow peers' evaluation, it should be noted that, can perform in the computer system of such as one group of computer executable instructions in the step shown in the process flow diagram of accompanying drawing, and, although show logical order in flow charts, in some cases, can be different from the step shown or described by order execution herein.
Fig. 1 is a kind of optional process flow diagram figure for the data processing method of fellow peers' evaluation according to the embodiment of the present invention, and as shown in Figure 1, the method comprises the steps:
Step S102, obtain the achievement data of each index in multiple affairs in multiple index, multiple index is the index for fellow peers' evaluation; And
Step S104, determines according to the achievement data of index each in multiple index in multiple affairs to meet the part index number presetting correlation rule in multiple index.
Wherein, fellow peers' evaluation refers to metrics evaluation enterprise, uses feedback on performance enterprise, mainly from five class evaluation indexes such as safety management, assets management, marketing service, operation of power networks, human resources.
During enforcement, can obtain for evaluating several all indexs belonging to enterprise of the same trade as above-mentioned multiple indexs, and can as affairs for evaluating all indexs of each enterprise, the corresponding achievement data of each index in each affairs, utilizes these achievement datas can select the part index number with correlation rule from all indexs.
Pass through above-mentioned steps, due to target value can be referred to from improving to a great extent all the other to the improvement energetically of arbitrary index in the index with correlation rule, therefore, for certain enterprise (as power grid enterprises), consider that its actual index factor is very numerous and diverse, by the data processing method of the embodiment of the present invention, excavate the part index number with correlation rule and can give prominence to key index, thus greatly improve the work efficiency of enterprise, and provide more clear, more scientific planning for administration and supervision authorities decision-making.
Alternatively, above-mentioned multiple index can be formed candidate's index set, wherein, each index in this multiple index is an element in candidate's index set, each affairs in above-mentioned multiple affairs can be contained in candidate's index set, like this, determine that meeting the part index number presetting correlation rule in multiple index item can comprise according to the achievement data of index each in multiple index in multiple affairs:
S2, obtains the Apriori algorithm preset;
S4, excavate to meet in multiple affairs to all indexs in candidate's index set according to the achievement data of each index in multiple affairs in the Apriori algorithm preset and multiple index and preset the frequent item set of correlation rule, frequent item set is the item collection that the support of every is all greater than minimum support; And
S6, using index involved in frequent item set as meeting the part index number presetting correlation rule.
Wherein, Apriori algorithm may be used for the frequent item set of Mining Association Rules, and it closes detection two stage Mining Frequent Itemsets Baseds by candidate generation and the downward of plot.
Such as, candidate's index set (i.e. candidate's index storehouse) I={i is determined according to the target of grid company 1, i 2..., i m, it is the set of index item, for each management category, as safety management, financial management etc., first establishes the index that can reflect department quality, then determines candidate's index set.Affairs collection D is the set of relevant issues, and wherein each affairs T is the set of index item, makes
For the ease of counting, after acquisition achievement data, the Index Logic value corresponding from the achievement data of every index in different affairs can be determined.Particularly, if the absolute value of the examination rate of change of certain candidate's index is greater than the absolute value of performance objective rate of change in certain management category, then the logical value of this candidate's index is labeled as " 1 "; Otherwise, the logical value of this candidate's index is labeled as " 0 ".Wherein, when the logical value of candidate's index is labeled as " 1 ", show to retain this candidate's index; When the logical value of candidate's index is labeled as " 0 ", show to reject this candidate's index.
In addition, correlation rule is the implications of shape as A → B, and wherein A, B are the elements of I, and rule A → B sets up demand fulfillment two conditions in affairs collection D, namely meets minimum support threshold value and minimal confidence threshold simultaneously.
Pass through the embodiment of the present invention, emphatically single index is analyzed with the fellow peers' evaluation in correlation technique and take appropriate measures improve index one by one level compared with, Algorithms of Maximal Frequent Itemset Mining is applied in the management application of Enterprise Performance index by the present invention first, Apriori algorithm is utilized to excavate frequent mode, finally be met the part index number of default correlation rule, and the relevance analyzed between above-mentioned part index number, can reduce numerous and diverse analysis with the index of repeatability, improve the efficiency of company.In addition, by the maximum frequent itemsets of parameter item, clearly can know the important indicator item of a certain range of management as finicial administration of enterprise field, and the influence degree of different index to another index can be recognized, contribute to enterprise management level and take measure targetedly to carry out science decision.
Alternatively, in multiple affairs, excavate to all indexs in candidate's index set the frequent item set meeting default correlation rule according to the achievement data of each index in multiple affairs in the Apriori algorithm preset and multiple index can comprise:
S8, determines all frequent item sets to all indexs in candidate's index set according to the achievement data of each index in multiple affairs in the Apriori algorithm preset and multiple index in multiple affairs;
S10, produce Strong association rule by all frequent item sets, Strong association rule meets the requirement of minimum support;
S12, using index involved in Strong association rule as meeting the part index number presetting correlation rule.
Particularly, the embodiment of the present invention can be implemented according to following steps:
(1) support counting of each index in all affairs is first added up, again the size of each index according to support counting is sorted, then non-frequent index is removed according to the minimum support threshold value minsup preset, also be, when s (A → B) >=minsup, form frequent 1-item collection L1.If L1 is , then stop; If L1 is not , then (2) are performed.
(2) carry out combining (namely connecting) by frequent 1-item collection L1 and form candidate 2-item collection C2, scan all 2-item collection, find out the 2-item collection that support counting meets minimum support threshold value, be i.e. frequent 2-item collection L2.If L2 is for, then stop; If L2 is not , then (3) are performed.
(3) according to (2), according to recursive fashion, frequent (k-1) item collection Lk-1 is found out, k >=2; And frequent (the k-1)-item collection found out any two and have (k-2) item to collect identical, be combined into k-item collection; Whether the subset that the k-item that judgement is found out collects its all (k-1)-item collection comprised all appears in frequent (k-1) item collection Lk-1, if be, then retains this k-item set, otherwise just deletes (i.e. beta pruning).
(4) check the k-item collection that (3) draw, if meet minimum support threshold value, then generate frequent k-item collection Lk, otherwise just delete (i.e. beta pruning).
(5) turn (3), find out frequent (k+1)-item collection, until cannot frequent item set be produced.
Alternatively, using index involved in Strong association rule as meet preset correlation rule part index number before, above-mentioned data processing method can also comprise:
S14, detects the requirement whether Strong association rule meets min confidence, wherein, if detect that Strong association rule meets the requirement of min confidence, then using the part index number of index involved in Strong association rule as satisfied default correlation rule; And/or, if detect that Strong association rule does not meet the requirement of min confidence, then first one or several indexs not meeting the requirement of min confidence in Strong association rule are rejected, then will one be eliminated or several not to meet in the correlation rule of the index of the requirement of min confidence involved index as meeting the part index number presetting correlation rule.
Also namely, to all frequent item sets, find out and meet min confidence namely correlation rule, wherein, by do not meet minimal confidence threshold correlation rule reject mode, the erroneous judgement because contingency causes can be prevented.And after obtaining the Strong association rule between each index, when analyzing certain management category, greatly can simplify analyzed index item number, taking more effective means to improve grid company target in one aspect for the relevance between index.
Preferably, in the multiple index of acquisition after the achievement data of each index in multiple affairs, above-mentioned data processing method can also comprise:
S16, detect the achievement data obtained and whether there is vacancy value, if detect that the achievement data of acquisition exists vacancy value, then vacancy value is filled, afterwards, the step determining to meet in multiple index item the part index number presetting correlation rule according to the achievement data of index each in multiple index in multiple affairs is performed; And/or
S18, detect the achievement data obtained and whether there are isolated point data, if detect that the achievement data of acquisition exists isolated point data, then isolated point data are filled, afterwards, the step determining to meet in multiple index item the part index number presetting correlation rule according to the achievement data of index each in multiple index in multiple affairs is performed.
Due to actual acquisition to achievement data often there are a little undesirable data, and good data mining algorithm generally has certain requirement to its data acquisition processed, therefore by above-mentioned S16 and/or S18, pre-service is carried out to the achievement data collected, can guarantee data integrity better, data redundancy is few, correlativity between data attribute is little.
In fact, vacancy value and isolated point data may be there are in the achievement data in grid company fellow peers' evaluation system.Wherein, when there is vacancy value in achievement data, most probable value can be used to fill, l such as, selects the mean value of same class achievement data to fill; When there are isolated point data in achievement data, the mean value of a class achievement data also can be selected to fill.During enforcement, when identifying isolated point data, the data needing to identify and the data of this index item contiguous time can be compared, set up a threshold value, if the data of this needs identification are greater than this threshold value with the difference of the data with this index item contiguous time, then can automatically identify this isolated point data.
Below in conjunction with chart and embodiment, the present invention will be described in detail:
Such as, when achievement data excavation is done to this large class of financial management of grid company, conveniently process, each index name is represented respectively with alphabetical A to J, wherein, A represents " turnover of total assets ", B represents " Unit Assets electricity sales amount ", C represents " contribution margin rate of growth ", D represents " capital assets revenue-generating power rate of growth ", E represents " operation revenue rate of growth ", F represents " cost income proportion ", G represents " every ten thousand yuan of power grid asset operation expenses ", H represents " economic value added rate ", I represents " contribution of management degree ", J represents " EBITDA rate of profit ", each Transaction name is represented respectively by numeral 01 to 09, wherein, 01 representative " Changchun ", 02 representative " Jilin ", 03 representative " Tonghua ", 04 representative " Baicheng ", 05 representative " Siping City ", 06 representative " prolonging limit ", 07 representative " Liaoyuan ", 08 representative " Bai Shan ", 09 representative " Songyuan City ", here, assuming that minimum support threshold value is 30%, minimal confidence threshold is 40%, determines its performance objective to each desired value, and the achievement data of financial management part is processed into following form, as shown in table 1:
Table 1 financial management achievement data
TID A B C D E F G H I J
01 1 1 1 1 1 1 0 1 1 1
02 1 1 0 0 0 1 0 1 1 1
03 1 0 0 1 0 1 0 0 0 0
04 0 0 1 1 1 0 1 0 0 0
05 1 1 0 0 0 0 0 0 0 0
06 0 0 0 0 0 0 0 0 0 0
07 0 0 0 0 0 0 1 0 0 0
08 0 0 0 1 1 0 0 0 0 0
09 1 1 0 0 0 1 0 1 0 0
Every is all the member of candidate index set C1 above, scans all affairs, counts to the number of times that each occurs, as shown in table 2:
Table 2
Frequent 1-item collection L1 is determined according to minimum support minsup=30% (namely numeration is greater than 2.7, is designated as 3), as shown in table 3:
Table 3
Connect L 1∞ L 1produce candidate 2-item collection C2, it is got 2 items by L1 and combines, altogether individual 2-item collection; Achievement data storehouse C2 again in scan table 1, calculates the support numeration of each candidate's index item in C2, determines the frequent 2-item collection L2 meeting minimum support threshold value; Connected by L2 self equally and produce candidate 3-item collection C3, carry out beta pruning according to minsup and obtain frequent 3-item collection L3; Connected by L3 self again and produce candidate 4-item collection C4, then carry out beta pruning according to minsup and obtain frequent item set L4; Use L 4∞ L 4produce candidate 5-item collection, find now can not meet minimum support threshold value, therefore search stops, above mining process can obtain following frequent item set:
Frequent 2-item collection L2:AB, AF, AH, BF, BH, DE, FH;
Frequent 3-item collection L3:ABF, ABH, AFH, BFH;
Frequent 4-item collection L4:ABFH;
The degree of confidence calculating different correlation rule in this frequent item set is as follows:
Confidence(A→BFH)=3/5=60%,Confidence(B→AFH)=3/4=75%,
Confidence(F→ABH)=3/4=75%,Confidence(H→ABF)=3/4=75%,
Confidence(AB→FH)=3/4=75%,Confidence(AF→BH)=3/4=75%,
Confidence(AH→BF)=3/3=100%,Cofidence(BF→AH)=3/3=100%,
Confidence(BH→AF)=3/3=100%,Cofidence(FH→AB)=3/3=100%,
Confidence(ABF→H)=3/3=100%,Confidence(AFH→B)=3/3=100%,
Confidence(ABH→F)=3/3=100%,Confidence(BFH→A)=3/3=100%。
Can find out and all be more than or equal to 60% by the degree of confidence of the correlation rule formed in frequent 4-item collection ABFH, there is very strong relevance, the correlation rule drawn is excavated as can be seen from achievement data, maximum frequent itemsets { the turnover of total assets, Unit Assets electricity sales amount, cost income proportion, economic value added rate } in four index item, by can largely improve all the other desired values to the wherein improvement energetically of, and these 11 indexs only relate to financial management aspect, consider that the index factor of the actual consideration of power grid enterprises is very numerous and diverse, key index can be given prominence to by the data mining technology of the embodiment of the present invention, thus greatly improve enterprise work efficiency, and make decision to provide for administration and supervision authorities and more clearly plan.
Embodiment 2
According to the embodiment of the present invention, provide a kind of device embodiment of the data processing equipment for fellow peers' evaluation.
Fig. 2 is a kind of optional schematic diagram for the data processing equipment of fellow peers' evaluation according to the embodiment of the present invention, and this device comprises: acquiring unit 10 and determining unit 20.Acquiring unit 10 is for obtaining the achievement data of each index in multiple affairs in multiple index, and multiple index is the index for fellow peers' evaluation; And determining unit 20 meets for determining according to the achievement data of index each in multiple index in multiple affairs the part index number presetting correlation rule in multiple index.
Wherein, fellow peers' evaluation refers to metrics evaluation enterprise, uses feedback on performance enterprise, mainly from five class evaluation indexes such as safety management, assets management, marketing service, operation of power networks, human resources.
During enforcement, can obtain for evaluating several all indexs belonging to enterprise of the same trade as above-mentioned multiple indexs, and can as affairs for evaluating all indexs of each enterprise, the corresponding achievement data of each index in each affairs, utilizes these achievement datas can select the part index number with correlation rule from all indexs.
Pass through above-mentioned steps, due to target value can be referred to from improving to a great extent all the other to the improvement energetically of arbitrary index in the index with correlation rule, therefore, for certain enterprise (as power grid enterprises), consider that its actual index factor is very numerous and diverse, by the data processing method of the embodiment of the present invention, excavate the part index number with correlation rule and can give prominence to key index, thus greatly improve the work efficiency of enterprise, and provide more clear, more scientific planning for administration and supervision authorities decision-making.
Alternatively, multiple index forms candidate's index set, each index in multiple index is an element in candidate's index set, and each transaction packet in multiple affairs is contained in candidate's index set, and above-mentioned determining unit can comprise: acquisition module, excavation module and determination module.Acquisition module, for obtaining the Apriori algorithm preset; Excavate module, for excavating according to the achievement data of each index in multiple affairs in the Apriori algorithm that presets and multiple index the frequent item set meeting default correlation rule to all indexs in candidate's index set in multiple affairs, frequent item set is the item collection that the support of every is all greater than minimum support; And determination module, for using index involved in frequent item set as meeting the part index number presetting correlation rule.
Wherein, Apriori algorithm may be used for the frequent item set of Mining Association Rules, and it closes detection two stage Mining Frequent Itemsets Baseds by candidate generation and the downward of plot.
Such as, candidate's index set (i.e. candidate's index storehouse) I={i is determined according to the target of grid company 1, i 2..., i m, it is the set of index item, for each management category, as safety management, financial management etc., first establishes the index that can reflect department quality, then determines candidate's index set.Affairs collection D is the set of relevant issues, and wherein each affairs T is the set of index item, makes
For the ease of counting, after acquisition achievement data, the Index Logic value corresponding from the achievement data of every index in different affairs can be determined.Particularly, if the absolute value of the examination rate of change of certain candidate's index is greater than the absolute value of performance objective rate of change in certain management category, then the logical value of this candidate's index is labeled as " 1 "; Otherwise, the logical value of this candidate's index is labeled as " 0 ".Wherein, when the logical value of candidate's index is labeled as " 1 ", show to retain this candidate's index; When the logical value of candidate's index is labeled as " 0 ", show to reject this candidate's index.
In addition, correlation rule is the implications of shape as A → B, and wherein A, B are the elements of I, and rule A → B sets up demand fulfillment two conditions in affairs collection D, namely meets minimum support threshold value and minimal confidence threshold simultaneously.
Pass through the embodiment of the present invention, emphatically single index is analyzed with the fellow peers' evaluation in correlation technique and take appropriate measures improve index one by one level compared with, the present invention first by maximum take place frequently item set mining algorithm application to Enterprise Performance index management application in, Apriori algorithm is utilized to excavate frequent mode, finally be met the part index number of default correlation rule, and the relevance analyzed between above-mentioned part index number, can reduce numerous and diverse analysis with the index of repeatability, improve the efficiency of company.In addition, by the maximum frequent itemsets of parameter item, clearly can know the important indicator item of a certain range of management as finicial administration of enterprise field, and the influence degree of different index to another index can be recognized, contribute to enterprise management level and take measure targetedly to carry out science decision.
Further alternatively, above-mentioned excavation module can comprise: determine submodule, generate submodule and determine submodule.Determine submodule, for determining all frequent item sets to all indexs in candidate's index set according to the achievement data of each index in multiple affairs in the Apriori algorithm preset and multiple index in multiple affairs; Generate submodule, for producing Strong association rule by all frequent item sets, Strong association rule meets the requirement of minimum support; Determine submodule, for using index involved in Strong association rule as meeting the part index number presetting correlation rule.
Particularly, the embodiment of the present invention can be implemented according to following steps:
(1) support counting of each index in all affairs is first added up, again the size of each index according to support counting is sorted, then non-frequent index is removed according to the minimum support threshold value minsup preset, also be, when s (A → B) >=minsup, form frequent 1-item collection L1.If L1 is , then stop; If L1 is not , then (2) are performed.
(2) carry out combining (namely connecting) by frequent 1-item collection L1 and form candidate 2-item collection C2, scan all 2-item collection, find out the 2-item collection that support counting meets minimum support threshold value, be i.e. frequent 2-item collection L2.If L2 is for, then stop; If L2 is not , then (3) are performed.
(3) according to (2), according to recursive fashion, frequent (k-1) item collection Lk-1 is found out, k >=2; And frequent (the k-1)-item collection found out any two and have (k-2) item to collect identical, be combined into k-item collection; Whether the subset that the k-item that judgement is found out collects its all (k-1)-item collection comprised all appears in frequent (k-1) item collection Lk-1, if be, then retains this k-item set, otherwise just deletes (i.e. beta pruning).
(4) check the k-item collection that (3) draw, if meet minimum support threshold value, then generate frequent k-item collection Lk, otherwise just delete (i.e. beta pruning).
(5) turn (3), find out frequent (k+1)-item collection, until cannot frequent item set be produced.
Further alternatively, above-mentioned data processing equipment can also comprise: the first detecting unit.First detecting unit, for using index involved in Strong association rule as meet preset correlation rule part index number before, detect the requirement whether Strong association rule meets min confidence, wherein, determine submodule also for when detecting that Strong association rule meets the requirement of min confidence, using index involved in Strong association rule as meeting the part index number presetting correlation rule; And/or, determine submodule also for detecting that Strong association rule does not meet the requirement of min confidence, then first one or several indexs not meeting the requirement of min confidence in Strong association rule are rejected, then will one be eliminated or several not to meet in the correlation rule of the index of the requirement of min confidence involved index as meeting the part index number presetting correlation rule.
Also namely, to all frequent item sets, find out and meet min confidence namely correlation rule, wherein, by do not meet minimal confidence threshold correlation rule reject mode, the erroneous judgement because contingency causes can be prevented.And after obtaining the Strong association rule between each index, when analyzing certain management category, greatly can simplify analyzed index item number, taking more effective means to improve grid company target in one aspect for the relevance between index.
Alternatively, above-mentioned data processing equipment can also comprise: the second detecting unit and the first filler cells; Or above-mentioned data processing equipment can also comprise: the 3rd detecting unit and the second filler cells; Or above-mentioned data processing equipment can also comprise: the second detecting unit, the first filler cells, the 3rd detecting unit and the second filler cells.Wherein, the second detecting unit, after the achievement data of each index in multiple affairs in the multiple index of acquisition, detects the achievement data obtained and whether there is vacancy value; First filler cells, for when detecting that the achievement data of acquisition exists vacancy value, vacancy value is filled, afterwards, determining unit performs the function determining to meet in multiple index item the part index number presetting correlation rule according to the achievement data of index each in multiple index in multiple affairs.Whether the 3rd detecting unit, exist isolated point data for the achievement data detecting acquisition; Second filler cells, for when detecting that the achievement data of acquisition exists isolated point data, isolated point data are filled, afterwards, determining unit performs the function determining to meet in multiple index item the part index number presetting correlation rule according to the achievement data of index each in multiple index in multiple affairs.
Due to actual acquisition to achievement data often there are a little undesirable data, and good data mining algorithm generally has certain requirement to its data acquisition processed, therefore by above-mentioned functional unit, pre-service is carried out to the achievement data collected, can guarantee data integrity better, data redundancy is few, correlativity between data attribute is little.
In fact, vacancy value and isolated point data may be there are in the achievement data in grid company fellow peers' evaluation system.Wherein, when there is vacancy value in achievement data, most probable value can be used to fill, l such as, selects the mean value of same class achievement data to fill; When there are isolated point data in achievement data, the mean value of a class achievement data also can be selected to fill.During enforcement, when identifying isolated point data, the data needing to identify and the data of this index item contiguous time can be compared, set up a threshold value, if the data of this needs identification are greater than this threshold value with the difference of the data with this index item contiguous time, then can automatically identify this isolated point data.
The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.
In the above embodiment of the present invention, the description of each embodiment is all emphasized particularly on different fields, in certain embodiment, there is no the part described in detail, can see the associated description of other embodiments.
In several embodiments that the application provides, should be understood that, disclosed technology contents, the mode by other realizes.Wherein, device embodiment described above is only schematic, the such as division of described unit, can be that a kind of logic function divides, actual can have other dividing mode when realizing, such as multiple unit or assembly can in conjunction with or another system can be integrated into, or some features can be ignored, or do not perform.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, and the indirect coupling of unit or module or communication connection can be electrical or other form.
The described unit illustrated as separating component or can may not be and physically separates, and the parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed on multiple unit.Some or all of unit wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, also can be that the independent physics of unit exists, also can two or more unit in a unit integrated.Above-mentioned integrated unit both can adopt the form of hardware to realize, and the form of SFU software functional unit also can be adopted to realize.
If described integrated unit using the form of SFU software functional unit realize and as independently production marketing or use time, can be stored in a computer read/write memory medium.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words or all or part of of this technical scheme can embody with the form of software product, this computer software product is stored in a storage medium, comprises all or part of step of some instructions in order to make a computer equipment (can be personal computer, server or the network equipment etc.) perform method described in each embodiment of the present invention.And aforesaid storage medium comprises: USB flash disk, ROM (read-only memory) (ROM, Read-OnlyMemory), random access memory (RAM, Random Access Memory), portable hard drive, magnetic disc or CD etc. various can be program code stored medium.
The above is only the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims (10)

1. for a data processing method for fellow peers' evaluation, it is characterized in that, comprising:
Obtain the achievement data of each index in multiple affairs in multiple index, described multiple index is the index for fellow peers' evaluation; And
Determine according to the achievement data of each index in described multiple affairs in described multiple index in described multiple index, to meet the part index number presetting correlation rule.
2. data processing method according to claim 1, it is characterized in that, described multiple index forms candidate's index set, each index in described multiple index is an element in described candidate's index set, each transaction packet in described multiple affairs is contained in described candidate's index set, determines that meeting the part index number presetting correlation rule in described multiple index item comprises according to the achievement data of each index in described multiple affairs in described multiple index:
Obtain the Apriori algorithm preset;
All indexs in described candidate's index set are excavated in described multiple affairs to the frequent item set meeting and preset correlation rule according to the achievement data of each index in described multiple affairs in the described Apriori algorithm that presets and described multiple index, described frequent item set is the item collection that the support of every is all greater than minimum support; And
Using index involved in described frequent item set as the part index number meeting described default correlation rule.
3. data processing method according to claim 2, it is characterized in that, according to the achievement data of each index in described multiple affairs in the described Apriori algorithm that presets and described multiple index all indexs in described candidate's index set are excavated in described multiple affairs and meet the frequent item set presetting correlation rule and comprise:
According to the achievement data of each index in described multiple affairs in the described Apriori algorithm that presets and described multiple index, in described multiple affairs, all frequent item sets are determined to all indexs in described candidate's index set;
Produce Strong association rule by described all frequent item sets, described Strong association rule meets the requirement of described minimum support;
Using index involved in described Strong association rule as the part index number meeting described default correlation rule.
4. data processing method according to claim 3, is characterized in that, using index involved in described Strong association rule as before the part index number meeting described default correlation rule, described data processing method also comprises:
Detect the requirement whether described Strong association rule meets min confidence,
Wherein, if detect that described Strong association rule meets the requirement of described min confidence, then using index involved in described Strong association rule as the part index number meeting described default correlation rule; And/or, if detect that described Strong association rule does not meet the requirement of described min confidence, then first one or several indexs not meeting the requirement of described min confidence in described Strong association rule are rejected, then using eliminate described one or several not meet in the correlation rule of the index of the requirement of described min confidence involved index as the part index number meeting described default correlation rule.
5. data processing method according to claim 1, is characterized in that, in the multiple index of acquisition after the achievement data of each index in multiple affairs, described data processing method also comprises:
Detect the achievement data obtained and whether there is vacancy value, if detect that the achievement data of described acquisition exists described vacancy value, then described vacancy value is filled, afterwards, the step determining to meet in described multiple index item the part index number presetting correlation rule according to the achievement data of each index in described multiple affairs in described multiple index is performed; And/or
Detect the achievement data obtained and whether there are isolated point data, if detect that the achievement data of described acquisition exists described isolated point data, then described isolated point data are deleted and filled, afterwards, the step determining to meet in described multiple index item the part index number presetting correlation rule according to the achievement data of each index in described multiple affairs in described multiple index is performed.
6. for a data processing equipment for fellow peers' evaluation, it is characterized in that, comprising:
Acquiring unit, for obtaining the achievement data of each index in multiple affairs in multiple index, described multiple index is the index for fellow peers' evaluation; And
Determining unit, meets for determining according to the achievement data of each index in described multiple affairs in described multiple index the part index number presetting correlation rule in described multiple index.
7. data processing equipment according to claim 6, it is characterized in that, described multiple index forms candidate's index set, each index in described multiple index is an element in described candidate's index set, each transaction packet in described multiple affairs is contained in described candidate's index set, and described determining unit comprises:
Acquisition module, for obtaining the Apriori algorithm preset;
Excavate module, all indexs in described candidate's index set are excavated in described multiple affairs to the frequent item set meeting and preset correlation rule for the achievement data of each index in described multiple affairs in the Apriori algorithm that presets described in basis and described multiple index, described frequent item set is the item collection that the support of every is all greater than minimum support; And
Determination module, for using index involved in described frequent item set as the part index number meeting described default correlation rule.
8. data processing equipment according to claim 7, is characterized in that, described excavation module comprises:
Determine submodule, for the achievement data of each index in described multiple affairs in the Apriori algorithm that presets described in basis and described multiple index, in described multiple affairs, all frequent item sets are determined to all indexs in described candidate's index set;
Generate submodule, for producing Strong association rule by described all frequent item sets, described Strong association rule meets the requirement of described minimum support;
Determine submodule, for using index involved in described Strong association rule as the part index number meeting described default correlation rule.
9. data processing equipment according to claim 8, is characterized in that, described data processing equipment also comprises:
First detecting unit, for using index involved in described Strong association rule as before the part index number meeting described default correlation rule, detect the requirement whether described Strong association rule meets min confidence,
Wherein, describedly determine submodule also for when detecting that described Strong association rule meets the requirement of described min confidence, using index involved in described Strong association rule as the part index number meeting described default correlation rule; And/or, describedly determine submodule also for detecting that described Strong association rule does not meet the requirement of described min confidence, then first one or several indexs not meeting the requirement of described min confidence in described Strong association rule are rejected, then using eliminate described one or several not meet in the correlation rule of the index of the requirement of described min confidence involved index as the part index number meeting described default correlation rule.
10. data processing equipment according to claim 6, is characterized in that, described data processing equipment also comprises:
Second detecting unit, after the achievement data of each index in multiple affairs in the multiple index of acquisition, detects the achievement data obtained and whether there is vacancy value; First filler cells, for when detecting that the achievement data of described acquisition exists described vacancy value, described vacancy value is filled, afterwards, described determining unit performs the function determining to meet in described multiple index item the part index number presetting correlation rule according to the achievement data of each index in described multiple affairs in described multiple index, and/or
Whether the 3rd detecting unit, exist isolated point data for the achievement data detecting acquisition; Second filler cells, for when detecting that the achievement data of described acquisition exists described isolated point data, described isolated point data are filled, afterwards, described determining unit performs the function determining to meet in described multiple index item the part index number presetting correlation rule according to the achievement data of each index in described multiple affairs in described multiple index.
CN201510226886.6A 2015-05-06 2015-05-06 Data processing method and device for fellow peers' evaluation Active CN104794234B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510226886.6A CN104794234B (en) 2015-05-06 2015-05-06 Data processing method and device for fellow peers' evaluation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510226886.6A CN104794234B (en) 2015-05-06 2015-05-06 Data processing method and device for fellow peers' evaluation

Publications (2)

Publication Number Publication Date
CN104794234A true CN104794234A (en) 2015-07-22
CN104794234B CN104794234B (en) 2019-02-15

Family

ID=53559026

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510226886.6A Active CN104794234B (en) 2015-05-06 2015-05-06 Data processing method and device for fellow peers' evaluation

Country Status (1)

Country Link
CN (1) CN104794234B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845833A (en) * 2017-01-20 2017-06-13 广东广业开元科技有限公司 A kind of processing system based on the compound making assessments of the performance of each enterprise's achievement of big data
CN108461127A (en) * 2018-01-12 2018-08-28 平安科技(深圳)有限公司 Medical data relationship image acquiring method, device, terminal device and storage medium
CN109656969A (en) * 2018-11-16 2019-04-19 北京奇虎科技有限公司 Data unusual fluctuation analysis method and device
CN110459276A (en) * 2019-08-15 2019-11-15 北京嘉和海森健康科技有限公司 A kind of data processing method and relevant device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101546912A (en) * 2009-04-28 2009-09-30 江苏省电力试验研究院有限公司 Same power network line loss classifying and assessing method
US20110004631A1 (en) * 2008-02-26 2011-01-06 Akihiro Inokuchi Frequent changing pattern extraction device
CN103440539A (en) * 2013-09-13 2013-12-11 国网信息通信有限公司 Method for processing electricity consumption data of consumers
CN103903094A (en) * 2014-03-28 2014-07-02 国家电网公司 System and method for bearing capacity evaluation of power grid enterprise

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110004631A1 (en) * 2008-02-26 2011-01-06 Akihiro Inokuchi Frequent changing pattern extraction device
CN101546912A (en) * 2009-04-28 2009-09-30 江苏省电力试验研究院有限公司 Same power network line loss classifying and assessing method
CN103440539A (en) * 2013-09-13 2013-12-11 国网信息通信有限公司 Method for processing electricity consumption data of consumers
CN103903094A (en) * 2014-03-28 2014-07-02 国家电网公司 System and method for bearing capacity evaluation of power grid enterprise

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
徐华: "《数据挖掘 方法与应用》", 31 October 2014, 清华大学出版社 *
顾敏奕: "数据挖掘技术在电网企业中的应用需求分析", 《上海电力学院学报》 *
马刚: "《商业智能》", 31 July 2010, 东北财经大学出版社 *
黄宜华: "《深入理解大数据 大数据处理与编程实践》", 31 August 2014, 机械工业出版社 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845833A (en) * 2017-01-20 2017-06-13 广东广业开元科技有限公司 A kind of processing system based on the compound making assessments of the performance of each enterprise's achievement of big data
CN108461127A (en) * 2018-01-12 2018-08-28 平安科技(深圳)有限公司 Medical data relationship image acquiring method, device, terminal device and storage medium
CN108461127B (en) * 2018-01-12 2020-10-20 平安科技(深圳)有限公司 Medical data relation image acquisition method and device, terminal equipment and storage medium
CN109656969A (en) * 2018-11-16 2019-04-19 北京奇虎科技有限公司 Data unusual fluctuation analysis method and device
CN110459276A (en) * 2019-08-15 2019-11-15 北京嘉和海森健康科技有限公司 A kind of data processing method and relevant device
CN110459276B (en) * 2019-08-15 2022-05-24 北京嘉和海森健康科技有限公司 Data processing method and related equipment

Also Published As

Publication number Publication date
CN104794234B (en) 2019-02-15

Similar Documents

Publication Publication Date Title
CN102650996B (en) Method and device for determining data mapping relationship between database tables
KR101593910B1 (en) System for online monitering individual information and method of online monitering the same
US20100030728A1 (en) Computing selectivities for group of columns and expressions
Nandurge et al. Analyzing road accident data using machine learning paradigms
CN109284626A (en) Random forests algorithm towards difference secret protection
CN103605651A (en) Data processing showing method based on on-line analytical processing (OLAP) multi-dimensional analysis
US10303705B2 (en) Organization categorization system and method
CN104462184A (en) Large-scale data abnormity recognition method based on bidirectional sampling combination
CN113723452B (en) Large-scale anomaly detection system based on KPI clustering
CN104794234A (en) Data processing method and device for benchmarking
CN107391515A (en) Power system index analysis method based on Association Rule Analysis
CN104574141A (en) Service influence degree analysis method
CN102750367A (en) Big data checking system and method thereof on cloud platform
CN105512210A (en) Correlated event type detection method and device
CN112860769A (en) Energy planning data management system
WO2022147237A1 (en) Automated linear clustering recommendation for database zone maps
CN115544519A (en) Method for carrying out security association analysis on threat information of metering automation system
KR101671890B1 (en) apparatus for analyzing business traction information and method fornalyzing business traction information
CN102364475A (en) System and method for sequencing search results based on identity recognition
CN108921433B (en) Risk quantitative analysis system based on business continuity
Kostić et al. Data mining and modeling use case in banking industry
KR101085066B1 (en) An Associative Classification Method for detecting useful knowledge from huge multi-attributes dataset
KR20180071699A (en) System for online monitoring individual information and method of online monitoring the same
US10621155B2 (en) Method and apparatus for data integration
CN113723522B (en) Abnormal user identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant