CN105590167A - Method and device for analyzing electric field multivariate operating data - Google Patents

Method and device for analyzing electric field multivariate operating data Download PDF

Info

Publication number
CN105590167A
CN105590167A CN201510956727.1A CN201510956727A CN105590167A CN 105590167 A CN105590167 A CN 105590167A CN 201510956727 A CN201510956727 A CN 201510956727A CN 105590167 A CN105590167 A CN 105590167A
Authority
CN
China
Prior art keywords
standardization
frequent
electric field
data
membership
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510956727.1A
Other languages
Chinese (zh)
Inventor
刘辉
赵宇思
吴林林
崔正湃
刘晓鹏
徐海翔
任巍曦
王若阳
王皓靖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
North China Electric Power Research Institute Co Ltd
Electric Power Research Institute of State Grid Jibei Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
North China Electric Power Research Institute Co Ltd
Electric Power Research Institute of State Grid Jibei Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, North China Electric Power Research Institute Co Ltd, Electric Power Research Institute of State Grid Jibei Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201510956727.1A priority Critical patent/CN105590167A/en
Publication of CN105590167A publication Critical patent/CN105590167A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method and device for analyzing electric field multivariate operating data. The method comprises steps of: standardizing each index data of the electric field multivariate operating data to generate a standardized item set of the index data; performing fuzzification on the standardized item set by using a normal distribution membership function to generate fuzzy sets and determining the membership grade value of each fuzzy set; determining a frequent 1 item set according to the membership grade value of each index data in each fuzzy set; identifying and computing a frequent item set by means of iteration according to the determined frequent 1 item set; generating an associated rule of the frequent item set by means of correlative steps of an Apriori algorithm, wherein the associated rule is a result of electric field multivariate data analysis. The provided improved fuzzy function associated rule mining algorithm is introduced into wind electric field multivariate operating data analysis. A Boolean type associated rule is converted into a value type associated rule by means of a divisional concept. The concept of a normal distribution membership function in fuzzy mathematics is introduced so that a boundary is reasonable and effective.

Description

The polynary Operational Data Analysis method of electric field and device
Technical field
The present invention relates to power technology, concrete will be a kind of polynary Operational Data Analysis method of electric field and device.
Background technology
Along with the develop rapidly of wind-powered electricity generation, the progressively intensification of power equipment automation and the arrival of large data age, wind-powered electricity generationAccumulation and the type of service data magnanimity are various, and prior art also concentrates on statistical to the Main Analysis method of service dataAnalyse in aspect, comprise probability distribution, regression analysis, variance etc., correlation potential between data and data is not filledThe utilization dividing.
The task dividable that data mining technology is born by it is classification, prediction, Association Rule Analysis, cluster analysis and peels offAnalyze this five large class, and correlation rule is a very important mining algorithm in Data Mining, it lays particular emphasis on determines numberAccording to the relation between middle different pieces of information item. The people such as R.Agrawal have proposed first boolean association rule and have excavated calculation in 1993Method (Apriori algorithm), but the value of boolean association rule processing is all discrete, kind, and it has shown these variableesBetween relation (numerical value that is data can only be 0 or 1).
Summary of the invention
For the degree of impact of each factor in quantitative electric field multivariate data, the embodiment of the present invention provides a kind of electric field manyUnit's Operational Data Analysis method, method comprises:
Each achievement data to the polynary service data of electric field carries out standardization, generates the standardization thing of each achievement dataItems collection;
Utilize normal distribution membership function to carry out subregion and Fuzzy processing generation to described standardization things item collectionFuzzy set, and the degree of membership value of definite each fuzzy set;
Determine frequent 1 collection according to each achievement data in the degree of membership value of each fuzzy set;
Calculate frequent K item collection according to frequent 1 collection of determining by this iteration identification;
Utilize the correlation step of constructing correlation rule in Apriori algorithm determine described frequent K item collection correlation rule andConfidence level, generates electric field multivariate data analysis result.
In the embodiment of the present invention, the each achievement data in the polynary service data of electric field is carried out to standardization and comprises:
Adopt extreme value standardized method to carry out standardization to the polynary service data of electric field.
In the embodiment of the present invention, the described normal distribution membership function that utilizes carries out mould to described standardization things item collectionGelatinization is processed and is generated fuzzy set, and determines that the degree of membership value of each fuzzy set comprises:
Determine the desired value of the concentrated each achievement data of described standardization things item;
According to the standardization things item of the desired value of described each achievement data, each achievement data concentrated maximum, minimumEach achievement data that value is concentrated described standardization things item carries out respectively subregion;
According to the normal distribution membership function of default each subregion, described standardization things item collection is converted into fuzzy setAnd the degree of membership value of definite fuzzy set.
In the embodiment of the present invention, the normal distribution membership function of described default each subregion is:
r h i g h ( t i j ) = 1 t i j ≤ μ min e - ( t i j - μ min ) 2 2 σ h i g h 2 μ min ≤ t i j ≤ μ max , σ h i g h = μ 0 - μ min 3
r m i d d l e ( t i j ) = e - ( μ 0 - t i j ) 2 2 σ m i d d l e 1 2 μ min ≤ t i j ≤ μ 0 , σ m i d d l e 1 = μ 0 - μ min 3 e - ( t i j - μ 0 ) 2 2 σ m i d d l e 2 2 μ 0 ≤ t i j ≤ μ max , σ m i d d l e 2 = μ max - μ 0 3
r l o w ( t i j ) = e - ( μ m a x - t i j ) 2 2 σ l o w 2 μ min ≤ t i j ≤ μ m a x , σ l o w = μ m a x - μ 0 3 1 t i j ≥ μ m a x
Wherein,For described standardization things item is concentrated the data after standardization, μmin、μmaxFor the standard of each achievement dataChange that things item concentrates minimum of a value and maximum, μ0For the desired value of achievement data. .
In the embodiment of the present invention, described determines frequent 1 collection according to each achievement data in the degree of membership value of each fuzzy setComprise:
Degree of membership weights by each achievement data in corresponding fuzzy set;
When the degree of membership weights that judge fuzzy set are not less than default minimum support, this fuzzy set is put into frequentlyConcentrate for 1.
Meanwhile, the present invention also provides a kind of electric field polynary Operational Data Analysis device, and device comprises:
Standardization module, carries out standardization for the each achievement data to the polynary service data of electric field, generatesThe standardization things item collection of each achievement data;
Fuzzy Processing module, for utilizing normal distribution membership function to carry out obfuscation to described standardization things item collectionProcess and generate fuzzy set, and determine the degree of membership value of each fuzzy set;
Frequent 1 collection generation module, for determining frequent 1 according to each achievement data in the degree of membership value of each fuzzy setCollection;
Frequent K item collection generation module, for calculating frequent K item according to frequent 1 collection of determining by this iteration identificationCollection;
Analysis result generation module, utilizes the correlation step of constructing correlation rule in Apriori algorithm to determine described frequent KCorrelation rule and the confidence level of item collection, generate electric field multivariate data analysis result.
In the embodiment of the present invention, standardization module adopts extreme value standardized method to carry out the polynary service data of electric fieldStandardization.
In the embodiment of the present invention, described Fuzzy Processing module comprises:
Desired value determining unit, for determining the desired value of the concentrated each achievement data of described standardization things item;
Zoning unit, concentrates according to the standardization things item of the desired value of described each achievement data, each achievement dataLarge value, minimum of a value are carried out subregion to described standardization things item collection;
Fuzzy set generation unit, according to the normal distribution membership function of default each subregion by described standardization things itemCollection is converted into fuzzy set and determines the degree of membership value of fuzzy set.
In the embodiment of the present invention, described frequent 1 collection generation module comprises:
Degree of membership weights generation unit, for the degree of membership weights in corresponding fuzzy set by each achievement data;
Judging unit, when the degree of membership weights that judge fuzzy set are not less than default minimum support, by this fuzzy setClose put into frequent 1 concentrate.
The present invention proposes and improve ambiguity function association rules mining algorithm, be introduced into the polynary magnanimity operation of wind energy turbine set numberAccording to analytical work in. This algorithm is converted into numeric type correlation rule by the concept of subregion by boolean association rule, and drawsEnter the concept of function of normal distribution in fuzzy mathematics, make its border more rationally effectively.
For above and other object of the present invention, feature and advantage can be become apparent, preferred embodiment cited below particularly,And coordinate appended graphicly, be described in detail below.
Brief description of the drawings
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existingHave the accompanying drawing of required use in technical description to be briefly described, apparently, the accompanying drawing in the following describes is only thisSome embodiment of invention, for those of ordinary skill in the art, not paying under the prerequisite of creative work, all rightObtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is correlation rule basic model;
The polynary Operational Data Analysis method flow diagram of electric field that Fig. 2 provides for the embodiment of the present invention;
Fig. 3 is the ambiguity function association rule algorithm flow chart in the embodiment of the present invention;
Fig. 4 is the membership function in the embodiment of the present invention;
Fig. 5 is blower fan annual electricity generating capacity statistical chart in the embodiment of the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, completeDescribe, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiment wholely. Based onEmbodiment in the present invention, those of ordinary skill in the art are not making obtain under creative work prerequisite every otherEmbodiment, belongs to the scope of protection of the invention.
Abbreviation and Key Term definition:
(1) definition of correlation rule:
Association rule mining is for finding to be present in interesting associated or relevant the closing between project or the attribute of databaseSystem, identifies the property set (claiming again frequent item set) of frequent appearance from data centralization, and then utilizes these Frequent Sets to createThe regular process of incidence relation is described. These relations are unknown and be hidden in advance, can not be by the logic of databaseOperation (as the connection of table) or statistical method draw. This explanation correlation rule is not that build-in attribute based on data self is (as letterNumber dependences), but there is feature based on data items time.
(2) concept of correlation rule:
If I={i1,i2,……imIt is the set of item. If the data D that task is relevant is the set of db transaction, whereinEach affairs T is the set of item, makesEach affairs has an identifier, is called TID. If A is an item collection, ifAffairs T comprises A, necessarily has
Definition 1: the record set of association rule mining is designated as D (D is transaction database), D={t1,t2,…,tk,…,tn},tk={i1,i2,…,ij,…,ip(k=1,2 ..., n) be affairs; tkIn element ij(j=1,2 ..., p) be calledProject.
Definition 2: establish I={i1,i2,……imThe set of all item design in D, any subset A of I is called in DItem Sets, | A|=k claims that set A is k Item Sets.
Definition 3: correlation rule be shape asImplications, whereinAnd A ∩ B=φ. RuleIn affairs collection D, occur thering is support s, wherein s be in D transaction packet containing the percentage of A ∪ B (be A and B the two).It is probability P (A ∪ B). RuleIn affairs collection D, there is confidence level c, if also comprised when comprising A affairs in DThe percentage of B is c, and it is conditional probability P (B|A) so. Be that support is:
sup p o r t ( A ⇒ B ) = P ( A ∪ B )
Confidence level is:
c o n f i d e n c e ( A ⇒ B ) = P ( B | A )
Support and confidence level are to describe two key concepts of correlation rule, and the former is for weighing correlation rule wholeThe statistical significance of data centralization, the latter is for weighing the credibility of correlation rule. Conventionally user needs to specify according to diggingMinimum support (being designated as minsupport) and min confidence (being designated as minconfidence). The former has described correlation ruleMinimum significance level, the latter has specified the least reliability that correlation rule must be satisfied.
Definition 4: if sup p o r t ( A ⇒ B ) ≥ min sup p o r t And c o n f i d e n c e ( A ⇒ B ) ≥ min c o n f i d e n c e , Correlation ruleFor strong rule, otherwise claim correlation ruleFor weak rule.
Association rule mining problem in D, solves all supports exactly and confidence level all exceedes minsupportWith the correlation rule of minconfidence, to solve satisfiedWith c o n f i d e n c e ( A ⇒ B ) ≥ m i n c o n f i d e n c e Rule
Apriori algorithm:
Apriori algorithm is a kind of algorithm of Mining Boolean Association Rules frequent item set, and the discovery of correlation rule can be divided intoTwo steps:
(1) minimum support (minsupport) of the Frequent Item Sets of setting according to user, iteration is identified all frequenciesNumerous Item Sets;
(2) in Item Sets, build and be greater than the Strong association rule of minimum confidence level (minconfidence) that user sets.
Fig. 1 correlation rule basic model, wherein D is data set, the searching algorithm that Algorithm-1 is Frequent Item Sets,Algorithm-2 is the generation algorithm of correlation rule, and R is the correlation rule set of excavating. User is by specifyingMinsupport, minconfidence are mutual with algorithm Algorithm-1, Algorithm-2 respectively, and mutual by with RResult is made an explanation and assessed.
The core of Apriori algorithm is to identify all Frequent Item Sets by iteration, and this is also to calculate maximum portionPoint. First, produce frequent 1 collection L by scanning for the first time1, before the inferior scanning of K (K > 1), first utilize the knot of K-1 scanning(be really frequent K-1 item collection Lk-1) and transaction item collection D in K item produce K item candidate frequent item set Ck, then determine CkIn eachThe support value of element finally calculates frequent K item collection L in the time of each time end of scank, algorithm is being worked as the frequent K item of candidate collectionCkDuring for sky, finish.
In the mining process of algorithm, the efficiency successively producing for improving frequent item set, Apriori character is used for compressing searchSpace, mainly comprises two treatment steps, connects and beta pruning step. Connect is mainly with a collection Lk-1From connecting to obtain Ck;If it is not to talk about frequently that beta pruning refers to the subset of K-item collection, itself is also non-frequent, can be from CkMiddlely left out.
Obtain after frequent item set, produce correlation rule step as follows. To each frequent item set l, produce all non-gapsCollection; To each nonvoid subset s, if confidence >=minconfidence, output ruleWhereinThe computing formula of confidence is as follows.
c o n f i d e n c e ( A ⇒ B ) = sup p o r t _ c o u n t ( A ∪ B ) sup p o r t _ c o u n t ( A )
As shown in Figure 2, be a kind of polynary Operational Data Analysis method of electric field that the embodiment of the present invention provides, method comprises:
Step S201, carries out standardization to each achievement data of the polynary service data of electric field, generates each achievement dataStandardization things item collection;
Step S202, utilizes normal distribution membership function to carry out subregion and obfuscation place to described standardization things item collectionReason generates fuzzy set, and determines the degree of membership value of each fuzzy set;
Step S203, determines frequent 1 collection according to each achievement data in the degree of membership value of each fuzzy set;
Step S204, calculates frequent K item collection according to frequent 1 collection of determining by this iteration identification;
Step S205, utilizes the correlation step of constructing correlation rule in Apriori algorithm to determine the pass of described frequent K item collectionConnection rule and confidence level, generate electric field multivariate data analysis result.
The present invention, on the basis of Apriori algorithm, has proposed improvement ambiguity function association rules mining algorithm, is drawnEnter in the analytical work of the polynary magnanimity service data of wind energy turbine set. This algorithm is converted into boolean association rule by the concept of subregionNumeric type correlation rule, and introduce the concept of function of normal distribution in fuzzy mathematics, make its border more rationally effectively.
As shown in Figure 3, be the ambiguity function association rule algorithm flow chart in the embodiment of the present invention. Below in conjunction with above-mentioned stepSuddenly, the ambiguity function association rule algorithm in the embodiment of the present invention is done into once describing in detail:
Adopt extreme value standardized method to carry out standardization (being step S201) to the polynary service data of electric field, concreteFor:
The service data of the selected variable collecting is carried out to data pretreatment, and adopt extreme value standardized method by originalData standard turns to [0,1] interval, and formula is as follows.
t i j = X i j ′ - m i n ( X i j ′ ) m a x ( X i j ′ ) - m i n ( X i j ′ ) - - - ( 3 )
In formula,For the data after standardization,Be respectively initial dataUpper lower limit value.The initial data hereI can be understood as the kind of variable, such as wind-powered electricity generation data are to having wind speed, propeller pitch angle, generated energyEtc. different types of data. J is not corresponding in the same time.
In the embodiment of the present invention, step S202, utilizes normal distribution membership function to described standardization things Xiang JijinRow Fuzzy processing generates fuzzy set, and determines that the degree of membership value of each fuzzy set comprises:
Determine the desired value of the concentrated each achievement data of described standardization things item;
According to the standardization things item of the desired value of described each achievement data, each achievement data concentrated maximum, minimumValue is carried out subregion to described standardization things item collection;
According to the normal distribution membership function of default each subregion, described standardization things item collection is converted into fuzzy setAnd the degree of membership value of definite fuzzy set.
Concrete, in the embodiment of the present invention, by the each Transaction Information T in transaction database Di(i=1,2 ..., n)Each projectBe expressed as fuzzy set, the wherein expression of membership function with given function of normal distributionFormula is as follows, membership function figure as shown in Figure 4:
r h i g h ( t i j ) = 1 t i j ≤ μ min e - ( t i j - μ min ) 2 2 σ h i g h 2 μ min ≤ t i j ≤ μ max , σ h i g h = μ 0 - μ min 3
r m i d d l e ( t i j ) = e - ( μ 0 - t i j ) 2 2 σ m i d d l e 1 2 μ min ≤ t i j ≤ μ 0 , σ m i d d l e 1 = μ 0 - μ min 3 e - ( t i j - μ 0 ) 2 2 σ m i d d l e 2 2 μ 0 ≤ t i j ≤ μ max , σ m i d d l e 2 = μ max - μ 0 3
r l o w ( t i j ) = e - ( μ m a x - t i j ) 2 2 σ l o w 2 μ min ≤ t i j ≤ μ m a x , σ l o w = μ m a x - μ 0 3 1 t i j ≥ μ m a x
Wherein, μmin、μmaxFor minimum of a value and the maximum of this parameter attribute; Be μmin、μmaxFor the standard of each achievement dataChange that things item concentrates minimum of a value and maximum; ; μ0For the desired value of parameter attribute, i.e. the desired value of achievement data. By heightThe characteristic of this function is known, and under function curve, 99.73% area is at desired value μ0In 3 standard deviations (3 σ) scope of left and right, because ofThis adopts the domain of definition of 6 conventional σ as function.
IfCorresponding fuzzy set is fi j,Be described as the form of fuzzy set by Zadeh representation
f i j = r i ( R j 1 ) R j 1 + r i ( R j 2 ) R j 2 + ... + r i ( R j k ) R j k - - - ( 4 )
Wherein, RjlFor projectL fuzzy subregion, (for concrete example RjlBe the high, medium and low song in Fig. 4Line 301,302,303), ri(Rjl) be subregion RjlOn degree of membership value (count according to function of normal distribution expression formulaFractional value, this is also degree of membership initial in algorithm).
As for the fuzzy subregion of how to confirm, the number of subregion (this patent is high, medium and low 3) is required according to actual conditionsThe fog-level of wanting determines, is generally 3. The border of subregion is exactly above-mentioned μmin、μ0、μmax, the border of " height " be [0,μ0]; " in " border be [μmin,μmax]; The border of " low " is [μ0,1]。
In the embodiment of the present invention, described determines frequent 1 collection according to each achievement data in the degree of membership value of each fuzzy setComprise:
Degree of membership weights by each achievement data in corresponding fuzzy set;
When the degree of membership weights that judge fuzzy set are not less than default minimum support, this fuzzy set is put into frequentlyConcentrate for 1.
In the present embodiment, determine frequent 1 collection according to each achievement data in the degree of membership value of each fuzzy set, according to what determineThe concrete steps that frequent 1 collection calculates frequent item set by this iteration identification are as follows;
(1), calculate n Transaction Information Ti(i=1,2 ..., n) in each projectAt corresponding mouldStick with paste set Rjs(s=1,2 ..., the k) weights (support of middle degree of membershipjs):
sup port j s = 1 n Σ i = 1 n r i ( R j s ) - - - ( 5 )
(2) to each subregion Rjs(1≤j≤m, 1≤s≤k), check whether weights corresponding to each fuzzy set are greater than or etc.In minimum support given in advance. If subregion RjsMeet above condition, put it into a frequent collection L1In,
L1={Rjs|supportjs≥Smin,1≤j≤m,1≤s≤k}(6)
(3) establish r=1, r represents the current quantity that is retained in project in frequent item set here.
(4) with being similar to Apriori algorithm from frequent item set LrMiddle generation candidate Cr+1, wherein two Item SetsIn to have r-1 project be identical, and other project difference, and two subregions that belong to same project can not appear at simultaneouslyCandidate Cr+1In same.
(5) to candidate Cr+1In each newly-generated r+1 item collection t=(t1,t2,...,tr+1), be handled as follows:
1. to each Transaction Information Ti, project t degree of membership is thereon concentrated in calculated candidate sport.
WhereinFor Transaction Information TiAt subregionOn degree of membership value.
2. the weights of projects are concentrated in calculated candidate sport
sup port t = 1 n Σ i = 1 n r i t - - - ( 8 )
If 3. supporttBe more than or equal to minsupport, by project t=(t1,t2,...,tr+1) put into Lr+1In.
(6) if Lr+1For sky, carry out next step, otherwise put r=r+1 and repeating step (4) to (6), calculate maximumFrequent item set.
Then,, by Apriori algorithm, there is project (t for all1,t2,...,tq) large q (q >=2) collection t structureCorrelation rule the strongly connected correlation rule (being step S205) of going out.
Meanwhile, the present invention also provides a kind of electric field polynary Operational Data Analysis device, and device comprises:
Standardization module, carries out standardization for the each achievement data to the polynary service data of electric field, generatesThe standardization things item collection of each achievement data;
Fuzzy Processing module, for utilizing normal distribution membership function to carry out obfuscation to described standardization things item collectionProcess and generate fuzzy set, and determine the degree of membership value of each fuzzy set;
Frequent 1 collection generation module, for determining frequent 1 according to each achievement data in the degree of membership value of each fuzzy setCollection;
Frequent K item collection generation module, for calculating frequent K item according to frequent 1 collection of determining by this iteration identificationCollection;
Analysis result generation module, utilizes the correlation step of constructing correlation rule in Apriori algorithm to determine described frequent KCorrelation rule and the confidence level of item collection, generate electric field multivariate data analysis result.
Below in conjunction with specific embodiment, the solution of the present invention is described further, as shown in Figure 5,13# blower fan send outElectric weight in floor level, therefore utilizes the fuzzy association rules algorithm excavation 13# blower fan that the present invention proposes to send out in each blower fanWhole influence factors that electric weight is low, and the influence degree of this factor is quantitatively described. The data that this algorithm adopts are 2013The daily mean of whole year operation data.
Whole year operation data in 2013 of 13# blower fan are carried out to daily mean calculating, through choosing data and data detection obtainsTo 365 groups of service datas of extracting for correlation rule, every group comprise per day wind speed, per day active power, effectively when windThese 7 indexs of rate, wind direction, availability, yaw angle and daily generation. The all attributes of data set are all normalized to [0,1]. WillEach parameter attribute is carried out Fuzzy processing by the membership function of Fig. 3, obtains the degree of membership value of each parameter, and part numerical value is as table 1Show.
As an example of " per day wind speed " variable example, how explanation calculates degree of membership value. 365 groups of data successively substitution are given aboveIn 3 expression formulas of the function of normal distribution going out, obtain 3*365 degree of membership value. Note: what table 1 provided is each variable 5It high, medium and low subregion degree of membership.
The part numerical value of the each parameter degree of membership of table 1
The input vector excavating the basic variable after data centralization Fuzzy processing as fuzzy association rules. GivenLittle support minsupport=0.3, minimum confidence level minconfidence=0.7, uses aforesaid fuzzy association rules to calculateMethod is excavated, and obtains result as follows.
Frequent 1 collection L1:
Daily generation, height
Average daily wind speed, height
Average daily active power, height
Wind direction, height
Yaw angle, height
Average daily wind speed, in
Wind direction, in
Rate when effective wind is low
Availability is low
Frequent 2 collection L2:
Daily generation, height; Average daily wind speed, height 8 -->
Daily generation, height; Average daily active power, height
Daily generation, height; Rate when effective wind is low
Daily generation, height; Availability is low
Average daily wind speed, height; Average daily active power, height
Average daily active power, height; Wind direction, height
Average daily active power, height; Yaw angle, height
Average daily active power, height; Rate when effective wind is low
Average daily active power, height; Availability is low
Wind direction, height; Yaw angle, height
Average daily wind speed, in; Rate when effective wind is low
Average daily wind speed, in; Availability is low
Rate when effective wind is low; Availability is low
Frequent 3 collection L3:
Daily generation, height; Average daily wind speed, height; Average daily active power, height
Daily generation, height; Average daily active power, height; Rate when effective wind is low
Daily generation, height; Average daily active power, height; Availability is low
Average daily wind speed, in; Rate when effective wind is low; Availability is low
From frequent item set, find out (relevant with the daily generation) project needing, calculate its confidence level.
From the above results, the confidence level of these two correlation rules is all greater than the minimum confidence level 0.7 of setting, and it is equalFor Strong association rule. Therefore, can obtain drawing a conclusion:
(1) causing the generated energy of 13# blower fan is that its availability is low a little less than the main factor of other blower fans, i.e. blower fanThe situation that operates in generating state is fewer, and the degree of impact of this factor is 0.9963.
(2) while causing 13# blower fan generated energy to be its effective wind a little less than the second factor of other blower fans, rate is low, i.e. this blower fanThe situation of the wind speed obtaining in incision and between cutting out is fewer, and the degree of impact of this factor is 0.9875.
The correlation rule that this algorithm is excavated is all consistent with general statistical analysis conclusion, and on the basis of statistical analysisThe upper quantitative degree of impact that has provided each factor. Thus deducibility go out this blower fan may exist fan trouble more or controlThe problem of strategy, the wind-resources situation of this blower fan position is slightly poor, is easier to occur very big wind and substantially calm situation. Meanwhile,What also can find out that the variable to long time scale (time scale is the variable of a year) has the greatest impact is also same time scale by upperThe variable changing, and as wind speed, wind direction, the variable of transient change is very little on the variable impact of long time scale, more explanation pointAnalyse the importance of time scale in work, should pay much attention to.
Those skilled in the art should understand, embodiments of the invention can be provided as method, system or computer programProduct. Therefore, the present invention can adopt complete hardware implementation example, completely implement software example or the reality in conjunction with software and hardware aspectExecute routine form. And the present invention can adopt the computer that wherein includes computer usable program code one or moreThe upper computer program product of implementing of usable storage medium (including but not limited to magnetic disc store, CD-ROM, optical memory etc.)The form of product.
The present invention is that reference is according to the flow process of the method for the embodiment of the present invention, equipment (system) and computer programFigure and/or block diagram are described. Should understand can be by computer program instructions realization flow figure and/or block diagram often first-classFlow process in journey and/or square frame and flow chart and/or block diagram and/or the combination of square frame. These computer programs can be providedInstruction is arrived the processor of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device to produceA raw machine, produces for reality the instruction of carrying out by the processor of computer or other programmable data processing deviceThe device of the function of specifying in flow process of flow chart or multiple flow process and/or square frame of block diagram or multiple square frame now.
These computer program instructions also can be stored in can vectoring computer or other programmable data processing device with spyDetermine in the computer-readable memory of mode work, the instruction generation that makes to be stored in this computer-readable memory comprises fingerMake the manufacture of device, this command device realize at flow process of flow chart or multiple flow process and/or square frame of block diagram orThe function of specifying in multiple square frames.
These computer program instructions also can be loaded in computer or other programmable data processing device, make at meterOn calculation machine or other programmable devices, carry out sequence of operations step to produce computer implemented processing, thus at computer orThe instruction of carrying out on other programmable devices is provided for realizing at flow process of flow chart or multiple flow process and/or block diagram oneThe step of the function of specifying in individual square frame or multiple square frame.
In the present invention, apply specific embodiment principle of the present invention and embodiment have been set forth, above embodimentExplanation just for helping to understand method of the present invention and core concept thereof; Meanwhile, for one of ordinary skill in the art,According to thought of the present invention, all will change in specific embodiments and applications, in sum, in this descriptionHold and should not be construed as limitation of the present invention.

Claims (10)

1. the polynary Operational Data Analysis method of electric field, is characterized in that, described method comprises:
Each achievement data to the polynary service data of electric field carries out standardization, generates the standardization things item of each achievement dataCollection;
Utilizing normal distribution membership function to carry out subregion and Fuzzy processing to described standardization things item collection generates fuzzyCollection, and the degree of membership value of definite each fuzzy set;
Determine frequent 1 collection according to each achievement data in the degree of membership value of each fuzzy set;
Calculate frequent K item collection according to frequent 1 collection of determining by this iteration identification;
Utilize the correlation step of constructing correlation rule in Apriori algorithm to determine the correlation rule of described frequent K item collection and credibleDegree, generates electric field multivariate data analysis result.
2. the polynary Operational Data Analysis method of electric field as claimed in claim 1, is characterized in that, described to the polynary fortune of electric fieldEach achievement data in row data carries out standardization and comprises:
Adopt extreme value standardized method to carry out standardization to the polynary service data of electric field.
3. the polynary Operational Data Analysis method of electric field as claimed in claim 1, is characterized in that, described utilizes normal distributionMembership function carries out Fuzzy processing to described standardization things item collection and generates fuzzy set, and determines the degree of membership of each fuzzy setValue comprises:
Determine the desired value of the concentrated each achievement data of described standardization things item;
According to the standardization things item of the desired value of described each achievement data, each achievement data concentrated maximum, minimum of a value pairEach achievement data that described standardization things item is concentrated carries out respectively subregion;
According to the normal distribution membership function of default each subregion, described standardization things item collection is converted into fuzzy set trueDetermine the degree of membership value of fuzzy set.
4. the polynary Operational Data Analysis method of electric field as claimed in claim 3, is characterized in that, described default each subregionNormal distribution membership function be:
r h i g h ( t i j ) = 1 t i j ≤ μ min e - ( t i j - μ min ) 2 2 σ h i g h 2 μ min ≤ t i j ≤ μ max , σ h i g h = μ 0 - μ min 3
r m i d d l e ( t i j ) = e - ( μ 0 - t i j ) 2 2 σ m i d d l e 1 2 μ min ≤ t i j ≤ μ 0 , σ m i d d l e 1 = μ 0 - μ min 3 e - ( t i j - μ 0 ) 2 2 σ m i d d l e 2 2 μ 0 ≤ t i j ≤ μ max , σ m i d d l e 2 = μ max - μ 0 3
r l o w ( t i j ) = e - ( μ max - t i j ) 2 2 σ l o w 2 μ min ≤ t i j ≤ μ max 1 t i j ≥ μ max , σ l o w = μ max - μ 0 3
Wherein,For described standardization things item is concentrated the data after standardization, μmin、μmaxFor the standardization thing of each achievement dataItems concentrate minimum of a value and maximum, μ0For the desired value of achievement data.
5. the polynary Operational Data Analysis method of electric field as claimed in claim 4, is characterized in that, described according to each index numberDetermine that according to the degree of membership value in each fuzzy set frequent 1 collection comprises:
Degree of membership weights by each achievement data in corresponding fuzzy set;
When the degree of membership weights that judge fuzzy set are not less than default minimum support, this fuzzy set is put into frequent 1Concentrate.
6. the polynary Operational Data Analysis device of electric field, is characterized in that, described device comprises:
Standardization module, carries out standardization for the each achievement data to the polynary service data of electric field, generates each fingerThe standardization things item collection of mark data;
Fuzzy Processing module, for utilizing normal distribution membership function to carry out Fuzzy processing to described standardization things item collectionGenerate fuzzy set, and determine the degree of membership value of each fuzzy set;
Frequent 1 collection generation module, for determining frequent 1 collection according to each achievement data in the degree of membership value of each fuzzy set;
Frequent K item collection generation module, for calculating frequent K item collection according to frequent 1 collection of determining by this iteration identification;
Analysis result generation module, utilizes the correlation step of constructing correlation rule in Apriori algorithm to determine described frequent K item collectionCorrelation rule and confidence level, generate electric field multivariate data analysis result.
7. the polynary Operational Data Analysis device of electric field as claimed in claim 6, is characterized in that, described standardization mouldPiece adopts extreme value standardized method to carry out standardization to the polynary service data of electric field.
8. the polynary Operational Data Analysis device of electric field as claimed in claim 7, is characterized in that, described Fuzzy Processing moduleComprise:
Desired value determining unit, for determining the desired value of the concentrated each achievement data of described standardization things item;
Zoning unit, according to the concentrated maximum of the standardization things item of the desired value of described each achievement data, each achievement data,Each achievement data that minimum of a value is concentrated described standardization things item carries out respectively subregion;
Fuzzy set generation unit, according to the normal distribution membership function of default each subregion by described standardization things Xiang JizhuanTurn to the degree of membership value of fuzzy set definite fuzzy set.
9. the polynary Operational Data Analysis device of electric field as claimed in claim 8, is characterized in that, described default each subregionNormal distribution membership function be:
r h i g h ( t i j ) = 1 t i j ≤ μ min e - ( t i j - μ min ) 2 2 σ h i g h 2 μ min ≤ t i j ≤ μ max , σ h i g h = μ 0 - μ min 3
r m i d d l e ( t i j ) = e - ( μ 0 - t i j ) 2 2 σ m i d d l e 1 2 μ min ≤ t i j ≤ μ 0 , σ m i d d l e 1 = μ 0 - μ min 3 e - ( t i j - μ 0 ) 2 2 σ m i d d l e 2 2 μ 0 ≤ t i j ≤ μ max , σ m i d d l e 2 = μ max - μ 0 3
r l o w ( t i j ) = e - ( μ max - t i j ) 2 2 σ l o w 2 μ min ≤ t i j ≤ μ max 1 t i j ≥ μ max , σ l o w = μ max - μ 0 3
Wherein,For described standardization things item is concentrated the data after standardization, μmin、μmaxFor the standardization thing of each achievement dataThe minimum of a value that items are concentrated and maximum, μ0For the desired value of achievement data.
10. the polynary Operational Data Analysis device of electric field as claimed in claim 9, is characterized in that, described frequent 1 collection is rawBecome module to comprise:
Degree of membership weights generation unit, for the degree of membership weights in corresponding fuzzy set by each achievement data;
Judging unit, when the degree of membership weights that judge fuzzy set are not less than default minimum support, puts this fuzzy setEntering frequent 1 concentrates.
CN201510956727.1A 2015-12-18 2015-12-18 Method and device for analyzing electric field multivariate operating data Pending CN105590167A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510956727.1A CN105590167A (en) 2015-12-18 2015-12-18 Method and device for analyzing electric field multivariate operating data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510956727.1A CN105590167A (en) 2015-12-18 2015-12-18 Method and device for analyzing electric field multivariate operating data

Publications (1)

Publication Number Publication Date
CN105590167A true CN105590167A (en) 2016-05-18

Family

ID=55929733

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510956727.1A Pending CN105590167A (en) 2015-12-18 2015-12-18 Method and device for analyzing electric field multivariate operating data

Country Status (1)

Country Link
CN (1) CN105590167A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330291A (en) * 2017-07-12 2017-11-07 广东工业大学 A kind of two type point value Zadeh Fuzzy Calculation method and devices of photovoltaic generation daily generation
CN107403239A (en) * 2017-07-25 2017-11-28 南京工程学院 A kind of parameters analysis method for being used for control device in power system
CN108694517A (en) * 2018-06-11 2018-10-23 北京石油化工学院 A kind of statistical and analytical method of the harmful influence risk in transit factor based on big data
CN112381654A (en) * 2020-11-13 2021-02-19 国网福建省电力有限公司经济技术研究院 Power grid engineering investment management index correlation analysis method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102636991A (en) * 2012-04-18 2012-08-15 国电科学技术研究院 Method for optimizing running parameters of thermal power unit and based on fuzzy set association rule
CN104298778A (en) * 2014-11-04 2015-01-21 北京科技大学 Method and system for predicting quality of rolled steel product based on association rule tree

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102636991A (en) * 2012-04-18 2012-08-15 国电科学技术研究院 Method for optimizing running parameters of thermal power unit and based on fuzzy set association rule
CN104298778A (en) * 2014-11-04 2015-01-21 北京科技大学 Method and system for predicting quality of rolled steel product based on association rule tree

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴姜等: "基于模糊正态分布隶属函数的继电保护装置状态评价", 《电力系统保护与控制》 *
牛成林等: "一种改进的增量式数值型关联规则挖掘算法", 《小型微型计算机系统》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330291A (en) * 2017-07-12 2017-11-07 广东工业大学 A kind of two type point value Zadeh Fuzzy Calculation method and devices of photovoltaic generation daily generation
CN107330291B (en) * 2017-07-12 2021-03-30 广东工业大学 Two-type point value Zadeh fuzzy calculation method and device for daily generated energy of photovoltaic power generation
CN107403239A (en) * 2017-07-25 2017-11-28 南京工程学院 A kind of parameters analysis method for being used for control device in power system
CN107403239B (en) * 2017-07-25 2021-02-12 南京工程学院 Parameter analysis method for control equipment in power system
CN108694517A (en) * 2018-06-11 2018-10-23 北京石油化工学院 A kind of statistical and analytical method of the harmful influence risk in transit factor based on big data
CN112381654A (en) * 2020-11-13 2021-02-19 国网福建省电力有限公司经济技术研究院 Power grid engineering investment management index correlation analysis method

Similar Documents

Publication Publication Date Title
CN103217960B (en) Automatic selection method of dynamic scheduling strategy of semiconductor production line
CN108241873B (en) A kind of intelligent failure diagnosis method towards pumping plant main equipment
CN108520272A (en) A kind of semi-supervised intrusion detection method improving blue wolf algorithm
CN106897821A (en) A kind of transient state assesses feature selection approach and device
CN106991447A (en) A kind of embedded multi-class attribute tags dynamic feature selection algorithm
CN105590167A (en) Method and device for analyzing electric field multivariate operating data
CN110083531B (en) Multi-target path coverage test method and implementation system for improving individual information sharing
CN106131158A (en) Resource scheduling device based on cloud tenant's credit rating under a kind of cloud data center environment
CN108304442A (en) A kind of text message processing method, device and storage medium
CN108345908A (en) Sorting technique, sorting device and the storage medium of electric network data
CN101256631A (en) Method, apparatus, program and readable storage medium for character recognition
CN109858518A (en) A kind of large data clustering method based on MapReduce
CN115577858A (en) Block chain-based carbon emission prediction method and device and electronic equipment
Wang et al. Application research of ensemble learning frameworks
CN113408341A (en) Load identification method and device, computer equipment and storage medium
CN110069546A (en) A kind of data classification method, device for classifying data and terminal device
CN111523768A (en) Entropy weight-TOPSIS-based generalized demand side resource quality evaluation method
CN103207804B (en) Based on the MapReduce load simulation method of group operation daily record
CN109840558A (en) Based on density peaks-core integration adaptive clustering scheme
CN111723206B (en) Text classification method, apparatus, computer device and storage medium
CN112214602A (en) Text classification method and device based on humor, electronic equipment and storage medium
CN117035837A (en) Method for predicting electricity purchasing demand of power consumer and customizing retail contract
CN104881688A (en) Two-stage clustering algorithm based on difference evolution and fuzzy C-means
CN111309770A (en) Automatic rule generating system and method based on unsupervised machine learning
CN108268478A (en) A kind of unbalanced dataset feature selection approach and device based on ur-CAIM algorithms

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160518