CN111177130A

CN111177130A - Relay protection data integrity checking method and system based on correlation algorithm

Info

Publication number: CN111177130A
Application number: CN201911309372.1A
Authority: CN
Inventors: 郭鹏; 王文焕; 杨国生; 詹荣荣; 张烈; 康逸群; 闫周天; 李妍霏; 张瀚方; 王丽敏; 姜宏丽; 申华
Original assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI
Current assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI
Priority date: 2019-12-18
Filing date: 2019-12-18
Publication date: 2020-05-19

Abstract

The invention relates to a relay protection data integrity checking method and a system based on a correlation algorithm, wherein the method comprises the following steps: determining a project set according to the acquired historical records, and constructing a transaction set; respectively mining frequent item sets by utilizing the item sets and the transaction sets based on different attribute information; determining an association rule according to a plurality of frequent item sets in the frequent item set, and establishing an association rule base; acquiring current relay protection data, and determining a strategy to determine an incomplete record according to a preset incomplete record; searching the association rule base according to the attribute value of the determined attribute of the determined incomplete record to determine an association rule matched with the incomplete record, and determining the actual value of the uncertain attribute of the incomplete record by using the association rule matched with the incomplete record. The invention uses the inferred value to replace the preset value, so that the verified data is more in line with the incidence relation of the big data, and data support can be provided for the research based on the relay protection big data.

Description

Relay protection data integrity checking method and system based on correlation algorithm

Technical Field

The invention relates to the technical field of relay protection data processing, in particular to a relay protection data integrity checking method and system based on an association algorithm.

Background

For relay protection big data, ensuring the integrity of the data is an important target of data cleaning, and therefore, firstly checking the integrity of the data and then predicting the missing attribute value of incomplete data by adopting a certain method.

The general method for checking the integrity of data includes: (1) checking the integrity of the data by adopting a certain coding rule, wherein the checking comprises parity check, checksum, CRC check and the like; (2) and using the integrity data set, comparing the data to be verified with the items in the integrity data set, and judging the integrity of the data. For example, the integrity check data set of the relay protection action information includes contents such as equipment, a substation, a data set to which an information point belongs, an information name, standard semantics, an information value, and a time. For the vacancy attribute values in incomplete data, the most probable values are often used for filling, such as regression prediction, interpolation estimation, etc.

The data integrity checking method and the method for predicting the missing attribute value in the incomplete data are only suitable for specific occasions and have no universality.

Disclosure of Invention

The invention provides a relay protection data integrity checking method and system based on a correlation algorithm, and aims to solve the problem of checking the integrity of relay protection data.

In order to solve the above problem, according to an aspect of the present invention, there is provided a method for checking integrity of relay protection data based on a correlation algorithm, the method including:

determining an item set according to the attribute value sets of different attributes in the acquired historical records, and constructing a transaction set by using the acquired historical records;

respectively mining a frequent item set by utilizing the item set and the transaction set based on different attribute information;

determining an association rule according to a plurality of frequent item sets in the frequent item set, and establishing an association rule base;

acquiring current relay protection data, and determining a strategy to determine an incomplete record according to a preset incomplete record;

searching the association rule base according to the attribute value of the determined attribute of the determined incomplete record to determine an association rule matched with the incomplete record, and determining the actual value of the uncertain attribute of the incomplete record by using the association rule matched with the incomplete record.

Preferably, the mining of frequent item sets using the item sets and the transaction sets based on different attribute information includes:

step 21, comparing the support degree of each item in the item set with a preset support degree threshold, and screening items with the support degree greater than or equal to the preset support degree threshold for reservation to obtain a1 item frequent item set;

step 22, setting k to 2;

step 23, in the (k-1) item frequent item set, determining a union of any two item sets with different last elements, and judging whether all subsets of each union are in the (k-1) item frequent item set;

step 24, if all the subsets of a certain union set are in the (k-1) item frequent item sets, calculating the support degree of the union set, and screening the item sets with the support degree being greater than or equal to a preset support degree threshold value for reservation to obtain k item frequent item sets;

step 25, judging whether the item number of the (k-1) item frequent item set is more than or equal to 2; if yes, updating k to k +1, and returning to step 23; otherwise, the process is finished.

Preferably, the determining an association rule according to a plurality of frequent item sets in the frequent item set and establishing an association rule base includes:

for any one of the multiple frequent item sets, determining multiple corresponding antecedents and consequent items according to elements in the multiple frequent item sets to respectively determine multiple initial association rules;

screening the initial association rules with the confidence degrees larger than or equal to a preset confidence degree threshold value in the plurality of initial association rules as strong association rules, and establishing an association rule base by using the strong association rules.

Preferably, the screening of the initial association rules with the confidence level greater than or equal to a preset confidence level threshold in the plurality of initial association rules includes:

step 31, selecting one item in a plurality of frequent item sets;

step 32, setting g to 2;

step 33, screening 1-postpiece initial association rules from the multiple frequent item sets, comparing the confidence of each initial association rule with a confidence threshold, and determining the initial association rules with the confidence greater than or equal to the confidence threshold as strong association rules;

step 34, forming the posterites of the (g-1) -posterite strong association rule in the multiple frequent item sets into a (g-1) -posterite set, taking a union set of 2 posterites with only 1 different elements in the (g-1) -posterite set, and judging whether all items contained in the union set are in the (g-1) -posterite set;

step 35, if all items contained in a certain union set are in the (g-1) -back-piece set, taking the union set as a back-piece to form an association rule of the multiple items of frequent item sets, judging whether the confidence of the association rule is greater than or equal to a confidence threshold, and if so, determining the association rule as a strong association rule;

step 36, judging whether the current g is smaller than the difference value between the number of items in the frequent item set and 1; if yes, updating g to g +1, and returning to step 34; otherwise, the process is finished.

Preferably, the determining an incomplete record according to a preset incomplete record determining policy includes:

and if the attribute value of certain attribute of a certain record in the obtained current relay protection data is a preset filling value or a null value, determining that the record is an incomplete record.

Preferably, the searching the association rule base according to the attribute value of the determined attribute of the determined incomplete record to determine the association rule matching the incomplete record, and determining the actual value of the uncertain attribute of the incomplete record by using the association rule matching the incomplete record comprises:

matching the attribute values of the determined attributes of the incomplete records with the previous items of each association rule in the association rule base to determine matching association rules;

and taking the attribute value corresponding to the latter item of the matching association rule as the actual value of the uncertain attribute of the incomplete record, and filling.

According to another aspect of the present invention, there is provided a system for checking integrity of relay protection data based on a correlation algorithm, the system including:

the transaction set construction unit is used for determining an item set according to the attribute value sets of different attributes in the acquired historical records and constructing a transaction set by using the acquired historical records;

the frequent item set determining unit is used for respectively mining frequent item sets by utilizing the item set and the transaction set based on different attribute information;

the association rule base establishing unit is used for determining association rules according to a plurality of frequent item sets in the frequent item set and establishing an association rule base;

the incomplete record determining unit is used for acquiring current relay protection data and determining an incomplete record according to a preset incomplete record determining strategy;

and the data checking unit is used for searching the association rule base according to the attribute value of the determined attribute of the determined incomplete record to determine an association rule matched with the incomplete record, and determining the actual value of the uncertain attribute of the incomplete record by using the association rule matched with the incomplete record.

Preferably, the mining of the frequent item sets by the frequent item set constructing unit based on different attribute information includes:

step 22, setting k to 2;

Preferably, the association rule base establishing unit determines the association rule according to a plurality of frequent item sets in the frequent item set, and establishes the association rule base, including:

the initial association rule determining module is used for determining a plurality of corresponding antecedents and postcedents according to elements in any one of the multiple frequent item sets so as to respectively determine a plurality of initial association rules;

and the association rule base establishing module is used for keeping the initial association rules with the confidence degrees larger than or equal to a preset confidence degree threshold value in the plurality of initial association rules as strong association rules and establishing an association rule base by using the strong association rules.

Preferably, the screening, by the association rule base establishing module, of the initial association rules with the confidence level greater than or equal to a preset confidence level threshold includes:

step 31, selecting a multiple item frequent item set;

step 32, setting g to 2;

Preferably, the incomplete record determining unit determines an incomplete record according to a preset incomplete record determining strategy, and includes:

Preferably, the data checking unit searches the association rule base according to the attribute value of the determined attribute of the determined incomplete record to determine an association rule matching the incomplete record, and determines the actual value of the uncertain attribute of the incomplete record by using the association rule matching the incomplete record, including:

The invention provides a method and a system for checking the integrity of relay protection data based on an association algorithm, wherein the integrity of the relay protection data is checked by using an association analysis method, the association relation of relay protection big data is firstly mined, then a suspected incomplete record is selected, the actual value of the uncertain attribute adopting the preset value is deduced according to the confirmed attribute and the big data association rule in the record, and the deduced value is used for replacing the preset value, so that the checked data is more in line with the association relation of the big data, a certain promotion effect is provided for the management and the application of the relay protection big data, and meanwhile, data support can be provided for the research based on the relay protection big data.

Drawings

A more complete understanding of exemplary embodiments of the present invention may be had by reference to the following drawings in which:

fig. 1 is a flowchart of an integrity checking method 100 for relay protection data based on a correlation algorithm according to an embodiment of the present invention;

FIG. 2 is a flow chart of the Apriori algorithm; and

fig. 3 is a schematic structural diagram of an integrity checking system 300 for relay protection data based on a correlation algorithm according to an embodiment of the present invention.

Detailed Description

The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, the same units/elements are denoted by the same reference numerals.

Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Further, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.

Fig. 1 is a flowchart of an integrity checking method 100 for relay protection data based on a correlation algorithm according to an embodiment of the present invention. As shown in fig. 1, in the integrity checking method for relay protection data based on an association algorithm provided in the embodiment of the present invention, an association analysis method is used to check the integrity of relay protection data, an association relationship of relay protection big data is first mined, then a suspected incomplete record is selected, an actual value of an uncertain attribute of a preset value is inferred according to a determined attribute and a big data association rule in the record, and the inferred value is used to replace the preset value, so that the checked data better conforms to the association relationship of the big data, a certain promotion effect is provided for the management and application of the relay protection big data, and a data support can be provided for the research based on the relay protection big data.

The Apriori algorithm is the most basic and common algorithm in association rule mining and is mainly used for quickly mining big data association rules. In big data association rule mining, indivisible minimum unit information i is calledItem, set I_k＝{i₁,i₂,…,i_kIs a set of k-terms; let I be the set of all items, T ═ T₁,t₂,…,t_nDenotes a set of data transactions, each transaction t_iThe set of included items is all a subset of I. The number of transactions that contain a set of items is referred to as the frequency of the set of items. The association rule is in the form of

Are both a proper subset of I and

degree of support

Reflecting the probability of simultaneous occurrence of the items contained in A and B in the transaction set, the calculation formula is as follows:

confidence level

Reflecting the conditional probability of B occurring in the transaction containing A, the calculation formula is as follows:

if item set I_kSupport degree of (I)_k) More than or equal to min _ sup, min _ sup is the minimum support threshold, item set I_kIs a frequent item set.

The Apriori algorithm uses a layer-by-layer iterative search method to find a frequent item set using a candidate set. The basic idea is as follows: first find all 1 item frequent item sets L₁，L₁For generating candidate item set C₂To C₂Is judged to dig out L₂I.e., 2 sets of frequent items, and L₂For finding C₃And L₃Go on so until it can't findTo more k items of frequent item sets L_kUntil now. The algorithm exploits the Apriori property, candidate C_kIf there is a subset C_k-1Not at predetermined L_k-1In, then candidate item set C_kMust not be a frequent set of items and thus be deleted directly. The number of candidate items can be reduced through screening, and therefore the speed of association rule mining is increased. The Apriori algorithm flow is shown in fig. 2. Here candidate item set C_kCorresponding to claim 2, any two final elements of the union of different sets of (k-1) item frequent-items.

The integrity checking method 100 for relay protection data based on the association algorithm provided by the embodiment of the invention starts from step 101, determines an item set according to the attribute value sets of different attributes in the acquired history record in step 101, and constructs a transaction set by using the acquired history record.

For example, in an embodiment of the present invention, for 22529 defect records, each record attribute includes: protection type, defect severity, whether protection is out of service, defect location, defect cause, equipment manufacturer. For program implementation convenience, the information of 6 dimensions is first coded. The protection types are 20 in total and are represented by A1-A20; the defect severity is represented by B1-B3 in 3 cases; whether the protection exits from the operation has 2 situations, which are represented by C1 and C2; the defect sites were 91 in total and were designated as D1-D91; the defect causes were 55 in total, and were represented by E1 to E55; the equipment manufacturers have 77 cases in common, and the cases are represented by F1 to F77. Therefore, the set of items I is { a1 to a20, B1 to B3, C1, C2, D1 to D91, E1 to E55, and F1 to F77 }.

Each data record covers 6 attributes of protection type, defect severity, protection exit, defect position, defect reason and equipment manufacturer, so each record forms a transaction set. For example, the 1 st and 2 nd defect entries constitute the first 2 transactions, which are: a5, B2, C2, D83, E23, F22 and a5, B2, C2, D64, F39. The defect reason of the record 2 is "other", and no arbitrary item in the item set I matches with the defect reason, so the transaction 2 only contains 5 items.

In step 102, based on different attribute information, the item set and the transaction set are utilized to respectively mine a frequent item set.

step 22, setting k to 2;

step 25, judging whether the item number of the (k-1) item frequent item set is more than or equal to 2; if yes, updating k to k +1, and returning to the step 3; otherwise, the process is finished.

In an embodiment of the present invention, the minimum support degree is set to be 0.5%, and the process of determining the frequent item set includes:

(1) a set of 1 item frequent items is first generated. And constructing 1 item frequent item set on the basis of the item sets. Specifically, according to the requirement of the minimum support degree of 0.5%, the items which do not meet the requirement of the minimum support degree in the item set { A1-A20, B1-B3, C1, C2, D1-D91, E1-E55 and F1-F77 } are deleted, and the rest items are listed in the 1 item frequent item set. For example, in protection types A1-A20, items that are not in the 1 item frequent item set are: a6 (intelligent terminal), a16 (short lead protection), a18 (generator protection), a19 (generator protection), and a20 (fault location device).

(2) A set of 2 frequent items is generated. In the 1 item frequent item set, any 2 item sets with different last elements are searched, and the union set is solved. And because all subsets of each union set are in the 1 item frequent item set, the support degree is directly calculated for the union set, if the support degree is more than or equal to the minimum support degree, the 2 item frequent item set is formed, otherwise, the support degree is deleted. For example, the support degree of the union set { a13, a5} of a13 (transformer protection) and a5 (line protection) is less than the minimum support degree, and therefore is not in the 2-item frequent item set, and the support degree of the union set { a5, B1} of a5 (line protection) and B1 (critical defect) is greater than the minimum support degree, and therefore is in the 2-item frequent item set.

(3) A set of 3 frequent items is generated. Searching any 2 item sets with different last elements in the 2 item frequent item sets, solving a union set of the item sets, judging whether all subsets of the union set are in the 2 item frequent item set, and if the subsets are not in the 2 item frequent item set, judging that the union set is not the 3 item frequent item set; and calculating the support degree of the union set which satisfies that any subset is a 2-item frequent item set, and if the support degree is greater than the minimum support degree, forming a 3-item frequent item set. For example, the union of the items { A5, B1}, { A5, B2} in the 2-item frequent-item set results in { A5, B1, B2}, and since the subset { B1, B2} of the union is not in the 2-item frequent-item set, the { A5, B1, B2} is not in the 3-item frequent-item set.

(4) And (k +1) item frequent item sets are generated according to the method on the basis of the k item frequent item sets, and the cyclic execution is carried out, wherein the cyclic execution is terminated when 6 item frequent item sets are obtained in the example.

In step 103, an association rule is determined according to a plurality of frequent item sets in the frequent item set, and an association rule base is established.

step 31, selecting a multiple item frequent item set;

step 32, setting g to 2;

In the embodiment of the invention, the association rule is generated according to a 2-item frequent item set, a 3-item frequent item set, a 4-item frequent item set, a 5-item frequent item set and a 6-item frequent item set. Specifically, take the example of generating association rules from a 3-frequent item set. For one item { A12, B3, C1} in the 3-item frequent item set (A12 refers to protecting the fault information system substation, B3 refers to general defects, C1 refers to protecting against exit from run), a 1-back-piece association rule is generated:

a12, B3 → C1, confidence 93.78% > 85%, the association rule is a strong association rule;

a12, C1 → B3, confidence 86.48% > 85%, the association rule is a strong association rule;

b3, C1 → A12, the confidence level does not meet the requirement, and the association rule is not a strong association rule.

Thus, 1 set of the backing pieces { C1}, { B3} is obtained, and then { C1, B3} is taken as the backing piece

The confidence level of A12 → C1, B3 does not meet the requirement, and the association rule is not a strong association rule.

Therefore, the strong association rules generated by the 3-item frequent item set { A12, B3, D1} are A12, B3 → C1 and A12, C1 → B3.

And finally, establishing an association rule base according to the strong association rule obtained by each frequent item set.

In step 104, current relay protection data is obtained, and a strategy is determined according to a preset incomplete record to determine an incomplete record.

In an embodiment of the present invention, it is considered that the item set I cannot cover all situations in the field, allowing the use of "other" or "NULL" to refer to other situations. However, due to the reasons of rough field filling and the like, part of the attribute fields of the item descriptions in the item set I with determined meanings can be used, and preset values are used for rough expression, so that incomplete data is caused. Thus, an incomplete record may be determined according to a preset padding value or a null value.

In step 105, the association rule base is searched according to the attribute values of the determined attributes of the determined incomplete records to determine the association rule matching the incomplete records, and the actual values of the uncertain attributes of the incomplete records are determined by using the association rule matching the incomplete records.

In the implementation mode of the invention, the defects are mainly characterized by protection types, defect severity, whether protection exits operation, defect positions, defect reasons and 6 dimensions of equipment manufacturers, which are respectively represented by A-F. Listing the possible values in each dimension constitutes a set of items I. Consider that items in item set I cannot cover all situations in the field, allowing the use of "other" or "NULL" to refer to other situations. However, due to the reasons of rough field filling and the like, part of the attribute fields of the item descriptions in the item set I with determined meanings can be used, and preset values are used for rough expression, so that incomplete data is caused.

According to the embodiment of the invention, after 22529 defect records obtained by statistics in 4 months from 2012 to 2019 are subjected to big data association analysis, the minimum confidence is set to be 85%, and the minimum support degree is 0.5%, 146 association rules are mined from 248 items of 6 attribute dimensions, and part of representative 16 association rules are shown in table 1.

TABLE 1 Association rules between relay protection defect data attributes

When the relay protection defect data record contains a suspected incomplete record, for example, { a ═ bus protection; b is severe; c is; d, opening the plug-in; e ═ others; f is manufacturer 1, and the defect cause field E is suspected to be an incomplete attribute. According to the association rule 9 in table 1, if "C ═ is ═ D ═ insert & F ═ manufacturer 1", the probability of "E ═ insert damage" is 92.21%, it can be inferred that "E ═ other" in the original record is an incomplete attribute, and "E ═ insert damage" should be used instead. And another protection as { A ═ other protection; b is severe; c is; d ═ an alternating current loop; e-external force failure; f is manufacturer 1, and the protection type field a is suspected to be an incomplete attribute. According to the association rule 5 in table 1, under the condition of "E ═ external force damage", the probability of "a ═ line protection" is 95.95%, and it can be inferred that "a ═ other protection" in the original record is an incomplete attribute and "a ═ line protection" should be used instead.

The relay protection big data can create good conditions for the promotion of professional application, and the data integrity is an important aspect for reflecting the data quality. According to the strong relevance of the relay protection big data, the embodiment of the invention applies Apriori algorithm to mine the relevance of the data, generates the relevance rule, checks the integrity of the relay protection data according to the relevance rule, and completes the prediction of null attribute values in the incomplete data. The article takes relay protection defect data as an example, and the relevance of 248 items of 6 dimensions of protection types, defect severity, protection exit or not, defect positions, defect reasons and equipment manufacturers is mined from 22529 defect records, so that the processing of incomplete data is completed, and the application effect is good.

Fig. 3 is a schematic structural diagram of an integrity checking system 300 for relay protection data based on a correlation algorithm according to an embodiment of the present invention. As shown in fig. 3, an integrity checking system 300 for relay protection data based on a correlation algorithm according to an embodiment of the present invention includes: a transaction set construction unit 301, a frequent item set determination unit 302, an association rule base establishment unit 303, an incomplete record determination unit 304, and a data check unit 305.

Preferably, the transaction set constructing unit 301 is configured to determine an item set according to attribute value sets of different attributes in the obtained history record, and construct a transaction set by using the obtained history record.

Preferably, the frequent item set determining unit 302 is configured to mine a frequent item set by using the item set and the transaction set respectively based on different attribute information.

Preferably, the frequent item set constructing unit 302, based on different attribute information, respectively mining a frequent item set by using the item set and the transaction set, including:

step 22, setting k to 2;

Preferably, the association rule base establishing unit 303 is configured to determine an association rule according to a plurality of frequent item sets in the frequent item set, and establish an association rule base.

Preferably, the association rule base establishing unit 303 determines an association rule according to a plurality of frequent item sets in the frequent item set, and establishes an association rule base, including:

and the association rule base establishing module is used for screening the initial association rules with the confidence degrees larger than or equal to a preset confidence degree threshold value in the plurality of initial association rules as strong association rules and establishing an association rule base by using the strong association rules.

step 31, selecting a multiple item frequent item set;

step 32, setting g to 2;

Preferably, the incomplete record determining unit 304 is configured to obtain current relay protection data, and determine an incomplete record according to a preset incomplete record determining policy.

Preferably, the incomplete record determining unit 304, determining an incomplete record according to a preset incomplete record determining strategy, includes:

Preferably, the data checking unit 305 is configured to search the association rule base according to the attribute value of the determined attribute of the determined incomplete record to determine an association rule matching the incomplete record, and determine the actual value of the uncertain attribute of the incomplete record by using the association rule matching the incomplete record.

Preferably, the data checking unit 305, searching the association rule base according to the attribute value of the determined attribute of the determined incomplete record to determine the association rule matching the incomplete record, and determining the actual value of the uncertain attribute of the incomplete record by using the association rule matching the incomplete record, includes:

The integrity checking system 300 for relay protection data based on the association algorithm in the embodiment of the present invention corresponds to the integrity checking system method 100 for relay protection data based on the association algorithm in another embodiment of the present invention, and is not described herein again.

The invention has been described with reference to a few embodiments. However, other embodiments of the invention than the one disclosed above are equally possible within the scope of the invention, as would be apparent to a person skilled in the art from the appended patent claims.

Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a/an/the [ device, component, etc ]" are to be interpreted openly as referring to at least one instance of said device, component, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims

1. A relay protection data integrity checking method based on a correlation algorithm is characterized by comprising the following steps:

2. The method of claim 1, wherein the mining a frequent item set using the item set and the transaction set based on different attribute information comprises:

step 22, setting k to 2;

3. The method of claim 1, wherein determining association rules based on a plurality of frequent itemsets in the frequent itemsets and building an association rule base comprises:

for any one of the multiple frequent item sets, determining multiple corresponding antecedents and consequent items according to elements in the multiple frequent item sets so as to respectively determine multiple initial association rules;

and screening the initial association rules of which the middle confidence degrees are greater than or equal to a preset confidence degree threshold value as strong association rules, and establishing an association rule base by using the strong association rules.

4. The method of claim 1, wherein the screening of the initial association rules with a confidence level greater than or equal to a preset confidence level threshold from among the plurality of initial association rules comprises:

step 31, selecting a multiple item frequent item set;

step 32, setting g to 2;

5. The method of claim 1, wherein determining incomplete records according to a preset incomplete record determination strategy comprises:

6. The method of claim 1, wherein searching the association rule base according to the attribute values of the determined attributes of the determined incomplete records to determine the association rule matching the incomplete records, and determining the actual values of the uncertain attributes of the incomplete records using the association rule matching the incomplete records comprises:

7. An integrity checking system of relay protection data based on a correlation algorithm, the system comprising:

8. The system of claim 7, wherein the frequent items set constructing unit, based on different attribute information, respectively mines frequent items sets using the item sets and the transaction sets, and comprises:

step 22, setting k to 2;

9. The system according to claim 7, wherein the association rule base establishing unit determines the association rule according to a plurality of frequent item sets in the frequent item set, and establishes the association rule base, including:

10. The system according to claim 9, wherein the association rule base establishing module for screening the initial association rules with confidence levels greater than or equal to a preset confidence level threshold from among the plurality of initial association rules comprises:

step 31, selecting a multiple item frequent item set;

step 32, setting g to 2;

11. The system of claim 7, wherein the incomplete record determining unit determines the incomplete record according to a preset incomplete record determining strategy, comprising:

12. The system of claim 7, wherein the data checking unit searches the association rule base according to the attribute values of the determined attributes of the incomplete records to determine the association rule matching the incomplete records, and determines the actual values of the uncertain attributes of the incomplete records by using the association rule matching the incomplete records, comprising: