CN111292201A - Method for pushing field operation and maintenance information of power communication network based on Apriori and RETE - Google Patents

Method for pushing field operation and maintenance information of power communication network based on Apriori and RETE Download PDF

Info

Publication number
CN111292201A
CN111292201A CN202010058050.0A CN202010058050A CN111292201A CN 111292201 A CN111292201 A CN 111292201A CN 202010058050 A CN202010058050 A CN 202010058050A CN 111292201 A CN111292201 A CN 111292201A
Authority
CN
China
Prior art keywords
node
information
rule
maintenance
storage area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010058050.0A
Other languages
Chinese (zh)
Inventor
莫穗江
高国华
李瑞德
王�锋
张欣欣
温志坤
黄定威
梁英杰
廖振朝
杨玺
张欣
汤铭华
陈嘉俊
李伟雄
童捷
张天乙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Power Grid Co Ltd
Jiangmen Power Supply Bureau of Guangdong Power Grid Co Ltd
Original Assignee
Guangdong Power Grid Co Ltd
Jiangmen Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Power Grid Co Ltd, Jiangmen Power Supply Bureau of Guangdong Power Grid Co Ltd filed Critical Guangdong Power Grid Co Ltd
Priority to CN202010058050.0A priority Critical patent/CN111292201A/en
Publication of CN111292201A publication Critical patent/CN111292201A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a method for pushing field operation and maintenance information of a power communication network based on Apriori and RETE, which comprises the steps of firstly, utilizing an Apriori algorithm to carry out association rule mining on pushed content information, service information and operation and maintenance personnel information in the power communication network, and establishing a rule base; after the operation and maintenance personnel terminal returns the field information, the rule engine selects a rule set to be executed according to the operation and maintenance personnel information and the operation and maintenance task information, all the push information stored in the server terminal executes each rule in the rule set R by using a RETE algorithm, and the correlation value of the information and the current operation and maintenance personnel is obtained; and then sorting the obtained push information in a descending order according to the correlation degree, screening the push information which is not less than the threshold value by setting an information push threshold value, and pushing the push information in a sequence from high to low according to the correlation degree. By the method, the individualized operation and maintenance information which is more in line with the actual conditions of the operation and maintenance site can be pushed for the operation and maintenance personnel, so that the management efficiency and level of the on-site operation and maintenance of the power communication network are improved.

Description

Method for pushing field operation and maintenance information of power communication network based on Apriori and RETE
Technical Field
The invention relates to the field of power communication networks, in particular to a method for pushing field operation and maintenance information of a power communication network based on Apriori and RETE.
Background
The power communication network is a core network for supporting the production and operation of power enterprises in China, is an important platform for building intelligent power grids in China, is an information network in a power system, and plays a vital role in the safe and stable operation of the power system. The electric power communication field operation and maintenance is one of the methods for ensuring the safe and stable operation of the electric power communication network, and the main tasks of the method comprise eliminating the warning of the electric power communication equipment, finding the fault of the equipment, analyzing the fault reason, eliminating the fault and the like.
The power communication network is matched with an electronic communication operation and maintenance system, and the informatization and intelligentization level of the on-site operation and maintenance is improved in a mode of electronizing the operation and maintenance on-site information, so that the operation and maintenance quality is improved to a certain extent. However, the electronic communication operation and maintenance system has the following problems that analysis of information of an operation and maintenance site is lacked, information interaction with operation and maintenance personnel cannot be carried out, and accurate remote judgment and guidance cannot be effectively carried out on site operation and maintenance.
The problems of the power communication operation and maintenance system are solved to a certain extent by using an information push technology on the basis of the system. The technology integrates the ability level of the operation and maintenance personnel by analyzing the information returned by the operation and maintenance site, pushes the required information for the operation and maintenance personnel, can improve the information interaction level between the operation and maintenance personnel and the background, and further improves the operation and maintenance quality. However, the application of the information push technology in the field operation and maintenance of the power communication system at the present stage is not wide, most of the information is searched for by actively searching the required information only by the operation and maintenance personnel, the same request for different operation and maintenance personnel may occur due to the lack of the relevant information of the operation and maintenance personnel in the information search process, the searched results are the same, and the existing information push technology cannot give personalized results according to the characteristics of the operation and maintenance personnel.
Disclosure of Invention
In order to solve the problem that the information pushing technology in the prior art cannot provide personalized operation and maintenance information according to the characteristics of operation and maintenance personnel, the invention provides a method for pushing the site operation and maintenance information of the power communication network based on Apriori and RETE, which can push the personalized operation and maintenance information more conforming to the actual conditions of the operation and maintenance site for the operation and maintenance personnel, thereby improving the management efficiency and level of the site operation and maintenance of the power communication network. The Apriori algorithm is an association rule mining algorithm, and the RETE algorithm is a forward rule fast matching algorithm.
In order to solve the technical problems, the invention adopts the technical scheme that: a method for pushing operation and maintenance information on site of a power communication network based on Apriori and RETE comprises the following steps:
the method comprises the following steps: collecting push information, service information and operation and maintenance personnel information from a power communication network, and preprocessing data;
step two: mining association rules based on an Apriori algorithm to form a rule base;
step three: carrying out rule matching on the information of the on-site operation and maintenance personnel and the operation and maintenance task based on the RETE algorithm;
step four: by setting an information pushing threshold, the pushing of the on-site operation and maintenance information of the power communication network is realized, and the correlation between the current on-site operation and maintenance personnel and the pushed information is updated.
Preferably, in the first step, the preprocessing of the data includes removing data records containing missing values.
The push information comprises temperature abnormity, load abnormity, flow abnormity, misoperation, power abnormity, operation flow, notice, configuration files, video guidance, equipment historical fault information, equipment manufacturer information, equipment age information, task completion degree and operation normalization;
the service information includes: communication equipment inspection, communication network management inspection, communication line inspection, communication optical transmission equipment inspection, communication network management equipment inspection, communication data network equipment inspection, mobile emergency communication system inspection, television telephone conference equipment inspection, communication clock synchronization equipment inspection, communication power supply equipment inspection, communication switching equipment inspection, communication access equipment inspection, communication carrier equipment inspection, communication cable line inspection, communication optical cable line inspection, fault treatment, mode opening, communication implementation and communication acceptance inspection;
operation and maintenance personnel information: operation and maintenance personnel number, service type and skill level.
Preferably, in the second step, the mining of association rules first mines a frequent item set, and then generates association rules through the frequent item set.
Preferably, the specific steps of mining the frequent item set are as follows:
s2.1.1: scanning a database; creating a two-dimensional table F for a given database0Two-dimensional table shown in Table 1, two-dimensional table F0The storage direction of the table is opposite to the storage direction of the table in the database; in two-dimensional table F0In (e), a list represents a set I of all items or attributes in the database that need to be mined (I ═ I)1,I2,…,Im}) and is also all 1-item sets in the association rule mining process; two-dimensional table F0The row of (1) represents the collection of records in which each item or attribute occurs, also all records in the 1-item set in which the respective item appears;
TABLE 1F0Watch (A)
ItemSet (item set) RecordsSet (record set) Count (Count)
I1 T1 p
I2 T2 q
…… …… ……
Im Tm n
S2.1.2: the number of records in each row in the two-dimensional table F0 is counted to obtain item IiThe support degree of (i ═ 1,2, …, m) is compared with the minimum support degree, the row where the item whose support degree is less than the minimum support degree is located is deleted, and the frequent 1-item set table F is obtained1I.e. a frequent 1-item set L1,L1∈I;
S2.1.3: generating a frequent K-item set; with the frequent 1-item set table, the join and pruning are performed together and a frequent K-item set is generated, the structure of which is shown in table 2. In Table 2, the first column represents the frequent k-1 set of items, and the T1 appearing in the following column represents the identification number of the record in which the preceding set of items is located;
TABLE 3.2Fk-1Watch (A)
Figure RE-GDA0002429084350000021
In table Ia,Ib,…ImRepresent different items or attributes, and { Ia,Ib,…ImContains k-1; { Ta,Tb… … represents a set of items Ia,Ib,…ImCo-occurring records identify a set of entries.
The pruning operation is to prune the connected object, namely the frequent K-item set, by using the property of Apriori algorithm before the connection generates the frequent K-item set, and the requirements in each frequent K-1 item set are as follows: the frequency of occurrence in all the frequent K-1 item sets is more than or equal to K-1;
when connection is carried out, conditions are required for selecting two connected K-1 items: only one of the two K-1 sets is different and the remaining K-2 sets are identical and are frequent sets. The connection operation is to obtain the transaction sets corresponding to two item sets meeting the connection condition and find out the two setsIntersecting, counting the number of transactions in the obtained result, and if the number is greater than or equal to the minimum support count, putting the transactions into a frequent K-item set table FkAnd if not, skipping the item set, and reconnecting the next group of K-1 item sets meeting the connection condition until the frequent item sets meeting the connection condition do not exist in the frequent K-1 item sets.
S2.1.4: and (8) repeating the step S2.1.3 until no frequent item set meets the connection condition after pruning, ending the execution, and obtaining all frequent item sets.
Preferably, the generating of the association rule includes the following steps:
s2.2.1: according to frequent K-item set FkiEstablishing a frequent item set tree, root node root [ { I [ ]a,Ib,…Im}, Min_Sup,θ];
Figure RE-GDA0002429084350000031
Wherein, FkiRepresenting the ith frequent item set in the frequent K item sets; i isa,Ib,…ImRepresenting different items or attributes; min _ Sup represents the minimum support; theta is expressed as the reciprocal of the correlation degree, Corr is more than or equal to 1, so theta is less than or equal to 1, and the purpose of false rule removing processing is realized; support (L) is expressed as the support of a frequent item set L; min _ Conf is expressed as minimum support;
s2.2.2: inserting a first-layer child node into the frequent item set tree, wherein the first-layer child node is FkiFrequent 1-item set of (1), containing the support of the item set, in ascending order of its support in the root node;
s2.2.3: verification node Nj,xNode Nj,xCorresponding to the frequent item set l, if: if support (l) is less than or equal to theta, the degree of correlation is determined
Figure RE-GDA0002429084350000032
Wherein, Corr (l → F)ki-l) are a set of antecedents l and a set of consequent items Fki-correlation of l; conf (l → F)kiL) is a rulel→Fki-a confidence of l; support (F)ki-l) is a frequent item set Fki-a support of l;
corr (l → F)ki-l) is > Min _ Corr, then the association rule r is derived: l → Fki-l, wherein l is a set of antecedent rules, Fki-l is a set of consequent terms with confidence:
Figure RE-GDA0002429084350000033
wherein Support (F)ki) For frequent item sets FkiThe degree of support of (c);
and for Nj,xDifferent types of operations are performed;
s2.2.4: to node Nj,xThe sub-node of the j +1 th layer inserted into the node applies a connection method of Apriori algorithm to connect Nj,xUnder the same branch, each next brother node is connected to generate a node Nj,xAdding the support degree support (l) of the frequent item set into each child node at the j +1 th layer as a node Nj,xInserting the child node of (2) into the frequent item set tree, and ordering Nj,x=Nj+1,1Go to step S2.2.3;
s2.2.5: and (5) repeatedly executing the steps S2.2.1-S2.2.4 until each item set in the frequent item sets is processed, and obtaining a complete association rule set R, namely a rule base.
Preferably, for Nj,xPerforms different operations:
1) if j is 1, x is k, Nj,xFor the last node N of the first layer1,kFinishing the construction of the frequent item set tree;
2) if N is presentj,xContaining N1,kFrequent item set in (1), let Nj,x=Nj,xNext, i.e., the next sibling of the parent, perform step S2.2.3;
3) if j is k-1, and Nj,xDoes not contain N1,kFrequent item set in (1), let Nj,x=Nj,xNext, perform step S2.2.3;
4) if j is more than 1 and less than k-1,Nj,xif the node is an intermediate node, the sub-tree under the node is not constructed any more, let Nj,x=Nj, xNext, step S2.2.3 is performed.
The Apriori algorithm adopts the idea of two-stage mining, and the association rule mining is divided into two steps for mining, namely, generating a frequent item set meeting the minimum support degree from a database and generating a strong association rule meeting the minimum confidence degree on the basis of the obtained frequent item set. The Apriori algorithm is the most classical algorithm in the field of association rule mining, but the application of the Apriori algorithm has some defects, mainly including too many database scanning times, large number of candidate items, complex operation of generating association rules and serious redundancy of the generated rules. Therefore, the method combines the idea of set operation with the Apriori algorithm, the algorithm only needs to scan the database once, and generates the frequent K-item set through iteration, the database does not need to be scanned in the iteration process, but the set intersection and union are used for operation, the connection operation is optimized, the candidate item set cannot be generated in the execution process, and the efficiency of mining the frequent item set is improved. On the basis, the frequent item set tree is applied to optimize the generation of the association rule, and the concept of the correlation degree is introduced, so that the generation of redundant and false rules is reduced, the generation process of the association rule is optimized, and the efficiency and the accuracy of the algorithm are improved on the whole.
Preferably, in the third step, the specific rule matching step is as follows:
s3.1: sequencing the original rules;
s3.2: creating a RETE network;
s3.3: matching the RETE network.
Preferably, the specific steps of sequencing the original rule are as follows:
s3.1.1: establishing a classification set;
s3.1.2: adding the conditions appearing in the rule set into the set F, and calculating the frequency of each condition appearing in the rule set, namely the node sharing degree; the degree of node sharing is the degree to which it is shared in the rule set. Let SxNode sharing degree of node x, then
Figure RE-GDA0002429084350000041
Wherein
Figure RE-GDA0002429084350000042
S3.1.3: sorting the elements in the set F in a descending order according to the occurrence frequency;
s3.1.4: and returning the processed rule set.
Preferably, creating the RETE network comprises the steps of:
s3.2.1: creating a root node;
s3.2.2, extracting a rule from the processed rule set, extracting a mode from the rule, checking whether the parameter type of the mode is a new type, if the parameter type of the mode is the new type, creating a new type node, and adding a HashMap (HashMap) for recording the successor node of the node, wherein the key of the HashMap is a literal name, and the value is α node;
s3.2.3, judging whether α nodes corresponding to the mode exist, if so, recording the position of the node, if not, creating a α node corresponding to the mode, adding the α node into the network, adding the α node into the HashMap corresponding to the type of node, and simultaneously establishing a α memory table;
s3.2.4: repeating the steps S3.2.2-S3.2.3 until all the patterns in the rule are processed;
s3.2.5, combining β nodes, namely, taking α (1) as a left input node of β (2), taking a right input node of β (2) in α (2), taking β (i-1) as a left input node of β (i), taking α (i) as a right input node of β (i), wherein i is greater than 2, and internally connecting memory tables of two father nodes into a memory table of the self;
s3.2.6, repeating step S3.2.5 until all β nodes are processed;
s3.2.7 packaging the conclusion part of the rule into leaf nodes and using them as output nodes of β (n);
s3.2.7: and (6) repeating the steps S3.2.2-S3.2.7 until all rules are processed.
Preferably, the concrete steps of RETE network matching are as follows:
s3.3.1: adding all facts needing to be processed into the artifacts set;
s3.3.2: if the artifacts are not null, selecting a fact for processing, otherwise stopping the matching process;
s3.3.3, putting the work storage area element in the identification network to match from the root node, if the type of the work storage area element is the same as the type of the type node, then saving the fact in the α storage area corresponding to the node, and the work storage area element continuously matches along the network;
s3.3.4, if the working storage area element is transferred to the right end of β node, it will be added into the right storage area of β node and matched with Token in the left storage area, if matching is successful, it will be added into Token (mark), then it will be transferred to the next node, if matching is not successful, it will be abandoned;
continuing the matching process of the working storage area elements along the network, if the working storage area elements are transmitted to the left end of the β node, packaging the working storage area elements as Token, and then transmitting the Token to the next node;
s3.3.5, if Token is transferred to the left end of β node, adding it into the left storage area of β node, and matching it with the working storage area element in the right storage area, if matching is successful, the Token encapsulates the matched working storage area element in the right storage area to form a new Token, and transfers it to the next node, if matching is unsuccessful, abandoning;
s3.3.6: if Token is transmitted to the terminal node, the rule corresponding to the root node is activated, a corresponding Activation (process of activating, accessing or giving an account) is established, and the Activation is stored in the Agenda (travel) to wait for Activation; and if the work storage area elements are transmitted to the terminal nodes, packaging the work storage area elements as Token, activating rules corresponding to the root nodes, establishing corresponding Activation, and storing the Activation in Agenda.
The RETE algorithm is improved, firstly, all the conditions of a rule set are partially stored in a condition library before the RETE network is established, the conditions in the condition library are sorted in a descending order according to occurrence frequency, in the RETE network establishment, a HashMap is added for each type node and is used for recording the subsequent node of the node, the key of the HashMap is a literal name, the value is α node, when a new α node is added to the type node, the improved RETE algorithm is added to the HashMap, the memory consumption is reduced, the rule library is processed before the network establishment, the rule set in the rule library is added to the HashMap, the total memory consumption is reduced, the overall network utilization frequency is reduced, the total memory consumption is reduced, the network utilization frequency is reduced greatly, and the overall network utilization time is reduced.
Preferably, in the fourth step, the specific step of pushing the on-site operation and maintenance information of the power communication network includes:
s4.1: calculating the correlation values of various information in the information base and storing the correlation values into a correlation set F;
s4.2: sorting the elements in the set F in a descending order, selecting the elements of top-x, and setting the minimum value of the elements as a threshold value k of initial information;
s4.3: obtaining information to be pushed according to the screening result, and pushing the information to be pushed, such as equipment temperature abnormal information, power supply abnormal alarm information, equipment-related operation flow information, operation video guidance or configuration files and the like, to operation and maintenance personnel by a pushing engine;
s4.4: analyzing the operation and maintenance effect, and updating a threshold k according to the accuracy and the recall rate of the operation and maintenance effect representing the push information, wherein the calculation formulas of the accuracy Pr and the recall rate Rr are as follows:
Figure RE-GDA0002429084350000061
Figure RE-GDA0002429084350000062
wherein, the set S represents a standard set, that is, all information to be pushed; the set M represents the information pushed this time.
Figure RE-GDA0002429084350000063
Wherein g (i) represents the correlation value of the ith element in the set F;
s4.5: step S4.4 is repeated and the threshold in step S4.2 is replaced by the new threshold.
The information push threshold value is dynamically changed according to the accuracy and the recall rate, so that push contents are screened, the information push threshold value can adapt to a changing scene, the accuracy and the recall rate of information push can be greatly improved, and the quality and the efficiency of field operation and maintenance of the power communication network are improved.
Compared with the prior art, the beneficial effects are: the method and the system can push all the push information in the power communication network and the correlation values between the current operation and maintenance personnel, the push information is pushed to the current operation and maintenance personnel according to the sequence of the correlation degrees from high to low, and the personalized operation and maintenance information which is more consistent with the actual conditions of the operation and maintenance site is pushed to the operation and maintenance personnel, so that the management efficiency and the level of the operation and maintenance on the power communication network site are improved.
Drawings
Fig. 1 is a schematic flow chart of a method for pushing operation and maintenance information on a power communication network site based on Apriori and RETE according to the present invention;
FIG. 2 is a line graph of accuracy for data set D of example 1 of the present invention;
FIG. 3 is a recall line plot of dataset D of example 1 of the present invention;
FIG. 4 is a line graph of accuracy of a data set D of example 1 of the present invention in a new scene;
fig. 5 is a plot of recall rates for dataset D in a new scene for example 1 of the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent; .
The technical scheme of the invention is further described in detail by the following specific embodiments in combination with the attached drawings:
example 1
Fig. 1 shows an embodiment of a method for pushing operation and maintenance information on site of a power communication network based on Apriori and RETE, which includes the following steps:
the method comprises the following steps: firstly, acquiring push information, service information and operation and maintenance personnel information from a power communication network, and then cleaning data to remove data with missing values;
the pre-processed information attributes are shown in table 3:
TABLE 3 data preprocessing table
Figure RE-GDA0002429084350000064
Figure RE-GDA0002429084350000071
Step two: firstly, mining a frequent item set by using an Apriori algorithm, then generating a rule base, wherein the frequent item set mining steps are as follows:
s2.1.1: scanning a database; creating a two-dimensional table F for a given database0Two isDimension table F0The storage direction of the table is opposite to the storage direction of the table in the database; in two-dimensional table F0In (e), a list represents a set I of all items or attributes in the database that need to be mined (I ═ I)1,I2,…,Im}) and is also all 1-item sets in the association rule mining process; two-dimensional table F0The row of (1) represents the collection of records in which each item or attribute occurs, also all records in the 1-item set in which the respective item appears;
s2.1.2: the number of records in each row in the two-dimensional table F0 is counted to obtain item IiThe support degree of (i ═ 1,2, …, m) is compared with the minimum support degree, the row where the item whose support degree is less than the minimum support degree is located is deleted, and the frequent 1-item set table F is obtained1I.e. a frequent 1-item set L1,L1∈I;
S2.1.3: generating a frequent K-item set; connecting and pruning are carried out together by utilizing a frequent 1-item set table to generate a frequent K-item set;
s2.1.4: and (8) repeating the step S2.1.3 until no frequent item set meets the connection condition after pruning, ending the execution, and obtaining all frequent item sets.
Based on the practical case of operation and maintenance of the power communication field, the invention constructs a data set D containing 540 RDF triples for describing a series of operation and maintenance related information and situation information.
Setting the minimum support degree to be 0.03, and generating a frequent item set according to the steps, wherein part of the frequent item set is shown in table 4:
table 4 data set D frequent itemsets
Figure RE-GDA0002429084350000072
Figure RE-GDA0002429084350000081
Then, the generation of the association rule comprises the following steps:
s2.2.1: according to frequent K-item set FkiEstablishing a frequent item set tree, root node root [ { I [ ]a,Ib,…Im}, Min_Sup,θ];
Figure RE-GDA0002429084350000082
Wherein, FkiRepresenting the ith frequent item set in the frequent K item sets; i isa,Ib,…ImRepresenting different items or attributes; min _ Sup represents the minimum support; theta is expressed as the reciprocal of the correlation degree, Corr is more than or equal to 1, so theta is less than or equal to 1, and the purpose of false rule removing processing is realized; support (L) is expressed as the support of a frequent item set L; min _ Conf is expressed as minimum support;
s2.2.2: inserting a first-layer child node into the frequent item set tree, wherein the first-layer child node is FkiFrequent 1-item set of (1), containing the support of the item set, in ascending order of its support in the root node;
s2.2.3: verification node Nj,xNode Nj,xCorresponding to the frequent item set l, if: if support (l) is less than or equal to theta, the degree of correlation is determined
Figure RE-GDA0002429084350000083
Wherein, Corr (l → F)ki-l) are a set of antecedents l and a set of consequent items Fki-correlation of l; ConJ (l → F)ki-l) is the rule l → Fki-a confidence of l; support (F)ki-l) is a frequent item set Fki-a support of l;
corr (l → F)ki-l) is > Min _ Corr, then the association rule r is derived: l → Fki-l, wherein l is a set of antecedent rules, Fki-l is a set of consequent terms with confidence:
Figure RE-GDA0002429084350000084
wherein, Support (F)ki) For frequent item sets FkiThe degree of support of (c);
and for Nj,xDifferent types of execution of different operations;
S2.2.4: to node Nj,xThe sub-node of the j +1 th layer inserted into the node applies a connection method of Apriori algorithm to connect Nj,xUnder the same branch, each next brother node is connected to generate a node Nj,xAdding the support degree support (l) of the frequent item set into each child node at the j +1 th layer as a node Nj,xInserting the child node of (2) into the frequent item set tree, and ordering Nj,x=Nj+1,1Go to step S2.2.3;
s2.2.5: and (5) repeatedly executing the steps S2.2.1-S2.2.4 until each item set in the frequent item sets is processed, and obtaining a complete association rule set R, namely a rule base.
In this embodiment, the data set D part association rule set is shown in table 5:
table 5 data set D association rule set
Figure RE-GDA0002429084350000085
Figure RE-GDA0002429084350000091
Step three: based on RETE algorithm, the rule matching is carried out on the information of the on-site operation and maintenance personnel and the operation and maintenance task, and the specific steps are as follows:
s3.1: sequencing the original rules;
s3.1.1: establishing a classification set;
s3.1.2: adding the conditions appearing in the rule set into the set F, and calculating the frequency of each condition appearing in the rule set, namely the node sharing degree; the degree of node sharing is the degree to which it is shared in the rule set. Let SxNode sharing degree of node x, then
Figure RE-GDA0002429084350000092
Wherein
Figure RE-GDA0002429084350000093
S3.1.3: sorting the elements in the set F in a descending order according to the occurrence frequency;
s3.1.4: and returning the processed rule set.
S3.2: creating a RETE network;
s3.2.1: creating a root node;
s3.2.2, extracting a rule from the processed rule set, extracting a mode from the rule, checking whether the parameter type of the mode is a new type, if the parameter type of the mode is the new type, creating a new type node, adding a HashMap for the node, and recording the subsequent node of the node, wherein the key of the HashMap is a literal name, and the value is α node;
s3.2.3, judging whether α nodes corresponding to the mode exist, if so, recording the position of the node, if not, creating a α node corresponding to the mode, adding the α node into the network, adding the α node into the HashMap corresponding to the type of node, and simultaneously establishing a α memory table;
s3.2.4: repeating the steps S3.2.2-S3.2.3 until all the patterns in the rule are processed;
s3.2.5, combining β nodes, namely, taking α (1) as a left input node of β (2), taking a right input node of β (2) in α (2), taking β (i-1) as a left input node of β (i), taking α (i) as a right input node of β (i), wherein i is greater than 2, and internally connecting memory tables of two father nodes into a memory table of the self;
s3.2.6, repeating step S3.2.5 until all β nodes are processed;
s3.2.7 packaging the conclusion part of the rule into leaf nodes and using them as output nodes of β (n);
s3.2.7: and (6) repeating the steps S3.2.2-S3.2.7 until all rules are processed.
S3.3: matching the RETE network.
S3.3.1: adding all facts needing to be processed into the artifacts set;
s3.3.2: if the artifacts are not null, selecting a fact for processing, otherwise stopping the matching process;
s3.3.3, putting the work storage area element in the identification network to match from the root node, if the type of the work storage area element is the same as the type of the type node, then saving the fact in the α storage area corresponding to the node, and the work storage area element continuously matches along the network;
s3.3.4, continuing the matching process of working storage area elements along the network, if the working storage area elements are transferred to the right end of β node, adding the working storage area elements into the right storage area of the β node and matching with the Token in the left storage area, if the matching is successful, adding the working storage area elements into the Token, then transferring the Token to the next node, and if the matching is unsuccessful, giving up;
continuing the matching process of the working storage area elements along the network, if the working storage area elements are transmitted to the left end of the β node, packaging the working storage area elements as Token, and then transmitting the Token to the next node;
s3.3.5, if Token is transferred to the left end of β node, adding it into the left storage area of β node, and matching it with the working storage area element in the right storage area, if matching is successful, the Token encapsulates the matched working storage area element in the right storage area to form a new Token, and transfers it to the next node, if matching is unsuccessful, abandoning;
s3.3.6: if Token is transmitted to the terminal node, the rule corresponding to the root node is activated, a corresponding Activation is established, and the Activation is stored in Agenda to wait for Activation; and if the work storage area elements are transmitted to the terminal nodes, packaging the work storage area elements as Token, activating rules corresponding to the root nodes, establishing corresponding Activation, and storing the Activation in Agenda.
Step four: through setting an information pushing threshold, pushing of the site operation and maintenance information of the power communication network is achieved, and the correlation degree between current site operation and maintenance personnel and the pushed information is updated, and the method specifically comprises the following steps:
s4.1: calculating the correlation values of various information in the information base and storing the correlation values into a correlation set F;
s4.2: sorting the elements in the set F in a descending order, selecting the elements of top-x, and setting the minimum value of the elements as a threshold value k of initial information;
s4.3: obtaining information to be pushed according to the screening result, and pushing the information to be pushed, such as equipment temperature abnormal information, power supply abnormal alarm information, equipment-related operation flow information, operation video guidance or configuration files and the like, to operation and maintenance personnel by a pushing engine;
s4.4: analyzing the operation and maintenance effect, and updating a threshold k according to the accuracy and the recall rate of the operation and maintenance effect representing the push information, wherein the calculation formulas of the accuracy Pr and the recall rate Rr are as follows:
Figure RE-GDA0002429084350000101
Figure RE-GDA0002429084350000102
wherein, the set S represents a standard set, that is, all information to be pushed; the set M represents the information pushed this time.
Figure RE-GDA0002429084350000103
Wherein g (i) represents the correlation value of the ith element in the set F;
s4.5: step S4.4 is repeated and the threshold in step S4.2 is replaced by the new threshold. .
The following is the effect on accuracy and recall, respectively, in both the case of fixed scenes and dynamically changing scenes.
The data set D is respectively subjected to fixed push top-5, fixed push top-8, dynamic threshold top-5 (the initial threshold is g (5)), dynamic threshold top-8 (the initial threshold is g (8)), and the experimental results are as follows:
as shown in fig. 2 and fig. 3, a fixed amount of information is pushed, the accuracy rate fluctuates within a certain range, and therefore, the information is relatively stable, but as the number of pushed information increases, the accuracy rate gradually decreases, and the recall rate gradually increases, because as the number of pushed information increases, the number of pushed useful information continuously increases, but the proportion of useful information pushed to the operation and maintenance staff continuously decreases. The operation and maintenance information pushing method using the dynamic threshold has the initial effect similar to that of the fixed pushing top-x, but as the using time of the method increases, the information pushing threshold is continuously optimized, the accuracy rate and the recall rate of information pushing are continuously increased, a stable state is finally achieved, the accuracy rate and the recall rate fluctuate above and below a certain value, and the result is not related to the setting of the initial threshold.
As shown in fig. 4 and 5, for the method of pushing a fixed amount of information, if a scene changes during the information pushing process, that is, a new unseen scene appears, the accuracy and recall rate of the information pushing will decrease and will stabilize at the decreased level. For the push information content decision method with the dynamic threshold, when the situation changes, the accuracy and recall rate of the push information are also reduced, but the push threshold is randomly adjusted, so that the accuracy of the push information returns to the level before the change and is kept stable.
The beneficial effects of this embodiment: the method and the system can push all the push information in the power communication network and the correlation values between the current operation and maintenance personnel, the push information is pushed to the current operation and maintenance personnel according to the sequence of the correlation degrees from high to low, and the personalized operation and maintenance information which is more consistent with the actual conditions of the operation and maintenance site is pushed to the operation and maintenance personnel, so that the management efficiency and the level of the operation and maintenance on the power communication network site are improved.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. A method for pushing field operation and maintenance information of a power communication network based on Apriori and RETE is characterized by comprising the following steps:
the method comprises the following steps: collecting push information, service information and operation and maintenance personnel information from a power communication network, and preprocessing data;
step two: mining association rules based on an Apriori algorithm to form a rule base;
step three: carrying out rule matching on the information of the on-site operation and maintenance personnel and the operation and maintenance task based on the RETE algorithm;
step four: by setting an information pushing threshold, the pushing of the on-site operation and maintenance information of the power communication network is realized, and the correlation between the current on-site operation and maintenance personnel and the pushed information is updated.
2. The method for pushing operation and maintenance information on the power communication network site based on Apriori and RETE as claimed in claim 1, wherein in the second step, the mining of association rules first mines frequent item sets, and then generates association rules through the frequent item sets.
3. The method for pushing operation and maintenance information on the power communication network site based on Apriori and RETE as claimed in claim 2, wherein the specific step of mining the frequent item set is:
s2.1.1: scanning a database; creating a two-dimensional table F for a given database0Two-dimensional table F0The storage direction of the table is opposite to the storage direction of the table in the database; in two-dimensional table F0In (e), a list represents a set I of all items or attributes in the database that need to be mined (I ═ I)1,I2,…,Im}) and is also all 1-item sets in the association rule mining process; two-dimensional table F0The row of (1) represents the collection of records in which each item or attribute occurs, also all records in the 1-item set in which the respective item appears;
s2.1.2: the number of records in each row in the two-dimensional table F0 is counted to obtain item IiThe support degree of (i ═ 1,2, …, m) is compared with the minimum support degree, the row where the item whose support degree is less than the minimum support degree is located is deleted, and the frequent 1-item set table F is obtained1I.e. a frequent 1-item set L1,L1∈I;
S2.1.3: generating a frequent K-item set; connecting and pruning are carried out together by utilizing a frequent 1-item set table to generate a frequent K-item set;
s2.1.4: and (8) repeating the step S2.1.3 until no frequent item set meets the connection condition after pruning, ending the execution, and obtaining all frequent item sets.
4. The method for pushing the operation and maintenance information of the electric power communication network site based on Apriori and RETE as claimed in claim 3, wherein the step of generating the association rule comprises the following steps:
s2.2.1: according to frequent K-item set FkiEstablishing a frequent item set tree, root node root [ { I [ ]a,Ib,…Im},Min_Sup,θ];
Figure FDA0002370685270000021
Wherein, FkiRepresenting the ith frequent item set in the frequent K item sets; i isa,Ib,…ImRepresenting different items or attributes; min _ Sup represents the minimum support; theta is expressed as the reciprocal of the correlation degree, Corr is more than or equal to 1, so theta is less than or equal to 1, and the purpose of false rule removing processing is realized; support (L) is expressed as the support of a frequent item set L; min _ Conf is expressed as minimum support;
s2.2.2: inserting a first-layer child node into the frequent item set tree, wherein the first-layer child node is FkiFrequent 1-item set of (1), containing the support of the item set, in ascending order of its support in the root node;
s2.2.3: verification node Nj,xNode Nj,xCorresponding to the frequent item set l, if: if support (l) is less than or equal to theta, the degree of correlation is determined
Figure FDA0002370685270000022
Wherein, Corr (l → F)ki-l) are a set of antecedents l and a set of consequent items Fki-correlation of l; conf (l → F)ki-l) is the rule l → Fki-a confidence of l; support (F)ki-l) is a frequent item set Fki-a support of l;
corr (l → F)ki-l) is > Min _ Corr, then the association rule r is derived: l → Fki-l, wherein l is a set of antecedent rules, Fki-l is a set of consequent terms with confidence:
Figure FDA0002370685270000023
wherein Support (F)ki) For frequent item sets FkiThe degree of support of (c);
and for Nj,xDifferent types of operations are performed;
s2.2.4: to node Nj,xThe sub-node of the j +1 th layer inserted into the node applies a connection method of Apriori algorithm to connect Nj,xUnder the same branch, each next brother node is connected to generate a node Nj,xAdding the support degree support (l) of the frequent item set into each child node at the j +1 th layer as a node Nj,xInserting the child node of (2) into the frequent item set tree, and ordering Nj,x=Nj+1,1Go to step S2.2.3;
s2.2.5: and (5) repeatedly executing the steps S2.2.1-S2.2.4 until each item set in the frequent item sets is processed, and obtaining a complete association rule set R, namely a rule base.
5. The method for pushing the operation and maintenance information of the electric power communication network site based on Apriori and RETE as claimed in claim 4, wherein the method is applied to Nj,xPerforms different operations:
1) if j is 1, x is k, Nj,xFor the last node N of the first layer1,kFinishing the construction of the frequent item set tree;
2) if N is presentj,xContaining N1,kFrequent item set in (1), let Nj,x=Nj,xNext, i.e., the next sibling of the parent, perform step S2.2.3;
3) if j is k-1, and Nj,xDoes not contain N1,kFrequent item set in (1), let Nj,x=Nj,xNext, perform step S2.2.3;
4) if j is more than 1 and less than k-1, Nj,xIf the node is an intermediate node, the sub-tree under the node is not constructed any more, let Nj,x=Nj,xNext, step S2.2.3 is performed.
6. The method for pushing the operation and maintenance information of the electric power communication network site based on Apriori and RETE as claimed in claim 2, wherein in the third step, the specific step of rule matching is as follows:
s3.1: sequencing the original rules;
s3.2: creating a RETE network;
s3.3: matching the RETE network.
7. The method for pushing the operation and maintenance information of the electric power communication network site based on Apriori and RETE as claimed in claim 6, wherein the step of sorting the original rules comprises:
s3.1.1: establishing a classification set;
s3.1.2: adding the conditions appearing in the rule set into the set F, and calculating the frequency of each condition appearing in the rule set, namely the node sharing degree; the degree of node sharing is the degree to which it is shared in the rule set. Let SxNode sharing degree of node x, then
Figure FDA0002370685270000031
Wherein
Figure FDA0002370685270000032
Wherein, RuleiRepresenting the ith rule;
s3.1.3: sorting the elements in the set F in a descending order according to the occurrence frequency;
s3.1.4: and returning the processed rule set.
8. The method for pushing the operation and maintenance information of the electric power communication network site based on Apriori and RETE as claimed in claim 7, wherein the step of creating the RETE network comprises the following steps:
s3.2.1: creating a root node;
s3.2.2, extracting a rule from the processed rule set, extracting a mode from the rule, checking whether the parameter type of the mode is a new type, if the parameter type of the mode is the new type, creating a new type node, adding a hash key value pair for the node, and recording the subsequent node of the node, wherein the value of the hash key value pair is α node;
s3.2.3, judging whether α nodes corresponding to the mode exist, if so, recording the position of the node, if not, creating a α node corresponding to the mode, adding the α node into the network, adding the α node into the HashMap corresponding to the type of node, and simultaneously establishing a α memory table;
s3.2.4: repeating the steps S3.2.2-S3.2.3 until all the patterns in the rule are processed;
s3.2.5, combining β nodes, namely, taking α (1) as a left input node of β (2), taking a right input node of β (2) in α (2), taking β (i-1) as a left input node of β (i), taking α (i) as a right input node of β (i), wherein i is greater than 2, and internally connecting memory tables of two father nodes into a memory table of the self;
s3.2.6, repeating step S3.2.5 until all β nodes are processed;
s3.2.7 packaging the conclusion part of the rule into leaf nodes and using them as output nodes of β (n);
s3.2.7: and (6) repeating the steps S3.2.2-S3.2.7 until all rules are processed.
9. The method for pushing the operation and maintenance information of the electric power communication network site based on Apriori and RETE according to claim 8, wherein the concrete steps of RETE network matching are as follows:
s3.3.1: adding all facts needing to be processed into the artifacts set;
s3.3.2: if the artifacts are not null, selecting a fact for processing, otherwise stopping the matching process;
s3.3.3, putting the work storage area element in the identification network to match from the root node, if the type of the work storage area element is the same as the type of the type node, then saving the fact in the α storage area corresponding to the node, and the work storage area element continuously matches along the network;
s3.3.4, continuing the matching process of working storage area elements along the network, if the working storage area elements are transferred to the right end of β node, adding the working storage area elements into the right storage area of the β node and matching with the Token in the left storage area, if the matching is successful, adding the working storage area elements into the Token, then transferring the Token to the next node, and if the matching is unsuccessful, giving up;
continuing the matching process of the working storage area elements along the network, if the working storage area elements are transmitted to the left end of the β node, packaging the working storage area elements as Token, and then transmitting the Token to the next node;
s3.3.5, if Token is transferred to the left end of β node, adding it into the left storage area of β node, and matching it with the working storage area element in the right storage area, if matching is successful, the Token encapsulates the matched working storage area element in the right storage area to form a new Token, and transfers it to the next node, if matching is unsuccessful, abandoning;
s3.3.6: if Token is transmitted to the terminal node, the rule corresponding to the root node is activated, a corresponding Activation is established, and the Activation is stored in Agenda to wait for Activation; and if the work storage area elements are transmitted to the terminal nodes, packaging the work storage area elements as Token, activating rules corresponding to the root nodes, establishing corresponding Activation, and storing the Activation in Agenda.
10. The method for pushing the operation and maintenance information of the power communication network site based on Apriori and RETE as claimed in claim 9, wherein in the fourth step, the specific step of pushing the operation and maintenance information of the power communication network site includes:
s4.1: calculating the correlation values of various information in the information base and storing the correlation values into a correlation set F;
s4.2: sorting the elements in the set F in a descending order, selecting the elements of top-x, and setting the minimum value of the elements as a threshold value k of initial information;
s4.3: obtaining information to be pushed according to the screening result, and pushing the information to be pushed, such as equipment temperature abnormal information, power supply abnormal alarm information, equipment-related operation flow information, operation video guidance or configuration files and the like, to operation and maintenance personnel by a pushing engine;
s4.4: analyzing the operation and maintenance effect, and updating a threshold k according to the accuracy and the recall rate of the operation and maintenance effect representing the push information, wherein the calculation formulas of the accuracy Pr and the recall rate Rr are as follows:
Figure FDA0002370685270000051
Figure FDA0002370685270000052
wherein, the set S represents a standard set, that is, all information to be pushed; the set M represents the information pushed this time;
Figure FDA0002370685270000053
wherein g (i) represents the correlation value of the ith element in the set F; k is a new threshold;
s4.5: step S4.4 is repeated and the threshold in step S4.2 is replaced by the new threshold.
CN202010058050.0A 2020-01-16 2020-01-16 Method for pushing field operation and maintenance information of power communication network based on Apriori and RETE Pending CN111292201A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010058050.0A CN111292201A (en) 2020-01-16 2020-01-16 Method for pushing field operation and maintenance information of power communication network based on Apriori and RETE

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010058050.0A CN111292201A (en) 2020-01-16 2020-01-16 Method for pushing field operation and maintenance information of power communication network based on Apriori and RETE

Publications (1)

Publication Number Publication Date
CN111292201A true CN111292201A (en) 2020-06-16

Family

ID=71022313

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010058050.0A Pending CN111292201A (en) 2020-01-16 2020-01-16 Method for pushing field operation and maintenance information of power communication network based on Apriori and RETE

Country Status (1)

Country Link
CN (1) CN111292201A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015768A (en) * 2020-08-28 2020-12-01 平安国际智慧城市科技股份有限公司 Information matching method based on Rete algorithm and related products thereof
CN112084761A (en) * 2020-09-02 2020-12-15 董萍 Hydraulic engineering information management method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104301151A (en) * 2014-10-28 2015-01-21 国家电网公司 Movement operation and maintenance system and method of power communication network
CN204145521U (en) * 2014-10-28 2015-02-04 国家电网公司 Power telecom network moves operational system
CN104732322A (en) * 2014-12-12 2015-06-24 国家电网公司 Mobile operation and maintenance method for power communication network machine rooms

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104301151A (en) * 2014-10-28 2015-01-21 国家电网公司 Movement operation and maintenance system and method of power communication network
CN204145521U (en) * 2014-10-28 2015-02-04 国家电网公司 Power telecom network moves operational system
CN104732322A (en) * 2014-12-12 2015-06-24 国家电网公司 Mobile operation and maintenance method for power communication network machine rooms

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
尹艳红: "基于Apriori算法的增量式关联规则控制研究", 《中国优秀硕士学位论文全文数据库·信息科技辑》 *
张文豪: "面向电力通信网现场运维的推送内容决策方法", 《中国优秀硕士学位论文全文数据库·工程科技II辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015768A (en) * 2020-08-28 2020-12-01 平安国际智慧城市科技股份有限公司 Information matching method based on Rete algorithm and related products thereof
CN112084761A (en) * 2020-09-02 2020-12-15 董萍 Hydraulic engineering information management method and device
CN112084761B (en) * 2020-09-02 2024-09-06 董萍 Hydraulic engineering information management method and device

Similar Documents

Publication Publication Date Title
JP5092165B2 (en) Data construction method and system
CN110389950B (en) Rapid running big data cleaning method
CN111090643B (en) Mass electricity consumption data mining method based on data analysis system
CN111709714A (en) Method and device for predicting lost personnel based on artificial intelligence
CN113268370B (en) Root cause alarm analysis method, system, equipment and storage medium
CN115906160B (en) Information processing method and system based on artificial intelligence analysis
CN111292201A (en) Method for pushing field operation and maintenance information of power communication network based on Apriori and RETE
CN115544519A (en) Method for carrying out security association analysis on threat information of metering automation system
CN111858722A (en) Big data application system and method based on Internet of things
CN102411594B (en) Method and device for obtaining information
CN104881436B (en) A kind of electric power communication device method for analyzing performance and device based on big data
CN112329432B (en) Power distribution network voltage out-of-limit problem correlation analysis method based on improved Apriori
MUMINOV et al. Fvs-Technology: Intellectual Search Tools
KR101985961B1 (en) Similarity Quantification System of National Research and Development Program and Searching Cooperative Program using same
CN116303379A (en) Data processing method, system and computer storage medium
CN115292274A (en) Data warehouse topic model construction method and system
CN109976271B (en) Method for calculating information structure order degree by using information representation method
CN110413602B (en) Layered cleaning type big data cleaning method
CN116860981A (en) Potential customer mining method and device
CN112101798A (en) Power equipment service life management method based on big data technology
D’Orazio Some Approaches to Outliers’ Detection in R
CN111428756A (en) Planning data fusion real-time state method and device based on time series information entropy
CN118394749B (en) Tin industry data standard mapping treatment method
Salah et al. Fast parallel mining of maximally informative k-itemsets in big data
KR102631020B1 (en) Distribution System Relationship-set-based Data Matching Method and Integrated DB System for Distribution System

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200616

RJ01 Rejection of invention patent application after publication