CN110008253B - Industrial data association rule mining and abnormal working condition prediction method - Google Patents
Industrial data association rule mining and abnormal working condition prediction method Download PDFInfo
- Publication number
- CN110008253B CN110008253B CN201910244856.6A CN201910244856A CN110008253B CN 110008253 B CN110008253 B CN 110008253B CN 201910244856 A CN201910244856 A CN 201910244856A CN 110008253 B CN110008253 B CN 110008253B
- Authority
- CN
- China
- Prior art keywords
- data
- sequence
- fitting
- association
- line segment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000005065 mining Methods 0.000 title claims abstract description 34
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 29
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 8
- 238000012549 training Methods 0.000 claims description 28
- 230000011218 segmentation Effects 0.000 claims description 24
- 238000013528 artificial neural network Methods 0.000 claims description 18
- 238000005259 measurement Methods 0.000 claims description 12
- 230000000717 retained effect Effects 0.000 claims description 6
- 238000003064 k means clustering Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 230000001537 neural effect Effects 0.000 claims 1
- 238000004519 manufacturing process Methods 0.000 abstract description 3
- 238000004088 simulation Methods 0.000 description 8
- 238000012360 testing method Methods 0.000 description 6
- 238000005070 sampling Methods 0.000 description 4
- 238000012423 maintenance Methods 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an industrial data association rule mining and abnormal working condition prediction method which can be applied to fault prediction and health management of an industrial process. The invention introduces the association rule mining into the industrial equipment fault prediction, and finds the association between the operation parameters through the association rule mining algorithm. According to the characteristics of industrial data, starting from the variation trend of the operation parameters of the equipment, generating a transaction set by taking the variation trend of the operation parameters as the most important index, mining association rules between the parameters on the basis of the transaction set, and introducing the mining result of the association rules into the prediction of the abnormal working condition of the industrial equipment to obtain a more accurate prediction result. The method has great application value for fault prediction and health management in engineering.
Description
Technical Field
The invention belongs to the technical field of reliability maintenance engineering, and relates to an industrial data association rule mining and abnormal working condition prediction method based on a two-stage frequent item set generation strategy.
Background
With the continuous emergence of complex systems and the increasing demand of real-time monitoring of industrial processes, modern industrial equipment is often equipped with a plurality of sensors to monitor the operation state of the industrial equipment in the operation process. Meanwhile, multiple fault modes may occur in the operation process of the equipment, a certain fault may correspond to a plurality of symptoms, and under the condition, the single sensor information cannot completely reflect the operation state of the equipment, so that fault prediction based on multi-sensor information is generated at the right moment. The failure prediction based on multi-sensor information aims to analyze the operation state of the equipment using comprehensive sensor information, thereby making more reliable equipment diagnosis and prediction. With the continuous development of sensing technology, the use of multiple sensors for condition monitoring, fault diagnosis and prediction of equipment has become a trend.
In the field of fault prediction, the work of combining association rule mining and fault prediction is still fresh at present. In fact, for time series data, equipment failure or failure is often represented by parameters or relevant features extracted from the parameters, and the prediction is often carried out on the variation trend of the parameters or the relevant features. And the association rule among the parameters is mined, so that more complete parameters, namely equipment running state information can be obtained, and a certain basis is provided for subsequent prediction.
Disclosure of Invention
Aiming at the current situation of the prior art, the invention aims to solve the problem that the association rule of sensor data is rarely considered in the existing data-driven prediction technology, provides an equipment abnormal working condition prediction method based on the operation parameter association rule, and constructs a more applicable wavelet neural network to perform abnormal working condition prediction (fault prediction).
The concept of the present invention will now be explained as follows:
the invention uses the association rule to depict the association of the operation parameters of the industrial process, and researches the abnormal working condition prediction problem mined based on the association rule of the time sequence data. In order to mine association rules on a sequence level for time series data, the invention provides a time series data association rule mining algorithm comprising a two-stage frequent item set generation process. In the first stage, extracting the change trend information of the time sequence as a basic mode for mining association rules, and finding a frequent item set of time sequence change forms; in the second stage, on the basis of the frequent item set of the time sequence variation form, the frequent item set of which the sequence is a basic mode is found, and association rule mining is carried out on every two sequences. And then, carrying out abnormal working condition prediction by using the system variables related to the association rule obtained by mining, and introducing the association rule into a wavelet neural network to improve the prediction accuracy. The method provided by the invention takes the operation parameter association rule into consideration, and can obtain a more accurate fault prediction result.
According to the invention concept, the invention provides an industrial data association rule mining and predicting method based on a two-stage frequent item set generation strategy, which comprises the following specific steps:
step 1: performing piecewise linearization representation and symbolization on time series data, and constructing a discrete data set suitable for association rule mining;
step 2: generating a frequent item set of the data set by adopting a two-stage frequent item set mining algorithm;
and step 3: generating association rules according to the frequent item sets, and extracting the association rules meeting the minimum support degree and the minimum confidence degree threshold;
and 4, step 4: and introducing the association rule mining result into a wavelet neural network and predicting the abnormal working condition of the industrial equipment.
Based on the above scheme, the following implementation manner can be specifically adopted for each step:
preferably, the step 1 comprises the following substeps:
step 1.1: the measuring time sequence of the sensor is as followsN is the number of sensors and k is the time sequence length; the starting point of the initial fitting isInitial fitting endpoint ofThe fitting starting point is recorded asFitted endpoint ofFitting error threshold value is omegaE;
1.2.1 initializing a segmentation point count value of 1;
1) firstly, calculating end as start + h;
3) if the fitting error ERR is not more than the fitting error threshold value omegaEIf h is h +1, skipping to step 1) again;
4) if the fitting error ERR is larger than the fitting error threshold value omegaEObtainingLine segment fitting sequence ofRecording the segmentation point when the start is equal to start + hResetting h to 2, count to count + 1;
1.2.3 circularly executing the step 1.2.2 until the end is larger than k, and obtaining a fitted linear time sequenceAnd segmentation pointComposed sequence of segmentation points Pi;
Step 1.3: time series after fitting any sensorIs marked as Yk={y1,y2,…,ykAnd extracting trend and numerical value information of each fitting line segment, and representing one fitting line segment s in the following triple modei:
Wherein k isiWhich represents the slope of the line segment,represents the span of the line segment on the time axis, riData { y } representing the growth rate of the line segment data corresponding to the line segmentj,yj+1,…,yj+h},j is the starting point of the line segment;
for the line segmented time sequence YkAll the line segments in the sequence are subjected to triple representation to obtain a triple sequence Sn={s1,s2,…,snIn which n represents the time series XkThe number of segments after segmentation;
step 1.4: clustering line segment sequences in the triple sequence and symbolizing the line segments, which are used for representing different change forms of equipment or systems, and describing the line segments s by adopting Euclidean distanceiAnd sjDegree of similarity dij:
Wherein d isijRepresenting a line segment siAnd sjSimilarity of (d)ijThe smaller the size, the more similar the change form of the two line segments, ωkAnd ωrIs a weight;
then according to the similarity index dijUsing a K-means clustering algorithm to pair SnClustering is carried out, and a phase is distributed to the same line segmentThe same symbol is used for representing the change mode of the operation parameter to obtain a symbolized sequence Fn={f1,f2,…,fn},f1,f2,…,fnRespectively representing symbols to which the 1 st, 2 … th, n line segments are assigned;
step 1.5: measuring time sequence for every two sensorsAndmerging its segment point sequence PiAnd PjIs denoted by Pij,nij-1 is PiAnd PjThe number of the combined segmentation points; and symbolizing the sequence according to the combined segmentation point pairAndperforming segmentation reconstruction to obtain reconstructed symbolic sequenceAnd
preferably, the step 2 comprises the following substeps:
step 2.1: for measuring time seriesAndrespectively corresponding operating parameters ViAnd VjThe symbolized data of the measurement sequence obtained from step 1 isAndfrom which a transaction set is formed, i.e. each transaction is recorded as Andthe line segment type symbols included in (1) are respectively marked asAndrecording the minimum support threshold of the two stages as min1And minisup2;
Step 2.2: calculating the support degree of each item through a single scanning data set to obtain a frequent 1-item set, and performing the following processes of 2.2.1-2.2.3:
2.2.1: let σ (-) be the support count of an item or set of items, initially 0; is provided withIs denoted by the class symbol tkT represents a or b;
2.2.3: for each tkIf, ifNot less than the minimum support degree threshold value minsup1Then, consider tkFor frequent 1-item sets, reserve tkAnd recording corresponding support degree counts; if it is notLess than the minimum support threshold value minsup1Then, consider tkNot a frequent 1-item set;
step 2.3: using the frequent 1-item set t obtained in step 2.2kForming a 2-item set and calculating the support degree of the 2-item set to find the frequent 2-item set according to the following processes:
2.3.1: note apAnd bqRespectively, the symbols from the original line segment class after step 2.2Andthe item retained in (1);
2.3.2 for each { ap,bqExecuting the following steps:
2) If it is notNot less than min1Then consider { ap,BqKeep { a } for the frequent 2-item setp,bqAnd recording corresponding support degree counts;
step 2.4: using the frequent 2-item set { a) obtained in step 2.3p,bqCalculating the support degree of every two operation parameters in the whole data set, and obtaining a frequent item set of a parameter level, and performing the following steps: for every two operating parameters ViAnd VjSet of formed items { Vi,Vj}, calculate σ ({ V)i,Vj})=sum(σ({ap,bq}) ifNot less than the minimum support degree threshold value minsup2Then { V } is retainedi,VjRecord the corresponding support degree and calculate sigma (V)i)=sum(σ(ap));σ(Vj)=sum(σ(bq))。
Preferably, the step 3 comprises the following substeps:
step 3.1: for each set { V satisfying the threshold of the support degree obtained in step 2i,VjResults in the following association rules: vj→ViAnd Vi→VjRecording the minimum confidence threshold value as minconf;
step 3.2: calculating a confidence threshold value according to each generated association rule group, wherein the process of extracting the association rules is as follows: for each association rule Vi→VjCalculatingIf conf (V)i→Vj) If the minimum confidence coefficient threshold is not less than minconf, the association rule V is reservedi→VjAnd records the corresponding support and confidence omegai。
Preferably, the step 4 comprises the following substeps:
step 4.1: for any set of association parameters extracted from the association rule, it is marked as { V1,V2,…,VuWhere u denotes the number of associated parameters, VuFor each association rule V, the rule's consequent, i.e. the target parameteri→Vu1,2, … u-1, each with a confidence level, which is denoted as ωi(ii) a For the target parameter VuPredicting abnormal working conditions by using a wavelet neural network;
step 4.2: constructing a training sample: the preset prediction step length is recorded to be l, and a group of association parameters extracted by association rule mining are set to be V1,V2,…,VuThe complete training data set formed by them is recorded asConstruct the following matrix ItrainFor the training input of the neural network:
wherein, ItrainEach column in the training output O is a training input sampletrainComprises the following steps:
step 4.3: training the wavelet neural network by using the constructed training sample: input parameter is ViI is 1,2, … u-1, and the output parameter is VuWherein at network initialization, the confidence ω derived from the association rule is usediSetting an initial weight value between a network input layer and a hidden layer, wherein i is 1,2, … u-1;
step 4.4: and (3) new data prediction: recording a preset abnormal working condition occurrence threshold value as omegapFor newly acquired sensor measurement data, the model trained in the step 4.3 is used for carrying out prediction in the step l, and if the obtained target parameter predicted value exceeds the set threshold value omega relative to the initial normal drift amountpAnd judging that the abnormal working condition occurs.
Preferably, before the device fails, the model is reconstructed and trained after a predetermined number of measurement data are updated with the data update, so as to obtain a more accurate prediction result.
The industrial data association rule mining and predicting method based on the two-stage frequent item set generation strategy can be used for a complex industrial system measured by a sensor. By mining the association rule of the operation parameters of the industrial equipment, the corresponding parameter association is obtained, and the parameter association is introduced into wavelet neural network prediction, so that a more accurate prediction effect can be obtained. The method provides firm support for subsequent equipment maintenance planning, is beneficial to equipment maintenance management with strict reliability requirements, and has wide prospects in the aspect of practical engineering application.
Drawings
FIG. 1 shows the predicted result of variable 7 of IDV (13) in the example and the comparison with the actual value;
FIG. 2 shows the predicted result of the variable 11 of IDV (13) in the example and the comparison with the actual value;
FIG. 3 shows the predicted error rate of IDV (13) variable 7 in the example;
FIG. 4 shows the predicted error rate of the IDV (13) variable 11 in the example.
Detailed Description
The embodiments of the present invention will now be further described with reference to the accompanying drawings.
The following example illustrates the specific operational steps and the effectiveness of the verification method in terms of Tennessee-Iseman (TE) process simulation data.
The data set was sampled at 3 minute intervals and recorded the variable measurements taken by each sensor at that sampling interval. Under each operating condition (normal operating state and fault operating state under 21 preset faults), the measurement data of the simulation process will generate two types of data sets, namely a training set and a test set. The acquisition process of the training set is measured values of all 52 variables obtained under the condition that the simulation process runs for 25 hours, wherein, except the training set acquired under the normal running state, the acquisition of the other 21 training set data introduces faults after the simulation process runs for 1 hour, and only the measured data of the following 24 hours are recorded. That is, the training set in the normal operation state has 500 observation samples, and the training sets collected in the remaining 21 fault states are all 480 observation samples. In addition, for 22 test sets, the data is all the variable measurement values collected after the simulation process runs for 48 hours, that is, each test set contains 960 sample data. It should be noted that in the simulation of 21 process faults, the corresponding fault was introduced after the simulation was run for 8 hours. Therefore, for the test set in 21 fault operation states, the first 160 observation samples are normal data, and the last 800 observation samples are fault data. In the TE process simulation model, only IDV (13) is a slowly varying fault, so in this example we use the relevant data of IDV (13) to perform experiments. The specific process of the industrial data association rule mining and abnormal working condition prediction method is as follows:
step 1: and (3) performing piecewise linearization representation and symbolization on the time series data, and constructing a discrete data set suitable for association rule mining. The method specifically comprises the following substeps:
step 1.1: the measuring time sequence of the sensor is as followsN is the number of sensors and k is the time sequence length; the starting point of the initial fitting isInitial fitting endpoint ofThe fitting starting point is recorded asFitted endpoint ofFitting error threshold value is omegaE. It should be noted that in the present invention, i and j are numbers indicating sensors as superscripts and are numbers indicating only ordinal numbers as subscripts, regardless of the sensor numbers.
1.2.1 initializing a segmentation point count value of 1;
1) firstly, calculating end as start + h;
3) if the fitting error ERR is not more than the fitting error threshold value omegaEIf h is h +1, skipping to step 1) again;
4) if the fitting error ERR is larger than the fitting error threshold value omegaEObtainingLine segment fitting sequence ofRecording the segmentation point when the start is equal to start + hResetting h to 2, count to count + 1;
1.2.3 circularly executing 1.2.2 till end that end is larger than k, and obtaining a line-segment time sequence after least square fittingAnd segmentation pointComposed sequence of segmentation points Pi;
Step 1.3: time series after fitting any sensorIs marked as Yk={y1,y2,…,ykWith a plurality of line segments fitted by the least squares method described above. Extracting trend and numerical information of each fitting line segment, and representing one fitting line segment s in the following triple modei:
Wherein k isiWhich represents the slope of the line segment,represents the span of the line segment on the time axis, riData { y } representing the growth rate of the line segment data corresponding to the line segmentj,yj+1,…,yj+h},j is the starting point of the line segment;
for the line segmented time sequence YkAll the line segments in the sequence are subjected to triple representation to obtain a triple sequence Sn={s1,s2,…,snIn which n represents the time series XkThe number of segments after segmentation;
step 1.4: clustering line segment sequences in the triple sequence and symbolizing the line segments to represent different change forms of equipment or a system, thereby preparing for subsequent association rule mining. Describing line segment s by Euclidean distanceiAnd sjDegree of similarity dij:
Wherein d isijRepresenting a line segment siAnd sjSimilarity of (d)ijThe smaller the size, the more similar the change form of the two line segments, ωkAnd ωrIs a weight;
then according to the similarity index dijUsing a K-means clustering algorithm to pair SnClustering is carried out, and the same symbol is distributed to the same line segment to represent the change mode of the operation parameter, so as to obtain a symbolized sequence Fn={f1,f2,…,fn},f1,f2,…,fnRespectively representing symbols to which the 1 st, 2 … th, n line segments are assigned;
step 1.5: measuring time sequence for every two sensorsAndmerging its segment point sequence PiAnd PjIs denoted by Pij,nij-1 is PiAnd PjThe number of the combined segmentation points; and respectively symbolize the sequences according to the combined segmentation pointsAndperforming segmentation reconstruction to obtain reconstructed symbolic sequenceAnd
step 2: and generating a frequent item set of the data set by adopting a two-stage frequent item set mining algorithm. The method specifically comprises the following substeps:
step 2.1: for measuring time seriesAndrespectively corresponding operating parameters ViAnd VjThe symbolized data of the measurement sequence obtained from step 1 isAndfrom which a transaction set is formed, i.e. each transaction logIs composed of Andthe line segment type symbols included in (1) are respectively marked asAndrecording the minimum support threshold of the two stages as min1And minisup2. In this example, the minimum support threshold is set as: minsup1=0.2, minsup2=0.2。
Step 2.2: calculating the support degree of each item through a single scanning data set to obtain a frequent 1-item set, and performing the following processes of 2.2.1-2.2.3:
2.2.1: let σ (-) be the support count of an item or set of items, initially 0; is provided withIs denoted by the class symbol tkT represents a or b;
2.2.3: for each tlIf, ifNot less than the minimum support degree threshold value minsup1Then, consider tkFor frequent 1-item sets, reserve tkAnd recording corresponding support degree counts; if it is notLess than the minimum support threshold value minsup1Then, consider tkNot a frequent 1-item set;
step 2.3: using the frequent 1-item set t obtained in step 2.2kForming a 2-item set and calculating the support degree of the 2-item set to find the frequent 2-item set according to the following processes:
2.3.1: note apAnd bqRespectively, the symbols from the original line segment class after step 2.2Andthe item retained in (1);
2.3.2 for each { ap,bqExecuting the following steps:
2) If it is notNot less than min1Then consider { ap,bqKeep { a } for the frequent 2-item setp,bqAnd recording corresponding support degree counts;
step 2.4: using the frequent 2-item set { a) obtained in step 2.3p,bqCalculating the support degree of every two operation parameters in the whole data set, and obtaining a frequent item set of a parameter level, and performing the following steps: for every two operating parameters ViAnd VjSet of formed items { Vi,Vj}, calculate σ ({ V)i,Vj})=sum(σ({ap,bq}) ifNot less than the minimum support degree threshold value minsup2Then { V } is retainedi,VjRecord the corresponding support degree and calculate sigma (V)i)=sum(σ(ap));σ(Vj)=sum(σ(bq))。
And step 3: and generating association rules according to the frequent item set, and extracting the association rules meeting the minimum support degree and the minimum confidence degree threshold value. The method specifically comprises the following substeps:
step 3.1: for each set { V satisfying the threshold of the support degree obtained in step 2i,VjResults in the following association rules: vj→ViAnd Vi→VjRecording the minimum confidence threshold value as minconf; in this example, the minimum confidence threshold is set as: minconf ═ 0.7;
step 3.2: calculating a confidence threshold value according to each generated association rule group, wherein the process of extracting the association rules is as follows: for each association rule Vi→VjCalculatingIf conf (V)i→Vj) If the minimum confidence coefficient threshold is not less than minconf, the association rule V is reservedi→VjAnd records the corresponding support and confidence omegai。
In this step, association rules satisfying the threshold condition are generated, and a part of association parameters and confidence values thereof are extracted as shown in table 1. As can be seen from the results of table 1, this example will perform the prediction operation using variable 7 and variable 11 as target parameters.
And 4, step 4: and introducing the association rule mining result into a wavelet neural network and predicting the abnormal working condition of the industrial equipment. The method specifically comprises the following substeps:
step 4.1: for any set of association parameters extracted from the association rule, it is marked as { V1,V2,…,VuWhere u denotes the number of associated parameters, VuFor each association rule V, the rule's consequent, i.e. the target parameteri→Vu1,2, … u-1, all haveOne confidence, let it be ωi(ii) a For the target parameter VuPredicting abnormal working conditions by using a wavelet neural network;
step 4.2: constructing a training sample: let the preset prediction step be l, which in this example is set to 10. The set of association parameters extracted by association rule mining is { V }1,V2,…,VuThe complete training data set formed by them is recorded asConstruct the following matrix ItrainFor the training input of the neural network:
wherein, ItrainEach column in the training output O is a training input sampletrainComprises the following steps:
in particular, the training set herein not only uses fault data of the IDV (13) related variables, but also uses data of the related variables under normal operating conditions.
Step 4.3: training the wavelet neural network by using the constructed training sample: input parameter is ViI is 1,2, … u-1, and the output parameter is VuWherein at network initialization, the confidence ω derived from the association rule is usediAnd i is 1,2, … u-1, setting initial weight between the network input layer and the hidden layer. In this example, for variable 7, the input layer is 4 nodes and the hidden layer is 8 nodes; for the variable 11, the input layer is 3 nodes, the hidden layer is 6 nodes, the output layers of the two variables are 1 node, the adopted wavelet basis functions are all Morlet mother wavelet basis functions, and the related confidence values in the table 1 are used as the initialization weights of the input layer and the hidden layer of the neural network;
step 4.4: and (3) new data prediction: recording a preset abnormal working condition occurrence threshold value as omegapFor newly acquired sensor measurement data, the model trained in the step 4.3 is used for carrying out prediction in the step l, and if the obtained target parameter predicted value exceeds the set threshold value omega relative to the initial normal drift amountpAnd judging that the abnormal working condition occurs. Before the device does not fail, with the updating of the data, every updating a predetermined number NlAfter the measurement data is obtained, the model is reconstructed and trained to obtain more accurate prediction results, wherein N islDepending on the sensor sampling frequency and actual industrial field requirements. This example uses the first 300 data of the test set (total 960 sample points) to verify the prediction effect and updates the neural network with every 10 data. The threshold at which abnormal conditions (failures) occur (i.e. a parameter deviating from its normal value by a certain percentage) is set to ωp=0.015。
Table 1 association rules
Rule antecedents | Rule clause | Confidence level |
Variable 13 | Variable 7 | 0.7527 |
Variable 16 | Variable 7 | 0.7446 |
Variable 36 | Variable 7 | 0.7017 |
Variable 35 | Variable 11 | 0.7513 |
Variable 36 | Variable 11 | 0.7390 |
TABLE 2 Total prediction error Rate
Introducing association rules | Without introducing association rules | |
Variable 7 | 1.0482 | 1.8548 |
Variable 11 | 0.8536 | 1.2135 |
Fig. 1 and fig. 2 show the prediction results of the variable 7 and the variable 13, and in order to verify the advantages of introducing the association rule, the prediction results are compared with the neural network prediction results under the condition of not introducing the association rule. In fig. 1 and 2, a vertical solid line indicates actual abnormal condition occurrence time under the condition of setting our threshold, and a vertical dotted line indicate predicted values of the abnormal condition occurrence time on the premise of introducing and not introducing the association rule, respectively. As can be seen from fig. 1 and fig. 2, the prediction result obtained by the method of the present invention can better approach the true value, and especially in the prediction of the first half test data, a good prediction result is obtained, because the first half is the operation data in the normal state, the training set is relatively complete and the value is relatively concentrated. In the prediction of the failure time, the method provided by the invention also obtains a better prediction result, in fig. 1, the predicted value lags behind the real value by 8 sampling points, and in fig. 2, the predicted value lags behind the real value by 5 sampling points. Compared with the prediction result without the introduction of the association rule, the method provided by the invention obviously obtains a more accurate prediction result. The error rate prediction calculation results for the variables 7 and 11 are shown in fig. 3 and 4. Also, to further quantify the results, the overall prediction error rate was calculated as shown in Table 2. From the point of view of the overall prediction error, the introduction of the association rule significantly reduces the prediction error of the neural network, which is also well reflected in the data presented in table 2.
Claims (5)
1. A method for mining industrial data association rules and predicting abnormal working conditions is characterized by comprising the following specific steps:
step 1: performing piecewise linearization representation and symbolization on time series data, and constructing a discrete data set suitable for association rule mining;
step 2: generating a frequent item set of the data set by adopting a two-stage frequent item set mining algorithm;
and step 3: generating association rules according to the frequent item sets, and extracting the association rules meeting the minimum support degree and the minimum confidence degree threshold;
and 4, step 4: introducing the association rule mining result into a wavelet neural network and predicting the abnormal working condition of the industrial equipment;
the step 1 comprises the following substeps:
step 1.1: the measuring time sequence of the sensor is as followsN is the number of sensors and k is the time sequence length; the starting point of the initial fitting isInitial fitting endpoint ofh is 2; the fitting starting point is recorded asFitted endpoint ofFitting error threshold value is omegaE;
1.2.1 initializing a segmentation point count value of 1;
1) firstly, calculating end as start + h;
3) if the fitting error ERR is not more than the fitting error threshold value omegaEIf h is h +1, skipping to step 1) again;
4) if the fitting error ERR is larger than the fitting error threshold value omegaEObtainingLine segment fitting sequence ofRecording the segmentation point when the start is equal to start + hResetting h to 2, count to count + 1;
1.2.3 circularly executing the step 1.2.2 until the end is larger than k, and obtaining a fitted linear time sequenceAnd segmentation pointComposed sequence of segmentation points Pi;
Step 1.3: time series after fitting any sensorIs marked as Yk={y1,y2,…,ykAnd extracting trend and numerical value information of each fitting line segment, and representing one fitting line segment s in the following triple modei:
Wherein k isiWhich represents the slope of the line segment,represents the span of the line segment on the time axis, riData { y } representing the growth rate of the line segment data corresponding to the line segmentj,yj+1,…,yj+h},j is the starting point of the line segment;
for the line segmented time sequence YkAll the line segments in the sequence are subjected to triple representation to obtain a triple sequence Sn={s1,s2,…,snIn which n represents the time series XkThe number of segments after segmentation;
step 1.4: clustering line segment sequences in the triple sequence and symbolizing the line segments, which are used for representing different change forms of equipment or systems, and describing the line segments s by adopting Euclidean distanceiAnd sjDegree of similarity dij:
Wherein d isijRepresenting a line segment siAnd sjSimilarity of (d)ijThe smaller the size, the more similar the change form of the two line segments, ωkAnd ωrIs a weight;
then according to the similarity index dijUsing a K-means clustering algorithm to pair SnClustering is carried out, and the same symbol is distributed to the same line segment to represent the change mode of the operation parameter, so as to obtain a symbolized sequence Fn={f1,f2,…,fn},f1,f2,…,fnRespectively representing symbols to which the 1 st, 2 … th, n line segments are assigned;
step 1.5: measuring time sequence for every two sensorsAndmerging its segment point sequence PiAnd PjIs denoted by Pij,nij-1 is PiAnd PjThe number of the combined segmentation points; and symbolizing the sequence according to the combined segmentation point pairAndperforming segmentation reconstruction to obtain reconstructed symbolic sequenceAnd
2. the method for mining industrial data association rules and predicting abnormal conditions as claimed in claim 1, wherein the step 2 comprises the following sub-steps:
step 2.1: for measuring time seriesAndrespectively corresponding operating parameters ViAnd VjThe symbolized sequence of the measurement time sequence obtained from step 1 isAndfrom which a transaction set is formed, i.e. each transaction is recorded as Andthe line segment type symbols included in (1) are respectively marked asAndrecording the minimum support threshold of the two stages as min1And minisup2;
Step 2.2: calculating the support degree of each item through a single scanning data set to obtain a frequent 1-item set, and performing the following processes of 2.2.1-2.2.3:
2.2.1: let σ (-) be the support count of an item or set of items, initially 0; is provided withIs denoted by the class symbol tkT represents a or b;
2.2.3: for each tkIf, ifNot less than the minimum support degree threshold value minsup1Then, consider tkFor frequent 1-item sets, reserve tkAnd recording corresponding support degree counts; if it is notLess than the minimum support threshold value minsup1Then, consider tkNot a frequent 1-item set;
step 2.3: using the frequent 1-item set t obtained in step 2.2kForming a 2-item set and calculating the support degree of the 2-item set to find the frequent 2-item set according to the following processes:
2.3.1: note apAnd bqRespectively, the symbols from the original line segment class after step 2.2Andthe item retained in (1);
2.3.2 for each { ap,bqExecuting the following steps:
2) If it is notNot less than min1Then consider { ap,bqKeep { a } for the frequent 2-item setp,bqAnd recording corresponding support degree counts;
step 2.4: using the frequent 2-item set { a) obtained in step 2.3p,bqCalculating the support degree of every two operation parameters in the whole data set, and obtaining a frequent item set of a parameter level, and performing the following steps: for every two operating parameters ViAnd VjSet of formed items { Vi,Vj}, calculate σ ({ V)i,Vj})=sum(σ({ap,bq}) ifNot less than the minimum support degree threshold value minsup2Then { V } is retainedi,VjRecord the corresponding support degree and calculate sigma (V)i)=sum(σ(ap));σ(Vj)=sum(σ(bq))。
3. The method for mining industrial data association rules and predicting abnormal conditions as claimed in claim 2, wherein the step 3 comprises the following sub-steps:
step 3.1: for each set { V satisfying the threshold of the support degree obtained in step 2i,VjResults in the following association rules: vj→ViAnd Vi→VjRecording the minimum confidence threshold value as minconf;
step 3.2: calculating a confidence threshold value according to each generated association rule group, wherein the process of extracting the association rules is as follows: for each association rule Vi→VjCalculatingIf conf (V)i→Vj) If the minimum confidence coefficient threshold value minconf is not less than the minimum confidence coefficient threshold value minconf, the association rule V is reservedi→VjAnd records the corresponding support and confidence omegai。
4. The method for mining industrial data association rules and predicting abnormal operating conditions as claimed in claim 3, wherein the step 4 comprises the following sub-steps:
step 4.1: for any set of association parameters extracted from the association rule, it is marked as { V1,V2,…,VuWhere u denotes the number of associated parameters, VuFor each association rule V, the rule's consequent, i.e. the target parameteri→Vu1,2, … u-1, each with a confidence level, which is denoted as ωi(ii) a For the target parameter VuPredicting abnormal working conditions by using a wavelet neural network;
step 4.2: constructing a training sample: the preset prediction step length is recorded to be l, and a group of association parameters extracted by association rule mining are set to be V1,V2,…,VuThe complete training data set formed by them is recorded asConstruct the following matrix ItrainIs a neural netTraining input of the collaterals:
wherein, ItrainEach column in the training output O is a training input sampletrainComprises the following steps:
step 4.3: training the wavelet neural network by using the constructed training sample: input parameter is ViI is 1,2, … u-1, and the output parameter is VuWherein at network initialization, the confidence ω derived from the association rule is usediSetting an initial weight value between a network input layer and a hidden layer, wherein i is 1,2, … u-1;
step 4.4: and (3) new data prediction: recording a preset abnormal working condition occurrence threshold value as omegapFor newly acquired sensor measurement data, the model trained in the step 4.3 is used for carrying out prediction in the step l, and if the obtained target parameter predicted value exceeds the set threshold value omega relative to the initial normal drift amountpAnd judging that the abnormal working condition occurs.
5. The method as claimed in claim 1, wherein before the equipment fails, the model is reconstructed and trained after a predetermined number of measurement data are updated with the update of the data, so as to obtain a more accurate prediction result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910244856.6A CN110008253B (en) | 2019-03-28 | 2019-03-28 | Industrial data association rule mining and abnormal working condition prediction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910244856.6A CN110008253B (en) | 2019-03-28 | 2019-03-28 | Industrial data association rule mining and abnormal working condition prediction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110008253A CN110008253A (en) | 2019-07-12 |
CN110008253B true CN110008253B (en) | 2021-02-23 |
Family
ID=67168723
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910244856.6A Active CN110008253B (en) | 2019-03-28 | 2019-03-28 | Industrial data association rule mining and abnormal working condition prediction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110008253B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112130541A (en) * | 2020-10-20 | 2020-12-25 | 陕西煤业新型能源科技股份有限公司 | Energy comprehensive management control system based on Internet of things |
CN112380274B (en) * | 2020-11-16 | 2023-08-22 | 北京航空航天大学 | Abnormality detection method for control process |
CN112800686A (en) * | 2021-03-29 | 2021-05-14 | 国网江西省电力有限公司电力科学研究院 | Transformer DGA online monitoring data abnormal mode judgment method |
CN112801426B (en) * | 2021-04-06 | 2021-06-22 | 浙江浙能技术研究院有限公司 | Industrial process fault fusion prediction method based on correlation parameter mining |
CN113032912A (en) * | 2021-04-20 | 2021-06-25 | 上海交通大学 | Ship diesel engine fault detection method based on association rule |
CN114936581B (en) * | 2022-06-01 | 2024-04-26 | 中国人民解放军63796部队 | Multi-parameter association mining method based on time sequence data segmentation |
CN115497267A (en) * | 2022-09-06 | 2022-12-20 | 江西小手软件技术有限公司 | Equipment early warning platform based on time sequence association rule |
CN115689071B (en) * | 2023-01-03 | 2023-05-02 | 南京工大金泓能源科技有限公司 | Equipment fault fusion prediction method and system based on associated parameter mining |
CN116204842B (en) * | 2023-03-10 | 2023-09-08 | 广东省建设工程质量安全检测总站有限公司 | Abnormality monitoring method and system for electrical equipment |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW201142630A (en) * | 2009-12-21 | 2011-12-01 | Ibm | Method for training and using a classification model with association rule models |
CN201898519U (en) * | 2010-09-01 | 2011-07-13 | 燕山大学 | Equipment maintenance early-warning device with risk control |
CN103676645B (en) * | 2013-12-11 | 2016-08-17 | 广东电网公司电力科学研究院 | A kind of method for digging of the correlation rule in time series data stream |
CN108873859B (en) * | 2018-05-31 | 2020-07-31 | 浙江工业大学 | Bridge type grab ship unloader fault prediction model method based on improved association rule |
-
2019
- 2019-03-28 CN CN201910244856.6A patent/CN110008253B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN110008253A (en) | 2019-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110008253B (en) | Industrial data association rule mining and abnormal working condition prediction method | |
CN110018670B (en) | Industrial process abnormal working condition prediction method based on dynamic association rule mining | |
JP7240691B1 (en) | Data drive active power distribution network abnormal state detection method and system | |
CN110008565B (en) | Industrial process abnormal working condition prediction method based on operation parameter correlation analysis | |
CN112418277B (en) | Method, system, medium and equipment for predicting residual life of rotating machine parts | |
JP6216242B2 (en) | Anomaly detection method and apparatus | |
CN109298697B (en) | Method for evaluating working state of each part of thermal power plant system based on dynamic baseline model | |
CN102789545B (en) | Based on the Forecasting Methodology of the turbine engine residual life of degradation model coupling | |
Said et al. | Machine learning technique for data-driven fault detection of nonlinear processes | |
JP2019527413A (en) | Computer system and method for performing root cause analysis to build a predictive model of rare event occurrences in plant-wide operations | |
CN110414154B (en) | Fan component temperature abnormity detection and alarm method with double measuring points | |
CN105548764A (en) | Electric power equipment fault diagnosis method | |
CN105607631B (en) | The weak fault model control limit method for building up of batch process and weak fault monitoring method | |
CN109917777B (en) | Fault detection method based on mixed multi-sampling rate probability principal component analysis model | |
CN112683535B (en) | Bearing life prediction method based on multi-stage wiener process | |
Mosallam et al. | Component based data-driven prognostics for complex systems: Methodology and applications | |
CN111950627A (en) | Multi-source information fusion method and application thereof | |
CN116380445B (en) | Equipment state diagnosis method and related device based on vibration waveform | |
CN111103137A (en) | Wind turbine gearbox fault diagnosis method based on deep neural network | |
CN111382494A (en) | System and method for detecting anomalies in sensory data of industrial machines | |
CN115186762A (en) | Engine abnormity detection method and system based on DTW-KNN algorithm | |
CN109299201B (en) | Power plant production subsystem abnormity monitoring method and device based on two-stage clustering | |
CN114896861A (en) | Rolling bearing residual life prediction method based on square root volume Kalman filtering | |
CN110308713A (en) | A kind of industrial process failure identification variables method based on k neighbour reconstruct | |
JP6915693B2 (en) | System analysis method, system analyzer, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |