WO2007064860A1 - Robust sensor correlation analysis for machine condition monitoring - Google Patents

Robust sensor correlation analysis for machine condition monitoring Download PDF

Info

Publication number
WO2007064860A1
WO2007064860A1 PCT/US2006/045959 US2006045959W WO2007064860A1 WO 2007064860 A1 WO2007064860 A1 WO 2007064860A1 US 2006045959 W US2006045959 W US 2006045959W WO 2007064860 A1 WO2007064860 A1 WO 2007064860A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
sensors
correlation coefficient
weight
clustering
Prior art date
Application number
PCT/US2006/045959
Other languages
French (fr)
Inventor
Chao Yuan
Christian Balderer
Tzu Kuo Huang
Claus Neubauer
Original Assignee
Siemens Corporate Research, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Corporate Research, Inc. filed Critical Siemens Corporate Research, Inc.
Priority to EP06838755A priority Critical patent/EP1955119A1/en
Publication of WO2007064860A1 publication Critical patent/WO2007064860A1/en

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B19/00Programme-control systems
    • G05B19/02Programme-control systems electric
    • G05B19/18Numerical control [NC], i.e. automatically operating machines, in particular machine tools, e.g. in a manufacturing environment, so as to execute positioning, movement or co-ordinated operations by means of programme data in numerical form
    • G05B19/406Numerical control [NC], i.e. automatically operating machines, in particular machine tools, e.g. in a manufacturing environment, so as to execute positioning, movement or co-ordinated operations by means of programme data in numerical form characterised by monitoring or safety
    • G05B19/4065Monitoring tool breakage, life or condition
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B23/00Testing or monitoring of control systems or parts thereof
    • G05B23/02Electric testing or monitoring
    • G05B23/0205Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults
    • G05B23/0218Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterised by the fault detection method dealing with either existing or incipient faults
    • G05B23/0224Process history based detection method, e.g. whereby history implies the availability of large amounts of data
    • G05B23/024Quantitative history assessment, e.g. mathematical relationships between available data; Functions therefor; Principal component analysis [PCA]; Partial least square [PLS]; Statistical classifiers, e.g. Bayesian networks, linear regression or correlation analysis; Neural networks
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/37Measurements
    • G05B2219/37508Cross correlation

Definitions

  • the present invention relates generally to the field of machi ne
  • condition monitoring and more particularly, to techniques and systems for the statistical modeling of relationships among machine sensors from historical data.
  • Machine condition monitoring systems include an array of sensors installed on the equipment, a communications network linking those sensors, and a processor connected to the network for receiving signals from the sensors and making determinations on machine conditions from those signals.
  • the purpose of machine condition monitoring is to detect faults as early as possible to avoid further damage to machines.
  • physical models were employed to describe the relationship between sensors that measure performance of a machine. Violation of those physical relationships could indicate faults.
  • accurate physical models are often difficult to acquire.
  • the sensors to be included in the model must be selected carefully.
  • the output sensor should be correlated with the input sensors.
  • Large systems such as a power plant can contain over a thousand sensors. A systematic technique for exploring the relationship between sensors is therefore needed.
  • n is the number of samples.
  • p ⁇ is also abbreviated as p.
  • a cluster analysis may be performed in order to group the sensors according to their similarity.
  • the well-known fc-means clustering method requires a number of clusters to be specified. The value has little or no physical meaning.
  • the classical fe-means clustering technique requires the calculation of the mean of each cluster, which is not directly possible using the correlation-coefficient based measure.
  • the present invention addresses the needs described above by providing several machine monitoring methods.
  • a new technique is presented to calculate the correlation coefficient, which is more robust against outliers.
  • a weight is assigned to each sample, indicating the likelihood that that sample is an outlier.
  • the calculation of the weight is based on the Mahalanobis distance from the sample to the sample mean.
  • hierarchical clustering is applied to intuitively reveal the group information among sensors. By specifying a similarity threshold, the user can easily obtain desired clustering results.
  • One embodiment of the invention is a method for machine condition monitoring.
  • the method includes the step of determining a robust correlation coefficient / ⁇ between pairs of sensors using data from a group of samples (x,- , yi) from the pairs of sensors.
  • the robust correlation coefficient/) ⁇ is determined by initializing
  • the method further includes the step of predicting
  • the method may also include the step of clustering the sensors based on the robust correlation coefficient.
  • the clustering may be a hierarchical clustering.
  • the step of clustering the sensors may further include the steps of initializing a cluster list by placing each sensor in its own cluster Q; determining
  • Another method for machine condition monitoring includes the steps of receiving a group of readings from a plurality of sensors; for at least one pair of sensors (x, y) of the plurality of sensors, determining a robust correlation coefficient p ⁇ , using a plurality of samples (x ; - , yi) from the group of readings, and using a weight W 1 - for each sample (pa , y,) based on how closely the sample obeys a joint distribution of the readings of the pair of sensors (x, y); and clustering the sensors in a hierarchical cluster scheme using distances calculated from the robust correlation coefficient p ⁇ .
  • the robust correlation coefficient p ⁇ may be determined as detailed above, and the clustering of the sensors may be performed as detailed above.
  • the method may further include the step of adjusting the threshold to adjust a dissimilarity of sensors in each cluster.
  • the weight w,- for each sample (x,- , yi) may further be based on a Mahalanobis distance from the sample to a sample mean.
  • Another embodiment of the invention is a computer-usable medium having computer readable instructions stored thereon for execution by a processor to perform the methods described above. Brief Description of the Drawings
  • FIG. 1 is a schematic illustration of a system for machine condition monitoring according to one embodiment of the invention.
  • FIG. 2 is plot showing a weighting function according to one embodiment of the invention.
  • FIG. 3 is a flow chart showing a method according to one embodiment of the invention.
  • FIG. 4 is plot showing data used in testing a method of the invention.
  • FIG. 5 is a pseudocode listing showing an implementation of a method according to one embodiment of the invention.
  • FIG. 6 is plot showing data used in testing a method of the invention.
  • FIG. 7 is plot showing hierarchical clustering as a function of cluster dissimilarity according to one embodiment of the invention.
  • a system 110 for monitoring conditions of machines 120, 130, 140 is shown in FIG. 1.
  • the system includes a plurality of machine sensors such as the sensors 121 A, 121B connected to machine 120.
  • the sensors may, for example, be accelerometers, temperature sensors, flow sensors, position sensors, rate sensors, chemical sensors or any sensor that measures a condition of a machine or process.
  • the sensors measure conditions chosen because they are related in predictable ways that reflect the presence or absence of normal
  • the sensors 121A, 121B are connected through a data network 150 to a data interface 118 in a machine condition monitoring system 110.
  • a processor 116 receives the sensor data from the data interface 118 and performs the monitoring methods of the invention.
  • the processor is connected to storage media 112 for storing computer-readable instructions that, when executed, perform the monitoring methods.
  • the storage media 112 may also store historical data received from the sensors 121A, 121B.
  • a user interface 114 is provided for communicating results and receiving instructions from a user.
  • the correlation coefficient p between x and y is ideally calculated from good samples that represent the joint distribution of x and x. Due to the existence of outliers (observations that lie outside the overall pattern of a distribution), however, the estimated p using prior art methods is often incorrect. To obtain a correct correlation coefficient/?, outliers must either be removed from samples or their effects must be reduced in the calculation of the correlation coefficient/?.
  • the second approach reducing the effect of outliers.
  • the technique of the present invention assigns weights w, to samples. If a sample obeys the joint distribution of x and y, that sample is given a high weight; otherwise, it is given a low weight.
  • the weights wi are defined such that
  • Each sample can be represented by a vector [x; y,- wi ⁇ .
  • a weight w,- must be assigned to each sample [x y]. To do so, a statistic f(x,y) is developed such that the probability of f(x,y) ⁇ /o is very large (such as 0.95), where /o is a threshold. That provides a criterion to distinguish outliers from normal samples.
  • Wi is assigned to it based on the deviation of f ⁇ x t , y t ) from/o.
  • the weighting function is
  • Wi is simply the normalized version of w(x ⁇ , y ⁇ ):
  • a method for calculating the correlation coefficient/), shown in FIG. 3, may now be defined as a series of steps for execution by a processor:
  • a hierarchical clustering method is used for clustering sensors for several reasons.
  • First, the output of hierarchical clustering provides the user with a graphical tree structure, which is intuitive and easy to analyze.
  • the correlation coefficient/? ⁇ is initially converted into a distance-based dissimilarity measure, which is often used in the clustering literature:
  • d xy ⁇ - abs(p xy ) .
  • d avg which is also known as complete linkage. That choice guarantees that all sensors in a cluster have a correlation coefficient larger than the user-specified threshold.
  • the implementation of the above hierarchical clustering technique is as follows. A matrix is used to store the pair-wise distances d avg , and a list is used to store clusters C 1 . In each step, the data is searched for the closest pair of clusters. That pair is merged, and the distance matrix and the cluster list are updated accordingly. If the distance of the closest pair is larger then the threshold, the process is stopped and the resulting clusters are returned. Note that the correlation coefficient must be transformed into a dissimilarity value as described above.
  • a pseudocode representation 500 of the method is presented in FIG. 5.
  • the methods of the invention have been experimentally applied to real power plant data.
  • the data set contained 35 sensors, including power (MW), inlet guide vane actuator position (IGV), inlet temperature (TlC), and 32 blade path temperature (BPTC) sensors: BPTClA, BPTClB, BPTC2A ... BPTC16B.
  • Each sensor has values corresponding to 2939 time stamps (i.e., 2939 samples).
  • MW and IGV are correlated and all 32 BPTC sensors are highly correlated. High absolute correlation coefficients are expected between correlated sensors.
  • the last sensor (BPTC16B) produced several extremely small values as compared with the remaining values.
  • Hierarchical clustering of that data in accordance with the invention is shown in FIG. 7.
  • the y-axis denotes the dissimilarity values between the pairs of joined clusters.
  • the tree is cut according to the threshold and a cluster is formed for each new sub-tree.
  • a dissimilarity threshold of 0.1 three clusters are produced: ⁇ MW, IGV ⁇ , ⁇ BPTC1A ... BPTC16B ⁇ and ⁇ TIC ⁇ .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Manufacturing & Machinery (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

A method for monitoring machine conditions is based on machine learning through the use of a statistical model. A correlation coefficient is calculated using weights assigned to each sample that indicate the likelihood that that sample is an outlier. The resulting correlation coefficient is more robust against outliers. The calculation of the weight is based on the Mahalanobis distance from the sample to the sample mean. Additionally, hierarchical clustering is applied to intuitively reveal group information among sensors. By specifying a similarity threshold, the user can easily obtain desired clustering results.

Description

ROBUST SENSOR CORRELATION ANALYSIS FOR MACHINE CONDITION
MONITORING
Cross Reference to Related Applications
[0001] This application claims the benefit of U.S. Provisional Application
Serial No. 60/741,298 entitled "Robust Sensor Correlation Analysis for Machine Condition Monitoring," filed on December 1, 2005, the contents of which are hereby incorporated by reference in their entirety.
Field of the Invention
[0002] The present invention relates generally to the field of machi ne
condition monitoring, and more particularly, to techniques and systems for the statistical modeling of relationships among machine sensors from historical data.
Background of the Invention
[0003] Many manufacturing and service equipment installations today include, in addition to systems for controlling machines and processes, systems for machine condition monitoring. Machine condition monitoring systems include an array of sensors installed on the equipment, a communications network linking those sensors, and a processor connected to the network for receiving signals from the sensors and making determinations on machine conditions from those signals. [0004] The purpose of machine condition monitoring is to detect faults as early as possible to avoid further damage to machines. Traditionally, physical models were employed to describe the relationship between sensors that measure performance of a machine. Violation of those physical relationships could indicate faults. However, accurate physical models are often difficult to acquire.
[0005] An alternative to the use of physical models is the use of statistical models based on machine learning techniques. That approach has gained increased interest in recent decades. In contrast to a physical model, which assumes known sensor relationships, a statistical model learns the relationships among sensors from historical data. That characteristic of the statistical models is a big advantage in that the same generic model can be applied to different machines. The learned models differ only in their parameters.
[0006] To ensure the success of a statistical model, the sensors to be included in the model must be selected carefully. For example, for a regression model, which uses a set of input sensors to predict the value of an output sensor, the output sensor should be correlated with the input sensors. Large systems such as a power plant can contain over a thousand sensors. A systematic technique for exploring the relationship between sensors is therefore needed.
[0007] In the statistics field, correlation analysis has been extensively used to find the dependence between random variables. If the signal from each sensor is viewed as a random variable and its value at a certain time is viewed as an independent observation, it is possible to similarly apply statistical correlation analysis to sensors to find out their relationship. A well-known method is to calculate the correlation coefficient between two random variables x and y as:
Figure imgf000004_0001
[0009] where [JQ, yϊ\ is the ϊth observation (or sample) of x and y. x and y are
the observation means of x and y. n is the number of samples. For simplicity, p^ is also abbreviated as p.
[0010] The correlation coefficient defined above suffers from the effects of outliers. A single outlier could significantly lower the p score between two random variables even if they are, in fact, highly correlated. To tackle that problem, researchers have proposed Spearman and Kendall correlation coefficients, which are known to be among the top performers.
[0011] After calculating p for each pair of sensors, a cluster analysis may be performed in order to group the sensors according to their similarity. The well-known fc-means clustering method requires a number of clusters to be specified. The value has little or no physical meaning. In addition, the classical fe-means clustering technique requires the calculation of the mean of each cluster, which is not directly possible using the correlation-coefficient based measure.
[0012] There is therefore presently a need to provide a method and system for establishing relationships among sensors in a machine condition monitoring system using statistical models based on machine learning techniques. The technique should be capable of dealing with data' containing outliers, and should have physically meaningful criteria for setting cluster parameters.
Summary of the Invention
[0013] The present invention addresses the needs described above by providing several machine monitoring methods. A new technique is presented to calculate the correlation coefficient, which is more robust against outliers. A weight is assigned to each sample, indicating the likelihood that that sample is an outlier. The calculation of the weight is based on the Mahalanobis distance from the sample to the sample mean. [0014] Additionally, hierarchical clustering is applied to intuitively reveal the group information among sensors. By specifying a similarity threshold, the user can easily obtain desired clustering results.
[0015] One embodiment of the invention is a method for machine condition monitoring. The method includes the step of determining a robust correlation coefficient /^ between pairs of sensors using data from a group of samples (x,- , yi) from the pairs of sensors. The robust correlation coefficient/)^ is determined by initializing
a weight w,- for each sample (x,- , yi), wherein 0 < wt ≤ 1 and T] wf = 1 , each weight w,
being proportional to an inverse of a distance between the sample (JC,- , yi) and a sample
mean; estimating a mean μ and covariance matrix Ω of the sample as μ = YJwizi and
Ω = ^Wj(?(- -β){zi -βf , wherein zt- = hj,]T ; updating the weight Wi for each
observation fo , yi) as W1 = wherein
Figure imgf000006_0001
-μ) l and
Figure imgf000006_0002
repeating the estimating and updating steps until convergence. The robust correlation coefficient is calculated as
Figure imgf000006_0003
where x = YJwixi and y = Ywtyt. . The method further includes the step of predicting
sensor readings using the robust correlation coefficient. [0016] The method may also include the step of clustering the sensors based on the robust correlation coefficient. The clustering may be a hierarchical clustering. [0017] The step of clustering the sensors may further include the steps of initializing a cluster list by placing each sensor in its own cluster Q; determining
distances between pairs of clusters davg » wherein
Figure imgf000007_0001
dxy = l- absiPxy) ; and, if a lowest of the distances dav& (C^Cj) is smaller than a
threshold, combining the respective clusters Ci , Cj , updating the cluster list and continuing with the determining step.
[0018] Another method for machine condition monitoring includes the steps of receiving a group of readings from a plurality of sensors; for at least one pair of sensors (x, y) of the plurality of sensors, determining a robust correlation coefficient p^, using a plurality of samples (x;- , yi) from the group of readings, and using a weight W1- for each sample (pa , y,) based on how closely the sample obeys a joint distribution of the readings of the pair of sensors (x, y); and clustering the sensors in a hierarchical cluster scheme using distances calculated from the robust correlation coefficient p^. [0019] The robust correlation coefficient p^ may be determined as detailed above, and the clustering of the sensors may be performed as detailed above. [0020] The method may further include the step of adjusting the threshold to adjust a dissimilarity of sensors in each cluster. The weight w,- for each sample (x,- , yi) may further be based on a Mahalanobis distance from the sample to a sample mean. [0021] Another embodiment of the invention is a computer-usable medium having computer readable instructions stored thereon for execution by a processor to perform the methods described above. Brief Description of the Drawings
[0022] FIG. 1 is a schematic illustration of a system for machine condition monitoring according to one embodiment of the invention.
[0023] FIG. 2 is plot showing a weighting function according to one embodiment of the invention.
[0024] FIG. 3 is a flow chart showing a method according to one embodiment of the invention.
[0025] FIG. 4 is plot showing data used in testing a method of the invention.
[0026] FIG. 5 is a pseudocode listing showing an implementation of a method according to one embodiment of the invention.
[0027] FIG. 6 is plot showing data used in testing a method of the invention.
[0028] FIG. 7 is plot showing hierarchical clustering as a function of cluster dissimilarity according to one embodiment of the invention.
Description of the Invention
[0029] A system 110 for monitoring conditions of machines 120, 130, 140 according to one embodiment of the invention is shown in FIG. 1. The system includes a plurality of machine sensors such as the sensors 121 A, 121B connected to machine 120. The sensors may, for example, be accelerometers, temperature sensors, flow sensors, position sensors, rate sensors, chemical sensors or any sensor that measures a condition of a machine or process. The sensors measure conditions chosen because they are related in predictable ways that reflect the presence or absence of normal
operating conditions in an installation 100. [0030] The sensors 121A, 121B are connected through a data network 150 to a data interface 118 in a machine condition monitoring system 110. A processor 116 receives the sensor data from the data interface 118 and performs the monitoring methods of the invention. The processor is connected to storage media 112 for storing computer-readable instructions that, when executed, perform the monitoring methods. The storage media 112 may also store historical data received from the sensors 121A, 121B. A user interface 114 is provided for communicating results and receiving instructions from a user.
[0031] ROBUST CORRELATION COEFFICIENT
[0032] The correlation coefficient p between x and y is ideally calculated from good samples that represent the joint distribution of x and x. Due to the existence of outliers (observations that lie outside the overall pattern of a distribution), however, the estimated p using prior art methods is often incorrect. To obtain a correct correlation coefficient/?, outliers must either be removed from samples or their effects must be reduced in the calculation of the correlation coefficient/?.
[0033] In the present invention, the second approach, reducing the effect of outliers, is used. In many cases, it is difficult or impossible to set a clear boundary between normal samples and outliers. Instead of requiring a binary outlier decision (yes or no) on a sample, the technique of the present invention assigns weights w, to samples. If a sample obeys the joint distribution of x and y, that sample is given a high weight; otherwise, it is given a low weight. The weights wi are defined such that
0 < W1 < 1 and ^w1. = 1. Each sample can be represented by a vector [x; y,- wi\. The
calculation of the correlation coefficient p, in robust form, therefore becomes:
Figure imgf000010_0001
[0035] where x = ∑ W1X1 and y = ]T WJ- 1 . In addition to p, other measures may
be transformed into a corresponding robust form, as demonstrated below.
[0036] A weight w,- must be assigned to each sample [x y]. To do so, a statistic f(x,y) is developed such that the probability of f(x,y) </o is very large (such as 0.95), where /o is a threshold. That provides a criterion to distinguish outliers from normal samples.
[0037] For a sample [x/, yj\, if /(x(-, yt) < /0 , it is concluded that that sample
obeys the distribution. If f(χ., y.) > /0 , it is concluded that that sample violates the
distribution, because the probability for that sample to occur under the distribution is very low (such as 0.05). When the sample violates the distribution, a decreasing weight
Wi is assigned to it based on the deviation of f{xt, yt) from/o. The weighting function is
defined as:
Figure imgf000010_0002
[0039] The shape of the weighting function W(X1, J1) is illustrated in the graph
200 of FIG. 2. Wi is simply the normalized version of w(x\, y{):
[0040] v, = v \
[0041] The function/(xi, yύ is defined to be the Mahalanobis distance from the mean of [x y]. By so defining the function/ (X1, yι), it is implicitly assumed that x and y satisfy a joint Gaussian distribution N(μ, Ω), where μ is the mean and Ω is the covariance matrix. Let zι = [%i yϊf.
Figure imgf000011_0001
yύ can be expressed as:
[0042] f(Zi) = (zi -μ)τa-\zi -μ)
[0043] It can be proved that/(z) satisfies a chi-square distribution. As noted above, a preferred embodiment of the invention requires that the probability of/(z) </o is 0.95, if z is a good sample. That suggests the use of/o = 6.0 based on the standard chi-square distribution lookup table. Note that the Mahalanobis distance can be used in cases where the distribution of x and y is not Gaussian, since in most cases, outliers are located far from the mean of the distribution. The same weighting strategy is applied to estimate μ and Ω, since standard ways to calculate them also suffer from outliers. The estimates of μ and Ω are defined as:
[0044] μ = ∑WtZi , and
Figure imgf000011_0002
[0046] A method for calculating the correlation coefficient/), shown in FIG. 3, may now be defined as a series of steps for execution by a processor:
[0047] 1. Initialize w,- (step 310) by calculating the sample mean and assigning to each weight wt a value proportional to the inverse of the distance between the i th sample and the sample mean.
[0048] 2. Estimate μ and Ω (step 320) as described above, using the sample weights Wi and zi = [xt yi\τ-
[0049] 3. Update w,- (step 330) using the estimates Of-U and Ω, by expressing/
(Xi, yi) as f(zι) = (zt - //)τΩ~1(z. -μ) , substituting in the weighting function w(x{, v():
Figure imgf000012_0001
[0051] and normalizing to find Wf.
[0052] W1 = 1K
[0053] 4. If the algorithm converges (decision 340), continue; otherwise, return to Step 2.
[0054] 5. Calculate and output (step 350) the robust correlation coefficient/) for each sample (xt, j,), as
Figure imgf000012_0002
[0056] and end the process (block 360).
[0057] An example data set 400 including 20 samples with two outliers 420,
430 located at [2.1, 3.2] and [1.8, 4.5], respectively, is shown in FIG. 4. Using the unweighted method discussed in the background section above for calculating the correlation coefficient/)^, the result is p = 0.254. Apparently, that result is corrupted by the two outliers 420, 430. If those two outliers are excluded, the same unweighted equation yields the ideal p = 0.973. The method of the invention discussed above produces p - 0.923. For comparison, the Spearman and Kendall correlation coefficient estimators produce/) = 0.582 and 0.605, respectively. The method of the present invention produces the best correlation coefficients; i.e., the correlation coefficients that are closest to the ideal value.
[0058] HIERARCHICAL SENSOR CLUSTERING [0059] A hierarchical clustering method is used for clustering sensors for several reasons. First, the output of hierarchical clustering provides the user with a graphical tree structure, which is intuitive and easy to analyze. Second, by specifying different similarity thresholds corresponding to different levels of the tree, a user can easily obtain preferred clustering results. Such a threshold is directly related to the correlation relationship among sensors.
[0060] To create a clustering schema, the correlation coefficient/?^ is initially converted into a distance-based dissimilarity measure, which is often used in the clustering literature:
[0061] dxy = \- abs(pxy) .
[0062] In general, one of the following three functions is used to measure the distance between two clusters C1 and Cj:
[0063] JOTn(C(,C,) = min^ ;
[0064] J(C,,C,) = max^ ; or y≡c,
Figure imgf000013_0001
[0066] The inventors have chosen the third function, davg , which is also known as complete linkage. That choice guarantees that all sensors in a cluster have a correlation coefficient larger than the user-specified threshold. [0067] The implementation of the above hierarchical clustering technique is as follows. A matrix is used to store the pair-wise distances davg, and a list is used to store clusters C1. In each step, the data is searched for the closest pair of clusters. That pair is merged, and the distance matrix and the cluster list are updated accordingly. If the distance of the closest pair is larger then the threshold, the process is stopped and the resulting clusters are returned. Note that the correlation coefficient must be transformed into a dissimilarity value as described above. A pseudocode representation 500 of the method is presented in FIG. 5.
[0068] TEST RESULTS
[0069] The methods of the invention have been experimentally applied to real power plant data. The data set contained 35 sensors, including power (MW), inlet guide vane actuator position (IGV), inlet temperature (TlC), and 32 blade path temperature (BPTC) sensors: BPTClA, BPTClB, BPTC2A ... BPTC16B. Each sensor has values corresponding to 2939 time stamps (i.e., 2939 samples). According to domain knowledge, MW and IGV are correlated and all 32 BPTC sensors are highly correlated. High absolute correlation coefficients are expected between correlated sensors. Among the 32 correlated BPTC sensors, the last sensor (BPTC16B) produced several extremely small values as compared with the remaining values. The plot 600 of FIG. 6 shows the sample distribution between BPTC 16A and BPTCI6B. The outliers 610 can be clearly seen. Using the unweighted method discussed in the background section above, the correlation coefficient /^ = 0.2532 for those two sensors. Using the method of the invention, the robust correlation coefficient p^ = 0.9993, which reflects the correct correlation between those two sensors.
[0070] Hierarchical clustering of that data in accordance with the invention is shown in FIG. 7. The y-axis denotes the dissimilarity values between the pairs of joined clusters. The tree is cut according to the threshold and a cluster is formed for each new sub-tree. For a dissimilarity threshold of 0.1, three clusters are produced: {MW, IGV}, {BPTC1A ... BPTC16B} and {TIC}. For a threshold of 0.3, two clusters are produced: {MW, IGV, BPTClA ... BPTC16B} and {TIC}. [0071] The foregoing Detailed Description is to be understood as being in every respect illustrative and exemplary, but not restrictive, and the scope of the invention disclosed herein is not to be determined from the Description of the Invention, but rather from the Claims as interpreted according to the full breadth permitted by the patent laws. For example, while the method is disclosed herein as describing machine condition monitoring in an industrial environment, the method may be used in any environment where conditions are monitored by sensors having relationships that are repeatable, while remaining within the scope of the invention. For example, the invention may be applied to an agricultural system, a highway traffic system or a natural ecological system. It is to be understood that the embodiments shown and described herein are only illustrative of the principles of the present invention and that various modifications may be implemented by those skilled in the art without departing from the scope and spirit of the invention.

Claims

What is claimed is:
1. A method for machine condition monitoring, comprising the steps of: determining a robust correlation coefficient p^ between pairs of sensors using data from a group of samples (JC, , yt) from the pairs of sensors, the robust correlation coefficient p^ being determined by:
initializing a weight w, for each sample (xt , yt), wherein 0 ≤ W1 ≤ 1 and
y,w, = 1 , each weight W1 being proportional to an inverse of a distance between
the sample (x, , yt) and a sample mean; estimating a mean μ and covariance matrix Ω of the sample as
μ = Yw1Z1 and Ω = ∑w.fe
Figure imgf000016_0001
-μf ,
wherein z, = [x,;y(F ;
updating the weight W1 for each observation (x,,y,) as
w, = _ W(X 1, y.)
wherein w(x,, y,) = <
wherein f(x,,y,) = (Z1 -//)τΩ"1(z, -μ) ; and
repeating the estimating and updating steps until convergence; calculating the robust correlation coefficient as
Figure imgf000016_0002
where * = ∑ W1X1 and y = ∑ w,;y, ; and
predicting sensor readings using the robust correlation coefficient.
2. The method of claim 1, further comprising the step of: clustering the sensors based on the robust correlation coefficient.
3. The method of claim 2, wherein the clustering is a hierarchical clustering.
4. The method of claim 2, wherein the step of clustering the sensors further comprises: initializing a cluster list by placing each sensor in its own cluster C1;
determining distances between pairs of clusters davg (C1 , C}) = >
Figure imgf000017_0001
wherein d^ = 1 - absi^p^) ; and
if a lowest of the distances davg (C1 ,C}) is smaller than a threshold, combining
the respective clusters C1 , Cj , updating the cluster list and continuing with the determining step.
5. A method for machine condition monitoring, comprising the steps of: receiving a group of readings from a plurality of sensors; for at least one pair of sensors (x, y) of the plurality of sensors, determining a robust correlation coefficient pxy, using a plurality of samples (X1 , y,) from the group of readings, and using a weight wt for each sample (x, , y,) based on how closely the sample obeys a joint distribution of the readings of the pair of sensors (x, y); and clustering the sensors in a hierarchical cluster scheme using distances calculated from the robust correlation coefficient pxy.
6. The method of claim 5, wherein the robust correlation coefficient p^ is determined by:
initializing the weight w,- for each sample (x,- , yi), wherein 0 < W1 ≤ 1 and
∑ W1 = 1 , each W1 being proportional to an inverse of a distance between the
sample (x,- , yi) and a sample mean; estimating a mean μ and covariance matrix Ω of the sample as
μ = ∑ WA and Ω = J] wt {z, - μXzt - μf ,
wherein Z1 =
Figure imgf000018_0001
;
updating the weight w,- for each sample (xι,yd as
Figure imgf000018_0002
w ,herei -n w ((x, ,yl ,) = J <[ ! f (.Xi^i) < fo ,
wherein /(x-, y,) = (z. - //)τ Ω"1 (z. - /z) ; and
repeating the estimating and updating steps until convergence; calculating the robust correlation coefficient as
Figure imgf000018_0003
where x = ∑ w(x,- and y = ∑ w,.^. .
7. The method of claim 5, wherein the step of clustering the sensors further comprises: initializing a cluster list by placing each sensor in its own cluster Cc,
determining distances between pairs of clusters davg (C1, C}) = >
Figure imgf000019_0001
wherein dxy = l — abs{pxy) ; and
if a lowest of the distances d (CnCj) is smaller than a threshold, combining
the respective clusters C; , Q , updating the cluster list and continuing with the determining step.
8. The method of claim 7, further comprising the step of: adjusting the threshold to adjust a dissimilarity of sensors in each cluster.
9. The method of claim 5, wherein the weight w/ for each sample (JC,- , yi) is further based on a Mahalanobis distance from the sample to a sample mean.
10. A computer-usable medium having computer readable instructions stored thereon for execution by a processor to perform a method comprising: receiving a group of readings from a plurality of sensors; for at least one pair of sensors (x, y) of the plurality of sensors, determining a robust correlation coefficient pxy, using a plurality of samples (x,- , yi) from the group of readings, and using a weight w,- for each sample (x, , yi) based on how closely the sample obeys a joint distribution of the readings of the pair of sensors (x, y); and clustering the sensors in a hierarchical cluster scheme using distances calculated
from the robust correlation coefficient pxy.
11. The computer-usable medium of claim 10, wherein the robust correlation coefficient/^ is determined by:
initializing the weight w, for each sample Qc1 , yt), wherein 0 < W1 ≤ 1 and
∑ W1 = 1 , each w, being proportional to an inverse of a distance between the
sample Qc1 , y,) and a sample mean; estimating a mean μ and covariance matrix Ω of the sample as
β = ∑ W1Z1 and Ω = ∑ W1 (z, - μ){zt - μf ,
wherein z, = [x,yj;
updating the weight W1 for each sample (x,,y;) as
w, = _ vK*.,y.)
w ,herei .n ,
Figure imgf000020_0001
wherein /(x,, j,) = (z, -^)1Q-1Cz1 -//) ; and
repeating the estimating and updating steps until convergence; calculating the robust correlation coefficient as
Figure imgf000020_0002
where X = ^Y w1X1 and y — ^ Wj1 .
12. The computer-usable medium of claim 10, wherein the step of clustering the sensors further comprises: initializing a cluster list by placing each sensor in its own cluster Q;
determining distances between pairs of clusters
Figure imgf000021_0001
wherein d^ = l- abs(pxy) ; and
if a lowest of the distances davg (CnCj) is smaller than a threshold, combining
the respective clusters C, , Cj , updating the cluster list and continuing with the determining step.
PCT/US2006/045959 2005-12-01 2006-12-01 Robust sensor correlation analysis for machine condition monitoring WO2007064860A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP06838755A EP1955119A1 (en) 2005-12-01 2006-12-01 Robust sensor correlation analysis for machine condition monitoring

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US74129805P 2005-12-01 2005-12-01
US60/741,298 2005-12-01
US11/563,396 2006-11-27
US11/563,396 US7769561B2 (en) 2005-12-01 2006-11-27 Robust sensor correlation analysis for machine condition monitoring

Publications (1)

Publication Number Publication Date
WO2007064860A1 true WO2007064860A1 (en) 2007-06-07

Family

ID=37845311

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2006/045959 WO2007064860A1 (en) 2005-12-01 2006-12-01 Robust sensor correlation analysis for machine condition monitoring

Country Status (3)

Country Link
US (1) US7769561B2 (en)
EP (1) EP1955119A1 (en)
WO (1) WO2007064860A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7756678B2 (en) 2008-05-29 2010-07-13 General Electric Company System and method for advanced condition monitoring of an asset system
US8144005B2 (en) 2008-05-29 2012-03-27 General Electric Company System and method for advanced condition monitoring of an asset system
US8352216B2 (en) 2008-05-29 2013-01-08 General Electric Company System and method for advanced condition monitoring of an asset system
EP3163463A1 (en) 2015-10-26 2017-05-03 Aphilion BVBA A correlation estimating device and the related method
WO2018111479A1 (en) * 2016-12-12 2018-06-21 General Electric Company System and method for issue detection of industrial processes
CN110426677A (en) * 2019-06-19 2019-11-08 中国航空工业集团公司雷华电子技术研究所 Clutter covariance matrix estimation method based on correlation coefficient weighted

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007067521A1 (en) * 2005-12-05 2007-06-14 Siemens Corporate Research, Inc. Use of sequential clustering for instance selection in machine condition monitoring
US20090018410A1 (en) * 2006-03-02 2009-01-15 Koninklijke Philips Electronics N.V. Body parameter sensing
US7797265B2 (en) * 2007-02-26 2010-09-14 Siemens Corporation Document clustering that applies a locality sensitive hashing function to a feature vector to obtain a limited set of candidate clusters
US7711668B2 (en) * 2007-02-26 2010-05-04 Siemens Corporation Online document clustering using TFIDF and predefined time windows
US8013738B2 (en) 2007-10-04 2011-09-06 Kd Secure, Llc Hierarchical storage manager (HSM) for intelligent storage of large volumes of data
US7382244B1 (en) 2007-10-04 2008-06-03 Kd Secure Video surveillance, storage, and alerting system having network management, hierarchical data storage, video tip processing, and vehicle plate analysis
JP4917527B2 (en) * 2007-12-21 2012-04-18 東京エレクトロン株式会社 Information processing apparatus, information processing method, and program
AT507019B1 (en) * 2008-07-04 2011-03-15 Siemens Vai Metals Tech Gmbh METHOD FOR MONITORING AN INDUSTRIAL PLANT
CN102265227B (en) * 2008-10-20 2013-09-04 西门子公司 Method and apparatus for creating state estimation models in machine condition monitoring
US9285983B2 (en) * 2010-06-14 2016-03-15 Amx Llc Gesture recognition using neural networks
US20120029298A1 (en) * 2010-07-28 2012-02-02 Yongji Fu Linear classification method for determining acoustic physiological signal quality and device for use therein
CN104067011B (en) * 2011-11-23 2017-07-28 Skf公司 Rotary system state monitoring device and method, computer readable medium and management server
US8988238B2 (en) 2012-08-21 2015-03-24 General Electric Company Change detection system using frequency analysis and method
DE102012019026A1 (en) * 2012-09-26 2014-03-27 Blum-Novotest Gmbh Method and device for measuring a tool received in a workpiece processing machine
SG10201405714SA (en) * 2014-09-15 2016-04-28 Yokogawa Engineering Asia Pte Ltd Method, system and computer program for fault detection in a machine
US10013820B2 (en) 2015-12-15 2018-07-03 Freeport-Mcmoran Inc. Vehicle speed-based analytics
US10378160B2 (en) 2015-12-15 2019-08-13 Freeport-Mcmoran Inc. Systems and methods of determining road quality
CN105787159B (en) * 2016-02-14 2019-04-16 同济大学 A kind of Product Modeling Method based on Products Eco system model
CN107480341B (en) * 2017-07-21 2018-10-23 河海大学 A kind of dam safety comprehensive method based on deep learning
US11314242B2 (en) * 2019-01-28 2022-04-26 Exxonmobil Research And Engineering Company Methods and systems for fault detection and identification

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030065462A1 (en) * 2001-08-13 2003-04-03 Potyrailo Radislav Alexandrovich Multivariate statistical process analysis systems and methods for the production of melt polycarbonate
US20040181299A1 (en) 2003-03-12 2004-09-16 Tokyo Electron Limited Prediction method and apparatus of a processing result

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6643613B2 (en) * 2001-07-03 2003-11-04 Altaworks Corporation System and method for monitoring performance metrics
EP1483720A1 (en) * 2002-02-01 2004-12-08 Rosetta Inpharmactis LLC. Computer systems and methods for identifying genes and determining pathways associated with traits
US7433497B2 (en) * 2004-01-23 2008-10-07 Hewlett-Packard Development Company, L.P. Stabilizing a sequence of image frames

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030065462A1 (en) * 2001-08-13 2003-04-03 Potyrailo Radislav Alexandrovich Multivariate statistical process analysis systems and methods for the production of melt polycarbonate
US20040181299A1 (en) 2003-03-12 2004-09-16 Tokyo Electron Limited Prediction method and apparatus of a processing result

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CAMPBELL N A: "ROBUST PROCEDURES IN MULTIVARIATE ANALYSIS I: ROBUST COVARIANCE ESTIMATION", APPLIED STATISTICS, ROYAL STATISTICAL SOCIETY, LONDON,, GB, vol. 29, no. 3, 1980, pages 231 - 237, XP009012499, ISSN: 0035-9254 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7756678B2 (en) 2008-05-29 2010-07-13 General Electric Company System and method for advanced condition monitoring of an asset system
US8144005B2 (en) 2008-05-29 2012-03-27 General Electric Company System and method for advanced condition monitoring of an asset system
US8352216B2 (en) 2008-05-29 2013-01-08 General Electric Company System and method for advanced condition monitoring of an asset system
EP3163463A1 (en) 2015-10-26 2017-05-03 Aphilion BVBA A correlation estimating device and the related method
WO2018111479A1 (en) * 2016-12-12 2018-06-21 General Electric Company System and method for issue detection of industrial processes
US10401847B2 (en) 2016-12-12 2019-09-03 General Electric Company System and method for issue detection of industrial processes
CN110426677A (en) * 2019-06-19 2019-11-08 中国航空工业集团公司雷华电子技术研究所 Clutter covariance matrix estimation method based on correlation coefficient weighted

Also Published As

Publication number Publication date
US20070162241A1 (en) 2007-07-12
US7769561B2 (en) 2010-08-03
EP1955119A1 (en) 2008-08-13

Similar Documents

Publication Publication Date Title
WO2007064860A1 (en) Robust sensor correlation analysis for machine condition monitoring
JP7183471B2 (en) Predictive classification of future behavior
US10621027B2 (en) IT system fault analysis technique based on configuration management database
CN110907762B (en) Non-invasive load matching identification method
US7716152B2 (en) Use of sequential nearest neighbor clustering for instance selection in machine condition monitoring
CN107111610B (en) Mapper component for neuro-linguistic behavior recognition systems
US11847413B2 (en) Lexical analyzer for a neuro-linguistic behavior recognition system
CN111638707B (en) Intermittent process fault monitoring method based on SOM clustering and MPCA
EP1960853A1 (en) Evaluating anomaly for one-class classifiers in machine condition monitoring
CN114861788A (en) Load abnormity detection method and system based on DBSCAN clustering
JP6707716B2 (en) Abnormality information estimation device, abnormality information estimation method and program
Fink et al. Novelty detection by multivariate kernel density estimation and growing neural gas algorithm
CN113032238A (en) Real-time root cause analysis method based on application knowledge graph
CN103324155A (en) System monitoring
CN118093290A (en) Method, device, equipment and medium for detecting server heat dissipation abnormality
CN110222098A (en) Electric power high amount of traffic abnormality detection based on flow data clustering algorithm
CN104933052B (en) The estimation method and data true value estimation device of data true value
CN113447813A (en) Fault diagnosis method and equipment for offshore wind generating set
CN115130617B (en) Detection method for continuous increase of self-adaptive satellite data mode
Kabiri et al. A bayesian approach for recognition of control chart patterns
Zeileis et al. party with the mob: Model-Based Recursive Partitioning in R
CN112711828A (en) Maintenance and spare part supply joint optimization method under partially observable information
KR102382820B1 (en) Method for managing process and Apparatus thereof
Wang et al. A fuzzy clustering based anomaly node detection method for publish/subscribe distributed systems
Kovito Fault Detection of Mechanical Equipment Failure Detection Using Intelligent Data Analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
REEP Request for entry into the european phase

Ref document number: 2006838755

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2006838755

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE