CN113454553A

CN113454553A - System and method for detecting and measuring anomalies in signaling originating from components used in industrial processes

Info

Publication number: CN113454553A
Application number: CN202080011347.XA
Authority: CN
Inventors: 阿莉森·米尚
Original assignee: Buehler AG
Current assignee: Buehler AG
Priority date: 2019-01-30
Filing date: 2020-01-30
Publication date: 2021-09-28
Anticipated expiration: 2040-01-30
Also published as: CN113454553B

Abstract

The present invention relates to a method and system for detecting anomalies in sensor data derived from components used in industrial processes. The abnormality detecting step includes: (i) obtaining process and alarm/fault data from a component or a group of components; (ii) learning typical frequencies of abnormal operation or alarms/faults; (iii) comparing the new data to the learned normal operation; and (iv) identify as anomalous based on a threshold that can be adjusted. The present invention also provides novel, automated and efficient alarm monitoring, detection and visualization specific to this application.

Description

System and method for detecting and measuring anomalies in signaling originating from components used in industrial processes

Technical Field

The present invention relates to industrial process control and/or monitoring systems. In particular, the present invention relates to a method for detecting abnormal or early indications of equipment failure in an industrial equipment or production plant by monitoring measurement data and/or process parameters originating from components used in the industrial process according to claim 1, and to a system for detecting abnormal or early indications of equipment failure in an industrial equipment or production plant by monitoring measurement data and/or process parameters originating from components used in the industrial process according to claim 15, respectively. The present invention includes an adaptive closed-loop and/or open-loop control device for automated closed-loop and/or open-loop control of grinding and roller systems, more particularly mill plants having roller frames, and mill systems and grinding plants in general. In addition to its application to control devices for controlling and manipulating grinding and roll systems, the present invention also relates generally to systems and methods for detecting and measuring anomalies in signaling originating from components used in industrial processes. Possible applications of the device according to the invention also relate to the following grinding and roller systems: real-time or near real-time measurement and monitoring of one or various roll drive operating parameters such as roll temperature, roll gap, roll speed, roll nip force, and/or power delivery; and/or the real-time or quasi-real-time measurement of composition or quality parameters, e.g. measured variables, such as water content, protein content, starch damage, ash content (minerals) of the flour (or grinding intermediate), residual starch content, fineness of grind, etc., during production regulation and processing in grain grinding plants, for process monitoring (measurement, monitoring) and open-loop and/or closed-loop control of the plants and processes.

Background

In industrial process and plant settings, control systems are used to monitor, control, manipulate and signal facilities, plants or other equipment, operations/processes, etc. of an industrial or chemical process. Typically, systems that perform control and monitoring use field devices distributed at key locations in the industrial process that are coupled to control circuitry through a process control loop. The term "field device" refers to any device that performs a function in a distributed control or process monitoring system, including all devices used in measurement, such as sensors and measurement devices, control, monitoring and signaling of industrial processes, and processing facilities. For example, each field device may include a communication device and circuitry for communicating, particularly wired or wireless, with a process controller, other field devices, or other circuitry over a process control loop. In some installations, the process control loop is also used to deliver a regulated current and/or voltage to the field device to power the field device. The process control loop also carries data in analog or digital format. Typically, field devices are used to sense or control process variables in industrial processes and/or particular equipment to monitor the local environment of the field device, if desired.

One of the technical problems of such systems is based on the fact that: control and monitoring of large industrial assets (e.g., in grain mills, food processing plants) often produces a large amount of process and alarm/fault data. It is often the case that many alarms/faults are triggered, however, these alarms/faults are typically ignored or turned off to keep the process and control running. Furthermore, alarms/faults can be triggered by simple maintenance events on the machine and there is no further technical concern. Furthermore, process data from, for example, motor current, may frequently display threshold-based atypical values and do not always draw attention to individual events. There is a need to provide automated extraction of alarm/fault signaling to a short list of significant events that are abnormal with respect to typical operation. This short list will enable effective preventive maintenance and root cause analysis of downtime events, which are very costly and should be minimized in an industrial process. Identifying abnormal patterns in advance is a challenge and therefore an unsupervised approach is desirable.

Furthermore, machines or other industrial equipment, such as factories, engines, mills or turbines, etc., fail for a variety of reasons. As mentioned above, known plant or machine faults are typically detected by sensors and, once a fault is detected, the fault is reported to an operator for correction or signaled to an appropriately assigned alarm device. However, conventional strategies employed for detecting faults are typically developed based on known problems that have previously occurred in the machine, plant, or device. These previously occurring situations may be determined by automatically inferring sensor profiles corresponding to known abnormal behavior associated with a particular problem. However, for problems that never occurred before, failures tend to occur without any warning or previous indication. In such cases, the cost of repair may be significantly greater than if the fault was detected early. Furthermore, the later detection of a fault or impending fault may compromise the safety of the machine. Accordingly, it is desirable to provide systems and methods for detecting unknown abnormal behavior in a machine in an automatic and accurate manner.

Anomaly detection from sensor data is an important application of data mining, particularly in grain mills and food processing plants. Using grinding production as an example, remote monitoring of equipment is an important part of the production process in order to ensure safe and optimal grinding and to prevent major system failures. A key task of remote monitoring is anomaly detection, i.e. detecting an early indication of a fault before it occurs. For example, roll pressure and roll temperature are key elements to ensure stable production and are monitored for this purpose. In the prior art, much effort has been put into automating anomaly detection, but this remains a very challenging task. As already discussed in part above, there are several technical challenges. Sensor data such as temperature, pressure, displacement, flow rate, vibration, etc. are noisy, sensor values may change discontinuously, and related structures may change even daily. There is a need to automatically address unwanted noise in conjunction with intelligent monitoring and detection systems and methods. Variables and multiple dependencies are important, so variables should not be analyzed individually, as this may generate false alarms. Furthermore, monitored systems are often unstable because operating conditions may change over time, for example environmental conditions such as air pressure or relative air humidity/local air humidity. Therefore, diagnostic information is also needed, such as which variables exhibit abnormalities. However, known prior art methods often have serious problems in practice and are not able to handle both multiple operating modes and multivariate variable manner of anomaly scores. Most systems are not efficient at providing information in a variable manner, which is particularly problematic in many industrial applications where the measurement parameter dimension can often be large.

US 2011/288836 discloses a method and system for detecting anomalies in an aircraft engine. The method and system define a behavior model of a controller of an aircraft engine using time regression that models behavior of the controller as a function of a data set that is related to the controller and that includes a measure of past behavior of the controller and a command and state measurement for the controller; continuously recalculating the behavioral model for each new data set; and monitoring the statistical change of the behavior model to detect a behavior abnormality of the controller representing an operation abnormality of the engine. US 2016/371600 a discloses a system and method of monitoring data recorded from a system over time. The techniques described herein include the ability to detect and classify system events and provide an indication of normal system operation and anomaly detection. The systems and methods of the present disclosure may represent events occurring in a monitored system in a manner such that temporal characteristics of the events may be captured and used for detection, classification, and/or anomaly detection, which may be particularly useful when dealing with complex systems and/or events. US 2017/139398 discloses that a plurality of production facilities and analysis devices are connected by a mist network. The analyzing device performs data analysis based on detection information of the detector acquired through the mist network, and stores determination information on an abnormality of each of the plurality of production facilities or an abnormality of the production object as a result of the data analysis. Each of the plurality of production facilities determines an abnormality of each of the plurality of production facilities or an abnormality of the production object based on the determination information stored in the analysis means. EP 3379360 discloses that the abnormality detection system 1 includes an arithmetic device 1H101, the arithmetic device 1H101 performing the following processing: a process of learning a predictive model that predicts a behavior of a monitoring target device based on operation data on the device; a process of adjusting the abnormality score based on a deviation of the operation data acquired from the monitoring target device from a predicted result obtained by the predictive model so that the abnormality score of the operation data under normal operation falls within a predefined range; a process for detecting an anomaly or an indication of an anomaly based on the adjusted anomaly score; and a process of displaying information on at least one of the abnormality score and the detection result on an output device.

Finally, in prior art systems, alarm/fault messages are typically recorded in automation system software and control systems, where it is difficult to drill down into exploring and looking at trends in such log-form data. Thus, operators rely on their observations of alarm/fault events to monitor plant health. There is a need for signaling and visualization that allows for better output of alarm/fault messages so that operators can easily track plant operation, safety and health conditions. The visualization will also enable other individuals, such as owners and maintenance providers, to gain insight and better communicate with the operator. Avoiding/measuring plant down time in large process plants is important because it represents a significant loss of revenue.

Disclosure of Invention

The object of the present invention is to overcome the drawbacks and technical problems known from the prior art. In particular, it is an object to provide an accurate and efficient control system and method for detecting anomalies in measurement data and sensor data originating from components used in industrial processes. The system should be able to provide automated techniques to efficiently extract large amounts of alarm/fault information into triggered, significant events that are anomalous with respect to typical operation. The system should be able to perform the control and monitoring process in real time or near real time. More specifically, the object of the present invention is to provide an intelligent, adaptive open/closed loop control device for the automatic optimization and control of the grinding wires of a roller system, which can be used to perform grinding and/or grinding in an optimized and automatic manner, and which increases the reliability of the mill and at the same time optimizes the operation by automatically reacting to the anomalies that occur.

The objects of the invention are achieved and attained according to the present invention by means of the elements and combinations particularly pointed out in the independent and dependent claims. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as described.

According to the present invention, the above mentioned objects are achieved for a system and method for detecting abnormal or early indications of equipment failure in an industrial equipment or a production plant by monitoring sensor data or measurement data originating from components used in the industrial process, in particular the following situations: measuring sensor data and/or measurement data of a component used in the industrial process by means of a measuring device or sensor, and identifying time frames or time periods of equal size within a data stream of received sensor data and/or measurement data within a time period of normal operation of the component used in the industrial process, the sensor data and/or measurement data comprising sensor values of a plurality of measurement parameters; in that for each of the identified time frames of equal size, the sensor values of the plurality of measured parameters are converted into observable binaries, and the binaries are assigned to a data store or data structure holding a sequence of storable markov chain states; in that a multi-dimensional data structure comprising a definable number of variable hidden markov model parameter values is generated, wherein the variable model parameters of the multi-dimensional data structure are determined by means of a machine learning module applied to the assigned sequence of binary processing codes, and wherein the variable hidden markov model parameters of the multi-dimensional data structure are varied and trained by learning the normal state frequencies of alarm events occurring on the basis of sensor data and/or measurement data of identified time frames of equal size; in that a plurality of probability state values are initialized and stored by applying a trained multi-dimensional data structure with the variable hidden markov model parameter values to a pre-sampled binary processing code of a time frame of equal size as the measured sensor data and/or measurement data; determining a log threshold for the anomaly score by sorting the stored log result values of the probability state values; and in deploying the trained multi-dimensional data structure with variable hidden markov model parameter values to monitor newly measured sensor data and/or measurement data from an industrial plant or plant using a threshold value of an anomaly score to detect anomalous sensor data values that may be indicative of an impending system failure, wherein, to trigger under an anomalous sensor data value, a logarithmic result value of a probability state value of newly measured sensor data and/or measurement data is generated and compared to a stored probability state value based on the logarithmic threshold value of an anomaly score. Furthermore, there are different approaches to provide binary vector distances of valid correlations, such as the classical hamming distance based on a window of n rows (where n is 1, the classical hamming distance). For vectors a and b, the distance is equal to the number of 1's in a and b over a window of n rows divided by the length of a. Another approach is based on the jaccard distance. J (a, B) ═ 1- | a | n | | a | u |.b |. In an implementation variant, the distance may be generated periodically and the algorithm is able to detect an anomaly if the effective correlation by the example method described above is anomalous.

It is important to note that the system and method of the present invention operates principally with and without conversion of the analog signal to a binary signal or code (threshold based). However, converting an analog signal into a binary signal or code has inter alia the following advantages: time series anomaly detection typically relies on a threshold and a moving average or similar value to detect anomalies. The result may be that too many abnormal events are detected due to the oscillation/noise signal (a typical situation in industrial processes). For example, for time series and anomaly detection, one can adjust the threshold to have more or less sensitivity (see fig. 13, where anomalies are marked by gray vertical lines). In this method, each of the events that have exceeded the threshold is taken to a true/1 value (false/0 otherwise) to generate a binary sequence. This enables us to look at the frequency of threshold crossings so that anomalies can be classified based on their frequency. An advantage is that there is no need to worry about excessive sensitivity of thresholds and oscillation/noise industry IoT data. Extensions of the algorithm can also be used to find anomalies in the simulated process data without seeing these advantages of binary translation. A moving average and variance threshold may be applied to generate the binary sequence. The above anomaly detection algorithm may then be used. The result is that anomalies will be identified when a process atypically exceeds a threshold. FIG. 14 illustrates an exception to process data. In FIG. 9, a binary sequence is generated based on a threshold value applied to the process data. In the following steps, the described anomaly detection method is applied to a binary sequence and the anomaly time periods are marked accordingly. It is important to note that the above-described inventive system and method of converting analog signal anomalies into binary vectors and then applying the statistical HMM anomaly detection according to the present invention, i.e., a Hidden Markov Model (HMM) based structure, is technically unique and not provided by any prior art system. The present invention uses HMMs for anomaly detection, whereas prior art systems use different techniques to label anomalies using HMMs. In particular, prior art systems do not use the threshold steps that are being used by the present invention. Furthermore, the prior art system does not mention the conversion of an analog signal into a binary sequence, which is also part of the differentiator of claim 1.

This is much more subjective than novelty with respect to the potential inventive steps or obvious objections to any of the claims, and I cannot comment on how patent officials in various regions will apply their subjective decisions and local laws. While the use of HMMs is certainly a well-known technique, the claims are very specific to the application (grain milling, etc.), and the description is quite specific to the exact implementation that is suited to our process. There is a good body of literature on anomaly detection in HMMs, but none teaches that HMMs are specifically applied to these types of industrial processes, not to alarm data. Furthermore, scientific papers/topics are often ambiguous because they do not so state the claims.

For example, the machine learning module may process the sequence of assigned binaries by applying maximum likelihood parameter estimation to train the multi-dimensional data structure with variable hidden markov model parameters, wherein elements of the sequence of markov chains that may store parameter states are assumed to be measurements independent of each other, and wherein the model parameters of the multi-dimensional data structure are changed by maximizing the product of probabilities in order to obtain trained model parameters of the multi-dimensional data structure. For example, the model parameters of the multi-dimensional data structure may be iteratively changed until a predefined convergence threshold is exceeded. To determine the threshold for an anomaly score, an averaging process may be applied, for example, based on different frequencies of occurrence of alarm events for sensor data and/or measurement data for the identified time frame. The invention has the following advantages: it provides novel methods and systems for automatically detecting individual triggers under anomalies in data derived from components used in industrial processes. It provides an efficient automated system for controlling and monitoring large industrial assets (e.g., in grain mills, food processing plants) that often produce large amounts of process and alarm/fault data that are difficult to process.

It should be noted that the machine is stopped for each relevant alarm (e.g., violation of a roll temperature limit, exceeding a roll pressure threshold, etc.), regardless of the anomaly detection system according to the present invention. However, the present invention provides novel systems and methods for unsupervised anomaly detection, for example, associated with industrial multivariate time series data. Unsupervised detection can be very important especially in "unknown-unknown" scenarios, where the operator is unaware of the underlying fault and does not observe any previous occurrences of such unknown faults. The system of the present invention may also provide data quality assessment, missing value padding, and additional or new feature generation, validation, and assessment. The present invention allows for the determination of unknown faults based on comparing normal operating profiles (e.g., all sensors indicating values within a normal range) with differences reported in the current state of operation. Sensors may be associated with various measurable elements of a machine, such as vibration, temperature, pressure, and environmental changes. In some cases, determining an unknown fault is related to discovering an impending fault (e.g., early detection). In some cases, determining unknown faults is related to early detection and other situations where faults may have occurred in the past but have an impact on current operation. Furthermore, the invention allows to effectively filter and distinguish alarms/faults triggered by simple maintenance events on the machine and not causing concern. This also applies to process data from, for example, motor currents, which may frequently display atypical values based on threshold values and do not always draw attention to individual events. The present invention allows for efficient and automatic extraction of alarm/fault data streamed from sensors and measurement devices into a short list of important events relative to typical operational anomalies. This short list provides the basis for a novel approach to effective preventative maintenance and root cause analysis of shutdown events that are very costly and should be minimized in an industrial process. The invention allows to identify abnormal patterns in advance, making it technically possible to have an unsupervised, fully automated method by the invention of controlling and monitoring the correct operation of the machine. Thus, the invention allows unsupervised anomaly detection, in particular associated with industrial multivariate time series data. Unsupervised detection is very important in "unknown-unknown" scenarios, where the operator is unaware of the potential fault and does not observe any previous occurrences of such unknown faults. The present invention is able to compare the normal operation or machine/engine profile (e.g., all sensors indicating values within a normal range) with the reported differences in the current state of the machine/engine to determine an unknown fault. Sensors may be associated with various measurable elements of a machine, such as vibration, temperature, pressure, and environmental changes. In some cases, determining an unknown fault (e.g., evaluation) is related to discovering an impending fault (e.g., early detection). In some other cases, determining unknown faults is related to early detection and situations in which the faults occurred in the past. Furthermore, the present invention allows insight to be obtained and viewed in a new way the trend of sensor data and/or alarm/fault messages in the form of the log, which also makes continuous monitoring of alarm/fault events by the operator superfluous. The present invention also allows for novel monitoring of alarm/fault messages so that operators can easily track plant operation and health conditions. The novel monitoring also enables other individuals, such as owners and maintenance providers, to gain automated insight and better communicate with operators. The present invention allows avoiding/measuring plant down time in large process plants as it represents a significant loss of revenue.

In an embodiment variant, for example, the sensitivity of the selected time frame may be automatically adjusted based on a dynamic adjustment of the threshold. This embodiment variant has the following advantages, among others: the convergence speed may be optimized by training the variable hidden markov model parameters of the multidimensional data structure.

In another embodiment variation, for example, an anomaly time frame is evaluated across many assets of the same industrial production line, wherein the anomaly time frame is applied to root cause analysis of plant downtime in order to trigger under an anomaly score. Further, as a variant, maintenance service signaling may be generated, for example, based on the root cause analysis of plant downtime. This embodiment variant has, among other things, the advantage of allowing a robust application of the invention across various asset and industrial production lines. Another advantage is that: this embodiment variant allows for cloud-based and/or network-based automatic maintenance and/or service applications and signaling.

In a further embodiment variant, in order to determine the threshold value of the anomaly score, a frequency pattern is generated for each of the identified time frames of equal size using pattern recognition to initialize a plurality of markov chain sequences of storable parameter states, wherein each storable parameter state is a function of a plurality of measured parameters, wherein, by means of the applied pattern recognition, a weighting factor and/or a mean and/or a variance of each of the plurality of sequences of storable parameter states is determined, and irrelevant time frames are removed from the set of identified time frames of equal size used. This embodiment variant has the following advantages, among others: the convergence speed may be optimized by training the variable hidden markov model parameters of the multidimensional data structure. Thus, the pattern recognition and weighting factors allow the associated anomaly metrics for each of the variables in the noisy data sample to be applied by comparing the measured data sample with the reference data, even when some of the variables are highly correlated. Thus, spurious dependencies introduced by noise can be removed by focusing on the most important dependencies of each variable. For example, neighborhood selection may be performed in an adaptive manner by fitting a sparse graph gaussian model as the maximum likelihood estimate. The associated anomaly measure for each measured parameter can then be generated by the distance between the fitted conditional distributions.

In an embodiment variant, the gating signal is generated as a digital signal or pulse, thereby providing an appropriate time window, wherein an abnormal time frame of occurrence of the newly measured sensor data is selected from a number of measured time frames of the measurement data, and the normal time frame may be eliminated or discarded, and wherein the selection of the abnormal time frame of occurrence triggers an appropriate signaling generation and conversion to the assigned alarm and/or monitoring and/or control/steering device. This embodiment variant has the following advantages, among others: it allows for efficient inter-machine signaling by generating appropriate handling signaling that controls the operation of the associated equipment triggered by abnormal or early indications of detected equipment failures in industrial equipment or production plants.

In a further embodiment variant, the above-described method and system for automatically detecting an abnormal or early indication of a failure of equipment in an industrial plant or production plant is applied to intelligent, adaptive closed-loop and open-loop control methods, the method is used for a closed-loop and/or open-loop control device for the self-optimizing control of the grinding line of a mill plant and/or of a roller system of a mill plant, wherein the grinding line comprises a plurality of processing units which can each be driven individually by means of closed-loop and open-loop control devices and can be adjusted individually during operation thereof on the basis of operating process parameters, wherein the closed-loop and open-loop control means comprise a pattern recognition module based on the above-described method for detecting anomalies, the operation of the control means being triggered by the signalling of the pattern recognition module, and wherein the operation of the mill plant is manipulated and adjusted by means of the control device on the basis of the transmitted trigger signal. As a variant, for example, the closed-loop and/or open-loop control device may comprise a batch controller with a defined process sequence in the processing unit, which is adjustable by means of an operational process recipe and a control device, wherein a defined number of end products may be produced from one or more starting materials by means of the operational process recipe, wherein the processing unit is controlled on the basis of operational batch process parameters specifically associated with the operational process recipe, and wherein the operational batch controller is adjusted or optimized by means of the control device on the basis of one or more occurring and detected abnormal time frames of the newly measured sensor data. For example, the control device may further comprise a second pattern recognition module for recognizing an operational process recipe having a multidimensional batch process parameter pattern, wherein the operational process recipe comprises at least one or more starting materials, a defined sequence of grinding processes within the processing units of the grinding line, and operational batch process parameters stored in association with the respective processing units of the grinding line, wherein the closed-loop and open-loop control device comprises a memory device for storing historical operational process recipes having historical batch process parameters, wherein the historical batch process parameters of the process recipe each define a process-typical, multidimensional batch process parameter pattern of an optimized batch process within a normal range, wherein an input of a new operational process recipe results in a pattern recognition by means of the pattern recognition module based on the associated multidimensional batch process parameter pattern, triggering and/or selecting one or more of the stored historical operational process recipes as the closest batch process parameter mode, and wherein upon detection of one or more occurring abnormal time frames of the newly measured sensor data by means of the control means based on the triggered closest batch normal process parameter mode, a new batch process parameter mode having new batch process parameters is generated by means of the closed-loop and open-loop control means, and the processing unit is actuated and adjusted by means of the closed-loop and open-loop control means as appropriate based on the generated operational process recipe having associated batch process parameters. This embodiment variant has the following advantages, among others: allows to provide an intelligent, adaptive open/closed loop control device for the automatic optimization and control of the grinding lines of the roller system that can be used to perform the grinding and/or grinding in an optimized and automatic manner, and to increase the reliability of the mill and at the same time optimize the operation or automatically react to the anomalies that occur.

It is noted that sensors and measurement devices, which are field devices, are used to sense or control process variables of industrial equipment or plants in industrial processes. However, in some installations, it may be desirable to monitor the local environment of the field device. Further note that, for example, the anomaly detection system or method may include additional threshold evaluator modules to store upper and lower thresholds for each received technical state data for one or more of the respective sensor signals. The threshold evaluator compares the received technical state data with a threshold and generates an anomaly indication of the particular technical state data regardless of the evaluation of the anomaly detection system or method if the respective data value falls outside of the interval defined by the respective upper and lower thresholds. In other words, threshold-based sensor data evaluation may provide a shortcut for detection of an indication of an anomaly. If a particular technical state data value is outside of a tolerance range defined by upper and lower thresholds, then the corresponding anomaly indication is detected immediately regardless of the results provided by the anomaly detection. For example, the threshold may be predefined (e.g., by an operator) based on prior experience, or the threshold may be learned by the machine learning module as a shortcut value from historical sensor data.

Other aspects of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as described.

Drawings

The invention will be described in more detail, by way of example, with reference to the accompanying drawings, in which:

fig. 1 shows a diagram schematically illustrating a monitoring and adaptation process in an industrial plant with intelligent, adaptive control means for the adaptive control of the industrial plant.

Fig. 2 shows a diagram schematically illustrating a monitoring and adaptation process in an industrial plant. Data from the sensors is sent periodically, for example, every 3 minutes.

Fig. 3-12 illustrate a plant fault/pause event (down time) in the quality grinding section of the exemplary plant according to fig. 2. The main fault pause event due to mechanical error in the grinding/cleaning/first cleaning section is shown in the operational state overview. The exemplary data for 2017 and 2018, respectively, show frequent failures at the single sensor level. Anomalies are shown at the mill level section and for each sensor. Signaling for optimization and preventative maintenance is also shown.

Figure 3 shows a graph schematically illustrating monitoring of exemplary mill yields over a period of 2017 to 2018.

Fig. 4 shows a graph schematically illustrating monitoring of an exemplary F1 production over a period of 2017 to 2018.

Fig. 5 shows a graph schematically illustrating monitoring of an exemplary grinding section grinding pause summary over a period of 2017 to 2018. In this example, the total number of suspension events in 2018 is 80, and the total duration is 2 days, 27 hours, and 11 minutes. The longest pause in the milled portion was 14 hours 51 minutes of 2018, 1 month, 1 day. The total number of suspension events in 2017 was 275 for a total duration of 9 days and 8 hours and 58 minutes. The sum does not include missing events.

Fig. 6 a-6 o show graphs schematically illustrating error monitoring of an exemplary milling plant over a period of 2017 from 11 months 01 to 11 months 30.

Fig. 7 shows a graph schematically illustrating a summary of error monitoring frequencies for an exemplary grinding plant over a time period of 2017 to 2018. The figure shows plant pause events for mechanical error failures of the second cleaning section (MUEPS001), the abrasive section (MUEPS002) and the first cleaning section (RE1PS001) in cycles. Missing data for greater than 10 minutes is plotted above. The vertical bar indicates when a fault occurred and is enlarged (15h) to make a short time scale fault event visible. The thicker vertical line indicates a longer fault event or several short fault events close together. Failures shorter than 3 minutes (data sampled every 3 minutes) were not included.

Fig. 8a and 8b show diagrams schematically illustrating exemplary top 10 fault alarms divided by duration for 2018.

Fig. 9 shows a graph schematically illustrating error monitoring in the cleaning section of an exemplary plant over a period of 2017 to 2018. The system and method of the present invention allows signaling to be generated if preventative maintenance or further monitoring is required for the sorter. The anomaly detection method of the present invention is capable of identifying devices that require preventative maintenance or monitoring. The figure shows some preliminary results, where the deviant weeks were marked with orange for the clean parts. Fig. 9 shows preliminary results of detecting abnormal failure frequencies. The vertical bar indicates when a fault occurred and is slightly enlarged to make a short time scale fault event visible. The color bars indicate the classification of each fault signal. The time period marked grey is assumed to be typical plant operation. The time period labeled blue is classified as normal operation and the week labeled orange is classified as abnormal. The classification of the missing data period as normal operation is not indicated. Note that the sorting machine, high level sensor-WT, flow balancer 203, and cleaning section are merely examples of different machines. Thus, in fig. 9, the sorter may also be referred to more generally as "machine 1" of the overall system, the high level sensor-WT as "machine 2", the flow balancer 203 as "machine 3", and the cleaning section as "machine 4".

FIG. 10 shows a graph that schematically illustrates error monitoring in a cleaning portion of an exemplary abrasive plant by sensor location in the plant. Reference numerals having the form a-xxxx denote sensors and measurement devices that capture measurement data during operation of the cleaning section of the grinding plant and are employed in different locations in the process.

FIG. 11 shows a graph schematically illustrating error monitoring in a cleaning section of an exemplary abrading plant by a weight scale and a flow balancer.

Fig. 12 shows a diagram schematically illustrating the error/fault correlation in sensor data and measurement data. The system and novel method of the present invention for a control device that detects anomalies in the operation of a plant is capable of handling a large number of correlations and sensor values. The chord chart o of fig. 12 shows simultaneous failures, which means a possible correlation between mechanical failures.

Fig. 13 shows a diagram which schematically shows an embodiment variant of the invention with binary conversion of the process data. Typically, time series anomaly detection algorithms rely on thresholds and moving averages or similar values to detect anomalies. The result may be that too many abnormal events are detected due to the oscillation/noise signal (a typical situation in industrial processes). In the present invention, one can adjust the threshold to have more or less sensitivity (see FIG. 13, where anomalies are marked by vertical lines). Each of the events that have exceeded the threshold is taken to a true/1 value (false/0 otherwise) to generate a binary sequence. This enables the frequency of threshold crossings to be looked at so that anomalies can be classified based on their frequency. A technical advantage is that there is no need to worry about excessive sensitivity of thresholds and oscillation/noise industry IoT data.

Fig. 14 shows a diagram schematically illustrating anomaly detection of process data, wherein in a first step a binary sequence is generated based on a threshold value applied to the process data (see fig. 14). In a second step, the anomaly detection described herein is applied to the binary sequence and the anomaly time period is marked accordingly. Converting analog signal anomalies into binary vectors and then applying the statistical HMM (hidden markov model) anomaly detection structure according to the present invention is technically unique and cannot be derived from any prior art system.

FIG. 15 shows other graphs that schematically illustrate anomaly detection of process data based on downtime and error sensor data, with the left column showing downtime measurements in various portions over time, and the right column showing machine faults for all machines measured in the top graph, the middle graph being the error time for the machine, and the bottom graph being the error frequency measured by day.

Fig. 16 shows further graphs schematically illustrating the error/fault correlation in the sensor data and the measurement data of the example shown in fig. 15. FIG. 16 illustrates how the system and method of the present invention for triggering sensor data or measurement data derived from components used in an industrial process to detect anomalies or early indications of equipment failures in an industrial equipment or production plant allows for the provision of appropriate manipulation signals based on detected and measured alarm frequencies and correlations and anomalies. The system of the invention thus allows a technically new way of operating as follows: triggering correlations between alarm events, and/or timely visualizing alarm events, and/or anomaly detection of abnormal downtime/alarms, and/or alarm playback and corresponding electronic signaling generation.

List of reference numerals

1 Industrial plant/production plant

11 production line

12 plant down time

13 monitoring device

14 control/operating device

15 alarm device

2 measuring device/sensor

3 equal sized time frames

31 anomalous time frame

4 measuring and/or processing parameters

41 sensor parameter/measurement parameter

42 process variable

43 abnormal sensor data value

5 Industrial Process Components/Industrial devices

6 Industrial Process

61 occurrence of an alarm event

611 of occurring alarm events

612 frequency pattern of alarm events occurring

System for detecting abnormal or early indication of equipment failure in an industrial equipment or production plant

71 monitoring device

8 machine learning module

81 multidimensional data structure

811. 812, … …, 81x variable hidden markov model parameter values

821. 822, … …, 82x may store Markov chain states

831. 832, … …, 83x trained model parameters

82 normal state frequency of alarm events

83 probability state value

84 logarithmic threshold

841 abnormal score

85 logarithmic result value

86 predefined convergence threshold

9 binary converter/differentiator

91 binary processing code

911 generated binary processing code

912 pre-sampled binary processing code

92 threshold value

Claims

1. A method for detecting an abnormal or early indication of an equipment failure (43) in an industrial equipment or production plant (1) by monitoring measurement data and/or process parameters (4) originating from components (5) used in an industrial process (6), characterized by the steps of:

measuring and/or monitoring measurement data of a component (5) used in an industrial process (6) and monitoring the process parameter (4) by means of a measuring device or sensor (2), and identifying a time frame (3) of equal size among the measurement and/or process parameters (4) for the time frame in case the component (5) used in the industrial process (6) is operating normally, the measurement and/or process parameters (4) comprising parameter values of a plurality of measurement parameters/sensor parameters (41) and/or process variables (42),

for each of the identified equally sized time frames (3), converting the parameter values (4) of the plurality of measured/sensor parameters (41) and/or process variables (42) into observable binary processing codes (91/911), and assigning the binary processing codes (91/911) to a sequence of storable Markov chain states (821, 822, …, 82x),

generating a multi-dimensional data structure (81) comprising a definable number of variable hidden Markov model parameter values (811, 812, …, 81x), wherein the variable model parameters (811, 812, …, 81x) of the multi-dimensional data structure (81) are determined by means of a machine learning module (8) applied to the sequence of storable Markov chain states (821, 822, …, 82x) with assigned binary processing code (91), and wherein the variable hidden Markov model parameters (811, 812, …, 81x) of the multi-dimensional data structure (81) are changed and trained by learning normal state frequencies (82) based on the measurement data and/or the process parameters (4) of identified equally sized time frames (3),

initializing and storing a plurality of probability state values (83) by applying a trained multi-dimensional data structure (81/831, 832, …, 83x) with the variable hidden Markov model parameter values (811, 812, …, 81x) to a pre-sampled binary processing code (912) of a time frame (3) of equal size as the parameter values (4) of the plurality of measured/sensor parameters (41) and/or process variables (42),

determining a log threshold (84) for an anomaly score (841) by sorting the stored log result values of the probability state values (83), an

Deploying the trained multidimensional data structure (81/831, 832, …, 83x) with the variable hidden Markov model parameter values (811, 812, …, 81x), to monitor newly measured and determined measurement data and/or process parameters (4) from an industrial plant or plant (1) using a threshold value (84) of the anomaly score (841), to detect abnormal sensor data values (43) that may indicate an impending system failure, wherein for triggering under the abnormal sensor data value (43), a logarithmic result value (85) of the probability state value (83) of the newly measured and determined measurement data and/or process parameter (4) is generated, and comparing the log threshold (84) based on the anomaly score (841) to the stored probability state values (83).

2. Method for detecting an abnormal or early indication of a device failure (43) in an industrial device or production plant (1) according to claim 1, wherein the binary processing code (91) is generated based on a threshold value (92) applied to the measurement data and/or the process parameter (4).

3. Method for detecting an abnormal or early indication of a device failure (43) in an industrial device or production plant (1) according to claim 2, characterized in that the bandwidth of the selected time frame (3) is automatically adjusted based on a dynamic adjustment of the threshold (92).

4. Method for detecting abnormal or early indication of equipment failure in an industrial system according to one of claims 1 to 3 characterized in that an abnormal time frame (31) is measured across many assets of the same industrial production line (11), wherein for triggering at the abnormal score (841) the abnormal time frame is applied for root cause analysis of plant downtime and maintenance service signaling is generated based on the root cause analysis of plant downtime (12).

5. Method for detecting an abnormal or early indication of a device failure (43) in an industrial device or production plant (1) according to one of claims 1 to 4, characterized in that the machine learning module (8) processes the sequence of assigned binaries (91) by applying a maximum likelihood parameter estimation, to train the multi-dimensional data structure (81) with the variable hidden Markov model parameters (811, 812, …, 81x), wherein the elements of the sequence of markov chains that can store parameter states (821, 822, …, 82x) are assumed to be measurements independent of each other, and wherein the model parameters of the multidimensional data structure (81) are changed by maximizing the product of the probabilities, in order to obtain trained model parameters (831, 832, …, 83x) of the multi-dimensional data structure (81).

6. Method for detecting an abnormal or early indication of a device failure (43) in an industrial device or production plant (1) according to claim 5, characterized in that the model parameters of the multidimensional data structure (82) are iteratively changed until a predefined convergence threshold (86) is exceeded.

7. Method for detecting an abnormal or early indication of a device failure (43) in an industrial device or production plant (1) according to one of the claims 1 to 6, characterized in that for determining the threshold value (84) of the abnormality score (841) an averaging process is applied based on the different frequencies of occurrence of alarm events (61) of the measurements and/or process parameters (4) of the identified time frame (31).

8. Method for detecting anomalies or early indications of equipment faults (43) in an industrial plant or production plant (1) according to one of claims 1 to 7, characterized in that, in order to determine the threshold (84) of the anomaly score (841), a frequency pattern (612) is generated for each of the identified time frames (31) of equal size using pattern recognition to initialize a plurality of Markov chain sequences of storable parameter states (821, 822, …, 82x), wherein each storable parameter state (821, 822, …, 82x) is a function of the plurality of measurement data and/or process parameters (4), wherein, by means of the applied pattern recognition, a weighting factor and/or a mean and/or a variance of each of the plurality of sequences of storable parameter states (821, 822, …, 82x) is determined, and removing irrelevant time frames from the set of identified time frames (3) of equal size used.

9. Method for detecting an abnormal or early indication of a device failure (43) in an industrial device or production plant (1) according to one of the claims 1 to 8, characterized in that a gating signal is generated as a digital signal or pulse, providing an appropriate time window, wherein an abnormal time frame (31) of occurrence of a newly measured measurement data and/or process parameter (4) is selected from time frames (3) of a number of measurements of the measurement data and/or process parameter (4) and the normal time frame is eliminated or discarded, and wherein the selection of an occurring abnormal time frame (31) triggers an appropriate signaling generation and a transition to an assigned alarm (15) and/or monitoring (13) and/or control/steering means (14).

10. Method for detecting an abnormal or early indication of an equipment failure (43) in an industrial equipment or production plant (1) according to one of the claims 1 to 9, characterized in that electronic control and steering signaling is generated based on detected abnormal time frames (31) of occurrence of newly measured measurement data and/or process parameters (4), wherein selection of at least one occurring abnormal time frame (31) triggers the appropriate signaling generation and conversion to adjust the operation of the industrial equipment and/or production plant (1) or component (5) by means of a control/steering device (14).

11. A system (7) for detecting abnormal or early indications of equipment faults (43) in an industrial equipment or production plant (6) by monitoring measurement data and/or process parameters (4) originating from components (5) used in the industrial process (6), characterized by:

the system (7) comprises a sensor or measuring device (2/13) for measuring the measurement data of a component (5) used in an industrial process (6) and/or the process parameter (4), and a detection device for identifying a time frame (3) of equal size in the measurement data for a time frame and/or the process parameter (4) in case the component (5) used in the industrial process (6) is functioning normally, the measurement data and/or the process parameter (4) comprising parameter values of a plurality of measurement parameters/sensor parameters (41) and/or process variables (42),

the system (7) comprising a differentiator (9), the differentiator (9) for converting the parameter values (4) of the plurality of measured/sensor parameters (41) and/or process variables (42) into observable binary processing codes (91/911) for each of the identified equally sized time frames (3), and assigning the binary processing codes (91/911) to a sequence of storable Markov chain states (821, 822, …, 82x),

the system (7) comprises a machine learning module (8), the machine learning module (8) for generating a multi-dimensional data structure (81) comprising a definable number of variable hidden Markov model parameter values (811, 812, …, 81x), wherein the variable model parameters (811, 812, …, 81x) of the multi-dimensional data structure (81) are determined by means of the machine learning module (8) applied to the sequence of storable Markov chain states (821, 822, …, 82x) with the assigned binary processing code (91), and wherein the variable hidden Markov model parameters (811) of the multi-dimensional data structure (81) are changed and trained by learning normal state frequencies (82) based on the measurement data and/or the process parameters (4) of identified equally sized time frames (3), 812, …, 81x),

the machine learning module (8) comprising means for initializing and storing a plurality of probability state values (83) by applying a trained multi-dimensional data structure (81/831, 832, …, 83x) with the variable hidden Markov model parameter values (811, 812, …, 81x) to a pre-sampled binary processing code (912) of a time frame (3) of equal size as the parameter values (4) of the plurality of measured/sensor parameters (41) and/or process variables (42),

the machine learning module (8) comprises means for determining a log threshold (84) for an anomaly score (841) by sorting the stored log result values of the probability state values (83), and

the machine learning module (8) comprising means for deploying the trained multidimensional data structure (81/831, 832, …, 83x) with the variable hidden Markov model parameter values (811, 812, …, 81x), to monitor newly measured and determined measurement data and/or process parameters (4) from an industrial plant or plant (1) using the threshold values (84) of the anomaly score (841) to detect anomalous sensor data values that may indicate an impending system failure, wherein, for triggering an abnormal sensor data value (43), a logarithmic result value (85) of the probability state value (83) of a newly measured measurement and/or process parameter (4) is generated and compared with the stored probability state value (83) based on the logarithmic threshold value (84) of the abnormality score (841).