CN115510302B

CN115510302B - Intelligent factory data classification method based on big data statistics

Info

Publication number: CN115510302B
Application number: CN202211432355.9A
Authority: CN
Inventors: 冯璟煕; 陈柏林; 乔迁
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2022-11-16
Filing date: 2022-11-16
Publication date: 2023-04-07
Anticipated expiration: 2042-11-16
Also published as: CN115510302A

Abstract

The invention relates to the field of data processing, in particular to an intelligent factory data classification method based on big data statistics.

Description

Intelligent factory data classification method based on big data statistics

Technical Field

The application relates to the field of data processing, in particular to an intelligent factory data classification method based on big data statistics.

Background

With the continuous development of intelligent technology, intelligent monitoring and intelligent management are vigorously developed for various industries, for example, in various large-scale plants, digital intelligent monitoring is realized for the operation monitoring of the plants, that is, the operation abnormality of the plants is reflected by the abnormality of monitoring data. However, due to the long-term operation of the plant, the operation data is increasing, and at this time, in the data analysis of the operation monitoring, a large amount of data analysis is required, so in order to facilitate the rapid acquisition of abnormal data in the plant operation monitoring, data classification needs to be performed according to the abnormality of the original data, that is, the abnormal data and the normal data are classified and stored, so the data is required to be analyzed for abnormality first.

In the abnormal analysis of the data, the difference between the data and the distribution density of the data are mainly utilized, for example, in the existing clustering algorithm, but the clustering only aims at the size difference of the data, and the abnormality of the data with the change trend cannot be reflected well, so that the abnormality of the plant operation data cannot be judged accurately. The abnormal degree of the data is determined by respectively determining the overall data distribution relation and the difference of the data on the time sequence, wherein the overall abnormal score of the data is analyzed by using a CBLOF algorithm, but the conventional CBLOF algorithm clustering excessively depends on the distinguishing of the size clusters, the characteristic of the clustering is neglected, the abnormal score of the data is single and the reliability is not high, so the final abnormal score is determined by combining the size of the clustering and the time span of the data in the clustering.

Disclosure of Invention

In order to solve the technical problem, the invention provides an intelligent factory data classification method based on big data statistics, which comprises the following steps:

acquiring an intelligent factory data sequence formed by intelligent factory data, and obtaining a plurality of clusters according to the intelligent factory data sequence;

obtaining the time span of the cluster to which each intelligent factory data belongs according to each cluster, and obtaining the abnormal score of each intelligent factory data according to a plurality of clusters, the time span and the number of data contained in the clusters; obtaining a plurality of time windows according to the intelligent factory data sequence; obtaining the difference of each intelligent factory data relative to the time window according to the difference of the adjacent data in the time window and the abnormal score; obtaining a first abnormal degree of each intelligent factory data according to the difference of each intelligent factory data relative to each time window; obtaining a second abnormal degree of each intelligent factory data according to the first abnormal degree and the abnormal score of the intelligent factory data;

and obtaining an abnormal data set and a normal data set according to the intelligent factory data sequence and the second abnormal degree of each intelligent factory data, and performing distributed storage on the abnormal data set and the normal data set.

Preferably, the method for obtaining the abnormal score of each smart plant data according to the plurality of clusters, the time span and the number of data included in the cluster includes:

acquiring the number of data contained in the cluster to which the intelligent factory data belongs, and recording the number of the data contained in the cluster to which the intelligent factory data belongs as the first number of the cluster to which the intelligent factory data belongs; acquiring the number of data contained in each cluster and recording the number as a second number, and acquiring the maximum value of the second numbers of all clusters and recording the maximum number; acquiring the distance between each intelligent factory data in each cluster and the center of the cluster to which the intelligent factory data belongs, and recording the distance as a first distance;

and obtaining the abnormal score of each intelligent factory data according to the time span, the first number, the maximum number and the first distance of the cluster to which each intelligent factory data belongs.

Preferably, the formula for obtaining the abnormal score of each smart plant data according to the time span, the first number, the maximum number and the first distance of the cluster to which each smart plant data belongs is as follows:

wherein, the first and the second end of the pipe are connected with each other,

represents the maximum number +>

Indicates the fifth->

A first number of a cluster to which smart factory data belongs>

Represents a fifth or fifth party>

Time span of cluster to which an individual intelligent plant data belongs, <' > based on the evaluation of the status of the intelligent plant data>

Represents a fifth or fifth party>

A first distance of an individual smart factory data>

Indicates the fifth->

Abnormal scores of intelligent plant data.

Preferably, the method for obtaining the difference of each smart plant data relative to the time window according to the difference of the neighboring data in the time window and the abnormal score includes:

the method comprises the steps of obtaining a plurality of time windows of each intelligent factory data, calculating a time difference value between each intelligent factory data and each data in each time window, obtaining a plurality of adjacent data of each intelligent factory data in each time window, obtaining a standard deviation of each intelligent factory data in each time window according to the time difference value, and obtaining a difference of each intelligent factory data relative to each time window according to the adjacent data of each time window, an abnormal score of each adjacent data and the standard deviation of each time window of each intelligent factory data, namely the difference of each intelligent factory data relative to each time window.

Preferably, the formula for obtaining the difference of each smart plant data relative to each belonging time window according to the respective neighboring data of each belonging time window of each smart plant data, the anomaly score of each neighboring data and the standard deviation of each belonging time window of each smart plant data is as follows:

indicates the fifth->

An intelligent factory data>

Indicates the fifth->

First and second intelligent factory data>

Number one of associated time windows>

Each adjacent data->

Indicates the fifth->

First and second intelligent factory data>

The fifth of the respective time window>

Normalized value of an anomaly score for neighbor data>

Indicates the fifth->

The intelligent factory data ^ th ^ or ^ th>

The number of adjacent data in each time window is greater than or equal to>

Represents a fifth or fifth party>

The intelligent factory data ^ th ^ or ^ th>

Standard deviation of the respective time window>

Indicates the fifth->

An intelligent factory data relative to ^ th->

The variability of the respective time windows.

Preferably, the method for obtaining the first abnormal degree of each smart factory data according to the difference of each smart factory data relative to each time window includes:

the method comprises the steps of obtaining a plurality of time affiliated time windows of each intelligent factory data, obtaining the standard deviation of each affiliated time window of each intelligent factory data by utilizing data in each affiliated time window of each intelligent factory data, and obtaining the first abnormal degree of each intelligent factory data according to the difference of each intelligent factory data relative to each time window and the standard deviation of each affiliated time window of each intelligent factory data.

Preferably, the formula for obtaining the first abnormal degree of each smart plant data according to the difference of each smart plant data relative to each time window and the standard deviation of each smart plant data in each time window is as follows:

wherein the content of the first and second substances,

indicates the fifth->

An intelligent factory data relative to ^ th->

Differentiation of the respective associated time window>

Indicates the fifth->

The number of the associated time window of an intelligent factory data>

Indicates the fifth->

Standard deviation of the variance of the individual intelligent plant data over all time windows->

Indicates the fifth->

A first degree of anomaly of the intelligent plant data.

Preferably, the method for obtaining the time span of the cluster to which each smart plant data belongs according to each cluster includes:

and forming a data pair by any two intelligent factory data in the cluster to which the intelligent factory data belongs, calculating the time difference of the two intelligent factory data in each data pair, and obtaining the time span of the cluster to which the intelligent factory data belongs according to the time difference of all the data pairs in the cluster to which the intelligent factory data belongs.

The embodiment of the invention at least has the following beneficial effects: firstly, reflecting the possibility of the cluster itself having abnormality according to the size of the cluster, and highlighting the influence of the cluster size on the data abnormality; and then, judging the influence relationship among the data of the same cluster according to the time sequence span of the data contained in the cluster, namely considering the influence of the data time sequence relationship on the data abnormity judgment, and performing more accurate data abnormity judgment.

Then, the relative difference of the data is determined according to the difference of the time sequence data and the relative relation of the window data in the calculation window on the time sequence, the data difference abnormity caused by the larger difference between the trend change data is avoided, the influence of the abnormal score of other data on the window calculation is considered in the window calculation, and the influence of abnormal data in the window on the abnormal judgment of other data is avoided.

And in the abnormal degree of the data time sequence, the influence of the data abnormal score on the abnormal judgment of the data time sequence is introduced, the common influence of the data abnormal score and the abnormal judgment of the data time sequence is strengthened, the final abnormal degree of the data is obtained and is used as the basis for data abnormal classification, namely, the data is classified more accurately.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a flow chart of an intelligent plant data classification method based on big data statistics according to the present invention.

Detailed Description

To further illustrate the technical means and effects of the present invention for achieving the predetermined objects, the following detailed description of the method for classifying plant data based on big data statistics, the detailed implementation, structure, features and effects thereof according to the present invention will be provided with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" refers to not necessarily the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The following describes a specific scheme of the intelligent factory data classification method based on big data statistics in detail with reference to the accompanying drawings.

Referring to fig. 1, a flowchart illustrating steps of a big data statistics-based intelligent plant data classification method according to an embodiment of the present invention is shown, where the method includes the following steps:

and S001, acquiring data to obtain an intelligent factory data sequence, and obtaining a plurality of clusters according to the intelligent factory data sequence.

1. Collecting data:

in order to control the operation state of the factory in time in the intelligent factory, real-time operation monitoring needs to be carried out, and then in order to facilitate data analysis, the generated data needs to be transmitted to a unified data management platform for analysis management. The data that this scheme was gathered for the unified data management platform of mill related are wisdom mill operation monitoring data.

Arranging the collected intelligent factory operation monitoring in time sequence to obtain an intelligent factory data sequence, and calling each data in the intelligent factory data sequence as intelligent factory data, such as vibration data of equipment engine, temperature data of equipment, etc

2. Obtaining a plurality of clusters according to the smart factory data sequence:

clustering each data in the intelligent factory data sequence by using a K-Means clustering algorithm to obtain

Cluster, in the present scheme->

And taking 10, and taking the average value of all data in each cluster as the central data of each cluster.

And step S002, obtaining the abnormal score of each intelligent factory data according to the clusters.

The intelligent factory operation data is mainly used for monitoring and analyzing the intelligent factory operation abnormity, the intelligent factory operation abnormity is mainly reflected in data abnormity, and abnormal data needs to be frequently called in the factory abnormity monitoring analysis, so that the abnormal data is called for convenience in abnormity analysis.

First, the intelligent factory data quantity contained in all clusters is obtained

And K represents the number of clusters, then the current cluster number is arranged from small to large, the first W clusters in the cluster sequence are selected as small clusters, W =3 is set in the invention, and the rest clusters are large clusters.

The obtained small clusters are respectively represented as

The large clusters are respectively represented as

Wherein n denotes the number of small clusters, <' > H>

Indicating the number of large clusters.

Will find the smart factory data in the sequence

Any two intelligent factory data in a cluster to which the individual intelligent factory data belong form a data pair, the time difference between two intelligent factory data in each data pair is calculated, and the ^ B/C is compared>

The maximum value of the time difference of all data pairs in the cluster to which the individual intelligent factory data belongs is taken as the ^ h>

Time span of a cluster to which an individual intelligent plant data belongs>

。

Get the first

The number of data contained in the cluster to which the individual intelligent plant data belongs is recorded as ^ h>

A first number of clusters to which the individual intelligent plant data belongs>

(ii) a Acquiring the number of data contained in each cluster, recording the number as a second number, acquiring the maximum value of the second number of all clusters, and recording the maximum number as the maximum number->

(ii) a When a first or second party>

If the cluster to which the individual intelligent factory data belongs is a large cluster, the ^ th or greater than the predetermined threshold value is selected>

The Euclidean distance between the intelligent factory data and the cluster center data is recorded as ^ H->

First distance of intelligent factory data->

(ii) a When it is at the fifth place>

When the cluster to which the intelligent factory data belongs is a small cluster, the judgment result of the judgment result is obtained>

The Euclidean distance between the intelligent factory data and the data of each large cluster center is recorded as the first->

First distance of intelligent factory data->

。

According to the first

Time span for a cluster to which individual smart factory data belongs>

The first number is greater than or equal to>

And a maximum number->

And a fifth->

First distance of intelligent factory data->

Get the fifth->

Anomaly score for individual smart plant data:

wherein the content of the first and second substances,

indicates the maximum number, is greater than or equal to>

Indicates the fifth->

A first number of clusters to which the individual smart factory data belongs,

indicates the fifth->

The relative size of the cluster data to which the intelligent factory data belongs indicates the greater the value of the cluster data, the greater the value indicates the greater the ^ h>

The smaller the number of cluster data to which the individual intelligent factory data belongs, the greater the possibility that the cluster itself to which the data belongs is abnormal, and so the ^ th or maximum judgment>

The greater the anomalous score of an individual intelligent factory data, the greater the->

Indicates the fifth->

The larger the value of the time span of the cluster to which the intelligent factory data belongs is, the larger the time sequence span of the data in the cluster to which the intelligent factory data belongs is, the smaller the influence relationship among the data is, the higher the possibility that the data in the cluster to which the intelligent factory data belongs is abnormal is, namely the larger the abnormal score of the data is, the greater the judgment result is, and the greater the judgment result is>

Indicates the fifth->

A first distance of the intelligent factory data, the greater the value indicating a ^ th or greater>

The greater the distance of an individual intelligent factory data from the cluster center, i.e. the ^ th->

The individual intelligent factory data differs from the majority of the data and is therefore @>

Indicates the fifth->

Abnormal scores of individual smart factory data.

Obtaining the abnormal score of each intelligent factory data, and when analyzing the abnormal score of each intelligent factory data, firstly reflecting the possibility of the cluster itself having abnormality according to the size of the cluster aiming at the abnormal score, and highlighting the influence of the cluster size on the data abnormality; and then, judging the influence relationship among the same cluster data according to the time sequence span of the data contained in the cluster, namely, considering the influence of the data time sequence relationship on data abnormity judgment, and carrying out more accurate data abnormity judgment, namely obtaining more accurate data abnormity scores, thereby carrying out more accurate classification on the factory data.

Step S003, a second abnormal degree of each smart factory data is obtained according to the plurality of clusters and the abnormal score of each smart factory data.

1. Calculate the variance of each smart plant data against each time window:

for the intelligent factory data, stable and invariable data may exist, and data with a certain trend change may also exist, so that the abnormal degree of the final data needs to be determined according to the change relation of the data on the time sequence at the moment, and the abnormal degree is used as the classification basis of the final abnormal data.

Is set to a size of

In the time window of (4), the scheme>

And taking 40, sliding in the intelligent factory data sequence by using the time window with 1 as a step length, wherein each sliding corresponds to one time window, and a plurality of time windows are obtained in the sliding process.

Obtaining includes the first

An intelligent factory data>

Is recorded as th>

A plurality of time windows of the intelligent factory data;

calculate the first

Intelligent factory data->

And the fifth->

A difference in time between data in the associated time window will ^ be determined>

All the data in each time window are arranged from small to large according to the time difference value, and the obtained time difference value is in front->

Individual data as ^ th ^ or ^ th ^>

The intelligent factory data ^ th ^ or ^ th>

A plurality of adjacent data of the time window, in the scheme/>

Taking 10;

by using the first

Intelligent factory data->

And the fifth->

All data in the time window are evaluated with the standard deviation as ^ h>

An intelligent factory data and ^ th->

The standard deviation of the associated time window->

。

According to the first

Multiple associated time windows, th->

A plurality of adjacent data and standard deviations of each associated time window of the individual intelligent plant data are evaluated to determine the ^ th ^ or the ^ th>

Variability of individual smart factory data versus time window:

wherein the content of the first and second substances,

represents a fifth or fifth party>

An intelligent factory data->

Indicates the fifth->

The intelligent factory data ^ th ^ or ^ th>

The fifth of the respective time window>

Each adjacent data->

Represents a fifth or fifth party>

An intelligent factory data and ^ th->

The intelligent factory data ^ th ^ or ^ th>

The fifth of the respective time window>

A difference of adjacent data, greater the value indicating a ^ th>

The difference between the value of the intelligent factory data and the value of the data adjacent to the time sequence is larger>

Represents a fifth or fifth party>

The intelligent factory data ^ th ^ or ^ th>

The fifth of the respective time window>

A normalized value of the anomaly score for the neighborhood data, a greater value indicating a greater anomaly score for the neighborhood data, and thereby utilizing the data to ^ h>

The less the value of the intelligent plant data that can be referenced when performing an anomaly analysis on the data, the greater the value of the data that can be referenced>

Indicates the fifth->

First and second intelligent factory data>

The number of adjacent data in each time window is greater than or equal to>

Represents a fifth or fifth party>

The intelligent factory data ^ th ^ or ^ th>

A standard deviation of the respective time window, a greater value indicating a greater difference in the data in the window, and a greater or lesser degree of judgment>

An intelligent factoryThe less discriminatory the data is with respect to the window, the less well-being>

Indicates the fifth->

An intelligent factory data relative to ^ th>

The variability of the respective time windows.

2. Calculating a first degree of anomaly of each smart plant data:

data in time series, each smart factory data may exist in a plurality of associated time windows, i.e. data

Has a plurality of relative diversity with respect to window data when vs @>

Relative discrepancy in a plurality of assigned time windows, a decision->

First degree of abnormality in time series:

wherein the content of the first and second substances,

indicates the fifth->

An intelligent factory data relative to ^ th>

Difference of each belonging time windowAnd if the value is greater, it indicates a fifth or fifth condition>

The greater the degree of abnormality of the individual intelligent factory data, the greater the>

Represents a fifth or fifth party>

The number of the associated time window of the individual intelligent factory data>

Represents->

Is at>

The relative difference mean of the respective time window which is assigned to a person is reflected in its entirety>

The relative difference in the time series of the signals, device for combining or screening>

Greater, greater>

The greater the degree of abnormality in the time series, the greater the>

Indicates the fifth->

Represents a fifth or fifth party>

A first degree of anomaly of the intelligent plant data.

When the first abnormal degree is determined, the relative difference of the data is determined according to the difference of time sequence data and the relative relation of window data in a time window on a time sequence, the data difference abnormality caused by large difference between trend change data is avoided, the influence of abnormal scores of other data on window calculation is considered during window operation, and the influence of abnormal score data in the window on the abnormal judgment of other data is avoided; and finally, obtaining the abnormal degree of the final data on the time sequence through the relative difference of a plurality of calculation windows where the data are positioned, wherein the local abnormality of the data is further reflected by considering the difference of the relative difference of the plurality of windows.

3. Calculating a second degree of anomaly of each smart plant data:

is combined with

Abnormal scoring and/or based on intelligent plant data>

The first abnormal degree of the intelligent factory data is judged as ^ h>

The second degree of anomaly of the smart factory data is:

represents a fifth or fifth party>

The abnormal score of the data of the intelligent factory, device for selecting or keeping>

Indicates the fifth->

A first degree of abnormality of the individual intelligent plant data, the greater the value of which is->

In a second degree of abnormality>

The larger.

And obtaining a second abnormal degree of the data of each intelligent factory, obtaining a data abnormal score and the abnormal degree of the data on a time sequence by combining the CBLOF algorithm and the data time sequence change during the abnormal degree analysis, and judging the data abnormality from the overall data distribution and the data time sequence. And in the abnormal degree of the data time sequence, the influence of the data abnormal score on the judgment of the data time sequence abnormality is introduced, the common influence of the data abnormal score and the judgment of the data abnormal score is strengthened, the final abnormal degree of the data is obtained and is used as the basis for data abnormal classification, namely, the data is classified more accurately.

And step S004, obtaining an abnormal data set and a normal data set according to the second abnormal degree of each intelligent factory data, and performing distributed storage on the abnormal data set and the normal data set.

Arranging the intelligent factory data according to the second abnormal degree from large to small

The set formed by the intelligent factory data is used as an abnormal data set, and the set formed by the intelligent factory data remained in the intelligent factory data sequence is used as a normal data set.

And the abnormal data set and the normal data set are stored in a distributed manner, so that the abnormal data can be inquired and called quickly.

In summary, the embodiment of the present invention provides an intelligent factory data classification method based on big data statistics, which reflects the possibility that a cluster itself has an abnormality according to the size of the cluster, and highlights the influence of the cluster size on data abnormality; and then, judging the influence relationship among the data of the same cluster according to the time sequence span of the data contained in the cluster, namely, considering the influence of the data time sequence relationship on data abnormity judgment, and performing more accurate data abnormity judgment. Then, the relative difference of the data is determined according to the difference of the time sequence data and the relative relation of the window data in the calculation window on the time sequence, the data difference abnormity caused by the larger difference between the trend change data is avoided, the influence of the abnormal score of other data on the window calculation is considered in the window calculation, and the influence of abnormal data in the window on the abnormal judgment of other data is avoided. And in the abnormal degree of the data time sequence, the influence of the data abnormal score on the abnormal judgment of the data time sequence is introduced, the common influence of the data abnormal score and the abnormal judgment of the data time sequence is strengthened, the final abnormal degree of the data is obtained and is used as the basis for data abnormal classification, namely, the data is classified more accurately.

It should be noted that: the precedence order of the above embodiments of the present invention is only for description, and does not represent the merits of the embodiments. And specific embodiments thereof have been described above. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and should not be taken as limiting the scope of the present invention, which is intended to cover any modifications, equivalents, improvements, etc. within the spirit of the present invention.

Claims

1. The intelligent factory data classification method based on big data statistics is characterized by comprising the following steps:

acquiring an intelligent factory data sequence formed by intelligent factory data, and acquiring a plurality of clusters according to the intelligent factory data sequence;

obtaining the time span of the cluster to which each smart factory data belongs according to each cluster, and obtaining the abnormal score of each smart factory data according to the clusters, the time span and the number of data contained in the clusters; obtaining a plurality of time windows according to the intelligent factory data sequence; obtaining the difference of each intelligent factory data relative to the time window according to the difference of the adjacent data in the time window and the abnormal score; obtaining a first abnormal degree of each intelligent factory data according to the difference of each intelligent factory data relative to each time window; obtaining a second abnormal degree of each intelligent factory data according to the first abnormal degree and the abnormal score of the intelligent factory data;

obtaining an abnormal data set and a normal data set according to the intelligent factory data sequence and the second abnormal degree of each intelligent factory data, and performing distributed storage on the abnormal data set and the normal data set;

the method for obtaining the difference of each smart factory data relative to the time window according to the difference of the adjacent data in the time window and the abnormal score comprises the following steps:

acquiring a plurality of time windows of each smart factory data, calculating a time difference value between each smart factory data and each data in each time window, acquiring a plurality of adjacent data of each smart factory data in each time window according to the time difference value, acquiring a standard deviation of each time window of each smart factory data by using the data in each time window of each smart factory data, and acquiring a difference of each smart factory data relative to each time window of each smart factory data, namely a difference of each smart factory data relative to each time window, according to each adjacent data of each time window of each smart factory data, an abnormal score of each adjacent data and a standard deviation of each time window of each smart factory data;

the formula for obtaining the difference of each smart plant data relative to each belonging time window according to the adjacent data of each belonging time window of each smart plant data, the abnormal score of each adjacent data and the standard deviation of each belonging time window of each smart plant data is as follows:

wherein the content of the first and second substances,

indicates the fifth->

An intelligent factory data->

Indicates the fifth->

The intelligent factory data ^ th ^ or ^ th>

Number one of associated time windows>

Each adjacent data->

Indicates the fifth->

The intelligent factory data ^ th ^ or ^ th>

The fifth of the respective time window>

Normalized value of the abnormal score of the individual neighborhood data, <' > or>

Indicates the fifth->

The intelligent factory data ^ th ^ or ^ th>

The number of adjacent data in each time window is greater than or equal to>

Indicates the fifth->

The intelligent factory data ^ th ^ or ^ th>

Standard deviation of the respective time window>

Indicates the fifth->

An intelligent factory data relative to ^ th->

The difference of each time window;

the method for obtaining a first anomaly degree of each smart plant data according to the difference of each smart plant data relative to each time window comprises the following steps:

acquiring a plurality of time affiliated time windows of each intelligent factory data, obtaining a standard deviation of each affiliated time window of each intelligent factory data by using data in each affiliated time window of each intelligent factory data, and obtaining a first abnormal degree of each intelligent factory data according to the difference of each intelligent factory data relative to each time window and the standard deviation of each affiliated time window of each intelligent factory data;

the formula for obtaining the first abnormal degree of each intelligent factory data according to the difference of each intelligent factory data relative to each time window and the standard deviation of each intelligent factory data belonging to each time window is as follows:

wherein the content of the first and second substances,

indicates the fifth->

An intelligent factory data relative to ^ th>

Differentiation of the respective associated time window>

Indicates the fifth->

Represents a fifth or fifth party>

Represents a fifth or fifth party>

A first degree of anomaly of the smart factory data;

the method for obtaining the time span of the cluster to which the smart factory data belongs according to each cluster comprises the following steps:

2. The intelligent big data statistics-based plant data classification method of claim 1, wherein the method for obtaining the abnormal score of each intelligent plant data according to the clusters, the time span and the number of data included in the clusters comprises:

acquiring the number of data contained in the cluster to which the intelligent factory data belongs, and recording the number of the data contained in the cluster to which the intelligent factory data belongs as the first number of the cluster to which the intelligent factory data belongs; acquiring the number of data contained in each cluster and recording the number as a second number, and acquiring the maximum value of the second number of all clusters and recording the maximum number; acquiring the distance between the data of each intelligent factory in each cluster and the center of the cluster to which the data belongs, and recording the distance as a first distance;

3. The intelligent plant data classification method based on big data statistics as claimed in claim 2, wherein the formula for obtaining the abnormal score of each intelligent plant data according to the time span, the first number, the maximum number and the first distance of the cluster to which each intelligent plant data belongs is as follows: