CN115510302B - Intelligent factory data classification method based on big data statistics - Google Patents

Intelligent factory data classification method based on big data statistics Download PDF

Info

Publication number
CN115510302B
CN115510302B CN202211432355.9A CN202211432355A CN115510302B CN 115510302 B CN115510302 B CN 115510302B CN 202211432355 A CN202211432355 A CN 202211432355A CN 115510302 B CN115510302 B CN 115510302B
Authority
CN
China
Prior art keywords
data
intelligent
intelligent factory
factory data
time window
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211432355.9A
Other languages
Chinese (zh)
Other versions
CN115510302A (en
Inventor
冯璟煕
陈柏林
乔迁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202211432355.9A priority Critical patent/CN115510302B/en
Publication of CN115510302A publication Critical patent/CN115510302A/en
Application granted granted Critical
Publication of CN115510302B publication Critical patent/CN115510302B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention relates to the field of data processing, in particular to an intelligent factory data classification method based on big data statistics.

Description

Intelligent factory data classification method based on big data statistics
Technical Field
The application relates to the field of data processing, in particular to an intelligent factory data classification method based on big data statistics.
Background
With the continuous development of intelligent technology, intelligent monitoring and intelligent management are vigorously developed for various industries, for example, in various large-scale plants, digital intelligent monitoring is realized for the operation monitoring of the plants, that is, the operation abnormality of the plants is reflected by the abnormality of monitoring data. However, due to the long-term operation of the plant, the operation data is increasing, and at this time, in the data analysis of the operation monitoring, a large amount of data analysis is required, so in order to facilitate the rapid acquisition of abnormal data in the plant operation monitoring, data classification needs to be performed according to the abnormality of the original data, that is, the abnormal data and the normal data are classified and stored, so the data is required to be analyzed for abnormality first.
In the abnormal analysis of the data, the difference between the data and the distribution density of the data are mainly utilized, for example, in the existing clustering algorithm, but the clustering only aims at the size difference of the data, and the abnormality of the data with the change trend cannot be reflected well, so that the abnormality of the plant operation data cannot be judged accurately. The abnormal degree of the data is determined by respectively determining the overall data distribution relation and the difference of the data on the time sequence, wherein the overall abnormal score of the data is analyzed by using a CBLOF algorithm, but the conventional CBLOF algorithm clustering excessively depends on the distinguishing of the size clusters, the characteristic of the clustering is neglected, the abnormal score of the data is single and the reliability is not high, so the final abnormal score is determined by combining the size of the clustering and the time span of the data in the clustering.
Disclosure of Invention
In order to solve the technical problem, the invention provides an intelligent factory data classification method based on big data statistics, which comprises the following steps:
acquiring an intelligent factory data sequence formed by intelligent factory data, and obtaining a plurality of clusters according to the intelligent factory data sequence;
obtaining the time span of the cluster to which each intelligent factory data belongs according to each cluster, and obtaining the abnormal score of each intelligent factory data according to a plurality of clusters, the time span and the number of data contained in the clusters; obtaining a plurality of time windows according to the intelligent factory data sequence; obtaining the difference of each intelligent factory data relative to the time window according to the difference of the adjacent data in the time window and the abnormal score; obtaining a first abnormal degree of each intelligent factory data according to the difference of each intelligent factory data relative to each time window; obtaining a second abnormal degree of each intelligent factory data according to the first abnormal degree and the abnormal score of the intelligent factory data;
and obtaining an abnormal data set and a normal data set according to the intelligent factory data sequence and the second abnormal degree of each intelligent factory data, and performing distributed storage on the abnormal data set and the normal data set.
Preferably, the method for obtaining the abnormal score of each smart plant data according to the plurality of clusters, the time span and the number of data included in the cluster includes:
acquiring the number of data contained in the cluster to which the intelligent factory data belongs, and recording the number of the data contained in the cluster to which the intelligent factory data belongs as the first number of the cluster to which the intelligent factory data belongs; acquiring the number of data contained in each cluster and recording the number as a second number, and acquiring the maximum value of the second numbers of all clusters and recording the maximum number; acquiring the distance between each intelligent factory data in each cluster and the center of the cluster to which the intelligent factory data belongs, and recording the distance as a first distance;
and obtaining the abnormal score of each intelligent factory data according to the time span, the first number, the maximum number and the first distance of the cluster to which each intelligent factory data belongs.
Preferably, the formula for obtaining the abnormal score of each smart plant data according to the time span, the first number, the maximum number and the first distance of the cluster to which each smart plant data belongs is as follows:
Figure 281475DEST_PATH_IMAGE002
wherein, the first and the second end of the pipe are connected with each other,
Figure DEST_PATH_IMAGE003
represents the maximum number +>
Figure 542823DEST_PATH_IMAGE004
Indicates the fifth->
Figure DEST_PATH_IMAGE005
A first number of a cluster to which smart factory data belongs>
Figure 916036DEST_PATH_IMAGE006
Represents a fifth or fifth party>
Figure 248928DEST_PATH_IMAGE005
Time span of cluster to which an individual intelligent plant data belongs, <' > based on the evaluation of the status of the intelligent plant data>
Figure DEST_PATH_IMAGE007
Represents a fifth or fifth party>
Figure 680610DEST_PATH_IMAGE005
A first distance of an individual smart factory data>
Figure 690023DEST_PATH_IMAGE008
Indicates the fifth->
Figure 894740DEST_PATH_IMAGE005
Abnormal scores of intelligent plant data.
Preferably, the method for obtaining the difference of each smart plant data relative to the time window according to the difference of the neighboring data in the time window and the abnormal score includes:
the method comprises the steps of obtaining a plurality of time windows of each intelligent factory data, calculating a time difference value between each intelligent factory data and each data in each time window, obtaining a plurality of adjacent data of each intelligent factory data in each time window, obtaining a standard deviation of each intelligent factory data in each time window according to the time difference value, and obtaining a difference of each intelligent factory data relative to each time window according to the adjacent data of each time window, an abnormal score of each adjacent data and the standard deviation of each time window of each intelligent factory data, namely the difference of each intelligent factory data relative to each time window.
Preferably, the formula for obtaining the difference of each smart plant data relative to each belonging time window according to the respective neighboring data of each belonging time window of each smart plant data, the anomaly score of each neighboring data and the standard deviation of each belonging time window of each smart plant data is as follows:
Figure 782055DEST_PATH_IMAGE010
wherein, the first and the second end of the pipe are connected with each other,
Figure DEST_PATH_IMAGE011
indicates the fifth->
Figure 44409DEST_PATH_IMAGE005
An intelligent factory data>
Figure 772194DEST_PATH_IMAGE012
Indicates the fifth->
Figure 680851DEST_PATH_IMAGE005
First and second intelligent factory data>
Figure DEST_PATH_IMAGE013
Number one of associated time windows>
Figure 542496DEST_PATH_IMAGE014
Each adjacent data->
Figure DEST_PATH_IMAGE015
Indicates the fifth->
Figure 895243DEST_PATH_IMAGE005
First and second intelligent factory data>
Figure 777617DEST_PATH_IMAGE013
The fifth of the respective time window>
Figure 691346DEST_PATH_IMAGE014
Normalized value of an anomaly score for neighbor data>
Figure 923394DEST_PATH_IMAGE016
Indicates the fifth->
Figure 504548DEST_PATH_IMAGE005
The intelligent factory data ^ th ^ or ^ th>
Figure 26665DEST_PATH_IMAGE013
The number of adjacent data in each time window is greater than or equal to>
Figure DEST_PATH_IMAGE017
Represents a fifth or fifth party>
Figure 584948DEST_PATH_IMAGE005
The intelligent factory data ^ th ^ or ^ th>
Figure 850713DEST_PATH_IMAGE013
Standard deviation of the respective time window>
Figure 286374DEST_PATH_IMAGE018
Indicates the fifth->
Figure 743506DEST_PATH_IMAGE005
An intelligent factory data relative to ^ th->
Figure 631828DEST_PATH_IMAGE013
The variability of the respective time windows.
Preferably, the method for obtaining the first abnormal degree of each smart factory data according to the difference of each smart factory data relative to each time window includes:
the method comprises the steps of obtaining a plurality of time affiliated time windows of each intelligent factory data, obtaining the standard deviation of each affiliated time window of each intelligent factory data by utilizing data in each affiliated time window of each intelligent factory data, and obtaining the first abnormal degree of each intelligent factory data according to the difference of each intelligent factory data relative to each time window and the standard deviation of each affiliated time window of each intelligent factory data.
Preferably, the formula for obtaining the first abnormal degree of each smart plant data according to the difference of each smart plant data relative to each time window and the standard deviation of each smart plant data in each time window is as follows:
Figure 966863DEST_PATH_IMAGE020
wherein the content of the first and second substances,
Figure 460293DEST_PATH_IMAGE018
indicates the fifth->
Figure 997583DEST_PATH_IMAGE005
An intelligent factory data relative to ^ th->
Figure 235184DEST_PATH_IMAGE013
Differentiation of the respective associated time window>
Figure DEST_PATH_IMAGE021
Indicates the fifth->
Figure 9DEST_PATH_IMAGE005
The number of the associated time window of an intelligent factory data>
Figure 879104DEST_PATH_IMAGE022
Indicates the fifth->
Figure 445083DEST_PATH_IMAGE024
Standard deviation of the variance of the individual intelligent plant data over all time windows->
Figure DEST_PATH_IMAGE025
Indicates the fifth->
Figure 665587DEST_PATH_IMAGE005
A first degree of anomaly of the intelligent plant data.
Preferably, the method for obtaining the time span of the cluster to which each smart plant data belongs according to each cluster includes:
and forming a data pair by any two intelligent factory data in the cluster to which the intelligent factory data belongs, calculating the time difference of the two intelligent factory data in each data pair, and obtaining the time span of the cluster to which the intelligent factory data belongs according to the time difference of all the data pairs in the cluster to which the intelligent factory data belongs.
The embodiment of the invention at least has the following beneficial effects: firstly, reflecting the possibility of the cluster itself having abnormality according to the size of the cluster, and highlighting the influence of the cluster size on the data abnormality; and then, judging the influence relationship among the data of the same cluster according to the time sequence span of the data contained in the cluster, namely considering the influence of the data time sequence relationship on the data abnormity judgment, and performing more accurate data abnormity judgment.
Then, the relative difference of the data is determined according to the difference of the time sequence data and the relative relation of the window data in the calculation window on the time sequence, the data difference abnormity caused by the larger difference between the trend change data is avoided, the influence of the abnormal score of other data on the window calculation is considered in the window calculation, and the influence of abnormal data in the window on the abnormal judgment of other data is avoided.
And in the abnormal degree of the data time sequence, the influence of the data abnormal score on the abnormal judgment of the data time sequence is introduced, the common influence of the data abnormal score and the abnormal judgment of the data time sequence is strengthened, the final abnormal degree of the data is obtained and is used as the basis for data abnormal classification, namely, the data is classified more accurately.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of an intelligent plant data classification method based on big data statistics according to the present invention.
Detailed Description
To further illustrate the technical means and effects of the present invention for achieving the predetermined objects, the following detailed description of the method for classifying plant data based on big data statistics, the detailed implementation, structure, features and effects thereof according to the present invention will be provided with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" refers to not necessarily the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The following describes a specific scheme of the intelligent factory data classification method based on big data statistics in detail with reference to the accompanying drawings.
Referring to fig. 1, a flowchart illustrating steps of a big data statistics-based intelligent plant data classification method according to an embodiment of the present invention is shown, where the method includes the following steps:
and S001, acquiring data to obtain an intelligent factory data sequence, and obtaining a plurality of clusters according to the intelligent factory data sequence.
1. Collecting data:
in order to control the operation state of the factory in time in the intelligent factory, real-time operation monitoring needs to be carried out, and then in order to facilitate data analysis, the generated data needs to be transmitted to a unified data management platform for analysis management. The data that this scheme was gathered for the unified data management platform of mill related are wisdom mill operation monitoring data.
Arranging the collected intelligent factory operation monitoring in time sequence to obtain an intelligent factory data sequence, and calling each data in the intelligent factory data sequence as intelligent factory data, such as vibration data of equipment engine, temperature data of equipment, etc
2. Obtaining a plurality of clusters according to the smart factory data sequence:
clustering each data in the intelligent factory data sequence by using a K-Means clustering algorithm to obtain
Figure 76845DEST_PATH_IMAGE026
Cluster, in the present scheme->
Figure 341604DEST_PATH_IMAGE026
And taking 10, and taking the average value of all data in each cluster as the central data of each cluster.
And step S002, obtaining the abnormal score of each intelligent factory data according to the clusters.
The intelligent factory operation data is mainly used for monitoring and analyzing the intelligent factory operation abnormity, the intelligent factory operation abnormity is mainly reflected in data abnormity, and abnormal data needs to be frequently called in the factory abnormity monitoring analysis, so that the abnormal data is called for convenience in abnormity analysis.
First, the intelligent factory data quantity contained in all clusters is obtained
Figure DEST_PATH_IMAGE027
And K represents the number of clusters, then the current cluster number is arranged from small to large, the first W clusters in the cluster sequence are selected as small clusters, W =3 is set in the invention, and the rest clusters are large clusters.
The obtained small clusters are respectively represented as
Figure 704584DEST_PATH_IMAGE028
The large clusters are respectively represented as
Figure DEST_PATH_IMAGE029
Wherein n denotes the number of small clusters, <' > H>
Figure 999999DEST_PATH_IMAGE030
Indicating the number of large clusters.
Will find the smart factory data in the sequence
Figure 965681DEST_PATH_IMAGE005
Any two intelligent factory data in a cluster to which the individual intelligent factory data belong form a data pair, the time difference between two intelligent factory data in each data pair is calculated, and the ^ B/C is compared>
Figure 334215DEST_PATH_IMAGE005
The maximum value of the time difference of all data pairs in the cluster to which the individual intelligent factory data belongs is taken as the ^ h>
Figure 461571DEST_PATH_IMAGE005
Time span of a cluster to which an individual intelligent plant data belongs>
Figure DEST_PATH_IMAGE031
Get the first
Figure 190754DEST_PATH_IMAGE005
The number of data contained in the cluster to which the individual intelligent plant data belongs is recorded as ^ h>
Figure 943816DEST_PATH_IMAGE005
A first number of clusters to which the individual intelligent plant data belongs>
Figure 183167DEST_PATH_IMAGE032
(ii) a Acquiring the number of data contained in each cluster, recording the number as a second number, acquiring the maximum value of the second number of all clusters, and recording the maximum number as the maximum number->
Figure DEST_PATH_IMAGE033
(ii) a When a first or second party>
Figure 592676DEST_PATH_IMAGE005
If the cluster to which the individual intelligent factory data belongs is a large cluster, the ^ th or greater than the predetermined threshold value is selected>
Figure 402632DEST_PATH_IMAGE005
The Euclidean distance between the intelligent factory data and the cluster center data is recorded as ^ H->
Figure 710116DEST_PATH_IMAGE005
First distance of intelligent factory data->
Figure 787662DEST_PATH_IMAGE034
(ii) a When it is at the fifth place>
Figure 787979DEST_PATH_IMAGE005
When the cluster to which the intelligent factory data belongs is a small cluster, the judgment result of the judgment result is obtained>
Figure 82301DEST_PATH_IMAGE005
The Euclidean distance between the intelligent factory data and the data of each large cluster center is recorded as the first->
Figure 193477DEST_PATH_IMAGE005
First distance of intelligent factory data->
Figure 391109DEST_PATH_IMAGE034
According to the first
Figure 296748DEST_PATH_IMAGE005
Time span for a cluster to which individual smart factory data belongs>
Figure 81295DEST_PATH_IMAGE031
The first number is greater than or equal to>
Figure 730582DEST_PATH_IMAGE032
And a maximum number->
Figure 782721DEST_PATH_IMAGE033
And a fifth->
Figure 859261DEST_PATH_IMAGE005
First distance of intelligent factory data->
Figure 161516DEST_PATH_IMAGE034
Get the fifth->
Figure 145652DEST_PATH_IMAGE024
Anomaly score for individual smart plant data:
Figure 317877DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure 768581DEST_PATH_IMAGE003
indicates the maximum number, is greater than or equal to>
Figure 527720DEST_PATH_IMAGE004
Indicates the fifth->
Figure 518810DEST_PATH_IMAGE005
A first number of clusters to which the individual smart factory data belongs,
Figure DEST_PATH_IMAGE035
indicates the fifth->
Figure 699868DEST_PATH_IMAGE036
The relative size of the cluster data to which the intelligent factory data belongs indicates the greater the value of the cluster data, the greater the value indicates the greater the ^ h>
Figure 383790DEST_PATH_IMAGE024
The smaller the number of cluster data to which the individual intelligent factory data belongs, the greater the possibility that the cluster itself to which the data belongs is abnormal, and so the ^ th or maximum judgment>
Figure 863182DEST_PATH_IMAGE024
The greater the anomalous score of an individual intelligent factory data, the greater the->
Figure 861225DEST_PATH_IMAGE006
Indicates the fifth->
Figure 243927DEST_PATH_IMAGE005
The larger the value of the time span of the cluster to which the intelligent factory data belongs is, the larger the time sequence span of the data in the cluster to which the intelligent factory data belongs is, the smaller the influence relationship among the data is, the higher the possibility that the data in the cluster to which the intelligent factory data belongs is abnormal is, namely the larger the abnormal score of the data is, the greater the judgment result is, and the greater the judgment result is>
Figure 833171DEST_PATH_IMAGE007
Indicates the fifth->
Figure 65439DEST_PATH_IMAGE005
A first distance of the intelligent factory data, the greater the value indicating a ^ th or greater>
Figure 663910DEST_PATH_IMAGE024
The greater the distance of an individual intelligent factory data from the cluster center, i.e. the ^ th->
Figure 107311DEST_PATH_IMAGE024
The individual intelligent factory data differs from the majority of the data and is therefore @>
Figure 382303DEST_PATH_IMAGE024
The greater the anomalous score of an individual intelligent factory data, the greater the->
Figure 852599DEST_PATH_IMAGE008
Indicates the fifth->
Figure 739914DEST_PATH_IMAGE005
Abnormal scores of individual smart factory data.
Obtaining the abnormal score of each intelligent factory data, and when analyzing the abnormal score of each intelligent factory data, firstly reflecting the possibility of the cluster itself having abnormality according to the size of the cluster aiming at the abnormal score, and highlighting the influence of the cluster size on the data abnormality; and then, judging the influence relationship among the same cluster data according to the time sequence span of the data contained in the cluster, namely, considering the influence of the data time sequence relationship on data abnormity judgment, and carrying out more accurate data abnormity judgment, namely obtaining more accurate data abnormity scores, thereby carrying out more accurate classification on the factory data.
Step S003, a second abnormal degree of each smart factory data is obtained according to the plurality of clusters and the abnormal score of each smart factory data.
1. Calculate the variance of each smart plant data against each time window:
for the intelligent factory data, stable and invariable data may exist, and data with a certain trend change may also exist, so that the abnormal degree of the final data needs to be determined according to the change relation of the data on the time sequence at the moment, and the abnormal degree is used as the classification basis of the final abnormal data.
Is set to a size of
Figure DEST_PATH_IMAGE037
In the time window of (4), the scheme>
Figure 471110DEST_PATH_IMAGE037
And taking 40, sliding in the intelligent factory data sequence by using the time window with 1 as a step length, wherein each sliding corresponds to one time window, and a plurality of time windows are obtained in the sliding process.
Obtaining includes the first
Figure 933315DEST_PATH_IMAGE024
An intelligent factory data>
Figure 373131DEST_PATH_IMAGE011
Is recorded as th>
Figure 47826DEST_PATH_IMAGE024
A plurality of time windows of the intelligent factory data;
calculate the first
Figure 758161DEST_PATH_IMAGE024
Intelligent factory data->
Figure 79684DEST_PATH_IMAGE011
And the fifth->
Figure 993413DEST_PATH_IMAGE013
A difference in time between data in the associated time window will ^ be determined>
Figure 721066DEST_PATH_IMAGE038
All the data in each time window are arranged from small to large according to the time difference value, and the obtained time difference value is in front->
Figure DEST_PATH_IMAGE039
Individual data as ^ th ^ or ^ th ^>
Figure 247426DEST_PATH_IMAGE024
The intelligent factory data ^ th ^ or ^ th>
Figure 520276DEST_PATH_IMAGE038
A plurality of adjacent data of the time window, in the scheme/>
Figure 436148DEST_PATH_IMAGE039
Taking 10;
by using the first
Figure 452646DEST_PATH_IMAGE024
Intelligent factory data->
Figure 842301DEST_PATH_IMAGE011
And the fifth->
Figure 286052DEST_PATH_IMAGE013
All data in the time window are evaluated with the standard deviation as ^ h>
Figure 689220DEST_PATH_IMAGE024
An intelligent factory data and ^ th->
Figure 774988DEST_PATH_IMAGE013
The standard deviation of the associated time window->
Figure 812957DEST_PATH_IMAGE017
According to the first
Figure 693189DEST_PATH_IMAGE024
Multiple associated time windows, th->
Figure 521336DEST_PATH_IMAGE024
A plurality of adjacent data and standard deviations of each associated time window of the individual intelligent plant data are evaluated to determine the ^ th ^ or the ^ th>
Figure 630369DEST_PATH_IMAGE024
Variability of individual smart factory data versus time window:
Figure 306201DEST_PATH_IMAGE040
wherein the content of the first and second substances,
Figure 544284DEST_PATH_IMAGE011
represents a fifth or fifth party>
Figure 407198DEST_PATH_IMAGE005
An intelligent factory data->
Figure 322851DEST_PATH_IMAGE012
Indicates the fifth->
Figure 587610DEST_PATH_IMAGE005
The intelligent factory data ^ th ^ or ^ th>
Figure 527753DEST_PATH_IMAGE013
The fifth of the respective time window>
Figure 628696DEST_PATH_IMAGE014
Each adjacent data->
Figure DEST_PATH_IMAGE041
Represents a fifth or fifth party>
Figure 781328DEST_PATH_IMAGE005
An intelligent factory data and ^ th->
Figure 900594DEST_PATH_IMAGE005
The intelligent factory data ^ th ^ or ^ th>
Figure 41332DEST_PATH_IMAGE013
The fifth of the respective time window>
Figure 82100DEST_PATH_IMAGE014
A difference of adjacent data, greater the value indicating a ^ th>
Figure 835162DEST_PATH_IMAGE005
The difference between the value of the intelligent factory data and the value of the data adjacent to the time sequence is larger>
Figure 74513DEST_PATH_IMAGE015
Represents a fifth or fifth party>
Figure 592344DEST_PATH_IMAGE005
The intelligent factory data ^ th ^ or ^ th>
Figure 166414DEST_PATH_IMAGE013
The fifth of the respective time window>
Figure 473899DEST_PATH_IMAGE014
A normalized value of the anomaly score for the neighborhood data, a greater value indicating a greater anomaly score for the neighborhood data, and thereby utilizing the data to ^ h>
Figure 333137DEST_PATH_IMAGE024
The less the value of the intelligent plant data that can be referenced when performing an anomaly analysis on the data, the greater the value of the data that can be referenced>
Figure 159886DEST_PATH_IMAGE042
Indicates the fifth->
Figure 866592DEST_PATH_IMAGE005
First and second intelligent factory data>
Figure 400603DEST_PATH_IMAGE013
The number of adjacent data in each time window is greater than or equal to>
Figure 614547DEST_PATH_IMAGE017
Represents a fifth or fifth party>
Figure 972716DEST_PATH_IMAGE005
The intelligent factory data ^ th ^ or ^ th>
Figure 6531DEST_PATH_IMAGE013
A standard deviation of the respective time window, a greater value indicating a greater difference in the data in the window, and a greater or lesser degree of judgment>
Figure 606883DEST_PATH_IMAGE005
An intelligent factoryThe less discriminatory the data is with respect to the window, the less well-being>
Figure 924601DEST_PATH_IMAGE018
Indicates the fifth->
Figure 266721DEST_PATH_IMAGE005
An intelligent factory data relative to ^ th>
Figure 741827DEST_PATH_IMAGE013
The variability of the respective time windows.
2. Calculating a first degree of anomaly of each smart plant data:
data in time series, each smart factory data may exist in a plurality of associated time windows, i.e. data
Figure 194805DEST_PATH_IMAGE011
Has a plurality of relative diversity with respect to window data when vs @>
Figure 367029DEST_PATH_IMAGE011
Relative discrepancy in a plurality of assigned time windows, a decision->
Figure 559676DEST_PATH_IMAGE011
First degree of abnormality in time series:
Figure DEST_PATH_IMAGE043
wherein the content of the first and second substances,
Figure 223876DEST_PATH_IMAGE018
indicates the fifth->
Figure 214966DEST_PATH_IMAGE005
An intelligent factory data relative to ^ th>
Figure 8740DEST_PATH_IMAGE013
Difference of each belonging time windowAnd if the value is greater, it indicates a fifth or fifth condition>
Figure 879613DEST_PATH_IMAGE005
The greater the degree of abnormality of the individual intelligent factory data, the greater the>
Figure 375317DEST_PATH_IMAGE021
Represents a fifth or fifth party>
Figure 609245DEST_PATH_IMAGE005
The number of the associated time window of the individual intelligent factory data>
Figure 693745DEST_PATH_IMAGE044
Represents->
Figure 848037DEST_PATH_IMAGE011
Is at>
Figure DEST_PATH_IMAGE045
The relative difference mean of the respective time window which is assigned to a person is reflected in its entirety>
Figure 909665DEST_PATH_IMAGE011
The relative difference in the time series of the signals, device for combining or screening>
Figure 508137DEST_PATH_IMAGE044
Greater, greater>
Figure 742416DEST_PATH_IMAGE011
The greater the degree of abnormality in the time series, the greater the>
Figure 220671DEST_PATH_IMAGE022
Indicates the fifth->
Figure 690966DEST_PATH_IMAGE024
Standard deviation of the variance of the individual intelligent plant data over all time windows->
Figure 515965DEST_PATH_IMAGE025
Represents a fifth or fifth party>
Figure 388106DEST_PATH_IMAGE005
A first degree of anomaly of the intelligent plant data.
When the first abnormal degree is determined, the relative difference of the data is determined according to the difference of time sequence data and the relative relation of window data in a time window on a time sequence, the data difference abnormality caused by large difference between trend change data is avoided, the influence of abnormal scores of other data on window calculation is considered during window operation, and the influence of abnormal score data in the window on the abnormal judgment of other data is avoided; and finally, obtaining the abnormal degree of the final data on the time sequence through the relative difference of a plurality of calculation windows where the data are positioned, wherein the local abnormality of the data is further reflected by considering the difference of the relative difference of the plurality of windows.
3. Calculating a second degree of anomaly of each smart plant data:
is combined with
Figure 37262DEST_PATH_IMAGE024
Abnormal scoring and/or based on intelligent plant data>
Figure 471218DEST_PATH_IMAGE024
The first abnormal degree of the intelligent factory data is judged as ^ h>
Figure 614754DEST_PATH_IMAGE024
The second degree of anomaly of the smart factory data is:
Figure DEST_PATH_IMAGE047
Figure 528353DEST_PATH_IMAGE008
represents a fifth or fifth party>
Figure 584296DEST_PATH_IMAGE005
The abnormal score of the data of the intelligent factory, device for selecting or keeping>
Figure 481713DEST_PATH_IMAGE025
Indicates the fifth->
Figure 960099DEST_PATH_IMAGE005
A first degree of abnormality of the individual intelligent plant data, the greater the value of which is->
Figure 492318DEST_PATH_IMAGE011
In a second degree of abnormality>
Figure 748856DEST_PATH_IMAGE048
The larger.
And obtaining a second abnormal degree of the data of each intelligent factory, obtaining a data abnormal score and the abnormal degree of the data on a time sequence by combining the CBLOF algorithm and the data time sequence change during the abnormal degree analysis, and judging the data abnormality from the overall data distribution and the data time sequence. And in the abnormal degree of the data time sequence, the influence of the data abnormal score on the judgment of the data time sequence abnormality is introduced, the common influence of the data abnormal score and the judgment of the data abnormal score is strengthened, the final abnormal degree of the data is obtained and is used as the basis for data abnormal classification, namely, the data is classified more accurately.
And step S004, obtaining an abnormal data set and a normal data set according to the second abnormal degree of each intelligent factory data, and performing distributed storage on the abnormal data set and the normal data set.
Arranging the intelligent factory data according to the second abnormal degree from large to small
Figure DEST_PATH_IMAGE049
The set formed by the intelligent factory data is used as an abnormal data set, and the set formed by the intelligent factory data remained in the intelligent factory data sequence is used as a normal data set.
And the abnormal data set and the normal data set are stored in a distributed manner, so that the abnormal data can be inquired and called quickly.
In summary, the embodiment of the present invention provides an intelligent factory data classification method based on big data statistics, which reflects the possibility that a cluster itself has an abnormality according to the size of the cluster, and highlights the influence of the cluster size on data abnormality; and then, judging the influence relationship among the data of the same cluster according to the time sequence span of the data contained in the cluster, namely, considering the influence of the data time sequence relationship on data abnormity judgment, and performing more accurate data abnormity judgment. Then, the relative difference of the data is determined according to the difference of the time sequence data and the relative relation of the window data in the calculation window on the time sequence, the data difference abnormity caused by the larger difference between the trend change data is avoided, the influence of the abnormal score of other data on the window calculation is considered in the window calculation, and the influence of abnormal data in the window on the abnormal judgment of other data is avoided. And in the abnormal degree of the data time sequence, the influence of the data abnormal score on the abnormal judgment of the data time sequence is introduced, the common influence of the data abnormal score and the abnormal judgment of the data time sequence is strengthened, the final abnormal degree of the data is obtained and is used as the basis for data abnormal classification, namely, the data is classified more accurately.
It should be noted that: the precedence order of the above embodiments of the present invention is only for description, and does not represent the merits of the embodiments. And specific embodiments thereof have been described above. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and should not be taken as limiting the scope of the present invention, which is intended to cover any modifications, equivalents, improvements, etc. within the spirit of the present invention.

Claims (3)

1. The intelligent factory data classification method based on big data statistics is characterized by comprising the following steps:
acquiring an intelligent factory data sequence formed by intelligent factory data, and acquiring a plurality of clusters according to the intelligent factory data sequence;
obtaining the time span of the cluster to which each smart factory data belongs according to each cluster, and obtaining the abnormal score of each smart factory data according to the clusters, the time span and the number of data contained in the clusters; obtaining a plurality of time windows according to the intelligent factory data sequence; obtaining the difference of each intelligent factory data relative to the time window according to the difference of the adjacent data in the time window and the abnormal score; obtaining a first abnormal degree of each intelligent factory data according to the difference of each intelligent factory data relative to each time window; obtaining a second abnormal degree of each intelligent factory data according to the first abnormal degree and the abnormal score of the intelligent factory data;
obtaining an abnormal data set and a normal data set according to the intelligent factory data sequence and the second abnormal degree of each intelligent factory data, and performing distributed storage on the abnormal data set and the normal data set;
the method for obtaining the difference of each smart factory data relative to the time window according to the difference of the adjacent data in the time window and the abnormal score comprises the following steps:
acquiring a plurality of time windows of each smart factory data, calculating a time difference value between each smart factory data and each data in each time window, acquiring a plurality of adjacent data of each smart factory data in each time window according to the time difference value, acquiring a standard deviation of each time window of each smart factory data by using the data in each time window of each smart factory data, and acquiring a difference of each smart factory data relative to each time window of each smart factory data, namely a difference of each smart factory data relative to each time window, according to each adjacent data of each time window of each smart factory data, an abnormal score of each adjacent data and a standard deviation of each time window of each smart factory data;
the formula for obtaining the difference of each smart plant data relative to each belonging time window according to the adjacent data of each belonging time window of each smart plant data, the abnormal score of each adjacent data and the standard deviation of each belonging time window of each smart plant data is as follows:
Figure QLYQS_1
wherein the content of the first and second substances,
Figure QLYQS_4
indicates the fifth->
Figure QLYQS_11
An intelligent factory data->
Figure QLYQS_16
Indicates the fifth->
Figure QLYQS_3
The intelligent factory data ^ th ^ or ^ th>
Figure QLYQS_19
Number one of associated time windows>
Figure QLYQS_5
Each adjacent data->
Figure QLYQS_18
Indicates the fifth->
Figure QLYQS_6
The intelligent factory data ^ th ^ or ^ th>
Figure QLYQS_13
The fifth of the respective time window>
Figure QLYQS_2
Normalized value of the abnormal score of the individual neighborhood data, <' > or>
Figure QLYQS_12
Indicates the fifth->
Figure QLYQS_9
The intelligent factory data ^ th ^ or ^ th>
Figure QLYQS_14
The number of adjacent data in each time window is greater than or equal to>
Figure QLYQS_8
Indicates the fifth->
Figure QLYQS_20
The intelligent factory data ^ th ^ or ^ th>
Figure QLYQS_7
Standard deviation of the respective time window>
Figure QLYQS_15
Indicates the fifth->
Figure QLYQS_10
An intelligent factory data relative to ^ th->
Figure QLYQS_17
The difference of each time window;
the method for obtaining a first anomaly degree of each smart plant data according to the difference of each smart plant data relative to each time window comprises the following steps:
acquiring a plurality of time affiliated time windows of each intelligent factory data, obtaining a standard deviation of each affiliated time window of each intelligent factory data by using data in each affiliated time window of each intelligent factory data, and obtaining a first abnormal degree of each intelligent factory data according to the difference of each intelligent factory data relative to each time window and the standard deviation of each affiliated time window of each intelligent factory data;
the formula for obtaining the first abnormal degree of each intelligent factory data according to the difference of each intelligent factory data relative to each time window and the standard deviation of each intelligent factory data belonging to each time window is as follows:
Figure QLYQS_21
wherein the content of the first and second substances,
Figure QLYQS_24
indicates the fifth->
Figure QLYQS_27
An intelligent factory data relative to ^ th>
Figure QLYQS_29
Differentiation of the respective associated time window>
Figure QLYQS_23
Indicates the fifth->
Figure QLYQS_26
The number of the associated time window of the individual intelligent factory data>
Figure QLYQS_28
Represents a fifth or fifth party>
Figure QLYQS_30
Standard deviation of the variance of the individual intelligent plant data over all time windows->
Figure QLYQS_22
Represents a fifth or fifth party>
Figure QLYQS_25
A first degree of anomaly of the smart factory data;
the method for obtaining the time span of the cluster to which the smart factory data belongs according to each cluster comprises the following steps:
and forming a data pair by any two intelligent factory data in the cluster to which the intelligent factory data belongs, calculating the time difference of the two intelligent factory data in each data pair, and obtaining the time span of the cluster to which the intelligent factory data belongs according to the time difference of all the data pairs in the cluster to which the intelligent factory data belongs.
2. The intelligent big data statistics-based plant data classification method of claim 1, wherein the method for obtaining the abnormal score of each intelligent plant data according to the clusters, the time span and the number of data included in the clusters comprises:
acquiring the number of data contained in the cluster to which the intelligent factory data belongs, and recording the number of the data contained in the cluster to which the intelligent factory data belongs as the first number of the cluster to which the intelligent factory data belongs; acquiring the number of data contained in each cluster and recording the number as a second number, and acquiring the maximum value of the second number of all clusters and recording the maximum number; acquiring the distance between the data of each intelligent factory in each cluster and the center of the cluster to which the data belongs, and recording the distance as a first distance;
and obtaining the abnormal score of each intelligent factory data according to the time span, the first number, the maximum number and the first distance of the cluster to which each intelligent factory data belongs.
3. The intelligent plant data classification method based on big data statistics as claimed in claim 2, wherein the formula for obtaining the abnormal score of each intelligent plant data according to the time span, the first number, the maximum number and the first distance of the cluster to which each intelligent plant data belongs is as follows:
Figure QLYQS_31
wherein the content of the first and second substances,
Figure QLYQS_34
the maximum number is represented by the number of the cells,/>
Figure QLYQS_37
indicates the fifth->
Figure QLYQS_39
A first number of a cluster to which smart factory data belongs>
Figure QLYQS_33
Is shown as
Figure QLYQS_36
Time span of a cluster to which an individual smart factory data belongs>
Figure QLYQS_38
Indicates the fifth->
Figure QLYQS_40
A first distance of intelligent factory data->
Figure QLYQS_32
Indicates the fifth->
Figure QLYQS_35
Abnormal scores of individual smart factory data. />
CN202211432355.9A 2022-11-16 2022-11-16 Intelligent factory data classification method based on big data statistics Active CN115510302B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211432355.9A CN115510302B (en) 2022-11-16 2022-11-16 Intelligent factory data classification method based on big data statistics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211432355.9A CN115510302B (en) 2022-11-16 2022-11-16 Intelligent factory data classification method based on big data statistics

Publications (2)

Publication Number Publication Date
CN115510302A CN115510302A (en) 2022-12-23
CN115510302B true CN115510302B (en) 2023-04-07

Family

ID=84513736

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211432355.9A Active CN115510302B (en) 2022-11-16 2022-11-16 Intelligent factory data classification method based on big data statistics

Country Status (1)

Country Link
CN (1) CN115510302B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115858894B (en) * 2023-02-14 2023-05-16 温州众成科技有限公司 Visual big data analysis method
CN117054902B (en) * 2023-09-06 2024-03-19 斯润天朗(合肥)科技有限公司 Lithium battery voltage sequencing abnormality detection method and device, electronic equipment and medium
CN117131530B (en) * 2023-10-20 2024-01-30 合肥亚明汽车部件有限公司 Intelligent factory sensitive data encryption protection method
CN117370917B (en) * 2023-12-07 2024-02-23 城光(湖南)节能环保服务股份有限公司 Urban intelligent street lamp service life prediction method and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112783938A (en) * 2020-12-30 2021-05-11 河海大学 Hydrological telemetering real-time data anomaly detection method

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4742370B2 (en) * 2007-02-01 2011-08-10 独立行政法人産業技術総合研究所 Abnormality detection apparatus and abnormality detection method
JP5621967B2 (en) * 2010-07-09 2014-11-12 富士電機株式会社 Abnormal data analysis system
JP5301717B1 (en) * 2012-08-01 2013-09-25 株式会社日立パワーソリューションズ Equipment condition monitoring method and apparatus
US11513480B2 (en) * 2018-03-27 2022-11-29 Terminus (Beijing) Technology Co., Ltd. Method and device for automatically diagnosing and controlling apparatus in intelligent building
CN110084326B (en) * 2019-05-13 2022-12-06 东北大学 Industrial equipment anomaly detection method based on fuzzy set
EP4035008A4 (en) * 2019-09-27 2023-10-11 Tata Consultancy Services Limited Method and system for diagnosing anomaly in a manufacturing plant
CN110794800B (en) * 2019-12-11 2023-11-21 河南中烟工业有限责任公司 Intelligent factory information management monitoring system
US11651031B2 (en) * 2020-08-10 2023-05-16 International Business Machines Corporation Abnormal data detection
US11500713B2 (en) * 2020-10-12 2022-11-15 Vmware, Inc. Methods and systems that rank and display log/event messages and transactions
US20220138504A1 (en) * 2020-10-29 2022-05-05 Oracle International Corporation Separation maximization technique for anomaly scores to compare anomaly detection models
CN113626502A (en) * 2021-08-13 2021-11-09 南方电网深圳数字电网研究院有限公司 Power grid data anomaly detection method and device based on ensemble learning
CN113608968B (en) * 2021-08-23 2023-06-23 北京邮电大学 Power dispatching monitoring data anomaly detection method based on density distance comprehensive decision
CN113868342A (en) * 2021-09-06 2021-12-31 江苏荣辉信息科技有限公司 Production equipment data acquisition method based on intelligent factory
CN113726911B (en) * 2021-11-01 2022-01-14 南京绛门信息科技股份有限公司 Factory data acquisition and processing system based on Internet of things technology
CN114241294A (en) * 2021-11-09 2022-03-25 国网浙江省电力有限公司信息通信分公司 Method, system, equipment and storage medium for equipment abnormity detection
CN114528934A (en) * 2022-02-18 2022-05-24 中国平安人寿保险股份有限公司 Time series data abnormity detection method, device, equipment and medium
CN114676883A (en) * 2022-03-02 2022-06-28 深圳江行联加智能科技有限公司 Power grid operation management method, device and equipment based on big data and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112783938A (en) * 2020-12-30 2021-05-11 河海大学 Hydrological telemetering real-time data anomaly detection method

Also Published As

Publication number Publication date
CN115510302A (en) 2022-12-23

Similar Documents

Publication Publication Date Title
CN115510302B (en) Intelligent factory data classification method based on big data statistics
CN101516099B (en) Test method for sensor network anomaly
CN108762228B (en) Distributed PCA-based multi-working-condition fault monitoring method
US8037533B2 (en) Detecting method for network intrusion
CN109343344A (en) Cigarette machine operating parameter optimization method
CN111160401A (en) Abnormal electricity utilization judging method based on mean shift and XGboost
CN110795690A (en) Wind power plant operation abnormal data detection method
CN114114039B (en) Method and device for evaluating consistency of single battery cells of battery system
CN116992322B (en) Smart city data center management system
CN112905583A (en) High-dimensional big data outlier detection method
CN111177276A (en) Spark calculation framework-based kinetic energy data processing system and method
CN116055413B (en) Tunnel network anomaly identification method based on cloud edge cooperation
CN115865649B (en) Intelligent operation and maintenance management control method, system and storage medium
CN115372828A (en) Battery cell consistency evaluation method based on charging segment data and unsupervised algorithm
CN109800782A (en) A kind of electric network fault detection method and device based on fuzzy knn algorithm
CN113408548A (en) Transformer abnormal data detection method and device, computer equipment and storage medium
CN111325410A (en) General fault early warning system based on sample distribution and early warning method thereof
CN114742124A (en) Abnormal data processing method, system and device
CN111314910B (en) Wireless sensor network abnormal data detection method for mapping isolation forest
CN116404186B (en) Power lithium-manganese battery production system
CN112422546A (en) Network anomaly detection method based on variable neighborhood algorithm and fuzzy clustering
CN108550053B (en) User consumption data acquisition and analysis system and method based on platform technology
CN108364095B (en) Molten steel quality diagnosis method in steelmaking production process based on data mining
CN112613521B (en) Multilevel data analysis system and method based on data conversion
CN114880939A (en) Intelligent prediction method and device for service life of power battery

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant