CN112004204B - High-dimensional data anomaly detection method based on layered processing in industrial Internet of things - Google Patents

High-dimensional data anomaly detection method based on layered processing in industrial Internet of things Download PDF

Info

Publication number
CN112004204B
CN112004204B CN202010805928.2A CN202010805928A CN112004204B CN 112004204 B CN112004204 B CN 112004204B CN 202010805928 A CN202010805928 A CN 202010805928A CN 112004204 B CN112004204 B CN 112004204B
Authority
CN
China
Prior art keywords
data
abnormal
anomaly detection
industrial
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010805928.2A
Other languages
Chinese (zh)
Other versions
CN112004204A (en
Inventor
韩光洁
屠隽弢
刘立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou Campus of Hohai University
Original Assignee
Changzhou Campus of Hohai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou Campus of Hohai University filed Critical Changzhou Campus of Hohai University
Priority to CN202010805928.2A priority Critical patent/CN112004204B/en
Publication of CN112004204A publication Critical patent/CN112004204A/en
Application granted granted Critical
Publication of CN112004204B publication Critical patent/CN112004204B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/38Services specially adapted for particular environments, situations or purposes for collecting sensor information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/11Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/70Services for machine-to-machine communication [M2M] or machine type communication [MTC]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Operations Research (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a high-dimensional data anomaly detection method based on layered processing in an industrial Internet of things. Firstly, trust selection is carried out on data collected by monitoring equipment in a data preprocessing stage. And constructing a trust verification model by utilizing mutual information according to the space-time correlation, and eliminating disturbance caused by industrial noise and machine aging. Secondly, executing a single-source anomaly detection algorithm to obtain a local anomaly detection result. The method comprehensively considers the difference of the heterogeneous equipment in type, starting time and transmission time, and fully utilizes the characteristics of time sequence data in the industrial environment. The data with the time stamp is received and transmitted through the data buffer queue model, and data transmission overhead and processing time delay among networks with different granularities are effectively reduced. And finally, running a multi-source anomaly detection algorithm on the edge nodes and analyzing the data situation to obtain a global data anomaly detection result. The method meets the requirements of low load and low time delay of the Internet of things equipment under the background of industrial big data, and improves the accuracy and reliability of data anomaly detection.

Description

High-dimensional data anomaly detection method based on hierarchical processing in industrial Internet of things
Technical Field
The invention belongs to the technical field of industrial Internet of things safety protection, and particularly relates to a method for identifying and protecting industrial sensitive data in an industrial Internet of things sensing layer and a network layer.
Background
The fourth Industrial revolution accelerated by the Internet of Things (Industrial Internet of Things) has triggered a global booming. With the continuous development of the traditional sensor network architecture, IIoT has made great progress. The edge computing combines core functions of network, computing, storage, application and the like, an open platform is created, redundant data is eliminated, key information is extracted, and meanwhile transmission pressure is relieved. The IIoT has the capabilities of sensing, calculating, deciding and transmitting by combining the change of edge objects of the edge intelligent sensing network. The intelligent edge integration method has wide application prospect in IIoT. The IIoT supporting the cognitive technology realizes semantic representation, sensing data association and AI modeling in a network plane, and improves the comprehension capability of the network. However, the traditional cognitive technology lacks a theoretical basis of high-level decision, and cannot recommend an optimal operation scheme based on IIoT data. With the rapid development of information science, Machine Learning (Machine Learning) based on an intelligent computing method plays an important role in the edge intelligent IIoT.
The industrial Internet of things network can be mainly divided into a sensing layer, a network layer and an application layer. A large number of heterogeneous IIoT nodes are deployed in a sensing layer and are responsible for collecting data of peripheral equipment, and mainly complete sensing and specific lightweight computing tasks. The edge server in the network layer integrates machine learning, processes data information and provides reliable judgment for high-level decision making. Machine learning mainly includes several processes: sensing, understanding, learning, judging, and reasoning. In order to obtain reliable decision making effect, the machine learning algorithm based on data driving puts high demands on data quality.
In practical industrial internet of things application scenarios, there are usually many limitations. First, data quality plays a crucial role in event detection, and background noise and irreversible aging of equipment in an industrial environment contribute to the drift of monitored data. In addition to serious resource limitations and environmental problems, these IIoT devices are vulnerable to external attackers, which makes the collected industrial data unreliable and unable to meet the detection requirements. Most of the traditional anomaly detection means only consider the situation under ideal conditions, and neglect to carry out necessary screening on input data in a preprocessing stage, so that obvious detection errors are caused and even results are inverted. Secondly, industrial production has very high requirements on timeliness, equipment in a perception layer often does not have the capacity of executing complex calculation, data are processed in a centralized mode through an edge server in an off-line mode, high time delay and high resource occupancy rate are caused, and timely and effective early warning is lacked for the occurrence of emergency. Meanwhile, massive high-dimensional data are obtained by utilizing the large-range deployment of heterogeneous equipment, and because the sensor nodes are different in type, starting time and transmission period, actual data flow is not a time continuous matrix, so that the waste of channel resources is caused.
The existing data detection method generally cannot reflect the overall data abnormal condition, the data is simply marked as 'normal' and 'abnormal', and the analysis and prediction of abnormal results are lacked. Data detection is abnormal due to equipment faults, operation state changes, external intrusion, emergencies and transmission disturbance, information values contained in different abnormal results have differences, and corresponding decisions made by a system can be directly influenced. When key nodes in a sensitive area are damaged, important data transmission false information can be caused, a control center lacks support of related data, the whole industrial production system is in a stagnation state, even irreversible serious accidents and disasters occur, and huge economic and property losses are caused.
Therefore, a more complete high-dimensional industrial data anomaly detection method must be researched, the self calculation capability of the IIoT equipment is utilized to the maximum extent, the detection reliability and accuracy are improved, the processing time delay and the communication cost are reduced, and powerful data support is provided for system decision.
Disclosure of Invention
Aiming at the problems, the invention provides a high-dimensional data anomaly detection method based on layered processing in an industrial Internet of things, which is characterized in that under the actual industrial environment condition, a credible data verification model based on multi-element Gaussian distribution is constructed, the problems of the difference of heterogeneous equipment in space-time distribution, noise interference, drift caused by equipment aging, data falsification by unknown attackers and the like are comprehensively considered, a time window T and a data buffer area queue are added, the limitation of uneven dynamic distribution of high-dimensional industrial data is eliminated, local and global anomaly information is respectively obtained by using single-source and multi-source anomaly detection algorithms, low-delay and high-precision detection is realized, and powerful data support is finally provided for system decision.
The technical purpose is achieved, the technical effect is achieved, and the invention is realized through the following technical scheme:
a high-dimensional data anomaly detection method based on hierarchical processing in an industrial Internet of things comprises the following steps:
(1) building trusted data verification model
Dividing the whole industrial production area into a plurality of sub-areas according to different work tasks; each subarea is provided with a plurality of heterogeneous devices for monitoring the running condition of the machine, each heterogeneous device is provided with a plurality of sensor nodes, and sensing data D of each type are collected to provide decision basis for the control system; by dividing a time window T, the equipment exchanges data information with a neighbor node in a communication range of the equipment in a wireless transmission mode; constructing a trusted data verification model, calculating the credibility of data, reducing noise and disturbance caused by instrument aging by using a real state updating mechanism, and obtaining data with high monitoring quality;
(2) local data anomaly detection
The heterogeneous equipment of the perception layer has light-weight computing capacity and performs local anomaly detection on data which passes trust verification; setting corresponding normal intervals by combining historical data samples according to the difference of working environments in different areas, and analyzing an abnormal detection result by adopting a fuzzy theory to obtain the data abnormal degree distribution condition of a single source; the time window T is added into the data buffer area queue, so that the problem of uneven dynamic distribution of high-dimensional industrial data is solved, the communication overhead in the data transmission process is reduced, and the detection efficiency is improved;
(3) global data anomaly detection
The edge server can obtain all monitoring data information in the area where the edge server is located, detect the local abnormal labeling data in the time window T again, execute a multi-source abnormal detection algorithm, and avoid result deviation caused by single data detection; and distinguishing isolated anomalies and aggregation anomalies, analyzing the variation trend of the anomalous data through the functional relation in the time domain, predicting the causes of the anomalies, retaining data containing sensitive information, effectively evaluating the global data anomaly result and finally transmitting the global data anomaly result to a control center.
In the step (1), a plurality of sub-regions are divided according to different tasks, and the whole region is represented in a set formIs Z ═ Z 1 ,Z 2 …Z e Where e denotes the specific number owned by each sub-area, sub-area Z e The sensing devices within are each assigned a unique ID and a pair of encryption keys to maintain security during data collection.
The expressions of the various types of sensing data collected by the heterogeneous equipment in the step (1) are
Figure BDA0002629115940000031
n≤W m Wherein D is j Representing a data matrix, W, collected by the device j m Representing the number of all data types, X n A data flow vector representing a certain type of attribute, expressed as
Figure BDA0002629115940000032
Figure BDA0002629115940000033
Wherein x t Representing data points collected at time t.
In the step (1), the credibility of the data collected by the heterogeneous equipment in different states is obtained by constructing a credible data verification model, when the credibility of the data is lower than a tolerable confidence interval, the data point is judged to be untrustworthy and discarded, and the credibility S is k Is calculated by the formula
Figure BDA0002629115940000034
Wherein S k Representing the trustworthy value of the sensing device k,
Figure BDA0002629115940000035
the data representing the actual observations that are being made,
Figure BDA0002629115940000036
representing the estimated actual value of the measured value,
Figure BDA0002629115940000037
represents the mean of the data and d represents the distance function.
The above-mentioned distance function d is calculated by the following formula:
Figure BDA0002629115940000041
wherein std m Representing the standard deviation of the actual observations.
In the step (1), a real state updating mechanism is used to reduce noise and interference caused by instrument aging, and the real state value updating process is as follows:
Figure BDA0002629115940000042
the expression of the data normal interval Tr in the step (2) is
Figure BDA0002629115940000043
Wherein
Figure BDA0002629115940000044
Represents the lower bound of class i data collected by device j,
Figure BDA0002629115940000045
representing the upper bound of the i-type data collected by the device j, the calculation formula of the abnormal degree is as follows:
Figure BDA0002629115940000046
wherein T is greater than or equal to 1 and less than or equal to T, and T represents the length of a time window; due to the influence of the difference of the industrial environment, the abnormal degree calculation is divided into two types of 1 or 0, which cannot reflect the actual situation, so that the abnormal degree calculation formula is rewritten into:
Figure BDA0002629115940000047
in the step (2), the local data in the time window T is uploaded to the edge server in a queue manner by using the data buffer queue, where the data processing delay ω stored in the queue needs to satisfy the following condition:
Figure BDA0002629115940000048
wherein Cnt (Q) j ) Representing the maximum queue length for device j and p representing the processing delay of a single data.
In the step (3), the multi-source anomaly detection algorithm executed on the edge server adds a spatial dimension on the basis of the single-source anomaly detection algorithm, and the multi-source anomaly detection algorithm has a discriminant formula:
Figure BDA0002629115940000049
condition needs to satisfy the following conditions:
Figure BDA0002629115940000051
where H (i, j) represents a highly correlated data set, Ψ represents a valid detection coefficient, m and n represent IDs of related heterogeneous devices, and t represents a time at which data is acquired.
In the step (3), the isolated anomalies and the aggregation anomalies are distinguished by using the time stamps of the data, the isolated anomalies are characterized in that data points near the abnormal values are normal data, the aggregation anomalies are characterized in that abnormal values continuously appear, the reasons for the abnormal values are predicted by solving the partial derivatives of the abnormal data curves before and after and the positions of inflection points, and sensitive data information is reserved.
The invention has the beneficial effects that:
the method has the advantages that trust selection is carried out on data in a preprocessing stage, disturbance caused by industrial noise and machine aging is eliminated by applying a trust verification model based on multivariate Gaussian distribution according to space-time correlation, high-quality industrial data acquisition is guaranteed, a light-weight single-source anomaly detection algorithm is executed by equipment in a sensing layer, local anomaly results are obtained, a data buffer queue model is added when time sequence data are received and sent, the problem of uneven dynamic distribution of the industrial data is avoided, reasonable inference is carried out on data situations through the multisource anomaly detection algorithm operated by edge nodes, global data anomaly detection results are finally obtained, processing delay and calculation load are effectively reduced, reliable data support is provided for system decision, and important significance is brought to protection of the ecological environment and safety of the whole IIoT.
Drawings
FIG. 1 is a diagram of a network model according to an embodiment of the present invention;
FIG. 2 is a data buffer queue model according to an embodiment of the present invention;
FIG. 3 is a diagram of a multi-source data anomaly analysis according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The following detailed description of the principles of the invention is provided in connection with the accompanying drawings.
The method aims to solve the problems that under actual industrial conditions, the quality of input monitoring is guaranteed, disturbance caused by noise interference and equipment aging is avoided, and dynamic industrial data distribution is not uniform and high-dimensional data is different; meanwhile, the efficiency and the precision of online learning are improved, the reason of the occurrence of the abnormity is accurately predicted, and the problems of the limiting conditions of IIoT (inter-integrated time delay), low energy consumption, low resource occupancy rate, large-range reliable transmission and the like are met as far as possible, so that the invention provides a data abnormity detection method based on hierarchical processing in the industrial Internet of things, which comprises the following steps:
step one, building a trusted data verification model
As shown in fig. 1, the whole industrial production area is divided into a plurality of sub-areas according to different work tasks; each subarea is provided with a plurality of heterogeneous devices for monitoring the running condition of the machine, each heterogeneous device is provided with various sensors, and various types of sensing data D are collected to provide important decision basis for the control system; by dividing a time window T, the equipment exchanges data information with a neighbor node in a communication range of the equipment in a wireless transmission mode; and constructing a trusted data verification model, calculating the credibility of the data, and reducing the disturbance caused by noise and instrument aging by using a real state updating mechanism so as to obtain the data with high monitoring quality.
In order to effectively process industrial data collected in a large range, reduce processing time delay and calculation redundancy and improve detection efficiency, IIoT equipment in the same area can exchange data. Each sub-area is assigned a respective edge server. The edge servers belong to active equipment, are deployed in an IIoT network layer and process industrial data uploaded by equipment in a perception layer, and can communicate through ISDN gateways and feed back global abnormal results to a control center.
The whole industrial area is represented as Z ═ Z in a set form 1 ,Z 2 …Z e Where e denotes the specific number owned by each sub-area, sub-area Z e The sensing devices within are each assigned a unique ID and a pair of encryption keys to maintain security during data collection.
The expression of each type of sensing data collected by the heterogeneous equipment is
Figure BDA0002629115940000061
n≤ W m Wherein D is j Representing a data matrix, W, collected by the device j m Representing the number of all data types, X n A data flow vector representing a certain type of attribute, expressed as
Figure BDA0002629115940000062
Wherein x t Representing data points collected at time t. Perception layer equipment and single-hop range thereof in wireless modeThe nodes within the enclosure communicate and transmit data information acquired within the time window T to each other.
The attack of an external attacker can destroy the authenticity of original data and hijack the IIoT device to send false error information. Therefore, confidence calculations are required in the preprocessing stage to avoid the input of low quality data. Obtaining the credibility of data collected by heterogeneous equipment in different states by constructing a credible data verification model, judging that the data point is untrustworthy and abandoning when the credibility of the data is lower than a tolerable confidence interval, and judging the credibility S k The calculation formula of (2) is as follows:
Figure BDA0002629115940000063
wherein S k Representing the trustworthy value of the sensing device k,
Figure BDA0002629115940000071
the data representing the actual observations of the object,
Figure BDA0002629115940000072
representing the estimated actual value of the measured value,
Figure BDA0002629115940000073
represents the mean of the data and d represents the distance function.
The calculation formula of the distance function d is as follows:
Figure BDA0002629115940000074
wherein std m Representing the standard deviation of the actual observations.
And reducing noise and interference caused by instrument aging by using a real state updating mechanism, wherein the real state value updating process comprises the following steps:
Figure BDA0002629115940000075
step two, local data anomaly detection
The equipment in the perception layer has light computing power and performs local anomaly detection on data which passes trust verification; due to the limitation of self conditions of the equipment, corresponding normal intervals are set according to the difference of working environments in different areas and by combining with historical data samples, and the abnormal detection results are analyzed by adopting a fuzzy theory to obtain the distribution condition of the abnormal degree of the data of a single source.
As shown in fig. 2, the heterogeneous devices are different in start time, monitoring period, and transmission time slot, and the original transmission mode cannot meet the requirement of time delay. By utilizing the time window T and adding a new data buffer queue, the problem of uneven dynamic distribution of high-dimensional industrial data is solved, the communication overhead in the data transmission process is reduced, and the detection efficiency is improved.
The expression of the data normal interval Tr is
Figure BDA0002629115940000076
Wherein
Figure BDA0002629115940000077
Represents the lower bound of class i data collected by device j,
Figure BDA0002629115940000078
representing the upper bound of the i-type data collected by the device j, the calculation formula of the abnormal degree is as follows:
Figure BDA0002629115940000079
wherein T is greater than or equal to 1 and less than or equal to T, and T represents the length of a time window; due to the influence of the difference of the industrial environment, the abnormal degree calculation is divided into two types of 1 or 0, which cannot reflect the actual situation, so that the abnormal degree calculation formula is rewritten into:
Figure BDA00026291159400000710
the above data buffer queue uploads the local data in the time window T to the edge server in a queue manner, where the data processing delay ω stored in the queue needs to satisfy the following condition:
Figure BDA0002629115940000081
wherein Cnt (Q) j ) Representing the maximum queue length for device j and p representing the processing delay of a single data.
Step three, global data anomaly detection
The edge server can obtain all monitoring data information of the region, firstly, the data with abnormal local labels in the time window T are re-detected, a multi-source abnormal detection algorithm is executed, and result deviation caused by single data detection is avoided; and distinguishing isolated anomalies and aggregation anomalies, analyzing the variation trend of the anomalous data through the functional relation in the time domain, predicting the causes of the anomalies, retaining data containing sensitive information, effectively evaluating the global data anomaly result and finally transmitting the global data anomaly result to a control center.
The edge server executes a multi-source anomaly detection algorithm, a space dimension is added on the basis of a single-source anomaly detection algorithm, and the multi-source anomaly detection algorithm has a discriminant formula as follows:
Figure BDA0002629115940000082
condition requires the following equality conditions to be satisfied:
Figure BDA0002629115940000083
where H (i, j) represents a highly correlated data set, Ψ represents a valid detection coefficient, m and n represent the IDs of the associated heterogeneous devices, and t represents the time at which the data was acquired.
As shown in fig. 3, the anomalies belong to aggregate anomalies or are also called continuous anomalies, the occurrence of an emergency event has diffusion and continuation properties, which cause abnormal changes of data in a certain period, the data on both sides of an inflection point show different trends, and after the event is finished, the data still detected as anomalies can be rapidly converged to a normal region.
The isolated anomaly and the aggregation anomaly are distinguished by using the time stamp of the data, the isolated anomaly is characterized in that data points near the abnormal value are normal data, the aggregation anomaly is characterized in that the abnormal value continuously appears, the reason for the abnormal value is predicted by solving the partial derivative of the curve of the abnormal data before and after and the position of an inflection point, and sensitive data information is reserved.
In summary, the following steps:
the invention discloses a high-dimensional data anomaly detection method based on layered processing in an industrial Internet of things, which comprises the steps of dividing sub-regions according to different production tasks under the actual industrial environment condition, selecting a trusted node by constructing a trusted data verification model based on multi-element Gaussian distribution, ensuring the quality of input data, comprehensively considering the problems of the difference of heterogeneous equipment in space-time distribution, noise interference, disturbance caused by equipment aging, unknown attacker tampering data, hijacking IIoT equipment and the like, adding a time window T and a data buffer queue, reducing the communication overhead in the data transmission process, eliminating the limitation of uneven dynamic distribution of high-dimensional industrial data, respectively executing an anomaly detection algorithm in a sensing layer and a network layer by utilizing a fuzzy evidence theory, obtaining local and global anomaly information, realizing low-delay and high-precision detection, and finally, powerful data support is provided for system decision, and the normal operation of the whole industrial production line is ensured.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (1)

1. A high-dimensional data anomaly detection method based on hierarchical processing in an industrial Internet of things is characterized by comprising the following steps:
(1) building trusted data verification model
Dividing the whole industrial production area into a plurality of sub-areas according to different work tasks; each subarea is provided with a plurality of heterogeneous devices for monitoring the running condition of the machine, each heterogeneous device is provided with a plurality of sensor nodes, and sensing data D of each type are collected to provide decision basis for the control system; by dividing a time window T, the heterogeneous equipment exchanges data information with a neighbor node in a communication range of the heterogeneous equipment in a wireless transmission mode; constructing a trusted data verification model, calculating the credibility of data, and reducing noise and disturbance caused by instrument aging by using a real state updating mechanism to obtain data with high monitoring quality;
(2) local data anomaly detection
The heterogeneous equipment of the perception layer has light computing power and performs local anomaly detection on data which passes trust verification; setting corresponding normal data intervals by combining historical data samples according to the difference of working environments in different regions, and analyzing an abnormal detection result by adopting a fuzzy theory to obtain the distribution condition of the abnormal degree of the data of a single source; the time window T is added into the data buffer queue, so that the problem of uneven dynamic distribution of high-dimensional industrial data is solved, the communication overhead in the data transmission process is reduced, and the detection efficiency is improved;
(3) global data anomaly detection
The edge server can obtain all monitoring data information in the area where the edge server is located, detect the local abnormal labeling data in the time window T again, execute a multi-source abnormal detection algorithm, and avoid result deviation caused by single data detection; distinguishing isolated anomalies and aggregate anomalies, analyzing the variation trend of the abnormal data through the functional relation in the time domain, predicting the reasons of the anomalies, reserving the data containing sensitive information, effectively evaluating the global data anomaly result and finally transmitting the global data anomaly result to a control center;
in the step (1), a plurality of sub-regions are divided according to different tasks, and the whole region is represented as Z ═ Z in a set form 1 ,Z 2 …Z e Where e denotes the specific number owned by each sub-area, sub-area Z e The sensing devices in the system are all endowed with unique IDs and paired encryption keys, and the security during data collection is maintained;
in the step (1), expressions of various types of sensing data collected by heterogeneous equipment are
Figure FDA0003728046940000011
Figure FDA0003728046940000012
Wherein D j Representing a data matrix, W, collected by the device j m Representing the number of all data types, X n A data flow vector representing a certain type of attribute, expressed as
Figure FDA0003728046940000013
Wherein x t Represents a data point collected at time t;
in the step (1), the credibility of the data collected by the heterogeneous equipment in different states is obtained by constructing a credible data verification model, when the credibility of the data is lower than a tolerable confidence interval, the data point is judged to be untrustworthy and discarded, and the credibility S is k Is calculated by the formula
Figure FDA0003728046940000021
Wherein S k Representing the trustworthy value of the sensing device k,
Figure FDA0003728046940000022
the data representing the actual observations that are being made,
Figure FDA0003728046940000023
representing the estimated actual value of the measured value,
Figure FDA0003728046940000024
represents the mean of the data, d represents the distance function;
the calculation formula of the distance function d is
Figure FDA0003728046940000025
Wherein std m A standard deviation representing the actual observed value;
in the step (1), a real state updating mechanism is used to reduce noise and interference caused by instrument aging, and the real state value updating process is as follows:
Figure FDA0003728046940000026
the expression of the data normal interval Tr in the step (2) is
Figure FDA0003728046940000027
Wherein
Figure FDA0003728046940000028
Represents the lower bound of class i data collected by device j,
Figure FDA0003728046940000029
the calculation formula of the abnormal degree representing the upper bound of the i-type data collected by the equipment j is
Figure FDA00037280469400000210
Wherein T is greater than or equal to 1 and less than or equal to T, and T represents the length of a time window; due to the influence of industrial environment difference on the calculation of the abnormal degree, the data abnormal result is divided into 1 or 0 which can not reflect the actual situation, so the abnormal degree calculation formula is rewritten into
Figure FDA0003728046940000031
In the step (2), the local data in the time window T is uploaded to the edge server in a queue manner by using a data buffer queue, where a data processing delay ω stored in the queue needs to satisfy the following condition:
Figure FDA0003728046940000032
wherein Cnt (Q) j ) Represents the maximum queue length of the device j, and p represents the processing delay of single data;
the multi-source abnormality detection algorithm executed on the edge server in the step (3) is that the spatial dimension is added on the basis of the single-source abnormality detection algorithm, and the multi-source abnormality detection algorithm has a discriminant of
Figure FDA0003728046940000033
Condition is required to satisfy the following conditions
Figure FDA0003728046940000034
Wherein H (i, j) represents a high-correlation data set, psi represents an effective detection coefficient, m and n represent IDs of related heterogeneous devices, and t represents the time when data are acquired;
in the step (3), the isolated anomalies and the aggregation anomalies are distinguished by using the time stamps of the data, the isolated anomalies are characterized in that data points near the abnormal values are normal data, the aggregation anomalies are characterized in that abnormal values continuously appear, the reasons for the abnormal values are predicted by solving the partial derivative of the abnormal data curve and the positions of inflection points, and the data containing sensitive information are reserved.
CN202010805928.2A 2020-08-12 2020-08-12 High-dimensional data anomaly detection method based on layered processing in industrial Internet of things Active CN112004204B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010805928.2A CN112004204B (en) 2020-08-12 2020-08-12 High-dimensional data anomaly detection method based on layered processing in industrial Internet of things

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010805928.2A CN112004204B (en) 2020-08-12 2020-08-12 High-dimensional data anomaly detection method based on layered processing in industrial Internet of things

Publications (2)

Publication Number Publication Date
CN112004204A CN112004204A (en) 2020-11-27
CN112004204B true CN112004204B (en) 2022-09-23

Family

ID=73463595

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010805928.2A Active CN112004204B (en) 2020-08-12 2020-08-12 High-dimensional data anomaly detection method based on layered processing in industrial Internet of things

Country Status (1)

Country Link
CN (1) CN112004204B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112770053B (en) * 2021-01-05 2023-04-07 珠海市横琴盈实科技研发有限公司 Internet of things equipment collaborative linkage method
CN112800110B (en) * 2021-01-22 2022-09-16 国家电网有限公司技术学院分公司 Weak sensitive data abnormity detection system of power internet of things sensor
CN114650166B (en) * 2022-02-07 2023-08-01 华东师范大学 Fusion anomaly detection system for open heterogeneous network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105764162A (en) * 2016-05-10 2016-07-13 江苏大学 Wireless sensor network abnormal event detecting method based on multi-attribute correlation
CN110336860A (en) * 2019-06-13 2019-10-15 河海大学常州校区 Key node data guard method based on multidimensional data processing in industrial Internet of Things
CN111371543A (en) * 2020-01-08 2020-07-03 中国科学院重庆绿色智能技术研究院 Internet of things equipment access control method based on double-block chain structure

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105764162A (en) * 2016-05-10 2016-07-13 江苏大学 Wireless sensor network abnormal event detecting method based on multi-attribute correlation
CN110336860A (en) * 2019-06-13 2019-10-15 河海大学常州校区 Key node data guard method based on multidimensional data processing in industrial Internet of Things
CN111371543A (en) * 2020-01-08 2020-07-03 中国科学院重庆绿色智能技术研究院 Internet of things equipment access control method based on double-block chain structure

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Multi-Dimensional Data Fusion Intrusion Detection for Stealthy Attacks on Industrial Control Systems;A. Yang等;《2018 IEEE Global Communications Conference (GLOBECOM), 2018, pp. 1-7, doi: 10.1109/GLOCOM.2018.8648131.》;20190221;全文 *
基于随机矩阵理论的WSN异常节点定位算法;林超等;《计算机工程》;20200115(第01期);全文 *

Also Published As

Publication number Publication date
CN112004204A (en) 2020-11-27

Similar Documents

Publication Publication Date Title
CN112004204B (en) High-dimensional data anomaly detection method based on layered processing in industrial Internet of things
Kozik et al. A scalable distributed machine learning approach for attack detection in edge computing environments
Kim et al. APAD: Autoencoder-based payload anomaly detection for industrial IoE
Shakya et al. Anomalies detection in fog computing architectures using deep learning
Qu et al. Incorporating unsupervised learning into intrusion detection for wireless sensor networks with structural co-evolvability
Mounica et al. RETRACTED: Detecting Sybil Attack In Wireless Sensor Networks Using Machine Learning Algorithms
US10693841B2 (en) System and method for transmitting data relating to an object
Jadidi et al. Automated detection-in-depth in industrial control systems
CN117827788B (en) Intelligent 3D printing factory data processing method and system
Sangeetha et al. Enhanced SCADA IDS security by using MSOM hybrid unsupervised algorithm
Stamatescu et al. Cybersecurity perspectives for smart building automation systems
Askar Deep learning and fog computing: a review
KR102252887B1 (en) System and method for abnormal detecting a hierarchical data in OT network
CN117614738A (en) Industrial intrusion monitoring system
Agate et al. Anomaly Detection for Reoccurring Concept Drift in Smart Environments
Eid et al. IIoT network intrusion detection using machine learning
KR102417752B1 (en) System and method for threat detecting based on AI in OT/ICS
Abie et al. Robust, secure, self-adaptive and resilient messaging middleware for business critical systems
CN112822191B (en) Method for multi-dimensional data security detection in networked cooperative system
Azarkasb et al. A network intrusion detection approach at the edge of fog
Ahakonye et al. Trees Bootstrap Aggregation for Detection and Characterization of IoT-SCADA Network Traffic
Al-Ambusaidi et al. ML-IDS: an efficient ML-enabled intrusion detection system for securing IoT networks and applications
Obert et al. Distributed renewable energy resource trust metrics and secure routing
Kharitonov et al. WiP: Distributed intrusion detection system for TCP/IP-based connections in industrial environments using self-organizing maps
Arkan et al. Entropy-based anomaly detection using observation points relations in wireless sensor networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant