CN101516099B - Test method for sensor network anomaly - Google Patents
Test method for sensor network anomaly Download PDFInfo
- Publication number
- CN101516099B CN101516099B CN2009100615378A CN200910061537A CN101516099B CN 101516099 B CN101516099 B CN 101516099B CN 2009100615378 A CN2009100615378 A CN 2009100615378A CN 200910061537 A CN200910061537 A CN 200910061537A CN 101516099 B CN101516099 B CN 101516099B
- Authority
- CN
- China
- Prior art keywords
- node
- high dimensional
- dimensional data
- nodes
- bunch
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Mobile Radio Communication Systems (AREA)
- Arrangements For Transmission Of Measured Signals (AREA)
Abstract
The invention relates to a test method for sensor network anomaly, which clusters the network. A cluster head in each cluster collects high dimensional data sequences of all nodes in every unit time of the cluster; a hidden Markov model building method is adopted for building the high dimensional data transition model of the nodes in each unit time, takes the similarity of the models as the classification benchmark, classifies the high dimensional data transition model of all nodes, collects all high dimensional data sequences in the unit time of the high dimensional data transition model of the nodes of the same type, and again the hidden Markov model building method is adopted for building new high dimensional data transition models of the nodes which are used for anomaly detection in the cluster. The method fully uses the relativity of the time and space of the data the in the sensor network, effectively reduces the data redundancy and communication overhead, prolongs the service life of the nodes of the sensors, thus achieving the purpose of anomaly detection.
Description
Technical field
The present invention relates to wireless sensor network, particularly relate to the method for wireless sensor network abnormality detection.
Background technology
In wireless sensor network, the space or the temporal correlation of height generally all arranged between a large amount of original sensor datas, all these full data transmission had both been wasted energy to base-station node (Sink) also there is no need.Data fusion (data aggregation, or be called convergence) main thought be exactly to carry out interactively cooperative cooperating between the node on the transmission path, greatly reduce data volume by removing the correlation redundancy, thereby with still less the same or suitable amount of information of data representation.
The achievement in research of the current existing part of data fusion of carrying out based on sensor network nodes data space correlation or temporal correlation, but a theoretical system do not formed.Especially all isolate at the utilization of spatial coherence and temporal correlation and come, with spatial coherence and temporal correlation organically, the combining of system, thereby hindered the further raising of data fusion efficient;
Existing data fusion model all be the part independently, to the explanation and the modeling of relation between each part and multi-level information fusion neither one science.In fact the correlation between the data has level, and be to weaken gradually along with the increase of distance, also have correlation between each part, existing research all is that zone of deterministic delimitation or boundary are the part, and the data of only regarding as this part have local correlations.
Existing Data Fusion model at first is to pay close attention to sensing data, has just carried out simple processing for the incident that the is detected modeling of data reflection, and this is just as the gap of amplitude coding in the speech coding and model based coding; In addition, existing data processing model is supported not enough to senior application characteristic, as being the performance temperature data equally, need the data fusion research (asking for) of topology information different fully, do not have unified Mathematical Modeling form with the method for the data fusion research employing that only needs statistical information as contour.The research of merging at the uniform data of multiple sensing data (as showing temperature, humidity simultaneously) does not also have significant achievement in addition.And the theory of these data processing, method often need original low-dimensional data set is converted into the high dimensional data collection, and present research mainly is confined to the low-dimensional application scenario, and the analysis and modeling work of high dimensional data collection does not launch as yet fully; At last, existing data processing model mainly concentrates on and uses linear system to analyze room and time correlation with deal with data, reduce transfer of data thereby for example use linear prediction model to predict, but the system of reality is often comparatively complicated, and the linear system model does not often have versatility.
Have a lot of exceptional value detection algorithms all are that the difference of judging sensor node data and its adjacent node data determines whether it being exceptional value, the time of Xu Yaoing is grown and can not guarantee its accuracy like this.Utilize the framework of data fusion, exceptional value detection problem is changed into the reasoning problems of the framework of given data fusion, promptly for given sequence, infer the status switch that its corresponding maximum possible according to the framework of existing data fusion, if judge that according to the likelihood function value that existing model uses existing status switch to obtain the result will be more practical.
Summary of the invention
The object of the present invention is to provide a kind of wireless sensor network method for detecting abnormality, take into full account data in time with the space on correlation, effectively reduce data redundancy and communication overhead, prolong the life-span of sensor node, reach the purpose of abnormality detection.
A kind of test method for sensor network anomaly carries out sub-clustering to network, and each bunch carries out abnormality detection as follows:
The high dimensional data sequence of all nodes in i unit interval in step 1) bunch head converges bunch, with this high dimensional data sequence is training sample, adopt the HMM construction method to make up the node high dimensional data transition model of i unit interval, i=1,2,, N, the unit interval quantity of N for extracting;
Step 2) serve as the classification benchmark with the transition model similitude, with the 1st, 2 ..., the node high dimensional data transition model of N unit interval is classified;
Step 3) is for the initial transition model of node high dimensional data that belongs to the j class, all high dimensional data sequences in its corresponding unit interval are converged, constitute new training sample, adopt the HMM construction method to make up j category node high dimensional data transition model, j=1,2 ... N1, N1 are step 2) number of categories that obtains;
Step 4) utilize j category node high dimensional data transition model to bunch in the data of all node collections carry out abnormality detection.
Technique effect of the present invention is embodied in:
(1) multidimensional data merges.High dimensional data collection in the network (as comprising temperature, humidity simultaneously) is carried out data fusion, adopt the HMM model that the high dimensional data collection is carried out modeling, set up the data fusion model of the general data collection that is fit to any dimension;
(2) reduce redundancy, reduce expense.The present invention makes full use of sensor network and collects correlation on data time and the space, at first by the sub-clustering algorithm, by the correlation on the space, bunch data difference that interior nodes collects is little, bunch being that unit carries out modeling, reduced the redundancy of model, the also convenient follow-up data that bunch interior nodes is collected carries out abnormality detection; Consider temporal correlation,, rebuild new training sample and make up new model, further reduce the redundancy of model category of model; Adopt HMM to the high dimensional data collection transition carry out modeling, make all data that in the process of abnormality detection, need not to collect send to leader cluster node from node, only need transition sequence (i.e. sequence after the sampling) is sent to leader cluster node, thereby significantly reduced communication overhead.
Description of drawings
Fig. 1 is a wireless sensor network structural representation of the present invention;
Fig. 2 is a method for detecting abnormality overview flow chart of the present invention;
Fig. 3 is a wireless sensor network sub-clustering result schematic diagram;
Fig. 4 is the training sample schematic diagram;
To be abnormality detection result schematic diagram: Fig. 5 (a) be node 2-D data transition figure in 17 days bunches of March to Fig. 5; Fig. 5 (b) is node 2-D data transition figure in 18 days bunches of March; Fig. 5 (c) is node 2-D data figure (do not comprise node 3, thick line is the transition curve of node 4) in 17 days bunches of March; Fig. 5 (d), 5 (e), 5 (f) are node 2-D data figure (do not comprise node 3, the thick line among the figure is respectively node 2,7,9 transition curve) in 18 days bunches of March.
Embodiment
Purpose, technical scheme and advantage in order more clearly to show this patent make a detailed description this method below in conjunction with accompanying drawing and instantiation.
The present invention is directed to the abnormality detection problem in the wireless sensor network, can carry out data fusion, extract useful information, finally draw abnormality detection result the high dimensional data collection in the network (reading that comprises temperature and humidity simultaneously).
Fig. 1 has described the layout of the used sensor network of embodiment, used sensing data among the embodiment is to collect the data of 1~54 node from February 28th, 2004 to April 5 by Intel Berkeley research laboratory (Intel Berkeley Research lab).Unit interval is 12 hours a integral multiple, is that one day (24 hours) illustrate with the unit interval in the example.
Fig. 2 is the schematic flow sheet of embodiment, may further comprise the steps:
Step 1 is carried out sub-clustering according to the adjacency of locus to network.
For network configuration shown in Figure 1, we carry out sub-clustering to network by the following method: each node has a status indication position, is initialized as 0,0 expression node and determines state, and 1 expression node has determined it oneself is leader cluster node or from node; Each node has a level (leval) marker bit, is initialized as ∞; Each node is the parent marker bit of underlined father node also.Each node (0, N
3) in select oneself interim ID number at random, wherein N represents the sum of node in the network.
Communicate between step 11 node, each node obtains that h jumps and h jumps information with interior neighbor node, and h is value arbitrarily, and h is taken as 2 in this experiment.Status indication is 0 node with oneself ID number is to make comparisons for ID number that 0 h jumps and h jumps with interior neighbor node with the mark mark, if the ID maximum of oneself, then confirm oneself to be leader cluster node, being about to the leval mark position is 0, the expression leader cluster node is at the 0th layer, and with the status indication position of oneself is 1, enters step 12.
To be that all nodes of 0 are relatively more own jump with spacing h and h jumps similitude with interior leader cluster node in step 12 status indication position, and we represent the similitude of two nodes with coefficient correlation among the present invention, promptly suppose x
k, x
1Be respectively node S
k, S
1The sequence samples of in one day, reading, correlation coefficient r
K, lBe defined as,
Wherein E () is for getting the average symbol.
If coefficient correlation is greater than relevance threshold, two curve similitudes are big more, and coefficient correlation levels off to 1, and we establish coefficient correlation greater than 0.97 in this experiment, and node is 1 with the status indication position of oneself, and node is from node.
Each calculates the similitude of all leader cluster nodes in own and the spacing one jumping scope step 14 from node, find the leader cluster node of coefficient correlation maximum, if this coefficient correlation is greater than relevance threshold, as coefficient correlation greater than 0.97, then will be somebody's turn to do from the leval mark of node and be changed to 1, should be changed to ID number of leader cluster node from the parent mark of node.
The node of step 15leval=∞ obtains h and h and jumps information with all nodes of leval ≠ ∞ in the interior scope, select the node S of the leval≤h-1 of similitude maximum and node, the leval mark of this node is changed to leval mark+1 of node s, and the parent mark of this node is changed to s.
More than be the process of leader cluster node election, so far, the node in the network belongs to and only belongs to one bunch, and the maximum hop count of each bunch is h, and wherein the size of h can be adjusted at the beginning of program running.Figure as a result after the sub-clustering as shown in Figure 3.
Next, each bunch is the center with bunch head, to bunch in data carry out analyzing and processing, draw the abnormality detection result in this bunch.
Each bunch of step 2 gathered sample training HMM (HMM), and sample comprises the data sequence of all nodes in N days bunches.Each node was gathered a data value (mean values of all data that data collect at two hours for this node) every two hours, 12 values were just arranged in one day, the sequences that these 12 values are formed are as sample, for the stability that guarantees to train, we converge to the leader cluster node training to the sequence of all nodes in bunch.Calculate the parameters of these HMMs with Bao Mu-Wei Erqi (Baum-Welch) algorithm iteration, comprise initial probability vector, state transition probability matrix, the parameters of mixed Gauss model, hybrid matrix etc.By accompanying drawing 4 (expression March 1 by node 4,6,7,8,9,10,11,12,13 formed bunch), we can see that the transition curve of data in same bunch is consistent basically.Our used training sequence is a multisequencing, and the result of Baum-Welch iteration makes the summation maximum of a plurality of sequences probability of occurrence in model.
All transition sequences of the data set of all nodes some days in given any one bunch of step 4, the forward direction algorithm of utilization Hidden Markov the inside, calculate this all sequences respectively in step 3 training HMM in probability of occurrence, get and make probability and maximum HMM, if the probability that the transition sequence of certain node occurs in this model, judges then that the data of this node are for unusual less than the unusual judgment threshold of setting.Probability is more little, and poor more with the model match, for unusual possibility is big more, establishing unusual judgment threshold in this experiment is 0.1.Mj represents j HMM, and establishing label and be the probability that the sequence of i occurs in mj is P (i|mj), and this sequence length is Ti, with L (i|mj)=(1/T
i) the pairing value of log (P (i|mj)) expression normalization probability logarithm.We were with March 17, and the data on March 18 are gone the accuracy of four models of training in the checking procedure three, with by node 4,6,7,8,9,10,11,12,13 constituted bunch be example, the result of match is as shown in table 1.Wherein as shown in Figure 5, in this bunch of data representation in the table one 2 days data of 9 nodes respectively at model I, the L (i|mj) in 2,3,4.Our best with model 3 matches of the data on March 17 as can be seen from this table, the data on March 18 are best with model 4 matches.The temperature reading of node 3 is all undesired in these two days, is more than 100 degree, and curve is respectively as accompanying drawing 5 (a), (solid line is represented temperature curve, and dotted line is represented voltage curve, and the line of overstriking is represented the temperature and the magnitude of voltage of node 3) shown in the accompanying drawing 5 (b); For the ease of observing, we remove node 3.March 17, in the remaining node, the L of node 4 (i|mj) minimum is observed its curve transition figure, finds other node height of voltage ratio of node 4, shown in accompanying drawing 5 (c); March 18, in the remaining node, node 2,7,9 L (i|mj) is little more a lot of than other nodes, their curve transition figure such as accompanying drawing 5 (d), accompanying drawing 5 (e) shown in the accompanying drawing 5 (f), can be defined as abnormal nodes.
Table one model match 0317,0,318 three day data result
In sum, a kind of test method for sensor network anomaly based on level decision-making and HMM is effective.
Claims (3)
1. a test method for sensor network anomaly carries out sub-clustering to network, and each bunch carries out abnormality detection as follows:
The high dimensional data sequence of all nodes in i unit interval in step 1) bunch head converges bunch, with this high dimensional data sequence is training sample, adopt the HMM construction method to make up the node high dimensional data transition model of i unit interval, i=1,2,, N, the unit interval quantity of N for extracting;
Step 2) serve as the classification benchmark with the transition model similitude, with the 1st, 2 ..., the node high dimensional data transition model of N unit interval is classified;
Step 3) is for the node high dimensional data transition model that belongs to the j class, all high dimensional data sequences in its corresponding unit interval are converged, constitute new training sample, adopt the HMM construction method to make up j category node high dimensional data transition model, j=1,2 ... N1, N1 are step 2) number of categories that obtains;
The transition sequence of the data set of all nodes some days in given bunch of the step 4), utilization Hidden Markov forward direction algorithm, calculate all sequences respectively at the probability of occurrence of the node high dimensional data transition model of j class, get and make probability and maximum node high dimensional data transition model, if the probability that the transition sequence of certain node occurs in this model, judges then that the data of this node are for unusual less than the unusual judgment threshold of setting;
Described network cluster dividing carries out in the following manner:
The node of identify label is relatively not own jumps with h at interval and h jumps with interior neighbor node ID number for step 01, if the ID maximum of self, is a bunch head with self identify label then, and the level mark puts 0;
The step 02 not node of identify label is calculated oneself and spacing h jumping and the h jumping similitude with interior bunch head, if exist similitude greater than relevance threshold, is from node with this node identify label then;
Step 03 repeating step 01~02 is identified up to the identity of all nodes;
Step 04 is calculated the similitude of all bunches head that own and spacing one jump from node, finds a bunch T of similitude maximum, if maximum comparability, is then confirmed this greater than relevance threshold from the child node of node for a bunch T, and will be somebody's turn to do from the level mark of node and put 1;
Step 05 do not do the level mark from node calculate self with spacing h and h jumping with interior all similitudes of having made the level mark from node, select similitude maximum and level mark≤h-1 from node S, should equal level mark+1 from the level mark of node, confirm that this was the child node from node S from node from node S;
Step 06 repeating step 05, up to all from the level mark of node by assignment;
Described similitude is calculated in the following manner: make x
k, x
1Be respectively node S
k, S
1The high dimensional data sequence of in a unit interval, gathering, similitude r
K, lBe defined as:
Wherein average is got in E () expression.
2. a kind of test method for sensor network anomaly according to claim 1, described step 2) adopt the K mean algorithm to classify.
3. a kind of test method for sensor network anomaly according to claim 1 is characterized in that, the described unit interval is 12 hours a integral multiple.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2009100615378A CN101516099B (en) | 2009-04-07 | 2009-04-07 | Test method for sensor network anomaly |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2009100615378A CN101516099B (en) | 2009-04-07 | 2009-04-07 | Test method for sensor network anomaly |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101516099A CN101516099A (en) | 2009-08-26 |
CN101516099B true CN101516099B (en) | 2010-12-01 |
Family
ID=41040335
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2009100615378A Expired - Fee Related CN101516099B (en) | 2009-04-07 | 2009-04-07 | Test method for sensor network anomaly |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101516099B (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101969655B (en) * | 2010-09-21 | 2013-01-23 | 浙江农林大学 | Single-layer linear network prediction based authentication method of network data of wireless sensor |
CN102340811B (en) * | 2011-11-02 | 2014-11-26 | 中国农业大学 | Method for carrying out fault diagnosis on wireless sensor networks |
CN105873129B (en) * | 2016-03-24 | 2018-12-18 | 中国人民解放军信息工程大学 | The sensor network missing values reconstructing method of multi-node collaboration |
CN105939524B (en) * | 2016-06-21 | 2019-08-16 | 南京大学 | A kind of wireless sensor network node event real-time predicting method |
CN107295536B (en) * | 2017-07-05 | 2020-07-07 | 苏州大学 | Parallel diagnosis test method |
CN107528722B (en) * | 2017-07-06 | 2020-10-23 | 创新先进技术有限公司 | Method and device for detecting abnormal point in time sequence |
US10841322B2 (en) | 2018-01-18 | 2020-11-17 | General Electric Company | Decision system and method for separating faults from attacks |
CN109374044B (en) * | 2018-09-30 | 2023-11-10 | 国际商业机器(中国)投资有限公司 | Intelligent automatic restoration method and device for multi-parameter environment monitoring equipment |
CN109406940B (en) * | 2018-10-16 | 2019-06-28 | 深圳供电局有限公司 | A kind of distributed feed line automatization system for power distribution network monitoring |
CN109768968B (en) * | 2018-12-19 | 2020-07-31 | 紫光云引擎科技(苏州)有限公司 | Data informatization acquisition and analysis system and method based on cloud computing |
CN112468487B (en) * | 2020-11-25 | 2022-03-18 | 清华大学 | Method and device for realizing model training and method and device for realizing node detection |
US20230092627A1 (en) * | 2021-09-21 | 2023-03-23 | International Business Machines Corporation | Distributed sensing and classification |
CN114244751B (en) * | 2021-11-22 | 2023-09-15 | 慧之安信息技术股份有限公司 | Wireless sensor network anomaly detection method and system |
CN116643951B (en) * | 2023-07-24 | 2023-10-10 | 青岛冠成软件有限公司 | Cold chain logistics transportation big data monitoring and collecting method |
-
2009
- 2009-04-07 CN CN2009100615378A patent/CN101516099B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN101516099A (en) | 2009-08-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101516099B (en) | Test method for sensor network anomaly | |
Dao et al. | Identification failure data for cluster heads aggregation in WSN based on improving classification of SVM | |
Capozzoli et al. | Fault detection analysis using data mining techniques for a cluster of smart office buildings | |
CN111639237B (en) | Electric power communication network risk assessment system based on clustering and association rule mining | |
WO2018126984A2 (en) | Mea-bp neural network-based wsn abnormality detection method | |
CN104750861B (en) | A kind of energy-accumulating power station mass data cleaning method and system | |
CN110990461A (en) | Big data analysis model algorithm model selection method and device, electronic equipment and medium | |
CN108985380B (en) | Point switch fault identification method based on cluster integration | |
CN112381181B (en) | Dynamic detection method for building energy consumption abnormity | |
CN110335168B (en) | Method and system for optimizing power utilization information acquisition terminal fault prediction model based on GRU | |
Guo et al. | Feature selection based on Rough set and modified genetic algorithm for intrusion detection | |
CN104881735A (en) | System and method of smart power grid big data mining for supporting smart city operation management | |
CN108289285B (en) | Method for recovering and reconstructing lost data of ocean wireless sensor network | |
CN114861788A (en) | Load abnormity detection method and system based on DBSCAN clustering | |
CN115278741A (en) | Fault diagnosis method and device based on multi-mode data dependency relationship | |
CN116780781B (en) | Power management method for smart grid access | |
CN108734359B (en) | Wind power prediction data preprocessing method | |
CN105373620A (en) | Mass battery data exception detection method and system for large-scale battery energy storage power stations | |
CN110533253A (en) | A kind of scientific research cooperative Relationship Prediction method based on Heterogeneous Information network | |
Gu et al. | Application of fuzzy decision tree algorithm based on mobile computing in sports fitness member management | |
Chu et al. | Co-training based on semi-supervised ensemble classification approach for multi-label data stream | |
CN117078048A (en) | Digital twinning-based intelligent city resource management method and system | |
CN109376790A (en) | A kind of binary classification method based on Analysis of The Seepage | |
CN116629428A (en) | Building energy consumption prediction method based on feature selection and SSA-BiLSTM | |
Li et al. | A hybrid coevolutionary algorithm for designing fuzzy classifiers |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20101201 Termination date: 20160407 |
|
CF01 | Termination of patent right due to non-payment of annual fee |