CN103561420A - Anomaly detection method based on data snapshot graphs - Google Patents

Anomaly detection method based on data snapshot graphs Download PDF

Info

Publication number
CN103561420A
CN103561420A CN201310549381.4A CN201310549381A CN103561420A CN 103561420 A CN103561420 A CN 103561420A CN 201310549381 A CN201310549381 A CN 201310549381A CN 103561420 A CN103561420 A CN 103561420A
Authority
CN
China
Prior art keywords
event
data
node
sequence
diagram
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310549381.4A
Other languages
Chinese (zh)
Other versions
CN103561420B (en
Inventor
吕建华
张柏礼
魏巨巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201310549381.4A priority Critical patent/CN103561420B/en
Publication of CN103561420A publication Critical patent/CN103561420A/en
Application granted granted Critical
Publication of CN103561420B publication Critical patent/CN103561420B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses an anomaly detection method based on data snapshot graphs. The method includes the first step of carrying out acquisition and pretreatment on detection data in a current monitored area of a wireless sensor network to determine an event area, the second step of obtaining a dataset related to a current event, using a graph model to abstractly summarize event data and converting the event data into the event data snapshot graphs, and the third step of carrying out query in an event mode pattern database through a graph similarity algorithm based on structural correlativity, searching for event mode patterns similar to the event graphs and judging the type of the current event, wherein the event mode pattern database is a collection of the event mode patterns, and the event mode patterns are the event data snapshot graphs which represent for abstract description of the type of the event. According to the anomaly detection method based on the data snapshot graphs, the event graphs can be obtained on the basis of domain expert knowledge or on the basis of data analysis. The method has the advantages of being used for detection of the complex event, improving event detection efficiency and reducing the false alarm rate.

Description

Method for detecting abnormality based on data snapshot figure
Technical field
The present invention relates to a kind of method for detecting abnormality of wireless sensor network, relate in particular to a kind of method for detecting abnormality based on data snapshot figure.
Background technology
The abnormality detection present situation of wireless sensor network
In wireless sensor network, the abnormal reason producing of sensor node data is varied, and the data that fault, collection occurred as sensor node itself are containing in noise data and sensor network, anomalous event etc. having occurred.The abnormality detection of wireless sensor network detects these abnormal datas exactly, feeds back to user, so that user makes corresponding decision-making.But a lot of users not only require the data that detect which sensor node to occur extremely, also require to detect the concrete anomalous event type that causes these data exceptions.Such abnormality detection, also referred to as accident detection or event detection, has important practical significance.For example,, in fire detection application, when the data of sensor network occur when abnormal, will judge these abnormal datas, confirm that these abnormal datas are to be caused by that anomalous event, detecting monitored area is that fire has occurred, and other events have still occurred.
Wireless sensor network is data-centered, and between data, has very strong temporal correlation.Usually, if regard certain node as summit in datagram in data sometime, the temporal correlation between data is seen the limit in datagram as, can very naturally use graph model to describe event feature.A lot of research has very strong ability with application example proof graph model on description complicated event, can be applied to line Sensor Network complicated event and detect.If based on event atlas and event relevant information building database, Sensor Network complicated event detects the query processing problem that can be considered collection class diagram data.When event occurs, gather related data and set up event query graph, the diagram data that inquiry matches in database, just can obtain and present event relevant information, for example event type, event likely cause, following may development trend, effectively processing means etc., these information are important evidence of relevant decision-making.
The top priority of event detection is event establishment model, and a suitable event model is the basis that guarantees event detection accuracy.Radio sensing network event detection technology has obtained extensive and deep research, and major part is the event detection technology based on threshold value, and whether event occurs to depend on whether the detected value of detected attribute exceeds predefined threshold value.Yet this scheme also has not enough aspect decision support, and likely causes wrong report, for example to break through threshold value be likely that event has occurred is likely also because equipment fault or transmission fault to certain node.
For the shortcoming based on threshold test, there is event detection technology (the document W.Xue based on isogram (Contour Map), Q.Luo, L.Chen and Y.Liu.Contour Map Matching for Event Detection in Sensor Networks[C] .In Proceedings of ACM SIGMOD, 2006. and Y.Liu, M.Li.Iso-Map:Energy-Efficient Contour Mapping in Wireless Sensor Networks[C] .In Proceedings of IEEE ICDCS, Toronto, Canada, June2007.).Isogram technology is abstracted into the event in sensor network region the space-time model of node perceived data, and a situation arises by Model Matching, to carry out query event, can significantly improve detection efficiency.Isopleth is also a kind of graph model, can effectively describe the space-time data feature of event, but the ideograph of isopleth all obtains based on expertise, does not have general generality.
The similarity query state of the art of diagram data
Figure similarity query can formal definitions be: Given Graph database D={ g 1, g 2..., g n, query graph q, similarity query returns to set of graphs { g i| g i∈ D, g imeet given threshold value with the similarity of q }.
For figure similarity query problem, key problem is to need tolerance means to quantize the similitude of two figure.Some researcher proposes to measure similitude with figure editing distance (Graph Edit Distance).Figure editing distance is transformed by the thought of string matching, utilizes editor and the alignment distance of the alignment distance of character string and the thought structural map of editing distance.The comparison of two figure needs three kinds of edit operations: insert, delete and change.Method based on figure editing distance belongs to indirect calculation similarity, and its computation complexity is higher, belongs to np complete problem.Except figure editing distance, maximum public subgraph is also used to weigh two similarities between graph structure, i.e. the public part of the maximum of two width figures.Method based on the public subgraph of maximum belongs to direct calculating similitude, the calculating of having used Subgraph Isomorphism, thereby computation complexity is higher.Document H.Bunke and K.Shearer.A graph distance metric based on maximal common subgraph.Pattern Recognition, 19:25-259,1998 have used maximum public subgraph (Maximal Common Subgraph) to weigh graph structure similitude.
Because calculating chart editing distance is all np complete problems with solving maximum public subgraph, so when processing similarity query problem by these two kinds of methods, generally the upper bound or the lower bound that first calculates two figure similarities, and the time overhead that calculates the upper and lower boundary of similitude is little compared with the time overhead that directly calculates two graph structure similitudes, and can utilize upper and lower boundary to filter out a part of non-result set.Grafil(document X.Yan, F.Zhu, P.S.Yu, et al.Feature-based Similarity Search in Graph Structures[J] .ACM Transactions on Database Systems (TODS), 2006,31 (4): 1418-1453) be the algorithm that solves subgraph Similarity Problem, subgraph similarity query is exactly the diagram data set that inquiry and given query graph have the public subgraph that meets some condition.It utilizes maximum public subgraph to measure the similitude of two figure, has provided limit lax than the concept of (relaxation ratio) in literary composition.Grafil extracts feature and sets up feature-graph matrix index from chart database, during inquiry, determine and be included in the minor structure feature in query graph, then query graph limit lax is transformed into Characteristic Number and the minimizing that query graph comprises, by losing the maximum number of feature after calculating slack side, just can filter out in advance a part of non-results set, thereby reduce problem complexity.
Conventionally solve figure editing distance and mainly contain two class algorithms: exact algorithm and Similarity algorithm.A large amount of exact algorithms (document K.Riesen, S.Fankhauser, H.Bunke.Speeding up graph edit distance computation with a bipartite heuristic.In MLG ' 07 and document M.Neuhaus, k.Riesen, and H.Bunke.Fast suboptimal alorithms for the computaion of graph edit distance.In SSSSpR ' 06) be all based on the comparison famous A* algorithm (document P.Hart conventionally, N.Nilsson, B.Raphael.A formal basis for the heuristic determination of minimum cost paths.IEEE Trans.SSC, 4 (2): 100-107, 1968.), yet exact algorithm can only be processed the figure that is no more than 12 summits conventionally, so a large amount of, solve the upper of editing distance, the algorithm of lower bound is suggested.
BLP(document D.Justice, A.Hero.A Binary Linear Programming Formulation of the Graph Edit Distance[J] .IEEE Trans.Pattern Anal.Mach.Intell., 2006,28 (8): 1200 – 1214) for having no right label figure, provide the method for calculating two figure editing distances and upper and lower boundary thereof, the method is converted into 0,1 integral linear programming model by minimization problem.Having no right label figure refers to the figure that has label on summit and there is no weights on limit.Two figure that BLP will calculate editing distance regard a subgraph of editing the figure of grid representative as, edit operation between two figure can surpass this large editor's grid scarcely so because the size of this grid (length of grid and wide) be just two figure number of vertices and.In literary composition, proved that the edit operation of figure is equivalent to the change of this editor's trellis state, and if an edit operation cost is module, the editing distance calculating so is thus also a module.This model is owing to being 0,1 integral linear programming problem, and integral linear programming problem does not exist polynomial time algorithm, so scope of a variable is relaxed into [0-1] again, so just become general linear planning problem, and there is polynomial time algorithm in general linear planning problem, i.e. interior point method.Because linear programming scope of a variable after lax is the superset of lax front scope of a variable, and this linear programming model represents, it is minimization problem, so what the model after lax calculated is the lower bound of two figure editing distances, utilizes lower bound just can filter some database diagram.
Comparing Stars(document Z.Zeng, A.K.T.Tung, J.Wang et at.Comparing Stars:On Approximating Graph Edit Distance[C] .In VLDB, 2009) adopt editing distance to measure the similitude between two figure.Document represents a figure with a plurality of hub-and-spoke configurations, calculates the upper and lower boundary of two figure by comparing the corresponding hub-and-spoke configuration group of two figure, and this computational process can complete in polynomial time.
Summary of the invention
Goal of the invention: in order to overcome the deficiencies in the prior art, the present invention becomes data snapshot figure according to the snapshot data of wireless sensor network event by event abstract, provide the method for detecting abnormality (Data Snapshot Graph Based Anomaly Detection Algorithm is called for short DSG) based on data snapshot figure; Occurrence diagram can obtain based on domain-specialist knowledge, or obtains based on data analysis.
Technical scheme: for achieving the above object, the technical solution used in the present invention is:
Method for detecting abnormality based on data snapshot figure, comprises the steps:
(1) the detection data in the current monitored area of wireless sensor network are gathered and preliminary treatment, determine event relevant range;
(2) obtain the data set relevant to current event, with graph model abstract event data collection, convert event data collection to event data snapshot plotting;
(3) adopt the similar search algorithm of figure based on structure connection degree, in event schema chart database, inquire about, search the event schema figure similar to the event data snapshot plotting of current event, the type of judgement current event;
Described event schema chart database is the set of event schema figure, and described event schema figure is event data snapshot plotting, is the abstractdesription to event type;
Described event schema figure obtains by domain-specialist knowledge or obtains based on data analysis, is a kind of occurrence diagram based on data snapshot; Described data snapshot is the data set that each node in moment sensor network occurs event, and the occurrence diagram of setting up based on this data set is event snapshot plotting constantly, is also the event schema figure of this event;
The similar search algorithm of the described figure based on structure connection degree is specially, from diagram data, extract basic structure, it is basic structure sequence that the degree of association of take between basic structure transforms diagram data, the similar inquiry problem of figure is converted into sequence similarity query problem, effectively reduce inquiry complexity, to be applicable to event detection application.
In described step (1), physical correlation property associated with the data based on sensor node is set up node associated diagram, according to node associated diagram, determine event area, described node associated diagram comprises the subgraph of global node associated diagram and global node associated diagram, node associated diagram to set up mode as follows:
T node associated diagram form is constantly expressed as:
G t=<V,E,ID,f v>
Wherein: the vertex set that V is figure, comprises all event related top; E is the limit set of figure; ID is the numbering set on summit; f v: V → ID is the labeling function on summit, and figure summit is corresponding one by one with sensor node; A summit on each node configuration node associated diagram of wireless sensor network;
If d is (v i) tfor vertex v is in t Monitoring Data constantly, the limit set E structure principle of figure is as follows: for any two vertex v 1, v 2∈ E, if v 1with v 2corresponding sensor node is single-hop communication neighbours, or v 1with v 2corresponding sensor node is communication neighbours and existence function f in k-hop 1with f 2make f 1(d (v 1) t)=f 2(d (v 2) t), there is limit (v 1, v 2) ∈ E;
Described event relevant range determines that method is: at the moment of event detection t, for any vertex v i∈ E, if | d (v i) t-1-d (v i) t|/| d (v i) t-1+ d (v i) t|≤e, vertex v ifor event related top, the t constantly region at all event related top place is event relevant range; Wherein constant e is preset value, is typically chosen in 2.5%~5%;
Determined that the node associated diagram after event boundaries is the subgraph of global node associated diagram, the subgraph of global node associated diagram is defined as follows:
Ge t=<V,E,ID,f v>
Wherein: the vertex set that V is figure, comprise all event related top,
Figure BDA0000410054960000051
e is the limit set of figure,
Figure BDA0000410054960000052
iD is the numbering set on summit,
Figure BDA0000410054960000053
f v: V → ID is the labeling function on summit, and figure summit is corresponding one by one with sensor node.
In described step (2), with graph model abstract event data collection, convert event data collection to event data snapshot plotting, described data snapshot and event data snapshot plotting are as follows:
1) wireless sense network data snapshot S is defined as follows:
For the wireless sense network N with k node, it comprises node is { n 1, n 2..., n k, N is set { d (n in the data snapshot of moment t 1) t, d (n 2) t..., d (n k) t;
2) t event data snapshot plotting Gs constantly tby t node associated diagram Ge constantly taccording to node data correlation calculations, obtain, its formalization representation is:
Gs t=<V,E,ID,DV,f v,g v>
Wherein: the vertex set that V is figure, comprises all event related top; E is the limit set of figure; ID is the numbering set on summit; DV={d (v i) tthat the interior all the sensors node of event area is at t monitor value constantly; f v: V → ID is the labeling function on summit, and figure summit is corresponding one by one with sensor node; g v: V → DV is the data-mapping function on summit;
T event data snapshot plotting Gs constantly tvertex set and t node associated diagram Ge constantly tvertex set identical, comprise all the sensors node in event area;
T event data snapshot plotting Gs constantly tlimit collection E (Gs t) be constructed as follows:
A) for any limit (v 1, v 2) ∈ E (Ge t), if (d (v 1) t-d (v 2) t)/(d (v 1) t+ d (v 2) t) >e, there is directed edge <v 2, v 1> ∈ E (Gs t);
B) for any limit (v 1, v 2) ∈ E (Ge t), if (d (v 2) t-d (v 1) t)/(d (v 1) t+ d (v 2) t) >e, there is directed edge <v 1, v 2> ∈ E (Gs t);
C) for any limit (v 1, v 2) ∈ E (Ge t), if | (d (v 1) t-d (v 2) t) |/(d (v 1) t+ d (v 2) t) <e, there is directed edge <v 2, v 1> ∈ E (Gs t), and there is directed edge <v 1, v 2> ∈ E (Gs t);
Wherein constant e is preset value, is typically chosen in 2.5%~5%; Described event data snapshot plotting is directed graph, for describing the data mode of each node of wireless sensor network event area and the contact between data mode;
3) event data snapshot plotting scale still may be very large, makes event schema feature not obvious, affects the effect of event detection; Therefore need to simplify event data snapshot plotting, describe more abstractively the pattern feature of event, also can reduce the scale of diagram data, improve storage efficiency and handling property simultaneously; The present invention simplifies the operation event data snapshot plotting, and event data snapshot plotting is simplified, and described simplified way is for to merge sensor node, and the rule that node merges is:
A) the necessary approximately equal of the data of merge node: to v 2, v 1∈ V (Gs t), if <v 1, v 2> ∈ E (Gs t) and | (d (v 1) t-d (v 2) t) |/(d (v 1) t+ d (v 2) t) <e, merge v 2, v 1it is a new node;
B) when approximately equalised, when more than two node is merged into a new node, the limit being associated with these nodes is all associated with on new node.
According to above-mentioned node, merge rule event data snapshot plotting is simplified, can eliminate redundant information, more abstractively data of description feature; The level of abstraction of data determines by data pooled error scope e, and e is larger, and it is higher that data merge degree, and event schema is simpler; Otherwise e is less, it is lower that data merge degree, and event schema is more complicated; Wherein constant e is preset value, is typically chosen in 2.5%~5%.
In described step (3), figure Similarity algorithm based on structure connection degree is specially, first based on structure connection degree, extract the architectural feature sequence of diagram data, the similar inquiry of diagram data is converted into the inquiry of architectural feature sequence similarity, then in event schema chart database, search the event schema figure similar to event data snapshot plotting, the type of judgement current event; Detailed process includes lower step:
1) basic structure of definition diagram data is ring-like (cycle) structure, star-like (star) structure and line style (line) structure, with respect to some other structure type, as frequent subgraph, frequent subtree etc., this three basic structures more easily obtains, and the basic structure information that has comprised figure, the basic structure of three kinds of diagram datas is defined as follows:
Ring type structure: in figure, a series of some set forms a closed-loop, and the limit number in this closed-loop is more than or equal to 3, note loop configuration is cycle (s), and s={v|v ∈ V ∧ v node forms a ring }, wherein this closed-loop can not be encircled by nested other, and this closed-loop is simple ring;
Hub-and-spoke configuration: a certain core vertex v in figure 0connect other several summits, and be not communicated with between other any two summits, meet degress (v 0)>=3, note hub-and-spoke configuration is star (v 0, s), s={v|v 0, v ∈ V ∧ v is v 0neighbors, degress (v 0) expression node v 0degree;
Linear structure: by the end-to-end connected structure in a string summit, note linear structure is line (s), s={v|v ∈ V ∧ degress (v)≤2}, degress (v) represents the degree of node v;
2) basic structure extraction step is as follows:
1. with degree of depth traversal method with recall thought and first find out ring type structures all in figure;
2. more any two ring type structure A, B, if A is the subset of B, ring type structure B comprises ring type structure A, deletes ring type structure B;
3. circulation execution step, 2. until do not comprise the ring type structure of other ring type structures, obtains the loop configuration of all simple rings;
4. each degree of vertex in calculating chart, the number of degrees are more than or equal to hub-and-spoke configuration of conduct of 3;
5. each degree of vertex in calculating chart, if certain degree of vertex equal 1 and the number of degrees of its abutment points be less than or equal to 2, continue traversal abutment points, until certain degree of vertex is greater than 2, form thus a linear structure;
3) the graph data structure characteristic sequence extracting method based on structure connection degree is as follows:
Different according to the significance level of each structure, the sequence of basic structure is carried out to the sequence of significance level, graph structure data transaction is become to the sequence of basic structure, by the degree of association between structure, weigh the significance level of each structure:
Associated: any two the basic structure s in a figure iand s jif: meet cvNum (s i, s j)>=1, structure s iwith structure s jbe associated, be designated as incident (s i, s j)=1; If cvNum is (s i, s j)=0, incident (s i, s j)=0, description architecture s iwith structure s jnot associated; Correlation form is defined as:
incident ( s i , s j ) = 1 if cvNum ( s i , s j ) &GreaterEqual; 1 0 if cvNum ( s i , s j ) = 0
CvNum (s wherein i, s j) expression structure s iwith structure s jpublic vertex number, and i ≠ j;
The degree of association based on relational structure quantity: a given figure g, supposes to contain N basic structure, i basic structure s ithe degree of association be:
Figure BDA0000410054960000082
Wherein: 1≤i≤N, known, sNum_CD (s i)≤(N-1); If a basic structure s is associated with k basic structure, degree of association sNum_CD (s)=k of this basic structure s;
According to above-mentioned definition, event data snapshot plotting is converted into the basic structure sequence based on the degree of association;
4) the architectural feature sequence similarity query algorithm of diagram data, concrete steps are as follows:
By editing distance, calculate the similarity of source string S and target string T; Described editing distance refers to quantity or the cost that is changed to the needed minimum edit operation of T by S, the edit operation that wherein proposed refers to the operation that the character of some positions of character string is deleted, inserts, replaced, each conversion operations has a relevant cost, and the cost of a given conversion operations sequence equals the cost sum of single operation in sequence;
In event data snapshot plotting basic structure sequence, level forward structure connection degree is larger, and importance is larger, and the probability of the main feature of this structure representative graph is just larger; In structure sequence, first structure importance is in the drawings maximum, and editing the required cost of this structure should be also maximum, defines thus a kind of exponential function f (x)=a of monotone decreasing -xcost function as a character manipulation of each change;
Sequence editing distance similitude: given sequence database Set={s 1, s 2..., s n, a search sequence qStr and an editing distance threshold tau, sequence similarity query result is for returning to all SED of meeting (qStr, s in sequence library Set i) the sequence s of < τ i;
A given search sequence, string editing is apart from the editing distance between the sequence in sequence of calculation database and query graph sequence, and result is returned to the sequence data that all and search sequence editing distance in sequence library are less than given cost threshold tau.
Beneficial effect: the method for detecting abnormality based on data snapshot figure provided by the invention, occurrence diagram can obtain based on domain-specialist knowledge, or obtains based on data analysis, can reduce rate of false alarm.
Accompanying drawing explanation
Fig. 1 is flow chart of the present invention;
Fig. 2 is event schema Fig. 1;
Fig. 3 is event schema Fig. 2;
Fig. 4 is event schema Fig. 3;
Fig. 5 is the event detection effect of different temperature relative error merge node;
Fig. 6 is the event detection effect of different humidity relative error merge node;
Fig. 7 is the event detection effect of different oxygen content relative error merge node.
Embodiment
Below in conjunction with accompanying drawing, the present invention is further described.
The explanation of scheme principle
In wireless sensor network, the generation of certain event must be reacted in the state variation of sensor node Monitoring Data, and event inherent feature will derive the specific data pattern of this event.If data are carried out to abstract characteristics extraction, find out this data pattern, when sensor network presents this data pattern again, just can be according to the generation of the similarity determination corresponding event of data pattern.Wireless sensor network is data-centered, and between data, there is very strong temporal correlation, if regard certain node as summit in datagram in data sometime, temporal correlation between data is seen the limit in datagram as, can very naturally use graph model to describe event feature.Wireless sensor network event detection can be converted into the similarity query problem of graph model data.
Be illustrated in figure 1 a kind of method for detecting abnormality (DSG) based on data snapshot figure, comprise the steps:
(1) the detection data in the current monitored area of wireless sensor network are gathered and preliminary treatment, determine event relevant range;
(2) obtain the data set relevant to current event, with graph model abstract event data collection, convert event data collection to event data snapshot plotting;
(3) adopt the similar search algorithm of figure based on structure connection degree, in event schema chart database, inquire about, search the event schema figure similar to the event data snapshot plotting of current event, the type of judgement current event;
Described event schema chart database is the set of event schema figure, and described event schema figure is event data snapshot plotting, is the abstractdesription to event type;
Described event schema figure obtains by domain-specialist knowledge or obtains based on data analysis, is a kind of occurrence diagram based on data snapshot; Described data snapshot is the data set that each node in moment sensor network occurs event, and the occurrence diagram of setting up based on this data set is event snapshot plotting constantly, is also the event schema figure of this event;
The similar search algorithm of the described figure based on structure connection degree is specially, from diagram data, extract basic structure, it is basic structure sequence that the degree of association of take between basic structure transforms diagram data, the similar inquiry problem of figure is converted into sequence similarity query problem, effectively reduce inquiry complexity, to be applicable to event detection application.
Based on domain-specialist knowledge: in some application of wireless sensor network, the data characteristics of particular event is known, and these known knowledge can be used for building event schema figure.
Based on data analysis: in many application of wireless sensor network, although the data characteristics of event presents certain regularity, yet be often hidden among a large amount of data, the data characteristics of diagram data abstractdesription event for the present invention, builds event schema figure.
Detection data in the current monitored area of wireless sensor network are gathered and preliminary treatment, determine event relevant range, step is as follows:
1) set up node associated diagram
Node associated diagram is used for being described in wireless sensor network, between sensor node at the incidence relation of moment t.The degree of association between node comprises two aspect information, is respectively: 1. physical interconnection degree: whether be the neighbor node of single-hop communication, and 2. data correlation degree: node detects between data whether have correlation.
T node associated diagram form is constantly expressed as:
G t=<V,E,ID,f v>
Wherein: the vertex set that V is figure, comprises all event related top; E is the limit set of figure; ID is the numbering set on summit; f v: V → ID is the labeling function on summit, and figure summit is corresponding one by one with sensor node; A summit on each node configuration node associated diagram of wireless sensor network.
If d is (v i) tfor vertex v is in t Monitoring Data constantly, the limit set E structure principle of figure is as follows: for any two vertex v 1, v 2∈ E, if v 1with v 2corresponding sensor node is single-hop communication neighbours, or v 1with v 2corresponding sensor node is communication neighbours and existence function f in k-hop 1with f 2make f 1(d (v 1) t)=f 2(d (v 2) t), there is limit (v 1, v 2) ∈ E.
Wherein constant k is pre-defined value, and more node associated diagram is more complicated for k; Otherwise the less node of k associated diagram is simpler; If k=1, node associated diagram is identical with wireless sensing net topology.In general, can select k=2, to guarantee that node associated diagram can take into account internodal physical correlation property associated with the data, and make graph structure be unlikely to too complex.Function f 1with f 2selection principle relevant with data characteristics, the quantitative correlation between can data of description, the qualitative correlation between also can data of description.
From above-mentioned definition, node associated diagram is a non-directed graph, for describing the incidence relation between wireless sensing net node, not only comprises physical correlation but also comprise data dependence.
2) determine event relevant range
At the moment of event detection t, for any vertex v i∈ E, if | d (v i) t-1-d (v i) t|/| d (v i) t-1+ d (v i) t|≤e, vertex v ifor event related top, the t constantly region at all event related top place is event relevant range; Wherein constant e is preset value, is typically chosen in 2.5%~5%.
Determined that the node associated diagram after event boundaries is the subgraph of global node associated diagram, the subgraph of global node associated diagram is defined as follows:
Ge t=<V,E,ID,f v>
Wherein: the vertex set that V is figure, comprise all event related top, e is the limit set of figure,
Figure BDA0000410054960000112
iD is the numbering set on summit,
Figure BDA0000410054960000113
f v: V → ID is the labeling function on summit, and figure summit is corresponding one by one with sensor node.
Obtain the data set relevant to current event, with graph model abstract event data collection, convert event data collection to time data snapshot plotting, concrete steps be described as follows:
1) obtain t data snapshot constantly
Wireless sense network data snapshot S is defined as follows:
For the wireless sense network N with k node, it comprises node is { n 1, n 2..., n k, N is set { d (n in the data snapshot of moment t 1) t, d (n 2) t..., d (n k) t.
2) calculate event data snapshot plotting
T event data snapshot plotting Gs constantly tby t node associated diagram Ge constantly taccording to node data correlation calculations, obtain, its formalization representation is:
Gs t=<V,E,ID,DV,f v,g v>
Wherein: the vertex set that V is figure, comprises all event related top; E is the limit set of figure; ID is the numbering set on summit; DV={d (v i) tthat the interior all the sensors node of event area is at t monitor value constantly; f v: V → ID is the labeling function on summit, and figure summit is corresponding one by one with sensor node; g v: V → DV is the data-mapping function on summit.
T event data snapshot plotting Gs constantly tvertex set and t node associated diagram Ge constantly tvertex set identical, comprise all the sensors node in event area;
T event data snapshot plotting Gs constantly tlimit collection E (Gs t) be constructed as follows:
A) for any limit (v 1, v 2) ∈ E (Ge t), if (d (v 1) t-d (v 2) t)/(d (v 1) t+ d (v 2) t) >e, there is directed edge <v 2, v 1> ∈ E (Gs t);
B) for any limit (v 1, v 2) ∈ E (Ge t), if (d (v 2) t-d (v 1) t)/(d (v 1) t+ d (v 2) t) >e, there is directed edge <v 1, v 2> ∈ E (Gs t);
C) for any limit (v 1, v 2) ∈ E (Ge t), if | (d (v 1) t-d (v 2) t) |/(d (v 1) t+ d (v 2) t) <e, there is directed edge <v 2, v 1> ∈ E (Gs t), and there is directed edge <v 1, v 2> ∈ E (Gs t);
Wherein constant e is preset value, is typically chosen in 2.5%~5%; Described event data snapshot plotting is directed graph, for describing the data mode of each node of wireless sensor network event area and the contact between data mode.
3) simplify event data snapshot plotting
Event data snapshot plotting scale still may be very large, makes event schema feature not obvious, affects the effect of event detection; Therefore need to simplify event data snapshot plotting, describe more abstractively the pattern feature of event, also can reduce the scale of diagram data, improve storage efficiency and handling property simultaneously; The present invention simplifies the operation event data snapshot plotting, and event data snapshot plotting is simplified, and described simplified way is for to merge sensor node, and the rule that node merges is:
A) the necessary approximately equal of the data of merge node: to v 2, v 1∈ V (Gs t), if <v 1, v 2> ∈ E (Gs t) and | (d (v 1) t-d (v 2) t) |/(d (v 1) t+ d (v 2) t) <e, merge v 2, v 1it is a new node;
B) when approximately equalised, when more than two node is merged into a new node, the limit being associated with these nodes is all associated with on new node.
According to above-mentioned node, merge rule event data snapshot plotting is simplified, can eliminate redundant information, more abstractively data of description feature; The level of abstraction of data determines by data pooled error scope e, and e is larger, and it is higher that data merge degree, and event schema is simpler; Otherwise e is less, it is lower that data merge degree, and event schema is more complicated; Wherein constant e is preset value, is typically chosen in 2.5%~5%.
The figure Similarity algorithm of employing based on structure connection degree inquired about in event schema chart database, searches the event schema figure similar to occurrence diagram, the type of judgement current event.The similarity query of diagram data is costly, is not suitable for the event detection scene that real-time is higher, the present invention is based on the architectural feature sequence that structure connection degree extracts diagram data, and the similar inquiry of diagram data is converted into the inquiry of architectural feature sequence similarity.What event schema database was stored is the corresponding architectural feature sequence of variety of event ideograph.Concrete scheme is as follows:
1) extract diagram data basic structure
The basic structure of definition diagram data is ring-like (cycle) structure, star-like (star) structure and line style (line) structure, with respect to some other structure type, as frequent subgraph, frequent subtree etc., this three basic structures more easily obtains, and the basic structure information that has comprised figure, the basic structure of three kinds of diagram datas is defined as follows:
Ring type structure: in figure, a series of some set forms a closed-loop, and the limit number in this closed-loop is more than or equal to 3, note loop configuration is cycle (s), and s={v|v ∈ V ∧ v node forms a ring }, wherein this closed-loop can not be encircled by nested other, and this closed-loop is simple ring;
Hub-and-spoke configuration: a certain core vertex v in figure 0connect other several summits, and be not communicated with between other any two summits, meet degress (v 0)>=3, note hub-and-spoke configuration is star (v 0, s), s={v|v 0, v ∈ V ∧ v is v 0neighbors, degress (v 0) expression node v 0degree;
Linear structure: by the end-to-end connected structure in a string summit, note linear structure is line (s), s={v|v ∈ V ∧ degress (v)≤2}, degress (v) represents the degree of node v.
In a figure, may contain a lot of ring type structures, and ring type structure is nested against one another sometimes, if consider all ring type structures, not only will causes problem complexity to increase, and can some ring structure of double counting.Because ring type structure is all comprised of basic ring, so basic ring type structure is only considered in this case.Basic ideas are with degree of depth traversal method and recall thought and first find out ring structures all in figure, then more any two ring structure A, B, if A is the subset of B, ring structure B comprises ring structure A, deletes B structure.By this method, what finally obtain is exactly basic ring type structure.
While extracting hub-and-spoke configuration and linear structure, each degree of vertex in calculating chart first, the number of degrees are more than or equal to hub-and-spoke configuration of conduct of 3.If certain degree of vertex equal 1 and the number of degrees of its abutment points be less than or equal to 2, continue traversal abutment points, until certain degree of vertex is greater than 2, form thus a linear structure.
2) based on structure connection degree, extract the architectural feature sequence of diagram data
After basic structure is all extracted, next step is exactly different according to the significance level of each structure, the sequence of basic structure is carried out to the sequence of significance level, graph structure data transaction is become to the sequence of basic structure, by the degree of association between structure, weigh the significance level of each structure:
Associated: any two the basic structure s in a figure iand s jif: meet cvNum (s i, s j)>=1, structure s iwith structure s jbe associated, be designated as incident (s i, s j)=1; If cvNum is (s i, s j)=0, incident (s i, s j)=0, description architecture s iwith structure s jnot associated; Correlation form is defined as:
incident ( s i , s j ) = 1 if cvNum ( s i , s j ) &GreaterEqual; 1 0 if cvNum ( s i , s j ) = 0
CvNum (s wherein i, s j) expression structure s iwith structure s jpublic vertex number, and i ≠ j;
The degree of association based on relational structure quantity: a given figure g, supposes to contain N basic structure, i basic structure s ithe degree of association be:
Figure BDA0000410054960000142
Wherein: 1≤i≤N, known, sNum_CD (s i)≤(N-1); If a basic structure s is associated with k basic structure, degree of association sNum_CD (s)=k of this basic structure s;
According to above-mentioned definition, event data snapshot plotting is converted into the basic structure sequence based on the degree of association.
3) the architectural feature sequence similarity query algorithm of diagram data
The similarity of source string S and target string T is calculated in this case by editing distance; Described editing distance refers to quantity or the cost that is changed to the needed minimum edit operation of T by S, the edit operation that wherein proposed refers to the operation that the character of some positions of character string is deleted, inserts, replaced, each conversion operations has a relevant cost, and the cost of a given conversion operations sequence equals the cost sum of single operation in sequence.
In event data snapshot plotting basic structure sequence, level forward structure connection degree is larger, and importance is larger, and the probability of the main feature of this structure representative graph is just larger; In structure sequence, first structure importance is in the drawings maximum, and editing the required cost of this structure should be also maximum, defines thus a kind of exponential function f (x)=a of monotone decreasing -xcost function as a character manipulation of each change;
Sequence editing distance similitude: given sequence database Set={s 1, s 2..., s n, a search sequence qStr and an editing distance threshold tau, sequence similarity query result is for returning to all SED of meeting (qStr, s in sequence library Set i) the sequence s of < τ i;
A given search sequence, string editing is apart from the editing distance between the sequence in sequence of calculation database and query graph sequence, and result is returned to the sequence data that all and search sequence editing distance in sequence library are less than given cost threshold tau.SED algorithm false code based on weights cost function is as follows:
Figure BDA0000410054960000151
Figure BDA0000410054960000161
Editing distance algorithm time complexity is O (mn), and space complexity is O (mn), if do not need to record the order of edit operation, space complexity is O (min (m, n)), wherein, m, n represents respectively the length of source string S and target string T.
Implementation algorithm for example
Sensor network generation data exception (or user's inquiry), determines event area, builds occurrence diagram, and query event ideograph database, according to returning results event type.Introduce respectively definite method and the occurrence diagram querying method of event area below:
Definite method of event area: the one, the querying command that user issues, this event area is user's appointment; The 2nd, in wireless sensor network, there is data exception, the region at the sensor node of these generation data exceptions and other sensor node place being associated with them has just formed event area.
Occurrence diagram querying method: a given occurrence diagram, in event schema chart database, find out the event schema figure similar to this occurrence diagram, and result is returned.
Based on above analysis, it is as follows that we provide concrete DSG algorithm:
Figure BDA0000410054960000162
Figure BDA0000410054960000171
Experimental performance is analyzed
Event scenarios
According to three kinds of particular events of actual application background definition: detect the event of detection oxygen high Areas in the event of detection current in the event, region of thermal source and region in the region in, respectively the scene of these three kinds of events is introduced below:
Event scenarios 1: designed the scene that detects thermal source in a region according to the application background of fire preventing, the cause in this scene with thermal source simulated fire, by the effect of the fire detection in the approximate reflection of the effect actual environment of surveyed area thermal source.
Event scenarios 2: designed in a region and detected the scene that has current according to detection infiltration or permeable application background in tunnel, by the effect of water seepage of tunnel or permeable detection in the approximate reflection of the effect actual environment of the interior current of detection event area.
Event scenarios 3: designed the scene that detects oxygen high Areas in a region according to the application background that detects oxygen high-load region in colliery, by detecting the approximate effect that reflects that in actual environment, oxygen high-load region is detected of effect of oxygen high Areas in event area.
The heat source temperature data of this experimental basis actual monitoring, current flow through region relative humidity data and colliery oxygen density data, synthesis of artificial is tested needed analogue data, respectively these three kinds of particular events is carried out to emulation experiment, and event detection effect is evaluated.
In emulation experiment, wireless sensor network consists of 256 nodes, be distributed in 16 * 16 spatial dimension, and by this region representation event area.
Event schema figure describes
Event schema Fig. 1 is the abstractdesription to certain moment fire snapshot data.Hot spot temperature is the highest, and temperature is around along with distance increases and reduces; Fig. 2 approximate description such data mode.Event schema Fig. 1 with less node and directed edge approximate description the data pattern of fire.
Event schema Fig. 2 is to certain flow through abstractdesription of region snapshot data of current constantly.The highest and approximately equal all along the nearest edge of current relative humidity, and along current region relative humidity far away lower and approximately equal all, Fig. 3 approximate description such data mode.As 4 node data approximately equals above or below in figure, thus between them with two-way directed edge connection; And in figure below 4 node datas higher than 4 node datas above, with unidirectional limit, represent by the low high node of node sensing data of data.Thereby event schema Fig. 2 has described the flow through data pattern in region of current with less node and directed edge.
Event schema Fig. 3 is the abstractdesription to certain snapshot data of colliery oxygen high-load distribution constantly.The oxygen density of oxygen high Areas is higher and all basically identical, and oxygen density is around lower; Fig. 4 approximate description such data mode.As the approximately equal all of 6 vertex datas above in figure, between them, all use two-way directed edge to connect, and in figure, two summit oxygen densities are below lower, thereby with the sensing higher summit of oxygen density, unidirectional limit.Thereby event schema Fig. 3 approximate description the data pattern that distributes of oxygen high-load.
Three above-mentioned event schema figure represent respectively three particular event data patterns, yet these three event schema figure also can represent the data pattern of other event.For example, event schema Fig. 1 also can describe the data pattern of gas leak event; This is that coal gas density around declines gradually because the coal gas density at gas leak center is the highest, and decentre coal gas density far away is lower, so the data pattern that event schema Fig. 1 can approximate description gas leak event.This also illustrates that event schema figure has good versatility, can describe the data pattern of complicated event.
Experiment
Experiment 1: the emulation experiment that detects thermal source event in region.First, adopt digital simulation program to generate each node snapshot data of 120 normal conditions, normal condition is exactly in region, not have the data mode of thermal source, comprises temperature environment, environment that range of temperature is larger etc. stably; Then, adopt digital simulation program to generate 120 each node snapshot datas that contain thermal source, the central temperature of simulation thermal source generates at random, the centre coordinate of thermal source determines at random in event area, and thermal source is not quite similar to external radiation expansion, as in the situation that have wind in the situation that, there is barrier near calm and thermal source.The parameter of emulated data: normal temperature range is in [0 ℃, 40 ℃], and thermal source central temperature scope is in [50 ℃, 100 ℃], and the data width of whole temperature is 100 ℃; The node associated diagram in whole event region is that each node is associated with its 8 nodes around; The data pattern that detects thermal source in region is event schema Fig. 1.
Fig. 5 has shown the event detection effect of different temperatures relative error merge node.As seen from the figure, in temperature relative error, hour carry out node merging, the occurrence diagram obtaining can not subregion, right area normal condition and region there is the situation of thermal source, and region normal condition is reported by mistake to the situation that has thermal source for region.This is that node merges less because hour carry out node merging in temperature relative error, and the temperature grade existing between node is more, can not effectively reflect the main body trend that variations in temperature rises, and causes a large amount of wrong report phenomenons.When temperature relative error is larger, carry out node merging, there is the situation of thermal source in distinguishable region normal condition and region effectively, but can cause region to have failing to report of thermal source situation.This is that node merges more because carry out node merging when temperature relative error is larger, can effectively reflect the main body trend that variations in temperature rises.In this case, for region normal condition, in the occurrence diagram obtaining, state of temperature is basically identical, there is no the trend that variations in temperature rises; And be there is to thermal source situation in region, in the occurrence diagram obtaining, can retain main temperature ascendant trend, thereby there is the situation of thermal source in distinguishable region normal condition and region effectively.But, for subregion, there is the situation of thermal source, because thermal source central temperature is lower, variations in temperature rises distant, and the interior state of temperature of occurrence diagram obtaining changes less, can not effectively reflect and change the main body trend rising, thereby having caused failing to report of this situation, recall ratio declines.
Experiment 2: the emulation experiment that detects current event in region.First, adopt digital simulation program to generate each node snapshot data of 120 normal conditions, normal condition is exactly in region, not have the data mode of current, comprises relative humidity environment, environment that relative humidity variations amplitude is larger etc. stably; Then, adopt digital simulation program to generate 120 each node snapshot datas that contain current, the current of simulation flow into event area from different directions, flow out as the crow flies, crooked outflow or there is no outflow because of cutout in event area.The parameter of emulated data: normal phase is to humidity range in [30,80], and near relative humidity current will exceed [5,15] than normally, and the data width of whole relative humidity is 50, and wherein the unit of relative humidity is percentage (%); The node associated diagram in whole event region is that each node is associated with its 8 nodes around; In region, the data pattern of current event is event schema Fig. 2.
Fig. 6 has shown the event detection effect of different humidity relative error merge node.As seen from the figure, in humidity relative error, hour carry out node merging, no matter be normal condition or have streamflow regime, the occurrence diagram obtaining does not comprise the water flow mode figure of definition substantially.This is that because humidity relative error is less, node can not effectively merge because near the data of normal data and current have small variation, thereby the occurrence diagram obtaining does not substantially comprise water flow mode figure and causes failing to report of a large amount of events.Along with the increase of humidity relative error, recall ratio starts to rise, and this is because near node current has obtained effective merging, has reflected the pattern of current; But, precision ratio first declines rising afterwards, this is to concentrate on the scope of [2,3] due to the amplitude of variation of relative humidity under normal circumstances, and namely relative error is [4%, 6%] in scope, in the occurrence diagram of normal condition, also comprise water flow mode figure like this, cause the wrong report of a large amount of normal conditions, but along with the increase of humidity relative error, the amplitude of variation of relative humidity has just been eliminated under normal circumstances, and rate of false alarm will reduce; Yet when humidity relative error becomes larger, the data in region will reach unanimity, just can not detect current event, fail to report more and more.
Experiment 3: the emulation experiment that detects oxygen high Areas in region.First, adopt digital simulation program to generate each node snapshot data of 120 normal conditions, normal condition is exactly in region, not have the data mode of oxygen high Areas, comprises oxygen density environment, environment that oxygen density amplitude of variation is larger etc. stably; Then, adopt digital simulation program to generate 120 each node snapshot datas that contain oxygen high Areas, the centre coordinate of the oxygen high Areas of simulation determines at random in event area, and the area of high Areas is not quite similar.The parameter of emulated data: the normal oxygen content scope in region is [15,18], and the oxygen content scope of high Areas is in [18,21], and the data width of whole oxygen content is 6, and wherein the unit of oxygen content is %; The node associated diagram in whole event region is that each node is associated with its 8 nodes around; In region, the data pattern of oxygen high Areas event is event schema Fig. 3.
Fig. 7 has shown the event detection effect of different oxygen content relative error merge nodes.As seen from the figure; increase gradually along with oxygen content relative error; event recall ratio also progressively rises; this is because the data that node merges increase; the value approximately equal degree of neighbor node strengthens, and the data of oxygen high Areas reach unanimity, and apparently higher than normal region; the event schema that meets oxygen high Areas, thereby effectively detected; Yet during oxygen content relative error larger, reaching unanimity with the data in normal district in oxygen high Areas, just can not differentiate normal district or oxygen high Areas, and recall ratio will decline.Increase gradually along with oxygen content relative error, event precision ratio maintains higher level substantially, this is because normal condition is being carried out node while merging, be difficult to the event schema of the oxygen high Areas of an overall region data consistent of formation and the projecting data of these data, thereby rate of false alarm is lower, precision ratio is higher.
From experiment above, the method for detecting abnormality based on data snapshot figure is in building the process of occurrence diagram, and data error scope when node merges directly affects the effect of event detection.Therefore, reasonably setting data pooled error scope is this key factor, efficient detection that can realization event.
Method for detecting abnormality based on data snapshot figure mainly carries out figure modeling according to the snapshot data of wireless sensor network, generates a data snapshot plotting, the data characteristics that this data snapshot figure can certain event of abstractdesription sensor network.Experiment shows, when snapshot data is carried out to figure modeling, the data error scope of reasonably selecting node to merge is the key factor of datagram modeling.
The above is only the preferred embodiment of the present invention; be noted that for those skilled in the art; under the premise without departing from the principles of the invention, can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims (4)

1. the method for detecting abnormality based on data snapshot figure, is characterized in that: comprise the steps:
(1) the detection data in the current monitored area of wireless sensor network are gathered and preliminary treatment, determine event relevant range;
(2) obtain the data set relevant to current event, with graph model abstract event data collection, convert event data collection to event data snapshot plotting;
(3) adopt the similar search algorithm of figure based on structure connection degree, in event schema chart database, inquire about, search the event schema figure similar to the event data snapshot plotting of current event, the type of judgement current event;
Described event schema chart database is the set of event schema figure, and described event schema figure is event data snapshot plotting, is the abstractdesription to event type;
Described event schema figure obtains by domain-specialist knowledge or obtains based on data analysis, is a kind of occurrence diagram based on data snapshot; Described data snapshot is the data set that each node in moment sensor network occurs event, and the occurrence diagram of setting up based on this data set is event snapshot plotting constantly, is also the event schema figure of this event;
The similar search algorithm of the described figure based on structure connection degree is specially, and extracts basic structure from diagram data, and it is basic structure sequence that the degree of association of take between basic structure transforms diagram data, and the similar inquiry problem of figure is converted into sequence similarity query problem.
2. the method for detecting abnormality based on data snapshot figure according to claim 1, it is characterized in that: in described step (1), physical correlation property associated with the data based on sensor node is set up node associated diagram, according to node associated diagram, determine event area, described node associated diagram comprises the subgraph of global node associated diagram and global node associated diagram, node associated diagram to set up mode as follows:
T node associated diagram form is constantly expressed as:
G t=<V,E,ID,f v>
Wherein: the vertex set that V is figure, comprises all event related top; E is the limit set of figure; ID is the numbering set on summit; f v: V → ID is the labeling function on summit, and figure summit is corresponding one by one with sensor node; A summit on each node configuration node associated diagram of wireless sensor network;
If d is (v i) tfor vertex v is in t Monitoring Data constantly, the limit set E structure principle of figure is as follows: for any two vertex v 1, v 2∈ E, if v 1with v 2corresponding sensor node is single-hop communication neighbours, or v 1with v 2corresponding sensor node is communication neighbours and existence function f in k-hop 1with f 2make f 1(d (v 1) t)=f 2(d (v 2) t), there is limit (v 1, v 2) ∈ E;
Described event relevant range determines that method is: at the moment of event detection t, for any vertex v i∈ E, if | d (v i) t-1-d (v i) t|/| d (v i) t-1+ d (v i) t|≤e, vertex v ifor event related top, the t constantly region at all event related top place is event relevant range; Wherein constant e is preset value;
Determined that the node associated diagram after event boundaries is the subgraph of global node associated diagram, the subgraph of global node associated diagram is defined as follows:
Ge t=<V,E,ID,f v>
Wherein: the vertex set that V is figure, comprise all event related top,
Figure FDA0000410054950000021
e is the limit set of figure,
Figure FDA0000410054950000022
iD is the numbering set on summit,
Figure FDA0000410054950000023
f v: V → ID is the labeling function on summit, and figure summit is corresponding one by one with sensor node.
3. the method for detecting abnormality based on data snapshot figure according to claim 2, it is characterized in that: in described step (2), with graph model abstract event data collection, convert event data collection to event data snapshot plotting, described data snapshot and event data snapshot plotting are as follows:
1) wireless sense network data snapshot S is defined as follows:
For the wireless sense network N with k node, it comprises node is { n 1, n 2..., n k, N is set { d (n in the data snapshot of moment t 1) t, d (n 2) t..., d (n k) t;
2) t event data snapshot plotting Gs constantly tby t node associated diagram Ge constantly taccording to node data correlation calculations, obtain, its formalization representation is:
Gs t=<V,E,ID,DV,f v,g v>
Wherein: the vertex set that V is figure, comprises all event related top; E is the limit set of figure; ID is the numbering set on summit; DV={d (v i) tthat the interior all the sensors node of event area is at t monitor value constantly; f v: V → ID is the labeling function on summit, and figure summit is corresponding one by one with sensor node; g v: V → DV is the data-mapping function on summit;
T event data snapshot plotting Gs constantly tvertex set and t node associated diagram Ge constantly tvertex set identical, comprise all the sensors node in event area;
T event data snapshot plotting Gs constantly tlimit collection E (Gs t) be constructed as follows:
A) for any limit (v 1, v 2) ∈ E (Ge t), if (d (v 1) t-d (v 2) t)/(d (v 1) t+ d (v 2) t) >e, there is directed edge <v 2, v 1> ∈ E (Gs t);
B) for any limit (v 1, v 2) ∈ E (Ge t), if (d (v 2) t-d (v 1) t)/(d (v 1) t+ d (v 2) t) >e, there is directed edge <v 1, v 2> ∈ E (Gs t);
C) for any limit (v 1, v 2) ∈ E (Ge t), if | (d (v 1) t-d (v 2) t) |/(d (v 1) t+ d (v 2) t) <e, there is directed edge <v 2, v 1> ∈ E (Gs t), and there is directed edge <v 1, v 2> ∈ E (Gs t);
Wherein constant e is preset value; Described event data snapshot plotting is directed graph, for describing the data mode of each node of wireless sensor network event area and the contact between data mode;
3) event data snapshot plotting is simplified, described simplified way is for to merge sensor node, and the rule that node merges is:
A) the necessary approximately equal of the data of merge node: to v 2, v 1∈ V (Gs t), if <v 1, v 2> ∈ E (Gs t) and | (d (v 1) t-d (v 2) t) |/(d (v 1) t+ d (v 2) t) <e, merge v 2, v 1it is a new node;
B) when approximately equalised, when more than two node is merged into a new node, the limit being associated with these nodes is all associated with on new node.
4. the method for detecting abnormality based on data snapshot figure according to claim 1, it is characterized in that: in described step (3), figure Similarity algorithm based on structure connection degree is specially, first based on structure connection degree, extract the architectural feature sequence of diagram data, the similar inquiry of diagram data is converted into the inquiry of architectural feature sequence similarity, then in event schema chart database, search the event schema figure similar to event data snapshot plotting, the type of judgement current event; Detailed process includes lower step:
1) basic structure of definition diagram data is ring type structure, hub-and-spoke configuration and linear structure, with respect to some other structure type, as frequent subgraph, frequent subtree etc., this three basic structures more easily obtains, and the basic structure information that has comprised figure, the basic structure of three kinds of diagram datas is defined as follows:
Ring type structure: in figure, a series of some set forms a closed-loop, and the limit number in this closed-loop is more than or equal to 3, note loop configuration is cycle (s), and s={v|v ∈ V ∧ v node forms a ring }, wherein this closed-loop can not be encircled by nested other, and this closed-loop is simple ring;
Hub-and-spoke configuration: a certain core vertex v in figure 0connect other several summits, and be not communicated with between other any two summits, meet degress (v 0)>=3, note hub-and-spoke configuration is star (v 0, s), s={v|v 0, v ∈ V ∧ v is v 0neighbors, degress (v 0) expression node v 0degree;
Linear structure: by the end-to-end connected structure in a string summit, note linear structure is line (s), s={v|v ∈ V ∧ degress (v)≤2}, degress (v) represents the degree of node v;
2) basic structure extraction step is as follows:
1. with degree of depth traversal method with recall thought and first find out ring type structures all in figure;
2. more any two ring type structure A, B, if A is the subset of B, ring type structure B comprises ring type structure A, deletes ring type structure B;
3. circulation execution step, 2. until do not comprise the ring type structure of other ring type structures, obtains the loop configuration of all simple rings;
4. each degree of vertex in calculating chart, the number of degrees are more than or equal to hub-and-spoke configuration of conduct of 3;
5. each degree of vertex in calculating chart, if certain degree of vertex equal 1 and the number of degrees of its abutment points be less than or equal to 2, continue traversal abutment points, until certain degree of vertex is greater than 2, form thus a linear structure;
3) the graph data structure characteristic sequence extracting method based on structure connection degree is as follows:
Different according to the significance level of each structure, the sequence of basic structure is carried out to the sequence of significance level, graph structure data transaction is become to the sequence of basic structure, by the degree of association between structure, weigh the significance level of each structure:
Associated: any two the basic structure s in a figure iand s jif: meet cvNum (s i, s j)>=1, structure s iwith structure s jbe associated, be designated as incident (s i, s j)=1; If cvNum is (s i, s j)=0, incident (s i, s j)=0, description architecture s iwith structure s jnot associated; Correlation form is defined as:
incident ( s i , s j ) = 1 if cvNum ( s i , s j ) &GreaterEqual; 1 0 if cvNum ( s i , s j ) = 0
CvNum (s wherein i, s j) expression structure s iwith structure s jpublic vertex number, and i ≠ j;
The degree of association based on relational structure quantity: a given figure g, supposes to contain N basic structure, i basic structure s ithe degree of association be:
Figure FDA0000410054950000051
Wherein: 1≤i≤N, sNum_CD (s i)≤(N-1); If a basic structure s is associated with k basic structure, degree of association sNum_CD (s)=k of this basic structure s;
According to above-mentioned definition, event data snapshot plotting is converted into the basic structure sequence based on the degree of association;
4) the architectural feature sequence similarity query algorithm of diagram data, concrete steps are as follows:
By editing distance, calculate the similarity of source string S and target string T; Described editing distance refers to quantity or the cost that is changed to the needed minimum edit operation of T by S, the edit operation that wherein proposed refers to the operation that the character of some positions of character string is deleted, inserts, replaced, each conversion operations has a relevant cost, and the cost of a given conversion operations sequence equals the cost sum of single operation in sequence;
In event data snapshot plotting basic structure sequence, level forward structure connection degree is larger, and importance is larger, and the probability of the main feature of this structure representative graph is just larger; In structure sequence, first structure importance is in the drawings maximum, and editing the required cost of this structure should be also maximum, defines thus a kind of exponential function f (x)=a of monotone decreasing -xcost function as a character manipulation of each change;
Sequence editing distance similitude: given sequence database Set={s 1, s 2..., s n, a search sequence qStr and an editing distance threshold tau, sequence similarity query result is for returning to all SED of meeting (qStr, s in sequence library Set i) the sequence s of < τ i;
A given search sequence, string editing is apart from the editing distance between the sequence in sequence of calculation database and query graph sequence, and result is returned to the sequence data that all and search sequence editing distance in sequence library are less than given cost threshold tau.
CN201310549381.4A 2013-11-07 2013-11-07 Method for detecting abnormality based on data snapshot figure Expired - Fee Related CN103561420B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310549381.4A CN103561420B (en) 2013-11-07 2013-11-07 Method for detecting abnormality based on data snapshot figure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310549381.4A CN103561420B (en) 2013-11-07 2013-11-07 Method for detecting abnormality based on data snapshot figure

Publications (2)

Publication Number Publication Date
CN103561420A true CN103561420A (en) 2014-02-05
CN103561420B CN103561420B (en) 2016-06-08

Family

ID=50015537

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310549381.4A Expired - Fee Related CN103561420B (en) 2013-11-07 2013-11-07 Method for detecting abnormality based on data snapshot figure

Country Status (1)

Country Link
CN (1) CN103561420B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104536996A (en) * 2014-12-12 2015-04-22 南京理工大学 Computational node anomaly detection method in isomorphic environments
CN107225571A (en) * 2017-06-07 2017-10-03 纳恩博(北京)科技有限公司 Motion planning and robot control method and apparatus, robot
CN107704332A (en) * 2017-09-28 2018-02-16 努比亚技术有限公司 Freeze screen solution method, mobile terminal and computer-readable recording medium
CN109902564A (en) * 2019-01-17 2019-06-18 杭州电子科技大学 A kind of accident detection method based on the sparse autoencoder network of structural similarity
CN114365505A (en) * 2019-11-07 2022-04-15 阿里巴巴集团控股有限公司 Data-driven object graph for data center monitoring
CN115551060A (en) * 2022-10-20 2022-12-30 浙江瑞邦科特检测有限公司 Low-power consumption data monitoring method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7734451B2 (en) * 2005-10-18 2010-06-08 Honeywell International Inc. System, method, and computer program for early event detection
CN102291739A (en) * 2011-08-16 2011-12-21 哈尔滨工业大学 Method for detecting wireless sensor network sparse events based on compressed sensing and game theory
CN102665253A (en) * 2012-04-20 2012-09-12 山东大学 Event detection method on basis of wireless sensor network
CN102724686A (en) * 2012-05-17 2012-10-10 北京交通大学 Event detection mechanism applicable to wireless sensor network
CN103179602A (en) * 2013-03-15 2013-06-26 无锡清华信息科学与技术国家实验室物联网技术中心 Method and device for detecting abnormal data of wireless sensor network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7734451B2 (en) * 2005-10-18 2010-06-08 Honeywell International Inc. System, method, and computer program for early event detection
CN102291739A (en) * 2011-08-16 2011-12-21 哈尔滨工业大学 Method for detecting wireless sensor network sparse events based on compressed sensing and game theory
CN102665253A (en) * 2012-04-20 2012-09-12 山东大学 Event detection method on basis of wireless sensor network
CN102724686A (en) * 2012-05-17 2012-10-10 北京交通大学 Event detection mechanism applicable to wireless sensor network
CN103179602A (en) * 2013-03-15 2013-06-26 无锡清华信息科学与技术国家实验室物联网技术中心 Method and device for detecting abnormal data of wireless sensor network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吕建华等: "无线传感网中能量优化的异常检测算法", 《南京航空航天大学学报》, vol. 43, 31 July 2011 (2011-07-31) *
王玉芹: "基于数据流模型的网络异常检测方法研究", 《潍坊学院学报》, vol. 6, no. 4, 31 July 2006 (2006-07-31) *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104536996A (en) * 2014-12-12 2015-04-22 南京理工大学 Computational node anomaly detection method in isomorphic environments
CN104536996B (en) * 2014-12-12 2017-12-12 南京理工大学 Calculate node method for detecting abnormality under a kind of homogeneous environment
CN107225571A (en) * 2017-06-07 2017-10-03 纳恩博(北京)科技有限公司 Motion planning and robot control method and apparatus, robot
CN107225571B (en) * 2017-06-07 2020-03-31 纳恩博(北京)科技有限公司 Robot motion control method and device and robot
CN107704332A (en) * 2017-09-28 2018-02-16 努比亚技术有限公司 Freeze screen solution method, mobile terminal and computer-readable recording medium
CN109902564A (en) * 2019-01-17 2019-06-18 杭州电子科技大学 A kind of accident detection method based on the sparse autoencoder network of structural similarity
CN114365505A (en) * 2019-11-07 2022-04-15 阿里巴巴集团控股有限公司 Data-driven object graph for data center monitoring
CN115551060A (en) * 2022-10-20 2022-12-30 浙江瑞邦科特检测有限公司 Low-power consumption data monitoring method
CN115551060B (en) * 2022-10-20 2023-11-17 浙江瑞邦科特检测有限公司 Low-power-consumption data monitoring method

Also Published As

Publication number Publication date
CN103561420B (en) 2016-06-08

Similar Documents

Publication Publication Date Title
CN103546916B (en) Method for detecting abnormality based on data increment figure
CN103561420A (en) Anomaly detection method based on data snapshot graphs
CN107358347A (en) Equipment cluster health state evaluation method based on industrial big data
CN105630988A (en) Method and system for rapidly detecting space data changes and updating data
CN108306756A (en) One kind being based on electric power data network holography assessment system and its Fault Locating Method
Braverman et al. Clustering problems on sliding windows
US20120036242A1 (en) Method and sensor network for attribute selection for an event recognition
CN109410588A (en) A kind of traffic accident evolution analysis method based on traffic big data
Su et al. An improved random forest model for the prediction of dam displacement
CN105491614A (en) Wireless sensor network abnormal event detection method and system based on secondary mixed compression
Chuchro et al. A concept of time windows length selection in stream databases in the context of sensor networks monitoring
CN107045141B (en) Microseism based on inverse time arrival time difference database/earthquake source method for rapidly positioning
CN115654381A (en) Water supply pipeline leakage detection method based on graph neural network
CN116010722A (en) Query method of dynamic multi-objective space-time problem based on grid space-time knowledge graph
Di et al. Comprehensive early warning method of microseismic, acoustic emission, and electromagnetic radiation signals of rock burst based on deep learning
Wang et al. Group pattern mining on moving objects’ uncertain trajectories
Fagiani et al. A novelty detection approach to identify the occurrence of leakage in smart gas and water grids
CN116074092B (en) Attack scene reconstruction system based on heterogram attention network
Liu A real-time detection method for abnormal data of internet of things sensors based on mobile edge computing
Li et al. Evolving a Bayesian network model with information flow for time series interpolation of multiple ocean variables
Shekhar et al. What’s spatial about spatial data mining: three case studies
CN110807061A (en) Method for searching frequent subgraphs of uncertain graphs based on layering
Nunes et al. Analysis of large scale climate data: how well climate change models and data from real sensor networks agree?
CN113840255B (en) Anomaly detection method based on cloud edge fusion environment
Li et al. Extracting semantic event information from distributed sensing devices using fuzzy sets

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160608

Termination date: 20191107

CF01 Termination of patent right due to non-payment of annual fee