CN114401136A - Rapid anomaly detection method for multiple attribute networks - Google Patents

Rapid anomaly detection method for multiple attribute networks Download PDF

Info

Publication number
CN114401136A
CN114401136A CN202210042389.0A CN202210042389A CN114401136A CN 114401136 A CN114401136 A CN 114401136A CN 202210042389 A CN202210042389 A CN 202210042389A CN 114401136 A CN114401136 A CN 114401136A
Authority
CN
China
Prior art keywords
abnormal
network
attribute
private
networks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210042389.0A
Other languages
Chinese (zh)
Other versions
CN114401136B (en
Inventor
张欣悦
武南南
王文俊
张宁
孙英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202210042389.0A priority Critical patent/CN114401136B/en
Publication of CN114401136A publication Critical patent/CN114401136A/en
Application granted granted Critical
Publication of CN114401136B publication Critical patent/CN114401136B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a rapid anomaly detection method for multiple attribute networks, which is characterized in that an anomaly detection model is established in multiple private attribute networks based on local, an approximate optimal anomaly query method is adopted in each private network to detect abnormal subgraphs with specific shapes and align the abnormal subgraphs with the abnormal subgraphs in a public network, and the method specifically comprises the following steps: initializing an upper limit result set and a lower limit result set as an empty set, setting the iteration number i as 0, and selecting the first m nodes as an upper limit structure; decomposing the query graph Q into a star structure, and limiting the upper limit of the abnormal subgraph obtained by the last iteration
Figure DDA0003470790430000011
And lower limit
Figure DDA0003470790430000012
Merging to obtain a temporary optimal structure S; when the upper limit is reached
Figure DDA0003470790430000013
And lower limit
Figure DDA0003470790430000014
Returning approximate abnormal query results when the abnormal scores are close
Figure DDA0003470790430000015
Otherwise, iteration is carried out until the stopping condition is met, and the method can dig out the IP with similar attack behaviors at different times. Therefore, the attacked website can avoid the attack risk with high probability only by intercepting the IP of the network segment.

Description

Rapid anomaly detection method for multiple attribute networks
Technical Field
The invention belongs to the field of federal anomaly detection of multiple attribute networks, and particularly relates to a rapid anomaly detection method for multiple attribute networks.
Background
The federated anomaly detection problem is to find associated anomaly subgraphs on multiple layers of private property graph data. Abnormal subgraph detection has been widely applied to network attack detection in computer networks, public opinion outbreak detection in social networks, congestion detection in traffic networks and other various applications.
Currently, anomaly detection faces two major challenges: firstly, isolated data in most industries are limited to be shared with other industries due to data privacy and safety, secondly, the traditional anomaly detection needs to calculate the whole amount of network to judge the anomaly, and the data volume generated in the fields of the Internet and the like every day is increased by exponential level, so that the calculation result cannot be obtained quickly, and after the result is obtained, few methods can achieve the purpose of mining the relation of abnormal nodes and knowing the reason of the abnormal event.
A near-optimal federal anomaly detection method is generally adopted for private graph attribute data of a multi-layer attribute network. The approximate optimal abnormal query abstracts the existing spatio-temporal data into a connected private attribute network or attribute graph and matches the connected private attribute network or attribute graph with a known behavior pattern to obtain the most relevant and abnormal parts in the networks so as to explore the abnormal connection and abnormal cause between nodes under a single-layer network. Each private property network is aligned with an anomaly on the public property network, respectively, to mine the commonality of anomalies between these events.
Disclosure of Invention
Aiming at the problems in the prior art, the invention excavates the abnormity of a specific structure mode existing among a plurality of private attribute networks on the premise of protecting privacy, guides the formulation of corresponding policies and simultaneously excavates potential abnormal information. In a plurality of computer attack networks with different time periods, the method can excavate the IP with similar attack behaviors at different times. And obtaining the related fixed network segment and the attack mode according to the real record. Therefore, the attacked website can avoid the attack risk with high probability only by intercepting the IP of the network segment.
The invention is implemented by adopting the following technical scheme:
a quick abnormity detection method for a plurality of attribute networks comprises the following steps:
constructing a plurality of attribute networks according to requirements and calculating abnormal attribute values of network nodes according to the following formula;
Figure RE-GDA0003565469600000011
wherein: n is the number of all nodes in the network; attribute network G*={Gi},i∈{0,1,...,N},Gi=(Vi,Ei,Pi) Denotes the ith network, Vi,Ei,PiEach represents GiThe node set, the edge set and the abnormal attribute set of (1);
inputting an edge set and an abnormal attribute set of a public network and a plurality of private networks, and presetting parameters of a network to be tested, wherein the parameters comprise:
an anomaly threshold α and an alignment threshold σ; initialization result set UiThe iteration times i are 0 for the empty set;
the plurality of private networks GiDownloading a public network to the local and respectively pre-aligning with the public network to obtain an alignment probability matrix set Hij
Obtaining the last iteration result, detecting the approximate optimal abnormal subgraph S of the private networkj *
Aligning the public network to obtain a result set Uj, and uploading the result set Uj to a cloud end;
merging private networks at cloudUploading the result, aligning the result with a public network, and obtaining all aligned abnormal subgraphs U*Summarized as set Ui+1
Networking multiple layers of private attributes GjOptimal abnormal subgraph S inj *The nodes of (2) are regarded as normal nodes;
when U is turnedi=Ui+1Returning to the aligned abnormal subgraph set S*Combining with the abnormal subgraph after U output alignment;
Figure RE-GDA0003565469600000021
otherwise, the iteration number i is i +1, and 5) to 7) are repeated until the stop condition of 8) is satisfied.
Advantageous effects
The invention is based on the attribute network that distributes on a plurality of local private data sets and sets up the abnormal detection model, wherein the network is made up of a plurality of private attribute networks and a public attribute network. In each private property network, the detected special shape abnormal subgraph is aligned with the abnormal subgraph in the public property network. The algorithm for anomaly detection adopts an approximate optimal anomaly query method, utilizes the anomaly calculation characteristics of linear time subset scanning in the method and an approximate query method based on an anomaly mode, overcomes the defects that the traditional anomaly detection algorithm is low in speed, poor in robustness, high in data cost and incapable of explaining an operation result, and quickly mines a result by knowing an abnormal behavior mode in advance and has the capability of analyzing the cause of the anomaly. And in the public attribute network, selecting an important public abnormal subgraph to carry out abnormal subgraph alignment, and simultaneously preventing data leakage.
Drawings
FIG. 1 is a flow chart of the process of the present invention.
FIG. 2 is a schematic diagram of the concept of a rapid anomaly detection method based on the specific structure of Federal anomaly detection.
Fig. 3 is a schematic diagram of a near-optimal abnormal query method in a private network.
FIG. 4 is a schematic diagram of a query graph setup suitable for different data scenarios.
FIG. 5 is a schematic diagram of the method applied to private computer attribute network attack detection.
Fig. 6 is a schematic diagram of applying the method to a certain related abnormal IP group found in a plurality of computer attack networks.
FIG. 7 is a general flow diagram illustrating a federated exception alignment algorithm to which the present invention relates.
FIG. 8 is a flow chart illustrating a near optimal anomaly detection algorithm in accordance with the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the following detailed discussion of the present invention will be made with reference to the accompanying drawings and examples, which are only illustrative and not limiting, and the scope of the present invention is not limited thereby.
As shown in fig. 1, the present invention provides a method for fast detecting an anomaly in a multi-layer attribute network, which includes the following steps:
s1, constructing a public network, a plurality of private networks and a query graph mode according to requirements, and calculating abnormal attribute values of the nodes.
Building multiple attribute networks G as required*={GiJ, i ∈ {0, 1., N }, where G ∈i=(Vi,Ei,Pi) Denotes the ith network, Vi,Ei,PiEach represents GiThe set of nodes, the set of edges, and the set of exception attributes. N is the number of attribute networks. G0Is a public network and the remaining networks are private networks. The multi-network may be formed by dividing data of the same source by time slices, or may be formed by directly using data of a plurality of different sources. The query graph is set as shown in fig. 4, and different behavior modes can select different query graph structures. For example, cyber attacks select a star configuration or a bipartite graph configuration, while river pollution selects a linear configuration.
The definition of the abnormal characteristic value of the node generally adopts an empirical p value, and the main meaning of the empirical p value of the node is that no network attack occursNull hypothesis H0And if the probability of whether the network attack event occurs to the IP node is less than or equal to the abnormal threshold alpha, judging that the network attack event occurs. The empirical p-value for a node v for a certain feature d can be defined as:
Figure RE-GDA0003565469600000031
wherein: i is a logic function, fd(v(t))≥fd(v) If so, the value of I is 1, otherwise, the value of I is 0; the abnormal characteristic (such as IP access amount) of the node v before the t day is fd(v(t)) Day t abnormality characterized by fd(v) (ii) a The lower the empirical p-value, the more anomalous the node.
If the attribute value of v itself is not counted according to time, the attribute value of v itself is used as an observed value cvThe attribute values of other nodes of the network are used as comparison values ciThe p-value is calculated according to the following equation.
Figure RE-GDA0003565469600000032
And N is the number of all nodes in the network.
S2, an abnormality threshold α (typically 0.15) and an alignment threshold σ (typically 0.6) are set. Initialization result set UiFor the empty set, the iteration number i is 0.
And marking the nodes with the abnormal attribute values less than or equal to alpha as abnormal nodes. If the node pair (v)i,vj) Is greater than an alignment threshold σ, viAnd vjWill be marked as closely related aligned nodes, where viAnd vjMust originate from different networks. Initialization result set UiAnd inputting an edge set and an abnormal attribute set of a public network and a plurality of private networks as an empty set, wherein the iteration number i is 0. And U is a set of the abnormal alignment subgraphs of all the private networks on the public network, and needs to be initialized to an empty set.
And S3, downloading the public network to the local by each private network, and respectively pre-aligning with the public network to obtain an alignment probability matrix set H.
This step is implemented using a cross mna framework of the multi-network alignment method, by inputting edge sets of multiple networks and known anchor links (confirming the same entity in different attribute networks), setting the training ratio to 1.0, and the training times to 400. Obtaining an aligned anchor chain matrix H between networksijIn summary, H (FIG. 2) is obtained. Other network alignment algorithms may be used to perform this step.
HijIs GiAnd GjThe alignment probability matrix of the nodes in between has the dimension of | Vi|×|VjL, each value H in the matrixij(vi,vj) Representing node viAnd node vjIs the alignment probability of the anchor node, and has a range of [0,1 ]]The larger the value is, the higher the alignment probability is, and the value of 1 indicates that the two nodes are known anchor nodes.
S4, obtaining the last iteration result, each private network GjLocally aligning with the private network, and detecting the near optimal abnormal subgraph S of the private networkj *
In the step, in order to evaluate the abnormal degree of each private network abnormal subgraph, the application introduces a nonparametric graph scanning statistic F as an abnormal score function:
Figure RE-GDA0003565469600000041
wherein
Figure RE-GDA0003565469600000042
Is a statistical function. S is a set of connected vertex subsets of G (i.e., connected subgraphs), α is the anomaly threshold for a node (typically 0.15), Nα(S) is the number of abnormal nodes (the abnormal attribute value is less than or equal to alpha) in S, and N (S) is the number of all nodes in S. In addition, in order to guarantee the maximum abnormality,
Figure RE-GDA0003565469600000043
two attributes need to be satisfied:
Figure RE-GDA0003565469600000044
value with Nα(S) monotonically increasing and varying with N (S) -Nα(S) the number of normal nodes is increased and monotonously decreased; the present invention therefore uses the Higher Criticism (HC) statistic as
Figure RE-GDA0003565469600000045
Figure RE-GDA0003565469600000046
The steps of detecting the approximate optimal abnormal subgraph of the private network are as follows:
4.1) initializing the upper and lower limit result set to be an empty set, inputting the edge set, the abnormal feature set and the edge set of the query graph of the private attribute network when the iteration number i is 0.
4.2) calculating the abnormal priority of the node. g () is a priority function. And inputting an attribute graph G, and sorting the nodes in the graph according to the size of the abnormal characteristic value, wherein the nodes with higher priority are more abnormal. The function orders the nodes in graph G into post outputs.
4.3) selecting the first m nodes as an upper limit structure.
4.4) decomposing the query graph Q into a star structure, for an upper limit node set
Figure RE-GDA0003565469600000047
Detecting the subgraph isomorphic with the star structure to obtain each point in the
Figure RE-GDA0003565469600000048
MaxQ function constructs upper limit
Figure RE-GDA0003565469600000049
The node set and the neighbors thereof match the part similar to the query graph Q in the attribute graph G and are converted into a lower limit structure
Figure RE-GDA00035654696000000410
Figure RE-GDA00035654696000000411
The query graph Q is decomposed into a star substructure, where each node will act as a central node or a leaf node of the star structure. When the query graph is a graph without attributes, the star structure with different leaf numbers only needs to be reserved. Star (v) sub-graph with function representing that return node v contains first-order neighbors on attribute graph G
Figure RE-GDA00035654696000000412
Is a star-shaped subgraph of the maximized abnormal score function and is isomorphic with the star-shaped structure of the query graph decomposition. By using the idea of greedy algorithm
Figure RE-GDA00035654696000000413
Splicing one by one, argmin
Figure RE-GDA00035654696000000414
Obtaining the part most similar to the query graph, and using the splicing result as the lower limit
Figure RE-GDA00035654696000000415
4.5) upper limit of abnormal subgraph obtained by last iteration
Figure RE-GDA00035654696000000416
And lower limit
Figure RE-GDA00035654696000000417
And combining to obtain the temporary optimal structure S.
4.6) updating the Upper bound result set
Figure RE-GDA00035654696000000418
Step (ii) of
Figure RE-GDA00035654696000000419
In (3), the update of the upper bound node set requires the addition of an uncomputed node such as v(k+1),v(j)Is to reserve a centralized priorityThe node with the highest rank, and v(k)Is the lowest priority node. The updated upper limit node set number is m.
4.7) when the upper limit
Figure RE-GDA00035654696000000420
And lower limit
Figure RE-GDA00035654696000000421
Returning approximate abnormal query results when the abnormal scores are close
Figure RE-GDA00035654696000000422
Otherwise, the iteration number i is i +1 until the stop condition is met. End result
Figure RE-GDA00035654696000000423
Namely, the abnormal detection result which is similar to the structure of the query graph while the target function F of a certain sub-graph in the attribute graph is maximized is obtained.
And S5, aligning the public network to obtain Uj, and uploading the Uj to the cloud.
S6, combining the results uploaded by each private network at the cloud end, aligning the results with the public network, and obtaining all aligned abnormal subgraphs U*Summarized as set Ui+1
In order to obtain the alignment score of abnormal subgraph alignment between networks, a function Q is defined as a scoring function of abnormal alignment, and the formula is as follows:
Figure RE-GDA00035654696000000424
where σ is the alignment threshold (0.8 for this method setting), Nσ(S, U) is the number of aligned nodes in S and U, N (S) is the number of all nodes in S, and N (U) is the number of all nodes in U. The alignment probability of a node is derived from the pre-alignment matrix node set H ═ H (H)ij), HijIs GiAnd GjAnd i ≠ j (see fig. 2).
S7, mixing GjAll of (A) belong toSj *Is regarded as a normal node
The present invention accomplishes this by setting the node exception attribute values to 1.
S8, when Ui=Ui+1Returning to the aligned abnormal subgraph set S*And U.
Otherwise, the iteration number i is i +1, and 4) to 7) are repeated until the stop condition of 8) is satisfied.
End result (S)*U) is the set of anomaly sub-maps that maximizes the objective function and approximates a particular shape. The overall method objective function is defined as follows:
Figure RE-GDA0003565469600000051
the invention discloses a process for locally detecting an optimal abnormal subgraph from each March network, which comprises the following steps:
for a given plurality of attributes network G*={GiJ, i ∈ {0, 1., N }, where G ∈i=(Vi,Pi,Ei) Denotes the ith network, Vi,Pi,EiEach represents GiThe set of nodes, the set of edges, and the set of exception attributes. N is the number of attribute networks. The abnormal characteristic value of the node is set to 0,1]Smaller means more abnormal node, larger than alphamax0.15 indicates that the node is a normal node.
The invention searches the subgraph which contains the most abnormal node and is similar to the query graph structure in the private attribute network, thus setting the following objective function:
Figure RE-GDA0003565469600000052
i.e., the sub-graph result that approximates the optimal exception query should maximize the function value F and satisfy the constraint that S is isomorphic to Q. For a given attribute network G ═ (V, E, W), where G denotes an attribute graph that contains (1) a set of nodes V ═ n]1, ·, n }; (2) edge set
Figure RE-GDA0003565469600000053
Where | E | ═ p, i.e., the number of edge sets is p; (3) set of node exception attributes
Figure RE-GDA0003565469600000054
Wherein the row vector
Figure RE-GDA0003565469600000055
Is the value of the attribute observed within the time span T of the vertex V ∈ V. For node subsets
Figure RE-GDA0003565469600000056
Figure RE-GDA0003565469600000057
Only the row vector is retained in S. If VS∈V,ESE, and WsSubject to the constraint of W, we then define the sub-graph S in G as
Figure RE-GDA0003565469600000058
Setting simultaneously
Figure RE-GDA0003565469600000059
Is a query graph. For time t and node v in the computer network, the number of records accessed by node v on the t day is recorded as an observed value
Figure RE-GDA00035654696000000510
And expressing the average number of access logs of the node v in the time period T before the tth day as an expected value
Figure RE-GDA00035654696000000511
In addition, normal access data and actual attack data are distinguished in the data records, and the actually occurring attack type, attack time and the IP addresses of the attack and the attacked can be known. The calculation of the node anomaly characteristic value (empirical p value) as an observed value cvAnd expected value bvComparison of (1). To test the robustness of the algorithm, a percentage K E {5 } in the random flip network,10. 20} empirical p-value of the node.
The invention has the general idea that the sub-graph structures of the upper bound and the lower bound in the iterative algorithm are calculated when the abnormal scores of the upper bound and the lower bound are smaller than a threshold value. In the experiment, the threshold value of the difference of the upper limit and the lower limit abnormal score is set to be epsilon 10-6. When the condition is met, the approximate optimal solution can be found, the operation is finished, and the result is returned. The near-optimal anomaly detection of the private network specifically comprises the following steps:
1) root node selection given a private attribute graph G, a set of m root nodes need to be selected to begin the matching process with the query graph, where m means the number of nodes in the query graph Q. The idea of selecting a root node set is as follows: (1) the number of nodes which are as normal as possible is contained as little as possible; (2) the abnormal nodes are contained as many as possible, and the abnormal values calculated by the abnormal nodes are higher, so that the abnormal score of the whole subgraph can be guaranteed to be higher. In consideration of these two design goals, the priority function g () first constructs the matching order of the nodes in ascending order of the empirical p-value, and selects the first m nodes as root nodes, e.g., { v3, v6, v8, v7}, and the root node set enters the computation as the upper bound of the anomaly score function in the first iteration (see the specific implementation for the function definition).
Figure RE-GDA0003565469600000061
2) Constructing an upper bound of the anomaly score function, wherein in the ith iteration, the next step is to pass the result of the last iteration
Figure RE-GDA0003565469600000062
By updating the node set of the selected m nodes, i.e. by
Figure RE-GDA0003565469600000063
Graph structure for constructing upper bound of anomaly score
Figure RE-GDA0003565469600000064
In addition, the invention is provided with
Figure RE-GDA0003565469600000065
It is not necessary to be a connected graph, and even if isolated points are included in the attribute graph G, the requirement is satisfied.
Figure RE-GDA0003565469600000066
The number of nodes of the structure is the same as that of the nodes of the root node set constructed in the first iteration, and the structure consists of m nodes.
Figure RE-GDA0003565469600000067
The node of (1) is composed of two parts, one part is iteration from the last time
Figure RE-GDA0003565469600000068
And
Figure RE-GDA0003565469600000069
the node set S formed by the top points with higher abnormal values needing to be reserved is obtained from the intersection set of the attribute graph G, and the other part is that the node set S which is subjected to priority sorting and has higher priority can be used for adding the candidate abnormal score upper limit node set
Figure RE-GDA00035654696000000610
The vertex of (c) is denoted by max g ({ v) in the algorithm(j),...,v(k),v(k+1)}). The update part of the node adopts a compact iterative mode, such as an algorithm maxg ({ v)(j),...,v(k),v(k+1)} -S, m-S), assuming that the update reaches the kth vertex v after the i-1 th iteration(k)(v(k)Already exists
Figure RE-GDA00035654696000000611
In) and calculate
Figure RE-GDA00035654696000000612
And
Figure RE-GDA00035654696000000613
the resulting intersection of (c) requires the preservation of | S | number of vertices. Node set at nodes requiring reservationIn S, assume the jth vertex v(j)Is the vertex with the highest priority in S, and the kth vertex v(k)As the vertex with the lowest priority and the last one to be retained. The node update is at v(j)Then m-S-1 vertices are selected in priority order, and the non-calculated node v must be selected(k+1)Entering an updated node set
Figure RE-GDA00035654696000000614
To ensure that the next iteration is not trapped in an infinite loop. The invention adopts an optimization strategy when returning results, when the number of the vertexes in S is equal to the query graph Q and
Figure RE-GDA00035654696000000615
and when the graph is a connected graph, directly returning S as a calculation result. If the number of the nodes of the node set S needing to be reserved is less than m, updating step by step in descending order
Figure RE-GDA00035654696000000616
3) Constructing a lower bound for the anomaly score function: near-optimal anomaly query algorithm based on anomaly score upper-bound structure
Figure RE-GDA00035654696000000617
Node set of (2) constructing its lower bound structure
Figure RE-GDA00035654696000000618
Figure RE-GDA00035654696000000619
The same star subgraph as the decomposition structure of the query graph is selected as the root. The matching star structure is then assembled into a sub-graph approximating the query graph, and the method selects heuristic search to construct
Figure RE-GDA0003565469600000071
First, decomposition of the query graph is introduced
Figure RE-GDA0003565469600000072
The step decomposes the query graph Q into
Figure RE-GDA0003565469600000073
m is the number of nodes of the query graph. In the decomposition query graph, each node has an opportunity to serve as a central node of a star structure and a plurality of leaf nodes of the star structure. As an example of query graph decomposition is given in fig. 4, the query graph Q is decomposed into two star structures with 3 leaf nodes and two star structures with 2 leaf nodes, and only one star structure with the same number of nodes needs to be reserved in the calculation process. The Star function returns m Star structures. The function of Star
Figure RE-GDA0003565469600000074
Vertex in (1) { v }(j),...,v(k-1),v(k+1)As center, { v(j),...,v(k-1),v(k+1)Constructing m star subgraphs by taking neighbors in an attribute graph G as leaf top points
Figure RE-GDA0003565469600000075
Wherein. In order to further optimize the constructed result to be similar to the structure of the query graph, conditions are set
Figure RE-GDA0003565469600000076
Query graph of m Star structures and decomposition returned by Star function
Figure RE-GDA0003565469600000077
And (4) isomorphism. When { v }(j),...,v(k-1),v(k+1)The leaf node number of the star structure in which the star structure is located is larger than that of the decomposition subgraph
Figure RE-GDA0003565469600000078
When the number of leaves is small, only the most abnormal vertex is selected as a leaf, otherwise, all the neighbors of the vertex v are accepted.
Figure RE-GDA0003565469600000079
Need to match each decomposed subgraph
Figure RE-GDA00035654696000000710
And save the results. After obtaining the candidate subgraph, the method will
Figure RE-GDA00035654696000000711
Combining the sub-images one by one, and calculating to obtain a sub-image with the highest abnormal score as
Figure RE-GDA00035654696000000712
When two or more subgraphs have the same abnormal score, the subgraph with the minimum graph editing distance with the query graph is selected as the subgraph
Figure RE-GDA00035654696000000713
After the private network performs approximate anomaly detection, the abnormal subgraph needs to be transmitted into the public attribute network and abnormal alignment is performed, and the specific method of the abnormal alignment is as follows:
and (4) public abnormal alignment, namely uploading the alignment result to the cloud end by each private network, executing abnormal alignment work again at the cloud end, and integrating to obtain an aligned abnormal sub-graph set without changing the alignment result. The aligned abnormal subgraph always contains the most aligned nodes and the least non-aligned nodes. And the node with the alignment probability larger than that is judged as the aligned node. The invention uses an alignment function to count the alignment score of the abnormal alignment subgraph (the function definition is shown in a specific embodiment).
Figure RE-GDA00035654696000000714
The invention needs to optimize the contents of both the abnormal detection and the abnormal alignment, so the following objective functions are set:
Figure RE-GDA00035654696000000715
optimal alignment anomaly subgraph result (S)*,U*) The function value should be maximized.
Wherein: definition of a near-optimal federal anomaly detection algorithm. According to the steps, in order to obtain an optimal solution, the invention further provides a federal anomaly detection algorithm spanning multiple attribute networks, and the specific algorithm design is shown in the following figure. The algorithm is defined as the Approximate Optimal analysis Max Query in ordered Networks, AnamalyMaxQ for short. The method runs under the framework of FADMAN federal abnormal alignment method, and results are initialized to be UiThe number of iterations i is 0, an abnormal threshold alpha and an alignment threshold sigma are predefined, edge set/attribute set data of a public network and a plurality of private networks are input, and an alignment abnormal subgraph set S is continuously expanded through multiple iterations*And U*The result set that maximizes the objective function is obtained.
The method provided by the invention is an algorithm suitable for multi-scene federal anomaly detection on the premise of protecting privacy and not carrying out direct data exchange. A few specific scenarios are briefly described here. In a computer attack network, an IP and a website are used as nodes, an access behavior is used as an edge, and an access frequency is used as an abnormal attribute. Dividing the network into a plurality of networks according to time, and if all the networks have abnormal attributes, excavating abnormal IP groups (shown in figure 5) with similar attack behaviors; if the network in a certain time period has no abnormal attribute, the method can excavate the abnormality of the network by aligning the abnormal subgraphs of other networks to the network, so as to predict the IP attack effect in the time period; 2. in the enterprise investment network, the invention can detect whether the enterprise has the behavior risk of false positive and money laundering, and help investors to make investment decisions.
The invention shows detection analysis aiming at one-to-many attack mode and many-to-many attack mode of a computer on a computer flow network data set. Although these IP addresses appear in different places and times, their attack behavior is similar. By utilizing the method, some abnormal IP groups can be obtained by querying a specific attack mode, so that the server is helped to actively intercept the attack of a certain IP section, such as a star-shaped query graph and a bipartite graph query graph. As shown in fig. 6, the AnomalyMaxQ algorithm successfully discovers the attack network without innocent nodes.
One-to-many attack mode. As shown in fig. 6, which is a network attack detected by the algorithm. The red nodes represent the attacking or attacked IP site in the real world, and the yellow areas represent the outlier vertices we compute. It can be clearly seen that the attack records are found by the star query graph. The test results show that a certain IP address x.x.223.66 from Jiangsu province in China from 3 months and 10 days 2015 attacks the other four server sites yysj. The attack is detected to be FckEditorAttack attack. One-to-many or many-to-one network attack patterns are also the most common form of attack in networks. Many-to-many attack patterns. FIG. 6 shows that the network attack of DedecsAttack type is initiated by abroad and China together with Jiangxi on days 3, 12 in 2015, detected by the query graph. Because an attacker typically does not use only a single IP address to perform a network attack, it is possible to discover IP groups that are attacked at the same time, as compared to a star architecture. By recording these IP addresses, it is found that they come from multiple fixed network segments and that the attack pattern and location remain unchanged, meaning that they may come from the same attack source. With this information, network attacks can be prevented by blocking the IP of these fixed IP segments.
Practice shows that the method is wide in application range, strong in expansibility and suitable for different scenes, and related abnormal information/potential abnormal information is mined.
The present invention is not limited to the above-described embodiments. The foregoing description of the specific embodiments is intended to describe and illustrate the technical solutions of the present invention, and the above specific embodiments are merely illustrative and not restrictive. Those skilled in the art can make many changes and modifications to the invention without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (2)

1. A rapid anomaly detection method aiming at a plurality of attribute networks, wherein an anomaly detection model is established based on a data set distributed on a plurality of local private attribute networks, the anomaly detection model comprises attribute networks consisting of a plurality of private attribute networks and a public attribute network, and each private attribute network aligns a detected abnormal subgraph with a specific shape with an abnormal subgraph in the public attribute network by adopting a near-optimal anomaly query method, and the method comprises the following steps:
initializing an upper and lower limit result set as an empty set, inputting an edge set of a private attribute network, an abnormal feature set and an edge set of a query graph when the iteration number i is 0
Calculating the abnormal priority of the edge set, the abnormal feature set and the edge set node of the query graph of the private attribute network;
selecting the first m nodes as an upper limit structure;
decomposing the query graph Q into a star structure for the upper limit node set
Figure FDA0003470790400000011
Detecting the subgraph isomorphic with the star structure to obtain each point in the
Figure FDA0003470790400000012
The upper limit of the abnormal subgraph obtained by the last iteration is used
Figure FDA0003470790400000013
And lower limit
Figure FDA0003470790400000014
Merging to obtain a temporary optimal structure S;
updating the upper bound result set
Figure FDA0003470790400000015
When the upper limit is reached
Figure FDA0003470790400000016
And lower limit
Figure FDA0003470790400000017
Returning approximate abnormal query results when the abnormal scores are close
Figure FDA0003470790400000018
Otherwise, the iteration number i is i +1 until the stop condition is met.
2. The application of the rapid anomaly detection method for multiple attribute networks in claim 1 is characterized by comprising the following steps:
s1, constructing a multilayer attribute network according to requirements and calculating the abnormal attribute value of the network node according to the following formula;
Figure FDA0003470790400000019
wherein: n is the number of all nodes in the network; attribute network G*={Gi},i∈{0,1,...,N},Gi=(Vi,Ei,Pi) Denotes the ith network, Vi,Ei,PiEach represents GiThe node set, the edge set and the abnormal attribute set of (1);
s2, inputting an edge set and an abnormal attribute set of a public network and a plurality of private networks, and presetting parameters of the network to be tested, wherein the parameters comprise: an anomaly threshold α and an alignment threshold σ; initialization result set UiThe iteration times i are 0 when the set is empty;
s3, multiple private networks GiDownloading a public network to the local and respectively pre-aligning with the public network to obtain an alignment probability matrix set Hij
S4, obtaining the last iteration result, detecting the approximate optimal abnormal subgraph S of the private networkj *
S5, aligning with the public network to obtain a result set Uj, and uploading the result set Uj to the cloud;
s6, combining the results uploaded by each private network at the cloud, aligning the results with the public network, and obtaining all aligned abnormal subgraphs U*Summarized as result set Ui+1
S7, network G with multi-layer private attributesjOptimal abnormal subgraph S inj *The nodes of (2) are regarded as normal nodes;
s8, when Ui=Ui+1Returning to the aligned abnormal subgraph set S*Combining with the abnormal subgraph after U output alignment;
Figure FDA00034707904000000110
otherwise, the number of iterations i ═ i +1, and S5 through S7 are repeated until the stop condition of S8 is satisfied.
CN202210042389.0A 2022-01-14 2022-01-14 Rapid anomaly detection method for multiple attribute networks Active CN114401136B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210042389.0A CN114401136B (en) 2022-01-14 2022-01-14 Rapid anomaly detection method for multiple attribute networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210042389.0A CN114401136B (en) 2022-01-14 2022-01-14 Rapid anomaly detection method for multiple attribute networks

Publications (2)

Publication Number Publication Date
CN114401136A true CN114401136A (en) 2022-04-26
CN114401136B CN114401136B (en) 2023-05-05

Family

ID=81230469

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210042389.0A Active CN114401136B (en) 2022-01-14 2022-01-14 Rapid anomaly detection method for multiple attribute networks

Country Status (1)

Country Link
CN (1) CN114401136B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115018280A (en) * 2022-05-24 2022-09-06 支付宝(杭州)信息技术有限公司 Risk graph pattern mining method, risk identification method and corresponding devices
CN115017371A (en) * 2022-06-01 2022-09-06 阿里巴巴(中国)有限公司 Target node determination method, storage medium, and program product
CN115277156A (en) * 2022-07-22 2022-11-01 福建师范大学 User identity privacy protection method for resisting neighbor attack in social network

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150047026A1 (en) * 2012-03-22 2015-02-12 Los Alamos National Security, Llc Anomaly detection to identify coordinated group attacks in computer networks
CN110795807A (en) * 2019-10-28 2020-02-14 天津大学 Complex network-based element abnormal structure detection model construction method
WO2020042024A1 (en) * 2018-08-29 2020-03-05 区链通网络有限公司 Node abnormality detection method and device based on graph algorithm and storage device
CN111737647A (en) * 2020-05-19 2020-10-02 北京明略软件系统有限公司 Method and device for detecting abnormal connected subgraph
CN112417303A (en) * 2020-12-09 2021-02-26 天津大学 Evolution algorithm for detecting multiple abnormal subgraphs from dynamic attribute graph
CN112422571A (en) * 2020-11-19 2021-02-26 天津大学 Method for carrying out exception alignment across multiple attribute networks
CN112507210A (en) * 2020-11-18 2021-03-16 天津大学 Interactive visualization method for event detection on attribute network
CN112528640A (en) * 2020-12-09 2021-03-19 天津大学 Automatic domain term extraction method based on abnormal subgraph detection
CN112650968A (en) * 2020-11-18 2021-04-13 天津大学 Abnormal subgraph detection method based on abnormal alignment model for multiple networks
US20210266748A1 (en) * 2018-08-29 2021-08-26 Chongqing University Of Posts And Telecommunications Improved KNN - Based 6LoWPAN Network Intrusion Detection Method

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150047026A1 (en) * 2012-03-22 2015-02-12 Los Alamos National Security, Llc Anomaly detection to identify coordinated group attacks in computer networks
WO2020042024A1 (en) * 2018-08-29 2020-03-05 区链通网络有限公司 Node abnormality detection method and device based on graph algorithm and storage device
US20210266748A1 (en) * 2018-08-29 2021-08-26 Chongqing University Of Posts And Telecommunications Improved KNN - Based 6LoWPAN Network Intrusion Detection Method
CN110795807A (en) * 2019-10-28 2020-02-14 天津大学 Complex network-based element abnormal structure detection model construction method
CN111737647A (en) * 2020-05-19 2020-10-02 北京明略软件系统有限公司 Method and device for detecting abnormal connected subgraph
CN112507210A (en) * 2020-11-18 2021-03-16 天津大学 Interactive visualization method for event detection on attribute network
CN112650968A (en) * 2020-11-18 2021-04-13 天津大学 Abnormal subgraph detection method based on abnormal alignment model for multiple networks
CN112422571A (en) * 2020-11-19 2021-02-26 天津大学 Method for carrying out exception alignment across multiple attribute networks
CN112417303A (en) * 2020-12-09 2021-02-26 天津大学 Evolution algorithm for detecting multiple abnormal subgraphs from dynamic attribute graph
CN112528640A (en) * 2020-12-09 2021-03-19 天津大学 Automatic domain term extraction method based on abnormal subgraph detection

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
XINYUE ZHANG, NANNAN WU, ZIXU ZHEN, WENJUN WANG: "ANOMALYMAXQ: Anomaly-Structured Maximization to Query in Attributed Networks", 《HTTPS://ARXIV.ORG/ABS/2108.07405》 *
唐成华等: "基于特征选择的模糊聚类异常入侵行为检测", 《计算机研究与发展》 *
李洁颖等: "增量式健壮主成分分类器的无监督异常检测方法研究", 《计算机工程与应用》 *
赵琪琪等: "融合节点属性与结构信息的子空间异常社区检测方法", 《计算机工程》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115018280A (en) * 2022-05-24 2022-09-06 支付宝(杭州)信息技术有限公司 Risk graph pattern mining method, risk identification method and corresponding devices
CN115017371A (en) * 2022-06-01 2022-09-06 阿里巴巴(中国)有限公司 Target node determination method, storage medium, and program product
CN115277156A (en) * 2022-07-22 2022-11-01 福建师范大学 User identity privacy protection method for resisting neighbor attack in social network

Also Published As

Publication number Publication date
CN114401136B (en) 2023-05-05

Similar Documents

Publication Publication Date Title
CN114401136A (en) Rapid anomaly detection method for multiple attribute networks
CN108933793B (en) Attack graph generation method and device based on knowledge graph
CN105871882B (en) Network security risk analysis method based on network node fragility and attack information
CN115296924B (en) Network attack prediction method and device based on knowledge graph
Lin et al. Adversarial attacks on link prediction algorithms based on graph neural networks
CN112422571A (en) Method for carrying out exception alignment across multiple attribute networks
Haas et al. Efficient attack correlation and identification of attack scenarios based on network-motifs
CN111709022A (en) Hybrid alarm association method based on AP clustering and causal relationship
Hussain et al. Adversarial inter-group link injection degrades the fairness of graph neural networks
Liu et al. Multi-step attack scenarios mining based on neural network and Bayesian network attack graph
CN115114484A (en) Abnormal event detection method and device, computer equipment and storage medium
Paulo et al. Social network intelligence analysis to combat street gang violence
Pan et al. Overlapping community detection via leader-based local expansion in social networks
CN111159768B (en) Evaluation method for link privacy protection effect of social network
Shen et al. A hierarchical diffusion algorithm for community detection in social networks
Wang et al. [Retracted] Overlapping Community Detection Based on Node Importance and Adjacency Information
Soliman et al. Rank: Ai-assisted end-to-end architecture for detecting persistent attacks in enterprise networks
Krundyshev Neural network approach to assessing cybersecurity risks in large-scale dynamic networks
CN108366048A (en) A kind of network inbreak detection method based on unsupervised learning
Yang et al. A method of node importance measurement base on community structure in heterogeneous combat networks
Pan et al. A method of key links identification in command and control network based on bridging coefficient
CN114884688B (en) Federal anomaly detection method across multi-attribute networks
Prasad et al. DEFAD: ensemble classifier for DDOS enabled flood attack defense in distributed network environment
Hu et al. Research on automatic generation and analysis technology of network attack graph
Chaudhari et al. Harris Hawk Optimization-Based Distributed Denial of Service Attack Detection in IoT Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant