Disclosure of Invention
The invention provides a method and a system for detecting enterprise network abnormity based on a dynamic storage network aiming at the problems of the prior art, and the adopted technical scheme is as follows: a method for detecting enterprise network abnormity based on a dynamic storage network comprises the following steps:
s1, preprocessing the original event, recording and maintaining a multi-domain event database;
s2 converts current event C and historical event S into multi-bit digital vectors Q and F by field level embedding and temporal level encoding;
S3 forms an integrated memory M by retrieving F contained in Q to aggregate relevant facts through an iterative alert process, and initializes the memory to the code vector M0 ═ Q of the current event;
s4, decoding the integrated memory M to obtain the distribution probability of the expected event;
s5 determines whether the current event is abnormal by comparing the current event with the predicted event.
The specific steps of S2 are as follows:
s201, preselecting a periodic continuous bag-of-words model, and calculating an embedded vector of each field of a current event C and a historical event S to obtain corresponding field-level embedded vectors Q and F;
s202 feeds the field-level embedded vectors Q and F into the Bi-directional gated round robin unit Bi-GRU, representing the encoded vectors of the field-level embedded vectors Q and F as Q and F [ F ]1,f2,…fT]。
The specific steps of S3 are as follows:
s301, for each iteration i, a prompt vector A is usedi=[ai 1,ai 2,…ai T]Calculating a correlation between given F and Q;
s302, inputting a reminding vector into A by using a double-layer feedforward neural networkiVector code ft for capturing historical events, the previous memory Mi-1And the internal connection between the current event Q is denoted Zt;
s303 represents the calculation scenario Eatt of the GRU with modification by attGRU to obtain the final hidden state of the given reminding vector, E i att=hi T;
S304 updates the integrated memory M to Mi for each iteration i.
The S4 decodes the integrated storage M by gating the loop unit GRU in combination with the full connection layer FCL, constructs a separate GRUj + FCLj network with the given historical event S to predict the event
The jth field in (1).
And in the process of S3, the retrieval of the relevant facts is stored in a temporary storage mechanism, and the method is characterized by comprising the following steps:
s311, taking the historical event F with the reminding value exceeding the threshold lambda as a related event;
s312, when the final storage is obtained, the abnormal detection system puts the indexes of the relevant facts into a temporary storage;
s313, the abnormity detection system queries the cache to obtain related previous events;
s314, transmitting the abnormity detection early warning to a network security officer for analysis;
s315 when the current event is normal, the abnormality detection system automatically clears the cache.
An enterprise network anomaly detection system based on a dynamic storage network comprises a data preparation module, a presentation layer module, a storage formation module, a temporary storage module, a prediction layer module and an anomaly detection module;
a data preparation module: preprocessing an original event, and recording and maintaining a multi-domain event database;
A presentation layer module: converting a current event C and a historical event S into multi-bit digital vectors Q and F by field level embedding and time level coding;
a storage formation module: f contained by Q is retrieved to aggregate relevant facts through an iterative alert process to form an integrated store M, and the store is initialized to be a coded vector M0Q of the current event;
a temporary storage module: storing the relevant facts in the storage forming module as relevant events;
a prediction layer module: decoding the integrated memory M to obtain the distribution probability of the expected event;
an anomaly detection module: and comparing the current event with the predicted event to judge whether the current event is abnormal.
The representation layer module comprises a vector conversion module and a code conversion module;
the vector conversion module presets a periodic continuous bag-of-words model to calculate the embedded vector of each field of the current event C and the historical event S, and corresponding field-level embedded vectors Q and F are obtained;
the transcoding module feeds the field-level embedded vectors Q and F into the Bi-directional gated round-robin unit Bi-GRU, representing the encoded vectors of the field-level embedded vectors Q and F as Q and F ═ F1, F2, … fT.
The storage forming module comprises a correlation calculation module, an internal connection module, a state hiding module and a storage updating module;
A correlation calculation module: for each iteration i, by a prompt vector Ai=[ai 1,ai 2,…ai T]Calculating a correlation between given F and Q;
an internal connection module: inputting a reminding vector by using a double-layer feedforward neural network, capturing a vector code ft of a historical event, and storing a previous memory Mi-1And between the current event QThe internal connection is denoted Zt;
a state hiding module: using attGRU to represent GRU calculation scenario Eatt of operation modification to obtain final hidden state of given reminding vector, Ei att=hi T;
A storage update module: for each iteration i, the integrated memory M is updated to Mi
The invention has the beneficial effects that: the detection method implicitly codes the workflow path of the bottom system through an iteration reminding process, so that related events can be provided as possible important clues of potential malicious activities, origin tracking is facilitated, meanwhile, potential correlation detection among different events in different domains is realized, dependency among adjacent events is reduced, the detection sensitivity of the detection method on abnormal networks is improved, the safety of enterprise networks is ensured, and the security threat born by the enterprise networks is reduced;
the data preparation module and the presentation layer module of the detection system are matched, and the detected network signal is used as a current event to carry out implicit coding processing; the storage forming module is matched with the temporary storage module, extracts the scene characteristic value of the current event through an iteration reminding process, forms an integrated storage M and stores the integrated storage M into the temporary storage module; the prediction layer module and the anomaly detection module compare the current event with the integrated storage M, so that the network signal of the current event is detected; the detection system of the invention performs implicit coding on the workflow path of the bottom system through an iterative reminding process, stores the workflow path through the temporary storage module, so that the detection system can provide possible important clues that related events become potential malicious activities, facilitates origin tracking, simultaneously realizes potential correlation detection between different events in different domains, reduces dependency between adjacent events, improves the detection sensitivity of the detection system of the invention on abnormal networks, ensures the safety of enterprise networks, and reduces the security threat born by the enterprise networks.
Detailed Description
The present invention is further described below in conjunction with the following figures and specific examples so that those skilled in the art may better understand the present invention and practice it, but the examples are not intended to limit the present invention.
The first embodiment is as follows:
a method for detecting enterprise network abnormity based on a dynamic storage network comprises the following steps:
s1, taking the new event as a current event C, normalizing the current event C into a group of preset fields, taking K windows of the latest time from the database as related contexts, and expressing the historical event by S;
S2 converts current events C and historical events S into multi-bit digital vectors Q and F by field level embedding and event level encoding, the specific steps are:
s201, preselecting a periodic continuous bag-of-words model, and calculating an embedded vector of each field of a current event C and a historical event S to obtain corresponding field-level embedded vectors Q and F;
s202 feeds the field-level embedded vectors Q and F into the Bi-directional gated round-robin unit Bi-GRU, representing the encoded vectors of the field-level embedded vectors Q and F as Q and F [ < F >1,f2,…fT],ftAn amount of code representing the tth historical event;
the periodic continuous bag-of-words model can recycle hidden states, fusing a large amount of context information in field level embedding. For fields with continuous values, such as timestamps, we divide the range of values into several segments in order to reduce the large number of continuous values to a smaller set of discrete intervals;
s3 is formed an integrated memory M by searching F contained in Q to aggregate related facts through an iterative reminding process, and the memory is initialized to be the code vector M of the current event0The method comprises the following specific steps:
s301, for each iteration i, a prompt vector A is usedi=[ai 1,ai 2,…ai T]Calculating the correlation between given F and Q, ai tIs the alert weight;
S302 double-layer feedforward neural network inputs reminding vector ztVector encoding/to capture historical eventstPrevious memory Mi-1And internal connection between the previous event Q:
wherein O is the product of elements, W
1,W
2,b
1,b
2Is a parameter to be learned;
s303, using attGRU to represent operation modified GRU computing scenario Eatt to obtain given reminding vector ai TFinal hidden state h ofi T:
An episode is defined as the final hidden state of attGRU, Ei att=hi T(ii) a Wherein a isi tTo remind the weights, S304 updates the integrated memory M to, for each iteration i:
saving the retrieval of the relevant facts in a temporary storage mechanism in the process of S3, and setting the maximum value of the iteration times as r by a worker in the absence of clear supervision, wherein the method is implemented by the mechanism through the following steps:
s311, taking the historical event F with the reminding value exceeding the threshold lambda as a related event;
s312, when the final storage is obtained, the abnormal detection system puts the indexes of the relevant facts into a temporary storage to be used as a summary of the scenarios generated by each iteration;
s313, the abnormity detection system queries the cache to obtain related previous events;
s314, transmitting the abnormity detection early warning to a network security officer for analysis;
s315, when the current event is normal, the abnormality detection system automatically clears the cache;
The method for realizing the temporary storage of the indexes of the relevant events can avoid slow running speed caused by excessive storage contents, thereby improving the detection efficiency of the detection method and increasing the sensitivity of the detection method;
s4 decodes the integrated storage M by GRU and full connection layer FCL combination, and uses given historical events to construct a single GRUj + FCLj network to predict events
The j field in (1) comprises the following specific steps:
s401 represents a field uj of predicted expected time as GRUj + FCLj;
s402 calculates the conditional probability of field uj of the expected event:
wherein g isj tIs GRU at time tjHidden state of (W)j (1),Wj (2),bj (1)And bj (2)Is FCLjWeight and deviation of (y)j tIs the output of the first fully connected layer, ujIs obtained by the softmax function at the end of the second fully connected layer
S403, predicting the event
Is expressed as:
where n is the number of predefined fields of time;
the model training process minimizes cross-entropy loss on the training event sequence between expected events and observation times, while avoiding overfitting using techniques such as L2 regularization, random inactivation, increasing gradient noise, and the like;
s5 predicting event according to current observed event C and predicted event
Comparing to determine whether the current event is abnormal, and the method specifically comprises the following steps:
s501 sets a threshold value as a cut value in the prediction output;
s502, judging whether the current event C is positioned in the first k predicted events
The preparation method comprises the following steps of (1) performing;
s503, when the judgment in S502 is yes, the current event C is a normal event, and the system clears the cache;
s504 when the judgment of S502 is negative, the system immediately sends out an alarm;
s505, after the current event C which causes the alarm is read and cached as a related historical event, the system clears the cache;
the detection of the network signal can be completed for one time;
the detection method of the invention implicitly codes the workflow path of the bottom system through the iterative reminding process, so that the detection method can provide possible important clues that related events become potential malicious activities, facilitates origin tracking, simultaneously realizes potential correlation detection between different events in different domains, reduces dependency between adjacent events, improves the detection sensitivity of the detection method of the invention to abnormal networks, ensures the safety of enterprise networks, and reduces the security threat born by the enterprise networks.
Example two:
an enterprise network anomaly detection system based on a dynamic storage network comprises a data preparation module, a presentation layer module, a storage formation module, a temporary storage module, a prediction layer module and an anomaly detection module;
A data preparation module: taking a new event as a current event C, normalizing the current event C into a group of preset fields, taking K windows of the latest time from a database as related contexts, and expressing historical events by S;
the representation layer module comprises a vector conversion module and a code conversion module;
a vector conversion module: preselecting a periodic continuous bag-of-words model to calculate the embedded vector of each field of the current event C and the historical event S to obtain corresponding field-level embedded vectors Q and F;
the code conversion module sends the field level embedded vectors Q and F to a bidirectional gating circulation unit Bi-GRU, the coded vectors of the field level embedded vectors Q and F are represented as Q and F [ F1, F2, … fT ], and fT represents a proper amount of codes of the t-th historical event;
the periodic continuous bag-of-words model can recycle hidden states, fusing a large amount of context information in field level embedding. For fields with continuous values, such as timestamps, we divide the range of values into segments to reduce the large number of continuous values to a smaller set of discrete intervals;
the storage forming module comprises a related calculation module, an internal connection module, a state hiding module and a storage updating module;
A correlation calculation module: for each iteration i, by a prompt vector Ai=[ai 1,ai 2,…ai T]Calculating the correlation between given F and Q, ai tIs the alert weight;
an internal connection module: the double-layer feedforward neural network inputs a reminding vector ztVector encoding f to capture historical eventstPrevious memory Mi-1And internal connection between the previous event Q:
where O is the product of elements, W
1,W
2,b
1,b
2Is a parameter to be learned;
a state hiding module: using attGRU to represent GRU calculation scenario Eatt of operation modification to obtain given reminding vector ai TFinal hidden state h ofi T:
An episode is defined as the final hidden state of attGRU, Ei att=hi T。
A storage update module: for each iteration i, the integrated memory M is updated to:
the temporary storage module comprises an event extraction module, an event unloading module, a priority check module, a temporary early warning module and a mechanism resetting module;
an event extraction module: taking a historical event F with the reminding value exceeding a threshold lambda as a related event;
an event unloading module: when the final storage is obtained, the abnormal detection system puts the indexes of the relevant facts into a temporary storage;
a priority check module: the anomaly detection system queries the cache for relevant prior events;
the temporary early warning module: transmitting the abnormity detection early warning to a network security officer for analysis;
A mechanism resetting module: when the current event is normal, the abnormality detection system automatically clears the cache;
the prediction layer module comprises a field representation module, a field prediction module and a prediction calculation module;
a field representation module: representing a field uj of the predicted expected time as GRUj + FCLj;
a field prediction module: the conditional probability of the field uj of the expected event is calculated:
a prediction calculation module: will predict the event
Is expressed as:
the model training process minimizes cross-entropy loss on the training event sequence between expected events and observation times, while avoiding overfitting using techniques such as L2 regularization, random inactivation, increasing gradient noise, and the like;
the abnormity detection module comprises a threshold value cutting module and a prediction judgment module;
a threshold cutting module: setting a threshold to a cut value in the prediction output;
predictionA judging module: judging whether the current event C is positioned in the first k predicted events
Performing the following steps;
when the prediction judgment module judges that the current event C is a normal event, the alarm module clears the cache in the system;
when the prediction judging module judges that the current event C is a related historical event, the cache module clears the cache in the system;
The detection system of the invention performs implicit coding on the workflow path of the bottom system through an iterative reminding process, stores the workflow path through the temporary storage module, so that the detection system can provide possible important clues that related events become potential malicious activities, facilitates origin tracking, simultaneously realizes potential correlation detection between different events in different domains, reduces dependency between adjacent events, improves the detection sensitivity of the detection method of the invention on abnormal networks, ensures the safety of enterprise networks, and reduces the security threat born by the enterprise networks.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.