CN118074980A - Dynamic attack tracing system and method - Google Patents

Dynamic attack tracing system and method Download PDF

Info

Publication number
CN118074980A
CN118074980A CN202410209726.XA CN202410209726A CN118074980A CN 118074980 A CN118074980 A CN 118074980A CN 202410209726 A CN202410209726 A CN 202410209726A CN 118074980 A CN118074980 A CN 118074980A
Authority
CN
China
Prior art keywords
event
attack
weight
context
units
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410209726.XA
Other languages
Chinese (zh)
Inventor
李腾
蒋啸峰
谢亚轩
李德彪
马卓
马建峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xi'an Lianfei Intelligent Equipment Research Institute Co ltd
Xidian University
Original Assignee
Xi'an Lianfei Intelligent Equipment Research Institute Co ltd
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xi'an Lianfei Intelligent Equipment Research Institute Co ltd, Xidian University filed Critical Xi'an Lianfei Intelligent Equipment Research Institute Co ltd
Priority to CN202410209726.XA priority Critical patent/CN118074980A/en
Publication of CN118074980A publication Critical patent/CN118074980A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a dynamic attack tracing system and a dynamic attack tracing method, which mainly solve the problems of low tracing efficiency and poor accuracy of complex multi-step threat attack in the prior art. The system comprises a model generation module, an attack positioning module and an attack tracking module; firstly, constructing a behavior diagram model through a standardized event structure unit, compressing an event and optimizing the model; then positioning the attack event existing in the model, designing a parallel processing framework to realize high-efficiency attack analysis, simultaneously realizing the transmission calculation of the event relationship by means of the abstract environment statistical calculation, completing global context analysis by iterative calculation, and accurately distinguishing true and false attack actions; and finally, generating an attack path by utilizing an attack analysis result and a dynamic marking technology to obtain a graph model with the attack path. The invention can complete the high-precision long-chain multi-branch attack path analysis under the complex event relationship and realize the efficient attack tracing.

Description

Dynamic attack tracing system and method
Technical Field
The invention belongs to the technical field of network security, and further relates to an attack investigation technology, in particular to a dynamic attack tracing system and a dynamic attack tracing method, which can be used for analyzing long-chain multi-branch attack paths based on contexts.
Background
Today, the number of times of advanced persistent threat APT occurrences continues to increase, and attackers launch destructive attacks targeting enterprises, government, infrastructure, industrial sites, etc., causing tremendous security threats. Compared with the traditional attack, the attack mode is complex and secret, and once a host falls down, the whole intranet is severely threatened. However, due to the specificity of APT attacks, it is difficult for conventional threat alert systems to achieve efficient attack investigation with low false positive rates in the face of this type of attack, which would seriously affect the judgment of the defender.
Recent research has found that APT attack details are hidden in the environmental information, and investigation of APT attacks from massive amounts of data can use contextual information to analyze causal relationships between events. However, as network environments become more complex, more factors need to be considered in defending attacks, and some technical challenges are brought to APT investigation work: 1) The premise of utilizing the context is that the association between the events is needed to be known, and analysis of the semantics of the back attack of the events is realized through the association, however, the dependence explosion between the events can be caused by the excessively complex relationship, and the computing performance is seriously influenced; 2) In order to prevent an attacker from being easily detected, the attacker can disguise himself or hijack the system application to confuse the attack action, and the technology needs to accurately identify true and false attack events in a large number of events; 3) The data required by the technology covers all events in the attack occurrence time, and the huge data volume has high requirements on the efficiency of the traceability algorithm.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a dynamic attack tracing system and method which are used for solving the problems of low tracing efficiency and poor accuracy of complex multi-step threat attack. The method can investigate the attack step strategy of the complex threat, overcome the problem of depending on explosion among complex data, and effectively extract the threat attack event chain in a massive event set.
The idea of implementing the invention is as follows: firstly, constructing a behavior graph model through a standardized event structure unit, compressing events and optimizing the model, reducing the volume of input data to reduce calculation processing cost, then performing attack analysis work, positioning the attack events existing in the model, designing a parallel processing framework to realize high-efficiency attack analysis work, simultaneously realizing transfer calculation of event relation by means of abstract environment statistics calculation, overcoming the problem of dependent explosion caused by complex association, completing global context analysis by iterative calculation, realizing full attack action analysis under a path, and accurately distinguishing true attack actions from false attack actions; and finally, attack tracing work is carried out, and attack path generation is carried out by utilizing an attack analysis result and a dynamic marking technology, so that efficient attack tracing is finally realized.
The invention realizes the above purpose as follows:
a dynamic attack tracing system, comprising: the system comprises a model generation module, an attack positioning module and an attack tracking module;
The model generation module is used for generating a behavior graph model for attack investigation;
The attack positioning module is used for updating the event characteristic weight in the behavior graph model and calculating the weight to be close to the real attack characteristic value;
The attack tracking module is used for carrying out attack reconstruction work, extracting a calculation result in the attack positioning module, and generating an attack path by replacing a traditional graph searching algorithm with a dynamic mark.
Furthermore, the model generation module extracts event units according to the data of the standard event structure unit set, gradually constructs a behavior graph model by using the event units, eliminates redundant events by using an event compression mechanism in the construction process, and realizes model optimization.
Furthermore, the attack positioning module realizes weight updating calculation through the datamation of the influence of the event on the context environment and the characterization of the feedback of the context environment on the event, and completes the relevance analysis among all the events after multiple iterations.
A dynamic attack tracing method comprises the following steps:
(1) Collecting activity log data of a computer system, and extracting event structure data from the activity log data; simultaneously constructing an initial graph model comprising an event unit set and an entity node set, wherein the event unit set and the entity node set are empty;
(2) Completing causal direction judgment of an event through the event type represented by the event structure data, searching an entity node corresponding to the entity ID recorded by the event structure data in an initial behavior diagram model, finding the node and locking; if not, adding the node into an entity node set of the initial behavior diagram model and initializing;
(3) Generating event context snapshots through the locked entity nodes, and carrying out event merging judgment according to the context snapshots, merging or updating event units;
(4) And initializing weights of all event units according to the following formula:
D=IM(type)
wherein IM represents an initialization matrix and D represents event cell weights; type represents an event type;
(5) Constructing a parallel computing framework by taking entity nodes as cores, generating a corresponding computing core for each entity, and generating a forward event unit processing set and a backward event unit processing set, wherein the sequence of processing set elements is according to the generation sequence of event units; each computing core is distributed with a front output scale and a rear output scale, a front environment matrix and a rear environment matrix, and a front output result set and a rear output result set; the environmental weights of all event units are calculated and acquired, and the steps are as follows:
(5.1) enabling the scale of the forward list to be 0 at the initial time, and traversing the forward event unit processing set;
(5.2) assuming that the number of event types is N, the vector length of the weight is L, the forward event unit to be processed is E A, and the forward environment weight matrix M is updated by using the weight D A of E A; the calculation formula is as follows:
Wherein M i,j represents an element of an ith row and a jth column in M, i is an integer between 0 and N-1, and j is an integer between 0 and L-1; d A,j' is the j 'th element in E A weight D A, and j' =j; if D A,j' is greater than M i,j, then the value of M i,j is updated to D A,j'; if D A,j' is less than M i,j,DA,j' and greater than 0.6 and greater than 0.9 x M i,j, then multiplying M i,j; otherwise, not updating the value of M i,j;
(5.3) assuming that the forward environmental weight of the event element E B needs to be output under the current scale; if E A meets the condition under the current scale of the forward output scale, namely the event unit ID is successfully matched, executing the step (5.4) to acquire the forward environment weight of E B and outputting the forward environment weight to a forward output result set; otherwise, directly executing the step (5.5) without outputting;
(5.4) obtaining the environmental weight D context according to:
Wherein D context,g is the g element of D context, M weak is a weakening matrix, the type code of each event unit corresponds to one weakening matrix for weakening the influence of an uncorrelated value on an event, T () is a type code acquisition function, the return value of T (E B) is the type code of E B, A corresponding forward weakening matrix under the type coding of E B; For/> Processing the ith row and the jth column of the result, and g=g' =j; taking outThe column number in the result matrix is equal to one column of index values of the element in D context, the maximum value in the column is recorded to the corresponding position of D context, after all the elements in D context are processed, the D context value is inserted to the corresponding position of E B in the forward output result set, and meanwhile, one is added to the current scale;
(5.5) if the scale of the forward output scale at the moment does not reach the upper limit of the scale and there are non-traversed event units, returning to the step (5.2) to continue traversing; if the scale of the forward output scale does not reach the upper limit of the scale but no traversing event unit exists, continuing to execute the step (5.6); if the upper limit of the scale is reached, executing the step (5.7);
(5.6) directly utilizing the current environment matrix to participate in the calculation process under all unprocessed scales to obtain a forward output result set containing all forward environment weights;
(5.7) traversing the backward event unit processing set by using the backward table according to the processing procedures of the steps (5.1) - (5.6) to obtain a backward output result set containing all backward environment weights;
(5.8) after all the computing core processes are completed, obtaining the environment weights of all the event units;
(6) Updating the environment weights of all event units, and assuming the event units with weights to be updated are E B,Dcontext,F and D context,B as the forward and backward environment weights; the weight update is realized according to the following steps:
(6.1) forward predictive weight D want,F and backward predictive weight D want,B of E B are calculated using the following formulas:
Wherein M TM is a weight conversion matrix for realizing conversion calculation from each dimension of the environmental weight to each dimension of the predicted weight, and M TM,q,k is an element of the kth column of the q-th row of M TM; m TM,k',q' is an element of M TM, row k 'and column q'; d want,F,o represents the o-th element of D want,F, D want,B,o' represents the o' -th element of D want,B; d context,F,p represents the p-th element of D context,F, D context,B,p' represents the p' -th element of D context,B; where o=q, p=k and o '=q', p '=k' with a value range of [0, l-1]; each element of the prediction weight is calculated respectively, each element is traversed for L times, and the maximum value in the calculation result is taken; after all elements of the predicted weight are processed, executing the next step;
(6.2) performing an updating operation from the predicted weight to the actual weight by the following formula to obtain a new weight D new of the updated E B:
ErB,r=(Dwant,B,s-Dt)WR
ErF,r'=(Dwant,F,s'-Dt')WR
Wherein D want,B,s represents the s-th element of D want,B, and D want,F,s' represents the s' -th element of D want,F; d t、Dt' and D x represent the t-th element, the t' -th element, and the x-th element in the weight of E B, respectively; er B represents the backward difference weight, er B,r is the r element of Er B; er F represents the forward difference weight, er F,r' is the r' th element of Er F; WR is the weakening rate for correcting the effect of the difference on the result, ST () is a normalization function, and the return value is 0 or 1; d new,u is the u-th element of D new; er B,v is the v element of Er B; er F,w is the w element of Er F; where r=s=t, r ' =s ' =t ', u=v=w=x, and the value range is [0, l-1]; after updating each element in D new bit by bit, executing the next step;
(7) Circularly executing the steps (5) - (6) until the result is stable or the upper limit of the iteration times is reached; if the cycle termination condition is reached, the weight of the event unit is not changed any more, and the event unit is divided into a threat event and a benign event according to the weight vector;
(8) Inputting an event unit to be tracked, namely a threat element event unit meta, and gradually generating an attack structure in a behavior graph model by taking the event unit meta as a center; let the preliminary attack path be THREATSET 1= { meta }, the set of event units to be tracked ProcessSet = { meta };
(9) Searching for the event elements related to ProcessSet to construct a set of related event elements RELATEDSET;
(10) If RELATEDSET is not empty, combining RELATEDSET to THREATSET1 to obtain an attack path THREATSET, updating RELATEDSET to ProcessSet, and returning to step (9) after updating THREATSET with THREATSET; whereas direct output THREATSET;
(11) And obtaining a behavior graph model with an attack path, and realizing attack tracing.
Compared with the prior art, the invention has the following advantages:
firstly, the invention converts the discrete rule into the continuous value, and controls the data calculation of the weight influence among the events through the transmission algorithm, compared with the discrete rule matching algorithm, the method has better compatibility and expansibility, and for long-chain multi-branch attack analysis, the behavior diagram model construction method designed by the invention can adaptively adjust the path generation direction by matching with the weight updating algorithm, can overcome the high processing cost caused by the complex diagram structure, and realizes the high-efficiency extraction of the long-chain attack path.
Secondly, the invention designs a new weight updating algorithm and proposes a corresponding parallel processing framework; and the event calculation flow is redesigned, and three aspects of compression model, parallel processing and dynamic marking are provided to improve the attack analysis efficiency. The global computation of the attack event association is carried out through the context, so that the problems of low efficiency and explosion dependence of the traditional diffusion algorithm are overcome; the suspicious analysis of the long-path attack event is realized by an iterative calculation mode, the true and false attack event is accurately resolved by utilizing the forward and backward event flows, and the path interpretation which is easy to accept is provided for the true attack event; and finally outputting a complete attack path.
Drawings
FIG. 1 is a schematic diagram of the overall structure of the system of the present invention;
FIG. 2 is a flow chart of an implementation of the method of the present invention;
FIG. 3 is a flowchart illustrating the steps performed in the weight updating according to the present invention;
FIG. 4 is a graph comparing simulation results of the lateral contamination of the performance test of the present invention with those of the prior art.
Detailed Description
The invention will now be described in further detail with reference to the drawings and to specific embodiments.
Embodiment one: referring to fig. 1, the dynamic attack tracing system provided by the present invention specifically includes: the system comprises a model generation module, an attack positioning module and an attack tracking module;
the model generation module is used for generating a behavior graph model for attack investigation; the method comprises the steps of extracting event units according to data of a standard event structure unit set, gradually constructing a behavior graph model by using the event units, eliminating redundant events by using an event compression mechanism in the construction process, and realizing model optimization.
The attack positioning module is used for updating the event characteristic weight in the behavior graph model and calculating the weight to be close to the real attack characteristic value; the weight updating calculation is realized through the datamation of the influence of the event on the context environment and the characterization of the feedback of the context environment on the event, and a plurality of iterations are executed, and the global context correlation analysis of the long-chain attack can be realized through the calculation mode by completing the global influence diffusion simulation calculation.
The attack tracking module is used for carrying out attack reconstruction work and extracting a calculation result in the attack positioning module, and the attack path is generated by the module through dynamic marking instead of the traditional graph searching algorithm, so that the efficiency of generating the attack path is improved.
Embodiment two: referring to fig. 2, the dynamic attack tracing method provided by the invention specifically includes the following steps:
step 1, collecting activity log data of a computer system, and extracting event structure data from the activity log data; simultaneously constructing an initial graph model comprising an event unit set and an entity node set, wherein the event unit set and the entity node set are empty;
Step 2, finishing causal direction judgment of the event through the event type represented by the event structure data, searching an entity node corresponding to the entity ID recorded by the event structure data in an initial behavior diagram model, finding the node and locking; if not, the node is added to the entity node set of the initial behavior graph model and initialized. The entity node at least comprises entity ID, an in-out event unit set, attribute information such as attribute and environment matrix;
Step 3, generating event context snapshots through the locked entity nodes, and carrying out event merging judgment according to the context snapshots, merging or updating event units; the embodiment specifically comprises the following steps: generating new event units by using event structure data, forming event units among the same entities as the new event units into an event unit set among the same entities, traversing event units with the same number as the event types from the tail of the set, determining whether the context changes by comparing whether context snapshots are consistent, merging the event units with the same type if the event units with the same type exist, and updating the information of the event units after merging; if not, a new event element is added to the event element set of the behavior graph model and information for the event elements other than the event structure data is initialized. In this embodiment, the event unit information includes at least an event ID, a type, a front and rear entity node, an event structure data set, an event unit weight, and the like.
And 4, carrying out weight initialization operation on all event units according to the following formula:
D=IM(type)
wherein IM represents an initialization matrix and D represents event cell weights; type represents an event type;
Step 5, constructing a parallel computing framework by taking entity nodes as cores, generating a corresponding computing core for each entity, and generating a forward event unit processing set and a backward event unit processing set, wherein the sequence of processing set elements is according to the generation sequence of event units; each computing core is allocated a front output scale and a rear output scale, a forward environment matrix and a rear environment matrix, and a forward output result set and a rear output result set. Respectively calculating the environmental weights of the forward event unit and the backward event unit, respectively processing data in the forward direction and the backward direction, and obtaining a forward output result set and a backward output result set according to the same data processing process in the two directions; when each event unit is calculated, firstly judging whether the unit meets the current scale standard of the corresponding scale of the direction, if so, filling data into the current direction output result set recorded under the scale, otherwise, continuously calculating the next event unit, and calculating and obtaining the environmental weights of all event units; the implementation steps are as follows:
(5.1) enabling the scale of the forward list to be 0 at the initial time, and traversing the forward event unit processing set;
(5.2) assuming that the number of event types is N, the vector length of the weight is L, the forward event unit to be processed is E A, and the forward environment weight matrix M is updated by using the weight D A of E A; the calculation formula is as follows:
Wherein M i,j represents an element of an ith row and a jth column in M, i is an integer between 0 and N-1, and j is an integer between 0 and L-1; d A,j' is the j 'th element in E A weight D A, and j' =j; if D A,j' is greater than M i,j, then the value of M i,j is updated to D A,j'; if D A,j' is less than M i,j,DA,j' and greater than 0.6 and greater than 0.9 x M i,j, then multiplying M i,j; otherwise, not updating the value of M i,j;
(5.3) assuming that the forward environmental weight of the event element E B needs to be output under the current scale; if E A meets the condition under the current scale of the forward output scale, namely the event unit ID is successfully matched, executing the step (5.4) to acquire the forward environment weight of E B and outputting the forward environment weight to a forward output result set; otherwise, directly executing the step (5.5) without outputting;
(5.4) obtaining the environmental weight D context according to:
Wherein D context,g is the g element of D context, M weak is a weakening matrix, the type code of each event unit corresponds to one weakening matrix for weakening the influence of an uncorrelated value on an event, T () is a type code acquisition function, the return value of T (E B) is the type code of E B, A corresponding forward weakening matrix under the type coding of E B; For/> Processing the ith row and the jth column of the result, and g=g' =j; taking outThe column number in the result matrix is equal to one column of index values of the element in D context, the maximum value in the column is recorded to the corresponding position of D context, after all the elements in D context are processed, the D context value is inserted to the corresponding position of E B in the forward output result set, and meanwhile, one is added to the current scale;
(5.5) if the scale of the forward output scale at the moment does not reach the upper limit of the scale and there are non-traversed event units, returning to the step (5.2) to continue traversing; if the scale of the forward output scale does not reach the upper limit of the scale but no traversing event unit exists, continuing to execute the step (5.6); if the upper limit of the scale is reached, executing the step (5.7);
(5.6) directly utilizing the current environment matrix to participate in the calculation process under all unprocessed scales to obtain a forward output result set containing all forward environment weights;
(5.7) traversing the backward event unit processing set by using the backward table according to the processing procedures of the steps (5.1) - (5.6) to obtain a backward output result set containing all backward environment weights;
(5.8) after all the computing core processes are completed, obtaining the environment weights of all the event units;
Step 6, updating the environment weights of all event units, and assuming the event units to be updated are E B,Dcontext,F and D context,B as the forward and backward environment weights; the weight update is realized according to the following steps:
(6.1) forward predictive weight D want,F and backward predictive weight D want,B of E B are calculated using the following formulas:
Wherein M TM is a weight conversion matrix for realizing conversion calculation from each dimension of the environmental weight to each dimension of the predicted weight, and M TM,q,k is an element of the kth column of the q-th row of M TM; m TM,k',q' is an element of M TM, row k 'and column q'; d want,F,o represents the o-th element of D want,F, D want,B,o' represents the o' -th element of D want,B; d context,F,p represents the p-th element of D context,F, D context,B,p' represents the p' -th element of D context,B; where o=q, p=k and o '=q', p '=k' with a value range of [0, l-1]; each element of the prediction weight is calculated respectively, each element is traversed for L times, and the maximum value in the calculation result is taken; after all elements of the predicted weight are processed, executing the next step;
(6.2) performing an updating operation from the predicted weight to the actual weight by the following formula to obtain a new weight D new of the updated E B:
ErB,r=(Dwant,B,s-Dt)WR
ErF,r'=(Dwant,F,s'-Dt')WR
Wherein D want,B,s represents the s-th element of D want,B, and D want,F,s' represents the s' -th element of D want,F; d t、Dt' and D x represent the t-th element, the t' -th element, and the x-th element in the weight of E B, respectively; er B represents the backward difference weight, er B,r is the r element of Er B; er F represents the forward difference weight, er F,r' is the r' th element of Er F; WR is the weakening rate for correcting the effect of the difference on the result, ST () is a normalization function, and the return value is 0 or 1; d new,u is the u-th element of D new; er B,v is the v element of Er B; er F,w is the w element of Er F; where r=s=t, r ' =s ' =t ', u=v=w=x, and the value range is [0, l-1]; after updating each element in D new bit by bit, executing the next step;
step 7, circularly executing the steps 5-6 until the result is stable or the upper limit of the iteration times is reached; if the cycle termination condition is reached, the weights of the event units are no longer changed, and the event units are distinguished into threat events and benign events according to the belonging weight vectors. The result is stable, namely the number of threat events is kept unchanged within a preset range after multiple rounds of calculation, and the threat events are event units with a certain element value greater than 0.7 in the weight vector.
Step 8, inputting an event unit to be tracked, namely a threat element event unit meta, and gradually generating an attack structure in the behavior graph model by taking the event unit meta as a center; let the preliminary attack path be THREATSET 1= { meta }, the set of event units to be tracked ProcessSet = { meta };
step 9, searching the event units related to ProcessSet to construct a related event unit set RELATEDSET; the specific implementation steps are as follows:
(9.1) assuming E A is an event unit in ProcessSet, searching entity nodes related to E A, marking the entity, if the entity is not marked in the current calculation round, directly marking, otherwise, updating the existing entity mark, wherein the mark represents the range of optional elements in an event unit set associated with the entity, and finishing the expansion of the mark through E A;
(9.2) traversing the event cells in ProcessSet in sequence, performing step (9.1) thereon;
(9.3) extracting all marked or updated entity marks, comparing with the previous mark, if the mark expands and a new event unit in the range described by the expanded part is a threat event in the event unit set of the entity corresponding to the mark, adding the new event unit to RELATEDSET; wherein the first round was added directly, no contrast was present.
Step 10, if RELATEDSET is not empty, combining RELATEDSET to THREATSET1 to obtain an attack path THREATSET, updating RELATEDSET to ProcessSet, and returning to step 9 after updating THREATSET with THREATSET; whereas direct output THREATSET;
and 11, obtaining a behavior diagram model with an attack path, and realizing attack tracing.
At this point, all steps of the method are completed and the output result is a behavior graph model with attack paths THREATSET.
In the scheme of the invention, the design model generation module establishes a behavior graph model by utilizing the original system behavior data and event structuring data, analyzes the suspicious property of all event units in the model in the attack positioning module, and finally connects all event units with threat by the attack tracking module to generate an attack path.
Embodiment III: referring to fig. 1-3, the overall structure of the attack tracing system provided by the invention is the same as that of the first embodiment, and the tracing method realizes the same steps as that of the second embodiment; specific examples are now given to further describe the functions implemented by each module in the system:
1) And a model generation module:
The collected system activity records are converted into event structure data and then delivered to a model generation module for processing as input data, wherein the processing comprises the steps of judging the generation direction of an event unit, searching corresponding entity nodes, locking and generating event unit context snapshots.
Before a new event unit is added into the graph model, event merging judgment is carried out, other event unit sets which are the same as the entity nodes of the new event unit are collected, N events (N is the number of event types) are searched upwards, each time snapshot is compared to carry out context fluctuation analysis, if the event units of the same type which are not changed are found, merging processing is carried out, and otherwise, the new event units are added and initialized.
And carrying out weight initialization operation on the events in the graph model, wherein the formula is D=IM (type), the IM is an initialization matrix, D is event weight, and the type is event type.
2) Attack location module:
after the event units are all built into a behavior graph model and weight initialization is carried out, each event has own weight vector, then weight update operation is started, as shown in fig. 3, the overall flow of weight update is shown, the process is the most core work of the attack positioning module, context information of a target event is collected and an environment matrix is generated, then the matrix is used for generating the predicted weight of the event, and old weight update is carried out through the predicted weight. At the same time, part of the steps in fig. 3 are processed in parallel in a plurality of execution units, and the results are output in a unified way. The attack positioning module specifically executes the following steps:
(a) And constructing a parallel computing framework by taking the entity node as a core, generating two output scales in the parallel computing framework before updating, and simultaneously maintaining two entity environment matrixes. And then delivering each entity to a computing core for processing, and uniformly outputting the processing result.
(B) The environment weight statistics of the event is carried out in each computing core, and the computing steps are as follows:
(b1) The entity maintains an entity environment matrix, and the collection of events associated with the entity begins the computation.
(B2) Assuming that the event to be output is E A, firstly completing the environmental information statistics of E A, collecting and recording the weight of the associated event to be calculated as an environmental weight matrix, wherein the calculation formula is as follows:
where D is the weight of the associated event E B (the event associated with E A), M is the dataized context matrix, and i is the type code of E B. If the weight of E B is greater than the corresponding bit in M, the weight of E B is directly updated, and if the weight of EB is greater than 0.6 and the weight of E B is close to the corresponding bit in M, the value is multiplied, otherwise, the value of M is not updated, the formula is used for processing a single element in M, i, j as a subscript is the row number and the column number corresponding to the element.
(B3) If the corresponding bit of the output scale is satisfied after the calculation of E B is performed, the output processing at that time is performed, and the formula is performed firstM weak is used for weakening the influence of an uncorrelated value on an event;
D context is the context weight after processing, and the value is inserted into the corresponding bit of the output scale; n is the number of event types, j is the index of the subscript of D context.
(B4) If the output table still has output bits, returning to the step (b 1), otherwise, directly calculating the output and returning to all calculation results.
(C) After all event statistics are obtained, the weight update work is started:
(c1) Using the formula And obtaining a predicted weight value D want of the environment matrix on the event, wherein j is a subscript index of D want, L is the length of the weight vector, and M TM is a weight conversion matrix for realizing conversion calculation from each dimension of the environment weight to each dimension of the predicted weight. The forward and backward prediction weight of each event is obtained through calculation and is respectively D want,F,Dwant,B.
(C2) And then the operation of updating the predicted weight to the actual weight is carried out through the following formula:
ErB,i=(Dwant,B,i-Dold,i)WR
ErF,i=(Dwant,F,i-Dold,i)WR
Dnew,i=MIN(ErB,i,ErF,i)*St(Dold,i)+Dold,i
Where Er refers to the gap value before D want and D old, er F,ErB is the forward/backward difference value, respectively, WR is the weakening ratio, all corrected differences affect the result, D new is the updated weight, D old is the old weight, i is the index of the index element, ST (·) function is the normalization function, and the return value is 0 or 1. Updating D old bit by bit results in D new.
(D) And (3) performing the steps (b) and (c) on all event loops until the result is stable or the upper limit of the iteration times is reached.
Step (c) does not need to design a special parallel computing framework, and all events are directly halved to each computing core, or can be directly output after serial computing.
3) Attack tracking module:
The attack tracking module is used for extracting the result obtained by the attack positioning module and realizing attack reconstruction, and the module is different from the traditional searching algorithm in generating the path based on the dynamic mark.
The first step is to extract the starting point of the attack path, which can be a high-risk attack event, an attack entry or other suspicious events, wherein the event unit is used as a threat element event, and the attack path structure is generated step by taking the event unit as a center. The preliminary attack path is THREATSET = { meta }, the set of event cells to be tracked ProcessSet = { meta }.
Second, search for event cell set RELATEDSET associated with ProcessSet; the method comprises the following steps: assuming E A as an event unit in ProcessSet, searching an event unit set (snapshot set by a model generation module) associated with an entity related to E A, and marking the entity; if the entity is not marked in the round of calculation, directly marking, otherwise updating the existing entity mark; and traversing the event units in ProcessSet in sequence, extracting the values of all marked entities, comparing with the previous round of marking (the first round is not compared), and adding the new event unit into the event unit if the marking expands and the new event unit is a high-risk event RELATEDSET.
Finally, whether RELATEDSET is empty is judged, if yes, THREATSET is directly output, otherwise RELATEDSET is combined to THREATSET, RELATEDSET is converted to ProcessSet, then the second step is returned to search for the event unit set RELATEDSET related to the updated ProcessSet again, marking and adding are further completed until RELATEDSET is empty, and a graph model with an attack path THREATSET is obtained through output.
The effects of the present invention are further described below in conjunction with simulation experiments:
1. Experimental conditions and content:
The test data is derived from experimental attack cases using Msfvenom (attack tool) and the Linux platform audit tool auditd. The experimental environment was set as follows: an attack occurs between two virtual machines, where (1) the aggressor a system environment is kali and (2) the victim B system environment is Ubuntu. And A, uploading the malicious file to a host B and running, and executing persistence operation after acquiring a back door and downloading the important file from the host B to the local. The data source is daemon auditd running in B, which will continue to generate system call execution logs that conform to the built-in rules.
Processor configuration: win 10 operating system, intel Core i5-7300HQ CPU,NVIDIA GeForce GTX 1050, and 8GB of memory. Because the different machine performances are different, the time cost of the final result is only compared, and the improvement degree of the invention on the effect is shown.
In the first experiment, the invention performs an optimization strategy test, wherein the test refers to changing part of execution strategies in a scheme to show the influence of the execution strategies on an algorithm, in the aspect of the experiment, ten experiments are performed for each group to obtain an average value of results, and the iteration number is set to be 5, in addition, the invention uses the version of the starting optimization algorithm as a comparison standard, and the parallel strategies and the compression strategies are respectively closed to perform the test, and the results are shown in the table 1:
Table 1: optimizing policy test results
And secondly, performing performance comparison test on the invention, performing pollution treatment on the acquired data set to further show the performance of the invention, wherein the percentage represents the probability of generating a pollution event conforming to a continuity rule by each event, and respectively utilizing the long-chain multi-branch attack path analysis based on the context and the traditional rule-based diffusion analysis method for the experimental data set, wherein the two methods simultaneously use the behavior pattern model extracted by the model generation module of the invention, perform test on the basis of the attack analysis performance, perform three experiments for each group to obtain an average value, compare the investigation results, and the traceability of the case attack by the two methods is shown in the table 2.
Table 2: performance comparison test results
2. Analysis of experimental results:
referring to table 1, it clearly shows the test results of the three invention versions of the optimization strategy test, and it can be seen from the table that, in terms of performance, the strategy is only aimed at optimizing the calculation flow and the data, and the scheme precision is not affected, and the strategy is slightly floated because of different calculation bases, while in terms of cost, the number of events is not changed because the compression strategy is started under the non-parallel calculation frame, but the processing time cost is slightly larger than that of parallel processing because of the difference of the calculation flow. And after the compression policy is removed, the number of events changes, and the processing time cost is greatly increased. Meanwhile, the invention also shows that the processing cost is positively related to the number of the events, and the fluctuation degree reaches the O (N) level. This proves that the optimization strategy of the invention has obvious optimization effect on the calculation flow.
Referring to table 2, the principle of data set pollution is to continue the subsequent events of the events in each event chain according to probability, so as to simulate the camouflage action of an attacker. According to the results shown in Table II, although the invention is not interfered by camouflage actions along with the improvement of probability, the accuracy of the result is still kept above 0.97, in contrast, the performance of the traditional diffusion algorithm is reduced along with the aggravation of data pollution, and the gradient of the reduction is extremely severe and is reduced from 0.76 to 0.23. In addition, in addition to the longitudinal depth of the continuous event chain, the lateral branching influence test is also important, and in the case of the experiment, a plurality of pollution events are generated laterally under the condition that the longitudinal generation probability is locked to be 0.5%, and as a result, compared with the traditional method, the performance of the method is reduced, and the performance of the method is not influenced.
The simulation analysis proves the correctness and effectiveness of the method provided by the invention.
The non-detailed description of the invention is within the knowledge of a person skilled in the art.
The foregoing description of the preferred embodiment of the invention is not intended to be limiting, but it will be apparent to those skilled in the art that various modifications and changes in form and detail may be made without departing from the principles and construction of the invention, but these modifications and changes based on the idea of the invention are still within the scope of the appended claims.

Claims (10)

1. A dynamic attack tracing system, comprising: the system comprises a model generation module, an attack positioning module and an attack tracking module;
The model generation module is used for generating a behavior graph model for attack investigation;
the attack positioning module is used for updating the event feature weight in the behavior graph model and converging the weight calculation to a real attack feature value;
the attack tracking module is used for carrying out attack reconstruction work, extracting convergence results in the attack positioning module, and generating an attack path by replacing a traditional graph searching algorithm with dynamic marks.
2. The system according to claim 1, wherein: the model generation module extracts event units according to data of the standard event structure unit set, gradually constructs a behavior graph model by using the event units, eliminates redundant events by using an event compression mechanism in the construction process, and realizes model optimization.
3. The system according to claim 1, wherein: the attack positioning module realizes weight updating calculation through the datamation of the influence of the event on the context environment and the feedback characterization of the context environment on the event, and realizes calculation convergence after multiple iterations, thereby completing the relevance analysis among all the events.
4. The dynamic attack tracing method is characterized by comprising the following steps of:
(1) Collecting activity log data of a computer system, and extracting event structure data from the activity log data; simultaneously constructing an initial graph model comprising an event unit set and an entity node set, wherein the event unit set and the entity node set are empty;
(2) Completing causal direction judgment of an event through the event type represented by the event structure data, searching an entity node corresponding to the entity ID recorded by the event structure data in an initial behavior diagram model, finding the node and locking; if not, adding the node into an entity node set of the initial behavior diagram model and initializing;
(3) Generating event context snapshots through the locked entity nodes, and carrying out event merging judgment according to the context snapshots, merging or updating event units;
(4) And initializing weights of all event units according to the following formula:
D=IM(type)
wherein IM represents an initialization matrix and D represents event cell weights; type represents an event type;
(5) Constructing a parallel computing framework by taking entity nodes as cores, generating a corresponding computing core for each entity, and generating a forward event unit processing set and a backward event unit processing set, wherein the sequence of processing set elements is according to the generation sequence of event units; each computing core is distributed with a front output scale and a rear output scale, a front environment matrix and a rear environment matrix, and a front output result set and a rear output result set; the environmental weights of all event units are calculated and acquired, and the steps are as follows:
(5.1) enabling the scale of the forward list to be 0 at the initial time, and traversing the forward event unit processing set;
(5.2) assuming that the number of event types is N, the vector length of the weight is L, the forward event unit to be processed is E A, and the forward environment weight matrix M is updated by using the weight D A of E A; the calculation formula is as follows:
Wherein M i,j represents an element of an ith row and a jth column in M, i is an integer between 0 and N-1, and j is an integer between 0 and L-1; d A,j' is the j 'th element in E A weight D A, and j' =j; if D A,j' is greater than M i,j, then the value of M i,j is updated to D A,j'; if D A,j' is less than M i,j,DA,j' and greater than 0.6 and greater than 0.9 x M i,j, then multiplying M i,j; otherwise, not updating the value of M i,j;
(5.3) assuming that the forward environmental weight of the event element E B needs to be output under the current scale; if E A meets the condition under the current scale of the forward output scale, namely the event unit ID is successfully matched, executing the step (5.4) to acquire the forward environment weight of E B and outputting the forward environment weight to a forward output result set; otherwise, directly executing the step (5.5) without outputting;
(5.4) obtaining the environmental weight D context according to:
Wherein D context,g is the g element of D context, M weak is a weakening matrix, the type code of each event unit corresponds to one weakening matrix for weakening the influence of an uncorrelated value on an event, T () is a type code acquisition function, the return value of T (E B) is the type code of E B, A corresponding forward weakening matrix under the type coding of E B; /(I)Is thatProcessing the ith row and the jth column of the result, and g=g' =j; get/>The column number in the result matrix is equal to one column of index values of the element in D context, the maximum value in the column is recorded to the corresponding position of D context, after all the elements in D context are processed, the D context value is inserted to the corresponding position of E B in the forward output result set, and meanwhile, one is added to the current scale;
(5.5) if the scale of the forward output scale at the moment does not reach the upper limit of the scale and there are non-traversed event units, returning to the step (5.2) to continue traversing; if the scale of the forward output scale does not reach the upper limit of the scale but no traversing event unit exists, continuing to execute the step (5.6); if the upper limit of the scale is reached, executing the step (5.7);
(5.6) directly utilizing the current environment matrix to participate in the calculation process under all unprocessed scales to obtain a forward output result set containing all forward environment weights;
(5.7) traversing the backward event unit processing set by using the backward table according to the processing procedures of the steps (5.1) - (5.6) to obtain a backward output result set containing all backward environment weights;
(5.8) after all the computing core processes are completed, obtaining the environment weights of all the event units;
(6) Updating the environment weights of all event units, and assuming the event units with weights to be updated are E B,Dcontext,F and D context,B as the forward and backward environment weights; the weight update is realized according to the following steps:
(6.1) forward predictive weight D want,F and backward predictive weight D want,B of E B are calculated using the following formulas:
Wherein M TM is a weight conversion matrix for realizing conversion calculation from each dimension of the environmental weight to each dimension of the predicted weight, and M TM,q,k is an element of the kth column of the q-th row of M TM; m TM,k',q' is an element of M TM, row k 'and column q'; d want,F,o represents the o-th element of D want,F, D want,B,o' represents the o' -th element of D want,B; d context,F,p represents the p-th element of D context,F, D context,B,p' represents the p' -th element of D context,B; where o=q, p=k and o '=q', p '=k' with a value range of [0, l-1]; each element of the prediction weight is calculated respectively, each element is traversed for L times, and the maximum value in the calculation result is taken; after all elements of the predicted weight are processed, executing the next step;
(6.2) performing an updating operation from the predicted weight to the actual weight by the following formula to obtain a new weight D new of the updated E B:
ErB,r=(Dwant,B,s-Dt)WR
ErF,r'=(Dwant,F,s'-Dt')WR
Wherein D want,B,s represents the s-th element of D want,B, and D want,F,s' represents the s' -th element of D want,F; d t、Dt' and D x represent the t-th element, the t' -th element, and the x-th element in the weight of E B, respectively; er B represents the backward difference weight, er B,r is the r element of Er B; er F represents the forward difference weight, er F,r' is the r' th element of Er F; WR is the weakening rate for correcting the effect of the difference on the result, ST () is a normalization function, and the return value is 0 or 1; d new,u is the u-th element of D new; er B,v is the v element of Er B; er F,w is the w element of Er F; where r=s=t, r ' =s ' =t ', u=v=w=x, and the value range is [0, l-1]; after updating each element in D new bit by bit, executing the next step;
(7) Circularly executing the steps (5) - (6) until the result is stable or the upper limit of the iteration times is reached; if the cycle termination condition is reached, the weight of the event unit is not changed any more, and the event unit is divided into a threat event and a benign event according to the weight vector;
(8) Inputting an event unit to be tracked, namely a threat element event unit meta, and gradually generating an attack structure in a behavior graph model by taking the event unit meta as a center; let the preliminary attack path be THREATSET 1= { meta }, the set of event units to be tracked ProcessSet = { meta };
(9) Searching for the event elements related to ProcessSet to construct a set of related event elements RELATEDSET;
(10) If RELATEDSET is not empty, combining RELATEDSET to THREATSET1 to obtain an attack path THREATSET, updating RELATEDSET to ProcessSet, and returning to step (9) after updating THREATSET with THREATSET; whereas direct output THREATSET;
(11) And obtaining a behavior graph model with an attack path, and realizing attack tracing.
5. A method according to claim 4, characterized in that: the entity node in the step (2) at least comprises an entity ID, an in-out event unit set, an attribute and environment matrix information;
6. A method according to claim 4, characterized in that: in the step (3), the event merging determination is performed according to the contextual snapshot, and the event units are merged or updated, specifically: generating new event units by using event structure data, forming event units among the same entities as the new event units into an event unit set among the same entities, traversing event units with the same number as the event types from the tail of the set, determining whether the context changes by comparing whether context snapshots are consistent, merging the event units with the same type if the event units with the same type exist, and updating the information of the event units after merging; if not, a new event element is added to the event element set of the behavior graph model and information for the event elements other than the event structure data is initialized.
7. A method according to claim 6, characterized in that: the event unit information at least comprises event IDs, types, front and back entity nodes, event structure data sets and event unit weights.
8. A method according to claim 4, characterized in that: the environmental weights of all event units are acquired in the step (5), the environmental weights of the forward event units and the backward event units are required to be calculated respectively, data in the forward direction and the backward direction are processed respectively, and the data processing processes in the two directions are the same, so that a forward output result set and a backward output result set are obtained; when each event unit is calculated, firstly judging whether the unit meets the current scale standard of the corresponding scale of the direction, if so, filling data into the current direction output result set recorded under the scale, otherwise, continuously calculating the next event unit.
9. A method according to claim 4, characterized in that: the result in the step (7) is stable, namely the number of threat events after multiple rounds of calculation is kept unchanged within a preset range, and the threat events are event units with a certain element value greater than 0.7 in the weight vector.
10. A method according to claim 4, characterized in that: searching for the event elements related to ProcessSet in step (9) to construct a set of related event elements RELATEDSET, the steps are as follows:
(9.1) assuming E A is an event unit in ProcessSet, searching entity nodes related to E A, marking the entity, if the entity is not marked in the current calculation round, directly marking, otherwise, updating the existing entity mark, wherein the mark represents the range of optional elements in an event unit set associated with the entity, and finishing the expansion of the mark through E A;
(9.2) traversing the event cells in ProcessSet in sequence, performing step (9.1) thereon;
(9.3) extracting all marked or updated entity marks, comparing with the previous mark, if the mark expands and a new event unit in the range described by the expanded part is a threat event in the event unit set of the entity corresponding to the mark, adding the new event unit to RELATEDSET; wherein the first round was added directly, no contrast was present.
CN202410209726.XA 2024-02-26 2024-02-26 Dynamic attack tracing system and method Pending CN118074980A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410209726.XA CN118074980A (en) 2024-02-26 2024-02-26 Dynamic attack tracing system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410209726.XA CN118074980A (en) 2024-02-26 2024-02-26 Dynamic attack tracing system and method

Publications (1)

Publication Number Publication Date
CN118074980A true CN118074980A (en) 2024-05-24

Family

ID=91095035

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410209726.XA Pending CN118074980A (en) 2024-02-26 2024-02-26 Dynamic attack tracing system and method

Country Status (1)

Country Link
CN (1) CN118074980A (en)

Similar Documents

Publication Publication Date Title
CN112464245B (en) Generalized security evaluation method for deep learning image classification model
Adhao et al. Feature selection using principal component analysis and genetic algorithm
CN111709022B (en) Hybrid alarm association method based on AP clustering and causal relationship
CN115883261A (en) ATT and CK-based APT attack modeling method for power system
Laptiev et al. Algorithm for Recognition of Network Traffic Anomalies Based on Artificial Intelligence
CN116074092B (en) Attack scene reconstruction system based on heterogram attention network
CN109508544B (en) Intrusion detection method based on MLP
CN118074980A (en) Dynamic attack tracing system and method
Ngo et al. Toward an approach using graph-theoretic for IoT botnet detection
Zhang et al. SeqA-ITD: User behavior sequence augmentation for insider threat detection at multiple time granularities
Yan et al. A semantic analysis-based method for smart contract vulnerability
CN112765606A (en) Malicious code homology analysis method, device and equipment
Yan et al. Towards defending against Byzantine LDP amplified gain attacks
Zhang et al. An automatic approach for scoring vulnerabilities in risk assessment
Huynh et al. Deep feature selection for machine learning based attack detection systems
Yan et al. Holistic Implicit Factor Evaluation of Model Extraction Attacks
Bian et al. Research on a privacy preserving clustering method for social network
KR102562665B1 (en) Social advanced persistent threat detection system and method based on attacker group similarity
KR102556463B1 (en) Social advanced persistent threat prediction system and method based on attacker group similarity
Xue et al. Malicious Code Detection Technology Based on A3C Algorithm
CN115021973B (en) Novel intrusion detection method based on SGRU
Xue et al. Fast Generation-Based Gradient Leakage Attacks against Highly Compressed Gradients
Chen Construction of a computer network fault analysis and intrusion detection system based on K-means clustering algorithm
Wang et al. A Fishing Detector with Attentional Mechanisms.
Na et al. A Model Based on GCN and TCN for Malicious Code Detection in Power Information System

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination