CN112822220B - Multi-sample combination attack-oriented tracing method and device - Google Patents

Multi-sample combination attack-oriented tracing method and device Download PDF

Info

Publication number
CN112822220B
CN112822220B CN202110238343.1A CN202110238343A CN112822220B CN 112822220 B CN112822220 B CN 112822220B CN 202110238343 A CN202110238343 A CN 202110238343A CN 112822220 B CN112822220 B CN 112822220B
Authority
CN
China
Prior art keywords
attack
tracing
samples
feature
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110238343.1A
Other languages
Chinese (zh)
Other versions
CN112822220A (en
Inventor
薛晨龙
童志明
肖新光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Antiy Technology Group Co Ltd
Original Assignee
Antiy Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Antiy Technology Group Co Ltd filed Critical Antiy Technology Group Co Ltd
Priority to CN202110238343.1A priority Critical patent/CN112822220B/en
Publication of CN112822220A publication Critical patent/CN112822220A/en
Application granted granted Critical
Publication of CN112822220B publication Critical patent/CN112822220B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information
    • H04L63/126Applying verification of the received information the source of the received data

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a tracing method and a tracing device for multi-sample combination attack, wherein the method comprises the following steps: detecting an attack trigger event; obtaining at least two attack samples to be traced; performing feature extraction on at least two attack samples to be traced to obtain feature information of each attack sample to be traced to the source; and obtaining the tracing results of at least two attack samples to be traced according to the characteristic information of each attack sample to be traced and a pre-established target Bayesian tracing model. The method and the device can improve the tracing efficiency for multi-sample combined attack.

Description

Multi-sample combination attack-oriented tracing method and device
Technical Field
The invention relates to the technical field of network security, in particular to a multi-sample combination attack-oriented tracing method and device.
Background
Advanced Persistent Threat (APT), different from traditional network intrusion, often adopts a plurality of attack samples to carry out combined attack on a network information system, and because the combined attack has high risk, large detection difficulty, long duration and clear attack target, serious Threat is caused to network security, and in order to recover the network security, the network attack must be traced as soon as possible to find a solution.
At present, most of malicious attack sample traceability analysis methods mainly aim at a single sample, are not suitable for APT multi-sample combined attack, and most of the existing traceability analysis schemes have certain limitations. The attack samples in the APT come from a plurality of attack organizations, so that the confusion attack means is mainly analyzed and traced by depending on the experience of professional analysts at present, a large amount of data accumulation is needed in tracing analysis, and massive characteristic data can be generated along with the continuous change of the APT attack means, so that the tracing difficulty of analysts can be increased, the tracing analysis time is increased, and the tracing efficiency facing the multi-sample combination attack is low.
In view of the above, it is desirable to provide a tracing method and apparatus for multi-sample combination attack to solve the above disadvantages.
Disclosure of Invention
The technical problem to be solved by the invention is how to improve the tracing efficiency for the multi-sample combination attack, and aiming at the defects in the prior art, the invention provides a tracing method and a tracing device for the multi-sample combination attack.
In order to solve the above technical problem, in a first aspect, the present invention provides a tracing method for multi-sample combination attack, where the method includes:
detecting an attack trigger event;
obtaining at least two attack samples to be traced;
performing feature extraction on the at least two attack samples to be traced to obtain feature information of each attack sample to be traced;
and obtaining the tracing results of the at least two attack samples to be traced according to the characteristic information of each attack sample to be traced and a pre-established Bayesian tracing model.
Optionally, obtaining the tracing results of the at least two attack samples to be traced according to the feature information of each attack sample to be traced and a pre-created target bayesian tracing model, including:
screening the characteristic information of each attack sample to be traced to obtain a target characteristic subset corresponding to each attack sample to be traced to the source;
and inputting the target feature subset of each attack sample to be traced into the target Bayesian tracing model to obtain the tracing result of the at least two attack samples to be traced.
Optionally, the creating method of the target bayesian traceability model includes:
obtaining at least two groups of historical attack samples; wherein each group of historical attack samples comprises at least two attack samples;
performing feature extraction on the at least two groups of historical attack samples to obtain a feature set corresponding to each group of historical attack samples;
constructing a Bayesian tracing model;
and training the Bayes traceability model by using the feature set of each group of historical attack samples to obtain a target Bayes traceability model.
Optionally, the performing feature extraction on the at least two groups of historical attack samples to obtain a feature set corresponding to each group of historical attack samples includes:
for each set of historical attack samples, performing:
performing feature extraction on at least two attack samples included in the group of historical attack samples to obtain feature information corresponding to each attack sample;
screening the characteristic information of each attack sample to obtain a characteristic subset corresponding to the attack sample; the feature subset comprises feature information of each screened attack sample;
acquiring a real tracing result corresponding to the group of historical attack samples;
and combining the feature subset of each attack sample with the real tracing result of the group of historical attack samples to obtain the feature set corresponding to the group of historical attack samples.
Optionally, the feature set of each group of historical attack samples includes feature information as input and a real tracing result of the group of historical attack samples as output;
the training of the Bayes traceability model by using the feature set of each group of historical attack samples to obtain a target Bayes traceability model comprises the following steps:
giving preset probability distribution to parameters in the constructed Bayesian tracing model;
performing iterative training on the Bayes traceability model by using the feature set of each group of historical attack samples to obtain the optimized probability distribution of the parameters;
randomly sampling the feature set of each group of historical attack samples to obtain a test set;
obtaining a prediction traceability result corresponding to the test set according to the Bayes traceability model including the optimized probability distribution of the parameters obtained after the test set and the iterative training;
judging whether the similarity between the prediction traceability result and the real traceability result which is output in the test set is greater than a preset threshold value or not;
if yes, obtaining a target Bayes traceability model; and the parameters in the target Bayesian tracing model are the optimized probability distribution.
Optionally, the feature information of each attack sample includes at least one feature corresponding to the attack sample;
the screening the characteristic information of each attack sample comprises:
for each attack sample, performing:
obtaining the information gain of each feature in the feature information by using a decision tree algorithm, and calculating to obtain the weight corresponding to each feature by using the information gain of each feature;
and screening out the features corresponding to the weights larger than the preset weight threshold value according to the obtained weight of each feature and the preset weight threshold value so as to obtain the feature subset corresponding to the attack sample.
Optionally, after obtaining the tracing results of the at least two attack samples to be traced, the method further includes:
obtaining at least two groups of historical attack samples; wherein each group of historical attack samples comprises at least two attack samples;
periodically updating the acquired at least two sets of historical attack samples;
and periodically training the target Bayes tracing model by using the updated at least two groups of historical attack samples to obtain an optimized Bayes tracing model.
In a second aspect, the present invention further provides a tracing apparatus for multi-sample combination attack, including:
the system comprises an acquisition module, a source tracing module and a source tracing module, wherein the acquisition module is used for acquiring at least two attack samples to be traced when an attack trigger event is detected;
the characteristic extraction module is used for extracting the characteristics of the at least two attack samples to be traced acquired by the acquisition module to acquire the characteristic information of each attack sample to be traced;
and the tracing module is used for obtaining the tracing results of the at least two attack samples to be traced according to the feature information of each attack sample to be traced obtained by the feature extraction module and a pre-established target Bayesian tracing model.
In a third aspect, the present invention further provides a tracing apparatus for multi-sample combination attack, including: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor is configured to invoke the machine readable program to execute the tracing method for multi-sample combination attack provided by the first aspect or any possible implementation manner of the first aspect.
In a fourth aspect, the present invention further provides a computer-readable medium, where the computer-readable medium stores computer instructions, and when executed by a processor, the computer instructions cause the processor to execute the method for tracing to a multi-sample combination attack provided by the first aspect or any possible implementation manner of the first aspect.
The embodiment of the invention provides a multi-sample combination attack-oriented tracing method and a multi-sample combination attack-oriented tracing device. Therefore, through the pre-trained target Bayes traceability model, a plurality of attack samples to be traced can be traced, and the traceability results of attack organizations and the like to which the plurality of attack samples belong are finally determined. Therefore, the source tracing result of the current attack sample can be automatically and quickly analyzed by using the target Bayesian source tracing model for the APT-oriented confusion attack means, the participation of analysts is reduced, the source tracing analysis time is shortened, and the source tracing efficiency for the multi-sample combination attack is improved.
Drawings
Fig. 1 is a tracing method for multi-sample combination attack according to an embodiment of the present invention;
fig. 2 is another tracing method for multi-sample combination attack according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a device where a tracing apparatus for multi-sample combination attack is located according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a tracing apparatus facing multi-sample combination attack according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
As shown in fig. 1, a tracing method for multi-sample combination attack provided in an embodiment of the present invention includes the following steps:
step 101: detecting an attack trigger event;
step 102: obtaining at least two attack samples to be traced;
step 103: performing feature extraction on at least two attack samples to be traced to obtain feature information of each attack sample to be traced;
step 104: and obtaining the tracing results of at least two attack samples to be traced according to the characteristic information of each attack sample to be traced and a pre-established target Bayes tracing model.
In the embodiment of the invention, when an attack trigger event is detected, at least two attack samples to be traced for the current attack trigger event are obtained, the features of the attack samples to be traced are extracted, the feature information corresponding to each attack sample to be traced is obtained, and the tracing results of the at least two attack samples to be traced are finally obtained according to a pre-trained target Bayesian tracing model. Therefore, through the pre-trained target Bayes traceability model, a plurality of attack samples to be traced can be traced, traceability results such as attack organizations and the like to which the plurality of attack samples belong are finally determined, and meanwhile, the problem that a single sample cannot be accurately traced and positioned is avoided. Therefore, the source tracing result of the current attack sample can be automatically and quickly analyzed by using the target Bayesian source tracing model for the APT-oriented obfuscation attack means, the participation of analysts is reduced, the source tracing analysis time is shortened, and the source tracing efficiency for multi-sample combination attack is improved.
In the embodiment of the invention, a target Bayesian tracing model can be used for tracing the source of an attack sample to be traced when an attack trigger event is detected. Or when the reply is carried out on a certain attack event, the source tracing of at least one attack sample in the attack event can be carried out by utilizing the target Bayesian source tracing model.
In the embodiment of the present invention, the attack trigger event detected in step 101 may include, but is not limited to, a trigger event of any one of the following: when the network flow is detected to be abnormal, when the abnormal network connection is detected, when the system resources of the host computer are greatly occupied and the system is stopped, and when a large number of useless data packets are detected to be filled in the network.
In the embodiment of the present invention, for the tracing result of at least two attack samples to be traced, the tracing result includes an attack organization to which each attack sample belongs in the current attack trigger event, different attack samples belong to different attack organizations, and domain names or IP addresses used by attackers corresponding to different attack samples may also be different.
Optionally, in the tracing method for multi-sample combination attack shown in fig. 1, in step 104, according to the feature information of each attack sample to be traced and a target bayesian tracing model created in advance, the tracing results of at least two attack samples to be traced are obtained, which includes:
screening the characteristic information of each attack sample to be traced to obtain a target characteristic subset corresponding to each attack sample to be traced to the source;
and inputting the target characteristic subset of each attack sample to be traced into a target Bayesian tracing model to obtain the tracing results of at least two attack samples to be traced.
In the embodiment of the invention, the characteristic information of each attack sample to be traced is screened to obtain the target characteristic subset corresponding to each attack sample to be traced, and then the target characteristic subset of each attack sample to be traced is input into the target Bayesian tracing model to obtain the tracing results of at least two attack samples to be traced. Therefore, the source tracing analysis of the multiple attack samples is realized through the target Bayes source tracing model, the source tracing difficulty of the analysts is reduced, a large amount of data calculation and analysis are not required to be carried out by the analysts, and the source tracing efficiency and the intelligence of the multi-sample combined attack are improved.
In the embodiment of the invention, the feature information of each attack sample to be traced can be screened according to the obtained feature information corresponding to each attack sample, and the screening by using a decision tree algorithm is not needed, so that the feature screening of the attack sample to be traced from unknown source can be realized.
For example, when an attack trigger event is detected, obtaining 4 attack samples to be traced, performing feature extraction on the 4 attack samples to be traced, obtaining feature information of each attack sample to be traced, screening the feature information of each attack sample to be traced, obtaining target feature subsets corresponding to the 4 attack samples to be traced, inputting the target feature subsets of each attack sample to be traced into a target bayesian tracing model, obtaining tracing results of at least two attack samples to be traced, where the tracing results are probabilities that the 4 attack samples to be traced come from each attack organization: 30% of the attacking tissue a, 10% of the attacking tissue B, 5% of the attacking tissue D, 2% of the attacking tissue E, 0.15% of the attacking tissue M, 0.1% of the attacking tissue N.
In the embodiment of the invention, the target Bayesian tracing model can be used for screening attack samples simulating other tissues, and the attack tissues to which the attack samples belong can be accurately positioned by analyzing the tracing result of the attack samples, so that tracing errors caused by simulation are avoided, and the precision of the tracing result is improved.
Optionally, in the tracing method for multi-sample combination attack, before step 104, the creating method of the target bayesian tracing model includes:
obtaining at least two groups of historical attack samples; wherein each group of historical attack samples comprises at least two attack samples;
performing feature extraction on at least two groups of historical attack samples to obtain feature sets corresponding to each group of historical attack samples;
constructing a Bayesian tracing model;
and training the Bayes traceability model by using the feature set of each group of historical attack samples to obtain a target Bayes traceability model.
In the embodiment of the present invention, before step 104, a bayesian tracing model may be constructed by obtaining multiple groups of historical attack samples, and the bayesian tracing model is trained according to a feature set obtained by performing feature extraction on each group of historical attack samples, so as to obtain a pre-created target bayesian tracing model.
In the embodiment of the invention, the Bayesian traceability model is constructed based on the Bayesian network, the Bayesian network has strong uncertainty problem processing capability, the interrelation among all the characteristics can be expressed through the conditional probability, and the expression and fusion of the multi-source information can be effectively carried out according to the correlation of the characteristics, so that the traceability efficiency and the accuracy facing the multi-sample combination attack can be improved based on the Bayesian traceability model. It should be noted that the constructed bayesian traceability model has at least two layers of bayesian model frameworks.
Optionally, in the tracing method for multi-sample combined attack shown in fig. 1, the feature extraction is performed on at least two groups of historical attack samples to obtain a feature set corresponding to each group of historical attack samples, and the method includes:
for each set of historical attack samples, performing:
performing feature extraction on at least two attack samples included in the group of historical attack samples to obtain feature information corresponding to each attack sample;
screening the characteristic information of each attack sample to obtain a characteristic subset corresponding to the attack sample; the characteristic subset comprises characteristic information of each screened attack sample;
acquiring a real tracing result corresponding to the group of historical attack samples;
and combining the feature subset of each attack sample with the real source tracing result of the group of historical attack samples to obtain the feature set corresponding to the group of historical attack samples.
In the embodiment of the invention, for each group of historical attack samples comprising at least two attack samples, feature extraction is respectively carried out on each attack sample to obtain feature information corresponding to the attack sample, then the obtained feature information of the attack sample is screened to obtain a feature subset of the attack sample, a true tracing result corresponding to the group of historical attack samples is obtained, the feature subset of each attack sample corresponding to the group of historical attack samples and the obtained true tracing result are combined to obtain a feature set of the group of historical attack samples.
It should be noted that the feature information corresponding to each attack sample includes a dynamic feature and a static feature of the attack sample.
In the embodiment of the invention, by screening the characteristic information of the attack sample, the redundant characteristic information which has less influence on the tracing result and is rich in less information can be removed, so that the operand in the subsequent tracing process is reduced, and the tracing precision and the reliability of the tracing result for the multi-sample combination attack are further improved.
Optionally, in the tracing method for multi-sample combined attack shown in fig. 1, the feature set of each group of historical attack samples includes feature information as input and a real tracing result of the group of historical attack samples as output;
training the Bayes traceability model by using the feature set of each group of historical attack samples to obtain a target Bayes traceability model, and the method comprises the following steps:
giving preset probability distribution to parameters in the constructed Bayesian tracing model;
performing iterative training on the Bayes traceability model by using the feature set of each group of historical attack samples to obtain the optimized probability distribution of the parameters;
randomly sampling the feature set of each group of acquired historical attack samples to acquire a test set;
obtaining a prediction traceability result corresponding to the test set according to a Bayesian traceability model which is obtained after the test set and the iterative training and comprises optimized probability distribution of parameters;
judging whether the similarity between the predicted tracing result and a real tracing result which is output in the test set is greater than a preset threshold value or not;
if yes, obtaining a target Bayes tracing model; and the parameters in the target Bayes traceability model are optimized probability distribution.
In the embodiment of the present invention, 90% of the feature sets of each set of historical attack samples obtained in step 101 may be randomly extracted as a training set, and the remaining 10% may be used as a test set. The training set comprises at least two groups of feature sets of historical attack samples, and the feature set of each group of historical attack samples comprises feature information used as input and real source tracing results of the group of historical attack samples used as output; wherein each group of historical attack samples corresponds to one historical attack event.
In the embodiment of the invention, a Bayes traceability model is trained by utilizing a feature set of each group of historical attack samples, firstly, preset probability distribution is given to parameters in the constructed Bayes traceability model, so that the Bayes traceability model is iteratively trained by utilizing the feature set of each group of historical attack samples in the training set later to obtain a first Bayes traceability model, wherein each parameter in the first Bayes traceability model is optimized probability distribution, the first Bayes traceability model is calibrated by utilizing a test set, and when the similarity between a predicted traceability result of the corresponding test set output by the first Bayes traceability model and a real traceability result in the test set is greater than a preset threshold value, the first Bayes traceability model is determined to be a target Bayes traceability model; otherwise, returning to execute the iterative training of the Bayes traceability model by using the feature set of each group of historical attack samples until the target Bayes traceability model is determined.
In the embodiment of the present invention, for example, the test set includes 5 sets of historical attack samples, and when the first bayesian traceability model is calibrated by using the test set, if the similarity between the predicted traceability result corresponding to each set of historical attack samples obtained by the model and the real traceability result thereof is respectively 80%, 90%, 95%, 99% and 85%, and the preset threshold is 80%, it is determined that the first bayesian traceability model is the target bayesian traceability model.
Specifically, the parameters s = { (x) in the Bayesian tracing model 1 ,y 1 )、(x 2 ,y 2 )…(x 1 ,y 1 ) The posterior probability of is expressed as
Figure BDA0002961140670000111
The method comprises the following steps of obtaining a tracing result, wherein P (c) is used for representing prior probability, P (c | x, y) is used for representing likelihood estimation, and P (y | x) is used for representing prior probability that the tracing result is y on the premise of determining a characteristic x. The method comprises the steps of firstly endowing each parameter in the formula with preset probability distribution, and secondly performing iterative training on the Bayesian tracing model by using a training set so as to continuously optimize the probability distribution of each parameter.
Optionally, the feature information of each attack sample includes at least one feature corresponding to the attack sample;
screening the characteristic information of each attack sample, comprising the following steps:
for each attack sample, performing:
obtaining the information gain of each feature in the feature information by using a decision tree algorithm, and calculating the weight corresponding to each feature by using the information gain of each feature;
and screening out the features corresponding to the weights larger than the preset weight threshold value according to the obtained weight of each feature and the preset weight threshold value so as to obtain the feature subset corresponding to the attack sample.
In the embodiment of the invention, the characteristic information of each attack sample comprises at least one characteristic corresponding to the attack sample. Specifically, feature information of each attack sample is screened, firstly, a decision tree algorithm is utilized to obtain information gain of each feature in the feature information of the attack sample, so that a weight corresponding to each feature is obtained according to the information gain calculation of each feature, and features corresponding to the weights larger than a preset weight threshold are screened out to be reserved, namely, the reserved features form a feature subset of the attack sample. Therefore, through calculation of information gain and weight, the characteristics with high influence on the real tracing result are determined to be reserved, namely the characteristics of large information amount, high independence and small information overlapping degree are determined, the data processing difficulty in the tracing process can be further reduced, the operation speed is increased, and the efficiency of obtaining the tracing result and the accuracy of the tracing result are improved.
Specifically, the weight corresponding to each feature is calculated by the formula
Figure BDA0002961140670000121
Wherein w i Weights, G, for characterizing the ith feature i The information gain used to characterize the ith feature, m is used to characterize the total number of features in the attack sample,
Figure BDA0002961140670000122
and the sum of the information gains of the m characteristics used for characterizing the attack sample.
Optionally, in the tracing method for multi-sample combination attack, after obtaining the tracing results of at least two attack samples to be traced in step 104, the tracing method further includes:
obtaining at least two groups of historical attack samples; wherein each group of historical attack samples comprises at least two attack samples;
periodically updating the acquired at least two groups of historical attack samples;
and periodically training the target Bayes traceability model by using at least two groups of updated historical attack samples to obtain an optimized Bayes traceability model.
In the embodiment of the invention, the current target Bayesian traceability model can be regularly optimized, so that the current target Bayesian traceability model is continuously trained by using the periodically updated historical attack samples, the judgment capability of the model is enhanced, novel unknown attack means can be subjected to traceability analysis to assist in positioning attack organizations or hackers, the traceability capability of the model is favorably improved, and the traceability efficiency is further improved.
In order to more clearly illustrate the technical solution and advantages of the present invention, as shown in fig. 2, the following describes in detail a tracing method based on multi-sample combination attack provided by an embodiment of the present invention, which specifically includes:
step 201: and acquiring at least two groups of historical attack samples, and acquiring a feature set corresponding to each group of historical attack samples.
Specifically, at least two groups of historical attack samples are obtained, wherein each group of historical attack samples comprises at least two attack samples;
for each set of historical attack samples, performing:
a1, performing feature extraction on at least two attack samples included in the group of historical attack samples to obtain feature information corresponding to each attack sample;
a2, the characteristic information of each attack sample comprises at least one characteristic corresponding to the attack sample, and the screening of the characteristic information of each attack sample comprises the following steps: for each attack sample, performing: obtaining the information gain of each feature in the feature information by using a decision tree algorithm, and calculating to obtain the weight corresponding to each feature by using the information gain of each feature; screening out the features corresponding to the weight larger than the preset weight threshold value according to the obtained weight of each feature and the preset weight threshold value so as to obtain a feature subset corresponding to the attack sample; the characteristic subset comprises characteristic information of each screened attack sample;
a3, obtaining a real tracing result corresponding to the historical attack sample;
and A4, combining the feature subset of each attack sample with the real tracing result of the group of historical attack samples to obtain a feature set corresponding to the group of historical attack samples.
Step 202: and training to obtain a target Bayes tracing model.
Specifically, a Bayesian tracing model is constructed, wherein the feature set of each group of historical attack samples comprises feature information serving as input and real tracing results of the group of historical attack samples serving as output;
giving preset probability distribution to parameters in the constructed Bayesian traceability model;
performing iterative training on the Bayes traceability model by using the feature set of each group of historical attack samples to obtain the optimized probability distribution of the parameters;
randomly sampling the feature set of each group of acquired historical attack samples to acquire a test set;
obtaining a prediction traceability result corresponding to the test set according to a Bayesian traceability model which is obtained after the test set and the iterative training and comprises optimized probability distribution of parameters;
judging whether the similarity between the predicted traceability result and the real traceability result which is output in the test set is greater than a preset threshold value or not;
if yes, obtaining a target Bayes tracing model; and the parameters in the target Bayes traceability model are optimized probability distribution.
Step 203: and detecting an attack trigger event, and acquiring at least two attack samples to be traced.
Specifically, feature extraction is performed on at least two attack samples to be traced, and feature information of each attack sample to be traced is obtained.
Step 204: and obtaining the tracing results of at least two attack samples to be traced by using the target Bayesian tracing model.
Specifically, screening feature information of each attack sample to be traced to obtain a target feature subset corresponding to each attack sample to be traced to the source;
and inputting the target characteristic subset of each attack sample to be traced into a target Bayesian tracing model to obtain the tracing result of at least two attack samples to be traced.
Step 205: and obtaining an optimized Bayesian tracing model.
Specifically, after the tracing results of at least two attack samples to be traced are obtained, at least two groups of acquired historical attack samples are periodically updated, so that the target Bayes tracing model is periodically trained by using the at least two groups of updated historical attack samples, and the optimized Bayes tracing model is obtained.
As shown in fig. 3 and fig. 4, an embodiment of the present invention provides a tracing apparatus facing multi-sample combination attack. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. From a hardware level, as shown in fig. 3, for a hardware structure diagram of a device where a tracing apparatus facing a multi-sample combination attack is located provided in an embodiment of the present invention, in addition to the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 3, the device where the apparatus is located in the embodiment may also include other hardware, such as a forwarding chip responsible for processing a packet, and the like. Taking a software implementation as an example, as shown in fig. 4, as a logically meaningful device, the device is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory by the CPU of the device where the device is located and running the computer program instructions. The tracing device for multi-sample combination attack provided by the embodiment comprises:
an obtaining module 401, configured to obtain at least two attack samples to be traced when an attack trigger event is detected;
a feature extraction module 402, configured to perform feature extraction on at least two attack samples to be traced, which are obtained by the obtaining module 401, to obtain feature information of each attack sample to be traced;
the tracing module 403 is configured to obtain a tracing result of at least two attack samples to be traced according to the feature information of each attack sample to be traced obtained by the feature extraction module 402 and a pre-created target bayesian tracing model.
Optionally, on the basis of the tracing apparatus facing multiple sample combination attack shown in fig. 4, the tracing module 403 is further configured to perform the following operations:
screening the characteristic information of each attack sample to be traced to obtain a target characteristic subset corresponding to each attack sample to be traced;
and inputting the target characteristic subset of each attack sample to be traced into a target Bayesian tracing model to obtain the tracing results of at least two attack samples to be traced.
Optionally, on the basis of the tracing apparatus facing multi-sample combination attack shown in fig. 4, the tracing apparatus further includes a model building module, where the model building module is configured to perform the following operations:
obtaining at least two groups of historical attack samples; wherein each group of historical attack samples comprises at least two attack samples;
performing feature extraction on at least two groups of historical attack samples to obtain feature sets corresponding to each group of historical attack samples;
constructing a Bayesian tracing model;
and training the Bayes traceability model by using the feature set of each group of historical attack samples to obtain a target Bayes traceability model.
Optionally, on the basis of the tracing apparatus facing multi-sample combination attack shown in fig. 4, the model building module is further configured to perform the following operations:
for each set of historical attack samples, performing:
performing feature extraction on at least two attack samples included in the group of historical attack samples to obtain feature information corresponding to each attack sample;
screening the characteristic information of each attack sample to obtain a characteristic subset corresponding to the attack sample; the characteristic subset comprises characteristic information of each screened attack sample;
acquiring a real tracing result corresponding to the group of historical attack samples;
and combining the feature subset of each attack sample with the real source tracing result of the group of historical attack samples to obtain the feature set corresponding to the group of historical attack samples.
Optionally, on the basis of the tracing apparatus facing multi-sample combined attack shown in fig. 4, the feature set of each group of historical attack samples includes feature information as input and a real tracing result of the group of historical attack samples as output;
the model building module 401 is further configured to perform the following operations:
giving preset probability distribution to parameters in the constructed Bayesian traceability model;
performing iterative training on the Bayes traceability model by using the feature set of each group of historical attack samples to obtain the optimized probability distribution of the parameters;
randomly sampling the feature set of each group of historical attack samples to obtain a test set;
obtaining a prediction traceability result corresponding to the test set according to a Bayesian traceability model which is obtained after the test set and the iterative training and comprises optimized probability distribution of parameters;
judging whether the similarity between the predicted tracing result and a real tracing result which is output in the test set is greater than a preset threshold value or not;
if yes, obtaining a target Bayes tracing model; and the parameters in the target Bayes traceability model are optimized probability distribution.
Optionally, on the basis of the tracing apparatus facing multi-sample combination attack shown in fig. 4, the feature information of each attack sample includes at least one feature corresponding to the attack sample; the model building module is further used for executing the following operations:
screening the characteristic information of each attack sample, which comprises the following steps:
for each attack sample, performing:
obtaining the information gain of each feature in the feature information by using a decision tree algorithm, and calculating the weight corresponding to each feature by using the information gain of each feature;
and screening out the features corresponding to the weights larger than the preset weight threshold value according to the obtained weight of each feature and the preset weight threshold value so as to obtain the feature subset corresponding to the attack sample.
Optionally, on the basis of a tracing apparatus facing multi-sample combination attack shown in fig. 4, the apparatus further includes: an optimization module to perform the following operations:
obtaining at least two groups of historical attack samples; wherein each group of historical attack samples comprises at least two attack samples;
periodically updating the acquired at least two groups of historical attack samples;
and periodically training the target Bayes tracing model by using at least two updated groups of historical attack samples to obtain an optimized Bayes tracing model.
It can be understood that the schematic structure in the embodiment of the present invention does not constitute a specific limitation on a tracing apparatus for multi-sample combination attack. In other embodiments of the present invention, a traceable device for multi-sample combination attack may include more or fewer components than those shown, or combine some components, or split some components, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
For the information interaction, execution process and other contents between the modules in the above-mentioned apparatus, because the same concept is based on as the method embodiment of the present invention, specific contents can refer to the description in the method embodiment of the present invention, and are not described herein again.
The embodiment of the invention also provides a source tracing device facing the multi-sample combination attack, which comprises: at least one memory area and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor is configured to invoke the machine-readable program to execute the tracing method for multi-sample combination attack in any embodiment of the present invention.
An embodiment of the present invention further provides a computer-readable medium, where a computer instruction is stored on the computer-readable medium, and when the computer instruction is executed by a processor, the processor is enabled to execute a tracing method for multi-sample combination attack in any embodiment of the present invention.
Specifically, a system or an apparatus equipped with a storage medium on which software program codes that realize the functions of any of the above-described embodiments are stored may be provided, and a computer (or a CPU or MPU) of the system or the apparatus is caused to read out and execute the program codes stored in the storage medium.
In this case, the program code itself read from the storage medium can realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code constitute a part of the present invention.
Examples of the storage medium for supplying the program code include a flexible disk, hard disk, magneto-optical disk, optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD + RW), magnetic tape, nonvolatile memory card, and ROM. Alternatively, the program code may be downloaded from a server computer via a communications network.
Further, it should be clear that the functions of any one of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform a part or all of the actual operations based on instructions of the program code.
Further, it is to be understood that the program code read out from the storage medium is written to a memory provided in an expansion board inserted into the computer or to a memory provided in an expansion module connected to the computer, and then causes a CPU or the like mounted on the expansion board or the expansion module to perform part or all of the actual operations based on instructions of the program code, thereby realizing the functions of any of the above-described embodiments.
In summary, according to the tracing method and device for multi-sample combination attack provided by the present invention, when an attack trigger event is detected, at least two attack samples to be traced for the current attack trigger event are obtained, feature extraction is performed on the attack samples to be traced, feature information corresponding to each attack sample to be traced is obtained, and then a tracing result of the at least two attack samples to be traced is finally obtained according to a pre-trained target bayesian tracing model. The method has the following beneficial effects: through a pre-trained target Bayes traceability model, a plurality of attack samples to be traced can be traced, traceability results such as attack organizations and the like to which the plurality of attack samples belong are finally determined, and meanwhile, the problem that a single sample cannot be accurately traced and positioned is avoided. Therefore, the source tracing result of the current attack sample can be automatically and quickly analyzed by using the target Bayesian source tracing model for the APT-oriented obfuscation attack means, the participation of analysts is reduced, the source tracing analysis time is shortened, and the source tracing efficiency for multi-sample combination attack is improved.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (8)

1. A tracing method facing multi-sample combined attack is characterized by comprising the following steps:
detecting an attack trigger event;
obtaining at least two attack samples to be traced;
performing feature extraction on the at least two attack samples to be traced to obtain feature information of each attack sample to be traced;
obtaining the tracing results of the at least two attack samples to be traced according to the characteristic information of each attack sample to be traced and a pre-established target Bayesian tracing model;
the method for creating the target Bayesian tracing model comprises the following steps:
obtaining at least two groups of historical attack samples; wherein each group of historical attack samples comprises at least two attack samples;
performing feature extraction on the at least two groups of historical attack samples to obtain feature sets corresponding to each group of historical attack samples;
constructing a Bayesian tracing model;
training the Bayes traceability model by using the feature set of each group of historical attack samples to obtain a target Bayes traceability model;
the feature set of each group of historical attack samples comprises feature information used as input and real tracing results of the group of historical attack samples used as output;
the training of the Bayes traceability model by using the feature set of each group of historical attack samples to obtain a target Bayes traceability model comprises the following steps:
giving preset probability distribution to parameters in the constructed Bayesian tracing model;
performing iterative training on the Bayes traceability model by using the feature set of each group of historical attack samples to obtain the optimized probability distribution of the parameters;
randomly sampling the feature set of each group of historical attack samples to obtain a test set;
obtaining a prediction traceability result corresponding to the test set according to the Bayesian traceability model including the optimized probability distribution of the parameters obtained after the test set and the iterative training;
judging whether the similarity between the prediction traceability result and the real traceability result which is output in the test set is greater than a preset threshold value or not;
if yes, obtaining a target Bayes tracing model; and the parameters in the target Bayes tracing model are the optimized probability distribution.
2. The method according to claim 1, wherein obtaining the tracing results of the at least two attack samples to be traced according to the feature information of each attack sample to be traced and a pre-created target bayesian tracing model comprises:
screening the characteristic information of each attack sample to be traced to obtain a target characteristic subset corresponding to each attack sample to be traced to the source;
and inputting the target feature subset of each attack sample to be traced into the target Bayesian tracing model to obtain the tracing result of the at least two attack samples to be traced.
3. The method according to claim 1, wherein the performing feature extraction on the at least two groups of historical attack samples to obtain a feature set corresponding to each group of historical attack samples comprises:
for each set of historical attack samples, performing:
performing feature extraction on at least two attack samples included in the group of historical attack samples to obtain feature information corresponding to each attack sample;
screening the characteristic information of each attack sample to obtain a characteristic subset corresponding to the attack sample; the characteristic subset comprises characteristic information of each screened attack sample;
acquiring a real tracing result corresponding to the group of historical attack samples;
and combining the feature subset of each attack sample with the real source tracing result of the group of historical attack samples to obtain the feature set corresponding to the group of historical attack samples.
4. The method of claim 3, wherein the feature information of each attack sample comprises at least one feature corresponding to the attack sample;
the screening the characteristic information of each attack sample comprises:
for each attack sample, performing:
obtaining the information gain of each feature in the feature information by using a decision tree algorithm, and calculating the weight corresponding to each feature by using the information gain of each feature;
and screening out the features corresponding to the weights larger than the preset weight threshold value according to the obtained weight of each feature and the preset weight threshold value so as to obtain the feature subset corresponding to the attack sample.
5. The method according to any one of claims 1 to 4, further comprising, after the obtaining the tracing results of the at least two attack samples to be traced, the following steps:
obtaining at least two groups of historical attack samples; wherein each group of historical attack samples comprises at least two attack samples;
periodically updating the acquired at least two sets of historical attack samples;
and periodically training the target Bayes tracing model by using the updated at least two groups of historical attack samples to obtain an optimized Bayes tracing model.
6. A tracing device for multi-sample combination attack is characterized by comprising:
the system comprises an acquisition module, a tracing module and a judging module, wherein the acquisition module is used for acquiring at least two attack samples to be traced when an attack trigger event is detected;
the characteristic extraction module is used for extracting the characteristics of the at least two attack samples to be traced acquired by the acquisition module to obtain the characteristic information of each attack sample to be traced;
the tracing module is used for obtaining the tracing results of the at least two attack samples to be traced according to the feature information of each attack sample to be traced obtained by the feature extraction module and a pre-established target Bayesian tracing model;
further comprising a model building module for performing the following operations:
obtaining at least two groups of historical attack samples; wherein each group of historical attack samples comprises at least two attack samples;
performing feature extraction on at least two groups of historical attack samples to obtain feature sets corresponding to each group of historical attack samples;
constructing a Bayesian tracing model;
training a Bayes traceability model by using the feature set of each group of historical attack samples to obtain a target Bayes traceability model;
the feature set of each group of historical attack samples comprises feature information used as input and real source tracing results of the group of historical attack samples used as output;
the model building module is further configured to perform the following operations:
giving preset probability distribution to parameters in the constructed Bayesian tracing model;
performing iterative training on the Bayes tracing model by using the feature set of each group of historical attack samples to obtain optimized probability distribution of parameters;
randomly sampling the feature set of each group of historical attack samples to obtain a test set;
obtaining a prediction traceability result corresponding to the test set according to a Bayes traceability model including optimized probability distribution of parameters obtained after the test set and iterative training;
judging whether the similarity between the predicted traceability result and the real traceability result which is output in the test set is greater than a preset threshold value or not;
if yes, obtaining a target Bayes tracing model; and the parameters in the target Bayes traceability model are optimized probability distribution.
7. A tracing device for multi-sample combination attack is characterized by comprising: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor configured to invoke the machine readable program to perform the method of any of claims 1 to 5.
8. Computer readable medium, characterized in that it has stored thereon computer instructions which, when executed by a processor, cause the processor to carry out the method of any one of claims 1 to 5.
CN202110238343.1A 2021-03-04 2021-03-04 Multi-sample combination attack-oriented tracing method and device Active CN112822220B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110238343.1A CN112822220B (en) 2021-03-04 2021-03-04 Multi-sample combination attack-oriented tracing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110238343.1A CN112822220B (en) 2021-03-04 2021-03-04 Multi-sample combination attack-oriented tracing method and device

Publications (2)

Publication Number Publication Date
CN112822220A CN112822220A (en) 2021-05-18
CN112822220B true CN112822220B (en) 2023-02-28

Family

ID=75862834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110238343.1A Active CN112822220B (en) 2021-03-04 2021-03-04 Multi-sample combination attack-oriented tracing method and device

Country Status (1)

Country Link
CN (1) CN112822220B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113254918B (en) * 2021-07-14 2021-10-12 杭州云信智策科技有限公司 Information processing method, electronic device, and computer-readable storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107241352A (en) * 2017-07-17 2017-10-10 浙江鹏信信息科技股份有限公司 A kind of net security accident classificaiton and Forecasting Methodology and system
CN108259462A (en) * 2017-11-29 2018-07-06 国网吉林省电力有限公司信息通信公司 Big data Safety Analysis System based on mass network monitoring data
CN109889476A (en) * 2018-12-05 2019-06-14 国网冀北电力有限公司信息通信分公司 A kind of network safety protection method and network security protection system
CN109922069A (en) * 2019-03-13 2019-06-21 中国科学技术大学 The multidimensional association analysis method and system that advanced duration threatens
CN111147504A (en) * 2019-12-26 2020-05-12 深信服科技股份有限公司 Threat detection method, apparatus, device and storage medium
CN111565205A (en) * 2020-07-16 2020-08-21 腾讯科技(深圳)有限公司 Network attack identification method and device, computer equipment and storage medium
CN111935192A (en) * 2020-10-12 2020-11-13 腾讯科技(深圳)有限公司 Network attack event tracing processing method, device, equipment and storage medium
CN112202759A (en) * 2020-09-28 2021-01-08 广州大学 APT attack identification and attribution method, system and storage medium based on homology analysis
CN112333196A (en) * 2020-11-10 2021-02-05 恒安嘉新(北京)科技股份公司 Attack event tracing method and device, electronic equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10438207B2 (en) * 2015-04-13 2019-10-08 Ciena Corporation Systems and methods for tracking, predicting, and mitigating advanced persistent threats in networks

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107241352A (en) * 2017-07-17 2017-10-10 浙江鹏信信息科技股份有限公司 A kind of net security accident classificaiton and Forecasting Methodology and system
CN108259462A (en) * 2017-11-29 2018-07-06 国网吉林省电力有限公司信息通信公司 Big data Safety Analysis System based on mass network monitoring data
CN109889476A (en) * 2018-12-05 2019-06-14 国网冀北电力有限公司信息通信分公司 A kind of network safety protection method and network security protection system
CN109922069A (en) * 2019-03-13 2019-06-21 中国科学技术大学 The multidimensional association analysis method and system that advanced duration threatens
CN111147504A (en) * 2019-12-26 2020-05-12 深信服科技股份有限公司 Threat detection method, apparatus, device and storage medium
CN111565205A (en) * 2020-07-16 2020-08-21 腾讯科技(深圳)有限公司 Network attack identification method and device, computer equipment and storage medium
CN112202759A (en) * 2020-09-28 2021-01-08 广州大学 APT attack identification and attribution method, system and storage medium based on homology analysis
CN111935192A (en) * 2020-10-12 2020-11-13 腾讯科技(深圳)有限公司 Network attack event tracing processing method, device, equipment and storage medium
CN112333196A (en) * 2020-11-10 2021-02-05 恒安嘉新(北京)科技股份公司 Attack event tracing method and device, electronic equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
一种基于数据挖掘的多步入侵警报关联模型;于潇等;《吉林大学学报(理学版)》;20130926(第05期);第1-6页 *
基于层次化网络的多源头威胁态势高效评估方法;任帅等;《中国电子科学研究院学报》;20190320(第03期);第1-6页 *
网络攻击源追踪技术研究综述;姜建国等;《信息安全学报》;20180115(第01期);第1-20页 *

Also Published As

Publication number Publication date
CN112822220A (en) 2021-05-18

Similar Documents

Publication Publication Date Title
EP3651043B1 (en) Url attack detection method and apparatus, and electronic device
US8375450B1 (en) Zero day malware scanner
CN106778241B (en) Malicious file identification method and device
CN112866023B (en) Network detection method, model training method, device, equipment and storage medium
CN108768883B (en) Network traffic identification method and device
CN106899440B (en) Network intrusion detection method and system for cloud computing
JP2022527511A (en) Guessing the time relationship for cybersecurity events
CN112866292B (en) Attack behavior prediction method and device for multi-sample combination attack
CN110557382A (en) Malicious domain name detection method and system by utilizing domain name co-occurrence relation
CN110602029A (en) Method and system for identifying network attack
CN109104421B (en) Website content tampering detection method, device, equipment and readable storage medium
CN105072214A (en) C&C domain name identification method based on domain name feature
CN111368289B (en) Malicious software detection method and device
JP2010097342A (en) Malfunction detection device and program
US10489715B2 (en) Fingerprinting and matching log streams
CN109783805B (en) Network community user identification method and device and readable storage medium
CN109600382B (en) Webshell detection method and device and HMM model training method and device
TW202240453A (en) Method and computer for learning corredpondence between malicious behaviors and execution trace of malware and method for implementing neural network
CN112765660A (en) Terminal security analysis method and system based on MapReduce parallel clustering technology
CN115270954A (en) Unsupervised APT attack detection method and system based on abnormal node identification
CN112822220B (en) Multi-sample combination attack-oriented tracing method and device
CN112839061B (en) Tracing method and device based on regional characteristics
EP3913888A1 (en) Detection method for malicious domain name in domain name system and detection device
CN116962009A (en) Network attack detection method and device
JP2015018372A (en) Expression extraction model learning device, expression extraction model learning method and computer program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 150028 building 7, innovation and entrepreneurship square, science and technology innovation city, Harbin high tech Industrial Development Zone, Heilongjiang Province (No. 838, Shikun Road)

Applicant after: Antan Technology Group Co.,Ltd.

Address before: Room 506, 162 Hongqi Street, Nangang 17 building, high tech entrepreneurship center, high tech Industrial Development Zone, Songbei District, Harbin City, Heilongjiang Province

Applicant before: Harbin Antian Science and Technology Group Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant