CN115987687A - Network attack evidence obtaining method, device, equipment and storage medium - Google Patents

Network attack evidence obtaining method, device, equipment and storage medium Download PDF

Info

Publication number
CN115987687A
CN115987687A CN202310259847.0A CN202310259847A CN115987687A CN 115987687 A CN115987687 A CN 115987687A CN 202310259847 A CN202310259847 A CN 202310259847A CN 115987687 A CN115987687 A CN 115987687A
Authority
CN
China
Prior art keywords
attack
data
model
network
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310259847.0A
Other languages
Chinese (zh)
Other versions
CN115987687B (en
Inventor
梅阳阳
韩伟红
顾钊铨
李树栋
林凯瀚
亓玉璐
马兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peng Cheng Laboratory
Original Assignee
Peng Cheng Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peng Cheng Laboratory filed Critical Peng Cheng Laboratory
Priority to CN202310259847.0A priority Critical patent/CN115987687B/en
Publication of CN115987687A publication Critical patent/CN115987687A/en
Application granted granted Critical
Publication of CN115987687B publication Critical patent/CN115987687B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention belongs to network security and discloses a network attack evidence obtaining method, a device, equipment and a storage medium. The method comprises the following steps: network monitoring is carried out on equipment to be subjected to evidence obtaining, and a full-flow log of the equipment to be subjected to evidence obtaining is collected according to a target packet capturing tool; encrypting the file of the full-flow log to obtain encrypted monitoring data; performing model training on the initial recognition model according to a preset feature screening mode and a sample acquisition log to obtain a target attack recognition model; carrying out attack identification according to the encrypted monitoring data and the target attack identification model, and determining the network attack type; and completing the network attack evidence obtaining of the equipment to be proved according to the network attack type. By the method, the integrity and the safety of data are guaranteed, the condition of privacy disclosure can be avoided, meanwhile, attacks in the network can be accurately found and tracked in a real scene, and the integrity of the network attack evidence obtaining process is guaranteed.

Description

Network attack evidence obtaining method, device, equipment and storage medium
Technical Field
The present invention relates to network security, and in particular, to a method, an apparatus, a device, and a storage medium for obtaining evidence of network attack.
Background
The internet has been widely integrated into the world's social and economic structures to provide everyday services and tasks to individuals and organizations, but it is also becoming a hot source of conflict. Attackers use a variety of attack techniques, including denial of service attacks, distributed denial of service attacks, extortion software, and other botnet attacks, to expose the internet of things system and its network. Due to the anonymity of the internet, it is difficult to identify the culprit or to pursue the culprit using international laws without clear evidence.
One important application aspect of network forensics is intrusion detection and attack event traceback analysis. Network forensics of attack events can help investigators understand how attacks are initiated, what is stolen, and how attacks are prevented from happening again in the future. But existing network forensics methods do not consider the entire phase of the investigation. Moreover, most network forensics only focus on data acquisition, regardless of the forensics process. This is prone to violation of user privacy issues because user information is distributed among stakeholders. Although a plurality of network evidence obtaining methods are proposed at present, due to the lack of standardization, the problems of unreasonable design, poor practicability and the like exist. Different environments require different methods and tools, and thus are difficult to be preferred by professionals.
Deployment of an existing evidence obtaining IoTDots model needs to modify an application program, and the problem of user privacy is caused. Secondly, the data integrity is not considered in the data acquisition, and the data is at risk of being maliciously tampered in the collection, transmission and storage processes. Mahmud Hossain et al propose a Probe-IoT model, which is a network forensics method that uses a public digital book to find evidence, and the scheme is implemented based on a block chain technique. The method can provide an interface for evidence collection and verify the integrity and authenticity of evidence during event investigation without violating user privacy. Similar models are also FIF-IoT and Block4Forensic. The model can accelerate the investigation process without requiring the investigators to go to the site to collect data. However, a dedicated device for collecting service data is required, multiple entities are required to cooperate with each other, extra expenditure and power consumption are increased undoubtedly, and service degradation is caused by later maintenance problems. Meanwhile Jiang Xiaofeng provides a hierarchical network attack identification and position attack detection method based on deep learning. The method provides that a self-encoder is used for learning behavior patterns of normal flow and abnormal flow, and then the learned self-encoder is used for judging whether the flow to be detected is the normal flow or the abnormal flow. However, the traffic data collected from the network is often large, and server overhead can only be reduced by selecting appropriate characteristics for learning. Secondly, the method does not take into account the potential safety hazard of the data in the collection process.
Disclosure of Invention
The invention mainly aims to provide a network attack evidence obtaining method, a network attack evidence obtaining device, network attack evidence obtaining equipment and a storage medium, and aims to solve the technical problem of how to safely, efficiently and accurately obtain evidence of network attack behaviors in the prior art.
In order to achieve the above object, the present invention provides a network attack evidence obtaining method, which comprises:
network monitoring is carried out on equipment to be subjected to evidence obtaining, and a full-flow log of the equipment to be subjected to evidence obtaining is collected according to a target packet capturing tool;
encrypting the file of the full flow log to obtain encrypted monitoring data;
performing model training on the initial recognition model according to a preset feature screening mode and a sample acquisition log to obtain a target attack recognition model;
carrying out attack identification according to the encrypted monitoring data and the target attack identification model, and determining the network attack type;
and completing the network attack evidence obtaining of the equipment to be subjected to evidence obtaining according to the network attack type.
Optionally, the performing model training on the initial recognition model according to a preset feature screening mode and a sample collection log to obtain a target attack recognition model includes:
encrypting the file of the sample acquisition log to obtain sample encrypted data;
carrying out data preprocessing on the sample encrypted data to obtain sample processing data;
performing characteristic screening according to the sample processing data and a preset characteristic screening mode to obtain model training characteristics;
and performing model training on the initial recognition model according to the model training characteristics to obtain a target attack recognition model.
Optionally, the performing feature screening according to the sample processing data and a preset feature screening manner to obtain model training features includes:
acquiring a target characteristic variable and a preset characteristic threshold;
performing correlation calculation according to the target characteristic variable and the sample processing data to obtain a correlation coefficient value;
and performing feature screening on the sample processing data according to the correlation coefficient value and the preset feature threshold value to obtain model training features.
Optionally, the performing model training on the initial recognition model according to the model training features to obtain a target attack recognition model includes:
determining a model training set and a model testing set according to the model training characteristics, a preset attack type and the characteristic quantity of the model training characteristics;
performing model training on the initial recognition model according to a model training set and a preset loss function to obtain a model to be tested;
performing model test on the model to be tested according to the model test set to obtain a test result;
and obtaining a target attack identification model according to the test result and the model to be tested.
Optionally, the performing correlation calculation according to the target characteristic variable and the sample processing data to obtain a correlation coefficient value includes:
determining a target characteristic variable mean value according to the target characteristic variable;
determining a sample characteristic variable, a sample characteristic quantity and a sample variable mean value according to the sample processing data;
and performing correlation calculation according to the sample characteristic variables, the target characteristic variable mean value, the sample characteristic quantity and the sample variable mean value to obtain a correlation numerical value.
Optionally, the identifying an attack according to the encrypted monitoring data and the target attack identification model, and determining a network attack type includes:
carrying out deletion detection on the encrypted monitoring data to obtain a deletion detection result;
when the missing detection result indicates that the encrypted monitoring data has data missing, determining the missing type of the encrypted monitoring data;
performing data processing on the encrypted monitoring data according to the missing type to obtain first monitoring data;
normalizing the first monitoring data to obtain second monitoring data;
and inputting the second monitoring data to the target attack identification model for attack identification, and determining the network attack type.
Optionally, the normalizing the first monitoring data to obtain second monitoring data includes:
when the data type of the first monitoring data is not a preset type, performing data mapping on the first monitoring data to obtain a conversion characteristic value;
determining conversion vector data according to the conversion characteristic value;
and carrying out data normalization processing on the conversion vector data to obtain second monitoring data.
In addition, in order to achieve the above object, the present invention further provides a network attack evidence obtaining apparatus, including:
the acquisition module is used for carrying out network monitoring on the equipment to be subjected to evidence obtaining and acquiring a full-flow log of the equipment to be subjected to evidence obtaining according to a target packet capturing tool;
the encryption module is used for encrypting the file of the full-flow log to obtain encrypted monitoring data;
the training module is used for carrying out model training on the initial recognition model according to a preset feature screening mode and the sample acquisition log to obtain a target attack recognition model;
the identification module is used for carrying out attack identification according to the encrypted monitoring data and the target attack identification model and determining the network attack type;
and the completion module is used for completing the network attack evidence obtaining of the equipment to be proved according to the network attack type.
In addition, to achieve the above object, the present invention further provides a network attack evidence obtaining device, where the network attack evidence obtaining device includes: the network attack evidence obtaining system comprises a memory, a processor and a network attack evidence obtaining program which is stored on the memory and can run on the processor, wherein the network attack evidence obtaining program is configured to realize the network attack evidence obtaining method.
In addition, to achieve the above object, the present invention further provides a storage medium, on which a network attack forensics program is stored, and the network attack forensics program, when executed by a processor, implements the network attack forensics method as described above.
The method comprises the steps of carrying out network monitoring on equipment to be forensics, and collecting a full-flow log of the equipment to be forensics according to a target packet capturing tool; encrypting the file of the full-flow log to obtain encrypted monitoring data; performing model training on the initial recognition model according to a preset feature screening mode and a sample acquisition log to obtain a target attack recognition model; carrying out attack identification according to the encrypted monitoring data and the target attack identification model, and determining the network attack type; and completing the network attack evidence obtaining of the equipment to be proved according to the network attack type. According to the mode, file encryption is carried out on the basis of the full-flow log captured from the equipment to be subjected to evidence obtaining, encrypted monitoring data are obtained, the integrity and the safety of the data are guaranteed, the target attack recognition model obtained by model training of the initial recognition model is obtained according to the preset characteristic screening mode and sample collection, attack recognition is carried out on the encrypted monitoring data, the network attack type is determined, therefore, network attack evidence obtaining of the equipment to be subjected to evidence obtaining is completed, the condition that privacy is leaked does not exist, meanwhile, attacks in a network can be accurately found and tracked under a real scene, and the integrity of the network attack evidence obtaining process is guaranteed.
Drawings
Fig. 1 is a schematic structural diagram of a network attack forensics device of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a first embodiment of a network attack forensics method according to the present invention;
FIG. 3 is a flowchart illustrating a second embodiment of a network attack forensics method according to the present invention;
FIG. 4 is a schematic model diagram of a network attack forensics method according to an embodiment of the invention;
FIG. 5 is a flowchart illustrating a model training process according to an embodiment of the evidence obtaining method for network attacks;
fig. 6 is a block diagram of a first embodiment of a cyber attack evidence obtaining apparatus according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a network attack evidence obtaining device in a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the cyber attack evidence obtaining device may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. The communication bus 1002 is used to implement connection communication among these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a Wireless interface (e.g., a Wireless-Fidelity (Wi-Fi) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory, or may be a Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration shown in fig. 1 does not constitute a limitation of the cyber attack forensics device, and may include more or fewer components than those shown, or some components in combination, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a storage medium, may include therein an operating system, a network communication module, a user interface module, and a network attack forensic program.
In the network attack evidence obtaining device shown in fig. 1, the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 of the network attack evidence obtaining device of the present invention may be arranged in the network attack evidence obtaining device, and the network attack evidence obtaining device invokes the network attack evidence obtaining program stored in the memory 1005 through the processor 1001 and executes the network attack evidence obtaining method provided by the embodiment of the present invention.
An embodiment of the present invention provides a network attack forensics method, and referring to fig. 2, fig. 2 is a schematic flow diagram of a first embodiment of a network attack forensics method according to the present invention.
The network attack evidence obtaining method comprises the following steps:
step S10: and carrying out network monitoring on the equipment to be subjected to evidence obtaining, and acquiring the full-flow log of the equipment to be subjected to evidence obtaining according to a target packet capturing tool.
It should be noted that the main execution body of this embodiment is a terminal device, the terminal device may be an intelligent terminal such as a computer, a tablet, a mobile phone, and the like, and the terminal device is equipped with a network attack evidence obtaining system, which can be used in the overall processes of identification, collection, preservation, inspection, analysis, and presentation of network attack evidence obtaining, so as to define the source of the evidence, collect and store the data, process the data, find out the relevant evidence, analyze the relevant evidence, finally obtain the attack evidence, and determine the attacker who initiates the attack.
It is understood that the device to be forensically refers to a device which needs to be investigated to determine whether the internet of things device is under a network attack. The target packet capturing tool refers to a network packet capturing tool capable of realizing full-traffic collection, and includes but is not limited to Wireshark, tcpdump, snifferpro and the like.
In specific implementation, in order to ensure the completeness of data, full-flow collection is performed on a target network, a device to be subjected to evidence collection is placed in a network under investigation, network monitoring is achieved, a target packet capturing tool is used for network packet capturing, a pcap file corresponding to the device to be subjected to evidence collection is obtained, and the pcap file corresponding to the device to be subjected to evidence collection is a full-flow log.
Step S20: and encrypting the file of the full-flow log to obtain encrypted monitoring data.
It should be noted that the key of network forensics is to record the activities of attackers, and it is valuable to record forensics analysis only on the premise of ensuring data security and integrity. However, the user information is distributed among the stakeholders, which easily causes the problem of invading the privacy of the user, and risks being tampered with maliciously in the collection, transmission and storage processes. Therefore, in order to ensure the security of data, the acquired full-flow log is encrypted and transmitted by using a preset encryption algorithm to obtain an encrypted full-flow log, and then the encrypted full-flow log is stored, wherein the encrypted full-flow log is the encrypted monitoring data. In this embodiment, the preset encryption algorithm may adopt an SHA-256 encryption algorithm, or may adopt other encryption algorithms, which is not limited in this embodiment.
Step S30: and performing model training on the initial recognition model according to a preset feature screening mode and the sample collection log to obtain a target attack recognition model.
It should be noted that the preset feature screening mode refers to a preset mode of performing feature mining on multi-source data, the sample acquisition log refers to a sample full-flow log for model training, the sample acquisition log includes full-flow logs of various structure types, the sources are various, the initial recognition model is a deep neural network model, and the initial recognition model is subjected to model training according to the preset feature screening mode and the sample acquisition log, so that the target attack recognition model is obtained. The target attack identification model can carry out attack identification on the normalized traffic data so as to determine the network attack type existing in the traffic data.
Step S40: and carrying out attack identification according to the encrypted monitoring data and the target attack identification model, and determining the network attack type.
It should be noted that, the encrypted monitoring data is subjected to normalization processing to obtain processed encrypted monitoring data, and the processed encrypted monitoring data is input to the target attack recognition model, so that the target attack recognition model outputs the network attack type in the processed encrypted monitoring data.
It can be understood that, in order to ensure the accuracy of the identification result, data preprocessing needs to be performed on the encrypted monitoring data, and further, performing attack identification according to the encrypted monitoring data and the target attack identification model to determine the network attack type includes: performing deletion detection on the encrypted monitoring data to obtain a deletion detection result; determining the missing type of the encrypted monitoring data when the missing detection result indicates that the encrypted monitoring data has data missing; performing data processing on the encrypted monitoring data according to the missing type to obtain first monitoring data; normalizing the first monitoring data to obtain second monitoring data; and inputting the second monitoring data to the target attack recognition model for attack recognition, and determining the network attack type.
In the specific implementation, missing value check is performed on the encrypted monitoring data to obtain a detection result of whether the encrypted monitoring data has data missing, the detection result of whether the encrypted monitoring data has data missing is a missing detection result, and when the missing detection result is that the encrypted monitoring data has data missing, the missing type of the fake surface monitoring data is determined, wherein the missing type includes but is not limited to complete random missing, random missing and non-random missing.
It should be noted that, when the missing type of the encrypted monitoring data is a non-random missing type, the encrypted monitoring data needs to be processed, the processing process is to perform data interpolation or deletion on the missing part, and the encrypted monitoring data after data processing is the first monitoring data.
It can be understood that after the first monitoring data is obtained, most of the collected encrypted monitoring data have the problem of data imbalance, the numerical types in the data are not uniform and include the numerical type and the text type, and the text type data cannot be directly calculated, so that the first monitoring data needs to be uniformly converted into the numerical type to obtain the second monitoring data.
In specific implementation, after the second monitoring data is obtained, the second monitoring data is input to the target attack identification model for attack identification, so that the network attack type is obtained.
It should be noted that, in order to ensure the accuracy of the data normalization processing, further, the normalizing the first monitoring data to obtain second monitoring data includes: when the data type of the first monitoring data is not a preset type, performing data mapping on the first monitoring data to obtain a conversion characteristic value; determining conversion vector data according to the conversion characteristic value; and carrying out data normalization processing on the conversion vector data to obtain second monitoring data.
It can be understood that, the first monitoring data is analyzed and checked, and when the data type of the first monitoring data is not the preset type, the DictVectorizer method in sklern is used to map the first monitoring data to the digital vector space to obtain the characteristic vector value thereof, and the obtained characteristic vector value is the conversion characteristic value. In the present embodiment, the preset type refers to a numerical value type.
In a specific implementation, after the conversion characteristic value is obtained, the conversion characteristic value is used as a new characteristic, and conversion vector data is obtained based on the new characteristic, so that the hidden characteristic in the target data set can be maximally savedContaining information, anonymization of private information (such as IP address) is also achieved. After the conversion vector data are obtained, normalization processing is carried out on the conversion vector data, and the range of the features is reduced to [0,1]Ensuring that the neural network model does not bias towards a particular class. And finally, exporting the normalized second monitoring data. In this embodiment, the data normalization process is as follows: for the input sequence x 1 ,x 2 ,...x n },x i Mapping to [0,1]The normalized calculation formula of (a) is expressed as:
Figure SMS_1
. And if the first monitoring data is of a preset type, directly performing data normalization processing on the first monitoring data to obtain second monitoring data.
Step S50: and completing the network attack evidence obtaining of the equipment to be proved according to the network attack type.
It should be noted that the network attack on the device to be forensically received is determined according to the identified network attack type, and the network attack forensics on the device to be forensics is finally completed by recording the activities of the attacker based on the determined network attack type and the full-flow log.
In the embodiment, network monitoring is carried out on the equipment to be forensics, and a full-flow log of the equipment to be forensics is collected according to a target packet capturing tool; encrypting the file of the full-flow log to obtain encrypted monitoring data; performing model training on the initial recognition model according to a preset feature screening mode and a sample acquisition log to obtain a target attack recognition model; carrying out attack identification according to the encrypted monitoring data and the target attack identification model, and determining the network attack type; and completing the network attack evidence obtaining of the equipment to be proved according to the network attack type. According to the mode, file encryption is carried out on the basis of the full-flow log captured from the equipment to be subjected to evidence obtaining, encrypted monitoring data are obtained, the integrity and the safety of the data are guaranteed, the target attack recognition model obtained by model training of the initial recognition model is obtained according to the preset characteristic screening mode and sample collection, attack recognition is carried out on the encrypted monitoring data, the network attack type is determined, therefore, network attack evidence obtaining of the equipment to be subjected to evidence obtaining is completed, the condition that privacy is leaked does not exist, meanwhile, attacks in a network can be accurately found and tracked under a real scene, and the integrity of the network attack evidence obtaining process is guaranteed.
Referring to fig. 3, fig. 3 is a flowchart illustrating a network attack forensics method according to a second embodiment of the invention.
Based on the first embodiment, the step S30 in the network attack evidence obtaining method of this embodiment includes:
step S31: and encrypting the file of the sample acquisition log to obtain sample encrypted data.
It should be noted that, in order to ensure the security of the data, all sample collection logs are encrypted and transmitted by using a preset encryption algorithm to obtain encrypted full-flow logs, and then the encrypted full-flow logs are stored, where the encrypted sample collection logs are sample encrypted data. In this embodiment, the preset encryption algorithm may adopt an SHA-256 encryption algorithm, or may adopt other encryption algorithms, which is not limited in this embodiment.
Step S32: and carrying out data preprocessing on the sample encrypted data to obtain sample processing data.
It should be noted that, for the problems of various sources, inconsistent structures and the like of the acquired sample encrypted data, the data preprocessing after the preprocessing is obtained by processing the missing data and the data normalization processing, so as to obtain the sample processing data.
It is understood that the processing of missing data is specifically: and (4) carrying out missing value check on the sample encrypted data, and if the data is missing, analyzing which of a missing mechanism the data belongs to, namely, a completely random missing mechanism, a random missing mechanism or a non-random missing mechanism. If it is not a random missing, interpolation is needed, or the feature is deleted. The data normalization processing specifically comprises the following steps: most of the acquired sample encrypted data have the problem of data imbalance, so that the data set needs to be subjected to normalization processing before the model is trained. Meanwhile, the numerical types of the collected sample encrypted data are not uniform and comprise numerical types and text types, and the text type data cannot be directly calculated, so that the data needs to be uniformly converted into the numerical types. Firstly, the collected data set is analyzed and checked, and the data set is uniformly converted into a digital type. For non-digitized data, feature vectorization processing is performed. In this embodiment, a DictVectorizer method in sklern is used to map non-digitized data in sample encrypted data to a digital vector space, and a feature vector value of the non-digitized data is taken as a new feature. This maximizes the preservation of implicit information in the sample encrypted data and also allows anonymization of private information (e.g., IP addresses). Next, the digitized data is normalized to narrow the range of features to [0,1], ensuring that the neural network model does not bias to a particular class. Finally, deriving the normalized sample processing number
Accordingly. The data set normalization process is as follows: for the input sequence { x1, x2,. Xn }, xi maps to [0,1]]The normalized calculation formula of (a) is expressed as:
Figure SMS_2
step S33: and performing characteristic screening according to the sample processing data and a preset characteristic screening mode to obtain model training characteristics.
It should be noted that, because the sample processing data includes a large number of features, in order to reduce the number of features and reduce the computational overhead, a preset feature screening method is adopted to perform data content mining on the sample processing data to construct a new feature set, and an appropriate feature is selected and a training data set and a test data set are reasonably divided to help a classification model to obtain better accuracy. In this embodiment, the preset feature screening method is a method established based on Pearson correlation coefficient method, and can better measure the correlation between features.
It can be understood that, in order to obtain a model training feature beneficial to model training based on sample processing data and a preset feature screening manner, further, performing feature screening according to the sample processing data and the preset feature screening manner to obtain a model training feature includes: acquiring a target characteristic variable and a preset characteristic threshold; performing correlation calculation according to the target characteristic variable and the sample processing data to obtain a correlation coefficient value; and performing feature screening on the sample processing data according to the correlation coefficient value and the preset feature threshold value to obtain model training features.
In specific implementation, the target characteristic variables refer to various preset target characteristics under different attack types, the preset characteristic threshold refers to a characteristic screening threshold, and correlation conditions between the network traffic characteristics and the target characteristic variables in the sample processing data are quantitatively calculated according to a Pearson correlation coefficient method to obtain correlation coefficient values. And selecting the flow characteristic of which the correlation coefficient value with the target characteristic variable in the sample processing data is greater than a preset characteristic threshold value as a model training characteristic of model training. In this embodiment, the preset characteristic threshold is 0.4, and may be other values, which is not limited in this embodiment.
It should be noted that, further, the performing correlation calculation according to the target characteristic variable and the sample processing data to obtain a correlation coefficient value includes: determining a target characteristic variable mean value according to the target characteristic variable; determining sample characteristic variables, sample characteristic quantities and sample variable mean values according to the sample processing data; and performing correlation calculation according to the sample characteristic variables, the target characteristic variable mean value, the sample characteristic quantity and the sample variable mean value to obtain a correlation numerical value.
It is understood that the mean value of the target characteristic variable is determined based on the target characteristic variable y
Figure SMS_3
Determining the network traffic characteristics and the characteristic quantity of the network traffic characteristics in the sample processing data according to the sample processing data, wherein the network traffic characteristics in the sample processing data are sample characteristic variables x, the characteristic quantity of the network traffic characteristics are sample characteristic quantities, and determining the mean value of the sample characteristic variables>
Figure SMS_4
According to the sample characteristic variable, the target characteristic variable mean, the sample characteristic quantity and the sample variable meanPerforming correlation calculation to obtain correlation value and correlation value
Figure SMS_5
Wherein->
Figure SMS_6
,/>
Figure SMS_7
Step S34: and performing model training on the initial recognition model according to the model training characteristics to obtain a target attack recognition model.
It should be noted that, after obtaining model training features through data mining, model training is performed on the initial recognition model based on the model training features to obtain a target attack recognition model, and further, model training is performed on the initial recognition model according to the model training features to obtain the target attack recognition model, which includes: determining a model training set and a model testing set according to the model training characteristics, a preset attack type and the characteristic quantity of the model training characteristics; performing model training on the initial recognition model according to a model training set and a preset loss function to obtain a model to be tested; performing model test on the model to be tested according to the model test set to obtain a test result; and obtaining a target attack identification model according to the test result and the model to be tested.
It can be understood that, in order to balance the data balance problem, a preset number of features are randomly selected from the model training features for each preset attack type to form a model training set, the remaining model training features under each preset attack type are a model test set, and if the number of the model training features corresponding to the preset attack types is less than the preset number, the model training set is determined according to the preset weight. For example, in this embodiment, the preset number is 5000, the preset weight is 0.9, 8000 model training features exist under the preset attack type a, the number of model training features in the model training set corresponding to the preset attack type a is 5000, the number of model training features in the model testing set is 3000, 4000 model training features exist under the preset attack type B, the number of model training features in the model training set corresponding to the preset attack type B is 3600, and the number of model training features in the model testing set is 400.
In a specific implementation, the predetermined loss function is
Figure SMS_8
Wherein, y i Is a model training feature x i Output of y i E {0,1}. The initial recognition model is a deep neural network model, the deep neural network has high generalization capability, except for trained data, if the network receives new data similar to the training data, high-precision detection can be performed on the data, in the embodiment, 10-time cross validation and 100 iterations are adopted, and a 7-layer deep neural network model is constructed to recognize network attacks. Specifically comprising an input layer, an output layer and 5 hidden layers, as shown in fig. 4. The deep neural network is used by the input layer to receive data and provide data to the neural network, the hidden layer is used to optimize the network, and the number of neurons is equal to the attributes of these inputs and outputs. Each node in the network and the node of the next layer are operated to form a fully-connected neural network, and the fully-connected neural network is provided with a plurality of hidden layers, so that the characteristics of data can be better separated. Furthermore, in this embodiment softmax is used as an activation function that determines the activity of the network hidden layer neurons and can also be used to classify the output layer. Meanwhile, an Adam optimization method based on random gradient descent is adopted to improve training efficiency, and a multi-classification cross entropy function is used as a preset loss function to train an optimally-represented neural network.
It can be understood that model training is performed on the initial identification model according to the model training set and the preset loss function to obtain a model to be tested, model testing is performed on the model to be tested through the model testing set to obtain a testing result of the model to be tested, and when the testing result of the model to be tested is that the model to be tested meets the attack identification requirement, the model to be tested is taken as a target attack identification model.
In the specific implementation, as shown in fig. 5, full-flow acquisition is performed based on network packet capturing tools such as Wireshark, tcpdump, sniffrpro and the like, a large number of sample acquisition logs are obtained, the sample acquisition logs are encrypted, transmitted and stored to obtain sample encrypted data, the sample encrypted data are decrypted and then subjected to data preprocessing and feature mining processes, model training features usable for model training are obtained, model training is performed on an initial recognition model, and finally a target attack recognition model usable for network attack recognition is obtained.
In the embodiment, the sample acquisition log is subjected to file encryption to obtain sample encrypted data; carrying out data preprocessing on the sample encrypted data to obtain sample processing data; performing characteristic screening according to the sample processing data and a preset characteristic screening mode to obtain model training characteristics; and performing model training on the initial recognition model according to the model training characteristics to obtain a target attack recognition model. By the method, the sample collection logs are encrypted and preprocessed to obtain sample processing data, feature screening is carried out based on the sample processing data and a preset feature screening mode to obtain model training features with strong correlation with the detected target, model training is carried out on the initial recognition model based on the model training features to obtain the target attack recognition model, the performance of the target attack recognition model is guaranteed, and a strong foundation is laid for subsequent network attack evidence obtaining.
In addition, referring to fig. 6, an embodiment of the present invention further provides a network attack forensics apparatus, where the network attack forensics apparatus includes:
the acquisition module 10 is used for performing network monitoring on the equipment to be subjected to evidence obtaining and acquiring the full-flow log of the equipment to be subjected to evidence obtaining according to a target packet capturing tool.
And the encryption module 20 is configured to encrypt the file of the full flow log to obtain encrypted monitoring data.
And the training module 30 is used for performing model training on the initial recognition model according to the preset feature screening mode and the sample acquisition log to obtain a target attack recognition model.
And the identification module 40 is used for carrying out attack identification according to the encrypted monitoring data and the target attack identification model and determining the network attack type.
And a completion module 50, configured to complete network attack forensics on the device to be forensics according to the network attack type.
In the embodiment, network monitoring is carried out on the equipment to be subjected to evidence obtaining, and a full-flow log of the equipment to be subjected to evidence obtaining is collected according to a target packet capturing tool; encrypting the file of the full-flow log to obtain encrypted monitoring data; performing model training on the initial recognition model according to a preset feature screening mode and a sample acquisition log to obtain a target attack recognition model; carrying out attack identification according to the encrypted monitoring data and the target attack identification model, and determining the network attack type; and completing the network attack evidence obtaining of the equipment to be proved according to the network attack type. According to the mode, file encryption is carried out on the basis of the full-flow log captured from the equipment to be subjected to evidence obtaining, encrypted monitoring data are obtained, the integrity and the safety of the data are guaranteed, the target attack recognition model obtained by model training of the initial recognition model is obtained according to the preset characteristic screening mode and sample collection, attack recognition is carried out on the encrypted monitoring data, the network attack type is determined, and therefore network attack evidence obtaining of the equipment to be subjected to evidence obtaining is completed, the condition that privacy is leaked does not exist, meanwhile, attacks in a network can be accurately found and tracked under a real scene, and the integrity of the network attack evidence obtaining process is guaranteed.
In an embodiment, the training module 30 is further configured to perform file encryption on the sample acquisition log to obtain sample encrypted data;
carrying out data preprocessing on the sample encrypted data to obtain sample processing data;
performing characteristic screening according to the sample processing data and a preset characteristic screening mode to obtain model training characteristics;
and performing model training on the initial recognition model according to the model training characteristics to obtain a target attack recognition model.
In an embodiment, the training module 30 is further configured to obtain a target feature variable and a preset feature threshold;
performing correlation calculation according to the target characteristic variable and the sample processing data to obtain a correlation coefficient value;
and performing feature screening on the sample processing data according to the correlation coefficient value and the preset feature threshold value to obtain model training features.
In an embodiment, the training module 30 is further configured to determine a model training set and a model testing set according to the model training features, a preset attack type, and the feature quantity of the model training features;
performing model training on the initial recognition model according to a model training set and a preset loss function to obtain a model to be tested;
performing model test on the model to be tested according to the model test set to obtain a test result;
and obtaining a target attack identification model according to the test result and the model to be tested.
In an embodiment, the training module 30 is further configured to determine a target feature variable mean according to the target feature variable;
determining sample characteristic variables, sample characteristic quantities and sample variable mean values according to the sample processing data;
and performing correlation calculation according to the sample characteristic variable, the target characteristic variable mean value, the sample characteristic quantity and the sample variable mean value to obtain a correlation value.
In an embodiment, the identifying module 40 is further configured to perform deletion detection on the encrypted monitoring data to obtain a deletion detection result;
determining the missing type of the encrypted monitoring data when the missing detection result indicates that the encrypted monitoring data has data missing;
performing data processing on the encrypted monitoring data according to the missing type to obtain first monitoring data;
normalizing the first monitoring data to obtain second monitoring data;
and inputting the second monitoring data to the target attack identification model for attack identification, and determining the network attack type.
In an embodiment, the identification module 40 is further configured to perform data mapping on the first monitoring data to obtain a conversion characteristic value when the data type of the first monitoring data is not a preset type;
determining conversion vector data according to the conversion characteristic value;
and carrying out data normalization processing on the conversion vector data to obtain second monitoring data.
Since the present apparatus employs all technical solutions of all the above embodiments, at least all the beneficial effects brought by the technical solutions of the above embodiments are achieved, and are not described in detail herein.
In addition, an embodiment of the present invention further provides a storage medium, where a network attack forensics program is stored on the storage medium, and when the network attack forensics program is executed by a processor, the steps of the network attack forensics method described above are implemented.
Since the storage medium adopts all technical solutions of all the embodiments, at least all the beneficial effects brought by the technical solutions of the embodiments are achieved, and no further description is given here.
It should be noted that the above-mentioned work flows are only illustrative and do not limit the scope of the present invention, and in practical applications, those skilled in the art may select some or all of them according to actual needs to implement the purpose of the solution of the present embodiment, and the present invention is not limited herein.
In addition, the technical details that are not described in detail in this embodiment may refer to the network attack forensics method provided in any embodiment of the present invention, and are not described herein again.
Furthermore, it should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better embodiment. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g. Read Only Memory (ROM)/RAM, magnetic disk, optical disk), and includes several instructions for enabling a terminal device (e.g. a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A network attack evidence obtaining method is characterized by comprising the following steps:
network monitoring is carried out on equipment to be subjected to evidence obtaining, and a full-flow log of the equipment to be subjected to evidence obtaining is collected according to a target packet capturing tool;
encrypting the file of the full-flow log to obtain encrypted monitoring data;
performing model training on the initial recognition model according to a preset feature screening mode and a sample acquisition log to obtain a target attack recognition model;
carrying out attack identification according to the encrypted monitoring data and the target attack identification model, and determining the network attack type;
and completing the network attack evidence obtaining of the equipment to be proved according to the network attack type.
2. The network attack evidence obtaining method according to claim 1, wherein the model training of the initial recognition model according to the preset feature screening mode and the sample collection log to obtain the target attack recognition model comprises:
encrypting the file of the sample acquisition log to obtain sample encrypted data;
carrying out data preprocessing on the sample encrypted data to obtain sample processing data;
performing feature screening according to the sample processing data and a preset feature screening mode to obtain model training features;
and performing model training on the initial recognition model according to the model training characteristics to obtain a target attack recognition model.
3. The network attack evidence obtaining method according to claim 2, wherein the performing feature screening according to the sample processing data and a preset feature screening manner to obtain model training features comprises:
acquiring a target characteristic variable and a preset characteristic threshold;
performing correlation calculation according to the target characteristic variable and the sample processing data to obtain a correlation coefficient value;
and performing feature screening on the sample processing data according to the correlation coefficient value and the preset feature threshold value to obtain model training features.
4. The network attack evidence obtaining method according to claim 2, wherein the performing model training on the initial recognition model according to the model training features to obtain a target attack recognition model comprises:
determining a model training set and a model testing set according to the model training characteristics, a preset attack type and the characteristic quantity of the model training characteristics;
performing model training on the initial recognition model according to a model training set and a preset loss function to obtain a model to be tested;
performing model test on the model to be tested according to the model test set to obtain a test result;
and obtaining a target attack identification model according to the test result and the model to be tested.
5. The network attack forensics method according to claim 3, wherein the performing correlation calculation according to the target characteristic variable and the sample processing data to obtain a correlation coefficient value includes:
determining a target characteristic variable mean value according to the target characteristic variable;
determining sample characteristic variables, sample characteristic quantities and sample variable mean values according to the sample processing data;
and performing correlation calculation according to the sample characteristic variable, the target characteristic variable mean value, the sample characteristic quantity and the sample variable mean value to obtain a correlation value.
6. The method for obtaining evidence of network attack according to claim 1, wherein the determining the type of network attack by performing attack recognition according to the encrypted monitoring data and the target attack recognition model comprises:
performing deletion detection on the encrypted monitoring data to obtain a deletion detection result;
when the missing detection result indicates that the encrypted monitoring data has data missing, determining the missing type of the encrypted monitoring data;
performing data processing on the encrypted monitoring data according to the missing type to obtain first monitoring data;
normalizing the first monitoring data to obtain second monitoring data;
and inputting the second monitoring data to the target attack identification model for attack identification, and determining the network attack type.
7. The method of claim 6, wherein the normalizing the first monitoring data to obtain second monitoring data comprises:
when the data type of the first monitoring data is not a preset type, performing data mapping on the first monitoring data to obtain a conversion characteristic value;
determining conversion vector data according to the conversion characteristic value;
and carrying out data normalization processing on the conversion vector data to obtain second monitoring data.
8. A cyber attack forensics apparatus, comprising:
the acquisition module is used for carrying out network monitoring on the equipment to be subjected to evidence obtaining and acquiring a full-flow log of the equipment to be subjected to evidence obtaining according to a target packet capturing tool;
the encryption module is used for encrypting the file of the full-flow log to obtain encrypted monitoring data;
the training module is used for carrying out model training on the initial recognition model according to a preset feature screening mode and the sample acquisition log to obtain a target attack recognition model;
the identification module is used for carrying out attack identification according to the encrypted monitoring data and the target attack identification model and determining the network attack type;
and the completion module is used for completing the network attack evidence obtaining of the equipment to be proved according to the network attack type.
9. A cyber attack forensics apparatus, the apparatus comprising: a memory, a processor, and a cyber attack forensics program stored on the memory and executable on the processor, the cyber attack forensics program configured to implement the cyber attack forensics method of any of claims 1 to 7.
10. A storage medium having a cyber attack forensic program stored thereon, the cyber attack forensic program when executed by a processor implementing the cyber attack forensic method according to any of claims 1 to 7.
CN202310259847.0A 2023-03-17 2023-03-17 Network attack evidence obtaining method, device, equipment and storage medium Active CN115987687B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310259847.0A CN115987687B (en) 2023-03-17 2023-03-17 Network attack evidence obtaining method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310259847.0A CN115987687B (en) 2023-03-17 2023-03-17 Network attack evidence obtaining method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115987687A true CN115987687A (en) 2023-04-18
CN115987687B CN115987687B (en) 2023-05-26

Family

ID=85968492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310259847.0A Active CN115987687B (en) 2023-03-17 2023-03-17 Network attack evidence obtaining method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115987687B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116886379A (en) * 2023-07-21 2023-10-13 鹏城实验室 Network attack reconstruction method, model training method and related devices

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140149327A1 (en) * 2012-10-23 2014-05-29 Icf International Method and apparatus for monitoring network traffic
US20180165597A1 (en) * 2016-12-08 2018-06-14 Resurgo, Llc Machine Learning Model Evaluation in Cyber Defense
CN113014529A (en) * 2019-12-19 2021-06-22 北京数安鑫云信息技术有限公司 Network attack identification method, device, medium and equipment
CN113469234A (en) * 2021-06-24 2021-10-01 成都卓拙科技有限公司 Network flow abnormity detection method based on model-free federal meta-learning
CN113723440A (en) * 2021-06-17 2021-11-30 北京工业大学 Encrypted TLS application traffic classification method and system on cloud platform
CN113923026A (en) * 2021-10-11 2022-01-11 广州大学 Encrypted malicious flow detection model based on TextCNN and construction method thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140149327A1 (en) * 2012-10-23 2014-05-29 Icf International Method and apparatus for monitoring network traffic
US20180165597A1 (en) * 2016-12-08 2018-06-14 Resurgo, Llc Machine Learning Model Evaluation in Cyber Defense
CN113014529A (en) * 2019-12-19 2021-06-22 北京数安鑫云信息技术有限公司 Network attack identification method, device, medium and equipment
CN113723440A (en) * 2021-06-17 2021-11-30 北京工业大学 Encrypted TLS application traffic classification method and system on cloud platform
CN113469234A (en) * 2021-06-24 2021-10-01 成都卓拙科技有限公司 Network flow abnormity detection method based on model-free federal meta-learning
CN113923026A (en) * 2021-10-11 2022-01-11 广州大学 Encrypted malicious flow detection model based on TextCNN and construction method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孙波,孙玉芳,张相锋,梁彬: "电子数据证据收集系统保护机制的研究与实现", 电子学报 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116886379A (en) * 2023-07-21 2023-10-13 鹏城实验室 Network attack reconstruction method, model training method and related devices
CN116886379B (en) * 2023-07-21 2024-05-14 鹏城实验室 Network attack reconstruction method, model training method and related devices

Also Published As

Publication number Publication date
CN115987687B (en) 2023-05-26

Similar Documents

Publication Publication Date Title
CN112003870B (en) Network encryption traffic identification method and device based on deep learning
CN113705619B (en) Malicious traffic detection method, system, computer and medium
CN111107096A (en) Web site safety protection method and device
CN109831459B (en) Method, device, storage medium and terminal equipment for secure access
CN110460611B (en) Machine learning-based full-flow attack detection technology
CN115987687B (en) Network attack evidence obtaining method, device, equipment and storage medium
CN111049828B (en) Network attack detection and response method and system
Al-Mousa Analyzing cyber-attack intention for digital forensics using case-based reasoning
Harbola et al. Improved intrusion detection in DDoS applying feature selection using rank & score of attributes in KDD-99 data set
CN116776386B (en) Cloud service data information security management method and system
CN113225331A (en) Method, system and device for detecting host intrusion safety based on graph neural network
Goyal et al. A semantic machine learning approach for cyber security monitoring
CN113656800B (en) Malicious software behavior recognition method based on encryption traffic analysis
Thomas et al. Comparative analysis of dimensionality reduction techniques on datasets for zero-day attack vulnerability
CN112968891B (en) Network attack defense method and device and computer readable storage medium
Rodríguez et al. A Process Mining-based approach for Attacker Profiling
Lu et al. One intrusion detection method based on uniformed conditional dynamic mutual information
Ogundokun et al. Cyber intrusion detection system based on machine learning classification approaches
Jahromy et al. A new method for detecting network intrusion by using a combination of genetic algorithm and support vector machine classifier
Shyu et al. A multiagent-based intrusion detection system with the support of multi-class supervised classification
CN114021032B (en) Network crime information mining method, system and storage medium
Maheswaran et al. Effective Intrusion Detection System using Hybrid Ensemble Method for Cloud Computing
Tangi et al. A novel mechanism for development of intrusion detection system with BPNN
Selvam et al. An Improving Intrusion Detection Model Based on Novel CNN Technique Using Recent CIC-IDS Datasets
Wu et al. Research on the Impact of Attacks on Security Characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant