WO2022123623A1 - 情報処理装置、情報処理方法及び情報処理プログラム - Google Patents
情報処理装置、情報処理方法及び情報処理プログラム Download PDFInfo
- Publication number
- WO2022123623A1 WO2022123623A1 PCT/JP2020/045452 JP2020045452W WO2022123623A1 WO 2022123623 A1 WO2022123623 A1 WO 2022123623A1 JP 2020045452 W JP2020045452 W JP 2020045452W WO 2022123623 A1 WO2022123623 A1 WO 2022123623A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- access
- attack
- log
- normal
- true
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims description 19
- 238000003672 processing method Methods 0.000 title claims description 3
- 238000001514 detection method Methods 0.000 claims abstract description 164
- 238000012937 correction Methods 0.000 claims abstract description 152
- 239000000284 extract Substances 0.000 claims abstract description 27
- 238000000605 extraction Methods 0.000 claims description 84
- 238000000034 method Methods 0.000 claims description 71
- 230000008569 process Effects 0.000 claims description 32
- 238000012545 processing Methods 0.000 claims description 14
- 241000282326 Felis catus Species 0.000 description 45
- 238000004364 calculation method Methods 0.000 description 40
- 239000013598 vector Substances 0.000 description 29
- 238000012795 verification Methods 0.000 description 13
- 230000006870 function Effects 0.000 description 8
- 238000010801 machine learning Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 230000002159 abnormal effect Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 230000008520 organization Effects 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 230000005856 abnormality Effects 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 238000007637 random forest analysis Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 239000004575 stone Substances 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
Definitions
- This disclosure relates to attack detection technology.
- Rule-based detection techniques have long been well known as techniques for monitoring cyber attacks. Rule-based detection techniques use rules (signatures) that detect known fraudulent patterns, or rules that detect attack techniques or attacker behavior. However, due to the sophistication of attacks and the increase of unknown attacks, it becomes difficult to define rules in advance, which bothers SOC (Security Operation Center) staff. In addition, it is necessary to manually adjust the rules for each system to be monitored, and the limit of rule-based detection technology is approaching. Therefore, an advanced detection technique that does not require a rule to be defined in advance or that automatically determines the boundary between normal and abnormal is desired. Artificial Intelligence (hereinafter abbreviated as AI) such as machine learning can be considered as a technique for realizing such an advanced detection technique.
- AI Artificial Intelligence
- AI learns the data of multiple classes prepared in advance and automatically finds the boundary that separates the classes. If a large amount of data for each class can be prepared, AI can properly find boundaries. If AI can be applied to the monitoring of cyber attacks, it is expected that AI will replace the definition and update of rules that have been done by staff with specialized knowledge and skills.
- network security there is a problem that it is difficult to prepare a large amount of data for each class, which is the most important in AI. Attacks are rare and it is very difficult to prepare a large amount of attack data for learning. Therefore, it is necessary to increase the attack data in some way and use it for learning. Also, the attacker's ability is increasing day by day.
- Patent Document 1 discloses a technique for automatically generating a sophisticated attack sample designed to have characteristics similar to those in a normal state for evaluation of a security product.
- false detections False Positive: FP
- detection omissions False Negative
- a technique for automatically generating FN is disclosed.
- the attack feature vector is changed so as to cross the decision boundary of the normal model that learned the behavior of normal data.
- Both techniques also generate attacks in a simulated environment to have features that correspond to feature vectors that transcend boundaries. According to both technologies, it is possible to generate a realistic and sophisticated attack by confirming the environment, restrictions such as attacks, and the presence / absence of an attack function.
- the feature vector is modified so as to straddle the decision boundary of the detection system on the feature space, and a sample for avoiding detection is searched for. For this reason, it has been difficult to efficiently apply these techniques to attack detection techniques in which the feature space is non-linear and high-dimensional.
- the higher the accuracy of the attack detection system the higher the dimension and the non-linearity of the feature space, and the more difficult it is to reversely convert the representation on the feature space into the information in the real space. Therefore, it is difficult to obtain a sample existing in the real space from an attack sample that avoids the detection found in the feature space, and it is expected that the search will be ad hoc and inefficient.
- the main purpose of this disclosure is to enable efficient search for attacks that can actually exist in real space and avoid detection.
- the information processing device is An extraction unit that extracts true positive accesses that are known to be access for attack purposes and that the attack detection system has determined to be access for attack purposes, It has a correction unit that corrects the characteristics of the true positive access by using the characteristics of the true negative access that has been found to be a normal access and is determined by the attack detection system to be a normal access.
- FIG. 1 The figure which shows the hardware configuration example of the attack log generation apparatus which concerns on Embodiment 1.
- FIG. The flowchart which shows the operation example of the attack log generation apparatus which concerns on Embodiment 1.
- the access is a normal access
- the access determined by the attack detection system to be a normal access is referred to as a true negative access
- the access that has been found to be an attack access and is determined by the attack detection system to be an attack access is called a true positive access.
- the attack access is for the purpose of attack
- the access that the attack detection system mistakenly determines to be a normal access is called a false negative access.
- False negative access is an access that causes a detection omission.
- an access that is known to be a normal access but is erroneously determined by the attack detection system to be an attack access is called a false positive access. False positive access is an access that causes false positives.
- a normal log showing the characteristics of normal access and an attack log showing the characteristics of attack access are used.
- the normal log contains multiple fields, and each field describes a value that represents the characteristics of normal access.
- the attack log contains the same plurality of fields as the normal log, and each field describes a value representing the characteristics of attack access.
- the log of the true negative access is extracted from the normal log, and the log of the true positive access is extracted from the attack log.
- the characteristics of true negative access are used to modify the characteristics of true positive access.
- the characteristics of the true positive access are modified so that the attack detection system determines that the modified true positive access, which is the true positive access after the characteristics are modified, is the normal access.
- the characteristics of the true positive access are modified so that the true positive access becomes a false negative access.
- an attack sample of the false negative access that can avoid the detection by the attack detection system is obtained.
- FIG. 1 shows a hardware configuration example of the attack log generation device 100 according to the present embodiment.
- FIG. 2 shows an example of the functional configuration of the attack log generation device 100 according to the present embodiment.
- the attack log generation device 100 according to this embodiment is a computer.
- the attack log generation device 100 corresponds to an information processing device.
- the operation procedure of the attack log generation device 100 corresponds to the information processing method.
- the program that realizes the operation of the attack log generation device 100 corresponds to an information processing program.
- the attack log generation device 100 includes a processor 901, a main storage device 902, an auxiliary storage device 903, a keyboard 904, a mouse 905, and a display device 906 as hardware.
- the auxiliary storage device 903 stores a program that realizes the functions of the normal classification unit 101, the detection unit 102, the attack generation unit 103, the neighborhood extraction unit 104, the tendency extraction unit 105, and the feature correction unit 106 shown in FIG. .. These programs are loaded from the auxiliary storage device 903 into the main storage device 902.
- FIG. 3 schematically shows a state in which the processor 901 is executing a program that realizes the functions of the normal classification unit 101, the detection unit 102, the attack generation unit 103, the neighborhood extraction unit 104, the tendency extraction unit 105, and the feature correction unit 106. It is represented in.
- the detection avoidance attack log DB (Database) 111, the normal log DB 112, the attack log DB 113, the normal log statistical information DB 114, the true negative normal log DB 115, the nearby true negative normal log DB 116, and the true negative normal log tendency DB 117 shown in FIG. 2 are , Implemented by the main storage device 902 or the auxiliary storage device 903.
- the keyboard 904 and the mouse 905 receive instructions from the user of the attack log generation device 100.
- the display device 906 displays various information to the user of the attack log generation device 100.
- the attack log generation device 100 may include a communication device.
- the normal classification unit 101 extracts true negative access. More specifically, the normal classification unit 101 extracts a true negative normal log determined to be normal by the detection unit 102 from the normal log in the normal log DB 112.
- the normal log describes the characteristics of normal access in multiple fields. That is, the normal log defines normal access. Therefore, the true negative access is defined by the true negative normal log (hereinafter referred to as the true negative normal log) extracted by the normal classification unit 101.
- the normal classification unit 101 stores the extracted true negative normal log in the true negative normal log DB 115.
- the normal classification unit 101 extracts true positive access from the attack log.
- the normal classification unit 101 extracts a true positive attack log determined to be an attack by the detection unit 102 from the attack log in the attack log DB 113.
- the attack log the characteristics of the attack access generated by the attack generation unit 103 are described in a plurality of fields. That is, the attack log defines the attack access. Therefore, the true positive access is defined by the true positive attack log (hereinafter referred to as the true positive attack log) extracted by the normal classification unit 101.
- the normal classification unit 101 outputs the extracted true positive attack log to the neighborhood extraction unit 104.
- the normal classification unit 101 corresponds to the extraction unit. Further, the process performed by the normal classification unit 101 corresponds to the extraction process.
- the detection unit 102 functions as an attack detection system. More specifically, the detection unit 102 detects attack access by using machine learning. As described above, the normal log determined to be normal by the detection unit 102 is stored in the true negative normal log DB 115 as a true negative normal log by the normal classification unit 101. Further, the attack log determined to be an attack by the detection unit 102 is output to the neighborhood extraction unit 104 as a true positive attack log by the normal classification unit 101.
- the attack generation unit 103 generates an attack access according to the attack scenario. Then, the attack generation unit 103 stores a log representing the characteristics of the attack access as an attack log in the attack log DB 113.
- the neighborhood extraction unit 104, the tendency extraction unit 105, and the feature correction unit 106 are collectively referred to as a correction unit 107.
- the correction unit 107 modifies the characteristics of the true positive access extracted by the normal classification unit 101 by using the characteristics of the true negative access extracted by the normal classification unit 101. More specifically, the correction unit 107 modifies the characteristics of the true positive access so that the detection unit 102 determines that the corrected true positive access, which is the true positive access after the characteristics are corrected, is the normal access. .. Further, when the detection unit 102 determines that the correction true positive access is an attack access, the correction unit 107 corrects the characteristics of the correction true positive access by using the characteristics of the true negative access. The process performed by the correction unit 107 corresponds to the correction process.
- the neighborhood extraction unit 104 extracts a true negative normal log (hereinafter referred to as a neighborhood true negative normal log) in the vicinity of the true positive attack log extracted by the normal classification unit 101 from the true negative normal log DB 115. More specifically, the neighborhood extraction unit 104 extracts the true-negative normal log having characteristics similar to the characteristics of the true-positive attack log among the true-negative normal logs included in the true-negative normal log DB 115 as the neighborhood true-negative normal log. do.
- a true negative normal log hereinafter referred to as a neighborhood true negative normal log
- the tendency extraction unit 105 calculates the importance (feature impact) of each of the plurality of features of the true positive attack log.
- the tendency extraction unit 105 calculates the importance of each of the plurality of features of the true positive attack log so that the features having a high degree of distinction between the true negative access and the true positive access become more important. Further, the tendency extraction unit 105 selects a feature whose importance matches the selection condition from a plurality of features of the true positive attack log.
- the feature correction unit 106 corrects the characteristics of the true positive attack log selected by the tendency extraction unit 105 by using the corresponding characteristics of the true negative normal log. Then, the feature correction unit 106 stores the corrected true positive attack log as a detection avoidance attack log in the detection avoidance attack log DB 111.
- the detection avoidance attack log DB111 stores the detection avoidance attack log.
- the normal log DB 112 stores the normal log.
- the attack log DB 113 stores the attack log.
- the normal log statistical information DB 114 stores normal log statistical information (hereinafter referred to as normal log statistical information).
- the true negative normal log DB 115 stores the true negative normal log.
- the neighborhood true negative normal log DB 116 stores the neighborhood true negative normal log extracted by the neighborhood extraction unit 104.
- the true negative normal log tendency DB117 stores the tendency of the near true negative normal log extracted by the neighborhood extraction unit 104 (hereinafter referred to as the true negative normal log tendency).
- the normal classification unit 101 extracts a true negative normal log from the normal log (step S1-1). Specifically, the detection unit 102 analyzes a large amount of normal logs stored in advance in the normal log DB 112, and the detection unit 102 determines whether the access defined in the normal log corresponds to normal access or attack access. Is determined. Then, the normal classification unit 101 extracts the normal log determined by the detection unit 102 as a normal access as a true negative normal log. Then, the normal classification unit 101 stores the extracted true negative normal log in the true negative normal log DB 115.
- the attack generation unit 103 executes the attack and generates an attack log (step S1_2). That is, the attack generation unit 103 makes an attack access and generates an attack log showing the characteristics of the attack access. Then, the attack generation unit 103 stores the generated attack log in the attack log DB 113.
- the detection unit 102 analyzes the attack log and determines whether the access defined in the attack log corresponds to normal access or attack access (step S1_3).
- step S1_3 If the detection unit 102 determines that the access defined in the attack log corresponds to a normal access (NO in step S1_3), the process proceeds to step S1_8.
- step S1_8 the access determined by the detection unit 102 to correspond to the normal access is an attack access (false negative access) that can avoid the detection of the detection unit 102, so that the normal classification unit 101 displays the corresponding true positive attack log. It is stored in the detection avoidance attack log DB 111 as a detection avoidance attack log.
- step S1_3 the process proceeds to step S1_4. That is, the access determined by the detection unit 102 to correspond to the attack access is the attack access (true positive access) detected by the detection unit 102. Therefore, it is necessary to modify the characteristics of the true positive attack log so that the detection of the detection unit 102 can be avoided.
- step S1_4 the neighborhood extraction unit 104 extracts the neighborhood true negative normal log. That is, the neighborhood extraction unit 104 extracts the true negative normal log in the vicinity of the attack log (true positive attack log) obtained in step S1_3 from the true negative normal log DB 115. Details of step S1_4 will be described later.
- the tendency extraction unit 105 calculates the tendency of the characteristics of the neighborhood true negative normal log acquired in step S1_4 (step S1_5).
- the feature correction unit 106 corrects the true positive attack log so that the true positive attack log includes the tendency of the characteristics of the near true negative normal log (step S1_6). That is, the feature correction unit 106 corrects each field of the true positive attack log so that the tendency of the characteristics of the neighborhood true negative normal log calculated by the tendency extraction unit 105 is included.
- the detection unit 102 determines whether the access (corrected true positive access) defined in the true positive attack log (corrected true positive attack log) after being corrected by the feature correction unit 106 corresponds to the normal access or the attack access. It is determined whether it corresponds (step S1_7).
- the normal classification unit 101 stores the corrected true positive attack log in the detection avoidance attack log DB 111 as the detection avoidance attack log (NO). Step S1_8). Since the detection unit 102 cannot detect the corrected true positive access based on the corrected true positive attack log as an attack, the corrected true positive access is an attack access that can avoid the detection of the attack detection system. Therefore, the normal classification unit 101 stores the corrected true positive attack log as the detection avoidance attack log in the detection avoidance attack log DB 111.
- step S1_7 when the detection unit 102 determines that the corrected true positive access corresponds to the attack access (YES in step S1_7), the process returns to step S1_6. Then, the feature correction unit 106 further corrects the correction true positive attack log by using the characteristics of the neighborhood true negative normal log (step S1_6).
- the above is a rough flow of the operation of the attack log generation device 100.
- the details of the operations of the normal classification unit 101, the detection unit 102, the attack generation unit 103, the neighborhood extraction unit 104, the tendency extraction unit 105, and the feature correction unit 106 will be described.
- the normal classification unit 101 causes the detection unit 102 to determine the normality / abnormality of a large amount of normal logs prepared in advance in the normal log DB 112.
- the detection unit 102 determines whether the normal log is normal or abnormal. That is, the detection unit 102 determines whether the feature described in the normal log corresponds to the feature of normal access or the feature of attack access.
- the normal classification unit 101 extracts the normal log determined to be normal in the determination by the detection unit 102 as a true negative normal log. Then, the normal classification unit 101 stores the extracted true negative normal log in the true negative normal log DB 115. At this time, the normal classification unit 101 calculates the appearance frequency and percentile of unique values for each category data with respect to the category data (domain, method, status code, etc.) of the normal log.
- the normal classification unit 101 stores a dictionary composed of a pair of a unique value and a percentile in the normal log statistical information DB 114 as normal log statistical information.
- the percentile is an index that indicates the percentage of unique values that appear in order from the smallest.
- the normal classification unit 101 may store a dictionary composed of a pair of a unique value and an appearance frequency in the normal log statistical information DB 114 as normal log statistical information.
- the detection unit 102 acquires a log (normal log or attack log) from the normal classification unit 101. Then, the detection unit 102 extracts the feature from the log and converts the extracted feature into an expression (feature vector) for inputting to the machine learning algorithm. Then, the detection unit 102 applies the feature vector to the trained detection model. As a result, the detection unit 102 infers the class to which the log belongs.
- a method of learning a detection model using training data given teacher information (label) indicating which class of normal access or attack access the training data belongs to is called supervised learning. When supervised learning is used, the detection unit 102 uses the learned detection model to infer whether the feature vector belongs to the normal access or attack access class.
- a method of learning a detection model using only normal data as training data without preparing teacher information is called unsupervised learning. When unsupervised learning is used, the detection unit 102 uses the learned detection model to infer whether or not the feature vector belongs to the normal access class.
- FIG. 4 shows an example of the internal configuration of the attack generation unit 103.
- the attack generation unit 103 includes a simulated environment 1031, an attack execution unit 1032, an attack module 1033, an attack scenario DB 1034, and a log collection unit 1035.
- FIG. 5 shows a configuration example of the simulated environment 1031.
- the simulated environment 1031 is a virtual environment that simulates the business network of a company or an organization.
- the simulated environment 1031 is composed of, for example, a proxy server, a firewall, a file server, an AD (Active Direction) server, an in-house Web server, a user terminal, a stepping stone terminal, and a pseudo-Internet.
- the pseudo-Internet includes an attacker's Command and Control server.
- the attack module 1033 includes Reconnaissance, Weaponization, Delivery, Exploitation, Installation, Command and Control, Command and Control in the Cyber Kill Chain. ), A plurality of basic modules that realize each step of target execution (Actions on Subjective). Reconnaissance is a step of collecting target information (email address, etc.) from public information. Weaponization is the step of generating exploit kits, malware, etc. for attacks. Delivery is a step of sending an email with malware attached or a maliciously linked email to a target, directly accessing the target system, and the like. Exploits are steps such as allowing a target to execute an attack file such as malware, or allowing the target to access a malicious link. Installation is the step of a successful exploit and infecting the target with malware.
- Command & control is a step in which malware and a C & C server can communicate with each other, and the C & C server operates a remote target.
- Intrusion spread is a step in which a C & C server invades another computer using a local password hash.
- Execution of the purpose is a step in which the attacker's purpose is executed, such as information exploitation, falsification, data corruption, and service outage.
- the attack module 1033 is a program that realizes these functions.
- the attack scenario DB 1034 stores the attack scenario.
- the attack scenario is information in which combinations and parameters (for example, communication frequency, communication destination domain, infected terminal, etc.) of the attack module 1033 are defined according to a general targeted attack.
- a large number of attack scenarios are prepared in the attack scenario DB 1034 in order to have variations in the attack.
- the attack execution unit 1032 selects one attack scenario stored in the attack scenario DB 1034. Then, the attack execution unit 1032 executes the attack module 1033 on the simulated environment 1031 according to the selected attack scenario.
- the log collection unit 1035 collects logs on the simulated environment 1031 when an attack is executed, and stores the collected logs as attack logs in the attack log DB 113.
- the attack log includes, for example, a proxy server log, an AD server log, a file server log, a firewall log, and the like.
- FIG. 6 shows an example of the internal configuration of the neighborhood extraction unit 104.
- the neighborhood extraction unit 104 includes a feature extraction unit 1041, a feature expression unit 1042, and a neighborhood calculation unit 1043. Further, the neighborhood extraction unit 104 uses a normal log DB 112, a true negative normal log DB 115, and a near true negative normal log DB 116.
- the feature extraction unit 1041 extracts specified features from x (assumed to be one) true positive attack logs and y true negative normal logs. y is a number sufficiently larger than x.
- the feature expression unit 1042 converts the features extracted from the true positive attack log and the true negative normal log into a format (feature vector) that can be easily processed by the machine learning algorithm.
- the feature expression unit 1042 converts categorical data such as domains, methods, and status codes into, for example, One-hot encoding or Frequency Encoding described in the following references. References: Steve T. K. Jan, et al, Throwing Darts in the Dark? Detection Bots With Limited Data usage Natural Data Augmentation, Security & Privacy 2020 (https://people.cs.vt.edu/vbimal/pdf.)
- the feature expression unit 1042 normalizes or standardizes the numerical data. By normalizing or standardizing the numerical data, the feature expression unit 1042 makes the size of the numerical data uniform among the types of features.
- the neighborhood calculation unit 1043 uses the feature vector of the true positive attack log and the feature vector of the true negative normal log to specify K0 neighborhood true negative normal logs in the vicinity of each of the true positive attack logs. Let K 1 be the total number of true negative normal logs in the vicinity of x true positive attack logs. K 1 ⁇ K 0 . Then, the neighborhood calculation unit 1043 stores the specified K1 neighborhood true negative normal log in the neighborhood true negative normal log DB 116.
- the neighborhood calculation unit 1043 uses, for example, the KNN (K-nearest neighbor) method to identify K 1 neighborhood true negative normal logs.
- the feature or feature expression used by the neighborhood calculation unit 1043 to identify the K1 neighborhood true negative normal log may be different from the feature or feature expression used by the detection unit 102. Further, the neighborhood calculation unit 1043 uses an Euclidean distance or the like as a distance scale.
- FIG. 7 shows an example of the internal configuration of the tendency extraction unit 105.
- the tendency extraction unit 105 is composed of a feature extraction unit 1051, a feature expression unit 1052, an importance calculation unit 1053, and a tendency calculation unit 1054. Further, the tendency extraction unit 105 uses the neighborhood true negative normal log DB 116 and the true negative normal log tendency DB 117.
- the feature extraction unit 1051 acquires K1 neighborhood true negative normal logs from the neighborhood true negative normal log DB 116. Further, the feature extraction unit 1051 acquires x (temporarily one) true positive attack logs from the neighborhood extraction unit 104, for example. Then, the feature extraction unit 1051 extracts the specified features from the K1 neighborhood true negative normal log and the x true positive attack log, similarly to the feature extraction unit 1041.
- the feature expression unit 1052 Similar to the feature expression unit 1042, the feature expression unit 1052 also converts the features extracted from K 1 neighborhood true negative normal logs and x true positive attack logs into a format (feature vector) that can be easily processed by a machine learning algorithm. do.
- the feature or feature expression converted into the feature vector by the feature expression unit 1052 may be different from the feature or feature expression used by the detection unit 102.
- the importance calculation unit 1053 uses the feature vector obtained by the feature expression unit 1052 to discriminate between K 1 neighborhood true negative normal logs and x true positive attack logs (C 1 ). learn.
- the importance calculation unit 1053 determines the importance of the feature (feature impact), which is the degree to which the classifier (C 1 ) distinguishes between K 1 neighborhood true negative normal logs and x (for example, 1) true positive attack logs. ) Is calculated for each of the feature vectors of the true positive attack log.
- the importance calculation unit 1053 calculates the importance of each feature vector so that the feature having a high degree of distinction between the neighborhood true negative normal log and the true positive attack log becomes important. Then, the importance calculation unit 1053 extracts the features F 11 to F 1 n1 of the top n 1 items having high importance. n 1 is 1 or more.
- the importance calculation unit 1053 calculates the importance using, for example, a random forest.
- the tendency calculation unit 1054 acquires statistical information about the features F11 to F1n1 in the K1 neighborhood true negative normal log.
- the tendency calculation unit 1054 acquires the median value (median, med 1 ) and the mode value (mode, mod 1 ) of the percentile of the category data as statistical information for the feature which is the category data. Further, the tendency calculation unit 1054 acquires the average ( ⁇ 1 ) and standard deviation ( ⁇ 1 ) of the numerical data as statistical information for the feature which is the numerical data. Then, the tendency calculation unit 1054 stores the statistical information in the true negative normal log tendency DB 117.
- FIG. 8 shows an example of the internal configuration of the feature correction unit 106.
- the feature correction unit 106 includes a data correction unit 1061 and a verification unit 1062.
- the feature correction unit 106 uses the detection avoidance attack log DB 111.
- FIG. 9 shows an operation example of the data correction unit 1061 and the verification unit 1062.
- the data correction unit 1061 confirms whether or not there is an unconfirmed feature among the features F 11 to F 1n1 (step S2_1).
- the data correction unit 1061 selects the unconfirmed feature F 1i (i is any one of 1 to n1) (step S2_2).
- the operation of the data correction unit 1061 will be described by taking the case where the feature F 1i is the feature F 11 as an example.
- the data correction unit 1061 then acquires the actual value of feature F11 from the corresponding field in the true positive attack log (step S2_3). Then, the data correction unit 1061 generates the list 11 (step S2_4).
- Listing 11 includes modified values after modifying the actual value of the true positive attack log feature F 11 with the actual value of the nearby true negative normal log feature F 11 . That is, the list 11 includes a plurality of correction values that reflect the value of the feature F 11 of the K1 neighborhood true negative normal log. The method of generating the list 11 will be described later.
- the data correction unit 1061 also performs the processes of steps S2_2 to S2_4 for the other features F 12 to F 1n1 .
- step S2_5 When the processing of steps S2_2 to S2_4 is performed on all of the features F 11 to F 1n1 (NO in step S2_1), the data correction unit 1061 changes the correction values included in the list 11 to the list 1n1 of the features F 11 to F 1n1 . Are all combined, and an attack log (corrected true positive attack log) corresponding to each combination is generated (step S2_5).
- the number of correction values included in each list of Lists 11 to 1n1 is r 1j (j is 1 to n1)
- the verification unit 1062 verifies each modified true positive attack log (step S2_6). Specifically, the verification unit 1062 causes the detection unit 102 to determine whether the access defined in each modified true positive attack log corresponds to a normal access or an attack access.
- the verification unit 1062 stores the corrected true positive attack log determined to be normal access by the detection unit 102 as the detection avoidance attack log in the detection avoidance attack log DB 111 (step S2_7).
- the verification unit 1062 creates a detection avoidance attack log by the same method for all true positive attack logs.
- step S2_4 of FIG. 9 a method of generating the list (list 11 to list 1n1 ) shown in step S2_4 of FIG. 9 will be described with reference to FIGS. 10 and 11.
- an example of generating the list 11 for the feature F 11 will be described.
- the data correction unit 1061 determines whether the feature F 11 is categorical data or numerical data (step S3_1).
- Category data is a domain, method, status code, etc.
- Numerical data is request size, time interval, etc.
- the data correction unit 1061 acquires the percentile value of the category data of the feature F 11 from the dictionary of the normal log statistical information, and sets the acquired percentile value to cat 11 (. Step S3_2). Further, the data correction unit 1061 refers to the mode 11 as the statistical information of the feature F 11 in the K1 neighborhood true negative normal log from the true negative normal log tendency DB 117 (step S3_2).
- the data correction unit 1061 compares the value of cat 11 with the value of mod 11 (step S3_3).
- the data correction unit 1061 approaches the value of cat 11 by ⁇ 11 by ⁇ 11 .
- the value of cat 11 is updated so as to be (smaller), and the updated value of cat 11 is added to the list 11 (step S3_6). If the value of cat 11 is already described in the list 11 , the data correction unit 1061 overwrites the value of cat 11 already described with the new value of cat 11 .
- ⁇ 11 is a specified value.
- the data correction unit 1061 repeats the process of step S3_6 while the value of cat 11 is equal to or greater than the value of mod 11 (YES in step S3_5). When the value of cat 11 becomes less than the value of mod 11 (NO in step S3_4), the process proceeds to step S3_9.
- step S3_3 If the value of cat 11 is equal to the value of mod 11 (NO in step S3_3, YES in step S3_4), the process proceeds to step S3_9.
- step S3_7 When the value of cat 11 is smaller than the value of mod 11 (YES in step S3_3, YES in step S3_7), the data correction unit 1061 makes the value of cat 11 approach the value of mod 11 by ⁇ 11 (larger). The value of cat 11 is updated so as to be), and the updated value of cat 11 is added to the list 11 (step S3_8). If the value of cat 11 is already described in the list 11 , the data correction unit 1061 overwrites the value of cat 11 already described with the new value of cat 11 . The data correction unit 1061 repeats the process of step S3_8 while the value of cat 11 is equal to or less than the value of mod 11 (NO in step S3_7). When the value of cat 11 becomes larger than the value of mod 11 (YES in step S3_7), the process proceeds to step S3_9.
- step S3_9 the data correction unit 1061 confirms the list 11 .
- the data correction unit 1061 sets the value of the numerical data of the feature F 11 to num 11 (step S3_1). Further, the data correction unit 1061 refers to the mean ⁇ 11 and the standard deviation ⁇ 11 as statistical information of the feature F 11 in the K 1 neighborhood true negative normal log from the true negative normal log tendency DB 117 (step S3_10).
- the data correction unit 1061 compares the value of num 11 with the value of ⁇ 11 (step S3_11). When the value of num 11 is larger than the value of ⁇ 11 (NO in step S3_11, NO in step S3_12, YES in step S3_13), the data correction unit 1061 determines that the value of num 11 is ⁇ 11 for each value of ⁇ 11 . The num 11 value is updated so as to be closer (smaller), and the updated num 11 value is added to the list 11 (step S3_14). When the value of num 11 is already described in the list 11 , the data correction unit 1061 overwrites the value of num 11 already described with the new value of num 11 . Note that ⁇ 11 is a specified value.
- This ⁇ 11 may be the same value as ⁇ 11 used when the feature F 11 is categorical data, or may be a different value.
- the data correction unit 1061 repeats the process of step S3_14 while the value of num 11 is equal to or greater than the value of ( ⁇ 11 ⁇ 11 ) (YES in step S3_13).
- ⁇ 11 is also a specified value. It is conceivable that ⁇ 11 is defined from the statistical values related to the feature F 11 such as 3 ⁇ ⁇ 11 . When the value of num 11 becomes less than ( ⁇ 11 ⁇ 11 ) (NO in step S3_13), the process proceeds to step S3_17.
- step S3_17 If the value of num 11 is equal to the value of ⁇ 11 (NO in step S3_11, YES in step S3_12), the process proceeds to step S3_17.
- the data correction unit 1061 makes the value of num 11 approach (larger) the value 1 of ⁇ 1 . ) Is updated with num 11 by ⁇ 11, and the updated value of num 11 is added to the list 11 (step S3_16 ).
- the data correction unit 1061 overwrites the value of num 11 already described with the new value of num 11 .
- the data correction unit 1061 repeats the process of step S3_16 while the value of num 11 is equal to or less than the value of ( ⁇ 11 + ⁇ 11 ) (YES in step S3_15). When the value of num 11 becomes larger than ( ⁇ 11 ⁇ 11 ) (NO in step S3_15), the process proceeds to step S3_17.
- step S3_17 the data correction unit 1061 confirms the list 11 .
- the data correction unit 1061 After that, the data correction unit 1061 generates the list 12 to the list 1n1 by the same procedure for the features F 12 to F 1n1 . After the generation of Lists 11 to 1n1 is completed, the data correction unit 1061 performs step S2_5 of FIG.
- FIG. 10 shows an example of using the mode of the category data
- other statistical information such as the median may be used instead of the mode.
- the feature of the true negative access is used so that the attack detection system determines that the modified true positive access, which is the true positive access after the feature is corrected, is the normal access. , Correct the characteristics of true positive access. Therefore, according to the present embodiment, it is possible to obtain an attack sample of false negative access that can avoid detection by the attack detection system. Therefore, according to the present embodiment, it is possible to efficiently search for an attack that can actually exist in the real space and avoid detection.
- the log item is modified in the real space, and the modified log item is converted into the feature vector. Then, using the feature vector obtained by the transformation, it is confirmed whether or not the attack avoids the detection in the feature space. However, if nothing is done, it will be an ad hoc search. Therefore, in the present embodiment, a true-negative normal log in the vicinity of the true-positive attack log is specified on the newly generated feature space, and a true-positive attack is performed so as to have a characteristic value often found in the true-negative normal log. Correct the log. By doing this, it is possible to prevent an ad hoc search.
- the true positive attack log is modified so that it has the value of the feature often found in the true negative normal log in the vicinity of the true positive attack log, and the attack log that can avoid the detection is efficiently created. It is generating.
- a false positive normal log an example of efficiently generating an attack log that can avoid detection by using a false positive normal log (hereinafter referred to as a false positive normal log) in addition to the true negative normal log will be described. do.
- the false positive normal log is a normal log in which the characteristics of false positive access are described in a plurality of fields.
- the false positive access is an access that is known to be a normal access, but is erroneously determined by the attack detection system to be an attack access. False positive access is an access that causes a false positive in the attack detection system.
- FIG. 12 shows a functional configuration example of the attack log generation device 100 according to the present embodiment.
- a false positive normal log DB 118, a nearby false positive normal log DB 119, and a false positive normal log tendency DB 120 are added as compared with FIG. 2.
- the false positive normal log DB 118 stores the false positive normal log.
- the neighborhood false positive normal log DB 119 stores a near false positive normal log which is a false positive normal log in the vicinity of the true positive attack log.
- the false positive normal log tendency DB 120 stores the statistical information of the false positive normal log (hereinafter referred to as false positive normal log statistical information).
- the false positive normal log DB 118, the near false positive normal log DB 119, and the false positive normal log tendency DB 120 are realized by, for example, the main storage device 902 or the auxiliary storage device 903.
- the normal classification unit 101 classifies the normal log stored in the normal log DB 112 into a true negative normal log and a false positive normal log.
- the correction unit 107 corrects the characteristics of the true positive access by using the characteristics of the true negative access and the characteristics of the false positive access. That is, the correction unit 107 corrects the characteristics of the true positive attack log by using the characteristics of the near true negative normal log and the characteristics of the near false positive normal log. More specifically, the correction unit 107 uses the characteristics of the neighborhood true negative normal log after eliminating the characteristics overlapping with the characteristics of the neighborhood false positive normal log among the characteristics of the neighborhood true negative normal log to make a true positive attack. Correct log characteristics. Also in this embodiment, when the correction unit 102 (attack detection system) determines that the correction true positive access defined in the correction true positive attack log is an attack access, the correction unit 107 of the neighborhood true negative normal log. Correct the characteristics of the modified true positive attack log using the characteristics and the characteristics of the near-false positive normal log. Since the other components shown in FIG. 12 are the same as those shown in FIG. 2, the description thereof will be omitted.
- the normal classification unit 101 classifies the normal log into a true negative normal log and a false positive normal log (step S4-1). Specifically, the detection unit 102 analyzes a large amount of normal logs stored in advance in the normal log DB 112, and the detection unit 102 determines whether the access defined in the normal log corresponds to normal access or attack access. Is determined. Then, the normal classification unit 101 classifies the normal log determined by the detection unit 102 as a normal access into a true negative normal log. Further, the normal classification unit 101 classifies the normal log determined by the detection unit 102 as an attack access into a false positive normal log. Then, the normal classification unit 101 stores the true negative normal log in the true negative normal log DB 115, and stores the false positive normal log in the false positive normal log DB 118.
- the attack generation unit 103 executes the attack and generates an attack log (step S4_2). That is, the attack generation unit 103 makes an attack access and generates an attack log showing the characteristics of the attack access. Then, the attack generation unit 103 stores the generated attack log in the attack log DB 113.
- the detection unit 102 analyzes the attack log and determines whether the access defined in the attack log corresponds to normal access or attack access (step S4_3).
- step S4_3 If the detection unit 102 determines that the access defined in the attack log corresponds to a normal access (NO in step S4_3), the process proceeds to step S4_8.
- step S4_8 the access determined by the detection unit 102 to correspond to the normal access is an attack access (false negative access) that can avoid the detection of the detection unit 102, so that the normal classification unit 101 detects and avoids the corresponding attack log. It is stored in the detection avoidance attack log DB 111 as an attack log.
- step S4_3 if the detection unit 102 determines that the access defined in the attack log corresponds to the attack access (YES in step S4_3), the process proceeds to step S4_4.
- step S4_4 the neighborhood extraction unit 104 extracts the true negative normal log and the false positive normal log in the vicinity of the attack log (true positive attack log) obtained by step S4_3 from the true negative normal log DB 115 and the false positive normal log DB 118. (Step S4_4).
- the tendency extraction unit 105 calculates the tendency of the characteristics of the neighborhood true negative normal log and the neighborhood false positive normal log extracted in step S4_4 (step S4_5).
- the feature correction unit 106 corrects the true positive attack log so that the true positive attack log contains many tendency of the characteristics of the near true negative normal log but does not include the tendency of the characteristics of the near false positive normal log. (Step S4_6). That is, the feature correction unit 106 uses the features of the near-true negative normal log after eliminating the features that overlap with the features of the near-false positive normal log among the features of the near-true-negative normal log, and each field of the true-positive attack log. To fix.
- the detection unit 102 determines whether the access (corrected true positive access) defined in the true positive attack log (corrected true positive attack log) after being corrected by the feature correction unit 106 corresponds to the normal access or the attack access. It is determined whether it corresponds (step S4_7).
- the normal classification unit 101 stores the corrected true positive attack log in the detection avoidance attack log DB 111 as the detection avoidance attack log (NO). Step S4_8). Since the detection unit 102 cannot detect the corrected true positive access based on the corrected true positive attack log as an attack, the corrected true positive access is an attack access that can avoid the detection of the attack detection system. Therefore, the normal classification unit 101 stores the corrected true positive attack log as the detection avoidance attack log in the detection avoidance attack log DB 111.
- step S4_7 if the detection unit 102 determines that the corrected true positive access corresponds to the attack access (YES in step S4_7), the process returns to step S4_6. Then, the feature correction unit 106 further corrects the corrected true positive attack log by using the characteristics of the near true negative normal log and the characteristics of the near false positive normal log (step S4_6).
- the above is a rough flow of the operation of the attack log generation device 100 according to the present embodiment.
- the details of the operations of the normal classification unit 101, the attack generation unit 103, the neighborhood extraction unit 104, the tendency extraction unit 105, and the feature correction unit 106 according to the present embodiment will be described.
- the normal classification unit 101 causes the detection unit 102 to determine the normality / abnormality of a large amount of normal logs prepared in advance in the normal log DB 112.
- the detection unit 102 determines whether the normal log is normal or abnormal. That is, the detection unit 102 determines whether the feature described in the normal log corresponds to the feature of normal access or the feature of attack access.
- the normal classification unit 101 extracts the normal log determined to be normal in the determination by the detection unit 102 as a true negative normal log. Then, the normal classification unit 101 stores the extracted true negative normal log in the true negative normal log DB 115. Further, the normal classification unit 101 extracts the normal log determined to be abnormal in the determination by the detection unit 102 as a false positive normal log.
- the normal classification unit 101 stores the extracted false positive normal log in the false positive normal log DB 118. Further, the normal classification unit 101 stores the normal log statistical information in the normal log statistical information DB 114 as in the first embodiment. Since the procedure for generating and storing the normal log statistical information is as shown in the first embodiment, the description thereof will be omitted.
- FIG. 14 shows an example of the internal configuration of the neighborhood extraction unit 104.
- the neighborhood extraction unit 104 is composed of a feature extraction unit 1041, a feature expression unit 1042, and a neighborhood calculation unit 1043, as in the first embodiment.
- the neighborhood extraction unit 104 uses a normal log DB 112, a true negative normal log DB 115, a near true negative normal log DB 116, a false positive normal log DB 118, and a near false positive normal log DB 119.
- the feature extraction unit 1041 extracts the specified features from the true positive attack log, the true negative normal log, and the false positive normal log.
- the feature expression unit 1042 converts the features extracted from x true positive attack logs, y 0 true negative normal logs, and y 1 false positive normal logs into a format (feature vector) that can be easily processed by a machine learning algorithm. do. y 0 and y 1 are numbers sufficiently larger than x. Since the conversion method to the feature vector is as shown in the first embodiment, the description thereof will be omitted.
- the neighborhood calculation unit 1043 uses the feature vector of the true positive attack log and the feature vector of the true negative normal log to specify K0 neighborhood true negative normal logs in each of the true positive attack logs. Let K 1 be the total number of true negative normal logs in the vicinity of x true positive attack logs. K 1 ⁇ K 0 . Then, the neighborhood calculation unit 1043 stores the specified K1 neighborhood true negative normal log in the neighborhood true negative normal log DB 116. Further, the neighborhood calculation unit 1043 uses the feature vector of the true positive attack log and the feature vector of the false positive normal log to specify K 0 neighborhood false positive normal logs in the vicinity of each of the true positive attack logs. Let K 2 be the total number of false positive normal logs in the vicinity of x true positive attack logs.
- the neighborhood calculation unit 1043 stores the specified K 2 neighborhood false positive normal logs in the neighborhood false positive normal log DB 119.
- the neighborhood calculation unit 1043 can use, for example, the KNN method as in the first embodiment.
- FIG. 15 shows an example of the internal configuration of the tendency extraction unit 105.
- the tendency extraction unit 105 is composed of a feature extraction unit 1051, a feature expression unit 1052, an importance calculation unit 1053, and a tendency calculation unit 1054.
- the tendency extraction unit 105 uses the neighborhood true negative normal log tendency DB 116, the true negative normal log tendency DB 117, the neighborhood false positive normal log DB 119, and the false positive normal log tendency DB 120.
- the feature extraction unit 1051 acquires K1 neighborhood true negative normal logs from the neighborhood true negative normal log DB 116. Further, the feature extraction unit 1051 acquires x (temporarily one) true positive attack logs from the neighborhood extraction unit 104, for example. The feature extraction unit 1051 extracts a predetermined feature from K 1 neighborhood true negative normal log and x true positive attack log in the first embodiment. Further, the feature extraction unit 1051 acquires K2 neighborhood false positive normal logs from the neighborhood false positive normal log DB 119. Further, the feature extraction unit 1051 acquires x (temporarily one) true positive attack logs from the neighborhood extraction unit 104, for example. Then, the feature extraction unit 1051 extracts the specified features from the K 2 neighborhood false positive normal logs and the x true positive attack logs.
- the feature expression unit 1052 converts the features extracted from the K1 true negative normal log and the x true positive attack log into a format (feature vector) that can be easily processed by the machine learning algorithm. .. Further, the feature expression unit 1052 converts the features extracted from the K 2 false positive normal logs and the x true positive attack logs into a format (feature vector) that can be easily processed by the machine learning algorithm.
- the importance calculation unit 1053 learns the classifier (C 1 ) to calculate the importance of the feature, and extracts the top n 1 feature F 11 to F 1 n 1 having high importance. do. Further, the importance calculation unit 1053 uses the feature vector obtained by the feature expression unit 1052 to discriminate between K 2 neighborhood false positive normal logs and x true positive attack logs (C 2 ). ) To learn. The importance calculation unit 1053 determines the importance of the feature, which is the degree to which the classifier (C 2 ) distinguishes between K 2 neighborhood false positive normal logs and x (for example, 1) true positive attack logs. Calculate for each of the attack log feature vectors.
- the importance calculation unit 1053 calculates the importance of each feature vector so that the feature having a high degree of distinction between the near-false positive normal log and the true positive attack log becomes important. Then, the importance calculation unit 1053 extracts the features F 21 to F 2n1 of the top n 2 items having high importance. N 2 is 1 or more. The importance calculation unit 1053 calculates the importance using, for example, a random forest.
- the tendency calculation unit 1054 acquires statistical information about the features F11 to F1n1 in the K1 neighborhood true negative normal log. The tendency calculation unit 1054 stores the statistical information in the true negative normal log tendency DB 117. Further, the tendency calculation unit 1054 acquires statistical information about the features F 21 to F 2n1 in the K 2 neighborhood false positive normal logs. The tendency calculation unit 1054 stores the statistical information in the false positive normal log tendency DB 120. Similar to the first embodiment, the tendency calculation unit 1054 acquires the median value (median, med 2 ) and the mode (mode, mod 2 ) of the percentile of the category data as statistical information for the feature which is the category data. do. Further, the tendency calculation unit 1054 acquires the average ( ⁇ 2 ) and standard deviation ( ⁇ 2 ) of the numerical data as statistical information for the feature which is the numerical data.
- FIG. 16 shows an example of the internal configuration of the feature correction unit 106.
- the feature correction unit 106 is composed of a data correction unit 1061 and a verification unit 1062, as in the first embodiment.
- the feature correction unit 106 uses the detection avoidance attack log DB111, the true negative normal log tendency DB117, and the false positive normal log tendency DB120.
- FIG. 17 shows an operation example of the data correction unit 1061 and the verification unit 1062.
- steps S5-1 to S5_4 are the same as steps S2-1 to S2_1 in FIG. 9, description thereof will be omitted. Further, the method of generating the list 1i in step S5_4 is as shown in FIGS. 10 and 11.
- the data correction unit 1061 performs the same processing as the features F 11 to F 1n1 for the features F 12 to F 1n1 to generate the list 21 to the list 2n1 .
- the data correction unit 1061 confirms whether or not there is an unconfirmed feature among the features F 21 to F 2n1 (step S5_5).
- the data correction unit 1061 selects the unconfirmed feature F 2i (i is any one of 1 to n1) (step S5_6).
- the operation of the data correction unit 1061 will be described by taking the case where the feature F 2i is the feature F 21 as an example.
- the data correction unit 1061 then acquires the actual value of feature F 21 from the corresponding field in the true positive attack log (step S5_7). Then, the data correction unit 1061 generates the list 21 (step S5_8).
- Listing 21 includes modified values after modifying the actual value of feature F 21 in the true positive attack log with the actual value of feature F 21 in the neighborhood false positive normal log. That is, the list 21 includes a plurality of correction values that reflect the values of the feature F 21 of the K 2 neighborhood false positive normal logs. The method of generating the list 21 will be described later.
- the data correction unit 1061 also performs the processes of steps S5_5 to S5_8 for the other features F 22 to F 2n1 .
- the data correction unit 1061 merges the list F 11 to the list of the features F 11 to F 1n1 and the list 21 to the list 2n1 of the features F 21 to F 2n2 (step S5_9).
- the method of merging will be described later.
- the data correction unit 1061 combines all the correction values included in the merged list F11 to list 1n1 and generates an attack log (correction true positive attack log) corresponding to each combination (step S5_10). Fields that do not correspond to features F 11 to F 1n1 and features F 21 to F 2n2 retain the actual value of the true positive attack log.
- the verification unit 1062 verifies each modified true positive attack log (step S5_11). Specifically, the verification unit 1062 causes the detection unit 102 to determine whether the access defined in each modified true positive attack log corresponds to a normal access or an attack access.
- the verification unit 1062 stores the corrected true positive attack log determined to be normal access by the detection unit 102 as the detection avoidance attack log in the detection avoidance attack log DB 111 (step S5_12).
- the verification unit 1062 creates a detection avoidance attack log by the same method for all true positive attack logs.
- the data correction unit 1061 determines whether the feature F 21 is categorical data or numerical data (step S6_1).
- Category data is a domain, method, status code, etc.
- Numerical data is request size, time interval, etc.
- the data correction unit 1061 acquires the percentile value of the category data of the feature F 21 from the dictionary of the normal log statistical information, and sets the acquired percentile value to cat 21 (. Step S6_2). Further, the data correction unit 1061 refers to mod 21 as the mode 21 as statistical information of the feature F 21 in the K 2 neighborhood false positive normal logs from the false positive normal log tendency DB 120 (step S6_2).
- the data correction unit 1061 compares the value of cat 21 with the value of mod 21 (step S6_3).
- the data correction unit 1061 compares the value of cat 21 with (mod 21 + ⁇ 21 ) (step S6_4).
- ⁇ 21 is a specified value.
- step S6_5 the data correction unit 1061 updates the value of cat 21 so that the value of cat 21 moves away (increases) by ⁇ 21 from the value of mod 21 . Then, the updated value of cat 21 is added to the list 21 (step S6_5). If the value of cat 21 is already described in the list 21 , the data correction unit 1061 overwrites the value of cat 21 already described with the new value of cat 21 . Note that ⁇ 21 is a specified value.
- the data correction unit 1061 repeats the process of step S6_5 while the value of cat 21 is equal to or less than the value of (mod 21 + ⁇ 21 ) (YES in step S6_4). When the value of cat 21 becomes larger than (mod 21 + ⁇ 21 ) (NO in step S6_4), the process proceeds to step S6_8.
- step S6_3 When the value of cat 21 is smaller than the value of mod 21 (YES in step S6_3), the data correction unit 1061 compares the value of cat 21 with (mod 21 - ⁇ 21 ) (step S6_6).
- the data correction unit 1061 sets the value of cat 21 so that the value of cat 21 moves away (decreases) from the value of mod 21 by ⁇ 21 . It is updated and the updated value of cat 21 is added to the list 21 (step S6_7). If the value of cat 21 is already described in the list 21 , the data correction unit 1061 overwrites the value of cat 21 already described with the new value of cat 21 . The data correction unit 1061 repeats the process of step S6_7 while the value of cat 21 is equal to or greater than the value of (mod 21 ⁇ 21 ) (YES in step S6_6). When the value of cat 21 becomes smaller than (mod 21 - ⁇ 21 ) (NO in step S6_6), the process proceeds to step S6_8.
- step S6_8 the data correction unit 1061 confirms the list 21 .
- the data correction unit 1061 sets the value of the numerical data of the feature F 21 to num 21 (step S6_1). Further, the data correction unit 1061 refers to the mean ⁇ 21 and the standard deviation ⁇ 21 as statistical information of the feature F 21 in the K 2 neighborhood false positive normal logs from the false positive normal log tendency DB 120 (step S6_9).
- the data correction unit 1061 compares the value of num 21 with the value of ⁇ 21 (step S6_10).
- step S6_10 When the value of num 21 is equal to or greater than the value of ⁇ 21 (NO in step S6_10), the data correction unit 1061 compares the value of num 21 with ( ⁇ 21 + ⁇ 21 ) (step S6_11).
- the data correction unit 1061 causes the value of num 21 to move away (increase) by ⁇ 21 from the value of ⁇ 21 .
- the value of num 21 is updated, and the updated value of num 21 is added to the list 21 (step S6_12). If the value of num 21 is already described in the list 21 , the data correction unit 1061 overwrites the value of num 21 already described with the new value of num 21 .
- ⁇ 21 is a specified value.
- the ⁇ 21 may have the same value as the ⁇ 21 used when the feature F 21 is categorical data, or may have a different value.
- ⁇ 21 is also a specified value.
- ⁇ 21 is defined from the statistical values related to the feature F 21 , for example, 3 ⁇ ⁇ 21 .
- the data correction unit 1061 repeats the process of step S6_12 while the value of num 21 is ( ⁇ 21 + ⁇ 21 ) or less (YES in step S6_11).
- step S6_11 When the value of num 21 becomes larger than ( ⁇ 21 + ⁇ 21 ) (NO in step S6_11), the process proceeds to step S6_15.
- step S6_10 When the value of num 21 is smaller than the value of ⁇ 21 (YES in step S6_10), the data correction unit 1061 compares the value of num 21 with ( ⁇ 21 ⁇ 21 ) (step S6_13).
- the data correction unit 1061 sets ⁇ so that the value of num 21 moves away from (decreases) the value 1 of ⁇ 2 .
- the num 21 is updated by 21 and the updated value of the num 21 is added to the list 21 (step S6_14). If the value of num 21 is already described in the list 21 , the data correction unit 1061 overwrites the value of num 21 already described with the new value of num 21 .
- the data correction unit 1061 repeats the process of step S6_14 while the value of num 21 is ( ⁇ 21 ⁇ 21 ) or more (YES in step S6_13).
- step S6_13 When the value of num 21 becomes less than ( ⁇ 21 ⁇ 21 ) (NO in step S6_13), the process proceeds to step S6_15.
- step S6_15 the data correction unit 1061 confirms the list 21 .
- the data correction unit 1061 After that, the data correction unit 1061 generates the list 22 to the list 2n1 by the same procedure for the features F 22 to F 2n1 .
- FIG. 18 shows an example of using the mode of the category data
- other statistical information such as the median may be used instead of the mode. ..
- the data correction unit 1061 searches for a feature common to the features F 11 to F 1n1 and the features F 21 to F 2n2 .
- the feature F 11 and the feature F 23 are common.
- the data correction unit 1061 refers to the mode (mod 23 ) corresponding to the false positive normal log tendency DB 120 to F 23 .
- the minimum and maximum values of the elements of the feature list list 23 of F 23 are expressed as min (list 23 ) and max (list 23 ), respectively.
- the data correction unit 1061 deletes the elements of mod 23 ⁇ or more and min (list 23 ) + ⁇ or less from the list list 11 among the elements of list 11 .
- mod 23 is larger than max (list 23 )
- the data correction unit 1061 deletes the elements of max (list 23 ) ⁇ or more and mod 23 + ⁇ or less from the list list 11 among the elements of list 11 .
- ⁇ is a specified value.
- the data correction unit 1061 refers to the mean value ( ⁇ 23 ) corresponding to F 23 from the false positive normal log tendency DB 120.
- the minimum and maximum values of the elements of the feature list list 23 of F 23 are expressed as min (list 23 ) and max (list 23 ), respectively.
- ⁇ 23 is smaller than min (list 23 )
- the data correction unit 1061 deletes the elements of mod 23 ⁇ or more and min (list 23 ) + ⁇ or less from the list 11 among the elements of list 11 .
- the data correction unit 1061 When ⁇ 23 is larger than max (list 23 ), the data correction unit 1061 deletes the elements of max (list 23 ) ⁇ or more and mod 23 + ⁇ or less from the list list 11 among the elements of list 11 .
- ⁇ is a specified value, and may be defined from statistical values related to F 23 , for example, 3 ⁇ ⁇ 23 .
- the data correction unit 1061 simply merges (combines) the list 1i of the feature F 1i and the list 2i of the feature F 2i that are not common.
- the characteristics of the true positive attack log are characterized by using the characteristics of the neighborhood true negative normal log after eliminating the characteristics overlapping with the characteristics of the neighborhood false positive normal log among the characteristics of the neighborhood true negative normal log. To fix. Therefore, as compared with the first embodiment, it is possible to obtain an attack sample of false negative access that can avoid detection more skillfully.
- first and second embodiments have been described above, the two embodiments may be combined and implemented. Alternatively, one of these two embodiments may be partially implemented. Alternatively, these two embodiments may be partially combined and carried out. In addition, the configurations and procedures described in these two embodiments may be changed as necessary.
- the processor 901 shown in FIG. 1 is an IC (Integrated Circuit) that performs processing.
- the processor 901 is a CPU (Central Processing Unit), a DSP (Digital Signal Processor), or the like.
- the main storage device 902 shown in FIG. 1 is a RAM (Random Access Memory).
- the auxiliary storage device 903 shown in FIG. 1 is a ROM (Read Only Memory), a flash memory, an HDD (Hard Disk Drive), or the like.
- the OS (Operating System) is also stored in the auxiliary storage device 903. Then, at least a part of the OS is executed by the processor 901.
- the processor 901 executes a program that realizes the functions of the normal classification unit 101, the detection unit 102, the attack generation unit 103, the neighborhood extraction unit 104, the tendency extraction unit 105, and the feature correction unit 106 while executing at least a part of the OS. ..
- the processor 901 executes the OS, task management, memory management, file management, communication control, and the like are performed.
- At least one of the information, data, signal value, and variable value indicating the processing result of the normal classification unit 101, the detection unit 102, the attack generation unit 103, the neighborhood extraction unit 104, the tendency extraction unit 105, and the feature correction unit 106 Main storage 902, auxiliary storage 903, register in processor 901 and cache memory.
- the programs that realize the functions of the normal classification unit 101, the detection unit 102, the attack generation unit 103, the neighborhood extraction unit 104, the tendency extraction unit 105, and the feature correction unit 106 are magnetic disks, flexible disks, optical disks, compact disks, and Blu-ray discs. (Registered trademark) It may be stored in a portable recording medium such as a disc or a DVD.
- a portable recording medium containing a program that realizes the functions of the normal classification unit 101, the detection unit 102, the attack generation unit 103, the neighborhood extraction unit 104, the tendency extraction unit 105, and the feature correction unit 106 may be distributed. ..
- the attack log generation device 100 may be realized by a processing circuit.
- the processing circuit is, for example, a logic IC (Integrated Circuit), a GA (Gate Array), an ASIC (Application Specific Integrated Circuit), or an FPGA (Field-Programmable Gate Array).
- the normal classification unit 101, the detection unit 102, the attack generation unit 103, the neighborhood extraction unit 104, the tendency extraction unit 105, and the feature correction unit 106 are each realized as a part of the processing circuit.
- the superordinate concept of the processor and the processing circuit is referred to as "processing circuit Lee". That is, the processor and the processing circuit are specific examples of the “processing circuit Lee", respectively.
- attack log generator 101 normal classification unit, 102 detection unit, 103 attack generation unit, 104 neighborhood extraction unit, 105 tendency extraction unit, 106 feature correction unit, 107 correction unit, 111 detection avoidance attack log DB, 112 normal log DB , 113 Attack log DB, 114 Normal log statistical information DB, 115 True negative normal log DB, 116 Near true negative normal log DB, 117 True negative normal log tendency DB, 118 False positive normal log DB, 119 Near false positive normal log DB , 120 False positive normal log tendency DB, 901 processor, 902 main memory, 903 auxiliary memory, 904 keyboard, 905 mouse, 1031 simulated environment, 1032 attack execution unit, 1033 attack module, 1034 attack scenario DB, 1035 log collection unit , 1041 feature extraction unit, 1042 feature expression unit, 1043 neighborhood calculation unit, 1051 feature extraction unit, 1052 feature expression unit, 1053 importance calculation unit, 1054 trend calculation unit, 1061 data correction unit, 1062 verification unit.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer And Data Communications (AREA)
Abstract
Description
一方、セキュリティ監視の現場においては、専門的な知識を必要とするスタッフが不足していることが常態化してしまっている。そのため、少ないスタッフでもサイバー攻撃を高精度かつ効率よく検知することができる技術が必要である。
しかし、攻撃の高度化及び未知攻撃の増加により、あらかじめルールを定義することが困難となり、SOC(Security Operation Center)のスタッフを悩ましている。また、監視対象システムごとにルールを手作業で調整する必要があり、ルールベースの検知技術の限界が近づいている。このため、あらかじめルールを定義する必要のない、もしくは、正常と異常とを識別する境界が自動的に決められる高度な検知技術が望まれる。このような高度な検知技術を実現する技術として機械学習などのArtificial Intelligence(以下、AIと略す)が考えられる。AIはあらかじめ用意された複数のクラスのデータを学習し、クラス間を切り分ける境界を自動的に見つけ出す。クラスごとのデータを大量に用意することができれば、AIは適切に境界を見つけることができる。AIをサイバー攻撃の監視に応用することができれば、これまで専門的な知識やスキルを持つスタッフが行ってきたルールの定義及び更新をAIが代替してくれると期待される。
しかし、ネットワークセキュリティにおいては、AIで最も重要なクラスごとのデータを大量に用意することが困難であるという課題がある。攻撃の発生は稀であり、攻撃データを学習用に大量に用意することは非常に難しい。そのため攻撃データを何らかの方法で増やし学習に利用していく必要がある。
また、攻撃者の能力は日々高まっている。昨今の攻撃者は、攻撃対象の組織の情報をよく調査し、理解した上で、攻撃検知システムに気づかれないように攻撃を仕掛けてくる。内部犯行も増えており、攻撃対象の組織の情報を活用した巧妙な攻撃が今後増えてくると考えられる。検知を回避するために正常な状態によく似た特徴を持つよう巧妙に設計および開発された攻撃にも対応できるよう、攻撃データの巧妙さも必要となってくる。
検知を回避するような異常(攻撃)データを大量に用意することが困難であるという課題を解決するための技術として、検知を回避する攻撃サンプルを数多く自動生成することで、攻撃検知技術の精度向上につなげる技術が存在する。
特許文献1では、セキュリティ製品の評価のために、正常な状態によく似た特徴を持つよう作られた巧妙な攻撃サンプルを自動的に生成する技術が開示されている。特許文献2では、セキュリティ製品の評価のために、本来検知すべきではない正常な事象を検知してしまう誤検知(False Positive:FP)と、本来検知すべき事象を検知しない検知漏れ(False Negative:FN)を自動生成する技術が開示されている。
両技術では、ともに、正常データのふるまいを学習した正常モデルの決定境界を越えるように攻撃の特徴ベクトルを変更していく。また、両技術では、境界を越えた特徴ベクトルに対応する特徴を持つように模擬環境上で攻撃を生成する。両技術によれば、環境、攻撃等の制約及び攻撃機能の有無等を確認することでリアリティのある巧妙な攻撃を生成することができる。
一般的に、高精度な攻撃検知システムになればなるほど、特徴空間が超高次元かつ非線形であり、特徴空間上の表現から実空間の情報に逆変換することは困難になる。それゆえ、特徴空間上で発見した検知を回避する攻撃サンプルから実空間上に存在するサンプルを取得することが困難であり、場当たり的で非効率な探索になると予想される。
攻撃目的のアクセスであることが判明しており攻撃検知システムが攻撃目的のアクセスであると判定した真陽性アクセスを抽出する抽出部と、
正常なアクセスであることが判明しており前記攻撃検知システムが正常なアクセスであると判定した真陰性アクセスの特徴を用いて、前記真陽性アクセスの特徴を修正する修正部とを有する。
***概要***
また、攻撃アクセスであることが判明しており、攻撃検知システムが攻撃アクセスであると判定したアクセスを真陽性アクセスという。
また、攻撃目的の攻撃アクセスであることが判明しているが、攻撃検知システムが誤って正常アクセスであると判定したアクセスを偽陰性アクセスという。偽陰性アクセスは、検知漏れを生じさせるアクセスである。
また、正常アクセスであることが判明しているが、攻撃検知システムが誤って攻撃アクセスであると判定したアクセスを偽陽性アクセスという。偽陽性アクセスは、誤検知を生じさせるアクセスである。
そして、本実施の形態では、正常ログから真陰性アクセスのログを抽出し、攻撃ログから真陽性アクセスのログを抽出する。更に、本実施の形態では、真陰性アクセスの特徴を用いて、真陽性アクセスの特徴を修正する。本実施の形態では、特徴が修正された後の真陽性アクセスである修正真陽性アクセスを攻撃検知システムが正常なアクセスであると判定するように、真陽性アクセスの特徴を修正する。つまり、本実施の形態では、真陽性アクセスが偽陰性アクセスとなるように真陽性アクセスの特徴を修正する。このように、本実施の形態では、真陽性アクセスの特徴を修正することにより、攻撃検知システムによる検知を回避できる偽陰性アクセスの攻撃サンプルを得る。
図1は、本実施の形態に係る攻撃ログ生成装置100のハードウェア構成例を示す。また、図2は、本実施の形態に係る攻撃ログ生成装置100の機能構成例を示す。
本実施の形態に係る攻撃ログ生成装置100は、コンピュータである。攻撃ログ生成装置100は、情報処理装置に相当する。また、攻撃ログ生成装置100の動作手順は、情報処理方法に相当する。また、攻撃ログ生成装置100の動作を実現するプラグラムは、情報処理プログラムに相当する。
補助記憶装置903には、図2に示す正常分類部101、検知部102、攻撃生成部103、近傍抽出部104、傾向抽出部105及び特徴修正部106の機能を実現するプログラムが記憶されている。
これらプログラムは、補助記憶装置903から主記憶装置902にロードされる。そして、プロセッサ901がこれらプログラムを実行して、後述する正常分類部101、検知部102、攻撃生成部103、近傍抽出部104、傾向抽出部105及び特徴修正部106の動作を行う。
図3では、プロセッサ901が正常分類部101、検知部102、攻撃生成部103、近傍抽出部104、傾向抽出部105及び特徴修正部106の機能を実現するプログラムを実行している状態を模式的に表している。
また、図2に示す検知回避攻撃ログDB(Database)111、正常ログDB112、攻撃ログDB113、正常ログ統計情報DB114、真陰性正常ログDB115、近傍真陰性正常ログDB116及び真陰性正常ログ傾向DB117は、主記憶装置902又は補助記憶装置903により実現される。
キーボード904及びマウス905は、攻撃ログ生成装置100のユーザからの指示を受け付ける。ディスプレイ装置906は、攻撃ログ生成装置100のユーザに各種情報を表示する。
なお、図1には示していないが、攻撃ログ生成装置100は通信装置を備えていてもよい。
また、正常分類部101は、攻撃ログから真陽性アクセスを抽出する。より具体的には、正常分類部101は、攻撃ログDB113内の攻撃ログから、検知部102により攻撃と判定された真陽性の攻撃ログを抽出する。攻撃ログには、攻撃生成部103により生成された攻撃アクセスの特徴が複数のフィールドにおいて記述されている。つまり、攻撃ログにより、攻撃アクセスが定義づけられる。このため、正常分類部101により抽出される真陽性の攻撃ログ(以下、真陽性攻撃ログという)により、真陽性アクセスが定義づけられる。正常分類部101は、抽出した真陽性攻撃ログを近傍抽出部104に出力する。
正常分類部101は、抽出部に相当する。また、正常分類部101により行われる処理は抽出処理に相当する。
より具体的には、検知部102は、機械学習を利用して攻撃アクセスを検知する。前述したように、検知部102により正常と判定された正常ログは正常分類部101により真陰性正常ログとして真陰性正常ログDB115に格納される。また、検知部102により攻撃と判定された攻撃ログは正常分類部101により真陽性攻撃ログとして近傍抽出部104に出力される。
修正部107は、正常分類部101により抽出された真陰性アクセスの特徴を用いて、正常分類部101により抽出された真陽性アクセスの特徴を修正する。より具体的には、修正部107は、特徴が修正された後の真陽性アクセスである修正真陽性アクセスを検知部102が正常アクセスであると判定するように、真陽性アクセスの特徴を修正する。また、修正部107は、修正真陽性アクセスを検知部102が攻撃アクセスであると判定した場合には、真陰性アクセスの特徴を用いて、修正真陽性アクセスの特徴を修正する。
修正部107により行われる処理は、修正処理に相当する。
より具体的には、近傍抽出部104は、真陰性正常ログDB115に含まれる真陰性正常ログのうち真陽性攻撃ログの特徴に近似する特徴を有する真陰性正常ログを近傍真陰性正常ログとして抽出する。
また、傾向抽出部105は、真陽性攻撃ログの複数の特徴から、重要度が選択条件に合致する特徴を選択する。
そして、特徴修正部106は、修正後の真陽性攻撃ログを検知回避攻撃ログとして検知回避攻撃ログDB111に格納する。
次に、図3を参照して、本実施の形態に係る攻撃ログ生成装置100の動作例を説明する。
具体的には、正常ログDB112にあらかじめ蓄積されている大量の正常ログを検知部102が解析し、検知部102が、正常ログで定義されるアクセスが正常アクセスに該当するか攻撃アクセスに該当するかを判定する。そして、正常分類部101は、検知部102が正常アクセスと判定した正常ログを真陰性正常ログとして抽出する。
そして、正常分類部101は、抽出した真陰性正常ログを真陰性正常ログDB115に格納する。
つまり、攻撃生成部103が攻撃アクセスを行い、攻撃アクセスの特徴が示される攻撃ログを生成する。そして、攻撃生成部103は、生成した攻撃ログを攻撃ログDB113に格納する。
ステップS1_4の詳細は後述する。
つまり、特徴修正部106は、傾向抽出部105により算出された近傍真陰性正常ログの特徴の傾向が多く含まれるように真陽性攻撃ログの各フィールドを修正する。
修正真陽性攻撃ログに基づく修正真陽性アクセスを検知部102は攻撃として検知できないので、修正真陽性アクセスは攻撃検知システムの検知を回避可能な攻撃アクセスである。このため、正常分類部101は、修正真陽性攻撃ログを検知回避攻撃ログとして検知回避攻撃ログDB111に格納する。
以下では、正常分類部101、検知部102、攻撃生成部103、近傍抽出部104、傾向抽出部105及び特徴修正部106の動作の詳細を説明する。
正常分類部101は、検知部102による判定において正常と判定された正常ログを真陰性正常ログとして抽出する。そして、正常分類部101は、抽出した真陰性正常ログを真陰性正常ログDB115に格納する。このとき、正常分類部101は、正常ログのカテゴリデータ(ドメイン、メソッド、ステータスコードなど)に対し、カテゴリデータごとに、ユニークな値の出現頻度及びパーセンタイルを算出する。そして、正常分類部101は、ユニークな値とパーセンタイルとのペアで構成される辞書を正常ログ統計情報として正常ログ統計情報DB114に格納する。パーセンタイルは、ユニークな値の出現頻度を小さいほうから順番に並べ、当該ユニークな値が何パーセント目にあたるかを示す指標である。正常分類部101は、ユニークな値とパーセンタイルとのペアに代えて、ユニークな値と出現頻度のペアで構成される辞書を正常ログ統計情報として正常ログ統計情報DB114に格納してもよい。
学習データが正常アクセス及び攻撃アクセスのいずれのクラスに属するかが示される教師情報(ラベル)を与えた学習データを用いて検知モデルを学習する手法を教師有学習と呼ぶ。教師有学習が用いられる場合は、検知部102は、学習済みの検知モデルを利用して特徴ベクトルが正常アクセスと攻撃アクセスのどちらのクラスに属するかを推測する。
教師情報は用意せず、正常データのみを学習データとして用いて検知モデルを学習する手法を教師無学習と呼ぶ。教師無学習が用いられる場合は、検知部102は、学習済みの検知モデルを利用して特徴ベクトルが正常アクセスのクラスに属するか否かを推測する。
図4に示すように、攻撃生成部103は、模擬環境1031、攻撃実行部1032、攻撃モジュール1033、攻撃シナリオDB1034及びログ収集部1035から構成される。
模擬環境1031は、企業又は組織の業務ネットワークを模擬した仮想環境である。
模擬環境1031は、例えば、プロキシサーバ、ファイアウォール、ファイルサーバ、AD(Active Directory)サーバ、社内Webサーバ、ユーザ端末、踏み台端末及び疑似インターネットから構成される。疑似インターネットには攻撃者のコマンド&コントロール(Command and Control)サーバが含まれる。
偵察は、公開情報などから、標的の情報(メールアドレスなど)を収集するステップである。
武器化は、攻撃のためのエクスプロイトキットやマルウェア等を生成するステップである。
デリバリーは、マルウェアを添付したメール又は悪意あるリンク付きメールを標的に送信する、標的のシステムへ直接アクセスする等のステップである。
エクスプロイトは、標的にマルウェアなどの攻撃ファイルを実行させる、標的に悪意あるリンクにアクセスさせる等のステップである。
インストールは、エクスプロイトを成功させ、標的にマルウェアに感染させるステップである。
コマンド&コントロール(C&C)は、マルウェアとC&Cサーバが通信可能となり、C&Cサーバがリモートから標的への操作を行うステップである。
侵入拡大は、C&Cサーバがローカルのパスワードハッシュを使い他のコンピュータに侵入するステップである。
目的の実行は、情報搾取、改ざん、データ破壊、サービス停止等、攻撃者の目的が実行されるステップである。
攻撃モジュール1033は、これらの機能を実現するプログラムである。
攻撃シナリオは、一般的な標的型攻撃に合わせて、攻撃モジュール1033の組合せ及びパラメータ(例えば、通信頻度、通信先ドメイン、感染端末など)が定義された情報である。攻撃にバリエーションを持たすために数多くの攻撃シナリオが攻撃シナリオDB1034に用意される。
攻撃ログには、例えば、プロキシサーバログ、ADサーバログ、ファイルサーバログ、ファイアウォールログ等が含まれる。
近傍抽出部104は、特徴抽出部1041、特徴表現部1042、近傍算出部1043から構成される。また、近傍抽出部104は、正常ログDB112、真陰性正常ログDB115及び近傍真陰性正常ログDB116を用いる。
参考文献:Steve T.K. Jan、et al、Throwing Darts in the Dark? Detecting Bots with Limited Data using Neural Data Augmentation, Security & Privacy 2020 (https://people.cs.vt.edu/vbimal/publications/syntheticdata-sp20.pdf)
近傍算出部1043は、例えば、KNN(K-nearest neighbor)法を利用して、K1個の近傍真陰性正常ログを特定する。近傍算出部1043がK1個の近傍真陰性正常ログを特定する際に利用する特徴又は特徴表現は検知部102が用いる特徴又は特徴表現と異なっていてもよい。また、近傍算出部1043は、距離尺度としてEuclid距離などを用いる。
傾向抽出部105は、特徴抽出部1051、特徴表現部1052、重要度算出部1053、傾向算出部1054から構成される。また、傾向抽出部105は、近傍真陰性正常ログDB116と真陰性正常ログ傾向DB117を用いる。
そして、特徴抽出部1051は、特徴抽出部1041と同様に、K1個の近傍真陰性正常ログと、x個の真陽性攻撃ログとから、規定の特徴を抽出する。
そして、重要度算出部1053は、重要度の大きい上位n1件の特徴F11~F1n1を抽出する。n1は1以上である。重要度算出部1053は、例えば、ランダムフォレストを用いて重要度を算出する。
そして、傾向算出部1054は、統計情報を真陰性正常ログ傾向DB117に格納する。
特徴修正部106は、データ修正部1061及び検証部1062から構成される。特徴修正部106は、検知回避攻撃ログDB111を用いる。
以下では、特徴F1iが特徴F11である場合を例にしてデータ修正部1061の動作を記載する。
そして、データ修正部1061は、リスト11を生成する(ステップS2_4)。リスト11には、真陽性攻撃ログの特徴F11の実際の値を、近傍真陰性正常ログの特徴F11の実際の値で修正した後の修正値が含まれる。つまり、リスト11には、K1個の近傍真陰性正常ログの特徴F11の値が反映された複数の修正値が含まれる。
なお、リスト11の生成方法は後述する。
具体的には、検証部1062は、検知部102に各修正真陽性攻撃ログで定義されるアクセスが正常アクセス及び攻撃アクセスのいずれに該当するかを判定させる。
カテゴリデータは、ドメイン、メソッド、ステータスコード等である。数値データは、リクエストサイズ、時間間隔等である。
cat11の値がmod11の値より大きい場合(ステップS3_3でNO、ステップS3_4でNO、ステップS3_5でYES)は、データ修正部1061は、cat11の値がmod11の値にΔ11ずつ近づくよう(小さくなるよう)にcat11の値を更新し、更新後のcat11の値をリスト11に追加する(ステップS3_6)。既にリスト11にcat11の値が記載されている場合は、データ修正部1061は、既に記載されているcat11の値を新たなcat11の値で上書きする。なお、Δ11は規定の値である。
データ修正部1061は、cat11の値がmod11の値以上である間(ステップS3_5でYES)、ステップS3_6の処理を繰り返す。
cat11の値がmod11の値未満になったら(ステップS3_4でNO)、処理がステップS3_9に進む。
データ修正部1061は、cat11の値がmod11の値以下である間(ステップS3_7でNO)、ステップS3_8の処理を繰り返す。
cat11の値がmod11の値よりも大きくなったら(ステップS3_7でYES)、処理がステップS3_9に進む。
num11の値がμ11の値よりも大きい場合(ステップS3_11でNO、ステップS3_12でNO、ステップS3_13でYES)に、データ修正部1061は、num11の値がμ11の値にΔ11ずつ近づくよう(小さくなるよう)にnum11値を更新し、更新後のnum11の値をリスト11に追加する(ステップS3_14)。既にリスト11にnum11の値が記載されている場合は、データ修正部1061は、既に記載されているnum11の値を新たなnum11の値で上書きする。
なお、Δ11は規定の値である。このΔ11は、特徴F11がカテゴリデータである場合に用いるΔ11と同じ値でもよいし、異なる値でもよい。
データ修正部1061は、num11の値が(μ11-τ11)の値以上である間(ステップS3_13でYES)、ステップS3_14の処理を繰り返す。τ11も規定の値である。τ11は、例えば、3×σ11ように特徴F11に関わる統計値から定義することが考えられる。
num11の値が(μ11-τ11)未満になったら(ステップS3_13でNO)、処理がステップS3_17に進む。
データ修正部1061は、num11の値が(μ11+τ11)の値以下である間(ステップS3_15でYES)、ステップS3_16の処理を繰り返す。
num11の値が(μ11-τ11)よりも大きくなったら(ステップS3_15でNO)、処理がステップS3_17に進む。
以上のように、本実施の形態では、特徴が修正された後の真陽性アクセスである修正真陽性アクセスを攻撃検知システムが正常アクセスであると判定するように、真陰性アクセスの特徴を用いて、真陽性アクセスの特徴を修正する。このため、本実施の形態によれば、攻撃検知システムによる検知を回避できる偽陰性アクセスの攻撃サンプルを得ることができる。従って、本実施の形態によれば、実空間で実際に存在し得る、検知を回避する攻撃を効率的に探索することができる。
実施の形態1では、真陽性攻撃ログの近傍の真陰性正常ログに多く見られる特徴の値を持つように真陽性攻撃ログを修正して、検知を回避することができる攻撃ログを効率的に生成している。実施の形態2では、真陰性正常ログに加えて、偽陽性の正常ログ(以下、偽陽性正常ログという)も用いて、検知を回避することができる攻撃ログを効率的に生成する例を説明する。
なお、偽陽性正常ログは、偽陽性アクセスの特徴が複数のフィールドに記述される正常ログである。偽陽性アクセスとは、前述したとおり、正常アクセスであることが判明しているが、攻撃検知システムが誤って攻撃アクセスであると判定したアクセスである。偽陽性アクセスは、攻撃検知システムでの誤検知を生じさせるアクセスである。
なお、以下で説明していない事項は、実施の形態1と同様である。
図12は、本実施の形態に係る攻撃ログ生成装置100の機能構成例を示す。
図12では、図2と比較して、偽陽性正常ログDB118、近傍偽陽性正常ログDB119及び偽陽性正常ログ傾向DB120が追加されている。
近傍偽陽性正常ログDB119は、真陽性攻撃ログの近傍の偽陽性正常ログである近傍偽陽性正常ログを記憶する。
偽陽性正常ログ傾向DB120は、偽陽性正常ログの統計情報(以下、偽陽性正常ログ統計情報という)を記憶する。
偽陽性正常ログDB118、近傍偽陽性正常ログDB119及び偽陽性正常ログ傾向DB120は、例えば、主記憶装置902又は補助記憶装置903により実現される。
本実施の形態でも、修正部107は、修正真陽性攻撃ログで定義される修正真陽性アクセスを検知部102(攻撃検知システム)が攻撃アクセスであると判定した場合に、近傍真陰性正常ログの特徴と近傍偽陽性正常ログの特徴とを用いて、修正真陽性攻撃ログの特徴を修正する。
図12に示す他の構成要素は図2に示すものと同様であるため、説明を省略する。
図13を用いて、本実施の形態に係る攻撃ログ生成装置100の動作例を説明する。
具体的には、正常ログDB112にあらかじめ蓄積されている大量の正常ログを検知部102が解析し、検知部102が、正常ログで定義されるアクセスが正常アクセスに該当するか攻撃アクセスに該当するかを判定する。そして、正常分類部101は、検知部102が正常アクセスと判定した正常ログを真陰性正常ログに分類する。また、正常分類部101は、検知部102が攻撃アクセスと判定した正常ログを偽陽性正常ログに分類する。
そして、正常分類部101は、真陰性正常ログを真陰性正常ログDB115に格納し、偽陽性正常ログを偽陽性正常ログDB118に格納する。
つまり、攻撃生成部103が攻撃アクセスを行い、攻撃アクセスの特徴が示される攻撃ログを生成する。そして、攻撃生成部103は、生成した攻撃ログを攻撃ログDB113に格納する。
つまり、特徴修正部106は、近傍真陰性正常ログの特徴のうち近傍偽陽性正常ログの特徴と重複する特徴を排除した後の近傍真陰性正常ログの特徴を用いて真陽性攻撃ログの各フィールドを修正する。
修正真陽性攻撃ログに基づく修正真陽性アクセスを検知部102は攻撃として検知できないので、修正真陽性アクセスは攻撃検知システムの検知を回避可能な攻撃アクセスである。このため、正常分類部101は、修正真陽性攻撃ログを検知回避攻撃ログとして検知回避攻撃ログDB111に格納する。
以下では、本実施の形態に係る正常分類部101、攻撃生成部103、近傍抽出部104、傾向抽出部105及び特徴修正部106の動作の詳細を説明する。
正常分類部101は、検知部102による判定において正常と判定された正常ログを真陰性正常ログとして抽出する。そして、正常分類部101は、抽出した真陰性正常ログを真陰性正常ログDB115に格納する。また、正常分類部101は、検知部102による判定において異常と判定された正常ログを偽陽性正常ログとして抽出する。そして、正常分類部101は、抽出した偽陽性正常ログを偽陽性正常ログDB118に格納する。
また、正常分類部101は、実施の形態1と同様に、正常ログ統計情報を正常ログ統計情報DB114に格納する。正常ログ統計情報の生成手順及び格納手順は実施の形態1に示した通りなので、説明を省略する。
近傍抽出部104は、実施の形態1と同様に、特徴抽出部1041、特徴表現部1042、近傍算出部1043から構成される。本実施の形態では、近傍抽出部104は、正常ログDB112、真陰性正常ログDB115、近傍真陰性正常ログDB116、偽陽性正常ログDB118及び近傍偽陽性正常ログDB119を用いる。
また、近傍算出部1043は、真陽性攻撃ログの特徴ベクトルと偽陽性正常ログの特徴ベクトルを用い、真陽性攻撃ログそれぞれの近傍K0個の近傍偽陽性正常ログを特定する。x個の真陽性攻撃ログの近傍の偽陽性正常ログの総数をK2個とする。K2≧K0である。そして、近傍算出部1043は、特定したK2個の近傍偽陽性正常ログを近傍偽陽性正常ログDB119に格納する。
近傍を特定する手法として、実施の形態1と同様に、近傍算出部1043は、例えば、KNN法を利用することができる。
傾向抽出部105は、実施の形態1と同様に、特徴抽出部1051、特徴表現部1052、重要度算出部1053、傾向算出部1054から構成される。本実施の形態では、傾向抽出部105は、近傍真陰性正常ログDB116、真陰性正常ログ傾向DB117、近傍偽陽性正常ログDB119及び偽陽性正常ログ傾向DB120を用いる。
特徴抽出部1051は、実施の形態1に、K1個の近傍真陰性正常ログと、x個の真陽性攻撃ログとから、規定の特徴を抽出する。
更に、特徴抽出部1051は、近傍偽陽性正常ログDB119からK2個の近傍偽陽性正常ログを取得する。また、特徴抽出部1051は、例えば、近傍抽出部104からx個(仮に1つとする)真陽性攻撃ログを取得する。
そして、特徴抽出部1051は、K2個の近傍偽陽性正常ログと、x個の真陽性攻撃ログとから、規定の特徴を抽出する。
更に、特徴表現部1052は、K2個の偽陽性正常ログとx個の真陽性攻撃ログから抽出した特徴を機械学習アルゴリズムで処理しやすい形式(特徴ベクトル)に変換する。
更に、重要度算出部1053は、特徴表現部1052により得られた特徴ベクトルを用いて、K2個の近傍偽陽性正常ログと、x個の真陽性攻撃ログとを区別する識別器(C2)を学習する。重要度算出部1053は、識別器(C2)がK2個の近傍偽陽性正常ログとx個(例えば1個)の真陽性攻撃ログとを区別する度合いである特徴の重要度を真陽性攻撃ログの特徴ベクトルの各々について算出する。重要度算出部1053は、近傍偽陽性正常ログと真陽性攻撃ログとを区別する度合いが高い特徴の重要度が高くなるように特徴ベクトルの各々の重要度を算出する。
そして、重要度算出部1053は、重要度の大きい上位n2件の特徴F21~F2n1を抽出する。N2は1以上である。重要度算出部1053は、例えば、ランダムフォレストを用いて重要度を算出する。
更に、傾向算出部1054は、K2個の近傍偽陽性正常ログにおける特徴F21~F2n1についての統計情報を取得する。傾向算出部1054は、統計情報を偽陽性正常ログ傾向DB120に格納する。
実施の形態1と同様に、傾向算出部1054は、カテゴリデータである特徴については、カテゴリデータのパーセンタイルの中央値(メディアン、med2)と最頻値(モード、mod2)を統計情報として取得する。また、傾向算出部1054は、数値データである特徴については、数値データの平均(μ2)と標準偏差(σ2)を統計情報として取得する。
特徴修正部106は、実施の形態1と同様に、データ修正部1061及び検証部1062から構成される。本実施の形態では、特徴修正部106は、検知回避攻撃ログDB111、真陰性正常ログ傾向DB117及び偽陽性正常ログ傾向DB120を用いる。
そして、データ修正部1061は、リスト21を生成する(ステップS5_8)。リスト21には、真陽性攻撃ログの特徴F21の実際の値を、近傍偽陽性正常ログの特徴F21の実際の値で修正した後の修正値が含まれる。つまり、リスト21には、K2個の近傍偽陽性正常ログの特徴F21の値が反映された複数の修正値が含まれる。
なお、リスト21の生成方法は後述する。
具体的には、検証部1062は、検知部102に各修正真陽性攻撃ログで定義されるアクセスが正常アクセス及び攻撃アクセスのいずれに該当するかを判定させる。
カテゴリデータは、ドメイン、メソッド、ステータスコード等である。数値データは、リクエストサイズ、時間間隔等である。
データ修正部1061は、cat21の値が(mod21+τ21)の値以下である間(ステップS6_4でYES)、ステップS6_5の処理を繰り返す。
cat21の値が(mod21+τ21)よりも大きくなったら(ステップS6_4でNO)、処理がステップS6_8に進む。
データ修正部1061は、cat21の値が(mod21-τ21)の値以上である間(ステップS6_6でYES)、ステップS6_7の処理を繰り返す。
cat21の値が(mod21-τ21)よりも小さくなったら(ステップS6_6でNO)、処理がステップS6_8に進む。
データ修正部1061は、num21の値が(μ21+τ21)以下である間(ステップS6_11でYES)、ステップS6_12の処理を繰り返す。
データ修正部1061は、num21の値が(μ21-τ21)以上である間(ステップS6_13でYES)、ステップS6_14の処理を繰り返す。
F11およびF23が数値データの場合、データ修正部1061は、偽陽性正常ログ傾向DB120からF23に対応する平均値(μ23)を参照する。F23の特徴リストlist23の要素の最小値および最大値をそれぞれmin(list23)とmax(list23)と表現する。μ23がmin(list23)より小さい場合、データ修正部1061は、list11の要素のうち、mod23-β以上かつmin(list23)+β以下の要素をリストlist11から削除する。μ23がmax(list23)より大きい場合、データ修正部1061は、list11の要素のうち、max(list23)-α以上かつmod23+α以下の要素をリストlist11から削除する。βは規定の値であり、例えば、3×σ23ようにF23に関わる統計値から定義しても良い。
データ修正部1061は、共通しない特徴F1iのリスト1iと特徴F2iのリスト2iは、単純にマージ(結合)する。
本実施の形態でも、攻撃検知システムによる検知を回避できる偽陰性アクセスの攻撃サンプルを得ることができる。また、本実施の形態では、近傍真陰性正常ログの特徴のうち近傍偽陽性正常ログの特徴と重複する特徴を排除した後の近傍真陰性正常ログの特徴を用いて、真陽性攻撃ログの特徴を修正する。このため、実施の形態1に比べて、より巧妙に検知を回避できる偽陰性アクセスの攻撃サンプルを得ることができる。
あるいは、これら2つの実施の形態のうち、1つを部分的に実施しても構わない。
あるいは、これら2つの実施の形態を部分的に組み合わせて実施しても構わない。
また、これら2つの実施の形態に記載された構成及び手順を必要に応じて変更してもよい。
最後に、攻撃ログ生成装置100のハードウェア構成の補足説明を行う。
図1に示すプロセッサ901は、プロセッシングを行うIC(Integrated Circuit)である。
プロセッサ901は、CPU(Central Processing Unit)、DSP(Digital Signal Processor)等である。
図1に示す主記憶装置902は、RAM(Random Access Memory)である。
図1に示す補助記憶装置903は、ROM(Read Only Memory)、フラッシュメモリ、HDD(Hard Disk Drive)等である。
そして、OSの少なくとも一部がプロセッサ901により実行される。
プロセッサ901はOSの少なくとも一部を実行しながら、正常分類部101、検知部102、攻撃生成部103、近傍抽出部104、傾向抽出部105及び特徴修正部106の機能を実現するプログラムを実行する。
プロセッサ901がOSを実行することで、タスク管理、メモリ管理、ファイル管理、通信制御等が行われる。
また、正常分類部101、検知部102、攻撃生成部103、近傍抽出部104、傾向抽出部105及び特徴修正部106の処理の結果を示す情報、データ、信号値及び変数値の少なくともいずれかが、主記憶装置902、補助記憶装置903、プロセッサ901内のレジスタ及びキャッシュメモリの少なくともいずれかに記憶される。
また、正常分類部101、検知部102、攻撃生成部103、近傍抽出部104、傾向抽出部105及び特徴修正部106の機能を実現するプログラムは、磁気ディスク、フレキシブルディスク、光ディスク、コンパクトディスク、ブルーレイ(登録商標)ディスク、DVD等の可搬記録媒体に格納されていてもよい。そして、正常分類部101、検知部102、攻撃生成部103、近傍抽出部104、傾向抽出部105及び特徴修正部106の機能を実現するプログラムが格納された可搬記録媒体を流通させてもよい。
また、攻撃ログ生成装置100は、処理回路により実現されてもよい。処理回路は、例えば、ロジックIC(Integrated Circuit)、GA(Gate Array)、ASIC(Application Specific Integrated Circuit)、FPGA(Field-Programmable Gate Array)である。
この場合は、正常分類部101、検知部102、攻撃生成部103、近傍抽出部104、傾向抽出部105及び特徴修正部106は、それぞれ処理回路の一部として実現される。
なお、本明細書では、プロセッサと処理回路との上位概念を、「プロセッシングサーキットリー」という。
つまり、プロセッサと処理回路とは、それぞれ「プロセッシングサーキットリー」の具体例である。
Claims (12)
- 攻撃目的のアクセスであることが判明しており攻撃検知システムが攻撃目的のアクセスであると判定した真陽性アクセスを抽出する抽出部と、
正常なアクセスであることが判明しており前記攻撃検知システムが正常なアクセスであると判定した真陰性アクセスの特徴を用いて、前記真陽性アクセスの特徴を修正する修正部とを有する情報処理装置。 - 前記修正部は、
特徴が修正された後の前記真陽性アクセスである修正真陽性アクセスを前記攻撃検知システムが正常なアクセスであると判定するように、前記真陽性アクセスの特徴を修正する請求項1に記載の情報処理装置。 - 前記修正部は、
特徴が修正された後の前記真陽性アクセスである修正真陽性アクセスを前記攻撃検知システムが攻撃目的のアクセスであると判定した場合に、前記真陰性アクセスの特徴を用いて、前記修正真陽性アクセスの特徴を修正する請求項1に記載の情報処理装置。 - 前記修正部は、
正常なアクセスであることが判明しており前記攻撃検知システムが正常なアクセスであると判定したアクセスのうち前記真陽性アクセスの特徴に近似する特徴を有するアクセスを前記真陰性アクセスとして抽出し、
抽出した前記真陰性アクセスの特徴を用いて、前記真陽性アクセスの特徴を修正する請求項1に記載の情報処理装置。 - 前記修正部は、
前記真陽性アクセスに複数の特徴がある場合に、前記複数の特徴から選択条件に合致する特徴を選択し、
選択した特徴を、前記真陰性アクセスの特徴を用いて修正する請求項1に記載の情報処理装置。 - 前記修正部は、
前記真陽性アクセスと前記真陰性アクセスとを区別する度合いである特徴の重要度を、前記複数の特徴の各々について算出し、
前記複数の特徴から、重要度が前記選択条件に合致する特徴を選択する請求項5に記載の情報処理装置。 - 前記修正部は、
前記真陽性アクセスと前記真陰性アクセスとを区別する度合いが高い特徴の重要度が高くなるように前記複数の特徴の各々の重要度を算出する請求項6に記載の情報処理装置。 - 前記修正部は、
正常なアクセスであることが判明しているが前記攻撃検知システムが誤って攻撃目的のアクセスであると判定した偽陽性アクセスの特徴と、前記真陰性アクセスの特徴とを用いて、前記真陽性アクセスの特徴を修正する請求項1に記載の情報処理装置。 - 前記修正部は、
特徴が修正された後の前記真陽性アクセスである修正真陽性アクセスを前記攻撃検知システムが攻撃目的のアクセスであると判定した場合に、前記偽陽性アクセスの特徴と前記真陰性アクセスの特徴とを用いて、前記修正真陽性アクセスの特徴を修正する請求項8に記載の情報処理装置。 - 前記修正部は、
前記真陰性アクセスの特徴のうち前記偽陽性アクセスの特徴と重複する特徴を排除した後の前記真陰性アクセスの特徴を用いて、前記真陽性アクセスの特徴を修正する請求項8に記載の情報処理装置。 - コンピュータが、攻撃目的のアクセスであることが判明しており攻撃検知システムが攻撃目的のアクセスであると判定した真陽性アクセスを抽出し、
前記コンピュータが、正常なアクセスであることが判明しており前記攻撃検知システムが正常なアクセスであると判定した真陰性アクセスの特徴を用いて、前記真陽性アクセスの特徴を修正する情報処理方法。 - 攻撃目的のアクセスであることが判明しており攻撃検知システムが攻撃目的のアクセスであると判定した真陽性アクセスを抽出する抽出処理と、
正常なアクセスであることが判明しており前記攻撃検知システムが正常なアクセスであると判定した真陰性アクセスの特徴を用いて、前記真陽性アクセスの特徴を修正する修正処理とをコンピュータに実行させる情報処理プログラム。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2022553624A JP7170955B1 (ja) | 2020-12-07 | 2020-12-07 | 情報処理装置、情報処理方法及び情報処理プログラム |
PCT/JP2020/045452 WO2022123623A1 (ja) | 2020-12-07 | 2020-12-07 | 情報処理装置、情報処理方法及び情報処理プログラム |
CN202080107438.3A CN116569168A (zh) | 2020-12-07 | 2020-12-07 | 信息处理装置、信息处理方法和信息处理程序 |
DE112020007653.9T DE112020007653T5 (de) | 2020-12-07 | 2020-12-07 | Informationsverarbeitungseinrichtung, informationsverarbeitungsverfahren und informationsverarbeitungsgsprogramm |
US18/297,035 US20230262075A1 (en) | 2020-12-07 | 2023-04-07 | Information processing device, information processing method, and computer readable medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2020/045452 WO2022123623A1 (ja) | 2020-12-07 | 2020-12-07 | 情報処理装置、情報処理方法及び情報処理プログラム |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/297,035 Continuation US20230262075A1 (en) | 2020-12-07 | 2023-04-07 | Information processing device, information processing method, and computer readable medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022123623A1 true WO2022123623A1 (ja) | 2022-06-16 |
Family
ID=81974303
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2020/045452 WO2022123623A1 (ja) | 2020-12-07 | 2020-12-07 | 情報処理装置、情報処理方法及び情報処理プログラム |
Country Status (5)
Country | Link |
---|---|
US (1) | US20230262075A1 (ja) |
JP (1) | JP7170955B1 (ja) |
CN (1) | CN116569168A (ja) |
DE (1) | DE112020007653T5 (ja) |
WO (1) | WO2022123623A1 (ja) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160381042A1 (en) * | 2015-06-29 | 2016-12-29 | Fortinet, Inc. | Emulator-based malware learning and detection |
WO2018100718A1 (ja) * | 2016-12-01 | 2018-06-07 | 三菱電機株式会社 | 評価装置、セキュリティ製品の評価方法および評価プログラム |
US20190080089A1 (en) * | 2017-09-11 | 2019-03-14 | Intel Corporation | Adversarial attack prevention and malware detection system |
WO2019073557A1 (ja) * | 2017-10-11 | 2019-04-18 | 三菱電機株式会社 | サンプルデータ生成装置、サンプルデータ生成方法およびサンプルデータ生成プログラム |
JP2019197549A (ja) * | 2013-06-24 | 2019-11-14 | サイランス・インコーポレイテッドCylance Inc. | 機械学習を使用した生成的マルチモデルマルチクラス分類および類似度分析のための自動システム |
-
2020
- 2020-12-07 JP JP2022553624A patent/JP7170955B1/ja active Active
- 2020-12-07 WO PCT/JP2020/045452 patent/WO2022123623A1/ja active Application Filing
- 2020-12-07 CN CN202080107438.3A patent/CN116569168A/zh active Pending
- 2020-12-07 DE DE112020007653.9T patent/DE112020007653T5/de active Pending
-
2023
- 2023-04-07 US US18/297,035 patent/US20230262075A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2019197549A (ja) * | 2013-06-24 | 2019-11-14 | サイランス・インコーポレイテッドCylance Inc. | 機械学習を使用した生成的マルチモデルマルチクラス分類および類似度分析のための自動システム |
US20160381042A1 (en) * | 2015-06-29 | 2016-12-29 | Fortinet, Inc. | Emulator-based malware learning and detection |
WO2018100718A1 (ja) * | 2016-12-01 | 2018-06-07 | 三菱電機株式会社 | 評価装置、セキュリティ製品の評価方法および評価プログラム |
US20190080089A1 (en) * | 2017-09-11 | 2019-03-14 | Intel Corporation | Adversarial attack prevention and malware detection system |
WO2019073557A1 (ja) * | 2017-10-11 | 2019-04-18 | 三菱電機株式会社 | サンプルデータ生成装置、サンプルデータ生成方法およびサンプルデータ生成プログラム |
Also Published As
Publication number | Publication date |
---|---|
JP7170955B1 (ja) | 2022-11-14 |
CN116569168A (zh) | 2023-08-08 |
JPWO2022123623A1 (ja) | 2022-06-16 |
DE112020007653T5 (de) | 2023-08-03 |
US20230262075A1 (en) | 2023-08-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Khraisat et al. | A critical review of intrusion detection systems in the internet of things: techniques, deployment strategy, validation strategy, attacks, public datasets and challenges | |
US20230291755A1 (en) | Enterprise cybersecurity ai platform | |
Wang et al. | A Host‐Based Anomaly Detection Framework Using XGBoost and LSTM for IoT Devices | |
Demertzis et al. | A bio-inspired hybrid artificial intelligence framework for cyber security | |
Manhas et al. | Implementation of intrusion detection system for internet of things using machine learning techniques | |
Malik et al. | [Retracted] An Improved Deep Belief Network IDS on IoT‐Based Network for Traffic Systems | |
Atefi et al. | Anomaly analysis for the classification purpose of intrusion detection system with K-nearest neighbors and deep neural network | |
Natarajan | Cyber secure man-in-the-middle attack intrusion detection using machine learning algorithms | |
Kheddar et al. | Deep transfer learning for intrusion detection in industrial control networks: A comprehensive review | |
CN112884204B (zh) | 网络安全风险事件预测方法及装置 | |
Abirami et al. | Building an ensemble learning based algorithm for improving intrusion detection system | |
Eid et al. | Comparative study of ML models for IIoT intrusion detection: impact of data preprocessing and balancing | |
Liu et al. | A Hybrid Supervised Learning Approach for Intrusion Detection Systems | |
Soliman et al. | Rank: Ai-assisted end-to-end architecture for detecting persistent attacks in enterprise networks | |
Khoulimi et al. | An Overview of Explainable Artificial Intelligence for Cyber Security | |
Al-Zoubi et al. | A feature selection technique for network intrusion detection based on the chaotic crow search algorithm | |
WO2022123623A1 (ja) | 情報処理装置、情報処理方法及び情報処理プログラム | |
Loaiza et al. | Utility of artificial intelligence and machine learning in cybersecurity | |
Patil et al. | Learning to detect phishing web pages using lexical and string complexity analysis | |
Kejia et al. | A classification model based on svm and fuzzy rough set for network intrusion detection | |
Shah | Understanding and study of intrusion detection systems for various networks and domains | |
Redhu et al. | Deep learning-powered malware detection in cyberspace: a contemporary review | |
Bella et al. | Healthcare Intrusion Detection using Hybrid Correlation-based Feature Selection-Bat Optimization Algorithm with Convolutional Neural Network: A Hybrid Correlation-based Feature Selection for Intrusion Detection Systems. | |
Ansah et al. | Enhancing Network Security Through Proactive Anomaly Detection: A Comparative Study of Auto-Encoder Models and K-Nearest Neighbours Algorithm | |
Srinivasan | Keylogger Malware Detection Using Machine Learning Model for Platform-Independent Devices |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20965009 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022553624 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202080107438.3 Country of ref document: CN |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20965009 Country of ref document: EP Kind code of ref document: A1 |