WO2023151257A1

WO2023151257A1 - Method and apparatus for simulating cyber kill chain, storage medium and electronic device

Info

Publication number: WO2023151257A1
Application number: PCT/CN2022/113829
Authority: WO
Inventors: 唐杰; 吴龙平; 莫建平; 余凯
Original assignee: 三六零科技集团有限公司
Priority date: 2022-02-11
Filing date: 2022-08-22
Publication date: 2023-08-17
Also published as: CN116633567A

Abstract

A method and apparatus for simulating a cyber kill chain, a storage medium and an electronic device. The method comprises: acquiring a historical cyberattack event, representing said event in a form of graph data, and constructing an initial graph representation learning model; and acquiring enterprise target asset information, inputting said information into the initial graph representation learning model for training, so as to generate a kill chain path model, the kill chain path model being used for being input to an intruder simulation system to evaluate the enterprise safety. The method calculates a simulation cyber kill chain in an enterprise environment by means of a knowledge graph, and is self-adaptive to different enterprise environments or self-adaptive to asset information which is increasingly changed in the same enterprise environment.

Description

Method, device, storage medium and electronic equipment for simulating attack kill chain

technical field

The invention relates to the technical field of security detection, in particular to a method, device, storage medium and electronic equipment for simulating an attack kill chain.

Background technique

At present, enterprise environment security detection and security device capability assessment are basically divided into two methods: manual penetration testing, automated intrusion and attack simulation (BAS). Among them, although the manual penetration testing method can meet the short-term detection needs of enterprises, there are many deficiencies in familiarity with the enterprise environment, late delivery, work efficiency, standardization, behavior and data controllability. Automated intrusion and attack simulation (BAS) can perform full-volume vulnerability detection on the target environment, full-volume TTP content library, and set-up scenarios for automated simulated attacks, but even if relevant asset mapping has been done for the user environment, this method still has its own limitations Limitations and disadvantages, such as asset attacks based on single points, cannot simulate the attack kill chain that matches the real environment of the enterprise based on the degree of correlation between assets.

Contents of the invention

The purpose of the embodiments of the present invention is to provide a method, device, storage medium, and electronic device for simulating attack kill chains, which calculate the simulated attack kill chains in the enterprise environment through knowledge graphs, and adapt to different enterprise environments or adapt to the same enterprise environment ever-changing asset information.

In order to achieve the above object, one aspect of the present invention provides a method for simulating an attack kill chain, including:

Obtain historical attack events and represent them in the form of graph data to build an initial graph representation learning model;

Obtain the target asset information of the enterprise, input it into the initial graph representation learning model for training, and generate a kill chain path model,

The kill chain path model is used as input to an intruder simulation system to evaluate enterprise security.

Optionally, the said historical attack events are represented in the form of graph data, and an initial graph representation learning model is constructed, including:

Extract attack knowledge from the historical attack events and associate it with the kill chain;

The historical attack events associated with the kill chain are represented in the form of graph data, and an initial graph representation learning model is constructed.

Optionally, the extracting attack knowledge from the historical attack events and associating with the kill chain includes:

Perform training semantic analysis on the historical attack events, extract attack knowledge, and use it as a data set corpus;

Taking the TTP information in the attack knowledge and the enterprise target asset information as entities, as a training set corpus, and establishing a relationship with the kill chain.

Optionally, performing training semantic analysis on the historical attack events to extract attack knowledge as a data set corpus includes:

Carry out text preprocessing and text deep-level sentence segmentation on the historical attack events;

Using natural language processing technology, the sentence semantic dependency analysis is performed on the historical attack events after the deep sentence segmentation of the text;

A data set corpus is prepared for the historical attack events after sentence semantic dependency analysis, and attack knowledge is extracted.

Optionally, the acquisition of enterprise target asset information is input to the initial graph representation learning model for training to generate a kill chain path model, including:

Acquiring the target asset information of the enterprise, and mapping it to an entity representing the target asset information of the enterprise in the initial graph representation learning model;

The kill chain path model is generated by performing time series analysis on the entity representing the TTP information in the initial graph representation learning model and the kill chain scope where it is located in the knowledge calculation.

Optionally, the enterprise target asset information includes enterprise hardware configuration information, enterprise software configuration information, and/or enterprise industry information;

The acquiring the target asset information of the enterprise and mapping it to the entity representing the target asset information of the enterprise in the initial graph representation learning model includes:

Converting the enterprise hardware configuration information, enterprise software configuration information, and/or enterprise industry information into a knowledge map standard storage format, and representing the enterprise hardware configuration information, enterprise software configuration information, and /or the entity of the industry information of the enterprise is enumerated and mapped.

Another aspect of the present invention also provides a device for simulating the attack kill chain, including:

The knowledge map building block is used to obtain historical attack events and represent them in the form of graph data to build an initial graph representation learning model;

The kill chain path generation module is used to obtain the target asset information of the enterprise, input it to the initial graph representation learning model for training, and generate the kill chain path model,

Optionally, the historical attack events are represented in the form of graph data, and an initial graph representation learning model is constructed, including:

Another aspect of the present invention also provides a storage medium for storing a computer program for executing the above-mentioned intruder simulation attack detection method.

Another aspect of the present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor, wherein the processor executes the computer program At the same time, the above-mentioned intruder simulation attack detection method is realized.

In the embodiment of the present invention, by extracting attack knowledge from the historical attack events, associating them with the kill chain, and representing the historical attack events associated with the kill chain in the form of graph data, an initial graph representation learning Model, displayed in the form of a knowledge graph. Then, for different enterprise environments, obtain enterprise target asset information, input it into the initial graph representation learning model for training, generate a kill chain path model, and use the kill chain path model to input into the intruder simulation system to evaluate enterprise security . Different from the automatic attack and simulated intrusion related products in the prior art, which can only simulate attacks through single-point assets, or inefficiently associate simulated attacks, the method of the present invention calculates the simulated attack kill chain in the enterprise environment through knowledge graphs The method can accurately infer the attack kill chain path that may occur, so as to realize the intelligence of simulated intrusion and attack, improve the effectiveness of simulated attack, adapt to different enterprise environments, and adapt to the ever-changing assets in the same enterprise environment information.

Description of drawings

FIG. 1 is a schematic flow diagram of a method for simulating an attack kill chain provided by an embodiment of the present invention;

Fig. 2 is the specific flowchart of step S1;

Fig. 3 is the specific flowchart of step S2;

Fig. 4 is a schematic diagram of the device structure of the simulated attack kill chain of the present invention;

Among them: 400 - devices that simulate the attack kill chain;

401-knowledge map building block;

402-kill chain path generation module;

Fig. 5 is a schematic structural diagram of an electronic device;

Among them: 500-electronic equipment 500;

501-storage medium;

502 - Processor.

Detailed ways

In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

It should be noted that references in this specification to "one embodiment", "embodiment", "example embodiment" and the like mean that the described embodiment may include specific features, structures or characteristics, but not every Embodiments must include those specific features, structures or characteristics. Furthermore, such expressions are not referring to the same embodiment. Further, when specific features, structures or characteristics are described in conjunction with an embodiment, whether or not there is an explicit description, it has been indicated that it is within the knowledge of those skilled in the art to combine such features, structures or characteristics into other embodiments .

In addition, some terms are used in the description and the following claims to refer to specific components or components, and those skilled in the art should understand that manufacturers may use different nouns or terms to refer to the same component or component. This description and the subsequent claims do not use the difference in names as a way to distinguish components or parts, but use the differences in functions of components or parts as a criterion for distinguishing. "Includes" and "comprises" mentioned throughout the specification and the following claims are open-ended terms, so they should be interpreted as "including but not limited to".

Embodiment 1 of the present invention provides a method for simulating an attack kill chain. Referring to FIG. 1 , FIG. 1 shows a flow chart of steps in this embodiment;

A method for simulating an attack kill chain may include the following steps:

S1. Obtain historical attack events and represent them in the form of graph data to build an initial graph representation learning model.

In a specific implementation, as shown in Figure 2, Figure 2 is a specific flowchart corresponding to step S1, including:

S11. Extract attack knowledge from the historical attack events, and associate it with the kill chain.

In a specific implementation, the extracting attack knowledge from the historical attack events and associating with the kill chain includes: performing training semantic analysis on the historical attack events to extract attack knowledge as a data set corpus; The TTP information in the knowledge and the target asset information of the enterprise are regarded as entities, as a training set corpus, and a relationship is established with the kill chain.

S12. Represent the historical attack events associated with the kill chain in the form of graph data, construct an initial graph representation learning model, and define each node of the graph data to represent TTP information and the enterprise target asset information.

S2. Obtain the target asset information of the enterprise, input it into the initial graph representation learning model for training, and generate a kill chain path model,

In a specific implementation, as shown in FIG. 3, FIG. 3 is a specific flowchart corresponding to step S2, including:

S21. Acquire enterprise target asset information, and map it to entities representing enterprise target asset information in the initial graph representation learning model;

S22. Through knowledge calculation, time series analysis is performed on the entity representing the TTP information in the initial graph representation learning model and the kill chain range where it is located, to generate the kill chain path model.

In this embodiment, the generated kill chain path model can be sent to the BAS intruder simulation system to evaluate the security in the user environment according to the attack result.

In this embodiment, by extracting attack knowledge from the historical attack events and associating them with the kill chain, the historical attack events associated with the kill chain are represented in the form of graph data, and an initial graph representation learning model is constructed, using knowledge displayed in graph form. Then, for different enterprise environments, obtain enterprise target asset information, input it into the initial graph representation learning model for training, generate a kill chain path model, and use the kill chain path model to input into the intruder simulation system to evaluate enterprise security . Different from the automatic attack and simulated intrusion related products in the prior art, which can only simulate attacks through single-point assets, or inefficiently associate simulated attacks, the method of the present invention calculates the simulated attack kill chain in the enterprise environment through knowledge graphs The method can accurately infer the attack kill chain path that may occur, so as to realize the intelligence of simulated intrusion and attack, improve the effectiveness of simulated attack, adapt to different enterprise environments, and adapt to the ever-changing assets in the same enterprise environment information. It should be noted that although the method of simulating the kill chain of the present invention may have multiple combinations due to the combination of multiple assets in the timing analysis, it is far more efficient than blind enumeration and invalid association, and the kill chain is based on Constantly changing enterprise target asset information, such as new vulnerability information, new attack techniques and tactics, and attack implementations (TTPs) are constantly evolving and iterating.

Embodiment 2 of the present invention provides a method for simulating an attack kill chain, which may include the following steps:

In a specific implementation, the historical attack events include structured, semi-structured, and unstructured data structures. In this embodiment, training semantic analysis is performed on the historical attack events, including text preprocessing, deep Various data processing processes such as hierarchical sentence segmentation, target TTP language semantic dependency analysis, vocabulary tokenization, synonym expansion, and model training prediction, etc., extract attack knowledge from the historical attack events as a data set corpus; Regarding the TTP information, the enterprise target asset information is used as an entity, as a training set corpus, and establishes a relationship with the kill chain.

In this embodiment, text preprocessing is performed on the historical attack events to reduce input randomness and reduce algorithm input dimensions to improve performance. For example, the text of a certain historical attack event "A Trojan horse program, after it runs, it will release the normal Tencent TP program TPHelper.exe and the malicious TPHelperBase.dll in the %TEMP% directory to constitute dll hijacking." After text preprocessing, the It will be treated as "a Trojan horse program, which will release normal Tencent TP program EXE files and malicious DLL files in a specific directory after running to constitute dll hijacking.". Then, carry out text deep-level sentence segmentation processing on the historical attack events after text preprocessing, so that the sentences to be analyzed after the deep-level sentence segmentation processing of each text can express TTP information independently. The punctuation marks at the end of Chinese sentences or the coordinating relative conjunctions that appear in the text are processed for sentence segmentation. Then, through the natural language processing technology, the sentence semantic dependency analysis of the historical attack events after the deep sentence segmentation of the text is carried out, the complex and changeable description methods are standardized and unified, and the tools used by the involved attackers, The method, spatial location, implementation scope, and achievement effect are standardized and output. Create a data set corpus for the historical attack events after sentence semantic dependency analysis, extract attack knowledge, and perform synonym expansion for high-frequency keywords in the data set corpus to improve the recall rate of subsequent model predictions. For example, "Trojan horse collects account names in the domain." After synonym expansion, it expands to "Trojan horse collects user names in the domain.", "Trojan horse collects user accounts in the domain." and "Trojan horse harvests user login names in the domain.", etc. Finally, the TTP information in the data set corpus and the target asset information of the enterprise are regarded as entities, as the training set corpus, and a relationship is established with the kill chain.

S21. Acquire enterprise target asset information, and map it to entities representing enterprise target asset information in the initial graph representation learning model.

In a specific implementation, the enterprise target asset information mainly includes enterprise hardware configuration information, enterprise software configuration information, and/or industry information to which the enterprise belongs. Among them, the enterprise hardware configuration information mainly includes all hardware equipment information such as servers, hosts, gateways, switches, routers in the DMZ area and office area, servers in the production area, industrial and control equipment, etc.; enterprise software configuration information mainly includes All software systems operated and used, such as business systems, OA systems, ERP systems, common tools for employees, etc., and all software systems and service systems installed in the above hardware to provide or support services. The industry information of the enterprise is used to clarify the industry information of the enterprise, such as finance, tobacco, government and other industries, and the information is easy to know. By sorting out the version information, security patch information, vulnerability information, port information, protocols, and industry information of the enterprise that exist or are involved in hardware and software services, the enterprise hardware configuration information, enterprise software configuration information, and/or The industry information of the enterprise is converted into the standard storage format of the knowledge graph, and enumerated and mapped with the entities representing the hardware configuration information of the enterprise, the software configuration information of the enterprise, and/or the industry information of the enterprise in the initial graph representation learning model.

It should be noted that, for the method embodiment, for the sake of simple description, it is expressed as a series of action combinations, but those skilled in the art should know that the embodiment of the present invention is not limited by the described action sequence, because According to the embodiment of the present invention, certain steps may be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily required by the embodiments of the present invention.

The foregoing embodiments of the present invention can be applied to terminal equipment with a simulated attack kill chain, and the terminal equipment can include palmtop computers, desktop computers, signature terminals that provide users with electronic signatures, mobile phones, PDAs (Personal Digital Assistant, personal digital assistants) ) and so on, which are not limited in this embodiment of the present invention. The terminal can support Windows, Android (Android), IOS, WindowsPhone and other operating systems.

Referring to Figure 4, Figure 4 shows a device 400 for simulating an attack kill chain, which can be applied to terminal equipment such as computers, which can be implemented through the The method of simulating the attack kill chain includes at least a knowledge map building module 401 and a kill chain path generation module 402, specifically:

A device 400 for simulating an attack kill chain, comprising:

The knowledge map construction module 401 is used to obtain historical attack events and represent them in the form of graph data to construct an initial graph representation learning model;

The kill chain path generation module 402 is used to obtain the target asset information of the enterprise, input it into the initial graph representation learning model for training, and generate a kill chain path model,

Through natural language processing technology, sentence semantic dependency analysis is performed on the historical attack events after the deep sentence segmentation of the text;

The present invention also provides a storage medium for storing a computer program for executing the method for simulating an attack kill chain as described in FIGS. 1-3 . For example, computer program instructions, when executed by a computer, can invoke or provide the method and/or technical solution according to the present invention through the operation of the computer. The program instructions for invoking the method of the present invention may be stored in a fixed or removable storage medium, and/or transmitted through broadcast or other data streams in signal-bearing media and/or stored in running storage medium.

Here, an embodiment according to the present invention includes an electronic device 500 as shown in FIG. 5 , and in some implementations, includes a storage medium 501 for storing computer programs and a processor 502 for executing computer programs, wherein , when the computer program is executed by the processor, the electronic device is triggered to execute the methods and/or technical solutions based on the foregoing multiple embodiments, and the electronic device 500 may be a terminal device such as a mobile phone or a computer.

It should be noted that the software program of the present invention can be executed by a processor to realize the above steps or functions. Likewise, the software program (including associated data structures) of the present invention can be stored in a computer-readable recording medium such as RAM memory, magnetic or optical drive or floppy disk and the like. The schedule reminding method according to the present invention can be implemented on a computer as a computer-implemented method, and the executable code or part thereof for the method according to the present invention can be stored on the computer program product. Examples of computer program products include memory devices, optical storage devices, integrated circuits, servers, online software, and the like. In some embodiments, a computer program product comprises non-transitory program code means stored on a computer readable medium for performing the method according to the invention when said program product is executed on a computer.

To sum up, the method for simulating attack kill chains provided by the present invention extracts attack knowledge from the historical attack events and associates them with the kill chains, and uses the historical attack events associated with the kill chains as graph data Formal representation of , build an initial graph representation learning model, and display it in the form of a knowledge graph. Then, for different enterprise environments, obtain enterprise target asset information, input it into the initial graph representation learning model for training, generate a kill chain path model, and use the kill chain path model to input into the intruder simulation system to evaluate enterprise security . Different from the automatic attack and simulated intrusion related products in the prior art, which can only simulate attacks through single-point assets, or inefficiently associate simulated attacks, the method of the present invention calculates the simulated attack kill chain in the enterprise environment through knowledge graphs The method can accurately reason out the attack kill chain path that may occur, so as to realize the intelligence of simulated intrusion and attack, improve the effectiveness of simulated attack, adapt to different enterprise environments, and adapt to the ever-changing assets in the same enterprise environment information. It should be noted that although the method of simulating the kill chain of the present invention may have multiple combinations due to the combination of multiple assets in the timing analysis, it is far more efficient than blind enumeration and invalid association, and the kill chain is based on Constantly changing enterprise target asset information, such as new vulnerability information, new attack techniques and tactics, and attack implementations (TTPs) are constantly evolving and iterating.

Certainly, the present invention also can have other multiple embodiments, without departing from the spirit and essence of the present invention, those skilled in the art can make various corresponding changes and deformations according to the present invention, but these corresponding Changes and deformations should belong to the scope of protection of the appended claims of the present invention.

The present invention discloses A1, a method for simulating the attack kill chain, comprising:

A2. According to the method described in A1, the historical attack events are represented in the form of graph data, and an initial graph representation learning model is constructed, including:

A3. According to the method described in A2, the attack knowledge is extracted from the historical attack events and associated with the kill chain, including:

A4. According to the method described in A3, the training semantic analysis is performed on the historical attack event, and the attack knowledge is extracted as a data set corpus, including:

A5. According to the method described in A3, the acquisition of enterprise target asset information is input to the initial graph representation learning model for training to generate a kill chain path model, including:

The acquisition of enterprise target asset information is input to the initial graph representation learning model for training to generate a kill chain path model, including:

A6. According to the method described in A5,

The enterprise target asset information includes enterprise hardware configuration information, enterprise software configuration information, and/or enterprise industry information;

The present invention also discloses B7, a device for simulating an attack kill chain, including:

The kill chain path generation module is used to obtain the target asset information of the enterprise, input it to the initial graph representation learning model for training, and generate a kill chain path model,

B8. According to the device described in B7, the historical attack event is represented in the form of graph data, and an initial graph representation learning model is constructed, including:

B9. According to the device described in B8, the attack knowledge is extracted from the historical attack events and associated with the kill chain, including:

B10. According to the device described in B9, the training semantic analysis is performed on the historical attack events, and the attack knowledge is extracted as a data set corpus, including:

B11. According to the device described in B9, the acquisition of enterprise target asset information is input to the initial graph representation learning model for training to generate a kill chain path model, including:

B12. The device according to B7,

The present invention also discloses C13, a storage medium for storing a computer program for executing the method for simulating an attack kill chain described in any one of A1 to A6.

The present invention also discloses D14, an electronic device, including a memory, a processor, and a computer program stored on the memory and operable on the processor, wherein when the processor executes the computer program A method for realizing the simulated attack kill chain described in any one of A1-A6.

Claims

A method for simulating an attack kill chain, characterized by comprising:

Obtain historical attack events, represent them in the form of graph data, and build an initial graph representation learning model;

Obtain the target asset information of the enterprise, input it into the initial graph representation learning model for training, and generate a kill chain path model,

Wherein, the kill chain path model is used to input into an intruder simulation system to evaluate enterprise security.
The method according to claim 1, wherein said historical attack events are represented in the form of graph data, and an initial graph representation learning model is constructed, comprising:

Extract attack knowledge from the historical attack events and associate it with the kill chain;

The historical attack events associated with the kill chain are represented in the form of graph data, and the initial graph representation learning model is constructed.
The method according to claim 2, wherein said extracting attack knowledge from said historical attack events and associating with the kill chain further comprises:

Perform training semantic analysis on the historical attack events, extract attack knowledge, and use it as a data set corpus;

Taking the TTP information in the attack knowledge and the enterprise target asset information as entities, as a training set corpus, and establishing a relationship with the kill chain.
The method according to claim 3, wherein said performing training semantic analysis on said historical attack events to extract attack knowledge as a data set corpus further includes:

Carry out text preprocessing and text deep-level sentence segmentation on the historical attack events;

Using natural language processing technology, the sentence semantic dependency analysis is performed on the historical attack events after the deep sentence segmentation of the text;

A data set corpus is prepared for the historical attack events after sentence semantic dependency analysis, and attack knowledge is extracted.
The method according to claim 3, characterized in that said obtaining enterprise target asset information, inputting it into said initial graph representation learning model for training, and generating a kill chain path model, further comprising:

The acquisition of enterprise target asset information is input to the initial graph representation learning model for training to generate a kill chain path model, including:

Acquiring the target asset information of the enterprise, and mapping it to an entity representing the target asset information of the enterprise in the initial graph representation learning model;

The kill chain path model is generated by performing time series analysis on the entity representing the TTP information in the initial graph representation learning model and the kill chain scope where it is located in the knowledge calculation.
The method according to claim 5, characterized in that,

The enterprise target asset information includes enterprise hardware configuration information, enterprise software configuration information and/or enterprise industry information;

The acquiring the target asset information of the enterprise and mapping it to the entity representing the target asset information of the enterprise in the initial graph representation learning model includes:

Convert the enterprise hardware configuration information, enterprise software configuration information, and/or enterprise industry information into a knowledge map standard storage format, and represent enterprise hardware configuration information, enterprise software configuration information and/or enterprise software configuration information in the initial graph representation learning model or the entity of the industry information of the enterprise to enumerate and map.
A device for simulating an attack kill chain, characterized by comprising:

The knowledge map building block is used to obtain historical attack events and represent them in the form of graph data to build an initial graph representation learning model;

The kill chain path generation module is used to obtain the target asset information of the enterprise, input it to the initial graph representation learning model for training, and generate a kill chain path model,

The kill chain path model is used as input to an intruder simulation system to evaluate enterprise security.
The device according to claim 7, wherein the historical attack events are represented in the form of graph data, and an initial graph representation learning model is constructed, including:

Extract attack knowledge from the historical attack events and associate it with the kill chain;

The historical attack events associated with the kill chain are represented in the form of graph data, and an initial graph representation learning model is constructed.
A storage medium, characterized by being used for storing a computer program for executing the method for simulating an attack kill chain according to any one of claims 1-6.
An electronic device, comprising a memory, a processor, and a computer program stored in the memory and operable on the processor, characterized in that claims 1-6 are realized when the processor executes the computer program The method of simulating an attack kill chain described in any one of the above.