CN116827564A

CN116827564A - Threat event identification method and related device

Info

Publication number: CN116827564A
Application number: CN202210278736.XA
Authority: CN
Inventors: 高云鹏; 王仲宇; 吴朱亮; 谢于明
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2022-03-21
Filing date: 2022-03-21
Publication date: 2023-09-29

Abstract

The application discloses a threat event identification method which is applied to a server in a network. The server constructs a knowledge graph based on the acquired first alarm event and the alarm event generated in the previous period of time, so as to aggregate and obtain the relation between each alarm event and the attack source and the attack target. And then, based on a pre-trained threat identification model, identifying the associated attack characteristics between the first alarm event and other alarm events extracted from the knowledge graph to obtain a threat identification result of the first alarm event. In the scheme, aiming at the characteristic that an attacker frequently and continuously attacks in the network, the attack characteristics of the alarm event are extracted by constructing a knowledge graph, and the attack characteristics of the alarm event can be effectively extracted, so that whether the alarm event is a threat event or not is accurately identified based on the attack characteristics, the process of manually disposing the alarm event is avoided, and the disposal efficiency of the alarm event is improved.

Description

Threat event identification method and related device

Technical Field

The application relates to the technical field of network security, in particular to a threat event identification method and a related device.

Background

In recent years, network security problems are increasingly prominent, and various means are used by attackers to conduct network attacks, so that network security events are endless. To ensure network security, enterprises generally choose to deploy firewalls at network outlets to deter external attacks. The firewall detects the network message, matches the attack behavior characteristics, and directly discards the message after the internal network is found to be attacked, so that the communication between an external attacker and the internal equipment is blocked, and the purpose of protecting the internal network is achieved.

Generally, the attack detection mode adopted by the firewall can ensure higher detection accuracy only when detecting the attack behavior with obvious attack characteristics. For most other attack behaviors, the detection mode adopted by the firewall cannot guarantee higher detection accuracy. Therefore, for most suspected attack behaviors, which cannot guarantee higher detection accuracy, the firewall generally generates an alarm event and uploads the alarm event to the management platform.

Currently, for alarm events uploaded by firewalls, only security operators can identify whether they are genuine threat events. However, as the network continuously generates suspicious attack events, the firewall continuously uploads alarm events, and the security operators are relied on to identify the alarm events, which results in high treatment cost and low treatment efficiency of the alarm events.

Disclosure of Invention

The application provides a threat event identification method, which constructs a knowledge graph based on an acquired first alarm event and alarm events generated in a previous period of time to aggregate and obtain the relation among each alarm event, an attack source and an attack target. And then, based on a pre-trained threat identification model, identifying the associated attack characteristics between the first alarm event and other alarm events extracted from the knowledge graph to obtain a threat identification result of the first alarm event.

The first aspect of the present application provides a threat event identification method, which is applied to a network device such as a server or a controller in a network. The server acquires a first alarm event sent by the firewall, and builds a knowledge graph based on the first alarm event and a plurality of alarm events in a target time period. The target time period is a preset time period before the first alarm event occurs. The knowledge graph is used for recording the relation between the attack source and the attack target and each alarm event in the alarm event set, and the alarm event set comprises the first alarm event and the plurality of alarm events. Namely, the entities in the knowledge graph comprise an alarm event, an attack source and an attack target, and the connection relationship between the entities in the knowledge graph comprises the relationship between the attack source and the alarm event and the relationship between the alarm event and the attack target. Then, the server acquires attack features related to the first alarm event based on the knowledge graph, and inputs the attack features into a threat identification model to obtain a threat identification result. The threat identification result is used for indicating that the first alarm event is a probability value of the threat event, and the threat identification model is a machine learning model obtained through training in advance.

In the scheme, aiming at the characteristic that an attacker frequently and continuously attacks in the network, the attack characteristics of the alarm event are extracted by constructing a knowledge graph, and the attack characteristics of the alarm event can be effectively extracted, so that whether the alarm event is a threat event or not is accurately identified based on the attack characteristics, the process of manually disposing the alarm event is avoided, and the disposal efficiency of the alarm event is improved.

In addition, because the knowledge graph is constructed based on a part of information related to the alarm event (namely, the attack source and the attack target of the alarm event), compared with searching the attack characteristics related to the alarm event in a plurality of fields of each alarm event, the efficiency of acquiring the attack characteristics of the alarm event can be effectively improved by acquiring the attack characteristics of the alarm event based on the knowledge graph, so that the threat identification efficiency is improved.

Optionally, the attack feature comprises one or more features of the first set of features. The first feature set includes: the method comprises the steps that the quantity distribution condition of first type alarm events in an alarm event set in a plurality of different time periods is realized, wherein the attack source of the first type alarm events is a first attack source, and the first attack source is the attack source of the first alarm events; the number distribution condition of the second type of alarm events in the alarm event set in a plurality of different time periods is that the attack target of the second type of alarm events is a first attack target, and the first attack target is the attack target of the first alarm event; the number distribution condition of a third type of alarm event in the alarm event set in a plurality of different time periods is that an attack source of the third type of alarm event is a second attack source, and the second attack source comprises an attack source for attacking the first attack target; the number of the fourth type of alarm events in the alarm event set, wherein the attack source of the fourth type of alarm events is a first attack source, and the fourth type of alarm events and the first alarm events correspond to the same attacked port; the number of attack targets corresponding to the first type of alarm event. The target time period includes a plurality of different time periods.

In the scheme, the server extracts one or more attack features capable of reflecting the threat degree of the alarm event based on the knowledge graph, so that the treatment efficiency of accurately identifying whether the alarm event is a threat event based on the attack features is improved.

Optionally, the knowledge graph is further used for recording attribute information of each alarm event in the alarm event set. The attribute information includes one or more of a type, a five tuple, an application generating the alert event, a vulnerability utilized by the alert event, and a firewall detecting the alert event. The attack signature further includes one or more of a second signature set comprising: the type of the first alarm event, the number of the fifth type of alarm event in the alarm event set, the number of alarm events having the same five-tuple as the first alarm event, the application generating the first alarm event, the attack direction of the first alarm event, the number of vulnerabilities utilized by the alarm event having the same attack source as the first alarm event, and the number of alarm events detected by the firewall detecting the first alarm event. The attack source of the fifth type of alarm event is the same as the attack source of the first alarm event, and the fifth type of alarm event and the first alarm event belong to the same alarm type, and the quintuple comprises a source address, a destination address, a source port, a destination port and a transmission protocol.

Optionally, the server obtains a threat level of the first alarm event, where the threat level is obtained based on a source address corresponding to the first alarm event. Then, the server inputs the attack characteristics and the threat degree into a threat identification model to obtain a threat identification result.

In the scheme, the threat degree related to the source address of the alarm event is used as one of the inputs of the threat identification model, and is input into the threat identification model to execute threat identification, so that the accuracy of a threat identification result can be effectively improved.

Optionally, after the knowledge graph is constructed based on the first alarm event, the server acquires the second alarm event. The server constructs a new knowledge graph based on the second alarm event and the knowledge graph, and acquires attack characteristics related to the second alarm event based on the new knowledge graph. The server inputs the attack characteristics related to the second alarm event into a threat identification model to obtain a threat identification result corresponding to the second alarm event.

According to the method and the device, the attack characteristics related to the alarm event are acquired based on the mode of constructing the knowledge graph, the constructed knowledge graph can be effectively utilized when a new alarm event is acquired, so that the new knowledge graph is constructed, the attack characteristics related to the alarm event can be rapidly extracted based on the new knowledge graph, and the efficiency of acquiring the attack characteristics is improved. Compared with the method that the attack features are extracted from a large number of fields of the alarm event each time, the method and the device have the advantages that the knowledge graph is adopted, so that the efficiency of extracting the attack features can be greatly improved, and further the threat identification efficiency of the alarm event is improved.

Optionally, after obtaining the threat identification result of the first alarm event, the server displays the first alarm event and the threat identification result, so that the security operator can obtain the first alarm event and the threat identification result of the first alarm event. The server then obtains a disposition instruction for the first alarm event and sends an alarm disposition policy to the firewall that detected the first alarm event based on the disposition instruction, the alarm disposition policy being used to instruct the firewall on how to process messages related to the first alarm event.

Optionally, if the handling instruction is blocking the message related to the first alarm event, the server sends an access control list (Access Control Lists, ACL) to the firewall, where the ACL is used to indicate blocking the message related to the attack source of the first alarm event.

If the handling instruction is that the message related to the first alarm event is allowed to pass, the server sends a white list to the firewall, wherein the white list is used for indicating that the message related to the attack source of the first alarm event is a normal message.

Alternatively, the threat identification model is trained based on a gradient boost decision tree (Gradient Boosting Decision Tree, GBDT) algorithm or an extreme gradient boost (eXtreme Gradient Boosting, XGBoost) algorithm.

A second aspect of the present application provides a threat event identification apparatus. The device comprises an acquisition module and a processing module. The acquisition module is used for acquiring the first alarm event. The processing module is used for constructing a knowledge graph based on the first alarm event and a plurality of alarm events in the target time period. The processing module is further used for acquiring attack features related to the first alarm event based on the knowledge graph, inputting the attack features into the threat identification model, and obtaining a threat identification result.

The target time period is a preset time period before the first alarm event occurs, and the knowledge graph is used for recording the relation between the attack source and the attack target and each alarm event in the alarm event set, wherein the alarm event set comprises the first alarm event and the plurality of alarm events. The threat identification result is used for indicating that the first alarm event is a probability value of the threat event, and the threat identification model is a machine learning model obtained through training in advance.

Optionally, the knowledge graph is further used for recording attribute information of each alarm event in the alarm event set. The attribute information includes one or more of a type, a five tuple, an application generating the alert event, a vulnerability utilized by the alert event, and a firewall detecting the alert event. The attack feature further includes one or more features in the second feature set. The second feature set includes: the type of the first alarm event, the number of the fifth type of alarm event in the alarm event set, the number of alarm events having the same five-tuple as the first alarm event, the application generating the first alarm event, the attack direction of the first alarm event, the number of vulnerabilities utilized by the alarm event having the same attack source as the first alarm event, and the number of alarm events detected by the firewall detecting the first alarm event. The attack source of the fifth type of alarm event is the same as the attack source of the first alarm event, and the fifth type of alarm event and the first alarm event belong to the same alarm type. The five-tuple includes a source address, a destination address, a source port, a destination port, and a transport protocol.

Optionally, the acquiring module is further configured to acquire a threat degree of the first alarm event, and the processing module is further configured to input the attack feature and the threat degree into a threat identification model, so as to obtain a threat identification result. The threat level is derived based on a source address corresponding to the first alert event.

Optionally, the acquiring module is further configured to acquire the second alarm event. And the processing module is also used for constructing a new knowledge graph based on the second alarm event and the knowledge graph. The processing module is further used for acquiring the attack characteristics related to the second alarm event based on the new knowledge graph, and inputting the attack characteristics related to the second alarm event into the threat identification model to obtain a threat identification result corresponding to the second alarm event.

Optionally, the device further comprises a display module and a sending module. And the display module is used for displaying the first alarm event and the threat identification result. The acquisition module is further configured to acquire a treatment instruction for the first alarm event. And the sending module is used for sending an alarm disposal strategy to the firewall detecting the first alarm event according to the disposal instruction. The alarm handling policy is used to instruct the firewall on the manner in which to process messages associated with the first alarm event.

Optionally, if the handling instruction is blocking the message related to the first alarm event, the sending module is further configured to send an access control list ACL to the firewall, where the ACL is used to indicate blocking the message related to the attack source of the first alarm event. Or if the handling instruction is that the message related to the first alarm event is allowed to pass, the sending module is further configured to send a white list to the firewall, where the white list is used to indicate that the message related to the attack source of the first alarm event is a normal message.

Alternatively, the threat identification model is trained based on the GBDT algorithm or the XGBoost algorithm.

A third aspect of the application provides a network device comprising a processor and a memory. The memory is for storing program code and the processor is for invoking the program code in the memory to cause the network device to perform the method as in any of the embodiments of the first aspect.

A fourth aspect of the application provides a computer readable storage medium storing instructions which, when run on a computer, cause the computer to perform a method as in any one of the embodiments of the first aspect.

A fifth aspect of the application provides a computer program product which, when run on a computer, causes the computer to perform the method as in any of the embodiments of the first aspect.

A sixth aspect of the application provides a chip comprising one or more processors. Some or all of the processor is configured to read and execute computer instructions stored in the memory to perform the method of any of the possible implementations of any of the aspects described above. Optionally, the chip further comprises a memory. Optionally, the chip further comprises a communication interface, and the processor is connected with the communication interface. The communication interface is used for receiving data and/or information to be processed, and the processor acquires the data and/or information from the communication interface, processes the data and/or information and outputs a processing result through the communication interface. Optionally, the communication interface is an input-output interface or a bus interface. The method provided by the application is realized by one chip or a plurality of chips in a cooperative manner.

The solutions provided in the second aspect to the sixth aspect are used to implement or cooperate to implement the method provided in the first aspect, so that the same or corresponding beneficial effects as those in the first aspect can be achieved, which are not described herein.

Drawings

Fig. 1 is a schematic diagram of a network deployment scenario provided in an embodiment of the present application;

FIG. 2 is a schematic flow chart of a threat event identification method according to an embodiment of the application;

fig. 3 is a schematic structural diagram of a knowledge graph according to an embodiment of the present application;

FIG. 4 is a schematic diagram of another knowledge graph according to an embodiment of the present application;

FIG. 5 is a schematic flow chart of a threat identification method according to an embodiment of the application;

fig. 6 is a schematic diagram of a treatment result sent by an operation platform treatment module to a threat identification treatment module according to an embodiment of the application;

FIG. 7 is a training schematic diagram of a threat identification model provided by an embodiment of the application;

FIG. 8 is a schematic structural diagram of a threat event identification apparatus according to an embodiment of the application;

fig. 9 is a schematic structural diagram of a network device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application will now be described with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the present application. As one of ordinary skill in the art can know, with the development of technology and the appearance of new scenes, the technical scheme provided by the embodiment of the application is also applicable to similar technical problems.

The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

The embodiment of the application provides a threat event identification method which is used for improving the treatment efficiency of an alarm event. The embodiment of the application also provides a corresponding threat event identification device, a computer readable storage medium and the like. For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.

Referring to fig. 1, fig. 1 is a schematic diagram of a network deployment scenario according to an embodiment of the present application. As shown in fig. 1, the network architecture includes a threat event identification apparatus and a plurality of networks to be protected, which may include, for example, a data center network, a campus network, and an enterprise network. The network to be protected comprises a firewall, a switching device and terminal equipment. The threat event identification device is connected with a firewall in the network to be protected and is used for identifying the alarm event reported on the firewall. For ease of understanding, the various devices in the network architecture will be described in detail below.

In the network to be protected, the terminal equipment is a direct target of the attack initiated by the attack source, namely, the terminal equipment is a destination equipment of the attack message sent by the attack source. The terminal device includes a server, a personal computer, a notebook computer, a smart phone, a tablet computer, an internet of things device, and other physical devices. Optionally, the terminal device includes a virtualization device disposed on the physical device, for example, the terminal device includes a Virtual Machine (VM) disposed on the server and used for providing the business service.

The firewall is a network security device disposed between the network to be protected and the external network, and is used for detecting the attack behavior from the external network and taking corresponding defensive measures for the attack behavior. For example, during network operation, firewalls can perform tasks such as virus detection, intrusion detection, uniform resource locator (uniform resource locator, URL) filtering, domain name system (domain name system, DNS) filtering, and mail filtering.

Generally, a message feature library of an attack is pre-established on the firewall, and the firewall performs message feature library matching on all the passing messages and determines the matched messages as the attack, so that the matched messages are discarded. The attack detection mode adopted by the firewall can ensure higher detection accuracy only when the attack behavior with obvious attack characteristics is detected. For most other attack behaviors, the detection mode adopted by the firewall cannot guarantee higher detection accuracy. Therefore, for most suspected attack behaviors, which cannot guarantee higher detection accuracy, the firewall generally generates an alarm event and uploads the alarm event to the management platform.

The network devices deployed between the firewall and the terminal devices are packet forwarding devices (also commonly referred to as switching devices) for forwarding traffic between the external network and the terminal devices in the network to be protected and traffic between different terminal devices in the internal network. Illustratively, the network devices include packet forwarding devices such as switches, gateways, routers, and the like. Optionally, the network device is implemented as a virtualized device deployed on a hardware device. For example, the network device includes a VM, virtual router or virtual switch running a program for sending messages.

The threat event identification device is an execution main body of the threat event identification method provided by the embodiment of the application. Specifically, the threat event identification device is configured to construct a knowledge graph based on the acquired alarm events and the alarm events generated in a previous period of time, so as to aggregate and obtain relationships between each alarm event and the attack source and the attack target. And then, the threat event recognition equipment recognizes the associated attack characteristics among the alarm events extracted from the knowledge graph based on a pre-trained threat recognition model to obtain a threat recognition result of the alarm events. Illustratively, the threat event identification apparatus includes a server, a server cluster, or a VM disposed on a server. The threat event identification apparatus may be deployed in a public cloud, a private cloud, or a hybrid cloud.

It can be understood that the execution body of the threat event identification method provided by the embodiment of the application may also be other devices, for example, a controller or a network management device. The following description will take an execution subject of the method as a server as an example.

The above describes a scenario where the threat event identification method provided by the embodiment of the present application is applied, and the specific implementation process of the threat event identification method provided by the embodiment of the present application will be described in detail below.

Referring to fig. 2, fig. 2 is a flowchart of a threat event identification method according to an embodiment of the application. As shown in fig. 2, the threat event identification method includes the following steps 201-203.

Step 201, a first alarm event is acquired.

In this embodiment, a firewall in a network detects a packet passing through the firewall in real time, so as to detect an attack or a suspected attack in the network. When the firewall detects the suspected attack, the firewall reports a first alarm event to the server, wherein the first alarm event comprises relevant information of the suspected attack detected by the firewall. That is, the first alarm event acquired by the server is reported by the firewall after the firewall detects the suspected attack.

Illustratively, the first alarm event includes the following plurality of fields.

Field 1, IP address and port number of the attack source. Wherein, the attack source refers to an object for launching the attack.

Field 2, the IP address and port number of the attack target. Wherein, the attack target refers to an attacked object.

Field 3, the area where the attack source is located, such as an internal network or an external network. Wherein the firewall is disposed between the internal network and the external network.

And a field 4, wherein the attack target is located in an area, such as an internal network or an external network.

Field 5, name or number of suspected attack, e.g., remote desktop protocol (Remote Desktop Protocol, RDP) local account brute force attempt, suspected distributed denial of service (Distributed Reflection Denial of Service, DRDoS) attack attempt, etc.

And a field 6 for detecting the identity of the firewall suspected to attack.

And a field 7, which is the occurrence time of suspected attack.

The field 8 is a protocol type of the message corresponding to the suspected attack, for example, RDP, file transfer protocol (File Transfer Protocol, FTP), telecommunication network protocol (Telecommunication Network Protocol, TELNET), secure shell protocol (Secure Shell Protocol, SSH), and the like.

A field 9, a firewall handles actions for suspected attack, such as blocking, alarm only, etc.

It will be appreciated that the above-described plurality of fields is one possible example provided by the present embodiment. In practical applications, the first alarm event may further include other fields, or include fewer fields than the multiple fields, which are not limited in this embodiment.

Step 202, constructing a knowledge graph based on the first alarm event and a plurality of alarm events in a target time period, wherein the target time period is a preset time period before the first alarm event occurs, and the knowledge graph is used for recording the relationship between an attack source and an attack target and each alarm event in an alarm event set, and the alarm event set comprises the first alarm event and the plurality of alarm events.

Because the firewall reports the alarm event to the server in real time, the server also acquires other alarm events reported by the firewall before the server acquires the first alarm event. In this way, the server can acquire a plurality of alarm events within the target time period, that is, a plurality of alarm events within a preset time period before the occurrence of the first alarm event, based on the occurrence time of the first alarm event. For example, the target time period is one day or 12 hours before the first alarm event occurs, and the server acquires all alarm events in one day or 12 hours before the first alarm event occurs, so as to construct a knowledge graph.

In addition, since the alarm event reported by the firewall includes the attack source and the attack target corresponding to the alarm event, the server may construct a knowledge graph by using the alarm event, the attack source of the alarm event, and the attack target of the alarm event as entities in the knowledge graph.

That is, the entities in the knowledge graph include an alarm event, an attack source and an attack target, and the connection relationship between the entities in the knowledge graph includes a relationship (initiation) between the attack source and the alarm event and a relationship (attack) between the alarm event and the attack target.

Referring to fig. 3, fig. 3 is a schematic structural diagram of a knowledge graph according to an embodiment of the present application.

As shown in fig. 3, the entities in the knowledge graph of fig. 3 include an attack source 1, an attack source 2, an attack source 3, an attack target 1, an attack target 2, an attack target 3, an attack target 4, and an alarm event 1-an alarm event 7.

The alarm event initiated by the attack source 1 comprises alarm event 1-alarm 4. The alarm event 1-the alarm event 3 are used to attack the same attack target, namely attack target 1. The alarm event 4 is then used to attack the attack target 2.

The alarm events initiated by the attack source 2 include an alarm event 5 and an alarm event 6. The alarm event 5 is used to attack the attack target 1. The alarm event 6 is then used to attack the attack target 3.

The attack source 3 initiates an alarm event comprising an alarm event 7, and the alarm event 7 is used to attack the attack target 4.

Step 203, acquiring attack characteristics related to the first alarm event based on the knowledge graph.

In this embodiment, the server obtains the attack feature related to the first alarm event based on other entities in the knowledge graph that have a connection relationship with the first alarm event. That is, the server can obtain the attack feature related to the first alarm event based on the entity having the connection relation with the first alarm event in the knowledge graph, without concern about other entities having no connection relation with the first alarm event.

Illustratively, the attack signature associated with the first alarm event includes one or more of the following features.

Feature 1, number distribution of first type alert events over a plurality of different time periods of a target time period. The first type of alarm event attack source is a first attack source, and the first attack source is an attack source of the first alarm event, namely, the first type of alarm event refers to all alarm events initiated by the attack source of the first alarm event. For example, in the knowledge graph shown in fig. 3, assuming that the first alarm event is alarm event 1, the first alarm event includes all alarm events initiated by attack source 1, i.e., alarm event 1-alarm event 4.

Because all alarm events in the knowledge graph occur in the target time period, the server can acquire the quantity distribution situation of the first type alarm events in a plurality of different time periods of the target time period. The quantity distribution condition of the first type of alarm events in a plurality of different time periods can reflect the time characteristics of the attack behavior initiated by the attack source.

For example, assuming that the target time period is 24 hours before the first alarm event occurs, the target time period can be divided into 24 different time periods, each of which has a duration of 1 hour. The server then determines the number of first type alert events in each of the 24 different time periods, resulting in a number distribution of first type alert events in a plurality of different time periods of the target time period. Illustratively, in the case of the time period 24 divided into the time period 1, the time period 2..the number distribution of the first type of alarm event in the plurality of different time periods of the target time period is specifically: time period 1 includes 25 first type alarm events, time period 2 includes 55 first type alarm events …, and time period 24 includes 100 first type alarm events.

Alternatively, in the case where a larger number of different time periods are included in the target time period, the above-described feature 1 may include a larger number of features. Thus, after obtaining a quantity distribution of the first type of alert event over a plurality of different time periods of the target time period, the server may further refine the feature based on the quantity distribution. For example, the server determines the number of alarm events of the first type over a plurality of different time periods and takes as feature 1 the number maximum, number minimum and number average over these time periods.

Feature 2, number distribution of the second type of alert event over a plurality of different time periods of the target time period. The attack targets of the second type of alarm events are first attack targets, the first attack targets are attack targets of the first alarm events, namely the second type of alarm events are all alarm events of the attack targets of the first alarm events. For example, in the knowledge graph shown in fig. 3, assuming that the first alarm event is alarm event 1, the second alarm event includes all alarm events for attack target 1, namely alarm event 1-alarm event 3 and alarm event 5.

The number distribution condition of the second type of alarm event in a plurality of different time periods can reflect the time characteristics of the attack behavior of the attack source aiming at the same attack target.

The manner in which the server determines the number distribution of the second type of alarm events in a plurality of different time periods of the target time period is similar to the manner in which the server determines the feature 1, and the description of the feature 1 is specifically referred to and will not be repeated herein.

Feature 3, number distribution of third type alert events over a plurality of different time periods of the target time period. The attack source of the third type of alarm event is a second attack source, wherein the second attack source comprises an attack source for attacking the first attack target, namely the third type of alarm event refers to all alarm events initiated by the attack source for attacking the first attack target. For example, in the knowledge graph shown in fig. 3, assuming that the first alarm event is alarm event 1, the second attack source includes attack source 1 and attack source 2, and the third type of alarm event includes all alarm events initiated by attack source 1 and attack source 2, that is, alarm event 1-alarm event 6.

The manner in which the server determines the number distribution of the third type of alarm event in the multiple different time periods of the target time period is similar to the manner in which the server determines the feature 1, and the description of the feature 1 is specifically referred to and will not be repeated herein.

And 4, the number of the fourth type of alarm events, wherein the attack source of the fourth type of alarm events is a first attack source, and the fourth type of alarm events and the first alarm events correspond to the same attacked port. The attacked port refers to a port that an attack target corresponding to the alarm event is attacked. Because the same attack target may include a plurality of different ports, and the destination port is specified in the attack message sent by the attack source, the server can determine the attacked port corresponding to each alarm event based on the destination port in the attack message corresponding to the alarm event.

And 5, the number of attack targets corresponding to the first type of alarm event.

For example, in the knowledge graph shown in fig. 3, the first type of alarm event includes alarm event 1-alarm event 4, and the number of attack targets corresponding to alarm event 1-alarm event 4 is 2.

It will be appreciated that the various attack features described above are one possible example provided by the present embodiment. In practical applications, the attack feature related to the first alarm event may further include other attack features, which is not limited in this embodiment.

Optionally, the knowledge graph constructed by the server is further used for recording attribute information of the first alarm event and the plurality of alarm events. Specifically, the attribute information of the alarm event includes one or more of a type, a five tuple, an application generating the alarm event, a vulnerability utilized by the alarm event, and a firewall detecting the alarm event. The five-tuple comprises a source address, a destination address, a source port, a destination port and a transmission protocol indicated in a message corresponding to the alarm event.

In the case that the knowledge graph also records attribute information of the first alarm event and the plurality of alarm events, the attack feature related to the first alarm event further includes one or more of the following features: the type of the first alarm event, the number of the fifth alarm event, the number of alarm events having the same five-tuple as the first alarm event, an application program generating the first alarm event, an attack direction of the first alarm event, the number of vulnerabilities utilized by the alarm event having the same attack source as the first alarm event, and the number of alarm events detected by a firewall detecting the first alarm event.

The attack source of the fifth type of alarm event is the same as the attack source of the first alarm event, and the fifth type of alarm event and the first alarm event belong to the same alarm type. The alarm type of the alarm event refers to an attack type of a suspected attack behavior corresponding to the alarm event, and the alarm type may include, for example, RDP brute force cracking, structured query language (Structured Query Language, SQL) injection attempt, and the like.

For example, in the knowledge graph shown in fig. 3, it is assumed that the first alarm event is alarm event 1, and the alarm type of alarm event 1 is RDP brute force cracking, and then the fourth alarm event includes alarm events of alarm event 1-alarm event 4, and the alarm type of which is RDP brute force cracking.

The attack direction of the first alarm event includes, for example, an attack from an external network to an internal network, an attack from an internal network to an external network, or an attack from an internal network to an internal network. For any one alarm event, the number of vulnerabilities utilized by the alarm event may be 0, 1 or more.

Referring to fig. 4, fig. 4 is a schematic diagram of another knowledge graph according to an embodiment of the application.

As shown in fig. 4, for the alarm event 1, the type of the alarm event 1, the attacked port of the alarm event 1, the application program generating the alarm event 1, the vulnerability utilized by the application program generating the alarm event 1, and the firewall detecting the alarm event 1 are also recorded in the knowledge graph.

It should be noted that, due to the limited margin, only corresponding entities, such as the type corresponding to the alarm event 1, the attacked port, etc., are drawn for the alarm event 1 in fig. 4. In practical application, the knowledge graph may record attribute information of all alarm events or part of alarm events, which is not limited herein.

And 204, inputting the attack characteristics related to the first alarm event into a threat identification model to obtain a threat identification result.

The threat identification result is used for indicating the probability value that the first alarm event is a threat event. For example, the threat identification result indicates that the probability value of the first alert event being a threat event is 0.5 or 0.8.

In addition, the threat identification model is a machine learning model obtained through pre-training. For example, the server acquires alarm events reported by the firewall in a period of time and threat identification results marked by operators for the alarm events; then, the server trains the machine learning model by taking the alarm events and threat recognition results corresponding to the alarm events as training data, so as to obtain the threat recognition model. In the training process of the threat identification model, the input of the threat identification model is attack characteristics related to each alarm event, such as the characteristics 1-5.

In this embodiment, aiming at the characteristic that an attacker frequently and continuously attacks in the network, the attack features of the alarm event are extracted by constructing a knowledge graph, so that the attack features of the alarm event can be effectively extracted, thereby realizing the accurate identification of whether the alarm event is a threat event based on the attack features, avoiding the process of manually handling the alarm event, and improving the handling efficiency of the alarm event. In addition, because the knowledge graph is constructed based on a part of information related to the alarm event (namely, the attack source and the attack target of the alarm event), compared with searching the attack characteristics related to the alarm event in a plurality of fields of each alarm event, the efficiency of acquiring the attack characteristics of the alarm event can be effectively improved by acquiring the attack characteristics of the alarm event based on the knowledge graph, so that the threat identification efficiency is improved.

Alternatively, the threat identification model is trained based on a gradient boost decision tree (Gradient Boosting Decision Tree, GBDT) algorithm or an extreme gradient boost (eXtreme Gradient Boosting, XGBoost) algorithm. Specifically, the GBDT algorithm and the XGBOOST algorithm are both iterative decision tree algorithms, and the XGBOOST algorithm is an improvement of the GBDT algorithm. The GBDT algorithm and the XGBOOST algorithm are specifically composed of a plurality of decision trees, and the predicted values of all the decision trees are accumulated to obtain the final result of the algorithm. The GBDT algorithm and XGBOOST algorithm may be described with reference to the GBDT algorithm and XGBOOST algorithm in the prior art, and in this embodiment, the GBDT algorithm and XGBOOST algorithm are applied to the training threat identification model, and will not be described in detail herein.

Optionally, in the case that the server continuously monitors the attack event and the alarm event in the network, the server can acquire the threat level of the partial IP address based on the source of the attack event, so that the threat level of the alarm event is determined when the alarm event having the same IP address is acquired later. For example, in the case that the source IP addresses of a certain number of attack events acquired by the server are the same IP address or are in the same IP address segment, the server can determine that the threat level of a certain IP address or a certain IP address segment is higher, so as to establish a mapping relationship between the IP address or the IP address segment and the threat level.

Illustratively, the server obtains a threat level of the first alert event, the threat level being derived based on a source address corresponding to the first alert event. For example, when the server establishes a mapping relationship between the IP address or the IP address segment and the threat level, the server queries a mapping table recorded with the mapping relationship based on the source address of the first alarm event, so as to obtain the threat level of the first alarm event. The threat level of the first alarm event may be a value of 1-100, or may be a score value of 0% -100%, which is not limited in this embodiment.

After the threat degree of the first alarm event is obtained, the server inputs the attack characteristics and the threat degree related to the first alarm event into a threat identification model to obtain a threat identification result. Wherein the input of the threat identification model in the training phase also comprises the threat level of the alarm event.

The above describes how the server implements threat identification for the first alarm event when the first alarm event is acquired. In practical applications, the server continuously acquires the alarm event reported by the firewall, so how the server efficiently performs threat identification on the continuously acquired alarm event will be described below.

Optionally, after the knowledge graph is constructed and obtained and the threat identification result of the first alarm event is obtained based on the method described in steps 201 to 204, the server may also continuously receive other alarm events reported by the firewall. In this case, the server reconstructs a new knowledge-graph based on the already constructed knowledge-graph and the newly acquired alarm event, and performs threat identification on the newly acquired alarm event based on the new knowledge-graph.

Illustratively, the server obtains the second alert event after constructing the knowledge-graph based on the first alert event. Similar to the first alarm event, the second alarm event may also be an alarm event reported by a firewall.

Then, the server constructs a new knowledge graph based on the second alarm event and the knowledge graph, and acquires attack features related to the second alarm event based on the new knowledge graph. Because the original knowledge graph is established by the server based on the first alarm event and a plurality of alarm events in the target time period, the server can establish a new knowledge graph by adding the content related to the second alarm event into the knowledge graph on the basis of the established knowledge graph. After obtaining the new knowledge-graph, the server may obtain attack features related to the second alarm event from the new knowledge-graph according to the manner described in step 203.

And finally, the server inputs the attack characteristics related to the second alarm event into a threat identification model to obtain a threat identification result corresponding to the second alarm event.

The process of threat identification for the continuously acquired alarm event by the server is introduced above, and the process of disposal after the server obtains the threat identification result of the alarm event will be introduced below.

Optionally, after the step 204, that is, after the server obtains the threat identification result of the first alarm event, the server displays the first alarm event and the threat identification result, so that the security operator can obtain the first alarm event and the threat identification result of the first alarm event. The threat identification result displayed by the server can be used as a reference for the judgment of the security operator. After the security operator obtains the first alarm event and the threat identification result displayed by the server, the security operator can further judge the accuracy of the threat identification result obtained by the server, and issue a treatment instruction for the first alarm event at the server. Thus, the server is able to obtain the disposition instruction for the first alarm event. For example, when the security operator refers to the threat identification result displayed by the server and determines that the first alarm event is an attack event, the security operator issues a treatment instruction to block the message related to the first alarm event at the server. For another example, when the security operator refers to the threat identification result displayed by the server and determines that the first alarm event is a false alarm event, the security operator issues a disposition instruction on the server to allow the message related to the first alarm event to pass through.

After acquiring the treatment instruction for the first alarm event, the server sends an alarm treatment policy to the firewall detecting the first alarm event according to the treatment instruction, wherein the alarm treatment policy is used for indicating a mode of the firewall to process the message related to the first alarm event.

Illustratively, if the disposition instruction is blocking the message associated with the first alarm event, the server sends an access control list (Access Control Lists, ACL) to the firewall, the ACL indicating to block the message associated with the source of attack of the first alarm event. Briefly, by specifying the IP address of the source of the attack in the ACL, the firewall is able to filter the received message according to the ACL, similarly to the blacklist, thereby discarding the message whose source address is the IP address specified in the ACL.

And if the handling instruction is that the message related to the first alarm event is allowed to pass, sending a white list to the firewall, wherein the white list is used for indicating that the message related to the attack source of the first alarm event is a normal message. In this way, after receiving the white list, the firewall can allow the message related to the attack source of the first alarm event (i.e. the message sent by the source address of the first alarm event later) to pass through, and no corresponding alarm event is reported to the server.

Optionally, in some cases, after obtaining the threat identification result, the server may also send a corresponding alarm handling policy to the firewall based on the threat identification result, without sending the alarm handling policy after obtaining the handling instruction.

While the execution process of the threat event identification method provided by the embodiment of the application is described above, for convenience of understanding, a process of executing threat event identification based on a threat identification model and a process of training the threat identification model will be described in detail below with reference to specific examples.

Referring to fig. 5, fig. 5 is a flowchart of a threat identification method according to an embodiment of the application. In fig. 5, the security information providing module, threat identification handling module, and operation platform handling module are located on the same server or on different servers. As shown in fig. 5, the threat identification method of fig. 5 includes the following steps 501-507.

In step 501, the firewall reports an alert event to the threat identification disposal module.

When the firewall detects the suspected attack, the firewall reports an alarm event to the threat identification and disposal module according to the detected suspected attack. The alarm event reported by the firewall includes a plurality of fields, such as several tens of fields of alarm event level, alarm event type, number of times of occurrence of the same five-tuple alarm event in one day, source IP address of the alarm event, source port of the alarm event, geographic location of the source IP address of the alarm event, attack direction of the alarm event, protocol of the alarm event, application program generating the alarm event, time of occurrence of the alarm event, firewall generating the alarm event, etc.

At step 502, the security information providing module provides relevant security information of the source IP address of the alert event to the threat identification handling module.

Because the alarm event reported by the firewall includes the source IP address of the alarm event, the threat identification handling module may send the source IP address of the alarm event to the security information providing module. After the security information providing module obtains the source IP address, the threat identification processing module is provided with relevant security information for the source IP address of the alert event. The relevant security information of the source IP address of the alarm event includes, for example, threat level of the source IP address, geographic location of the source IP address, and domain name system (Domain Name System, DNS) server corresponding to the source IP address.

Illustratively, the security information provided by the security information providing module may be related to the source IP address of the alarm event as shown in table 1.

TABLE 1

In step 503, the threat identification processing module extracts the features of the alarm event, and performs threat identification on the alarm event based on the threat identification model.

In this embodiment, step 503 is similar to steps 202-204 described above, and please refer to steps 202-204 described above, which will not be repeated here.

Step 504, the threat identification handling module sends the threat identification result to the operation platform handling module.

And when the threat identification result is obtained based on the threat identification model, the threat identification processing module sends the threat identification result to the operation platform processing module so as to conveniently display the threat identification result corresponding to the alarm event to the security operator.

Optionally, since the threat identification result obtained based on the threat identification model is used to indicate a probability value that the alarm event is a threat event, the threat identification processing module can further determine the type of the alarm event based on the probability value. For example, the threat identification treatment module presets two thresholds, threshold 1 and threshold 2, respectively, wherein threshold 1 is less than threshold 2. When the probability value of the alarm event being the threat event is smaller than or equal to a threshold value 1, the threat identification processing module identifies the alarm event as a false alarm event; when the probability value of the alarm event being the threat event is greater than a threshold value 2, the threat identification processing module identifies the alarm event as the threat event; when the probability value of an alarm event being a threat event is greater than a threshold 1 and less than a threshold 2, the threat identification handling module identifies the alarm event as an uncertain event. And the threat identification handling module carries the identification result of the alarm event in the threat identification result sent to the operation platform handling module.

Illustratively, the threat identification results sent by the threat identification handling module to the operations platform handling module may be as shown in table 2.

TABLE 2

In step 505, the operation platform disposition module sends the disposition result to the threat identification disposition module.

After the threat identification result is obtained by the operation platform disposal module, the operation platform disposal module displays the threat identification result to the security operator, so that the security operator can refer to the threat identification result given by the threat identification disposal module to carry out final disposal on the alarm event. Generally, when determining that the alarm event is a threat event, the security operator may determine whether to send down a blacklist to block a message related to the alarm event based on information such as a client authorization condition and an area where a source IP address of the alarm event is located.

In this way, after the security operator performs final handling of the alarm event, the operation platform handling module sends the handling result decided by the security operator to the threat identification handling module. Among these, treatment results include, for example, various results of: the result 1 is that the alarm event is false alarm, and the message related to the alarm event is allowed to pass through; as a result 2, the alarm event is a threat event, and a blacklist needs to be issued to block messages related to the alarm event; as a result 3, the alarm event is a threat event, but the blacklist is not issued to block messages related to the alarm event. For example, referring to fig. 6, fig. 6 is a schematic diagram of a treatment result sent by an operation platform treatment module to a threat identification treatment module according to an embodiment of the application.

In step 506, the threat identification treatment module sends the treatment results to the firewall.

If the treatment result obtained by the threat identification treatment module is the result 1, the threat identification treatment module issues a white list to the firewall to allow the message related to the attack source of the alarm event to pass through and instruct the firewall to no longer report the corresponding alarm event. If the treatment result obtained by the threat identification treatment module is the result 2, the threat identification treatment module issues an ACL to the firewall to block the message related to the attack source of the alarm event. If the treatment result obtained by the threat identification treatment module is the result 3 described above, the threat identification treatment module does not send any treatment result to the firewall.

In step 507, the threat identification treatment module uses the treatment result as a tag for the alert event to continuously train the threat identification model.

In this embodiment, in the process that the threat identification treatment module continuously performs threat identification on the alarm event through the threat identification model, the threat identification treatment module may also periodically collect the historical alarm event, and continuously train the threat identification model based on the treatment result of the operation platform treatment module for the historical alarm event.

Illustratively, the alarm events reported by the firewall are stored on the server, so the threat identification handling module in the server selects N months of alarm events for model training at the time of model training. For example, in the case that the server receives approximately 1.5 thousands of alarm events per day, the threat identification processing module selects more than 100 thousands of alarm events as training samples to train the threat identification model.

During the training process, the threat identification module extracts attack features corresponding to each training sample based on the manner described in steps 202-203 above. And, threat identification module marks the attack characteristic of the alarm event according to the handling result of the operation platform handling module for the alarm event, and is divided into two kinds of labels, namely 1 (false alarm) and 0 (threat event). The threat identification module then takes these tagged attack features as input to the model training.

Illustratively, during the training process, the input of the threat identification model may be as shown in table 3.

TABLE 3 Table 3

Wherein, the attack number in table 3 represents the occurrence times of the same five-tuple of attack events as the alarm event in one day; an application represents an application that generates an alarm event, such as an application name or a protocol of the application; the first type number represents the number of first type alarm events; the second type number represents the number of alarm events of the second type.

After the input of the threat identification model is obtained, the threat identification treatment module inputs data in the threat identification model, and the XGBOOST algorithm is selected for training the model to generate a classification algorithm model. Referring to fig. 7, fig. 7 is a schematic diagram illustrating training of a threat identification model according to an embodiment of the application. As shown in fig. 7, the threat identification model constructed by the XGBOOST algorithm includes a plurality of decision trees, and the predicted values of all the decision trees are accumulated to obtain a threat identification result.

Referring to fig. 8, fig. 8 is a schematic structural diagram of a threat event identification apparatus according to an embodiment of the application. As shown in fig. 8, the threat event identification apparatus includes: an acquisition module 801, a processing module 802, a display module 803 and a sending module 804. An obtaining module 801 is configured to obtain a first alarm event. A processing module 802, configured to construct a knowledge graph based on the first alarm event and the plurality of alarm events in the target time period. The processing module 802 is further configured to obtain an attack feature related to the first alarm event based on the knowledge graph, and input the attack feature into the threat identification model to obtain a threat identification result. The target time period is a preset time period before the first alarm event occurs, and the knowledge graph is used for recording the relationship between the attack source and the attack target and each alarm event in the alarm event set, wherein the alarm event set comprises the first alarm event and the plurality of alarm events. The threat identification result is used to indicate a probability value that the first alert event is a threat event. The threat identification model is a machine learning model obtained through pre-training.

Optionally, the attack feature comprises one or more features of the first set of features. The first feature set includes: the method comprises the following steps that under the condition that the number of the first type of alarm events is distributed in a plurality of different time periods, the attack source of the first type of alarm events is a first attack source, and the first attack source is the attack source of the first alarm event; the number distribution condition of the second type of alarm events in a plurality of different time periods is that the attack target of the second type of alarm events is a first attack target, and the first attack target is the attack target of the first alarm event; the number distribution condition of the third type of alarm events in a plurality of different time periods is that the attack source of the third type of alarm events is a second attack source, and the second attack source comprises an attack source for attacking the first attack target; the number of the fourth type of alarm events, wherein the attack source of the fourth type of alarm events is a first attack source, and the fourth type of alarm events and the first alarm events correspond to the same attacked port; the number of attack targets corresponding to the first type of alarm event. The target time period includes a plurality of different time periods.

Optionally, the knowledge graph is further used for recording attribute information of each alarm event in the alarm event set. The attribute information includes one or more of a type, a five tuple, an application generating the alert event, a vulnerability utilized by the alert event, and a firewall detecting the alert event. The attack feature further includes one or more features in the second feature set. The second feature set includes: the type of the first alarm event, the number of alarm events having the same five-tuple as the first alarm event, the application program generating the first alarm event, the attack direction of the first alarm event, the number of vulnerabilities utilized by the alarm event having the same attack source as the first alarm event, and the number of alarm events detected by the firewall detecting the first alarm event. The five-tuple includes a source address, a destination address, a source port, a destination port, and a transport protocol.

Optionally, the acquiring module 801 is further configured to acquire a threat level of the first alarm event, and the processing module 802 is further configured to input the attack feature and the threat level into a threat identification model, so as to obtain a threat identification result. The threat level is derived based on a source address corresponding to the first alert event.

Optionally, the acquiring module 801 is further configured to acquire a second alarm event. The processing module 802 is further configured to construct a new knowledge graph based on the second alarm event and the knowledge graph. The processing module 802 is further configured to obtain an attack feature related to the second alarm event based on the new knowledge-graph. The processing module 802 is further configured to input an attack feature related to the second alarm event into the threat identification model, and obtain a threat identification result corresponding to the second alarm event.

Optionally, a display module 803 is configured to display the first alarm event and the threat identification result; an obtaining module 801, configured to obtain a treatment instruction for the first alarm event; a sending module 804, configured to send, according to the disposition instruction, an alarm disposition policy to the firewall that detects the first alarm event. The alarm handling policy is used to instruct the firewall on the manner in which to process messages associated with the first alarm event.

Optionally, if the handling instruction is blocking the message related to the first alarm event, the sending module 804 is further configured to send an access control list ACL to the firewall, where the ACL is used to indicate blocking the message related to the attack source of the first alarm event. Or, if the handling instruction is to allow the message related to the first alarm event to pass, the sending module 804 is further configured to send a white list to the firewall, where the white list is used to indicate that the message related to the attack source of the first alarm event is a normal message.

Fig. 9 is a schematic structural diagram of a network device according to an embodiment of the present application. As shown in fig. 9, the network device 900 is equipped with the threat event identification apparatus described above. Network device 900 is implemented by a general bus architecture.

The network device 900 includes at least one processor 901, a communication bus 902, a memory 903, and at least one communication interface 904.

Optionally, processor 901 is a general purpose central processing unit (central processing unit, CPU), network processor (network processor, NP), microprocessor, or one or more integrated circuits for implementing aspects of the present application, such as application-specific integrated circuits (ASIC), programmable logic devices (programmable logic device, PLD), or a combination thereof. The PLD is a complex programmable logic device (complex programmable logic device, CPLD), a field-programmable gate array (field-programmable gate array, FPGA), a general-purpose array logic (generic array logic, GAL), or any combination thereof.

Communication bus 902 is used to transfer information between the components described above. The communication bus 902 is classified into an address bus, a data bus, a control bus, and the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.

Memory 903 is optionally a read-only memory (ROM) or other type of static storage device that can store static information and instructions. Memory 903 is alternatively a random access memory (random access memory, RAM) or other type of dynamic storage device that can store information and instructions. Alternatively, memory 903 is an electrically erasable programmable read-only Memory (EEPROM), a compact disk read-only Memory (CD-ROM) or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. Optionally, the memory 903 is separate and coupled to the processor 901 via a communication bus 902. Optionally, the memory 903 and the processor 901 are integrated.

The communication interface 904 uses any transceiver-like means for communicating with other devices or communication networks. The communication interface 904 includes a wired communication interface. Optionally, the communication interface 904 further comprises a wireless communication interface. The wired communication interface is, for example, an ethernet interface. The ethernet interface is an optical interface, an electrical interface, or a combination thereof. The wireless communication interface is a wireless local area network (wireless local area networks, WLAN) interface, a cellular network communication interface, a combination thereof, or the like.

In a specific implementation, processor 901 includes one or more CPUs, such as CPU0 and CPU1 shown in fig. 9, as one embodiment.

In a specific implementation, as an embodiment, the network device 900 includes a plurality of processors, such as processor 901 and processor 905 shown in fig. 9. Each of these processors is a single-core processor (single-CPU) or a multi-core processor (multi-CPU). A processor herein refers to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).

In some embodiments, the memory 903 is used to store the program code 99 that performs aspects of the present application, and the processor 901 executes the program code 99 stored in the memory 903. That is, the network device 900 implements the above-described method embodiments by the processor 901 and the program code 99 in the memory 903.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are referred to each other, and each embodiment is mainly described as a difference from other embodiments.

A refers to B, referring to a simple variation where A is the same as B or A is B.

The terms first and second and the like in the description and in the claims of embodiments of the application, are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order of the objects, and should not be interpreted to indicate or imply relative importance. For example, a first speed limiting channel and a second speed limiting channel are used to distinguish between different speed limiting channels, rather than to describe a particular order of speed limiting channels, nor should the first speed limiting channel be understood to be more important than the second speed limiting channel.

In the embodiments of the present application, unless otherwise indicated, the meaning of "at least one" means one or more, and the meaning of "a plurality" means two or more.

The above-described embodiments may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (Digital Subscriber Line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy Disk, a hard Disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims

1. A method of threat event identification, comprising:

acquiring a first alarm event;

constructing a knowledge graph based on the first alarm event and a plurality of alarm events in a target time period, wherein the target time period is a preset time period before the first alarm event occurs, the knowledge graph is used for recording the relationship between an attack source and an attack target and each alarm event in an alarm event set, and the alarm event set comprises the first alarm event and the plurality of alarm events;

acquiring attack characteristics related to the first alarm event based on the knowledge graph;

inputting the attack characteristics into a threat identification model to obtain a threat identification result, wherein the threat identification result is used for indicating that the first alarm event is a probability value of a threat event, and the threat identification model is a machine learning model which is obtained through training in advance.

2. The method of claim 1, wherein the attack feature comprises one or more features in a first feature set, the first feature set comprising:

the method comprises the steps that the quantity distribution condition of first type alarm events in a plurality of different time periods in an alarm event set is realized, the attack source of the first type alarm events is a first attack source, the first attack source is the attack source of the first alarm events, and the target time period comprises the plurality of different time periods;

the number distribution condition of the second type of alarm events in the alarm event set in the different time periods is that the attack target of the second type of alarm events is a first attack target, and the first attack target is the attack target of the first alarm event;

the number distribution condition of a third type of alarm event in the alarm event set in the different time periods is that an attack source of the third type of alarm event is a second attack source, and the second attack source comprises an attack source for attacking the first attack target;

the number of fourth-class alarm events in the alarm event set, wherein the attack source of the fourth-class alarm event is the first attack source, and the fourth-class alarm event and the first alarm event correspond to the same attacked port;

And the number of attack targets corresponding to the first type of alarm event.

3. The method according to claim 1 or 2, wherein the knowledge graph is further used for recording attribute information of each alarm event in the set of alarm events, the attribute information including one or more of a type, a five tuple, an application program generating the alarm event, a vulnerability utilized by the alarm event, and a firewall detecting the alarm event;

the attack feature further includes one or more features in a second feature set, the second feature set including: the type of the first alarm event, the number of the fifth alarm event in the alarm event set, the number of alarm events with the same five-tuple as the first alarm event, an application program generating the first alarm event, the attack direction of the first alarm event, the number of vulnerabilities utilized by the alarm event with the same attack source as the first alarm event, and the number of alarm events detected by a firewall detecting the first alarm event, wherein the attack source of the fifth alarm event is the same as the attack source of the first alarm event, and the fifth alarm event belongs to the same alarm type as the first alarm event, and the five-tuple comprises a source address, a destination address, a source port, a destination port and a transmission protocol.

4. A method according to any one of claims 1-3, characterized in that the method further comprises:

acquiring threat degree of the first alarm event, wherein the threat degree is obtained based on a source address corresponding to the first alarm event;

inputting the attack characteristic into a threat identification model to obtain a threat identification result, wherein the method comprises the following steps:

and inputting the attack characteristics and the threat degree into a threat identification model to obtain the threat identification result.

5. The method according to any one of claims 1-4, further comprising:

acquiring a second alarm event;

constructing a new knowledge graph based on the second alarm event and the knowledge graph;

acquiring attack characteristics related to the second alarm event based on the new knowledge graph;

and inputting the attack characteristics related to the second alarm event into the threat identification model to obtain a threat identification result corresponding to the second alarm event.

6. The method according to any one of claims 1-5, further comprising:

displaying the first alarm event and the threat identification result;

Acquiring a treatment instruction for the first alarm event;

and sending an alarm handling policy to a firewall detecting the first alarm event according to the handling instruction, wherein the alarm handling policy is used for indicating a mode of processing a message related to the first alarm event by the firewall.

7. The method of claim 6, wherein the sending an alert handling policy to a firewall that detected the first alert event according to the handling instructions comprises:

if the handling instruction is to block the message related to the first alarm event, sending an Access Control List (ACL) to the firewall, wherein the ACL is used for indicating to block the message related to the attack source of the first alarm event;

or if the handling instruction is that the message related to the first alarm event is allowed to pass, sending a white list to the firewall, wherein the white list is used for indicating that the message related to the attack source of the first alarm event is a normal message.

8. The method of any of claims 1-7, wherein the threat identification model is trained based on a gradient boost decision tree GBDT algorithm or an extreme gradient boost XGBoost algorithm.

9. A threat event identification apparatus, comprising:

the acquisition module is used for acquiring a first alarm event;

the processing module is used for constructing a knowledge graph based on the first alarm event and a plurality of alarm events in a target time period, wherein the target time period is a preset time period before the first alarm event occurs, the knowledge graph is used for recording the relation between an attack source and an attack target and each alarm event in an alarm event set, and the alarm event set comprises the first alarm event and the plurality of alarm events;

the processing module is further configured to obtain an attack feature related to the first alarm event based on the knowledge graph;

the processing module is further configured to input the attack feature into a threat identification model to obtain a threat identification result, where the threat identification result is used to indicate that the first alarm event is a probability value of a threat event, and the threat identification model is a machine learning model that is obtained through training in advance.

10. The apparatus of claim 9, wherein the attack feature comprises one or more features in a first feature set, the first feature set comprising:

The method comprises the steps that the number distribution condition of first type alarm events in a plurality of different time periods is achieved, wherein an attack source of the first type alarm events is a first attack source, the first attack source is an attack source of the first alarm events, and the target time period comprises the plurality of different time periods;

the number distribution condition of the second type of alarm events in the different time periods is that the attack targets of the second type of alarm events are first attack targets, and the first attack targets are attack targets of the first alarm events;

the number distribution condition of a third type of alarm event in the different time periods is that an attack source of the third type of alarm event is a second attack source, and the second attack source comprises an attack source for attacking the first attack target;

the number of fourth-class alarm events, wherein the attack source of the fourth-class alarm event is the first attack source, and the fourth-class alarm event and the first alarm event correspond to the same attacked port;

11. The apparatus according to claim 9 or 10, wherein the knowledge graph is further configured to record attribute information of each alarm event in the set of alarm events, where the attribute information includes one or more of a type, a five tuple, an application that generates the alarm event, a vulnerability utilized by the alarm event, and a firewall that detects the alarm event;

12. The device according to any one of claims 9-11, wherein,

the acquisition module is further configured to acquire a threat level of the first alarm event, where the threat level is obtained based on a source address corresponding to the first alarm event;

The processing module is further configured to input the attack feature and the threat degree into a threat identification model, and obtain the threat identification result.

13. The device according to any one of claims 9-12, wherein,

the acquisition module is further used for acquiring a second alarm event;

the processing module is further configured to construct a new knowledge graph based on the second alarm event and the knowledge graph;

the processing module is further configured to obtain an attack feature related to the second alarm event based on the new knowledge graph;

the processing module is further configured to input the attack feature related to the second alarm event into the threat identification model, so as to obtain a threat identification result corresponding to the second alarm event.

14. The apparatus of any one of claims 9-13, further comprising a presentation module and a transmission module;

the display module is used for displaying the first alarm event and the threat identification result;

the acquisition module is further configured to acquire a treatment instruction for the first alarm event;

the sending module is configured to send an alarm handling policy to a firewall that detects the first alarm event according to the handling instruction, where the alarm handling policy is used to instruct the firewall to process a packet related to the first alarm event.

15. The apparatus of claim 14, wherein the means for transmitting is further configured to:

if the handling instruction is blocking the message related to the first alarm event, sending an Access Control List (ACL) to the firewall, wherein the ACL is used for indicating to block the message related to the attack source of the first alarm event;

16. The apparatus of any one of claims 9-15, wherein the threat identification model is trained based on a GBDT algorithm or an XGBoost algorithm.

17. A network device comprising a processor and a memory, the memory for storing program code, the processor for invoking the program code in the memory to cause the network device to perform the method of any of claims 1-8.

18. A computer readable storage medium storing instructions which, when run on a computer, cause the computer to perform the method of any one of claims 1-8.

19. A computer program product comprising program code which, when run on a computer, causes the computer to perform the method of any of claims 1-8.