CN113987476A - Method and device for determining defect host, electronic equipment and storage medium - Google Patents

Method and device for determining defect host, electronic equipment and storage medium Download PDF

Info

Publication number
CN113987476A
CN113987476A CN202111245887.7A CN202111245887A CN113987476A CN 113987476 A CN113987476 A CN 113987476A CN 202111245887 A CN202111245887 A CN 202111245887A CN 113987476 A CN113987476 A CN 113987476A
Authority
CN
China
Prior art keywords
host
detected
outlier
network behavior
detection result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111245887.7A
Other languages
Chinese (zh)
Inventor
肖海燕
顾涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Security Technologies Co Ltd
Original Assignee
New H3C Security Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Security Technologies Co Ltd filed Critical New H3C Security Technologies Co Ltd
Priority to CN202111245887.7A priority Critical patent/CN113987476A/en
Publication of CN113987476A publication Critical patent/CN113987476A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/554Detecting local intrusion or implementing counter-measures involving event detection and direct action
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the application provides a method and a device for determining a lost host, electronic equipment and a storage medium. The scheme is as follows: acquiring network behavior characteristic data of each host to be detected in the target group within a preset time length; determining an outlier detection result of each host to be detected by using a preset outlier detection algorithm; the outlier detection result of each host to be detected indicates that: whether the network behavior characteristic data of the host to be detected is an outlier in each network behavior characteristic data; acquiring an event detection result of each host to be detected; the event detection result is used for indicating whether a safety event occurs in the host to be detected within a preset time length; and determining the lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected. By applying the technical scheme provided by the embodiment of the application, the influence of misjudgment of the safety event on the determination of the lost host is reduced, and the accuracy of the determined lost host is improved.

Description

Method and device for determining defect host, electronic equipment and storage medium
Technical Field
The application relates to the technical field of security protection, in particular to a method and a device for determining a lost host, electronic equipment and a storage medium.
Background
A security event refers to any event that attempts to change the security state of an information system (e.g., change access control measures, change security levels, change user passwords, etc.). In the process that a user accesses the network through the host, the security management platform can detect the access behavior of the host and determine whether the host has a security event. When the host is detected to have a security event, the host is determined to be a lost host, so that security protection is performed on the lost host, such as warning and the like.
In the related art, due to the fact that false alarm may occur in the security event detection process, the determined lost host is deviated, and accuracy of the determined lost host is affected.
Disclosure of Invention
An object of the embodiments of the present application is to provide a method and an apparatus for determining a failed host, an electronic device, and a storage medium, so as to reduce an influence of misjudgment of a security event on determination of the failed host, and improve accuracy of determining the failed host. The specific technical scheme is as follows:
the embodiment of the application provides a method for determining a lost host, which comprises the following steps:
acquiring network behavior characteristic data of each host to be detected in the target group within a preset time length; the target group comprises all to-be-detected hosts which are determined according to a preset group division index; each host to be detected has the same index value of a preset group division index;
Determining an outlier detection result of each host to be detected by using a preset outlier detection algorithm; the outlier detection result of each host to be detected indicates that: whether the network behavior characteristic data of the host to be detected is an outlier in each network behavior characteristic data;
acquiring an event detection result of each host to be detected; the event detection result is used for indicating whether the host to be detected has a safety event within the preset time length;
and determining the lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected.
Optionally, the network behavior feature data includes at least one of: upstream traffic, downstream traffic, session number, Internet Protocol (IP) number, and access area number.
Optionally, the step of determining the outlier detection result of each host to be detected by using a preset outlier detection algorithm includes:
determining a first outlier in the network behavior characteristic data of each host to be detected by using an isolated forest algorithm;
determining a second outlier in the network behavior characteristic data of each host to be detected by using a boxplot algorithm;
Determining a target host in the target group as an outlier, wherein the target host is a host to be detected corresponding to each network behavior characteristic data in the intersection of the first outlier and the second outlier;
and determining each host to be detected in the target group except the target host as a non-outlier host.
Optionally, the step of determining a first outlier in the network behavior feature data of each host to be detected by using an isolated forest algorithm includes:
constructing a preset number of isolated trees according to the network behavior characteristic data of each host to be detected to obtain an isolated forest model;
calculating the abnormal score of the network behavior characteristic data of each host to be detected by using the isolated forest model;
and determining the network behavior characteristic data with the anomaly score larger than a first anomaly threshold value as a first outlier.
Optionally, the step of determining a second outlier in the network behavior feature data of each host to be detected by using a boxcar algorithm includes:
aiming at each behavior class in the network behavior characteristic data, comparing the network behavior characteristic data of each host to be detected in the behavior class with a second abnormal threshold corresponding to the behavior class; the second anomaly threshold corresponding to each behavior category is: the network behavior feature data of each host to be detected in the target group in the behavior category is determined according to the quartile of the network behavior feature data;
And if the network behavior characteristic data of the host to be detected in any behavior category is larger than a second abnormal threshold corresponding to the behavior category, determining the network behavior characteristic data of the host to be detected as a second outlier.
Optionally, the step of determining the lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected includes:
and for each host to be detected in the target group, if the outlier detection result indicates that the network behavior characteristic data of the host to be detected is an outlier and the event detection result indicates that the host to be detected has a security event within the preset time, determining that the host to be detected is a lost host.
Optionally, before determining the lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected, the method further includes:
acquiring outlier detection results and event detection results of each host to be detected in the target group within a plurality of continuous preset time lengths;
the step of determining the lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected includes:
For each host to be detected in the target group, within each preset time length, if an outlier detection result corresponding to the preset time length indicates that network behavior characteristic data of the host to be detected is an outlier, and an event detection result of the preset time length indicates that a security event occurs in the host to be detected, determining the host to be detected as a candidate host within the preset time length;
and if the frequency of the host to be detected determined as the alternative host within the preset time lengths is greater than a preset frequency threshold value, determining that the host to be detected is the collapse host.
An embodiment of the present application further provides a device for determining a lost host, where the device includes:
the first acquisition module is used for acquiring network behavior characteristic data of each host to be detected in the target group within a preset time length; the target group comprises all to-be-detected hosts which are determined according to a preset group division index; each host to be detected has the same index value of a preset group division index;
the first determining module is used for determining an outlier detection result of each host to be detected by utilizing a preset outlier detection algorithm; the outlier detection result of each host to be detected indicates that: whether the network behavior characteristic data of the host to be detected is an outlier in each network behavior characteristic data;
The second acquisition module is used for acquiring the event detection result of each host to be detected; the event detection result is used for indicating whether the host to be detected has a safety event within the preset time length;
and the second determining module is used for determining the lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected.
Optionally, the network behavior feature data includes at least one of: uplink traffic, downlink traffic, session number, access IP number, and access zone number.
Optionally, the first determining module is specifically configured to determine, by using an isolated forest algorithm, a first outlier in the network behavior feature data of each host to be detected;
determining a second outlier in the network behavior characteristic data of each host to be detected by using a boxplot algorithm;
determining a target host in the target group as an outlier, wherein the target host is a host to be detected corresponding to each network behavior characteristic data in the intersection of the first outlier and the second outlier;
and determining each host to be detected in the target group except the target host as a non-outlier host.
Optionally, the first determining module is specifically configured to construct a preset number of isolated trees according to the network behavior feature data of each host to be detected, so as to obtain an isolated forest model; calculating the abnormal score of the network behavior characteristic data of each host to be detected by using the isolated forest model; and determining the network behavior characteristic data with the anomaly score larger than a first anomaly threshold value as a first outlier.
Optionally, the first determining module is specifically configured to, for each behavior category in the network behavior feature data, compare the network behavior feature data of each host to be detected in the behavior category with a second abnormal threshold corresponding to the behavior category; the second anomaly threshold corresponding to each behavior category is: the network behavior feature data of each host to be detected in the target group in the behavior category is determined according to the quartile of the network behavior feature data; and if the network behavior characteristic data of the host to be detected in any behavior category is larger than a second abnormal threshold corresponding to the behavior category, determining the network behavior characteristic data of the host to be detected as a second outlier.
Optionally, the second determining module is specifically configured to, for each host to be detected in the target group, determine that the host to be detected is a lost host if the outlier detection result indicates that the network behavior feature data of the host to be detected is an outlier and the event detection result indicates that the host to be detected has a security event within the preset time.
Optionally, the apparatus further comprises:
a third obtaining module, configured to obtain outlier detection results and event detection results of each host to be detected in the target group within a plurality of consecutive preset durations before determining a lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected;
the second determining module is specifically configured to determine, for each host to be detected in the target group, within each preset time period, if an outlier detection result corresponding to the preset time period indicates that network behavior feature data of the host to be detected is an outlier, and an event detection result of the preset time period indicates that a security event occurs in the host to be detected, the host to be detected is a candidate host within the preset time period;
and if the frequency of the host to be detected determined as the alternative host within the preset time lengths is greater than a preset frequency threshold value, determining that the host to be detected is the collapse host.
Embodiments of the present application further provide an electronic device, including a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: implementing any of the above-described steps of the method for determining a lost host.
Embodiments of the present application further provide a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: implementing any of the above-described steps of the method for determining a lost host.
Embodiments of the present application further provide a computer program product containing instructions, which when run on a computer, cause the computer to perform any of the above-described methods for determining a failed host.
In the technical scheme provided by the embodiment of the application, the network behavior characteristic data of each host to be detected in the target group is acquired, and the outlier detection result of each host to be detected is determined by using the preset outlier detection algorithm, so that the lost host in the target group is determined according to the outlier detection result and the event detection result of each host to be detected within the preset time.
Compared with the related technology, in the process of determining the lost host, whether the network behavior characteristic data of the host is an outlier in the target group where the host is located is comprehensively considered in addition to whether the host has a security event, namely, the deviation condition of the network behavior characteristic of the host relative to the network behavior characteristics of other hosts in the group where the host is located is also considered, so that the influence of misjudgment of the security event on the determination of the lost host is effectively reduced, and the accuracy of the determined lost host is improved.
Of course, it is not necessary for any product or method of the present application to achieve all of the above-described advantages at the same time.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a first flowchart illustrating a method for determining a failed host according to an embodiment of the present application;
fig. 2 is a second flowchart illustrating a method for determining a failed host according to an embodiment of the present application;
fig. 3 is a third flowchart illustrating a method for determining a failed host according to an embodiment of the present application;
fig. 4 is a fourth flowchart illustrating a method for determining a failed host according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a boxplot provided by an embodiment of the present application;
fig. 6 is a fifth flowchart illustrating a method for determining a failed host according to an embodiment of the present application;
fig. 7 is a sixth flowchart illustrating a method for determining a failed host according to an embodiment of the present application;
Fig. 8 is a schematic diagram of the number of times of host computer failure and the failure probability in a plurality of preset durations according to an embodiment of the present disclosure;
fig. 9-a is a schematic structural diagram of a first apparatus for determining a defect host according to an embodiment of the present disclosure;
fig. 9-b is a schematic diagram of a second structure of a lost host determination apparatus according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the related art, when a failed host is determined, a security management platform, such as situation awareness, Security Information and Event Management (SIEM), a Secure Operation Center (SOC), and the like, may be used to generate a security event. When the security management platform generates a security event for a host, the host will be determined to be a lost host. However, there are often false positives in detected security events.
For example, a user may manually trigger a host to scan locally, at which point the host will scan for locally stored data. Due to the triggering of this scan function, the security management platform can determine that this scan triggered a security event. Although the security event is misjudged, the host is still determined to be a lost host, and the accuracy of the determined lost host is affected.
In order to solve the problems in the related art, an embodiment of the present application provides a method for determining a failed host. As shown in fig. 1, fig. 1 is a first flowchart illustrating a method for determining a failed host according to an embodiment of the present application. The method comprises the following steps.
Step S101, acquiring network behavior characteristic data of each host to be detected in a target group within a preset time length; the target group comprises all to-be-detected hosts which are determined according to a preset group division index; each host to be detected has the same index value of the preset group division index.
Step S102, determining an outlier detection result of each host to be detected by using a preset outlier detection algorithm; the outlier detection result of each host to be detected indicates that: and whether the network behavior characteristic data of the host to be detected is an outlier in each network behavior characteristic data.
Step S103, acquiring an event detection result of each host to be detected; and the event detection result is used for indicating whether the host to be detected has a safety event within a preset time length.
And step S104, determining the lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected.
By the method shown in fig. 1, the network behavior characteristic data of each host to be detected in the target group is obtained, and the outlier detection result of each host to be detected is determined by using the preset outlier detection algorithm, so that the lost host in the target group is determined according to the outlier detection result and the event detection result of each host to be detected within the preset time duration.
Compared with the related technology, in the process of determining the lost host, whether the network behavior characteristic data of the host is an outlier in the target group where the host is located is comprehensively considered in addition to whether the host has a security event, namely, the deviation condition of the network behavior characteristic of the host relative to the network behavior characteristics of other hosts in the group where the host is located is also considered, so that the influence of misjudgment of the security event on the determination of the lost host is effectively reduced, and the accuracy of the determined lost host is improved.
The following examples are given to illustrate the examples of the present application. For convenience of description, the following description will be made by taking an electronic device as an execution subject, and the electronic device may be any detection device and is not limited in any way.
Aiming at the step S101, network behavior characteristic data of each host to be detected in the target group within a preset time length is obtained; the target group comprises all to-be-detected hosts which are determined according to a preset group division index; each host to be detected has the same index value of the preset group division index.
In this embodiment, the electronic device may divide the plurality of hosts to be detected into a plurality of groups according to a preset group division index. The host to be detected in each group has the same index value of the preset group division index.
The preset group division index may be based on feature information of a user of the host and feature information of the host. For example, the preset group division index may be a department where a user of the host is located, an area where the host is located, and the like.
For ease of understanding, the grouping of all hosts in a company is exemplified.
When the preset group index is the department where the user of the host is located, the electronic device may determine the department where the user of each host is located according to the personnel organization structure of the company, so as to divide the hosts corresponding to the users of the same department into the same group, thereby obtaining a plurality of groups. Such as a financial staff group, a research and development staff group, a testing staff group, an auditor group, etc.
When the preset group index is an area where the host is located, such as a floor, the electronic device can divide the hosts on the same floor into the same group to obtain a plurality of groups. Such as a first floor host group, a second floor host group, etc.
In the embodiment of the present application, according to the difference of the preset group division indexes, the same index value of each host to be detected in each group obtained by division is also different. Here, the division manner of the group is not particularly limited.
The electronic equipment can obtain at least one group after the group division is carried out on the multiple hosts to be detected. And each group comprises one or more hosts to be detected. Here, the number of groups obtained by the division and the number of hosts to be detected included in each group are not particularly limited.
The target group may be any one of groups into which the group is divided.
In an optional embodiment, in order to reduce the difference between different hosts when accessing the network and reduce the influence on the accuracy of the outliers determined in the later stage, the number of hosts to be detected included in the target group may be greater than a preset number threshold.
The preset number threshold may be set according to experience of a user. For example, the preset number threshold may be 10. I.e. the number of hosts to be detected included in the target group is at least 10. Here, the preset number threshold is not particularly limited.
For each host to be detected in the target group, the electronic device may obtain log information of the host to be detected within a preset time period, and extract network behavior characteristic data of the host to be detected from the obtained log information.
In an optional embodiment, the network behavior feature data includes at least one of: uplink traffic, downlink traffic, session number, access IP number, and access zone number.
When the network behavior feature data includes the uplink traffic, the electronic device may calculate, according to log information of the host to be detected within a preset time period, a sum of the uplink traffic of each log of the host to be detected within the preset time period, as the uplink traffic of the host to be detected.
When the network behavior feature data includes the downlink traffic, the electronic device may calculate, according to log information of the host to be detected within a preset time period, a sum of the downlink traffic of each log of the host to be detected within the preset time period, as the downlink traffic of the host to be detected.
When the network behavior feature data includes the session number, the electronic device may count a sum of the session logs of the host to be detected within a preset time period according to log information of the host to be detected within the preset time period, and the sum is used as the session number of the host to be detected.
When the network behavior feature data includes the access IP number, the electronic device may obtain, according to log information of the host to be detected within a preset time period, a destination IP accessed by the host to be detected within the preset time period, and perform deduplication processing on the obtained destination IP. And the electronic equipment counts the number of the target IP remained after the duplicate removal processing, and the number is used as the number of the access IP of the host to be detected.
When the network behavior feature data includes the number of access areas, the electronic device may obtain, according to log information of the host to be detected within a preset time period, a country where a target IP accessed by the host to be detected within the preset time period is located, and perform deduplication processing on the obtained country. And the electronic equipment counts the number of the remaining countries after the duplication elimination as the number of the access areas of the host to be detected.
In this embodiment, the network behavior feature data may further include a Domain Name Server (DNS) request number, a Uniform Resource Locator (URL) request number, and the like. Here, the network behavior feature data is not particularly limited.
The preset time duration may be set according to a user requirement, for example, the preset time duration may be 1 hour, such as 10:00 to 11: 00. Here, the preset time period is not particularly limited.
Aiming at the step S102, determining an outlier detection result of each host to be detected by using a preset outlier detection algorithm; the outlier detection result of each host to be detected indicates that: and whether the network behavior characteristic data of the host to be detected is an outlier in each network behavior characteristic data.
In this step, the electronic device may perform outlier detection by using a preset outlier detection algorithm according to the network behavior feature data of each host to be detected in the target group within a preset time duration, so as to obtain an outlier detection result of each host to be detected.
In an optional embodiment, for the outlier detection result of the host to be detected, the outlier detection result may indicate: the network behavior feature data of the host to be detected is an outlier in all the network behavior feature data included in the target group. At this time, the electronic device may determine the host to be detected as an outlier host.
In another optional embodiment, for the outlier detection result of the host to be detected, the outlier detection result may indicate: the network behavior feature data of the host to be detected is not an outlier in all the network behavior feature data included in the target group. At this time, the electronic device may determine that the host to be detected is not an outlier, i.e., is not an outlier.
The preset outlier detection algorithm includes, but is not limited to, an isolated forest algorithm, a box line graph algorithm, a Support Vector Machine (SVM) outlier detection algorithm, and a 3sigma anomaly detection algorithm. Here, the preset outlier detection algorithm is not particularly limited.
For the step S103, obtaining an event detection result of each host to be detected; and the event detection result is used for indicating whether the host to be detected has a safety event within a preset time length.
In this step, the security management platform may perform security event detection on each host to be detected in real time, so as to determine whether a security event occurs in each host to be detected, and obtain an event detection result of each host to be detected. The electronic device can obtain the event detection result of each host to be detected within the preset time length from the security management platform.
The event detection result can be obtained by detecting and analyzing the host to be detected through detection modes such as correlation analysis, User and Entity Behavior Analysis (UEBA) detection and the like. Here, the method of determining the event detection result is not particularly limited.
In the embodiment of the present application, the step S103 may be executed before/after the step S101 or the step S102 is executed, or may be executed simultaneously with the step S101 or the step S102. Here, the execution sequence of step S103 and step S101 or step S102 is not particularly limited.
In step S104, the lost host in the target group is determined according to the outlier detection result and the event detection result corresponding to each host to be detected.
In this step, for each host to be detected in the target group, the electronic device may determine which hosts to be detected are the lost hosts according to the outlier detection result and the event detection result of the host to be detected. The following description is provided for the determination method that the host to be detected is a lost host, and is not specifically described here.
In an optional embodiment, in the step S102, the predetermined outlier detection algorithm is used to determine the outlier detection result of each host to be detected, which may be specifically expressed as:
the electronic equipment determines outliers in the network behavior characteristic data of the hosts to be detected according to the network behavior characteristic data of the hosts to be detected in the target group by using an isolated forest algorithm, and determines outlier detection results of the hosts to be detected in the target group based on the outliers. For the determination of outliers, reference is made to the description below, which is not specifically described here.
In another alternative embodiment, in step S102, the predetermined outlier detection algorithm is used to determine the outlier detection result of each host to be detected, which may be specifically expressed as:
The electronic equipment determines outliers in the network behavior characteristic data of the hosts to be detected according to the network behavior characteristic data of the hosts to be detected in the target group by using a boxplot algorithm, and determines outlier detection results of the hosts to be detected in the target group based on the outliers. For the determination of outliers, reference is made to the description below, which is not specifically described here.
In another alternative embodiment, in order to further improve the accuracy of the determined outlier detection result, according to the method shown in fig. 1, the embodiment of the present application further provides a method for determining a defect host. As shown in fig. 2, fig. 2 is a second flowchart of a method for determining a failed host according to an embodiment of the present application. The method comprises the following steps.
Step S201, acquiring network behavior characteristic data of each host to be detected in a target group within a preset time length; the target group comprises all to-be-detected hosts which are determined according to a preset group division index; each host to be detected has the same index value of the preset group division index.
Step S201 is the same as step S101.
Step S202, determining a first outlier in the network behavior characteristic data of each host to be detected by using an isolated forest algorithm.
The above-mentioned first outlier can be determined by referring to the following description, and is not specifically described herein.
Step S203, determining a second outlier in the network behavior characteristic data of each host to be detected by using a box plot algorithm.
The above-mentioned second outlier can be determined by referring to the following description, and is not specifically described herein.
In the embodiment of the present application, the execution sequence of step S202 and step S203 is not particularly limited.
Step S204, the target host in the target group is determined as an outlier, and the target host is a host to be detected corresponding to each network behavior characteristic data in the intersection of the first outlier and the second outlier.
In this step, after determining the first outlier and the second outlier, the electronic device may determine network behavior feature data included in an intersection of the first outlier and the second outlier, and determine the host to be detected corresponding to the network behavior feature data as an outlier host.
For convenience of understanding, the network behavior feature data included in the first outlier is data 1-data 3, and the network behavior feature data included in the second outlier is data 2-data 4. The data 1 to the data 4 are respectively network behavior characteristic data corresponding to the host 1 to the host 4.
According to the network behavior feature data included in the first outlier and the second outlier, the electronic device may determine the network behavior feature data included in the intersection of the first outlier and the second outlier as data 2 and data 3. At this time, the electronic device may determine that the host 2 corresponding to the data 2 and the host 3 corresponding to the data 3 are target hosts in the target group. That is, the electronic device may determine that the hosts 2 and 3 in the target group are outlier hosts.
In step S205, each host to be detected in the target group except the target host is determined as a non-outlier host.
In this step, after determining the target hosts in the target group as the outlier hosts, the electronic device may determine each host to be detected in the target group other than the target hosts as a non-outlier host.
Still taking the host 2 and the host 3 as examples, after determining that the host 2 and the host 3 are outlier hosts in the target group, the electronic device may determine that none of the other hosts in the target group except the host 2 and the host 3 are outlier hosts, that is, non-outlier hosts. I.e. both the above-mentioned host 1 and host 4 are non-outlier hosts.
The above steps S202 to S205 are refinements of the above step S102.
In the embodiment of the application, after the first outlier and the second outlier are determined, the to-be-detected host (i.e., the target host) corresponding to each network behavior feature data in the intersection of the first outlier and the second outlier is determined as the outlier in the target group, the outlier detection results determined by different preset outlier detection algorithms are effectively integrated, the accuracy of the determined outlier host is effectively improved, the accuracy of the determined non-outlier host is improved, the accuracy of the outlier detection results is effectively improved, and the accuracy of the trapped-lost host determined based on the outlier detection results is improved.
Step S206, acquiring an event detection result of each host to be detected; and the event detection result is used for indicating whether the host to be detected has a safety event within a preset time length.
Step S207, determining the lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected.
The above-described step S206 to step S207 are the same as the above-described step S103 to step S104.
In the embodiment shown in fig. 2, the determination of the outlier host and the non-outlier host in the target group is described by taking only the integration of the outlier detection results determined by the isolated forest algorithm and the boxline map algorithm as an example. In addition, the electronic device may further determine an outlier host and a non-outlier host in the target group by synthesizing at least two outlier detection results of the outlier detection results corresponding to the isolated forest algorithm, the boxplot algorithm, the SVM outlier detection algorithm, and the 3sigma anomaly detection algorithm, where a specific determination manner may refer to the method shown in fig. 2 and is not specifically described herein.
In an optional embodiment, according to the method shown in fig. 2, an embodiment of the present application further provides a method for determining a failed host. As shown in fig. 3, fig. 3 is a third flowchart illustrating a method for determining a failed host according to an embodiment of the present application. The method comprises the following steps.
Step S301, acquiring network behavior characteristic data of each host to be detected in a target group within a preset time length; the target group comprises all to-be-detected hosts which are determined according to a preset group division index; each host to be detected has the same index value of the preset group division index.
Step S301 is the same as step S201.
Step S302, constructing a preset number of isolated trees according to the network behavior characteristic data of each host to be detected, and obtaining an isolated forest model.
In the embodiment of the application, the electronic device can randomly construct a preset number of isolated trees according to the network behavior characteristic data of each host to be detected and the behavior categories included in the network behavior characteristic data to obtain the isolated forest model.
For ease of understanding, the construction of an isolated tree is described in conjunction with Table 1.
TABLE 1
Main unit Upstream flow Downstream traffic
Host A A1 A2
Host B B1 B2
Host C C1 C2
As can be seen from table 1, the target group includes 3 hosts to be detected shown in table 1, that is, host a, host B, and host C, and the behavior categories included in the network behavior feature data of each host to be detected include: upstream traffic and downstream traffic.
When constructing the root node of the isolated tree, the electronic device may randomly select network behavior feature data corresponding to a behavior category to fill the root node, and if the uplink traffic is selected, the data filled by the root node are a1, a2, and A3 in table 1.
When constructing the leaf nodes (i.e., the left child node and the right child node) of the root node of the isolated tree, the electronic device randomly generates a piece of cutting data, such as P1, and compares the cutting data with each network behavior feature data in the root node to obtain a comparison result. For example, a1< P1, B1< P1, C1> P1. At this time, the electronic device may pad a1 and B1 to the left child node of the root node and C1 to the right child node of the root node. And by analogy, two leaf nodes corresponding to the left child node of the root node are further determined, and therefore an isolated tree A is obtained. Each isolated tree in the isolated forest model can be constructed according to the construction mode of the isolated tree a, and is not specifically described here.
In an optional embodiment, when each isolated tree in the isolated forest model is constructed, the electronic device may end the construction process of the isolated tree when the leaf node of each layer only includes network behavior feature data of one host to be detected, so as to obtain an isolated tree.
In another optional embodiment, when each isolated tree in the isolated forest model is constructed, the electronic device may further determine whether to end the construction process of constructing the isolated tree according to the height of the currently constructed isolated tree, so as to obtain an isolated tree. For example, the electronic device may end the construction process of constructing the isolated tree when the height of the isolated tree is 5 (i.e., the isolated tree includes a root node and four layers of leaf nodes), so as to obtain an isolated tree.
In the embodiment of the present application, the condition for each isolated tree to end the building process is not particularly limited.
In the embodiment of the present application, the larger the preset number is, the larger the number of isolated trees included in the isolated forest model is, the higher the accuracy of the outliers determined based on the isolated forest model is, but the larger the calculation amount of the outlier determination process is. Therefore, in order to balance the accuracy of the determined outliers and the computational load of the outlier determination process, the predetermined number may be determined according to the user requirements or the user experience. For example, the preset number may be 100. Here, the predetermined number is not particularly limited.
And step S303, calculating the abnormal score of the network behavior characteristic data of each host to be detected by using the isolated forest model.
In this step, after the isolated forest model is constructed and obtained, for each host to be detected in the target group, the electronic device may input network behavior characteristic data of the host to be detected to the isolated forest model, so as to obtain an abnormal score output by the isolated forest model. I.e. the anomaly score of the host to be detected.
In the embodiment of the present application, the above-mentioned abnormal score may range from 0 to 1. The larger the abnormal score of the host to be detected is, the higher the probability that the network behavior characteristic data of the host to be detected is an outlier is; the smaller the abnormal score of the host to be detected is, the lower the probability that the network behavior characteristic data of the host to be detected is an outlier is.
After determining the abnormal score of the network behavior feature data of each host to be detected in the target group, the electronic device may compare the abnormal score of the network behavior feature data of the host to be detected with the first abnormal threshold value to obtain a comparison result. The comparison result specifically includes the following two cases.
In the first case, the anomaly score of the network behavior feature data of the host to be detected is greater than the first anomaly threshold.
And in the second case, the abnormal score of the network behavior characteristic data of the host to be detected is less than or equal to the first abnormal threshold.
The first anomaly threshold may be set according to user requirements or experience. For example, the first anomaly threshold may be 0.8. Here, the first abnormality threshold is not particularly limited.
Step S304, the network behavior feature data with the abnormal score larger than the first abnormal threshold value is determined as a first outlier.
In this step, for each host to be detected in the target group, when the comparison result between the abnormal score of the network behavior feature data of the host to be detected and the first abnormal threshold is that the above condition is met, that is, the abnormal score of the network behavior feature data of the host to be detected is greater than the first abnormal threshold, the electronic device may determine that the network behavior feature data of the host to be detected is an outlier, and record the outlier as the first outlier.
In an optional embodiment, for each host to be detected in the target group, when the comparison result between the anomaly score of the network behavior feature data of the host to be detected and the first anomaly threshold is the above-mentioned case two, that is, the anomaly score of the network behavior feature data of the host to be detected is less than or equal to the first anomaly threshold, the electronic device may determine that the network behavior feature data of the host to be detected is not an outlier, that is, is not an outlier.
Through the steps S302 to S304, the electronic device can accurately determine the data to be detected in which the feature data in the target group belongs to the first outlier.
The above steps S302 to S304 are refinements of the above step S202.
Step S305, determining a second outlier in the network behavior characteristic data of each host to be detected by using a box plot algorithm.
Step S306, determine the target host in the target group as an outlier, where the target host is a host to be detected corresponding to each network behavior feature data in the intersection of the first outlier and the second outlier.
Step S307, determine each host to be detected in the target group except the target host as a non-outlier host.
Step S308, obtaining an event detection result of each host to be detected; and the event detection result is used for indicating whether the host to be detected has a safety event within a preset time length.
Step S309, determining a lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected.
The above-described steps S305 to S309 are the same as the above-described steps S203 to S207.
In an optional embodiment, according to the method shown in fig. 2, an embodiment of the present application further provides a method for determining a failed host. As shown in fig. 4, fig. 4 is a fourth flowchart illustrating a method for determining a failed host according to an embodiment of the present application. The method comprises the following steps.
Step S401, acquiring network behavior characteristic data of each host to be detected in a target group within a preset time length; the target group comprises all to-be-detected hosts which are determined according to a preset group division index; each host to be detected has the same index value of the preset group division index.
And S402, determining a first outlier in the network behavior characteristic data of each host to be detected by using an isolated forest algorithm.
The above-described steps S401 to S402 are the same as the above-described steps S201 to S202.
Step S403, aiming at each behavior category in the network behavior feature data, comparing the network behavior feature data of each host to be detected in the behavior category with a second abnormal threshold corresponding to the behavior category; the second anomaly threshold corresponding to each behavior category is: and determining according to the quartile of the network behavior characteristic data of each host to be detected in the behavior category in the target group.
In this step, for each host to be detected in the target group, the electronic device may compare the network behavior feature data corresponding to each behavior category of the host to be detected with the second abnormal threshold corresponding to the behavior category, respectively, to obtain a comparison result. The comparison result includes at least one of two cases.
In the first case, the network behavior feature data corresponding to each behavior category of the host to be detected is less than or equal to the second abnormal threshold corresponding to the behavior category.
In case two, the network behavior feature data corresponding to at least one behavior class of the host to be detected is greater than the second anomaly threshold corresponding to the behavior class.
In the embodiment of the present application, the quartile includes an upper quartile and a lower quartile.
In an optional embodiment, the second abnormal threshold is a sum of the upper quartile and a 1.5-time quartile interval of the network behavior feature data of all the hosts to be detected in the target group.
For convenience of understanding, the second anomaly threshold will be described with reference to fig. 5, taking a certain category of behavior, such as the above uplink traffic, as an example. Fig. 5 is a schematic diagram of a box line diagram provided in an embodiment of the present application.
The electronic device may sort the uplink traffic of each host to be detected in the target group according to the descending order of the data, so as to obtain the box line diagram shown in fig. 5. Max is the maximum uplink traffic, and Min is the minimum uplink traffic.
The electronics quartered all upstream traffic, which in this case would produce three equal divisions, Q1, M, and Q2 shown in fig. 5. Wherein, M is the median of all uplink flows, Q1 is the lower quartile of all uplink flows, and Q2 is the upper quartile of all uplink flows.
The second abnormal threshold corresponding to the uplink traffic may be represented as: q2+1.5 (Q2-Q1). Wherein Q2-Q1 are interquartile spaces.
In the embodiment of the present application, the second abnormal threshold corresponding to each behavior class is different according to different network behavior feature data corresponding to each behavior class of each host to be detected in the target group. Here, the second anomaly threshold corresponding to each behavior category is not particularly limited.
Step S404, if the network behavior feature data of the host to be detected in any behavior category is greater than the second anomaly threshold corresponding to the behavior category, determining the network behavior feature data of the host to be detected as a second outlier.
In this step, for each host to be detected in the target group, when the comparison result between the network behavior feature data corresponding to the behavior class of the host to be detected and the second abnormal threshold corresponding to the behavior class satisfies the second condition, that is, the network behavior feature data corresponding to at least one behavior class of the host to be detected is greater than the second abnormal threshold corresponding to the behavior class, the electronic device may determine the network behavior feature data of the host to be detected as an outlier, and record the outlier as the second outlier.
Through the steps S403 to S404, the electronic device may accurately determine, by using a box plot algorithm, the host to be detected whose network behavior feature data in the target group belongs to the second outlier.
The above steps S403 to S404 are refinements of the above step S203.
In an optional embodiment, for each host to be detected in the target group, when the comparison result between the network behavior feature data corresponding to the behavior class of the host to be detected and the second abnormal threshold corresponding to the behavior class satisfies the above condition, that is, the network feature data corresponding to each behavior class of the host to be detected is less than or equal to the second abnormal threshold corresponding to each behavior class, the electronic device may determine the network behavior feature data of the host to be detected as a non-outlier.
Step S405, determine the target host in the target group as an outlier, where the target host is a host to be detected corresponding to each network behavior feature data in the intersection of the first outlier and the second outlier.
Step S406, determine each host to be detected in the target group except the target host as a non-outlier host.
Step S407, acquiring an event detection result of each host to be detected; and the event detection result is used for indicating whether the host to be detected has a safety event within a preset time length.
Step S408, determining a lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected.
The above-described steps S405 to S408 are the same as the above-described steps S204 to S207.
In an optional embodiment, according to the method shown in fig. 1, an embodiment of the present application further provides a method for determining a failed host. As shown in fig. 6, fig. 6 is a fifth flowchart illustrating a method for determining a failed host according to an embodiment of the present application. The method comprises the following steps.
Step S601, acquiring network behavior characteristic data of each host to be detected in a target group within a preset time length; the target group comprises all to-be-detected hosts which are determined according to a preset group division index; each host to be detected has the same index value of the preset group division index.
Step S602, determining an outlier detection result of each host to be detected by using a preset outlier detection algorithm; the outlier detection result of each host to be detected indicates that: and whether the network behavior characteristic data of the host to be detected is an outlier in each network behavior characteristic data.
Step S603, obtaining an event detection result of each host to be detected; and the event detection result is used for indicating whether the host to be detected has a safety event within a preset time length.
The above steps S601 to S603 are the same as the above steps S101 to S103.
Step S604, for each host to be detected in the target group, if the outlier detection result indicates that the network behavior feature data of the host to be detected is an outlier and the event detection result indicates that the host to be detected has a security event within a preset time, determining that the host to be detected is a lost host.
In this embodiment, after obtaining the outlier detection result and the event detection result of each host to be detected in the target group within the preset time period, the electronic device may determine, for each host to be detected in the target group, whether the outlier detection result of the host to be detected indicates whether the network behavior characteristic data of the host to be detected within the preset time period is an outlier, and determine whether the event detection result of the host to be detected indicates that the host to be detected has a security event within the preset time period, which may be specifically shown in table 2.
TABLE 2
Situation(s) Event detection results Outlier detection results
1 Occurrence of a security event Outliers
2 No security event occurred Non-outliers
3 Occurrence of a security event Non-outliers
4 Does not occurSecurity event Outliers
For each host to be detected in the target group, when the event detection result of the host to be detected in the preset time duration indicates that a security event occurs and the outlier detection result indicates that the network behavior characteristic data is an outlier, the electronic device may determine the host to be detected as a lost host, where the situation 1 shown in table 2 is satisfied.
The above steps S603 to S604 are refinements of the above step S104.
In an optional embodiment, for each host to be detected in the target group, if the outlier detection result indicates that the network behavior feature data of the host to be detected is not an outlier, or the event detection result indicates that the host to be detected does not generate a security event within a preset time period, it is determined that the host to be detected is not a lost host.
Specifically, for each host to be detected in the target group, when the event detection result of the host to be detected in the preset time duration indicates that no security event has occurred, or the outlier detection result indicates that the network behavior characteristic data thereof is not an outlier, that is, when any one of the conditions 2, 3, and 4 shown in table 2 is satisfied, the electronic device may determine that the host to be detected is not a lost host.
In another alternative embodiment, the misjudgment of the security event is considered, which may cause an error in the event detection result of each host to be detected. For example, a host that has been lost may be determined to have not experienced a security event, or a host that has not been lost may be determined to have experienced a security event. Therefore, for each host to be detected in the target group, if the outlier detection result indicates that the network behavior feature data of the host to be detected is not an outlier, or the event detection result indicates that the host to be detected does not generate a security event within a preset time period, the electronic device may not process the host to be detected. That is, the electronic device does not determine the host to be detected as a lost host.
In an optional embodiment, for the to-be-detected host that does not have the security event and is indicated by the above-mentioned outlier detection result, in order to further determine whether the to-be-detected host is a lost host, the electronic device may further detect the to-be-detected host, for example, whether the to-be-detected host is attacked maliciously or not. Here, the detection method of the host to be detected is not particularly limited.
Through the steps S603 to S604, the electronic device can accurately determine the lost host in the target group according to the event detection result and the outlier detection result of each host to be detected in the target group within the preset time duration, so that the influence of misjudgment of the security event on the determination of the lost host is reduced, and the accuracy of the determined lost host is improved.
In an optional embodiment, according to the method shown in fig. 1, an embodiment of the present application further provides a method for determining a failed host. As shown in fig. 7, fig. 7 is a sixth flowchart illustrating a method for determining a failed host according to an embodiment of the present application. The method comprises the following steps.
Step S701, acquiring network behavior characteristic data of each host to be detected in a target group within a preset time length; the target group comprises all to-be-detected hosts which are determined according to a preset group division index; each host to be detected has the same index value of the preset group division index.
Step S702, determining an outlier detection result of each host to be detected by using a preset outlier detection algorithm; the outlier detection result of each host to be detected indicates that: and whether the network behavior characteristic data of the host to be detected is an outlier in each network behavior characteristic data.
Step S703, obtaining the event detection result of each host to be detected; and the event detection result is used for indicating whether the host to be detected has a safety event within a preset time length.
The above steps S701 to S703 are the same as the above steps SS101 to S103.
Step S704, obtaining the outlier detection result and the event detection result of each host to be detected in the target group within a plurality of consecutive preset durations.
In this step, the electronic device may repeatedly perform the above steps S701 to S703 to obtain an outlier detection result and an event detection result of each host to be detected in the target group within a plurality of consecutive preset durations.
For example, the electronic device may obtain the outlier detection result and the event detection result of each host to be detected in the target group during the period from 9:00 to 18:00 every hour.
In the embodiment of the application, in consideration of the limitation of the outlier detection result and the event detection result of the host to be detected within a single preset time, the electronic device determines the lost host in the target group by acquiring the outlier detection result and the event detection result of the host to be detected within a plurality of continuous preset times.
Step S705, for each host to be detected in the target group, in each preset time period, if the outlier detection result corresponding to the preset time period indicates that the network behavior feature data of the host to be detected is an outlier, and the event detection result of the preset time period indicates that the host to be detected has a security event, determining the host to be detected as a candidate host within the preset time period.
In this step, for each host to be detected in the target group, when the outlier detection result of the host to be detected within a certain preset time duration indicates that the network behavior feature data of the host to be detected is an outlier, and the event detection result within the preset time duration indicates that the host to be detected has a security event, the electronic device may determine that the host to be detected has a single collapse. At this time, the electronic device may determine the host to be detected as the host to be selected within the preset duration.
Step S706, for the host to be detected in each target group, if the number of times that the host to be detected is determined to be the candidate host within the preset time durations is greater than a preset number threshold, determining that the host to be detected is the lost host.
In this step, for each host to be detected in the target group, when the number of times that the host to be detected is determined to be the candidate host within the consecutive preset durations is greater than a preset number threshold, that is, the number of times that the host to be detected is lost within the consecutive preset durations is greater than a preset number threshold, the electronic device may determine that the host to be detected is the lost host.
In the embodiment of the application, the accuracy of the determined lost host can be further improved by obtaining the event detection result and the outlier detection result of each host to be detected in the target group within a plurality of continuous preset durations.
For ease of understanding, the curve shown in FIG. 8 was obtained by a number of experiments. Fig. 8 is a schematic diagram of the number of times of host computer failure and the failure probability in a plurality of preset durations according to the embodiment of the present disclosure.
In the curve shown in fig. 8, the horizontal direction is the number of times that the host to be detected is determined as the candidate host, and the vertical direction is the probability of the host failing. In the curve shown in fig. 8, as the number of times of host computer failure increases, the failure probability of the host computer will also increase. Therefore, when the number of times of the host computer failing in a plurality of preset durations is larger, the higher the accuracy of determining the host computer as the failing host computer is, the higher the accuracy of determining the failing host computer is.
For example, in fig. 8, when the number of times of the host computer is lost is 1, the host computer has a loss probability of 55%. That is, when the number of times of the host losing the host in a plurality of preset time periods is 1, the accuracy of determining the host as a lost host can reach 55%. And when the number of times of the host computer is 20, the host computer has a defect probability of 100%. That is, when the number of times of the host losing the host within the preset time periods is 20, the accuracy of determining the host as the lost host can reach 100%.
In an optional embodiment, the preset number threshold may be set according to a user requirement. For example, the preset number threshold may be 20. Here, the preset number threshold is not particularly limited.
Based on the same inventive concept, according to the above method for determining a lost host provided in the embodiment of the present application, the embodiment of the present application further provides a device for determining a lost host. As shown in fig. 9-a, fig. 9-a is a schematic diagram of a first structure of a lost host determination apparatus according to an embodiment of the present application. The apparatus includes the following modules.
A first obtaining module 901, configured to obtain network behavior feature data of each host to be detected in the target group within a preset time; the target group comprises all to-be-detected hosts which are determined according to a preset group division index; each host to be detected has the same index value of the preset group division index;
a first determining module 902, configured to determine an outlier detection result of each host to be detected by using a preset outlier detection algorithm; the outlier detection result of each host to be detected indicates that: whether the network behavior characteristic data of the host to be detected is an outlier in each network behavior characteristic data;
A second obtaining module 903, configured to obtain an event detection result of each host to be detected; the event detection result is used for indicating whether a safety event occurs in the host to be detected within a preset time length;
the second determining module 904 is configured to determine a lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected.
Optionally, the network behavior feature data includes at least one of: uplink traffic, downlink traffic, session number, access IP number, and access zone number.
Optionally, the first determining module 902 may be specifically configured to determine, by using an isolated forest algorithm, a first outlier in the network behavior feature data of each host to be detected;
determining a second outlier in the network behavior characteristic data of each host to be detected by using a boxplot algorithm;
determining a target host in the target group as an outlier, wherein the target host is a host to be detected corresponding to each network behavior characteristic data in the intersection of the first outlier and the second outlier;
and determining each host to be detected in the target group except the target host as a non-outlier host.
Optionally, the first determining module 902 may be specifically configured to construct a preset number of isolated trees according to the network behavior feature data of each host to be detected, so as to obtain an isolated forest model; calculating the abnormal score of the network behavior characteristic data of each host to be detected by using the isolated forest model; and determining the network behavior characteristic data with the anomaly score larger than a first anomaly threshold value as a first outlier.
Optionally, the first determining module 902 may be specifically configured to, for each behavior category in the network behavior feature data, compare the network behavior feature data of each host to be detected in the behavior category with a second abnormal threshold corresponding to the behavior category; the second anomaly threshold corresponding to each behavior category is: the network behavior feature data of each host to be detected in the target group is determined according to the quartile of the network behavior feature data of the behavior category; and if the network behavior characteristic data of the host to be detected in any behavior category is larger than a second abnormal threshold corresponding to the behavior category, determining the network behavior characteristic data of the host to be detected as a second outlier.
Optionally, the second determining module 904 may be specifically configured to, for each host to be detected in the target group, determine that the host to be detected is a lost host if the outlier detection result indicates that the network behavior feature data of the host to be detected is an outlier and the event detection result indicates that the host to be detected has a security event within a preset time period;
and if the outlier detection result indicates that the network behavior characteristic data of the host to be detected is not an outlier or the event detection result indicates that the host to be detected does not generate a security event within a preset time period, determining that the host to be detected is not a lost host.
Optionally, as shown in fig. 9-b, the apparatus for determining a lost host may further include:
a third obtaining module 905, configured to obtain an outlier detection result and an event detection result of each host to be detected in the target group within a plurality of consecutive preset durations before determining a lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected;
the second determining module 904 may be specifically configured to, for each host to be detected in the target group, within each preset time duration, determine the host to be detected as a candidate host within the preset time duration if an outlier detection result corresponding to the preset time duration indicates that network behavior feature data of the host to be detected is an outlier, and an event detection result of the preset time duration indicates that a security event occurs in the host to be detected;
if the frequency of the host to be detected being determined as the alternative host within the preset time lengths is greater than a preset frequency threshold value, determining that the host to be detected is the collapse host;
and if the frequency of the host to be detected determined as the alternative host within the preset time lengths is not greater than the preset frequency threshold value, determining that the host to be detected is not the collapse host.
By the device provided by the embodiment of the application, the network behavior characteristic data of each host to be detected in the target group is obtained, the outlier detection result of each host to be detected is determined by using the preset outlier detection algorithm, and therefore the lost host in the target group is determined according to the outlier detection result and the event detection result of each host to be detected within the preset time.
Compared with the related technology, in the process of determining the lost host, whether the network behavior characteristic data of the host is an outlier in the target group where the host is located is comprehensively considered in addition to whether the host has a security event, namely, the deviation condition of the network behavior characteristic of the host relative to the network behavior characteristics of other hosts in the group where the host is located is also considered, so that the influence of misjudgment of the security event on the determination of the lost host is effectively reduced, and the accuracy of the determined lost host is improved.
Based on the same inventive concept, according to the method for determining a failed host provided in the embodiment of the present application, the embodiment of the present application further provides an electronic device, as shown in fig. 10, including a processor 1001 and a machine-readable storage medium 1002, where the machine-readable storage medium 1002 stores machine-executable instructions capable of being executed by the processor 1001. Processor 1001 is caused by machine executable instructions to implement any of the steps shown in fig. 1-4, and 6-7 described above.
In an alternative embodiment, as shown in fig. 10, the electronic device may further include: a communication interface 1003 and a communication bus 1004; the processor 1001, the machine-readable storage medium 1002, and the communication interface 1003 complete communication with each other through the communication bus 1004, and the communication interface 1003 is used for communication between the electronic device and other devices.
Based on the same inventive concept, according to the above-mentioned method for determining a failed host provided in the embodiment of the present application, the embodiment of the present application further provides a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions capable of being executed by a processor. The processor is caused by machine executable instructions to implement any of the steps shown in fig. 1-4, and 6-7 described above.
Based on the same inventive concept, according to the above-mentioned method for determining a failed host provided in the embodiments of the present application, the embodiments of the present application further provide a computer program product containing instructions, which, when run on a computer, causes the computer to perform any one of the steps shown in fig. 1 to 4 and fig. 6 to 7.
The communication bus may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc.
The machine-readable storage medium may include a RAM (Random Access Memory) and a NVM (Non-Volatile Memory), such as at least one disk Memory. Additionally, the machine-readable storage medium may be at least one memory device located remotely from the aforementioned processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also DSPs (Digital Signal Processing), ASICs (Application Specific Integrated circuits), FPGAs (Field Programmable Gate arrays) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the embodiments of the lost host determination apparatus, the electronic device, the machine-readable storage medium, and the computer program product, etc., since they are substantially similar to the embodiments of the lost host determination method, the description is relatively simple, and relevant points can be found in the partial description of the embodiments of the lost host determination method.
The above description is only for the preferred embodiment of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims (16)

1. A method for determining a failed host, the method comprising:
acquiring network behavior characteristic data of each host to be detected in the target group within a preset time length; the target group comprises all to-be-detected hosts which are determined according to a preset group division index; each host to be detected has the same index value of a preset group division index;
Determining an outlier detection result of each host to be detected by using a preset outlier detection algorithm; the outlier detection result of each host to be detected indicates that: whether the network behavior characteristic data of the host to be detected is an outlier in each network behavior characteristic data;
acquiring an event detection result of each host to be detected; the event detection result is used for indicating whether the host to be detected has a safety event within the preset time length;
and determining the lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected.
2. The method of claim 1, wherein the network behavior feature data comprises at least one of: uplink traffic, downlink traffic, session number, number of IP accesses to the Internet protocol, and number of access areas.
3. The method according to claim 1 or 2, wherein the step of determining the outlier detection result of each host to be detected by using a predetermined outlier detection algorithm comprises:
determining a first outlier in the network behavior characteristic data of each host to be detected by using an isolated forest algorithm;
determining a second outlier in the network behavior characteristic data of each host to be detected by using a boxplot algorithm;
Determining a target host in the target group as an outlier, wherein the target host is a host to be detected corresponding to each network behavior characteristic data in the intersection of the first outlier and the second outlier;
and determining each host to be detected in the target group except the target host as a non-outlier host.
4. The method according to claim 3, wherein the step of determining the first outlier in the network behavior feature data of each host to be detected by using an isolated forest algorithm comprises:
constructing a preset number of isolated trees according to the network behavior characteristic data of each host to be detected to obtain an isolated forest model;
calculating the abnormal score of the network behavior characteristic data of each host to be detected by using the isolated forest model;
and determining the network behavior characteristic data with the anomaly score larger than a first anomaly threshold value as a first outlier.
5. The method according to claim 3, wherein the step of determining the second outlier in the network behavior feature data of each host to be detected by using a boxplot algorithm comprises:
aiming at each behavior class in the network behavior characteristic data, comparing the network behavior characteristic data of each host to be detected in the behavior class with a second abnormal threshold corresponding to the behavior class; the second anomaly threshold corresponding to each behavior category is: the network behavior feature data of each host to be detected in the target group in the behavior category is determined according to the quartile of the network behavior feature data;
And if the network behavior characteristic data of the host to be detected in any behavior category is larger than a second abnormal threshold corresponding to the behavior category, determining the network behavior characteristic data of the host to be detected as a second outlier.
6. The method according to claim 1, wherein the step of determining the lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected comprises:
and for each host to be detected in the target group, if the outlier detection result indicates that the network behavior characteristic data of the host to be detected is an outlier and the event detection result indicates that the host to be detected has a security event within the preset time, determining that the host to be detected is a lost host.
7. The method according to claim 1, wherein before determining the lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected, further comprising:
acquiring outlier detection results and event detection results of each host to be detected in the target group within a plurality of continuous preset time lengths;
the step of determining the lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected includes:
For each host to be detected in the target group, within each preset time length, if an outlier detection result corresponding to the preset time length indicates that network behavior characteristic data of the host to be detected is an outlier, and an event detection result of the preset time length indicates that a security event occurs in the host to be detected, determining the host to be detected as a candidate host within the preset time length;
and if the frequency of the host to be detected determined as the alternative host within the preset time lengths is greater than a preset frequency threshold value, determining that the host to be detected is the collapse host.
8. A lost host determination apparatus, the apparatus comprising:
the first acquisition module is used for acquiring network behavior characteristic data of each host to be detected in the target group within a preset time length; the target group comprises all to-be-detected hosts which are determined according to a preset group division index; each host to be detected has the same index value of a preset group division index;
the first determining module is used for determining an outlier detection result of each host to be detected by utilizing a preset outlier detection algorithm; the outlier detection result of each host to be detected indicates that: whether the network behavior characteristic data of the host to be detected is an outlier in each network behavior characteristic data;
The second acquisition module is used for acquiring the event detection result of each host to be detected; the event detection result is used for indicating whether the host to be detected has a safety event within the preset time length;
and the second determining module is used for determining the lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected.
9. The apparatus of claim 8, wherein the network behavior feature data comprises at least one of: uplink traffic, downlink traffic, session number, number of IP accesses to the Internet protocol, and number of access areas.
10. The apparatus according to claim 8 or 9, wherein the first determining module is specifically configured to determine a first outlier in the network behavior feature data of each host to be detected by using an isolated forest algorithm;
determining a second outlier in the network behavior characteristic data of each host to be detected by using a boxplot algorithm;
determining a target host in the target group as an outlier, wherein the target host is a host to be detected corresponding to each network behavior characteristic data in the intersection of the first outlier and the second outlier;
And determining each host to be detected in the target group except the target host as a non-outlier host.
11. The device according to claim 10, wherein the first determining module is specifically configured to construct a preset number of isolated trees according to the network behavior feature data of each host to be detected, so as to obtain an isolated forest model; calculating the abnormal score of the network behavior characteristic data of each host to be detected by using the isolated forest model; and determining the network behavior characteristic data with the anomaly score larger than a first anomaly threshold value as a first outlier.
12. The apparatus according to claim 10, wherein the first determining module is specifically configured to, for each behavior class in the network behavior feature data, compare the network behavior feature data of each host to be detected in the behavior class with a second anomaly threshold corresponding to the behavior class; the second anomaly threshold corresponding to each behavior category is: the network behavior feature data of each host to be detected in the target group in the behavior category is determined according to the quartile of the network behavior feature data; and if the network behavior characteristic data of the host to be detected in any behavior category is larger than a second abnormal threshold corresponding to the behavior category, determining the network behavior characteristic data of the host to be detected as a second outlier.
13. The apparatus according to claim 8, wherein the second determining module is specifically configured to determine, for each host to be detected in the target group, that the host to be detected is a lost host if the outlier detection result indicates that the network behavior feature data of the host to be detected is an outlier and the event detection result indicates that the host to be detected has a security event within the preset time duration.
14. The apparatus of claim 8, further comprising:
a third obtaining module, configured to obtain outlier detection results and event detection results of each host to be detected in the target group within a plurality of consecutive preset durations before determining a lost host in the target group according to the outlier detection result and the event detection result corresponding to each host to be detected;
the second determining module is specifically configured to determine, for each host to be detected in the target group, within each preset time period, if an outlier detection result corresponding to the preset time period indicates that network behavior feature data of the host to be detected is an outlier, and an event detection result of the preset time period indicates that a security event occurs in the host to be detected, the host to be detected is a candidate host within the preset time period;
And if the frequency of the host to be detected determined as the alternative host within the preset time lengths is greater than a preset frequency threshold value, determining that the host to be detected is the collapse host.
15. An electronic device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: carrying out the method steps of any one of claims 1 to 7.
16. A machine-readable storage medium having stored thereon machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: carrying out the method steps of any one of claims 1 to 7.
CN202111245887.7A 2021-10-26 2021-10-26 Method and device for determining defect host, electronic equipment and storage medium Pending CN113987476A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111245887.7A CN113987476A (en) 2021-10-26 2021-10-26 Method and device for determining defect host, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111245887.7A CN113987476A (en) 2021-10-26 2021-10-26 Method and device for determining defect host, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113987476A true CN113987476A (en) 2022-01-28

Family

ID=79741393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111245887.7A Pending CN113987476A (en) 2021-10-26 2021-10-26 Method and device for determining defect host, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113987476A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114500122A (en) * 2022-04-18 2022-05-13 国家计算机网络与信息安全管理中心江苏分中心 Specific network behavior analysis method and system based on multi-source data fusion
CN115118464A (en) * 2022-06-10 2022-09-27 深信服科技股份有限公司 Method and device for detecting defect host, electronic equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114500122A (en) * 2022-04-18 2022-05-13 国家计算机网络与信息安全管理中心江苏分中心 Specific network behavior analysis method and system based on multi-source data fusion
CN114500122B (en) * 2022-04-18 2022-07-01 国家计算机网络与信息安全管理中心江苏分中心 Specific network behavior analysis method and system based on multi-source data fusion
CN115118464A (en) * 2022-06-10 2022-09-27 深信服科技股份有限公司 Method and device for detecting defect host, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US10878102B2 (en) Risk scores for entities
CN109558295B (en) Performance index abnormality detection method and device
US11087329B2 (en) Method and apparatus of identifying a transaction risk
US9389946B2 (en) Operation management apparatus, operation management method, and program
de Oliveira et al. Determining critical links in a road network: vulnerability and congestion indicators
CN107819631B (en) Equipment anomaly detection method, device and equipment
CN111538642B (en) Abnormal behavior detection method and device, electronic equipment and storage medium
WO2017113677A1 (en) User behavior data processing method and system
CN112434208A (en) Training of isolated forest and identification method and related device of web crawler of isolated forest
CN109165691B (en) Training method and device for model for identifying cheating users and electronic equipment
CN110083475B (en) Abnormal data detection method and device
US20180191736A1 (en) Method and apparatus for collecting cyber incident information
CN106445938B (en) Data detection method and device
CN111865982B (en) Threat assessment system and method based on situation awareness alarm
CN109784042A (en) The detection method of abnormal point, device, electronic equipment and storage medium in time series
CN113987476A (en) Method and device for determining defect host, electronic equipment and storage medium
JP6528532B2 (en) Disaster detection program, disaster detection device and disaster detection method
CN114978877A (en) Exception handling method and device, electronic equipment and computer readable medium
CN114662772A (en) Traffic noise early warning method, model training method, device, equipment and medium
CN114297037A (en) Alarm clustering method and device
Chen et al. Research on entropy weight multiple criteria decision‐making evaluation of metro network vulnerability
CN117033552A (en) Information evaluation method, device, electronic equipment and storage medium
Bawdekar et al. Selection of stationarity tests for time series forecasting using reliability analysis
CN110098983B (en) Abnormal flow detection method and device
CN111695829B (en) Index fluctuation period calculation method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination