CN109413071B - Abnormal flow detection method and device - Google Patents

Abnormal flow detection method and device Download PDF

Info

Publication number
CN109413071B
CN109413071B CN201811291061.2A CN201811291061A CN109413071B CN 109413071 B CN109413071 B CN 109413071B CN 201811291061 A CN201811291061 A CN 201811291061A CN 109413071 B CN109413071 B CN 109413071B
Authority
CN
China
Prior art keywords
flow
historical
characteristic
future
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811291061.2A
Other languages
Chinese (zh)
Other versions
CN109413071A (en
Inventor
顾成杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Security Technologies Co Ltd
Original Assignee
New H3C Security Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Security Technologies Co Ltd filed Critical New H3C Security Technologies Co Ltd
Priority to CN201811291061.2A priority Critical patent/CN109413071B/en
Publication of CN109413071A publication Critical patent/CN109413071A/en
Application granted granted Critical
Publication of CN109413071B publication Critical patent/CN109413071B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the application provides an abnormal flow detection method and device, wherein the method comprises the following steps: extracting historical characteristic variables of the historical flow data at each historical moment under a plurality of characteristic dimensions to obtain flow characteristic vectors under each characteristic dimension, and sequencing the historical characteristic variables in the flow characteristic vectors according to time; predicting future values corresponding to historical flow data under each characteristic dimension at future time by adopting a regression algorithm according to the flow characteristic vectors under each characteristic dimension; acquiring the weight of each feature dimension; determining a future flow value corresponding to the historical flow data at a future moment according to the weight of each characteristic dimension and the future value under each characteristic dimension; and acquiring actual flow data at a future moment, and determining whether the actual flow data is abnormal or not according to the future flow value and the actual flow value of the actual flow data. By applying the technical scheme provided by the embodiment of the application, the flexibility of abnormal flow detection is improved, and the misjudgment rate of the abnormal flow detection is reduced.

Description

Abnormal flow detection method and device
Technical Field
The present application relates to the field of network security technologies, and in particular, to a method and an apparatus for detecting abnormal traffic.
Background
With the rapid development and wide application of computer and internet technologies, the system security of a computer network is more and more threatened by computer viruses, hacker attacks and other aspects, and network abnormality is often caused. In order to improve the security of the network, how to detect abnormal traffic in the network becomes a key of research.
At present, when abnormal traffic is detected, network management personnel set a traffic threshold value according to experience. And if the actual flow is larger than the flow threshold value, determining that the flow is abnormal. And if the actual flow is not larger than the flow threshold, determining that the flow is normal. The set flow threshold value is a fixed value, the time law and the service characteristics are not considered, the flexibility is low, and the misjudgment rate is high.
Disclosure of Invention
An object of the embodiments of the present application is to provide a method and an apparatus for detecting abnormal traffic, so as to improve the flexibility of abnormal traffic detection and reduce the false rate of abnormal traffic detection. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present application provides an abnormal traffic detection method, where the method includes:
obtaining historical flow data;
extracting historical characteristic variables of the historical flow data at each historical moment under a plurality of characteristic dimensions to obtain flow characteristic vectors under each characteristic dimension, wherein the historical characteristic variables in the flow characteristic vectors are sorted according to time;
predicting future values corresponding to the historical flow data in each feature dimension at the future time by adopting a regression algorithm according to the flow feature vectors in each feature dimension;
acquiring the weight of each feature dimension;
determining a future flow value corresponding to the historical flow data at the future moment according to the weight of each characteristic dimension and the future value under each characteristic dimension;
and acquiring actual flow data at the future moment, wherein the actual flow data and the historical flow data have the same message information, and determining whether the actual flow data is abnormal or not according to the future flow value and the actual flow value of the actual flow data.
In a second aspect, an embodiment of the present application provides an abnormal flow rate detection apparatus, including:
the first acquisition unit is used for acquiring historical flow data;
the extraction unit is used for extracting historical characteristic variables of the historical flow data at each historical moment under a plurality of characteristic dimensions to obtain flow characteristic vectors under each characteristic dimension, and the historical characteristic variables in the flow characteristic vectors are sorted according to time;
the prediction unit is used for predicting future values corresponding to the historical flow data under each characteristic dimension at the future moment by adopting a regression algorithm according to the flow characteristic vectors under each characteristic dimension;
the second acquisition unit is used for acquiring the weight of each feature dimension;
the first determining unit is used for determining a future flow value corresponding to the historical flow data at the future moment according to the weight of each characteristic dimension and a future numerical value under each characteristic dimension;
and the second determining unit is used for acquiring actual flow data at the future moment, wherein the actual flow data and the historical flow data have the same message information, and determining whether the actual flow data is abnormal or not according to the future flow value and the actual flow value of the actual flow data.
In a third aspect, embodiments provide a network device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: implementing any of the method steps provided in the first aspect.
In a fourth aspect, embodiments of the present application provide a machine-readable storage medium storing machine-executable instructions that, when invoked and executed by a processor, cause the processor to: implementing any of the method steps provided in the first aspect.
In the abnormal flow detection method and device provided by the embodiment of the application, the future flow value is determined according to the flow characteristic vector under each characteristic dimension, and the characteristics in the flow characteristic vector are sorted according to time. Because the factors such as the time law of the traffic data, the service characteristics and the like are considered when determining the future traffic value, the determined future traffic value can change along with the change of time and service instead of being fixed and unchangeable. Further, when it is determined whether or not the actual flow data is abnormal based on the future flow value, it is possible to improve the flexibility of abnormal flow detection and reduce the erroneous determination rate of abnormal flow detection. Of course, it is not necessary for any product or method of the present application to achieve all of the above-described advantages at the same time.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a first schematic flow chart of an abnormal traffic detection method according to an embodiment of the present application;
fig. 2 is a second flowchart of an abnormal traffic detection method according to an embodiment of the present application;
fig. 3 is a third schematic flow chart of an abnormal traffic detection method according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an abnormal flow rate detection apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a network device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
At present, when abnormal traffic is detected, network management personnel set a traffic threshold value according to experience. And if the actual flow is larger than the flow threshold value, determining that the flow is abnormal. And if the actual flow is not larger than the flow threshold, determining that the flow is normal. The set flow threshold is a fixed value. To improve network security, network administrators may set the traffic threshold low, which results in a large amount of useless false positive information. In order to prevent a large amount of useless false alarm information, network management personnel can set a high flow threshold value, which can cause that abnormal flow caused by malicious attacks cannot be found. Therefore, when the existing abnormal flow is detected, the flow threshold is set by a network manager according to the requirement without considering the time rule and the service characteristics, so that the flexibility is low, and the misjudgment rate is high.
In order to improve the flexibility of abnormal flow detection and reduce the misjudgment rate of abnormal flow detection, the embodiment of the application provides an abnormal flow detection method. The method can be applied to any network device. Network devices include, but are not limited to, routers, switches, and the like.
In the method, network equipment acquires historical flow data; extracting historical characteristic variables of the historical flow data at each historical moment under a plurality of characteristic dimensions to obtain flow characteristic vectors under each characteristic dimension, and sequencing the historical characteristic variables in the flow characteristic vectors according to time; predicting future values corresponding to historical flow data under each characteristic dimension at future time by adopting a regression algorithm according to the flow characteristic vectors under each characteristic dimension; acquiring the weight of each feature dimension; determining a future flow value corresponding to the historical flow data at a future moment according to the weight of each characteristic dimension and the future value under each characteristic dimension; and acquiring actual flow data at a future moment, wherein the actual flow data and the historical flow data have the same message information, and determining whether the actual flow data is abnormal or not according to the future flow value and the actual flow value of the actual flow data.
Therefore, in the technical scheme provided by the embodiment of the application, the future flow value is determined according to the flow characteristic vector under each characteristic dimension, and the characteristics in the flow characteristic vector are sorted according to time. Because the factors such as the time law of the traffic data, the service characteristics and the like are considered when determining the future traffic value, the determined future traffic value can change along with the change of time and service instead of being fixed and unchangeable. Further, when it is determined whether or not the actual flow data is abnormal based on the future flow value, it is possible to improve the flexibility of abnormal flow detection and reduce the erroneous determination rate of abnormal flow detection.
The present application will be described in detail below with reference to specific examples. For ease of understanding, the following description will take the execution subject as an example of a network device.
Referring to fig. 1, fig. 1 is a first flowchart illustrating an abnormal traffic detection method according to an embodiment of the present application. The method comprises the following steps.
Step 101, obtaining historical flow data.
In the embodiment of the application, the network device may count the traffic at each moment through the traffic counting tool. Specifically, the network device obtains historical traffic data within a preset time period before the current time. In one example, the network device may obtain historical traffic data within a preset time period before the current time based on traffic mirroring, NetStream, and the like.
In an optional embodiment, the network device obtains a large amount of historical traffic data, and performs grouping according to message information of the traffic data to obtain multiple sets of historical traffic data. And respectively carrying out abnormal flow detection on each group of historical flow data. In one example, the message information may be triplet information, quintet information, or hepta information.
For example, the network device obtains historical traffic data 1-10. The message information of the historical traffic data 1-3 is the message information 1, the message information of the historical traffic data 4-7 is the message information 2, and the message information of the historical traffic data 8-10 is the message information 3. The network equipment can divide the historical flow data 1-3 into a group of historical flow data, divide the historical flow data 4-7 into a group of historical flow data, and divide the historical flow data 8-10 into a group of historical flow data to obtain 3 groups of historical flow data.
In the embodiment of the present application, the traffic data with the same message information has a relatively high correlation, and the traffic data with different message information has a relatively low correlation. In implementation, abnormal flow detection can be respectively carried out on multiple groups of historical flow data with different message information, and accuracy of abnormal flow prediction is improved.
And 102, extracting historical characteristic variables of the historical flow data at each historical moment under a plurality of characteristic dimensions to obtain flow characteristic vectors under each characteristic dimension. And the historical characteristic variables in the flow characteristic vector are sorted according to time.
In the embodiment of the present application, the feature dimensions include, but are not limited to: a total traffic value, an uplink traffic value, a downlink traffic value, a number of bytes, a number of messages, a source IP (Internet Protocol) address value, a destination IP address value, a number of connections, a number of concurrencies, a number of newly-established connections, a number of disconnected connections, a TCP (Transmission Control Protocol) duty ratio, a UDP (User Datagram Protocol), a User Datagram Protocol (User Datagram Protocol), an ICMP (Internet Control Message Protocol) Message number, an HTTP (Hyper Text transfer Protocol) Message number, a DNS (Domain Name System) Message number, and an FTP (File transfer Protocol) Message number. The characteristic value of the historical flow data at a certain historical moment under a characteristic dimension is as follows: historical flow data is historical characteristic variables of the historical time under the characteristic dimension.
After the network equipment acquires the historical traffic data, extracting historical characteristic variables of the historical traffic data at each historical moment under a plurality of characteristic dimensions. For each feature dimension, the network equipment combines the historical feature variables under the feature dimension according to the sequence of the historical time corresponding to the historical feature variables to obtain the flow feature vector under each feature dimension.
For example, in feature dimension 1, the historical feature variables of the historical traffic data extracted by the network device include: {10:00,5},{10:01,9},{10:02, 10},{10:03,4},{10:04,7}. And the network equipment obtains the flow characteristic vector under the characteristic dimension 1, which can be {5, 9, 10, 4, 7} according to the extracted historical characteristic variables of each historical moment under the characteristic dimension 1.
And 103, predicting future values corresponding to the historical flow data in each feature dimension at the future time by adopting a regression algorithm according to the flow feature vectors in each feature dimension.
For each feature dimension, the network device trains a regression model by using the flow feature vector under the feature dimension through regression algorithms such as GBDT (Gradient Boosting Decision Tree) and random forest, and predicts a future value corresponding to historical flow data under the feature dimension at a future moment based on the regression model obtained by training.
For example, the current time is 11: 00. And the network equipment predicts future values corresponding to the historical flow data under each characteristic dimension in 11:01 by adopting a regression algorithm and utilizing the extracted flow characteristic vectors under each characteristic dimension.
And 104, acquiring the weight of each feature dimension.
In the embodiment of the present application, the weight of each feature dimension may be set by a network administrator according to experience, or may be determined by using a regression algorithm according to a flow feature vector under each feature dimension.
In one embodiment, the network device determines the weight of each feature dimension using a linear regression algorithm.
For example, the network device determines the weight of each feature dimension using the following equation (1).
Figure GDA0002967747610000061
Wherein Q is1Is the flow value of the historical flow data at the historical moment, n is the number of a plurality of characteristic dimensions, kiIs the weight of the ith feature dimension, viThe value of the historical flow data at the historical moment under the ith characteristic dimension is shown.
Based on the formula (1), n pieces of historical flow data are obtained, and the weight of each feature dimension can be determined.
And 105, determining a future flow value corresponding to the historical flow data at the future time according to the weight of each characteristic dimension and the future value under each characteristic dimension.
And the network equipment performs weighting processing on the future numerical values under the characteristic dimensions according to the weight of each characteristic dimension to obtain the future flow value corresponding to the historical flow data at the future time.
In an alternative embodiment, the network device may determine a future flow value Q corresponding to the historical flow data at the future time using the following formula (2)0
Figure GDA0002967747610000062
Wherein n is the number of a plurality of characteristic dimensions, kiIs the weight of the ith feature dimension, wiIs the future value in the ith feature dimension.
For example, 5 feature dimensions are currently set, the weight of feature dimension 1 is 0.1, the weight of feature dimension 2 is 0.2, the weight of feature dimension 3 is 0.3, the weight of feature dimension 4 is 0.2, and the weight of feature dimension 5 is 0.2. The network equipment determines that the future value under the characteristic dimension 1 is 10, the future value under the characteristic dimension 2 is 5, the future value under the characteristic dimension 3 is 14, the future value under the characteristic dimension 4 is 8, and the future value under the characteristic dimension 5 is 12, so that the future flow value corresponding to the historical flow data at the future moment can be determined to be Q0=0.1*10+0.2*5+0.3*14+0.2*8+0.2*12=10.2。
In addition, in an alternative embodiment, the network device may determine a future traffic value Q corresponding to the historical traffic data at a future time using the following formula (2)0
Figure GDA0002967747610000071
Wherein n is the number of a plurality of characteristic dimensions, kiIs the weight of the ith feature dimension, wiT is a preset fault-tolerant threshold value which is a future numerical value under the ith characteristic dimension, and T is more than or equal to 0 and less than or equal to 1. The preset fault tolerance threshold value can be a system default value or can be set by a network manager according to experience.
For example, the preset fault tolerance threshold T is 0.5, 5 feature dimensions are currently set, the weight of the feature dimension 1 is 0.1, the weight of the feature dimension 2 is 0.2, the weight of the feature dimension 3 is 0.3, the weight of the feature dimension 4 is 0.2, and the weight of the feature dimension 5 is 0.2. The network device determines that the future value in the characteristic dimension 1 is 10, the future value in the characteristic dimension 2 is 5, the future value in the characteristic dimension 3 is 14, the future value in the characteristic dimension 4 is 8, and the future value in the characteristic dimension 5 is 12, and then determines that the future flow value corresponding to the historical flow data at the future time is:
Q0=(1+0.5)*(0.1*10+0.2*5+0.3*14+0.2*8+0.2*12)=15.3。
and step 106, acquiring actual flow data at a future moment, wherein the actual flow data and the historical flow data have the same message information, and determining whether the actual flow data is abnormal or not according to the future flow value and the actual flow value of the actual flow data.
In the embodiment of the application, the actual flow data corresponding to the historical flow data is the actual flow data which is the same as the message information of the historical flow data. After acquiring actual flow data corresponding to the historical flow data at a future moment, the network equipment compares the determined future flow value with the actual flow value of the actual flow data, and determines whether the actual flow data is abnormal according to a comparison result.
For example, the current time is 11: 00. And determining whether the actual flow data is abnormal according to the future flow value corresponding to the historical flow data at the time of 11:01 and the actual flow value corresponding to the actual flow data at the time of 11:01 by the network equipment under the condition that the actual flow data corresponding to the historical flow data at the time of 11:01 is obtained.
In an optional embodiment, if the determined future traffic value is smaller than the actual traffic value of the actual traffic data, the network device determines that the actual traffic data is abnormal. And if the determined future flow value is not less than the actual flow value of the actual flow data, the network equipment determines that the actual flow data is normal.
In another alternative embodiment, the difference threshold is preset in the network device, that is, the preset difference threshold is preset. And the network equipment subtracts the determined future flow value from the actual flow value of the actual flow data to obtain a flow difference value under the condition of obtaining the actual flow data corresponding to the historical flow data at the future time. The network device compares the traffic difference with a preset difference threshold. If the traffic difference is greater than the preset difference threshold, the network device may determine that the actual traffic data is abnormal. If the traffic difference is not greater than the preset difference threshold, the network device may determine that the actual traffic data is normal.
For example, the preset difference threshold is θ. The determined future flow value is HnThe actual flow rate value of the actual flow rate data is Ha,Ha-HnΔ h. If Δ h>Theta, the network device may determine that the actual traffic data is anomalous. Otherwise, the network device may determine that the actual traffic data is normal.
And determining whether the actual flow data is abnormal or not by adopting a preset difference threshold value, wherein the actual flow value of the actual flow data is allowed to fluctuate within a certain range larger than the future flow value. The actual flow rate value fluctuates within this range, and it can be considered that the actual flow rate value of the actual flow rate data is a temporary change, and the actual flow rate data is normal flow rate data. This increases the robustness of abnormal flow monitoring and has better adaptability.
In the abnormal flow detection method provided by the embodiment of the application, the future flow value is determined according to the flow characteristic vector under each characteristic dimension, and the characteristics in the flow characteristic vector are sorted according to time. Because the factors such as the time law of the traffic data, the service characteristics and the like are considered when determining the future traffic value, the determined future traffic value can change along with the change of time and service instead of being fixed and unchangeable. Further, when it is determined whether or not the actual flow data is abnormal based on the future flow value, it is possible to improve the flexibility of abnormal flow detection and reduce the erroneous determination rate of abnormal flow detection.
In order to improve the efficiency of detecting abnormal traffic, in an alternative embodiment, reference is made to a second flowchart of the abnormal traffic detecting method shown in fig. 2. The method may include the following steps.
Step 201, obtaining historical flow data.
Same as step 201 and step 101.
Step 202, selecting an optimal feature dimension subset from the plurality of feature dimensions by using a preset feature dimension selection method.
In the embodiment of the present application, the feature dimension selection method includes, but is not limited to, a genetic algorithm, a rough set, a tabu search, and the like. The network equipment selects an optimal feature dimension subset from a plurality of feature dimensions by adopting a preset feature dimension selection method, removes redundant feature dimensions and weakly related feature dimensions from the plurality of feature dimensions, and obtains the optimal feature dimension subset.
On the basis of complete historical flow data, the network equipment removes redundant characteristic dimensions and maintains the flow detection precision. In addition, the characteristic dimension participating in the subsequent steps 203-208 is reduced, and the detection efficiency of the abnormal flow is improved.
And step 203, extracting historical characteristic variables of the historical flow data at each historical moment under each characteristic dimension included by the characteristic dimension subset.
And 204, combining the historical characteristic variables under the characteristic dimensions according to the sequence of the historical moments corresponding to the historical characteristic variables for the extracted historical characteristic variables under each characteristic dimension to obtain the flow characteristic vector under each characteristic dimension. The historical feature variables in the traffic feature vector are ordered according to time.
And step 205, predicting future values corresponding to the historical flow data in each feature dimension at the future time by adopting a regression algorithm according to the flow feature vectors in each feature dimension.
And step 206, acquiring the weight of each feature dimension.
And step 207, determining a future flow value corresponding to the historical flow data at the future time according to the weight of each characteristic dimension and the future value under each characteristic dimension.
And 208, acquiring actual flow data at a future moment, wherein the actual flow data and the historical flow data have the same message information, and determining whether the actual flow data is abnormal or not according to the future flow value and the actual flow value of the actual flow data.
Step 205-208 is the same as step 103-106.
In order to improve the efficiency of detecting abnormal traffic, in an alternative embodiment, refer to a third schematic flow chart of the abnormal traffic detection method shown in fig. 3. The method may include the following steps.
Step 301, obtaining historical flow data.
Step 302, extracting historical characteristic variables of the historical flow data at each historical moment under a plurality of characteristic dimensions to obtain flow characteristic vectors under each characteristic dimension, wherein the historical characteristic variables in the flow characteristic vectors are sorted according to time.
And step 303, predicting future values corresponding to the historical flow data in each feature dimension at the future time by adopting a regression algorithm according to the flow feature vectors in each feature dimension.
And 304, acquiring the weight of each feature dimension.
And 305, determining a future flow value corresponding to the historical flow data at the future time according to the weight of each characteristic dimension and the future value under each characteristic dimension.
Step 306, obtaining actual flow data at a future moment, wherein the actual flow data and the historical flow data have the same message information, and determining whether the actual flow data is abnormal according to the future flow value and the actual flow value of the actual flow data.
Steps 301-306 are the same as steps 101-106.
And 307, acquiring actual characteristic variables of the actual flow data under each characteristic dimension under the condition that the actual flow data are determined to be abnormal.
The characteristic value of the actual flow data under a characteristic dimension is as follows: the actual flow data is a characteristic variable in the characteristic dimension.
Step 308, for each feature dimension, determine whether the actual feature variable in the feature dimension is greater than the future value in the feature dimension. If so, go to step 309. If not, go to step 310.
The future values for each feature dimension are predicted in step 303.
Step 309, determine that the actual flow data is abnormal in the characteristic dimension.
Step 310, determining that the actual flow data is normal in the characteristic dimension.
For example, the characteristic dimension includes a number of bytes. The future number of predicted bytes is 100. If the actual characteristic variable of the actual flow under the byte number is 90, and 90<100, the actual flow is determined to be normal under the characteristic dimension of the byte number. If the actual characteristic variable of the actual flow under the byte number is 150, 150>100, the fact that the actual flow is abnormal under the characteristic dimension of the byte number is determined.
It should be appreciated that, at step 308, for each feature dimension, it is determined whether the difference between the actual feature variable for that feature dimension and the future value for that feature dimension is greater than a preset threshold. If so, go to step 309. If not, go to step 310. The preset threshold may be set empirically or in practice, and is not particularly limited herein.
According to the technical scheme, different characteristic dimensions are analyzed, whether the actual flow is abnormal in each characteristic dimension is determined, the reason for the abnormality can be accurately positioned by network management personnel, and the reason for the abnormality can be timely solved.
Corresponding to the above abnormal traffic detection method embodiment, the embodiment of the present application further provides an abnormal traffic detection apparatus. Referring to fig. 4, fig. 4 is a schematic structural diagram of an abnormal flow rate detecting device provided in an embodiment of the present application, where the device includes:
a first obtaining unit 401, configured to obtain historical traffic data;
an extracting unit 402, configured to extract historical feature variables of historical flow data at each historical time under multiple feature dimensions to obtain flow feature vectors under each feature dimension, where the historical feature variables in the flow feature vectors are sorted according to time;
the prediction unit 403 is configured to predict future values corresponding to historical flow data in each feature dimension at a future time by using a regression algorithm according to the flow feature vectors in each feature dimension;
a second obtaining unit 404, configured to obtain weights of the feature dimensions;
a first determining unit 405, configured to determine, according to the weight of each feature dimension and a future value under each feature dimension, a future flow value corresponding to historical flow data at a future time;
a second determining unit 406, configured to obtain actual traffic data at the future time, where the actual traffic data has the same message information as the historical traffic data, and determine whether the actual traffic data is abnormal according to the future traffic value and the actual traffic value of the actual traffic data.
Optionally, the extracting unit 402 may be specifically configured to:
selecting an optimal characteristic dimension subset from a plurality of characteristic dimensions by adopting a preset characteristic dimension selection method;
extracting historical characteristic variables of the historical flow data at each historical moment under each characteristic dimension included by the characteristic dimension subset;
and aiming at the extracted historical characteristic variables under each characteristic dimension, combining the historical characteristic variables under the characteristic dimension according to the sequence of the historical moments corresponding to the historical characteristic variables to obtain the flow characteristic vector under each characteristic dimension.
Optionally, the second obtaining unit 404 may specifically be configured to:
and determining the weight of each characteristic dimension by adopting a machine learning algorithm according to the flow characteristic vector under each characteristic dimension.
Optionally, the first determining unit 405 may be specifically configured to:
determining a future flow value Q corresponding to the historical flow data at the future time by using the following formula0
Figure GDA0002967747610000121
Wherein n is the number of a plurality of characteristic dimensions, kiIs the weight of the ith feature dimension, wiT is a preset fault-tolerant threshold value which is a future numerical value under the ith characteristic dimension, and T is more than or equal to 0 and less than or equal to 1.
Optionally, the second determining unit 406 may be specifically configured to:
subtracting the future flow value from the actual flow value of the actual flow data to obtain a flow difference value;
if the flow difference value is larger than a preset difference value threshold value, determining that the actual flow data is abnormal;
and if the flow difference value is not larger than the preset difference value threshold value, determining that the actual flow data is normal.
Optionally, the extracting unit 402 may be further configured to obtain actual feature variables of the actual flow data in each feature dimension under the condition that it is determined that the actual flow data is abnormal;
the second determining unit 406 may be further configured to, for each feature dimension, determine that the actual flow data is abnormal in the feature dimension if the actual feature variable in the feature dimension is greater than the future value in the feature dimension.
In the abnormal flow detection device provided in the embodiment of the present application, the future flow value is determined according to the flow feature vector under each feature dimension, and the features in the flow feature vector are sorted according to time. Because the factors such as the time law of the traffic data, the service characteristics and the like are considered when determining the future traffic value, the determined future traffic value can change along with the change of time and service instead of being fixed and unchangeable. Further, when it is determined whether or not the actual flow data is abnormal based on the future flow value, it is possible to improve the flexibility of abnormal flow detection and reduce the erroneous determination rate of abnormal flow detection.
Corresponding to the foregoing abnormal traffic detection method embodiment, an embodiment of the present application further provides a network device, as shown in fig. 5, including a processor 501 and a machine-readable storage medium 502, where the machine-readable storage medium 502 stores machine-executable instructions that can be executed by the processor 501. The processor 501 is caused by machine executable instructions to implement any of the steps of the above-described abnormal traffic detection method. The abnormal flow detection method comprises the following steps:
obtaining historical flow data;
extracting historical characteristic variables of the historical flow data at each historical moment under a plurality of characteristic dimensions to obtain flow characteristic vectors under each characteristic dimension, and sequencing the historical characteristic variables in the flow characteristic vectors according to time;
predicting future values corresponding to historical flow data under each characteristic dimension at future time by adopting a regression algorithm according to the flow characteristic vectors under each characteristic dimension;
acquiring the weight of each feature dimension;
determining a future flow value corresponding to the historical flow data at a future moment according to the weight of each characteristic dimension and the future value under each characteristic dimension;
and acquiring actual flow data at a future moment, wherein the actual flow data and the historical flow data have the same message information, and determining whether the actual flow data is abnormal or not according to the future flow value and the actual flow value of the actual flow data.
In the technical scheme provided by the embodiment of the application, the future flow value is determined according to the flow characteristic vectors in time series under each characteristic dimension, and the characteristics in the flow characteristic vectors are sorted according to time. The factors such as the time rule of the traffic data, the service characteristics and the like are considered when the future traffic value is determined, so that the determined future traffic value can change along with the change of time and service instead of being fixed, and when whether the actual traffic data is abnormal or not is determined according to the future traffic value, the flexibility of abnormal traffic detection can be improved, and the misjudgment rate of abnormal traffic detection is reduced.
Optionally, as shown in fig. 5, the network device may further include: a communication interface 503 and a communication bus 504; the processor 501, the machine-readable storage medium 502, and the communication interface 503 are configured to complete communication with each other through the communication bus 504, and the communication interface 503 is configured to communicate with other devices through the network device.
Corresponding to the above abnormal traffic detection method embodiment, an embodiment of the present application further provides a machine-readable storage medium, which stores machine-executable instructions, and when the machine-readable storage medium is called and executed by a processor, the machine-executable instructions cause the processor to implement any step of the above abnormal traffic detection method.
The communication bus may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, a single thick line may be used, as shown in FIG. 5, but does not indicate that there is only one bus or one type of bus.
The machine-readable storage medium may include a RAM (Random Access Memory) and a NVM (Non-Volatile Memory), such as at least one disk Memory. Additionally, the machine-readable storage medium may be at least one memory device located remotely from the aforementioned processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also DSPs (Digital Signal Processing), ASICs (Application Specific Integrated circuits), FPGAs (Field Programmable Gate arrays) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the embodiments of the abnormal traffic detection apparatus, the network device, and the machine-readable storage medium, since they are substantially similar to the embodiments of the abnormal traffic detection method, the description is relatively simple, and relevant points can be referred to the partial description of the embodiments of the abnormal traffic detection method.
The above description is only for the preferred embodiment of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims (14)

1. An abnormal traffic detection method, characterized in that the method comprises:
obtaining historical flow data;
extracting historical characteristic variables of the historical flow data at each historical moment under a plurality of characteristic dimensions to obtain flow characteristic vectors under each characteristic dimension, wherein the historical characteristic variables in the flow characteristic vectors are sorted according to time, and the plurality of characteristic dimensions comprise: the method comprises the steps of obtaining a total flow value, an uplink flow value, a downlink flow value, the number of bytes, a source IP address value, a destination IP address value, the number of newly-built connections, the number of disconnected connections, a TCP (transmission control protocol) ratio, a UDP (user datagram protocol) ratio, an ICMP (internet control protocol) message number, an HTTP message number, a DNS (domain name system) message number and an FTP (file transfer protocol) message number;
predicting future values corresponding to the historical flow data in each feature dimension at the future time by adopting a regression algorithm according to the flow feature vectors in each feature dimension;
acquiring the weight of each feature dimension;
determining a future flow value corresponding to the historical flow data at the future moment according to the weight of each characteristic dimension and the future value under each characteristic dimension;
and acquiring actual flow data at the future moment, wherein the actual flow data and the historical flow data have the same message information, and determining whether the actual flow data is abnormal or not according to the future flow value and the actual flow value of the actual flow data.
2. The method according to claim 1, wherein the step of extracting historical feature variables of the historical flow data at each historical time under a plurality of feature dimensions to obtain a flow feature vector under each feature dimension comprises:
selecting an optimal characteristic dimension subset from a plurality of characteristic dimensions by adopting a preset characteristic dimension selection method;
extracting historical characteristic variables of the historical flow data at each historical moment under each characteristic dimension included by the characteristic dimension subset;
and aiming at the extracted historical characteristic variables under each characteristic dimension, combining the historical characteristic variables under the characteristic dimension according to the sequence of the historical moments corresponding to the historical characteristic variables to obtain the flow characteristic vector under each characteristic dimension.
3. The method of claim 1, wherein the step of obtaining the weight of each feature dimension comprises:
and determining the weight of each characteristic dimension by adopting a machine learning algorithm according to the flow characteristic vector under each characteristic dimension.
4. The method according to claim 1, wherein the step of determining the future flow value corresponding to the historical flow data at the future time according to the weight of each characteristic dimension and the future value of each characteristic dimension comprises:
determining a future flow value corresponding to the historical flow data at the future time by using the following formulaQ 0
Figure 735481DEST_PATH_IMAGE001
Wherein,nthe number of the plurality of feature dimensions,k i is as followsiThe weight of each feature dimension is determined by the weight,w i is as followsiAnd T is a preset fault-tolerant threshold value, and T is more than or equal to 0 and less than or equal to 1.
5. The method of claim 1, wherein the step of determining whether the actual flow data is abnormal based on the future flow value and an actual flow value of the actual flow data comprises:
subtracting the future flow value from the actual flow value of the actual flow data to obtain a flow difference value;
if the flow difference value is larger than a preset difference threshold value, determining that the actual flow is abnormal;
and if the flow difference value is not larger than the preset difference threshold value, determining that the actual flow is normal.
6. The method according to any one of claims 1 to 5, wherein in case it is determined that the actual flow data is abnormal, further comprising:
acquiring actual characteristic variables of the actual flow data under each characteristic dimension;
for each feature dimension, if the actual feature variable under the feature dimension is larger than the future value under the feature dimension, determining that the actual flow data is abnormal under the feature dimension.
7. An abnormal flow rate detecting apparatus, characterized in that the apparatus comprises:
the first acquisition unit is used for acquiring historical flow data;
an extraction unit, configured to extract historical feature variables of the historical flow data at each historical time under multiple feature dimensions, to obtain a flow feature vector under each feature dimension, where the historical feature variables in the flow feature vector are sorted according to time, where the multiple feature dimensions include: the method comprises the steps of obtaining a total flow value, an uplink flow value, a downlink flow value, the number of bytes, a source IP address value, a destination IP address value, the number of newly-built connections, the number of disconnected connections, a TCP (transmission control protocol) ratio, a UDP (user datagram protocol) ratio, an ICMP (internet control protocol) message number, an HTTP message number, a DNS (domain name system) message number and an FTP (file transfer protocol) message number;
the prediction unit is used for predicting future values corresponding to the historical flow data under each characteristic dimension at the future moment by adopting a regression algorithm according to the flow characteristic vectors under each characteristic dimension;
the second acquisition unit is used for acquiring the weight of each feature dimension;
the first determining unit is used for determining a future flow value corresponding to the historical flow data at the future moment according to the weight of each characteristic dimension and a future numerical value under each characteristic dimension;
and the second determining unit is used for acquiring actual flow data at the future moment, wherein the actual flow data and the historical flow data have the same message information, and determining whether the actual flow data is abnormal or not according to the future flow value and the actual flow value of the actual flow data.
8. The apparatus according to claim 7, wherein the extraction unit is specifically configured to:
selecting an optimal characteristic dimension subset from a plurality of characteristic dimensions by adopting a preset characteristic dimension selection method;
extracting historical characteristic variables of the historical flow data at each historical moment under each characteristic dimension included by the characteristic dimension subset;
and aiming at the extracted historical characteristic variables under each characteristic dimension, combining the historical characteristic variables under the characteristic dimension according to the sequence of the historical moments corresponding to the historical characteristic variables to obtain the flow characteristic vector under each characteristic dimension.
9. The apparatus according to claim 7, wherein the second obtaining unit is specifically configured to:
and determining the weight of each characteristic dimension by adopting a machine learning algorithm according to the flow characteristic vector under each characteristic dimension.
10. The apparatus according to claim 7, wherein the first determining unit is specifically configured to:
determining a future flow value corresponding to the historical flow data at the future time by using the following formulaQ 0
Figure 902151DEST_PATH_IMAGE003
Wherein n is the number of a plurality of characteristic dimensions,k i is as followsiThe weight of each feature dimension is determined by the weight,w i is as followsiAnd T is a preset fault-tolerant threshold value, and T is more than or equal to 0 and less than or equal to 1.
11. The apparatus according to claim 7, wherein the second determining unit is specifically configured to:
subtracting the future flow value from the actual flow value of the actual flow data to obtain a flow difference value;
if the flow difference value is larger than a preset difference threshold value, determining that the actual flow is abnormal;
and if the flow difference value is not larger than the preset difference threshold value, determining that the actual flow is normal.
12. The apparatus according to any one of claims 7 to 11,
the extracting unit is further configured to acquire actual feature variables of the actual flow data in each feature dimension under the condition that it is determined that the actual flow data is abnormal;
the second determining unit is further configured to, for each feature dimension, determine that the actual flow data is abnormal in the feature dimension if the actual feature variable in the feature dimension is greater than a future value in the feature dimension.
13. A network device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: carrying out the method steps of any one of claims 1 to 6.
14. A machine-readable storage medium having stored thereon machine-executable instructions that, when invoked and executed by a processor, cause the processor to: carrying out the method steps of any one of claims 1 to 6.
CN201811291061.2A 2018-10-31 2018-10-31 Abnormal flow detection method and device Active CN109413071B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811291061.2A CN109413071B (en) 2018-10-31 2018-10-31 Abnormal flow detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811291061.2A CN109413071B (en) 2018-10-31 2018-10-31 Abnormal flow detection method and device

Publications (2)

Publication Number Publication Date
CN109413071A CN109413071A (en) 2019-03-01
CN109413071B true CN109413071B (en) 2021-08-06

Family

ID=65470812

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811291061.2A Active CN109413071B (en) 2018-10-31 2018-10-31 Abnormal flow detection method and device

Country Status (1)

Country Link
CN (1) CN109413071B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109768995B (en) * 2019-03-06 2021-08-13 国网甘肃省电力公司电力科学研究院 Network flow abnormity detection method based on cyclic prediction and learning
CN109802973A (en) 2019-03-15 2019-05-24 北京百度网讯科技有限公司 Method and apparatus for detection flows
CN110784458B (en) * 2019-10-21 2023-04-18 新华三信息安全技术有限公司 Flow abnormity detection method and device and network equipment
CN110941797B (en) * 2019-11-07 2023-04-07 中信银行股份有限公司 Operation index monitoring and trend prediction system based on service index
CN111181923A (en) * 2019-12-10 2020-05-19 中移(杭州)信息技术有限公司 Flow detection method and device, electronic equipment and storage medium
CN112101400A (en) * 2019-12-19 2020-12-18 国网江西省电力有限公司电力科学研究院 Industrial control system abnormality detection method, equipment, server and storage medium
CN111556057B (en) * 2020-04-29 2022-11-04 绿盟科技集团股份有限公司 Flow abnormity detection method and device, electronic equipment and storage medium
CN116723059B (en) * 2023-08-10 2023-10-20 湖南润科通信科技有限公司 Security analysis system for network information

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003058250A (en) * 2001-08-09 2003-02-28 Toshiba Corp Control device
CN101155085A (en) * 2006-09-29 2008-04-02 中兴通讯股份有限公司 Method and device for real-time flux prediction and real-time flux monitoring and early warning
CN102932264A (en) * 2012-11-02 2013-02-13 曙光云计算技术有限公司 Method and device for judging flow overflowing
CN104202329A (en) * 2014-09-12 2014-12-10 北京神州绿盟信息安全科技股份有限公司 DDoS (distributed denial of service) attack detection method and device
CN106411597A (en) * 2016-10-14 2017-02-15 广东工业大学 Network traffic abnormality detection method and system
CN107729952A (en) * 2017-11-29 2018-02-23 新华三信息安全技术有限公司 A kind of traffic flow classification method and device
CN107786542A (en) * 2017-09-26 2018-03-09 杭州安恒信息技术有限公司 Methods of marking and device based on big data intellectual analysis malice IP
CN107832866A (en) * 2017-09-26 2018-03-23 晶赞广告(上海)有限公司 A kind of method for predicting and device, storage medium, terminal
CN108234496A (en) * 2018-01-05 2018-06-29 宝牧科技(天津)有限公司 A kind of method for predicting based on neural network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003058250A (en) * 2001-08-09 2003-02-28 Toshiba Corp Control device
CN101155085A (en) * 2006-09-29 2008-04-02 中兴通讯股份有限公司 Method and device for real-time flux prediction and real-time flux monitoring and early warning
CN102932264A (en) * 2012-11-02 2013-02-13 曙光云计算技术有限公司 Method and device for judging flow overflowing
CN104202329A (en) * 2014-09-12 2014-12-10 北京神州绿盟信息安全科技股份有限公司 DDoS (distributed denial of service) attack detection method and device
CN106411597A (en) * 2016-10-14 2017-02-15 广东工业大学 Network traffic abnormality detection method and system
CN107786542A (en) * 2017-09-26 2018-03-09 杭州安恒信息技术有限公司 Methods of marking and device based on big data intellectual analysis malice IP
CN107832866A (en) * 2017-09-26 2018-03-23 晶赞广告(上海)有限公司 A kind of method for predicting and device, storage medium, terminal
CN107729952A (en) * 2017-11-29 2018-02-23 新华三信息安全技术有限公司 A kind of traffic flow classification method and device
CN108234496A (en) * 2018-01-05 2018-06-29 宝牧科技(天津)有限公司 A kind of method for predicting based on neural network

Also Published As

Publication number Publication date
CN109413071A (en) 2019-03-01

Similar Documents

Publication Publication Date Title
CN109413071B (en) Abnormal flow detection method and device
CN107154950B (en) Method and system for detecting log stream abnormity
EP3652914B1 (en) Cyberanalysis workflow acceleration
US10148690B2 (en) Accurate real-time identification of malicious BGP hijacks
US10104124B2 (en) Analysis rule adjustment device, analysis rule adjustment system, analysis rule adjustment method, and analysis rule adjustment program
AU2011305214B2 (en) IP prioritization and scoring system for DDoS detection and mitigation
CN107968791B (en) Attack message detection method and device
US20170187747A1 (en) Using recurrent neural networks to defeat dns denial of service attacks
US20160226893A1 (en) Methods for optimizing an automated determination in real-time of a risk rating of cyber-attack and devices thereof
Peneti et al. DDOS attack identification using machine learning techniques
CN112437037A (en) Sketch-based DDoS flooding attack detection method and device
CN115499205A (en) Method and device for detecting abnormal external connection behavior, storage medium and electronic equipment
US20160337389A1 (en) Discovering yet unknown malicious entities using relational data
CN111092849B (en) Traffic-based detection method and device for distributed denial of service
Kuznetsov et al. Variance analysis of networks traffic for intrusion detection in smart grids
CN114024761A (en) Network threat data detection method and device, storage medium and electronic equipment
CN111819559A (en) Using machine learning models with quantized step sizes for malware detection
CN117579379A (en) Malicious encryption traffic detection method and device, storage medium and electronic equipment
US11321453B2 (en) Method and system for detecting and classifying malware based on families
Bartos et al. IFS: Intelligent flow sampling for network security–an adaptive approach
KR101816868B1 (en) Apparatus and method for verifying rules for malware detection
CN111224919B (en) DDOS (distributed denial of service) identification method and device, electronic equipment and medium
RU2683631C1 (en) Computer attacks detection method
Shawly et al. Architectures for detecting real-time multiple multi-stage network attacks using hidden Markov model
US20210288916A1 (en) Method and system for preventive filtering of network traffic

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant