CN113973013A - Network flow detection method, system and related components - Google Patents

Network flow detection method, system and related components Download PDF

Info

Publication number
CN113973013A
CN113973013A CN202111241230.3A CN202111241230A CN113973013A CN 113973013 A CN113973013 A CN 113973013A CN 202111241230 A CN202111241230 A CN 202111241230A CN 113973013 A CN113973013 A CN 113973013A
Authority
CN
China
Prior art keywords
matrix
delay
time
target
overtime
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111241230.3A
Other languages
Chinese (zh)
Other versions
CN113973013B (en
Inventor
梁艾青
范渊
刘博�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
DBAPPSecurity Co Ltd
Original Assignee
DBAPPSecurity Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by DBAPPSecurity Co Ltd filed Critical DBAPPSecurity Co Ltd
Priority to CN202111241230.3A priority Critical patent/CN113973013B/en
Publication of CN113973013A publication Critical patent/CN113973013A/en
Application granted granted Critical
Publication of CN113973013B publication Critical patent/CN113973013B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Algebra (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The application discloses a network flow detection method, a system and related components, wherein the method comprises the following steps: vectorizing the network traffic data by taking time as a unit to obtain a network traffic vector corresponding to each time; constructing a delay subsequence and an overtime subsequence of the target time by taking any one of all the times as the target time, constructing a delay track matrix and an overtime track matrix, and performing RPCA (resilient packet access) low-rank recovery to obtain a delay low-rank matrix and an overtime low-rank matrix; selecting a target matrix from the delay low-rank matrix to calculate a characteristic vector and obtain a characteristic vector hyperplane; selecting a principal component with the highest contribution degree from the overtime low-rank matrix as a target vector; and calculating the distance from the target vector to the hyperplane of the characteristic vector to determine the flow abnormity grade corresponding to the target moment. According to the method, the RPCA is selected for low-rank recovery, so that sparse and large noise in original network flow data is eliminated, the whole detection method has high robustness, and the determined mutation point is more accurate.

Description

Network flow detection method, system and related components
Technical Field
The present invention relates to the field of network security, and in particular, to a method, a system, and a related component for detecting network traffic.
Background
Currently, in the field of network security management, daily observation of network traffic is a fundamental and important ring. Conventional daily observation methods for network traffic include Principal Component Analysis (PCA) and Singular Spectrum Transform (SST) based time-series Change-Point Detection (CPD).
The PCA is extremely sensitive to large noise and sharp peaks, so that when the method is used for analyzing network traffic containing sparse large noise, the interference of the large noise cannot be eliminated in an analysis result, and the robustness is poor. When the CPD is used for analyzing the network flow, due to the fact that infinite estimation exists between time intervals, the solution of partial differential equations in the time intervals with limited sample numbers is a non-trivial solution, and therefore processing is difficult, and in practical application, even if the SST is combined, the robustness of an observation result is poor due to the existence of sparse large noise.
Therefore, how to provide a solution to the above technical problems is a problem to be solved by those skilled in the art.
Disclosure of Invention
In view of this, the present invention provides a method, a system and related components for detecting network traffic with high robustness, which avoid sparse and large noise interference. The specific scheme is as follows:
a network traffic detection method comprises the following steps:
acquiring current network flow data;
vectorizing the network traffic data by taking time as a unit to obtain a network traffic vector corresponding to each time;
constructing a delay subsequence and an overtime subsequence of the target moment by taking any one moment of all the moments as the target moment and using all the network traffic vectors;
constructing a delay track matrix and an overtime track matrix by respectively utilizing the delay subsequence and the overtime subsequence;
respectively carrying out RPCA (resilient packet error) low-rank recovery on the delay track matrix and the overtime track matrix to obtain a delay low-rank matrix and an overtime low-rank matrix;
selecting a target matrix with rows and columns both being a first preset value from the time-delay low-rank matrix, calculating a characteristic vector of the target matrix, and obtaining a characteristic vector hyperplane according to all the characteristic vectors;
selecting a principal component with the highest contribution degree from the overtime low-rank matrix as a target vector;
and calculating the distance from the target vector to the hyperplane of the characteristic vector to determine the flow abnormity grade corresponding to the target moment.
Preferably, the step of performing RPCA low rank recovery on the delay trajectory matrix and the timeout trajectory matrix respectively to obtain a delay low rank matrix and a timeout low rank matrix specifically includes:
and respectively optimizing the RPCA low-rank recovery of the delay trajectory matrix and the overtime trajectory matrix by a non-precise augmented Lagrange multiplier algorithm to obtain a delay low-rank matrix and an overtime low-rank matrix.
Preferably, the process of acquiring the current network traffic data specifically includes:
acquiring current network flow data according to the setting of a user terminal;
the user terminal sets the time sequence window size and the data set type of the network flow data, wherein the data set type comprises a domain name request number, and/or an address access number, and/or a network session number, and/or an in-band flow value.
Preferably, the process of constructing the delay subsequence and the timeout subsequence at the target time by using any one of all the times as the target time and using all the network traffic vectors includes:
constructing a delay subsequence and an overtime subsequence of the target moment by taking any one moment of all the moments as the target moment and using all the network traffic vectors;
the delay subsequence comprises all the network traffic vectors from a first delay starting time to the target time, and the timeout subsequence comprises all the network traffic vectors from the first timeout starting time to a first timeout ending time;
the number of all the network traffic vectors corresponding to the time from the first delay starting point to the target time and the number of all the network traffic vectors corresponding to the time from the first timeout starting point to the first timeout ending point are both second preset values; the first timeout start time lags behind the target time by a preset time length.
Preferably, the process of constructing the delay trajectory matrix and the timeout trajectory matrix by using the delay subsequence and the timeout subsequence respectively includes:
constructing a delay track matrix by using a delay subsequence corresponding to each time from the second delay starting time to the target time; the number of all delay subsequences corresponding to the time from the second delay starting point to the target time is a third preset value;
constructing an overtime track matrix by utilizing an overtime subsequence corresponding to each time from the target time to a second overtime end time; and the number of all overtime subsequences corresponding to the target time to the second overtime end point is a fourth preset value.
Preferably, the calculating a distance from the target vector to the feature vector hyperplane to determine a flow anomaly level corresponding to the target time includes:
according to the formula
Figure BDA0003319296040000031
Calculating a flow abnormal value; wherein β is the target vector, HrFor the feature vector hyperplane, cp (t)1) Is the target time t1A corresponding flow anomaly value;
and determining the flow abnormity grade corresponding to the target moment according to the flow abnormity value.
Preferably, the network traffic detection method further includes:
and visually displaying the network traffic data and the traffic abnormal grade corresponding to the network traffic data.
Correspondingly, the present application also discloses a network flow detection system, including:
the acquisition module is used for acquiring current network flow data;
the preprocessing module is used for vectorizing the network traffic data by taking time as a unit to obtain a network traffic vector corresponding to each time;
a sequence module, configured to construct a delay subsequence and an overtime subsequence of the target time by using any one of all the times as a target time and using all the network traffic vectors;
the matrix module is used for constructing a delay track matrix and an overtime track matrix by respectively utilizing the delay subsequence and the overtime subsequence;
the low-rank module is used for performing RPCA (resilient packet access) low-rank recovery on the delay track matrix and the overtime track matrix respectively to obtain a delay low-rank matrix and an overtime low-rank matrix;
the hyperplane module is used for selecting a target matrix with rows and columns both being a first preset value from the delay low-rank matrix, calculating a feature vector of the target matrix, and obtaining a feature vector hyperplane according to all the feature vectors;
the target vector module is used for selecting a principal component with the highest contribution degree from the overtime low-rank matrix as a target vector;
and the calculation module is used for calculating the distance from the target vector to the hyperplane of the characteristic vector so as to determine the flow abnormity grade corresponding to the target moment.
Correspondingly, this application still discloses a network flow detection device, includes:
a memory for storing a computer program;
a processor for implementing the steps of the network traffic detection method according to any of the above when executing the computer program.
Accordingly, the present application also discloses a readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the network traffic detection method according to any one of the above.
The application discloses a network flow detection method, which comprises the following steps: acquiring current network flow data; vectorizing the network traffic data by taking time as a unit to obtain a network traffic vector corresponding to each time; constructing a delay subsequence and an overtime subsequence of the target moment by taking any one moment of all the moments as the target moment and using all the network traffic vectors; constructing a delay track matrix and an overtime track matrix by respectively utilizing the delay subsequence and the overtime subsequence; respectively carrying out RPCA (resilient packet error) low-rank recovery on the delay track matrix and the overtime track matrix to obtain a delay low-rank matrix and an overtime low-rank matrix; selecting a target matrix with rows and columns both being a first preset value from the time-delay low-rank matrix, calculating a characteristic vector of the target matrix, and obtaining a characteristic vector hyperplane according to all the characteristic vectors; selecting a principal component with the highest contribution degree from the overtime low-rank matrix as a target vector; and calculating the distance from the target vector to the hyperplane of the characteristic vector to determine the flow abnormity grade corresponding to the target moment. According to the method, the RPCA low-rank recovery is selected to obtain the time-delay low-rank matrix and the overtime low-rank matrix, sparse large noise in original network flow data is eliminated in the process, the mutation point is detected, the whole detection method has high robustness, and the determined mutation point is more accurate.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart illustrating steps of a network traffic detection method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of sub-steps of a network traffic detection method according to an embodiment of the present invention:
FIG. 3 is a diagram illustrating sub-steps of a network traffic detection method according to an embodiment of the present invention;
FIGS. 4 a-4 c are graphs comparing the results of different methods according to the embodiment of the present invention;
FIGS. 5a and 5b are graphs comparing the discrete effect between abnormal score classes according to different methods in the embodiment of the present invention;
FIG. 6 is a graph of ESR scores for various methods in accordance with an embodiment of the present invention;
fig. 7 is a structural distribution diagram of a network traffic detection system according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
At present, the PCA of the traditional principal component analysis method is extremely sensitive to large noise and sharp peaks, so that the analysis result of the method cannot exclude the interference of the large noise, and the robustness is poor. When the CPD is used for analyzing the network flow, due to the fact that infinite estimation exists between time intervals, the solution of partial differential equations in the time intervals with limited sample numbers is a non-trivial solution, and therefore processing is difficult, and in practical application, even if the SST is combined, the robustness of an observation result is poor due to the existence of sparse large noise.
According to the method, the RPCA low-rank recovery is selected to obtain the time-delay low-rank matrix and the overtime low-rank matrix, sparse large noise in original network flow data is eliminated in the process, the mutation point is detected, the whole detection method has high robustness, and the determined mutation point is more accurate.
The embodiment of the invention discloses a network flow detection method, which is shown in figure 1 and comprises the following steps:
s1: acquiring current network flow data;
s2: vectorizing the network traffic data by taking time as a unit to obtain a network traffic vector corresponding to each time;
specifically, all network traffic data is divided by taking time as a unit to obtain network traffic data corresponding to each time, and then vectorization processing is performed on the network traffic data, so that a network traffic vector corresponding to each time t is y (t);
s3: constructing a delay subsequence and an overtime subsequence of the target moment by taking any moment in all moments as the target moment and passing all network flow vectors;
this step, when carried out in practice, can generally be carried out in the following manner:
constructing a delay subsequence and an overtime subsequence of the target moment by taking any moment in all moments as the target moment and passing all network flow vectors;
the time-delay subsequence comprises all network traffic vectors from a first time-delay starting moment to a target moment, and the time-out subsequence comprises all network traffic vectors from the first time-out starting moment to a first time-out terminal moment;
the number of all network traffic vectors corresponding to the time from the first time delay starting point to the target time and the number of all network traffic vectors corresponding to the time from the first timeout starting point to the first timeout ending point are second preset values; the first timeout start time lags the target time by a preset time length.
It can be understood that the delay subsequence and the timeout subsequence are necessarily formed by the same number of network traffic vectors, but the start point and the end point of the time range of the specific network traffic vector are not necessarily completely described above, and may also be set according to actual needs, and the set of time ranges may also be adjusted according to actual needs.
Referring to fig. 2, assume that the target time is t ═ t1If the second preset value is w, the starting time of the first delay is t1W, the first timeout start time lags behind the target time by a preset time length g, whereby the target time t is available if the aggregation rule is left-open and right-closed at this time1Time delay ofSubsequence (b): v (t)1-1)=[y(t1-w),......,y(t1-1)]TAnd the target time t1Timeout subsequence of (1): r (t)1+g)=[y(t1+g),......,y(t1+g-w)]T
It is understood that the above representation of the time-out sub-sequence and the time-out sub-sequence is only an example, and in practical applications, the representation method and the time range of each sequence can be adjusted according to requirements.
S4: constructing a delay track matrix and an overtime track matrix by respectively utilizing the delay subsequence and the overtime subsequence;
this step, when carried out in practice, can generally be carried out in the following manner:
constructing a delay track matrix by using a delay subsequence corresponding to each time from the second delay starting point time to the target time; the number of all delay subsequences corresponding to the time from the second delay starting point to the target time is a third preset value;
constructing an overtime track matrix by utilizing an overtime subsequence corresponding to each time from the target time to the second overtime end time; and the number of all overtime subsequences from the target moment to the second overtime end point moment is a fourth preset value.
Similarly, the time ranges of the delay trajectory matrix and the timeout trajectory matrix, the lengths and the end points of the time ranges can be set according to actual requirements, and the time ranges are not limited herein.
Specifically, the third preset value may be n, and the starting time of the second delay is t1N, if the set rule is left-open and right-closed in the time range when the matrix is constructed, the target time t can be obtained1The delay trajectory matrix is: h (t)1)=[v(t1-n),......,v(t1-1)]T(ii) a Similarly, if the fourth preset value is m, the second timeout end point time is t1+ m, if the set rule is left-open and right-closed in the time range when constructing the matrix, the target time t can be obtained1The timeout trace matrix of (d) is G (t)1)=[r(t1+g),......,r(t1+g+m-2),r(t1+g+m-1)]。
It is understood that the above representation of the delay trajectory matrix and the timeout trajectory matrix is only an example, and in practical applications, the representation method and the time range of each matrix may be adjusted according to requirements.
S5: respectively to delay trace matrix H (t)1) And timeout trace matrix G (t)1) Performing RPCA low-rank recovery to obtain a delay low-rank matrix AHSum time-out low rank matrix aG
It can be understood that the RPCA low-rank recovery action eliminates the large noise in the original network traffic data, and improves the robustness of the whole detection method and the accuracy of abnormal point detection.
S6: from the time-delayed low rank matrix H (t)1) Selecting a target matrix with rows and columns both having a first preset value l, and calculating a feature vector H of the target matrix1,……,HlAnd obtaining a feature vector hyperplane H according to all feature vectorsr=span{H1,H2,......,Hl};
S7: selecting a principal component with the highest contribution degree from the overtime low-rank matrix as a target vector beta;
s8: and calculating the distance from the target vector to the hyperplane of the characteristic vector to determine the flow abnormity grade corresponding to the target moment.
Wherein, this step specifically includes:
according to the formula
Figure BDA0003319296040000071
Calculating a flow abnormal value; wherein β is the target vector, HrFor the feature vector hyperplane, cp (t)1) Is the target time t1A corresponding flow anomaly value;
and determining the flow abnormity grade corresponding to the target time according to the flow abnormity value.
Specifically, in this embodiment, reference may be made to fig. 3 for processing and transforming the data in steps S4-S8.
Therefore, the RPCA low-rank recovery is selected to obtain the time-delay low-rank matrix and the overtime low-rank matrix, sparse loud noise in original network flow data is eliminated in the process, and the mutation point is detected at the same time, so that the whole detection method has high robustness, and the determined mutation point is more accurate.
The embodiment of the invention discloses a specific network traffic detection method, and compared with the previous embodiment, the embodiment further explains and optimizes the technical scheme.
Specifically, the process of performing RPCA low rank recovery on the delay trajectory matrix and the timeout trajectory matrix respectively to obtain a delay low rank matrix and a timeout low rank matrix includes:
and respectively carrying out RPCA (adaptive Augmented Lagrange Multiplier, IALM) optimization on the delay trajectory matrix and the overtime trajectory matrix by a non-precise Augmented Lagrange Multiplier (IALM) algorithm to obtain a delay low-rank matrix and an overtime low-rank matrix.
Specifically, the pseudo code corresponding to the optimization process of the non-precise augmented lagrange multiplier algorithm is as follows:
Figure BDA0003319296040000081
according to the pseudo codes, the delay trajectory matrix and the timeout trajectory matrix which need to be subjected to RPCA low-rank recovery in the embodiment can be optimized.
The embodiment of the invention discloses a specific network traffic detection method, and compared with the previous embodiment, the embodiment further explains and optimizes the technical scheme.
The process of acquiring the current network traffic data specifically includes:
acquiring current network flow data according to the setting of a user terminal;
the user terminal sets the time sequence window size and the data set type of the network traffic data, wherein the data set type comprises the domain name request number, the address access number, the network session number and the in-band in-bound traffic value.
Specifically, the time series window size may be set at the user terminal to 30 days, 7 days, or 1 day.
Further, in addition to the time-series window size and data and type of the network traffic data, the user terminal setting further includes an abnormal value detection threshold, which provides a determination criterion when determining a traffic abnormal level corresponding to the traffic abnormal value.
Further, the network traffic detection method further includes:
and visually displaying the network traffic data and the traffic abnormal grade corresponding to the network traffic data.
It can be understood that, according to the network traffic detection method in the foregoing embodiment, each time can be calculated as a target time, so that an abnormal traffic level or an abnormal traffic value corresponding to each time can be obtained. After all the relevant data for each time are acquired, the data may be output to the user terminal, specifically, the data content output to the user terminal includes a timestamp sequence, all the network traffic data, an abnormal value detection threshold, an abnormal traffic class, an abnormal traffic value, and a feature vector matrix of an abnormal point at all times, where the definition of the abnormal point is specifically: when the abnormal flow value corresponding to a certain moment exceeds the abnormal value detection threshold, the moment is determined as an abnormal point; according to the difference value of the abnormal flow value and the abnormal value detection threshold value, the abnormal flow grade of the abnormal point can be further determined, for example, the abnormal flow grade is divided into no abnormality, general abnormality, moderate abnormality and serious abnormality; the eigenvector matrix of the abnormal point mainly comprises data such as the number of principal components, the contribution ratio of the principal components, a principal component eigenvalue list and the like.
Further, when the user terminal displays the data, the flow rate abnormality level can be displayed through visualization, specifically, through visualization means such as a line graph, a bar graph, a lane graph, and color change.
The embodiment of the invention discloses a specific network traffic detection method, and compared with the previous embodiment, the embodiment further explains and optimizes the technical scheme.
Specifically, in this embodiment, the network traffic detection method in the embodiment of the present application, which is referred to as RPCA-SST for short below, is compared with the conventional SST method in different directions:
referring to fig. 4a to 4c, the three graphs are respectively the network traffic data input and SST methods of different three-segment data and the detection results of outliers and sparse large noise in the RPCA-SST of the present application;
referring to fig. 5a and 5b, the two graphs respectively show a comparison of discrete effects between abnormal score classes of network traffic data input, SST method, RSST method, rulsfi, and RPCA-SST method of the present application for different two-end data;
see figure 6 for an embodiment of ESR score for various algorithms. It is understood that the Equal Sampling Rate (ESR) scoring algorithm is a quality metric without threshold for investigating the performance quality of each algorithm. For two discrete probability distributions p (T) and q (T) defined over the region 1 ≦ T ≦ T, the sample equality is defined as:
Figure BDA0003319296040000101
where c is the tolerance parameter and h is the distance scaling function defined on the region [ -c, c ]. ES (p, q, c,1) is an overview of recall and accuracy, based on which the definition of ESR can be found as:
ESR(p,q,c,1)=ω×ES(p,q,c,1)+(1-ω)×ES(q,p,c,1);
where 1 ≦ ω ≦ 1 is a weight parameter that trades off between recall and accuracy. The higher the ESR score, the better the CPD performance, and the specific derivation and calculation analysis can be found in documents Y.Mohammad and T.Nishida, "On-associated ssa-based change discovery algorithms," in System Integration (SII),2011IEEE/SICE International Symposium On. IEEE,2011, pp.938-945, which corresponds to document formula (14), and will not be described herein again.
Through the test comparison of various network traffic detection methods in different directions and by different means, it is obvious that the traffic detection method RPCA-SST disclosed by the application can avoid sparse and large noise, has higher robustness and more accurate abnormal value detection result, and has accuracy far higher than that of other traditional methods.
Correspondingly, the present application also discloses a network traffic detection system, as shown in fig. 7, including:
the acquisition module 1 is used for acquiring current network flow data;
the preprocessing module 2 is configured to perform vectorization processing on the network traffic data in units of time to obtain a network traffic vector corresponding to each time;
a sequence module 3, configured to construct a delay subsequence and an overtime subsequence of the target time by using any one of all the times as a target time and using all the network traffic vectors;
the matrix module 4 is used for respectively utilizing the delay subsequence and the overtime subsequence to construct a delay track matrix and an overtime track matrix;
the low-rank module 5 is configured to perform RPCA low-rank recovery on the delay trajectory matrix and the timeout trajectory matrix, respectively, to obtain a delay low-rank matrix and a timeout low-rank matrix;
the hyperplane module 6 is used for selecting a target matrix of which the rows and the columns are the first preset values from the delay low-rank matrix, calculating the eigenvector of the target matrix, and obtaining a hyperplane of the eigenvector according to all the eigenvectors;
a target vector module 7, configured to select a principal component with the highest contribution degree from the timeout low-rank matrix as a target vector;
and the calculating module 8 is used for calculating the distance from the target vector to the hyperplane of the feature vector so as to determine the flow abnormity grade corresponding to the target moment.
According to the embodiment of the application, the RPCA low-rank recovery is selected to obtain the time-delay low-rank matrix and the overtime low-rank matrix, sparse large noise in original network flow data is eliminated in the process, the mutation point is detected, the whole detection method has high robustness, and the determined mutation point is more accurate.
In some specific embodiments, the low rank module is specifically configured to: and respectively optimizing the RPCA low-rank recovery of the delay trajectory matrix and the overtime trajectory matrix by a non-precise augmented Lagrange multiplier algorithm to obtain a delay low-rank matrix and an overtime low-rank matrix.
In some specific embodiments, the obtaining module is specifically configured to: acquiring current network flow data according to the setting of a user terminal; the user terminal sets the time sequence window size and the data set type of the network flow data, wherein the data set type comprises a domain name request number, and/or an address access number, and/or a network session number, and/or an in-band flow value.
In some specific embodiments, the sequence module is specifically configured to:
constructing a delay subsequence and an overtime subsequence of the target moment by taking any one moment of all the moments as the target moment and using all the network traffic vectors;
the delay subsequence comprises all the network traffic vectors from a first delay starting time to the target time, and the timeout subsequence comprises all the network traffic vectors from the first timeout starting time to a first timeout ending time;
the number of all the network traffic vectors corresponding to the time from the first delay starting point to the target time and the number of all the network traffic vectors corresponding to the time from the first timeout starting point to the first timeout ending point are both second preset values; the first timeout start time lags behind the target time by a preset time length.
In some specific embodiments, the matrix module is specifically configured to:
constructing a delay track matrix by using a delay subsequence corresponding to each time from the second delay starting time to the target time; the number of all delay subsequences corresponding to the time from the second delay starting point to the target time is a third preset value;
constructing an overtime track matrix by utilizing an overtime subsequence corresponding to each time from the target time to a second overtime end time; and the number of all overtime subsequences corresponding to the target time to the second overtime end point is a fourth preset value.
In some specific embodiments, the calculation module is specifically configured to:
according to the formula
Figure BDA0003319296040000121
Calculating a flow abnormal value; wherein β is the target vector, HrFor the feature vector hyperplane, cp (t)1) Is the target time t1A corresponding flow anomaly value;
and determining the flow abnormity grade corresponding to the target moment according to the flow abnormity value.
In some specific embodiments, the network traffic detection system further includes:
and the visualization module is used for visually displaying the network traffic data and the traffic abnormal grade corresponding to the network traffic data.
Correspondingly, the embodiment of the present application further discloses a network traffic detection device, including:
a memory for storing a computer program;
a processor for implementing the steps of the network traffic detection method according to any of the above embodiments when executing the computer program.
Correspondingly, the embodiment of the present application further discloses a readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the network traffic detection method according to any of the above embodiments.
The specific content of the network traffic detection method in the embodiment of the present application may refer to the description in the above embodiment, and is not described herein again.
The network traffic detection device and the readable storage medium in the embodiments of the present application have the same technical effects as the network traffic detection method in the embodiments above, and are not described herein again.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The network traffic detection method, system and related components provided by the present invention are introduced in detail, and a specific example is applied in the description to explain the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A network traffic detection method is characterized by comprising the following steps:
acquiring current network flow data;
vectorizing the network traffic data by taking time as a unit to obtain a network traffic vector corresponding to each time;
constructing a delay subsequence and an overtime subsequence of the target moment by taking any one moment of all the moments as the target moment and using all the network traffic vectors;
constructing a delay track matrix and an overtime track matrix by respectively utilizing the delay subsequence and the overtime subsequence;
respectively carrying out RPCA (resilient packet error) low-rank recovery on the delay track matrix and the overtime track matrix to obtain a delay low-rank matrix and an overtime low-rank matrix;
selecting a target matrix with rows and columns both being a first preset value from the time-delay low-rank matrix, calculating a characteristic vector of the target matrix, and obtaining a characteristic vector hyperplane according to all the characteristic vectors;
selecting a principal component with the highest contribution degree from the overtime low-rank matrix as a target vector;
and calculating the distance from the target vector to the hyperplane of the characteristic vector to determine the flow abnormity grade corresponding to the target moment.
2. The method for detecting network traffic according to claim 1, wherein the step of performing RPCA low rank recovery on the delay trajectory matrix and the timeout trajectory matrix respectively to obtain a delay low rank matrix and a timeout low rank matrix specifically includes:
and respectively optimizing the RPCA low-rank recovery of the delay trajectory matrix and the overtime trajectory matrix by a non-precise augmented Lagrange multiplier algorithm to obtain a delay low-rank matrix and an overtime low-rank matrix.
3. The method according to claim 1, wherein the process of acquiring the current network traffic data specifically includes:
acquiring current network flow data according to the setting of a user terminal;
the user terminal sets the time sequence window size and the data set type of the network flow data, wherein the data set type comprises a domain name request number, and/or an address access number, and/or a network session number, and/or an in-band flow value.
4. The method according to claim 1, wherein the process of constructing the delay subsequence and the timeout subsequence of the target time by using all the network traffic vectors and taking any one of all the times as the target time includes:
constructing a delay subsequence and an overtime subsequence of the target moment by taking any one moment of all the moments as the target moment and using all the network traffic vectors;
the delay subsequence comprises all the network traffic vectors from a first delay starting time to the target time, and the timeout subsequence comprises all the network traffic vectors from the first timeout starting time to a first timeout ending time;
the number of all the network traffic vectors corresponding to the time from the first delay starting point to the target time and the number of all the network traffic vectors corresponding to the time from the first timeout starting point to the first timeout ending point are both second preset values; the first timeout start time lags behind the target time by a preset time length.
5. The method according to claim 4, wherein the process of constructing the delay trajectory matrix and the timeout trajectory matrix by using the delay subsequence and the timeout subsequence, respectively, comprises:
constructing a delay track matrix by using a delay subsequence corresponding to each time from the second delay starting time to the target time; the number of all delay subsequences corresponding to the time from the second delay starting point to the target time is a third preset value;
constructing an overtime track matrix by utilizing an overtime subsequence corresponding to each time from the target time to a second overtime end time; and the number of all overtime subsequences corresponding to the target time to the second overtime end point is a fourth preset value.
6. The method according to any one of claims 1 to 5, wherein the step of calculating the distance from the target vector to the hyperplane of the feature vector to determine the traffic anomaly level corresponding to the target time includes:
according to the formula
Figure FDA0003319296030000021
Calculating a flow abnormal value; wherein beta is the orderScalar quantity, HrFor the feature vector hyperplane, cp (t)1) Is the target time t1A corresponding flow anomaly value;
and determining the flow abnormity grade corresponding to the target moment according to the flow abnormity value.
7. The network traffic detection method of claim 6, further comprising:
and visually displaying the network traffic data and the traffic abnormal grade corresponding to the network traffic data.
8. A network traffic detection system, comprising:
the acquisition module is used for acquiring current network flow data;
the preprocessing module is used for vectorizing the network traffic data by taking time as a unit to obtain a network traffic vector corresponding to each time;
a sequence module, configured to construct a delay subsequence and an overtime subsequence of the target time by using any one of all the times as a target time and using all the network traffic vectors;
the matrix module is used for constructing a delay track matrix and an overtime track matrix by respectively utilizing the delay subsequence and the overtime subsequence;
the low-rank module is used for performing RPCA (resilient packet access) low-rank recovery on the delay track matrix and the overtime track matrix respectively to obtain a delay low-rank matrix and an overtime low-rank matrix;
the hyperplane module is used for selecting a target matrix with rows and columns both being a first preset value from the delay low-rank matrix, calculating a feature vector of the target matrix, and obtaining a feature vector hyperplane according to all the feature vectors;
the target vector module is used for selecting a principal component with the highest contribution degree from the overtime low-rank matrix as a target vector;
and the calculation module is used for calculating the distance from the target vector to the hyperplane of the characteristic vector so as to determine the flow abnormity grade corresponding to the target moment.
9. A network traffic detection device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the network traffic detection method according to any of claims 1 to 7 when executing the computer program.
10. A readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the network traffic detection method according to any one of claims 1 to 7.
CN202111241230.3A 2021-10-25 2021-10-25 Network traffic detection method, system and related components Active CN113973013B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111241230.3A CN113973013B (en) 2021-10-25 2021-10-25 Network traffic detection method, system and related components

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111241230.3A CN113973013B (en) 2021-10-25 2021-10-25 Network traffic detection method, system and related components

Publications (2)

Publication Number Publication Date
CN113973013A true CN113973013A (en) 2022-01-25
CN113973013B CN113973013B (en) 2024-02-02

Family

ID=79588339

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111241230.3A Active CN113973013B (en) 2021-10-25 2021-10-25 Network traffic detection method, system and related components

Country Status (1)

Country Link
CN (1) CN113973013B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9361797B1 (en) * 2014-12-11 2016-06-07 Here Global B.V. Detecting road condition changes from probe data
CN106301950A (en) * 2016-09-07 2017-01-04 中国联合网络通信集团有限公司 A kind of OD stream quantitative analysis method and analytical equipment
CN107070867A (en) * 2017-01-03 2017-08-18 湖南大学 Exception of network traffic quick determination method based on multilayer local sensitivity Hash table
CN110166464A (en) * 2019-05-27 2019-08-23 北京信息科技大学 A kind of detection method and system of content center network interest extensive aggression
US10430690B1 (en) * 2018-04-20 2019-10-01 Sas Institute Inc. Machine learning predictive labeling system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9361797B1 (en) * 2014-12-11 2016-06-07 Here Global B.V. Detecting road condition changes from probe data
CN106301950A (en) * 2016-09-07 2017-01-04 中国联合网络通信集团有限公司 A kind of OD stream quantitative analysis method and analytical equipment
CN107070867A (en) * 2017-01-03 2017-08-18 湖南大学 Exception of network traffic quick determination method based on multilayer local sensitivity Hash table
US10430690B1 (en) * 2018-04-20 2019-10-01 Sas Institute Inc. Machine learning predictive labeling system
CN110166464A (en) * 2019-05-27 2019-08-23 北京信息科技大学 A kind of detection method and system of content center network interest extensive aggression

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周伯阳等: "基于多尺度低秩模型的电力无线接入网异常流量检测方法", 电子学报, no. 08 *

Also Published As

Publication number Publication date
CN113973013B (en) 2024-02-02

Similar Documents

Publication Publication Date Title
JP7353238B2 (en) Method and system for performing automated root cause analysis of abnormal events in high-dimensional sensor data
CN112069398A (en) Information pushing method and device based on graph network
US8341158B2 (en) User's preference prediction from collective rating data
Fan et al. Dish-ts: a general paradigm for alleviating distribution shift in time series forecasting
Giampouras et al. Alternating iteratively reweighted least squares minimization for low-rank matrix factorization
US20230186101A1 (en) Time series data adversarial sample generating method and system, electronic device, and storage medium
CN108921424B (en) Power data anomaly detection method, device, equipment and readable storage medium
CN108470052B (en) Anti-trust attack recommendation algorithm based on matrix completion
CN111930983B (en) Image retrieval method and device, electronic equipment and storage medium
Brockwell et al. Continuous auto-regressive moving average random fields on R n
Wilms et al. Sparse identification and estimation of large-scale vector autoregressive moving averages
US11762730B2 (en) Selection of outlier-detection programs specific to dataset meta-features
CN114936323B (en) Training method and device of graph representation model and electronic equipment
CN114239685A (en) Method and device for evaluating robustness of neural network image classification model
Cui et al. Process monitoring method based on correlation variable classification and vine copula
CN108228959A (en) Using the method for Random censorship estimating system virtual condition and using its wave filter
US9147162B2 (en) Method for classification of newly arrived multidimensional data points in dynamic big data sets
CN113110972A (en) Method, device and medium for detecting time sequence data abnormity
Segarra et al. Statistical graph signal processing: Stationarity and spectral estimation
CN113973013A (en) Network flow detection method, system and related components
Pilgram et al. Modelling the dynamics of nonlinear time series using canonical variate analysis
Calafiore et al. Leading impulse response identification via the Elastic Net criterion
CN110210572B (en) Image classification method, device, storage medium and equipment
CN110990383A (en) Similarity calculation method based on industrial big data set
Jiao et al. A corrected and more efficient suite of MCMC samplers for the multinomal probit model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant