WO2021027697A1 - 一种流量异常检测的方法、模型训练方法和装置 - Google Patents

一种流量异常检测的方法、模型训练方法和装置 Download PDF

Info

Publication number
WO2021027697A1
WO2021027697A1 PCT/CN2020/107627 CN2020107627W WO2021027697A1 WO 2021027697 A1 WO2021027697 A1 WO 2021027697A1 CN 2020107627 W CN2020107627 W CN 2020107627W WO 2021027697 A1 WO2021027697 A1 WO 2021027697A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
time series
time sequence
sub
type
Prior art date
Application number
PCT/CN2020/107627
Other languages
English (en)
French (fr)
Inventor
张彦芳
李刚
薛莉
林玮
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP20852228.4A priority Critical patent/EP4009590A4/en
Publication of WO2021027697A1 publication Critical patent/WO2021027697A1/zh
Priority to US17/669,638 priority patent/US20220166681A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/026Capturing of monitoring data using flow identification
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • H04L41/064Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis involving time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • H04L43/067Generation of reports using time frame reporting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays
    • H04L43/087Jitter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Definitions

  • This application relates to the field of machine learning, and more specifically, to a method, model training method and device for traffic anomaly detection.
  • anomaly detection In the field of machine learning, anomaly detection refers to the detection of models, data, or time that do not conform to predictions. Usually anomaly detection is the study of historical data by professionals, and then find out abnormal points. Data sources include applications, processes, operating systems, devices, or networks. With the increase in the complexity of computing systems, humans are no longer capable of the current difficulty of anomaly detection.
  • This application provides a method, a model training method, and a device for detecting anomalies in traffic, which can improve the accuracy of anomaly detection of network traffic data by a model.
  • a method for detecting traffic anomaly is provided to obtain a target time series, the target time series includes N elements, and the N elements correspond to N time instants, wherein each of the N elements Each element is the flow data received at the corresponding time; according to the target time sequence, the target parameter of the target time series is obtained, and the target parameter includes a period factor and/or jitter density, wherein the period factor is To represent a wave-shaped change around a long-term trend presented in the target time series, the jitter density is used to represent the deviation between the actual value of the target time series and the target value within the target time; according to the target parameter , Determining the first type to which the target time series belongs from a plurality of types, wherein each type of the plurality of types corresponds to a parameter set, and the target parameter belongs to a parameter set corresponding to the first type Detect the abnormal situation of the target time series according to the judgment model of the first type corresponding to the first type, wherein each of the multiple types corresponds to a
  • the acquiring the target parameter of the target time series according to the target time series includes: combining each of the N elements in the target time series Elements are decomposed into trend components, periodic components and residual components; determining a first sub-time sequence including N of the period components and a second sub-time sequence including N of the residual components; according to the first sub-time sequence Or, the second sub-time sequence obtains the target parameter of the target time sequence.
  • the acquiring the target parameter of the target time sequence according to the first sub-time sequence or the second sub-time sequence includes: according to the first sub-time sequence The time sequence determines whether the period factor exists in the target time sequence.
  • the determining whether the target time sequence has the periodic factor according to the first sub-time sequence includes: N in the first sub-time sequence In the case of the existence of periodic components, it is determined that the periodic factor exists in the target time sequence; in the case that the N periodic components in the first sub-time sequence do not exist, it is determined that the target time sequence does not exist. Cycle factor.
  • the method further includes: determining a second type corresponding to the first type according to the first mapping relationship and the first type to which the target time series belongs A decision model, the first mapping relationship includes the corresponding relationship between the multiple types and multiple decision models of the second category; according to the second sub-time series and the second category corresponding to the first category
  • the judgment model detects the abnormal situation of the target time series, and the second type of judgment model is an N-sigma model.
  • the acquiring the target parameter of the target time sequence according to the first sub-time sequence or the second sub-time sequence includes: according to the second sub-time sequence Time series, determine the jitter density of the target time series.
  • the determining the jitter density of the target time series according to the second sub-time series includes: determining R of the target time series according to the following formula:
  • the R is the jitter density
  • the r n can be determined according to the following formula:
  • the C n is the n-th element in the second sub-time sequence
  • the x n is the n-th element in the target time sequence
  • the N is determined according to the following formula:
  • the T is the length of the target time series
  • the W is the window length of the joining window
  • the ⁇ is the first preset value.
  • the determining the first type to which the target time series belongs from a plurality of types according to the target parameter includes: according to the target parameter, from the plurality of types The first parameter set to which the target parameter belongs is determined in the parameter set; the first type to which the target time series belongs is determined from a plurality of types according to the third mapping relationship and the first parameter set, and the third The mapping relationship includes multiple parameter sets and multiple corresponding relationships of the types.
  • the above-mentioned types may include: periodic, aperiodic, stable, glitch, periodic stabilization, periodic glitch, aperiodic stabilization, or aperiodic glitch.
  • the periodic and aperiodic types can be determined according to the periodic factor. Specifically, when the periodic factor exists, the type is periodic; when the periodic factor does not exist, the type is aperiodic.
  • the stable type and the glitch type can be determined according to the jitter density. Specifically, when the jitter density is greater than the second preset value, the type is a glitch type; when the jitter density is less than or equal to the second preset value, the type is a smooth type .
  • the target parameter determine the first parameter set to which the target parameter belongs from multiple parameter sets, and then determine the first type of the target time series from the multiple types according to the third mapping relationship and the first parameter set, so that Obtain the type of the target time series, and complete the classification of the target time series.
  • the detecting an abnormal situation of the target time series according to the first type of judgment model corresponding to the first type includes: determining that N of the trend components are included Divide the second time sequence into M subsequences of target length, where M is a positive integer, and the second time sequence is the third sub time sequence or the second time sequence is It is formed according to the third sub-time sequence and the linear segmentation algorithm PLR; calculates the matrix contour MP values of M target-length sub-sequences, and the matrix contour MP values of the M target-length sub-sequences form the MP time sequence; According to the MP time series and the N-sigma algorithm, the abnormal situation of the target time series is detected.
  • the second time sequence into M sub-sequences of target length, where the second time sequence is the third sub-time sequence or the second time sequence is based on the third sub-time sequence and linear segmentation
  • the algorithm PLR is formed.
  • it calculates the matrix contour MP value of M target length subsequences.
  • the abnormal situation of the target time series is detected, thereby improving the accuracy of traffic anomaly detection.
  • the method further includes: determining the first type corresponding to the first type according to the second mapping relationship and the first type to which the target time series belongs A decision model, and the second mapping relationship includes a correspondence relationship between the multiple types and multiple decision models of the first category.
  • the second mapping relationship and the first type to which the target time series belongs determine the first type of judgment model corresponding to the first type, so that the corresponding judgment model can be determined for the type to which the target time series belongs, thereby increasing traffic The accuracy of anomaly detection.
  • a method for detecting traffic anomaly including: acquiring a target time series; the target time series includes N elements, and the N elements correspond to N times, wherein among the N elements Each element of is the traffic data received at the corresponding time; according to the target time sequence, the target parameter of the target time series is obtained, and the target parameter includes a period factor and/or jitter density, wherein the period The factor is used to represent a wave-shaped change around a long-term trend presented in the target time series, and the jitter density is used to represent the deviation of the actual value of the target time series from the target value within the target time;
  • the first parameter set to which the target parameter belongs is determined in the parameter set; the abnormal condition of the target time series is detected according to the first type of judgment model corresponding to the first parameter set, wherein the multiple parameter sets Each parameter set in is corresponding to a type of judgment model, and the judgment model is used for traffic abnormality detection.
  • the model detects traffic anomaly on the target time series, so the accuracy of traffic anomaly detection can be improved.
  • the obtaining the target parameter of the target time series according to the target time series includes: combining each of the N elements in the target time series Elements are decomposed into trend components, periodic components and residual components; determining a first sub-time sequence including N of the period components and a second sub-time sequence including N of the residual components; according to the first sub-time sequence Or, the second sub-time sequence obtains the target parameter of the target time sequence.
  • the acquiring the target parameter of the target time sequence according to the first sub-time sequence or the second sub-time sequence includes: according to the first sub-time sequence The time sequence determines whether the period factor exists in the target time sequence.
  • the determining whether the target time sequence has the periodicity factor according to the first sub-time sequence includes: N in the first sub-time sequence In the case of the existence of periodic components, it is determined that the periodic factor exists in the target time sequence; in the case that the N periodic components in the first sub-time sequence do not exist, it is determined that the target time sequence does not exist. Cycle factor.
  • the method further includes: detecting the target time sequence according to the second sub-time sequence and the second type of judgment model corresponding to the first parameter set In the abnormal situation, the second type of judgment model is the N-sigma model.
  • the method further includes: determining the second type corresponding to the first parameter set according to the fourth mapping relationship and the first parameter set to which the target parameter belongs
  • the fourth mapping relationship includes the corresponding relationship between the plurality of parameter sets and the plurality of second-type determination models.
  • the acquiring the target parameter of the target time sequence according to the first sub-time sequence or the second sub-time sequence includes: according to the second sub-time sequence Time series, determine the jitter density of the target time series.
  • the determining the jitter density of the target time series according to the second sub-time series includes: determining the jitter density of the target time series according to the following formula:
  • the R is the jitter density
  • the r n can be determined according to the following formula:
  • the C n is the n-th element in the second sub-time sequence
  • the x n is the n-th element in the target time sequence
  • the N is determined according to the following formula:
  • the T is the length of the target time series
  • the W is the window length of the joining window
  • the ⁇ is the first preset value.
  • the detecting the abnormal situation of the target time series according to the first type of judgment model corresponding to the first parameter set includes: determining that N trends are included The third sub-time sequence of the component; divide the second time sequence into M sub-sequences of target length, where M is a positive integer, and the second time sequence is the third sub-time sequence or the second time sequence It is formed according to the third sub-time sequence and the linear segmentation algorithm PLR; calculates the matrix contour MP values of the sub-sequences of M target lengths, and the matrix contour MP values of the sub-sequences of the M target lengths form the MP time series ; According to the MP time series and the N-sigma algorithm, the abnormal situation of the target time series is detected.
  • the second time sequence into M sub-sequences of target length, where the second time sequence is the third sub-time sequence or the second time sequence is based on the third sub-time sequence and linear segmentation
  • the algorithm PLR is formed; secondly, the matrix contour MP value of the sub-sequence of M target length is calculated; finally, according to the MP time series and the N-sigma algorithm, the abnormal situation of the target time series is detected, thereby improving the accuracy of traffic anomaly detection.
  • the method further includes: determining the first type corresponding to the first parameter set according to the fifth mapping relationship and the first parameter set to which the target parameter belongs
  • the fifth mapping relationship includes the corresponding relationship between the multiple parameter sets and the multiple first-type decision models.
  • the determination model of the first type corresponding to the first parameter set is determined, so that the determination model corresponding to the target time series can be obtained, which improves the accuracy of traffic anomaly detection.
  • a method for detecting abnormal traffic includes: acquiring a target time sequence, the target time sequence includes N elements, and the N elements correspond to N times, wherein the N Each of the elements is the flow data received at the corresponding moment; each element of the N elements in the target time series is decomposed into a trend component, a periodic component and a residual component; it is determined that N A third sub-time sequence of the trend component; divide the second time sequence into M sub-sequences of target length, where M is a positive integer, and the second time sequence is the third sub-time sequence or the The second time sequence is formed according to the third sub-time sequence and the linear segmentation algorithm PLR; the matrix contour MP values of the M target-length sub-sequences are calculated, and the matrix contour MP values of the M target-length sub-sequences constitute MP Time series: According to the third type of judgment model, detect the abnormal situation of the MP time series.
  • the target time series and divide the second time series into M sub-sequences of target length, where the second time series is the third sub-time series or the second time series is based on the third sub-time series and linear segmentation Formed by the algorithm PLR, the third sub-time series is a time series formed by decomposing the trend components of each of the N elements in the target time series; secondly, the matrix contour MP values of the sub-sequences of M target length are calculated; Finally, according to the third type of judgment model, the abnormal situation of the target time series is detected, thereby improving the accuracy of traffic abnormality detection.
  • the method further includes: determining a second sub-time sequence including the N residual components; and detecting according to the second sub-time sequence and the third type of judgment model The abnormal situation of the target time series.
  • the target length is specified by the communication protocol.
  • the third type of judgment model is an N-sigma model.
  • a method for classifying traffic patterns including: obtaining a target time series, the target time series including N elements, and the N elements correspond to N times, wherein among the N elements Each element of is the flow data received at the corresponding time; according to the target time series, the target parameters of the target time series are obtained, and the target parameters include period factors and/or jitter density, where the period factors are To represent a wave-shaped change around a long-term trend presented in the target time series, the jitter density is used to represent the deviation between the actual value of the target time series and the target value within the target time; according to the target parameter , Classify the target time series.
  • the obtaining the target parameter of the target time series according to the target time series includes: combining each of the N elements in the target time series Elements are decomposed into trend components, periodic components and residual components; determining a first sub-time sequence including N of the period components and a second sub-time sequence including N of the residual components; according to the first sub-time sequence Or, the second sub-time sequence obtains the target parameter of the target time sequence.
  • the acquiring the target parameter of the target time sequence according to the first sub-time sequence or the second sub-time sequence includes: according to the first sub-time sequence The time sequence determines whether the period factor exists in the target time sequence.
  • the determining whether the target time sequence has the periodic factor according to the first sub-time sequence includes: N in the first sub-time sequence In the case of the existence of three periodic components, it is determined that the periodic factor exists in the target time sequence; in the case that the N periodic components in the first sub-time sequence do not exist, it is determined that the target time sequence does not exist. Cycle factor.
  • the classifying the target time series according to the target parameter includes: in the presence of the periodic factor, determining the target time series as Periodic; in the absence of the periodic factor, the target time series is determined to be aperiodic.
  • the acquiring the target parameter of the target time sequence according to the first sub-time sequence or the second sub-time sequence includes: according to the second sub-time sequence Time series, determine the jitter density of the target time series.
  • the determining the jitter density of the target time series according to the residual component of the target time series includes: determining R of the target time series according to the following formula:
  • the R is the jitter density
  • the r n can be determined according to the following formula:
  • the C n is the n-th element in the second sub-time sequence
  • the x n is the n-th element in the target time sequence
  • the N is determined according to the following formula:
  • the T is the length of the target time series
  • the W is the window length of the joining window
  • the ⁇ is the first preset value.
  • the classifying the target time series according to the target parameter includes: when the jitter density is greater than a second preset value, dividing the The target time series is determined to be a glitch type; in a case where the jitter density is less than or equal to the second preset value, the target time series is determined to be a stationary type.
  • a method for training a traffic anomaly detection model including: acquiring a first time series, where the first time series includes N elements, and the N elements correspond to N times, wherein the N Each of the elements is the traffic data received at the corresponding moment; the first type of the first time series is obtained according to the original classification model of the first time series; the first type corresponding to the first type
  • the first type of judgment model is used to perform traffic anomaly detection processing on the first time series of the first type to obtain first data, where the first data is an abnormal point of the first time series; to obtain second data, The second data is the original abnormal point of the first time series; according to the first data and the second data, the parameters of the first type of determination model are adjusted to obtain the first target determination model.
  • multiple first time sequences may be acquired, and the first target determination model can be trained according to the multiple first time sequences.
  • the first type of the first time series is periodic or non-periodic, glitch, stationary, cyclic stationary, non-cyclic stationary, and periodic glitch. Or non-periodic glitch type.
  • a method for training a traffic anomaly detection model including: acquiring a first time series, the first time series including N elements, and the N elements correspond to N times, wherein the N Each of the elements is the traffic data received at the corresponding time; the first parameter set of the first time series is obtained according to the original parameter model of the first time series; the first parameter set corresponding to the first parameter set is obtained according to the
  • the first type of judgment model is to perform traffic abnormality detection processing on the first time series to obtain fourth data, where the fourth data is an abnormal point of the first time series; to obtain second data, the first time series
  • the second data is the original abnormal point of the first time series; according to the second data and the fourth data, the parameters of the first type of determination model are adjusted to obtain the first target determination model.
  • multiple first time sequences may be acquired, and the first target determination model can be trained according to the multiple first time sequences.
  • the first type of the first time series is periodic or non-periodic, glitch, stationary, cyclic stationary, non-cyclic stationary, and periodic glitch. Or non-periodic glitch type.
  • a method for training a traffic anomaly detection model includes: acquiring a first time series, the first time series including N elements, and the N elements correspond to N times, wherein , Each of the N elements is the traffic data received at the corresponding time; processing the first time series to obtain a third sub-time series, where the third sub-time series is the A time series composed of trend components decomposed by each of the N elements in the first time series; according to the fourth type of judgment model, the first time series is subjected to traffic abnormality detection processing to obtain the third data,
  • the third data is an abnormal point in the first time series; second data is acquired, and the second data is an original abnormal point in the first time series; according to the second data and the third data , Adjusting the parameters of the fourth type of judgment model to obtain the second target judgment model.
  • multiple first time sequences may be acquired, and the second target determination model can be trained according to the multiple first time sequences.
  • a method for training a classification model of traffic patterns including: acquiring a first time series, the first time series including N elements, and the N elements correspond to the N times, wherein, Each of the N elements is the traffic data received at the corresponding time; according to the original classification model of the first time series, the first type of the first time series is acquired; the first type of the first time series is acquired; The original type of the time series; according to the original type of the first time series and the first type of the first time series, the parameters of the original model of the first time series are adjusted to obtain the information of the first time series Target classification model.
  • the first type of the first time series is periodic or non-periodic, glitch, stationary, cyclic stationary, non-cyclic stationary, and periodic glitch. Or non-periodic glitch type.
  • a flow abnormality detection device including a memory for storing a program; a processor for executing a program stored in the memory, and when the processor executes the program stored in the memory, the processor uses For acquiring the target time sequence, the target time sequence includes N elements, and the N elements correspond to N times, wherein each element of the N elements is traffic data received at a corresponding time; According to the target time series, the target parameters of the target time series are obtained.
  • the target parameters include a period factor and/or jitter density, wherein the period factor is used to represent the surrounding long-term A wave-shaped change of the trend, the jitter density is used to indicate the deviation between the actual value of the target time series and the target value within the target time; according to the target parameter, the target time series is determined from multiple types. Belongs to the first type, wherein each of the multiple types corresponds to a parameter set, and the target parameter belongs to the parameter set corresponding to the first type; according to the first type corresponding to the first type A judgment model for detecting abnormal conditions of the target time series, wherein each of the multiple types corresponds to a type of judgment model, and the judgment model is used for traffic abnormality detection.
  • the processor is further configured to: decompose each of the N elements in the target time series into trend components, periodic components, and residual components Determine a first sub-time sequence including N of the periodic components and a second sub-time sequence including N of the residual components; obtain the first sub-time sequence according to the first sub-time sequence or the second sub-time sequence The target parameter of the target time series.
  • the processor is further configured to determine whether the period factor exists in the target time sequence according to the first sub-time sequence.
  • the processor is further specifically configured to: in the case where N periodic components in the first sub-time sequence exist, determine that the target time sequence exists. The period factor; in the case where the N period components in the first sub-time sequence do not exist, it is determined that the period factor does not exist in the target time sequence.
  • the processor is further configured to: determine a second type corresponding to the first type according to the first mapping relationship and the first type to which the target time series belongs
  • the first mapping relationship includes the corresponding relationship between the plurality of types and the plurality of second type determination models; according to the second sub-time sequence and the second type corresponding to the first type
  • the judgment model of the second type is used to detect the abnormal situation of the target time series, and the judgment model of the second type is the N-sigma model.
  • the processor is further configured to: determine the jitter density of the target time sequence according to the second sub-time sequence.
  • the processor is further specifically configured to determine R of the target time series according to the following formula:
  • the R is the jitter density
  • the r n can be determined according to the following formula:
  • the C n is the n-th element in the second sub-time sequence
  • the x n is the n-th element in the target time sequence
  • the N is determined according to the following formula:
  • the T is the length of the target time series
  • the W is the window length of the joining window
  • the ⁇ is the first preset value.
  • the processor is further configured to: according to the target parameter, determine from the multiple parameter sets a first parameter set to which the target parameter belongs;
  • the third mapping relationship and the first parameter set determine the first type to which the target time series belongs from multiple types, and the third mapping relationship includes multiple parameter sets and multiple corresponding relationships of the types.
  • the above-mentioned types may include: periodic, aperiodic, stable, glitch, periodic stabilization, periodic glitch, aperiodic stabilization, or aperiodic glitch.
  • the periodic and aperiodic types can be determined according to the periodic factor. Specifically, when the periodic factor exists, the type is periodic; when the periodic factor does not exist, the type is aperiodic.
  • the stable type and the glitch type can be determined according to the jitter density. Specifically, when the jitter density is greater than the second preset value, the type is a glitch type; when the jitter density is less than or equal to the second preset value, the type is a smooth type .
  • the processor is further specifically configured to: determine a third sub-time series including N of the trend components; and divide the second time series into M sub-times of target length Sequence, the M is a positive integer, the second time sequence is the third sub-time sequence or the second time sequence is formed according to the third sub-time sequence and the linear segmentation algorithm PLR; calculate M The matrix contour MP values of subsequences with a target length, the matrix contour MP values of the subsequences with a target length constitute an MP time series; according to the MP time series and the N-sigma algorithm, the target time series are detected abnormal situation.
  • the processor is further configured to: determine, according to the second mapping relationship and the first type to which the target time series belongs, the The first type of determination model, and the second mapping relationship includes the corresponding relationship between the plurality of types and the plurality of determination models of the first type.
  • a flow abnormality detection device including a memory for storing a program; a processor for executing a program stored in the memory, and when the processor executes the program stored in the memory, the processor uses To obtain a target time sequence; the target time sequence includes N elements, and the N elements correspond to N times, wherein each element of the N elements is traffic data received at a corresponding time; According to the target time series, the target parameters of the target time series are obtained.
  • the target parameters include a period factor and/or jitter density, wherein the period factor is used to represent the surrounding long-term A wave-shaped change of the trend, the jitter density is used to indicate the deviation of the actual value of the target time series from the target value within the target time;
  • the first parameter set to which the target parameter belongs is determined from a plurality of parameter sets Detect the abnormal situation of the target time series according to the first type of determination model corresponding to the first parameter set, wherein each parameter set in the plurality of parameter sets corresponds to a type of determination model, the The judgment model is used for flow abnormality detection.
  • the processor is further configured to: decompose each of the N elements in the target time series into a trend component, a periodic component, and a residual component Determine a first sub-time sequence including N of the periodic components and a second sub-time sequence including N of the residual components; obtain the first sub-time sequence according to the first sub-time sequence or the second sub-time sequence The target parameter of the target time series.
  • the processor is further configured to: determine whether the period factor exists in the target time sequence according to the first sub-time sequence.
  • the processor is further specifically configured to: in the case where N periodic components in the first sub-time sequence exist, determine that the target time sequence exists. The period factor; in the case where the N period components in the first sub-time sequence do not exist, it is determined that the period factor does not exist in the target time sequence.
  • the processor is further configured to: detect the target according to the second sub-time sequence and the second type of judgment model corresponding to the first parameter set For abnormal situations of time series, the second type of judgment model is the N-sigma model.
  • the processor is further configured to: determine the first parameter set corresponding to the first parameter set according to the fourth mapping relationship and the first parameter set to which the target parameter belongs For a two-type determination model, the fourth mapping relationship includes a corresponding relationship between the plurality of parameter sets and a plurality of the second-type determination models.
  • the processor is further configured to: determine the jitter density of the target time sequence according to the second sub-time sequence.
  • the processor is further configured to determine the jitter density of the target time series according to the following formula:
  • the R is the jitter density
  • the r n can be determined according to the following formula:
  • the C n is the n-th element in the second sub-time sequence
  • the x n is the n-th element in the target time sequence
  • the N is determined according to the following formula:
  • the T is the length of the target time series
  • the W is the window length of the joining window
  • the ⁇ is the first preset value.
  • the processor is further specifically configured to: determine a third sub-time series including N of the trend components; and divide the second time series into M sub-times of target length Sequence, the M is a positive integer, the second time sequence is the third sub-time sequence or the second time sequence is formed according to the third sub-time sequence and the linear segmentation algorithm PLR; calculate M The matrix contour MP values of subsequences with a target length, the matrix contour MP values of the subsequences with a target length constitute an MP time series; according to the MP time series and the N-sigma algorithm, the target time series are detected abnormal situation.
  • the processor is further specifically configured to: determine the corresponding parameter of the first parameter set according to the fifth mapping relationship and the first parameter set to which the target parameter belongs For the first type of determination model, the fifth mapping relationship includes the corresponding relationship between the plurality of parameter sets and the plurality of first type of determination models.
  • a flow abnormality detection device including a memory for storing a program; a processor for executing the program stored in the memory, and when the processor executes the program stored in the memory, the processing
  • the device is used to obtain a target time sequence, the target time sequence includes N elements, and the N elements correspond to N moments, wherein each element of the N elements is the traffic data received at the corresponding moment Decompose each of the N elements in the target time series into a trend component, a periodic component, and a residual component; determine a third sub-time series that includes the N trend components; convert the second time series Divide into M sub-sequences of target length, where M is a positive integer, the second time sequence is the third sub-time sequence or the second time sequence is based on the third sub-time sequence and the linear segmentation algorithm PLR Calculate the matrix contour MP values of the sub-sequences of M target lengths, the matrix contour MP values of the sub-sequences of the M target lengths form the MP time
  • the processor is further configured to: determine a second sub-time sequence including the N residual components; and determine according to the second sub-time sequence and the third category The model detects the abnormal situation of the target time series.
  • the target length is specified by the communication protocol.
  • the third type of judgment model is an N-sigma model.
  • a device for classifying traffic patterns including a memory for storing a program; a processor for executing the program stored in the memory, and when the processor executes the program stored in the memory, the The processor is configured to obtain a target time sequence, the target time sequence includes N elements, and the N elements correspond to N moments, wherein each element of the N elements is the traffic received at the corresponding moment Data; according to the target time series, obtain target parameters of the target time series, the target parameters including period factors and/or jitter density, wherein the period factors are used to represent the surrounding long-term in the target time series A wave-shaped change of the trend, the jitter density is used to indicate the deviation of the actual value of the target time series from the target value within the target time; the target time series are classified according to the target parameter.
  • the processor is further specifically configured to: decompose each of the N elements in the target time series into a trend component, a periodic component, and Residual components; determine a first sub-time sequence including N of the periodic components and a second sub-time sequence of the N residual components; obtain the first sub-time sequence according to the first sub-time sequence or the second sub-time sequence State the target parameters of the target time series.
  • the processor is further specifically configured to: determine whether the period factor exists in the target time sequence according to the first sub-time sequence.
  • the processor is further specifically configured to: in the case where N periodic components in the first sub-time sequence exist, determine that the target time sequence exists The periodicity factor; in the case that the N periodic components in the first sub-time sequence do not exist, it is determined that the periodicity factor does not exist in the target time sequence.
  • the processor is further specifically configured to: in the presence of the periodic factor, determine the target time sequence as a periodic type; If it does not exist, the target time series is determined to be aperiodic.
  • the processor is further specifically configured to determine the jitter density of the target time series according to the second sub-time series.
  • the processor is further specifically configured to determine the R of the target time series according to the following formula:
  • the R is the jitter density
  • the r n can be determined according to the following formula:
  • the C n is the n-th element in the second sub-time sequence
  • the x n is the n-th element in the target time sequence
  • the N is determined according to the following formula:
  • the T is the length of the target time series
  • the W is the window length of the joining window
  • the ⁇ is the first preset value.
  • the processor is further specifically configured to: when the jitter density is greater than a second preset value, determine the target time sequence as a glitch type; When the jitter density is less than or equal to the second preset value, the target time series is determined to be a stationary type.
  • a traffic anomaly detection model training device including a memory for storing a program; a processor for executing a program stored in the memory, and when the processor executes the program stored in the memory, The processor is configured to obtain a first time sequence, the first time sequence includes N elements, and the N elements correspond to N times, where each element of the N elements is received at the corresponding time.
  • Traffic data obtain the first type of the first time series according to the original classification model of the first time series; determine the first type of the first type according to the judgment model of the first type corresponding to the first type Perform traffic abnormality detection processing on a time series to obtain first data, where the first data is an abnormal point in the first time series; and obtain second data, where the second data is the original of the first time series Abnormal points; according to the first data and the second data, adjust the parameters of the first type of judgment model to obtain the first target judgment model.
  • the processor may obtain multiple first time sequences, and train the first target determination model according to the multiple first time sequences.
  • the first type of the first time series is periodic or aperiodic, glitch, stationary, cyclic stationary, non-cyclic stationary, periodic glitch Type or non-periodic glitch type.
  • a traffic anomaly detection model training device including a memory for storing a program; a processor for executing a program stored in the memory, and when the processor executes the program stored in the memory, The processor is configured to obtain a first time sequence, the first time sequence includes N elements, and the N elements correspond to N times, where each element of the N elements is received at the corresponding time.
  • the first parameter set of the first time series according to the original parameter model of the first time series; according to the first type of judgment model corresponding to the first parameter set, the first time series Sequence flow abnormality detection processing to obtain fourth data, where the fourth data is an abnormal point in the first time series; obtain second data, which is an original abnormal point in the first time series ; According to the second data and the fourth data, adjust the parameters of the first type of determination model to obtain a first target determination model.
  • the processor may obtain multiple first time sequences, and train the first target determination model according to the multiple first time sequences.
  • the first type of the first time series is periodic or aperiodic, glitch, stationary, cyclic stationary, non-cyclic stationary, periodic glitch Type or non-periodic glitch type.
  • a training model of a traffic anomaly detection model including a memory for storing a program; a processor for executing a program stored in the memory, and when the processor executes the program stored in the memory,
  • the processor is configured to obtain a first time sequence, the first time sequence includes N elements, and the N elements correspond to N time instants, where each element of the N elements is a corresponding time instant Received traffic data; processing the first time series to obtain a third sub-time series, the third sub-time series being decomposed by each of the N elements in the first time series A time series composed of trend components; according to the fourth type of judgment model, perform traffic anomaly detection processing on the first time series to obtain third data, where the third data is an abnormal point of the first time series; Acquire second data, the second data being the original abnormal point of the first time series; according to the second data and the third data, adjust the parameters of the fourth type of judgment model to obtain the first Two target judgment model.
  • the processor may obtain multiple first time sequences, and train the second target determination model according to the multiple first time sequences.
  • a traffic pattern classification model training device including a memory for storing a program; a processor for executing a program stored in the memory, and when the processor executes the program stored in the memory,
  • the processor is configured to obtain a first time sequence, the first time sequence includes N elements, the N elements correspond to the N moments, and each element of the N elements corresponds to The traffic data received at the moment of time; according to the original classification model of the first time series, obtain the first type of the first time series; obtain the original type of the first time series; according to the first time series And the first type of the first time series, adjusting the parameters of the original model of the first time series to obtain the target classification model of the first time series.
  • the first type of the first time series is periodic or aperiodic, glitch, stationary, cyclic stationary, non-cyclic stationary, periodic glitch Type or non-periodic glitch type.
  • a computer storage medium stores a program code, and the program code includes a method for executing any one of the foregoing first to eighth aspects.
  • An eighteenth aspect provides a computer program product containing instructions, which when the computer program product runs on a computer, causes the computer to execute the method in any one of the above-mentioned first to eighth aspects.
  • a chip in a nineteenth aspect, includes a processor and a data interface, the processor reads instructions stored in a memory through the data interface, and executes any one of the first to eighth aspects above The method in the possible implementation mode.
  • the chip may further include a memory in which instructions are stored, and the processor is configured to execute the instructions stored in the memory.
  • the processor is configured to execute the method in any one of the possible implementation manners of the first aspect to the eighth aspect.
  • the aforementioned chip may specifically be a field programmable gate array FPGA or an application specific integrated circuit ASIC.
  • Fig. 1 is a schematic structural diagram of a system architecture provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a method 200 for classifying traffic patterns according to an embodiment of the present application.
  • FIG. 3 shows the network traffic sequence of four types of network devices.
  • FIG. 4 is a schematic flowchart of a method 400 for detecting abnormal traffic according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram after processing the time sequence according to an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of another method 600 for detecting abnormal traffic according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of abnormal baseline traffic in an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of another method 800 for detecting abnormal traffic according to an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a method 900 for training a traffic pattern classification model provided by an embodiment of the present application.
  • FIG. 10 is a schematic flowchart of another method 1000 for training a traffic anomaly detection model provided by an embodiment of the present application.
  • FIG. 11 is a schematic flowchart of another method 1100 for training a traffic anomaly detection model provided by an embodiment of the present application.
  • FIG. 12 is a schematic flowchart of another method 1200 for training a traffic anomaly detection model provided by an embodiment of the present application.
  • FIG. 13 is a schematic block diagram of a traffic pattern classification device 1300 provided by an embodiment of the present application.
  • FIG. 14 is a schematic block diagram of a flow abnormality detection device 1400 provided by an embodiment of the present application.
  • FIG. 15 is a schematic diagram of the hardware structure of a traffic pattern classification model training apparatus 1500 provided by an embodiment of the present application.
  • FIG. 16 is a schematic diagram of the hardware structure of a traffic anomaly detection model training device 1600 provided by an embodiment of the present application.
  • a time series is a sequence of data points arranged in the order of time. Usually the time interval of a set of time series is a constant value, so the time series can be analyzed and processed as discrete time data. Anomaly detection of time series is usually to find data points far away from a relatively established pattern or distribution. Time series anomalies include: sudden rises, sudden drops, and mean changes. Time series anomaly detection algorithms include algorithms based on statistics and data distribution (N-Sigma), algorithms based on distance/density (local anomaly factor algorithm), isolated forests, and prediction-based algorithms (ARIMA).
  • Anomaly detection is performed on the flow data collected from devices or ports in the network. Anomaly detection results provide a basis for discovering network attacks, configuration errors, and network device failures.
  • x t is the flow data at time t
  • is the mean value of the normal distribution
  • is the variance of the normal distribution
  • the mean and variance can be estimated using n historical flow data (x 1 , x 2 ,...x t ,...x n ) in the time window, and the estimation is as follows:
  • the data flow x t to be detected is an abnormal point, namely:
  • Y is the preset value.
  • the data flow x t to be detected is a normal point, namely:
  • FIG. 1 is a schematic diagram of the system architecture of an embodiment of the present application.
  • the system architecture 100 includes an execution device 110, a training device 120, a database 130, a network device 140, a data storage system 150, and a data collection device 160.
  • the execution device 110 includes a calculation module 111, an I/O interface 112, a preprocessing module 113, and a preprocessing module 114.
  • the calculation module 111 may include the target model/rule 101, and the preprocessing module 113 and the preprocessing module 114 are optional.
  • the data collection device 160 is used to collect training data.
  • the training data may include a first time series, where the first time series includes N elements, and the N elements correspond to N time instants. Among the N elements, Each element is the traffic data received at the corresponding moment.
  • the data collection device 160 stores the training data in the database 130, and the training device 120 trains to obtain the target model/rule 101 based on the training data maintained in the database 130.
  • the training device 120 performs traffic anomaly detection on the first time series, and compares the output traffic anomaly detection result of the first time series with the original first time series. The results of abnormal traffic are compared until the first time series of abnormal traffic detection results output by the training device 120 and the original first time series of abnormal traffic results are less than a certain threshold, thereby completing the training of the target model/rule 101, where, The abnormal result of the original flow in the first time series is obtained by the operator through analysis of the first time series.
  • the above-mentioned target model/rule 101 can be used to implement the method of traffic anomaly detection in the embodiment of the present application, that is, input the target time series (after relevant preprocessing) into the target model/rule 101 to obtain the detection of the target time series result.
  • the first time series maintained in the database 130 may not all come from the collection of the data collection device 160, and may also be received from other devices.
  • the training device 120 does not necessarily perform the training of the target model/rule 101 completely based on the first time sequence maintained by the database 130. It may also obtain the first time sequence from the cloud or other places for model training.
  • the above description does not It should be used as a limitation to the embodiments of this application.
  • the target model/rule 101 obtained by training according to the training device 120 can be applied to different systems or devices, such as the execution device 110 shown in FIG. 1, which can be a server or a cloud.
  • the execution device 110 is configured with an input/output (input/output, I/O) interface 112 for data interaction with external devices.
  • the network device 140 inputs data to the I/O interface 112, and the input data is
  • the embodiment of the present application may include: the target time sequence input by the network device.
  • the network device 140 here may specifically be a terminal device.
  • the preprocessing module 113 and the preprocessing module 114 are used to perform preprocessing according to the input data (such as the target time series) received by the I/O interface 112.
  • the preprocessing module 113 and the preprocessing module may not be provided.
  • 114 there may only be one preprocessing module, and the calculation module 111 is directly used to process the input data.
  • the execution device 110 may call data, codes, etc. in the data storage system 150 for corresponding processing .
  • the data, instructions, etc. obtained by corresponding processing may also be stored in the data storage system 150.
  • the I/O interface 112 presents the processing result, such as the detection result of the target time series obtained above, to the network device 140, so as to provide it to the user.
  • the training device 120 can generate corresponding target models/rules 101 based on different training data for different goals or tasks, and the corresponding target models/rules 101 can be used to achieve the above goals or complete The above tasks provide the user with the desired result.
  • the network device 140 can automatically send input data to the I/O interface 112.
  • the user can view the result output by the execution device 110 on the network device 140, and the specific presentation form may be a specific manner such as display, sound, and action.
  • the network device 140 can also be used as a data collection terminal to collect the input data of the input I/O interface 112 and the output result of the output I/O interface 112 as shown in FIG. 1 as new sample data and store it in the database 130.
  • the I/O interface 112 directly uses the input data input to the I/O interface 112 and the output result of the output I/O interface 112 as shown in the figure as a new sample The data is stored in the database 130.
  • Fig. 1 is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data The storage system 150 is an external memory relative to the execution device 110. In other cases, the data storage system 150 may also be placed in the execution device 110.
  • the method shown in FIG. 2 may be executed by a traffic pattern classification device, and the traffic pattern classification device may be a server or a cloud with a traffic pattern classification function.
  • the method 200 shown in FIG. 2 includes step 210 to step 230. These steps are described in detail below.
  • the target time sequence includes N elements, and the N elements correspond to N times, where each element of the N elements is traffic data received at a corresponding time.
  • the foregoing target time sequence may be sent by the first network device.
  • the server may also obtain other time series to be detected for traffic anomaly other than the target time series, where the other time series to be detected for traffic anomaly may be from the same network device, and the same network device may be the first A network device, the same network device may also be any other network device except the first network device. Or, other time series to be detected for traffic anomaly may come from different network devices. This application does not limit this.
  • each of the N elements is the traffic data received at the corresponding moment, which can be understood as the traffic data of the first network device in the unit time (second) collected by the server; that is, the traffic data of the N elements
  • Each element can be that the server collects traffic data of the first network device every 1s.
  • the target time series may include 4 elements, that is, the server needs to collect traffic data of the first network device every 1s to obtain the target time series collected within 4s.
  • the target time series may be ⁇ 1MB, 2MB, 5MB, 9MB ⁇ , where 1MB is the traffic data of the first network device in 1s, 2MB is the traffic data of the first network device in 2s, and 5MB is the first The flow data of the network device in 3s, 9MB is the flow data of the first network device in 4s.
  • the target time series can also be ⁇ 1.2MB, 2MB, 3MB, 5MB ⁇ , where 1.2MB is the traffic data of the first network device in 1s, and 2MB is the traffic data of the first network device in 2s. 3MB is the traffic data of the first network device in 3s, and 5MB is the traffic data of the first network device in 4s.
  • each of the N elements is the traffic data received at the corresponding moment, which can also be understood as the total traffic data of the first network device within a preset time collected by the server, where the preset time is greater than Unit of time.
  • the target time series may include 6 elements, that is, the server needs to collect the traffic data of the first network device every preset time to obtain the target time series within 6 preset times.
  • the preset time is 10s, that is, the server collects the traffic data of the first network device every 10s to obtain the target time series collected within 1m.
  • the target time series can be ⁇ 1MB,2.5MB,5MB,8MB,12MB,18MB ⁇ , where 1MB is the traffic data of the first network device in 1s-10s, and 2.5MB is the first network device in 11s-20s 5MB is the traffic data of the first network device in 21s-30s, 8MB is the traffic data of the first network device in 31s-40s, 12MB is the traffic data of the first network device in 41s-50s, 18MB is the flow data of the first network device in 51s-60s.
  • the target time sequence can also be ⁇ 1MB, 2MB, 3.5MB, 6MB, 8MB ⁇ , where 1MB is the first network device in 1s-5s 2MB is the traffic data of the first network device in 6s-10s, 3.5MB is the traffic data of the first network device in 11s-15s, 6MB is the traffic data of the first network device in 16s-20s , 8MB is the traffic data of the first network device in 21s-25s.
  • the target parameters include a period factor and/or jitter density, where the period factor is used to represent a wave-shaped change around a long-term trend in the target time series , Jitter density is used to indicate the deviation of the actual value of the target time series from the target value within the target time.
  • each of the N elements in the target time sequence into a trend component, a periodic component, and a residual component; determine the first sub-time sequence including the N periodic components and the second sub-time sequence of the N residual components Time sequence, and obtain the target parameter of the target time sequence according to the first sub-time sequence or the second sub-time sequence.
  • the N periodic components decomposed by each of the N elements in the target time series are the same, that is, every periodic component decomposed by each element in a time series is the same.
  • x n is an element in the target time series, and x n is decomposed into trend component T n , periodic component S n , and residual component C n ; S n is the element in the first sub-time series, and C n is the second Elements in the sub-time series.
  • x n can be expressed as the following formula:
  • the server may decompose each of the N elements in the target time series into a trend component, a periodic component, and a residual component through a Time Series Decompose (TSD) algorithm.
  • TSD Time Series Decompose
  • the target time sequence it is determined whether the target time sequence has a periodicity factor.
  • the target time sequence has a periodic factor; in the case that N periodic components in the first sub-time sequence do not exist, the target time sequence is determined There is no periodic factor.
  • the existence of the foregoing N periodic components can be understood as the N periodic components are all valid values; for example, the periodic component may be 1, and the periodic component may also be 3.
  • the absence of the foregoing N periodic components can be understood as the N periodic components are all invalid values; for example, the periodic component may be zero.
  • N periodic components are all valid values can be understood as the N periodic components obtained by decomposing the time series are all non-zero values; N periodic components are all invalid values can be understood as the N periodic components obtained by decomposing the time series It is a value of 0.
  • the time series shown in Figure 3(c) is a periodic time series. By decomposing the time series shown in Figure 3(c), the N periodic components of the time series are obtained as 0 values.
  • the time series shown in Figure 3(b) is a non-periodic time series. By decomposing the time series shown in Figure 3(b), all N periodic components of the time series are obtained. It is a non-zero value, that is, the N periodic components of the time series are valid values, that is, the periodic components exist.
  • the periodic factor can be set to 1; if the target time series does not have a periodic factor, the periodic factor can be set to 0.
  • the server determines the R of the target time series according to the following formula:
  • R is the jitter density
  • r n can be determined according to the following formula:
  • C n is the nth element in the second sub-time series
  • x n is the nth element in the target time series
  • N is determined according to the following formula:
  • T is the length of the target time series
  • W is the length of the window added to the window
  • is the first preset value
  • the target time series is determined to be a periodic type; when the periodic factor does not exist, the target time series is determined to be a non-periodic type.
  • the target time series is determined to be a glitch type; when the jitter density is less than or equal to the second preset value, the target time series is determined to be a stationary type.
  • the following 8 situations can be obtained by classifying the target sequence, where S is the period factor and R is the jitter density:
  • the first type to which the target time series belongs is the glitch type
  • the first type to which the target time series belongs is the periodic glitch type
  • the first type to which the target time series belongs is aperiodic glitch type
  • the first type of the target time series belongs to the cyclic stationary type
  • the first type to which the target time series belongs is the periodic glitch type.
  • the method 200 may further include step 230.
  • the output target time series is periodic or non-periodic; or, the output target time series is stationary or glitch; or, the output target time series is periodic glitch, cyclic stationary, aperiodic glitch, or non-cyclic stationary .
  • the network traffic sequence of 4 types of network devices For example, as shown in Figure 3, the network traffic sequence of 4 types of network devices. It can be seen from Figure 3(a), Figure 3(b), Figure 3(c) and Figure 3(d) that the characteristics of network traffic have changed greatly.
  • the traffic of the network device as shown in Figure 3(a) is relatively stable at all times, that is, the traffic sequence can be stable;
  • the traffic of the network device as shown in Figure 3(b) has obvious local jitter, that is, the traffic sequence It can be a glitch type;
  • the traffic of a network device exhibits extremely strong periodic characteristics (days, weeks), that is, the flow sequence can be a periodic type;
  • the flow cycle characteristics (days, weeks) of is not obvious, that is, the flow sequence can be acyclic.
  • the method shown in FIG. 4 may be executed by a flow anomaly detection device, and the flow anomaly detection device may be a server with a flow anomaly detection function.
  • step 410 The method shown in FIG. 4 includes step 410 to step 440. These steps are described in detail below.
  • the abnormal situation of the time series described in the embodiments of the present application can be understood as the abnormal situation of the traffic of the time series.
  • the target time sequence includes N elements, and the N elements correspond to N times, where each element of the N elements is traffic data received at a corresponding time.
  • the foregoing target sequence may be sent by the first network device.
  • the server may also obtain other time series to be detected for traffic anomaly other than the target time series, where the other time series to be detected for traffic anomaly may be from the same network device, and the same network device may be the first A network device, the same network device may also be any other network device except the first network device. Or, other time series to be detected for traffic anomaly may come from different network devices.
  • the target parameters include a period factor and/or jitter density, where the period factor is used to represent a wave-shaped change around a long-term trend presented in the target time series , Jitter density is used to indicate the deviation of the actual value of the target time series from the target value within the target time.
  • each of the N elements in the target time series into trend components, periodic components, and residual components; determine a first sub-time sequence including N periodic components and a second sub-time sequence including N residual components
  • Sub-time sequence According to the first sub-time sequence or the second sub-time sequence, the target parameters of the target time sequence are obtained.
  • x n is an element in the target time series, and x n is decomposed into a trend component T n , a periodic component S n , and a residual component C n ; the S n is an element in the first sub-time series, and the C n is an element in the second sub-time series.
  • x n can be expressed as the following formula:
  • the server may decompose each element in the target time series into a trend component, a periodic component, and a residual component through a Time Series Decompose (TSD) algorithm.
  • TSD Time Series Decompose
  • the target time sequence it is determined whether the target time sequence has a periodicity factor.
  • the target time sequence has a periodic factor; in the case that N periodic components in the first sub-time sequence do not exist, the target time sequence is determined There is no periodic factor.
  • the decomposed periodic components of each element in a time series are the same, that is, a time series corresponds to a periodic component.
  • the decomposed periodic component of each element in the time series may be 1 or the decomposed periodic component of each element in the time series may also be 2.
  • the existence of the foregoing N periodic components can be understood as the N periodic components are all valid values; for example, the periodic component may be 0.5, and the periodic component may also be 2.
  • the absence of the foregoing N periodic components can be understood as the N periodic components are all invalid values; for example, the periodic component may be zero.
  • the periodic factor can be set to 1; if the target time series does not have a periodic factor, the periodic factor can be set to 0.
  • the server determines the R of the target time series according to the following formula:
  • R is the jitter density
  • r n can be determined according to the following formula:
  • C n is the nth element in the second sub-time series
  • x n is the nth element in the target time series
  • N is determined according to the following formula:
  • T is the length of the target time series
  • W is the length of the window added to the window
  • is the first preset value
  • step 410 and step 420 For content not described in step 410 and step 420, reference may be made to the description of step 210 and step 220 in the above method 200, which will not be repeated here.
  • the target parameter determine the first type to which the target time series belongs from multiple types, where each of the multiple types corresponds to a parameter set, and the target parameter belongs to a parameter set corresponding to the first type.
  • the above-mentioned types may include: periodic, aperiodic, stable, glitch, periodic stabilization, periodic glitch, aperiodic stabilization, or aperiodic glitch.
  • the periodic and aperiodic types can be determined according to the periodic factor. Specifically, when the periodic factor exists, the type is periodic; when the periodic factor does not exist, the type is aperiodic.
  • the stable type and the glitch type can be determined according to the jitter density. Specifically, when the jitter density is greater than the second preset value, the type is a glitch type; when the jitter density is less than or equal to the second preset value, the type is a smooth type .
  • the target parameter determines the first parameter set to which the target parameter belongs from multiple parameter sets; second, determine the first type of the target time series from the multiple types according to the third mapping relationship and the first parameter set
  • the third mapping relationship includes multiple parameter sets and multiple corresponding relationships of the types. That is, the target parameter is obtained by calculation, the first parameter set is determined according to which parameter set the target parameter belongs to, and the first type of the target time series is determined according to the first type corresponding to the first parameter set.
  • the judgment model is used for traffic anomaly detection.
  • the first type of judgment model corresponding to the first type may be determined according to the second mapping relationship and the first type to which the target time series belongs, and the second mapping relationship includes multiple types and multiple first type determinations Correspondence of the model.
  • the first type of judgment model may specifically be: first, determine a third sub-time series including N of the trend components; divide the second time series into M sub-sequences of target length, M is a positive integer, The second time sequence is the third sub-time sequence or the second time sequence is formed according to the third sub-time sequence and the (Linear Segmentation Algorithms, PLR) linear segmentation algorithm; secondly, the matrix contour MP of the sub-sequence of M target length is calculated Value, the matrix contour MP values of the sub-sequences of M target lengths form the MP time series; finally, according to the MP time series and the N-sigma algorithm, the abnormal situation of the target time series is detected.
  • the number of line segments determines the approximate granularity of the original series.
  • the PLR representation method is to approximately identify a time series of length m (m>>M) with M straight line segments adjacent to the end.
  • the trend component reflects the overall change of the time series. Therefore, the abnormality detection of the third sub-time series including N trend components can improve the accuracy of the time series abnormality detection result.
  • the third sub-time series (the time series composed of trend components after decomposing each of the N elements in the target time series) is represented by PLR, which is realized by a top-down algorithm.
  • the flow data The start point and end point of is the first selected segment point. Then, traverse all the points between the two points and find the point with the largest distance from the line connecting the two points. If the distance from this point to the line is greater than the preset threshold, it will be used as the third segment point.
  • this new point is to the two line segments formed by the adjacent point on the left and the adjacent point on the right, continue to look for the point with the largest distance, the two points found, who has the largest distance from the corresponding line segment, and If this distance is greater than the threshold, the point will be used as the fourth segment point, and the loop will continue until no point with a distance greater than the threshold is found, and the segmentation is completed.
  • This threshold which is the distance from the point to the line segment, uses Euclidean distance.
  • the second time sequence is divided into M target-length subsequences through a matrix profile (MP), and the matrix contour MP values of the M target-length subsequences are calculated through MP.
  • MP is a method of describing sequence outlines from the structure of time series, and is often used in time series clustering, density estimation, and graph discovery.
  • the principle of MP is to cut the entire time series into fixed-length subsequences, and then calculate the Euclidean distances between the subsequences and other subsequences, and take the minimum value as the MP value of the sequence.
  • the time series X ⁇ x 0 ,x 1 ,...x n-2 ,x n-1 ⁇ is divided into several sub-sequences by MP Subsequence
  • the MP value of the original time series X is the minimum value of the distance between the subsequence and other subsequences in the original sequence, namely j ⁇ [0,nm].
  • the abnormal conditions of the detected target time series are as follows:
  • ⁇ mp is the mean value of the MP time series
  • ⁇ mp is the variance of the MP time series
  • is the preset value.
  • the abnormal flow point in the MP time series can be detected.
  • the time corresponding to the abnormal flow point in the MP time series is the time of the abnormal flow point in the target time series.
  • the flow abnormal points of the target time series are obtained.
  • the above figure shows the graph formed by the original time series
  • the middle figure shows the graph formed by the time series after the original sequence is represented by PLR
  • the following figure shows The time series expressed by PLR is divided into M subsequences, and the time series graph formed by the MP values of the M subsequences is calculated.
  • the original sequence has a sudden change on the abscissa of 60.
  • the sequence after the original time series is represented by PLR
  • the sudden change on the abscissa of 60 can be filled in.
  • the flow time series may fluctuate in a short period of time, but the flow will return to normal in a short period of time.
  • the abnormal points of the target time series may be output.
  • the server may also perform another traffic anomaly detection on the target time series, that is, the server may also perform step 450.
  • Step 450 Detect an abnormal situation of the target time series according to the second sub-time series and the second type of judgment model corresponding to the first type, and the second type of judgment model is the N-sigma model.
  • the first mapping relationship and the first type to which the target time series belongs determine the second type determination model corresponding to the first type, and the first mapping relationship includes correspondences between multiple types and multiple second type determination models relationship.
  • the second type of judgment model can be specifically:
  • ⁇ 2 is the mean value of the second sub-time series
  • ⁇ 2 is the variance of the second sub-time series
  • the traffic data received at the time corresponding to the nth element in the target time series is abnormal data; if Then the traffic data received at the time corresponding to the nth element in the target time sequence is normal data. among them, Is the default value.
  • the abnormal flow point in the second sub-time series can be detected, and the time corresponding to the abnormal flow point in the second sub-time series is It is the time of the abnormal flow point of the target time series, so the abnormal flow point of the target time series can be obtained according to the abnormal flow point in the second sub-time series.
  • the abnormal points of the target time series may be output.
  • anomaly detection is performed based on the type of time series and the type of time series corresponding to the judgment model, where each type of time series corresponds to a type of judgment model, that is, each type of time series has a time series corresponding to that type of time.
  • the first type of decision model and the second type of decision model corresponding to the sequence For example, for periodic time series, there is a corresponding first type of judgment model, and there is also a second type of judgment model corresponding to it; for aperiodic time series, there is a corresponding first type of judgment model. Model, there is also a second type of judgment model corresponding to it. Therefore, time series anomaly detection can be performed for each type of time series and the judgment model corresponding to each type of time series, thereby improving the accuracy of time series anomaly detection.
  • the method shown in FIG. 6 may be executed by a flow anomaly detection device, which may be a server with a flow anomaly detection function.
  • step 610 The method shown in FIG. 6 includes step 610 to step 640. These steps are described in detail below.
  • the target time sequence includes N elements, and the N elements correspond to N times, where each element of the N elements is traffic data received at a corresponding point in time.
  • target parameters of the target time series according to the target time series, where the target parameters include a period factor and/or jitter density, where the period factor is used to represent a wave-shaped change around a long-term trend presented in the target time series.
  • the jitter density is used to indicate the deviation of the actual value of the target time series from the target value within the target time.
  • each of the N elements in the target time sequence into a trend component, a periodic component, and a residual component; determine a first sub-time sequence that includes the N periodic components and that includes N And obtain the target parameter of the target time sequence according to the first sub-time sequence or the second sub-time sequence.
  • the target time sequence it is determined whether the target time sequence has a periodicity factor.
  • the target time sequence has a periodic factor; in the case that N periodic components in the first sub-time sequence do not exist, the target time sequence is determined There is no periodic factor.
  • the server determines the R of the target time series according to the following formula:
  • R is the jitter density
  • r n can be determined according to the following formula:
  • C n is the nth element in the second sub-time series
  • x n is the nth element in the target time series
  • N is determined according to the following formula:
  • T is the length of the target time series
  • W is the length of the window added to the window
  • is the first preset value
  • step 610 and step 620 For the content not described in the foregoing step 610 and step 620, reference may be made to the description of step 210 and step 220 in the foregoing method 200, which will not be repeated here.
  • is equal to 2
  • the determined jitter density of the target time series is 4, it can be determined that the first parameter set is ⁇ R> ⁇ , and if the determined jitter density of the target time series is 1, then Determine that the first parameter set is ⁇ R ⁇ .
  • the first parameter set corresponding to the first parameter set is determined according to the fifth mapping relationship and the first parameter set to which the target parameter belongs.
  • the fifth mapping relationship includes multiple parameter sets and multiple first parameter sets. Determine the corresponding relationship of the model.
  • a third sub-time sequence including N of the trend components divide the second time sequence into M sub-sequences of target length, where M is a positive integer, and the second time sequence is the third sub-time sequence or the second
  • the time series is formed according to the third sub-time series and the linear segmentation algorithm PLR; the matrix contour MP values of the M target-length sub-sequences are calculated, and the matrix contour MP values of the M target-length sub-sequences form the MP time series; according to The MP time series and the N-sigma algorithm detect abnormal conditions of the target time series.
  • the abnormal flow point in the MP time series can be detected.
  • the time corresponding to the abnormal flow point in the MP time series is the time of the abnormal flow point in the target time series.
  • the flow abnormal points of the target time series are obtained.
  • the abnormal points of the target time series may be output.
  • the foregoing method 600 may further include step 650.
  • the second type of judgment model is the N-sigma model.
  • the fourth mapping relationship and the first parameter set to which the target parameter belongs determine the second type of determination model corresponding to the first parameter set, and the fourth mapping relationship includes the correspondence between multiple parameter sets and multiple second type determination models relationship.
  • the abnormal flow point in the second sub-time series can be detected, and the time corresponding to the abnormal flow point in the second sub-time series That is, it is the time of the abnormal flow point of the target time series, so the abnormal flow point of the target time series can be obtained according to the abnormal flow point in the second sub-time series.
  • the abnormal points of the target time series may be output.
  • the method shown in FIG. 8 may be executed by a flow anomaly detection device, and the flow anomaly detection device may be a server with a flow anomaly detection function.
  • step 810 The method described in FIG. 8 includes step 810 to step 840. These steps are described in detail below.
  • the target time sequence includes N elements, and the N elements correspond to N times, where each element of the N elements is traffic data received at a corresponding time.
  • each of the N elements in the target time sequence into a trend component, a periodic component, and a residual component; determine a third sub-time sequence including the N trend components.
  • each of the N elements in the target time series is decomposed into a trend component, a periodic component, and a residual component.
  • the target length is specified by the communication protocol.
  • the third type of judgment model is the N-sigma model.
  • the abnormal flow point in the MP time series can be detected.
  • the time corresponding to the abnormal flow point in the MP time series is the time of the abnormal flow point in the target time series.
  • the abnormal flow point of the target time series can be obtained according to the abnormal flow point in the MP time series.
  • the abnormal traffic conditions of the target time series can be obtained, and then the abnormal points in the target time series can be output.
  • the above method 800 may further include step 860.
  • the abnormal flow point in the second sub-time series can be detected, and the time corresponding to the abnormal flow point in the second sub-time series is the target time series The time of the flow abnormality point, so that the flow abnormality point of the target time series can be obtained according to the flow abnormality point in the second sub-time series.
  • the abnormal points in the second sub-time series can be output.
  • the method for classifying traffic patterns in the embodiment of the present application is described in detail with reference to FIG. 2, and the method for detecting abnormal traffic in the embodiment of the present application is described in detail with reference to FIG. 4 to FIG. 8.
  • the following describes the implementation of the application in detail with reference to FIG.
  • the training method for the classification model of the traffic pattern provided in the example, and the training method for the traffic anomaly detection model provided by the embodiment of the present application will be described in detail with reference to FIGS. 10 to 12.
  • FIG. 9 is a schematic flowchart of a method 900 for training a traffic pattern classification model provided by an embodiment of the present application.
  • the method shown in FIG. 9 can be executed by a device with strong computing capabilities such as a computer device, a server device, or a computing device.
  • the method shown in FIG. 9 includes steps 910 to 940, and these steps are respectively described in detail below.
  • each element in the first time series is traffic data received at a corresponding time, which can be understood as each element in the first time series is historical traffic data received at a corresponding time.
  • multiple first time series can also be acquired.
  • the steps of the original classification model include step 1 to step 4.
  • Step 1 decompose each of the N elements in the first time series into trend components, periodic components, and residual components; determine a first sub-time sequence including N periodic components and N residual components The second sub-time series of the components.
  • Step 2 Determine the period factor of the first time sequence according to the N period components in the first sub-time sequence. In the case where N periodic components in the first sub-time sequence exist, it is determined that the periodic factor of the first time sequence exists, and the periodic factor can be determined to be 1; the N periodic components in the first sub-time sequence do not exist In this case, it is determined that the periodic factor of the first time series does not exist, and the periodic factor can be determined to be zero.
  • Step 3 Determine the jitter density of the first time series according to the N residual components in the second sub-time series.
  • the R of the first time series is determined according to the following formula:
  • R is the jitter density
  • r n can be determined according to the following formula:
  • C n is the nth element in the second sub-time series
  • x n is the nth element in the target time series
  • N is determined according to the following formula:
  • T is the length of the first time series
  • W is the length of the window added to the window
  • is the first preset value
  • Step 4 Determine the first type of the first time series according to the period factor S and the jitter density R.
  • the first type of the first time series is periodic; if S does not exist, the first type of the first time series is aperiodic; when R is greater than the second preset value Below, the first type of the first time series is the glitch type; when R is less than or equal to the second preset value, the first type of the first time series is the stationary type; where S exists and R is greater than the second preset value In the case of setting a value, the first type of the first time series is a periodic glitch type; when S exists and R is less than or equal to the second preset value, the first type of the first time series is a cycle stationary type; When S does not exist and R is greater than the second preset value, the first type of the first time series is an aperiodic glitch type; when S does not exist and R is less than or equal to the second preset value, The first type of the first time series is the non-cyclic stationary type.
  • the original type of the first time series is inconsistent with the first type of the first time series
  • adjust the parameters of the original model of the first time series where the parameters of the original model of the first time series include the first preset value And the second preset value. If the original type of the first time series is a glitch type, and the first type of behavior of the first time series is stable, the first preset value can be adjusted accordingly, or the second preset value can be adjusted accordingly. smaller.
  • the server may obtain multiple first time series, and train the target classification model of the first time series according to the multiple first time series, that is, continuously optimize the first time series according to the multiple first time series models The target classification model.
  • FIG. 10 is a schematic flowchart of another method 1000 for training a traffic anomaly detection model provided by an embodiment of the present application.
  • the method shown in FIG. 10 can be executed by a device with strong computing capabilities such as a computer device, a server device, or a computing device.
  • the method shown in FIG. 10 includes steps 1010 to 1050, and these steps are respectively described in detail below.
  • each element in the first time series is traffic data received at a corresponding time, which can be understood as each element in the first time series is historical traffic data received at a corresponding time.
  • multiple first time series can also be acquired.
  • the steps of the original classification model include step 1 to step 4.
  • Step 1 decompose each of the N elements in the first time series into trend components, periodic components, and residual components; determine a first sub-time sequence including N periodic components and N residual components The second sub-time series of the components.
  • Step 2 Determine the period factor of the first time sequence according to the N period components in the first sub-time sequence. In the case where N periodic components in the first sub-time sequence exist, it is determined that the periodic factor of the first time sequence exists, and the periodic factor can be determined to be 1; the N periodic components in the first sub-time sequence do not exist In this case, it is determined that the periodic factor of the first time series does not exist, and the periodic factor can be determined to be zero.
  • Step 3 Determine the jitter density of the first time series according to the N residual components in the second sub-time series.
  • the R of the first time series is determined according to the following formula:
  • R is the jitter density and r n can be determined according to the following formula:
  • C n is the nth element in the second sub-time series
  • x n is the nth element in the target time series
  • N is determined according to the following formula:
  • T is the length of the first time series
  • W is the length of the window added to the window
  • is the first preset value
  • Step 4 Determine the first type of the first time series according to the period factor S and the jitter density R.
  • the first type of the first time series is periodic; if S does not exist, the first type of the first time series is aperiodic; when R is greater than the second preset value Below, the first type of the first time series is the glitch type; when R is less than or equal to the second preset value, the first type of the first time series is the stationary type; where S exists and R is greater than the second preset value In the case of setting a value, the first type of the first time series is a periodic glitch type; when S exists and R is less than or equal to the second preset value, the first type of the first time series is a cycle stationary type; When S does not exist and R is greater than the second preset value, the first type of the first time series is an aperiodic glitch type; when S does not exist and R is less than or equal to the second preset value, The first type of the first time series is the non-cyclic stationary type.
  • the steps of the first type of judgment model include step A to step D:
  • Step A Determine the third sub-time series including N trend components
  • Step B Divide the second time series into M sub-sequences of target length, where M is a positive integer, the second time series is the third sub-time series or the second time series is based on the third sub-time series and linear segmentation Algorithm PLR formed;
  • Step C Calculate the matrix contour MP values of the M target length subsequences, and the matrix contour MP values of the M target length subsequences form the MP time series;
  • Step D According to the MP time series and the N-sigma algorithm, the abnormal situation of the first time series is detected.
  • the parameters of the original model of the first time series are adjusted, where the parameters of the original model of the first time series include sensitivity. If the number of data in the first data is greater than the number of data in the second data, the sensitivity of the original model of the first time series can be adjusted down accordingly; if the number of data in the first data is less than that in the second data The sensitivity of the original model of the first time series can be adjusted accordingly; if the number of data in the first data is equal to the number of data in the second data, there is no need to adjust the original model of the first time series. The sensitivity of the model.
  • the server may obtain multiple first time sequences, and train the first target determination model according to the multiple first time sequences, that is, continuously optimize the first target determination model according to the multiple first time sequences.
  • FIG. 11 is a schematic flowchart of another method 1100 for training a traffic anomaly detection model provided by an embodiment of the present application.
  • the method shown in FIG. 11 can be executed by a device with strong computing capabilities such as a computer device, a server device, or a computing device.
  • the method shown in FIG. 11 includes steps 1110 to 1150, which are described in detail below.
  • first time sequence where the first time sequence includes N elements, and the N elements correspond to N times, where each element of the N elements is traffic data received at a corresponding time.
  • each element in the first time series is traffic data received at a corresponding time, which can be understood as each element in the first time series is historical traffic data received at a corresponding time.
  • multiple first time series can also be acquired.
  • the original parameter model of the first time series may include steps a to d:
  • Step a According to the TSD algorithm, decompose each of the N elements in the first time series into trend components, periodic components, and residual components; determine the first sub-time sequence including N periodic components and include N residuals The second sub-time series of the components.
  • Step b Determine the period factor of the first time sequence according to the N period components in the first sub-time sequence.
  • N periodic components in the first sub-time sequence it is determined that the periodic factor of the first time sequence exists, and the periodic factor can be determined to be 1; the N periodic components in the first sub-time sequence do not exist In this case, it is determined that the periodic factor of the first time series does not exist, and the periodic factor can be determined to be zero.
  • Step c Determine the jitter density of the first time sequence according to the N residual components in the second sub-time sequence.
  • the R of the first time series is determined according to the following formula:
  • R is the jitter density
  • r n can be determined according to the following formula:
  • C n is the nth element in the second sub-time series
  • x n is the nth element in the target time series
  • N is determined according to the following formula:
  • the T is the length of the first time series
  • the W is the window length of the joining window
  • the ⁇ is the first preset value
  • Step d Determine the first parameter set of the first time sequence according to the period factor S and the jitter density R.
  • the steps of the first type of judgment model include step A to step D:
  • Step A Determine the third sub-time series including N trend components
  • Step B Divide the second time series into M sub-sequences of target length, where M is a positive integer, the second time series is the third sub-time series or the second time series is based on the third sub-time series and linear segmentation Algorithm PLR formed;
  • Step C Calculate the matrix contour MP values of the M target length subsequences, and the matrix contour MP values of the M target length subsequences form the MP time series;
  • Step D According to the MP time series and the N-sigma algorithm, the abnormal situation of the first time series is detected.
  • the parameters of the first type of determination model are adjusted, where the parameters of the first type of determination model include sensitivity. If the number of data in the first data is greater than the number of data in the second data, the sensitivity of the first type of judgment model can be adjusted down accordingly; if the number of data in the first data is less than that in the second data According to the number of data, the sensitivity of the first type of judgment model can be adjusted accordingly; if the number of data in the first data is equal to the number of data in the second data, there is no need to adjust the sensitivity of the first type of judgment model .
  • the server may obtain multiple first time sequences, and train the first target determination model according to the multiple first time sequences, that is, continuously optimize the first target determination model according to the multiple first time sequences.
  • FIG. 12 is a schematic flowchart of another method 1200 for training a traffic anomaly detection model provided by an embodiment of the present application.
  • the method shown in FIG. 12 can be executed by a device with strong computing capabilities such as a computer device, a server device, or a computing device.
  • the method shown in FIG. 12 includes steps 1210 to 1250, which are described in detail below.
  • each element in the first time series is traffic data received at a corresponding time, which can be understood as each element in the first time series is historical traffic data received at a corresponding time.
  • multiple first time series can also be acquired.
  • each of the N elements in the first time series is decomposed into a trend component, a periodic component, and a residual component, and a third sub-time series including N trend components is determined.
  • the fourth type of judgment model includes step A'to step D':
  • Step A' Determine the third sub-time series including N trend components
  • Step B' Divide the second time series into M sub-sequences of target length, where M is a positive integer, the second time series is the third sub-time series or the second time series is based on the third sub-time series and the linear division Segment algorithm PLR formed;
  • Step C' Calculate the matrix contour MP values of the M target length subsequences, and the matrix contour MP values of the M target length subsequences form the MP time series;
  • Step D' According to the MP time series and the N-sigma algorithm, the abnormal situation of the first time series is detected.
  • the parameters of the fourth type of determination model are adjusted, where the parameters of the fourth type of determination model include sensitivity. If the number of data in the first data is greater than the number of data in the second data, the sensitivity of the fourth type of judgment model can be adjusted down accordingly; if the number of data in the first data is less than that in the second data According to the number of data, the sensitivity of the fourth type of judgment model can be adjusted accordingly; if the number of data in the first data is equal to the number of data in the second data, there is no need to adjust the sensitivity of the fourth type of judgment model .
  • the server may obtain multiple first time sequences, and train the second target determination model according to the multiple first time sequences, that is, continuously optimize the second target determination model according to the multiple first time sequences.
  • FIG. 13 is a schematic block diagram of a traffic pattern classification device 1300 according to an embodiment of the present application.
  • the traffic pattern classification device 1300 shown in FIG. 13 includes a memory 1301, a processor 1302, a communication interface 1303, and a bus 1304.
  • the memory 1301, the processor 1302, and the communication interface 1303 implement communication connections between each other through the bus 1304.
  • the memory 1301 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
  • the memory 1301 may store a program.
  • the processor 1302 and the communication interface 1303 are used to execute each step of the traffic pattern classification method in the embodiment of the present application.
  • the communication interface 1303 may obtain the target time sequence from a memory or other devices, and then the processor 8002 classifies the to-be-targeted time sequence.
  • the processor 1302 may adopt a general central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more
  • the integrated circuit is used to execute related programs to realize the functions required by the units in the flow pattern classification device of the embodiment of the present application, or to execute the flow pattern classification method of the embodiment of the present application.
  • the processor 1302 may also be an integrated circuit chip with signal processing capability.
  • each step of the method for classifying traffic patterns in the embodiment of the present application can be completed by hardware integrated logic circuits in the processor 1302 or instructions in the form of software.
  • the aforementioned processor 1302 may also be a general-purpose processor, digital signal processing (DSP), ASIC, ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gates or transistor logic Devices, discrete hardware components.
  • DSP digital signal processing
  • FPGA field programmable gate array
  • the aforementioned general-purpose processor may be a microprocessor or the processor may also be any conventional processor.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 1301, and the processor 1302 reads the information in the memory 1301, and combines its hardware to complete the functions required by the units included in the object detection device of the embodiment of the present application, or perform the object detection of the method embodiment of the present application method.
  • the communication interface 1303 uses a transceiving device such as but not limited to a transceiver to implement communication between the device 1300 and other devices or communication networks.
  • a transceiving device such as but not limited to a transceiver to implement communication between the device 1300 and other devices or communication networks.
  • the target time series can be acquired through the communication interface 1303.
  • the bus 1304 may include a path for transferring information between various components of the device 1300 (for example, the memory 1301, the processor 1302, and the communication interface 1303).
  • FIG. 14 is a schematic block diagram of a flow abnormality detection device 1400 according to an embodiment of the present application.
  • the abnormal flow detection device 1400 shown in FIG. 14 includes a memory 1401, a processor 1402, a communication interface 1403, and a bus 1404.
  • the memory 1401, the processor 1402, and the communication interface 1403 implement communication connections between each other through the bus 1404.
  • the memory 1401 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
  • the memory 1401 may store a program.
  • the processor 1402 and the communication interface 1403 are used to execute each step of the traffic abnormality detection method in the embodiment of the present application.
  • the communication interface 1403 may obtain the target time sequence from a memory or other devices, and then the processor 8002 classifies the to-be-targeted time sequence.
  • the processor 1402 may adopt a general central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more
  • the integrated circuit is used to execute related programs to realize the functions required by the units in the flow pattern classification device of the embodiment of the present application, or to execute the flow pattern classification method of the embodiment of the present application.
  • the processor 1402 may also be an integrated circuit chip with signal processing capability.
  • each step of the method for classifying traffic patterns in the embodiment of the present application may be completed by an integrated logic circuit of hardware in the processor 1402 or instructions in the form of software.
  • the aforementioned processor 1402 may also be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an ASIC, a ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gates, or transistor logic Devices, discrete hardware components.
  • the aforementioned general-purpose processor may be a microprocessor or the processor may also be any conventional processor.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 1401, and the processor 1402 reads the information in the memory 1401, and combines its hardware to complete the functions required by the units included in the object detection apparatus of the embodiment of the present application, or perform the object detection of the method embodiment of the present application method.
  • the communication interface 1403 uses a transceiver device such as but not limited to a transceiver to implement communication between the device 1400 and other devices or communication networks.
  • a transceiver device such as but not limited to a transceiver to implement communication between the device 1400 and other devices or communication networks.
  • the target time series can be acquired through the communication interface 1403.
  • the bus 1404 may include a path for transferring information between various components of the device 1400 (for example, the memory 1401, the processor 1402, and the communication interface 1403).
  • FIG. 15 is a schematic diagram of the hardware structure of a traffic pattern classification model training device 1500 according to an embodiment of the present application. Similar to the foregoing device 1300, the device 1500 for training a traffic pattern classification model shown in FIG. 15 includes a memory 1501, a processor 1502, a communication interface 1503, and a bus 1504. Among them, the memory 1501, the processor 1502, and the communication interface 1503 communicate with each other through the bus 1504.
  • the memory 1501 may store a program, and when the program stored in the memory 1501 is executed by the processor 1502, the processor 1502 is configured to execute each step of the traffic pattern classification model training method in the embodiment of the present application.
  • the processor 1502 may adopt a general CPU, a microprocessor, an ASIC, a GPU, or one or more integrated circuits to execute related programs to implement the traffic pattern classification model training method of the embodiment of the present application.
  • the processor 1502 may also be an integrated circuit chip with signal processing capability.
  • each step of the method for training a traffic pattern classification model in the embodiment of the present application can be completed by an integrated logic circuit of hardware in the processor 1502 or instructions in the form of software.
  • flow pattern classification model training device 1500 shown in FIG. 15 is used to train the flow pattern classification model, and the training flow pattern classification model can be used to implement the flow pattern classification method of the embodiment of the present application.
  • the device shown in FIG. 15 may obtain the first time sequence from the outside through the communication interface 1503, and then the processor trains the classification model of the traffic pattern to be trained according to the first time sequence.
  • FIG. 16 is a schematic diagram of the hardware structure of a traffic anomaly detection model training device 1600 in an embodiment of the present application. Similar to the foregoing device 1400, the traffic anomaly detection model training device 1600 shown in FIG. 16 includes a memory 1601, a processor 1602, a communication interface 1603, and a bus 1604. Among them, the memory 1601, the processor 1602, and the communication interface 1603 implement communication connections between each other through the bus 1604.
  • the memory 1601 may store a program, and when the program stored in the memory 1601 is executed by the processor 1602, the processor 1602 is configured to execute each step of the traffic anomaly detection model training method in the embodiment of the present application.
  • the processor 1602 may adopt a general CPU, a microprocessor, an ASIC, a GPU or one or more integrated circuits to execute related programs to implement the traffic anomaly detection model training method in the embodiment of the present application.
  • the processor 1602 may also be an integrated circuit chip with signal processing capabilities.
  • each step of the method for training a traffic pattern classification model in the embodiment of the present application can be completed by an integrated logic circuit of hardware in the processor 1602 or instructions in the form of software.
  • the traffic anomaly detection model is trained by the traffic pattern classification model training device 1600 shown in FIG. 16, and the traffic anomaly detection model obtained by training can be used to execute the traffic anomaly detection method of the embodiment of the present application.
  • the device shown in FIG. 16 may obtain the first time sequence from the outside through the communication interface 1603, and then the processor trains the traffic anomaly detection model to be trained according to the first time sequence.
  • the foregoing device 1300, device 1400, device 1500, and device 1600 only show a memory, a processor, and a communication interface, in the specific implementation process, those skilled in the art should understand that the device 1300, device 1400, and device The 1500 and the device 1600 may also include other devices necessary for normal operation. At the same time, according to specific needs, those skilled in the art should understand that the device 1300, the device 1400, the device 1500, and the device 1600 may also include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the device 1300, the device 1400, the device 1500, and the device 1600 may also only include the components necessary to implement the embodiments of the present application, instead of including those shown in FIG. 13, FIG. 14, FIG. 15, and FIG. All devices shown.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Testing And Monitoring For Control Systems (AREA)
  • Traffic Control Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请提供了一种流量异常检测的方法,该方法包括:获取包括N个元素的目标时间序列;根据目标时间序列,获取目标时间序列的目标参数,目标参数包括周期因子和/或抖动密度,周期因子表示目标时间序列中呈现出来的围绕长期趋势的一种波浪形变动,抖动密度表示目标时间序列在目标时间内实际值与目标值的偏差;根据目标参数,从多个类型中确定目标时间序列所属于的第一类型,其中,多个类型中的每个类型对应一个参数集合,目标参数属于第一类型对应的参数集合;根据第一类型对应的第一类的判定模型,检测目标时间序列的异常情况,其中,多个类型中的每个类型对应一个类型的判定模型。通过上述技术方案,可以提高流量异常检测的精度。

Description

一种流量异常检测的方法、模型训练方法和装置
本申请要求于2019年8月15日提交中国专利局、申请号为CN 201910752193.9、发明名称为“流量异常检测的方法、模型训练方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及机器学习领域,并且更具体地,涉及一种流量异常检测的方法、模型训练方法和装置。
背景技术
在机器学习领域中,异常检测指对不符合预测的模型、数据或时间进行检测。通常异常检测是由专业人员对历史数据的学习,然后找出异常点。数据来源包括应用、进程、操作系统、设备或者网络,随着计算系统复杂度的提升,人工已经不能胜任现在的异常检测难度。
在现有技术中,基于统计与数据分布的算法对网络流量数据进行异常检测,前提条件是假设在短时间内,流量数据服从正态分布,但是,网络流量数据分布在短时间内并不服从正态分布,因此,基于统计与数据分布的算法对网络流量数据进行异常检测的精度不高。
发明内容
本申请提供一种流量异常检测的方法、模型训练方法和装置,能够提高模型对网络流量数据的异常检测的精度。
第一方面,提供了一种流量异常检测的方法,获取目标时间序列,所述目标时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;根据所述目标时间序列,获取所述目标时间序列的目标参数,所述目标参数包括周期因子和/或抖动密度,其中,所述周期因子用于表示所述目标时间序列中呈现出来的围绕长期趋势的一种波浪形变动,所述抖动密度用于表示所述目标时间序列在目标时间内实际值与目标值的偏差;根据所述目标参数,从多个类型中确定所述目标时间序列所属于的第一类型,其中,所述多个类型中的每个类型对应一个参数集合,所述目标参数属于所述第一类型对应的参数集合;根据所述第一类型对应的第一类的判定模型,检测所述目标时间序列的异常情况,其中,所述多个类型中的每个类型对应一个类型的判定模型,所述判定模型用于流量异常检测。
首先,根据获取到的目标时间序列,确定目标时间序列的目标参数;其次,根据目标参数,确定目标时间序列所属于的第一类型;最后,根据第一类型对应的第一类的判定模型,对目标时间序列进行流量异常检测,因此,可以提高流量异常检测的精度。
结合第一方面,在一种可能的实现方式中,所述根据所述目标时间序列,获取所述目标时间序列的目标参数包括:将所述目标时间序列中的所述N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个所述周期分量的第一子时间序列和包括N个所述残余分量的第二子时间序列;根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数。
结合第一方面,在一种可能的实现方式中,所述根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数包括:根据所述第一子时间序列,确定所述目标时间序列是否存在所述周期因子。
结合第一方面,在一种可能的实现方式中,所述根据所述第一子时间序列,确定所述目标时间序列是否存在所述周期因子包括:在所述第一子时间序列中的N个周期分量存在的情况下,确定所述目标时间序列存在所述周期因子;在所述第一子时间序列中的N个周期分量不存在的情况下,确定所述目标时间序列不存在所述周期因子。
结合第一方面,在一种可能的实现方式中,所述方法还包括:根据第一映射关系和所述目标时间序列所属于的第一类型,确定所述第一类型对应的第二类的判定模型,所述第一映射关系包括所述多个类型和多个所述第二类的判定模型的对应关系;根据所述第二子时间序列和所述第一类型对应的第二类的判定模型,检测所述目标时间序列的异常情况,所述第二类的判定模型是N-sigma模型。
结合第一方面,在一种可能的实现方式中,所述根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数包括:根据所述第二子时间序列,确定所述目标时间序列的抖动密度。
结合第一方面,在一种可能的实现方式中,所述根据所述第二子时间序列,确定所述目标时间序列的抖动密度包括:根据以下公式确定所述目标时间序列的R:
Figure PCTCN2020107627-appb-000001
其中,所述R是抖动密度,所述r n可以根据以下公式确定:
Figure PCTCN2020107627-appb-000002
其中,所述C n为所述第二子时间序列中的第n个元素,所述x n为所述目标时间序列中的第n个元素;
所述N根据以下公式确定:
Figure PCTCN2020107627-appb-000003
所述T为所述目标时间序列的长度,所述W为加入窗口的窗长,所述α是第一预设值。
结合第一方面,在一种可能的实现方式中,所述根据目标参数,从多个类型中确定所 述目标时间序列所属于的第一类型包括:根据所述目标参数,从所述多个参数集合中确定所述目标参数属于的第一参数集合;根据第三映射关系和所述第一参数集合,从多个类型中确定所述目标时间序列所属于的第一类型,所述第三映射关系包括多个参数集合和多个所述类型的对应关系。
其中,上述所述类型可以包括:周期型、非周期型、稳定型、毛刺型、周期稳定型、周期毛刺型、非周期稳定型或非周期毛刺型。
其中,周期型和非周期型可以根据周期因子确定。具体地,在所述周期因子存在的情况下,所述类型为周期型;在所述周期因子不存在的情况下,所述类型为非周期型。
其中,稳定型和毛刺型可以根据抖动密度确定。具体地,在所述抖动密度大于第二预设值的情况下,所述类型为毛刺型;在所述抖动密度小于或等于所述第二预设值的情况下,所述类型为平稳型。
根据目标参数,从多个参数集合中确定目标参数属于的第一参数集合,再根据第三映射关系和第一参数集合,从多个类型中确定目标时间序列所属于的第一类型,从而可以获取目标时间序列所属于的类型,完成对目标时间序列的分类。
结合第一方面,在一种可能的实现方式中,所述根据所述第一类型对应的第一类的判定模型,检测所述目标时间序列的异常情况包括:确定包括N个所述趋势分量的第三子时间序列;将第二时间序列分成M个目标长度的子序列,所述M为正整数,所述第二时间序列是所述第三子时间序列或所述第二时间序列是根据所述第三子时间序列和线性分段算法PLR形成的;计算M个目标长度的子序列的矩阵轮廓MP值,所述M个目标长度的子序列的矩阵轮廓MP值组成MP时间序列;根据所述MP时间序列和N-sigma算法,检测所述目标时间序列的异常情况。
首先,将第二时间序列分成M个目标长度的子序列,其中,第二时间序列是所述第三子时间序列或所述第二时间序列是根据所述第三子时间序列和线性分段算法PLR形成的,其次,计算M个目标长度的子序列的矩阵轮廓MP值;最后,根据MP时间序列和N-sigma算法,检测目标时间序列的异常情况,从而提高了流量异常检测的精度。
结合第一方面,在一种可能的实现方式中,所述方法还包括:根据第二映射关系和所述目标时间序列所属于的第一类型,确定所述第一类型对应的第一类的判定模型,所述第二映射关系包括所述多个类型和多个所述第一类的判定模型的对应关系。
根据第二映射关系和目标时间序列所属于的第一类型,确定第一类型对应的第一类的判定模型,从而可以针对目标时间序列所属于的类型,确定相应的判定模型,从而提高了流量异常检测的精度。
第二方面,提供了一种流量异常检测的方法,包括:获取目标时间序列;所述目标时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;根据所述目标时间序列,获取所述目标时间序列的目标参数,所述目标参数包括周期因子和/或抖动密度,其中,所述周期因子用于表示所述目标时间序列中呈现出来的围绕长期趋势的一种波浪形变动,所述抖动密度用于表示所述目标时间序列在目标时间内实际值与目标值的偏差;从多个参数集合中确定所述目标参数所属于的第一参数集合;根据所述第一参数集合对应的第一类的判定模型,检测所述目标时间序列的异常情况,其中,所述多个参数集合中的每个参数集合对应一个类型的判定模型,所述判定模型用于流量异常检测。
首先,根据获取到的目标时间序列,确定目标时间序列的目标参数;其次,根据目标参数,确定目标时间序列所属于的第一参数集合;最后,根据第一参数集合对应的第一类的判定模型,对目标时间序列进行流量异常检测,因此,可以提高流量异常检测的精度。
结合第二方面,在一种可能的实现方式中,所述根据所述目标时间序列,获取所述目标时间序列的目标参数包括:将所述目标时间序列中的所述N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个所述周期分量的第一子时间序列和包括N个所述残余分量的第二子时间序列;根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数。
结合第二方面,在一种可能的实现方式中,所述根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数包括:根据所述第一子时间序列,确定所述目标时间序列是否存在所述周期因子。
结合第二方面,在一种可能的实现方式中,所述根据所述第一子时间序列,确定所述目标时间序列是否存在所述周期因子包括:在所述第一子时间序列中的N个周期分量存在的情况下,确定所述目标时间序列存在所述周期因子;在所述第一子时间序列中的N个周期分量不存在的情况下,确定所述目标时间序列不存在所述周期因子。
结合第二方面,在一种可能的实现方式中,所述方法还包括:根据所述第二子时间序列和所述第一参数集合对应的第二类的判定模型,检测所述目标时间序列的异常情况,所述第二类的判定模型是N-sigma模型。
结合第二方面,在一种可能的实现方式中,所述方法还包括:根据第四映射关系和所述目标参数所属于的第一参数集合,确定所述第一参数集合对应的第二类的判定模型,所述第四映射关系包括所述多个参数集合和多个所述第二类的判定模型的对应关系。
结合第二方面,在一种可能的实现方式中,所述根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数包括:根据所述第二子时间序列,确定所述目标时间序列的抖动密度。
结合第二方面,在一种可能的实现方式中,所述根据所述第二子时间序列,确定所述目标时间序列的抖动密度包括:根据以下公式确定所述目标时间序列的抖动密度:
Figure PCTCN2020107627-appb-000004
其中,所述R是抖动密度,所述r n可以根据以下公式确定:
Figure PCTCN2020107627-appb-000005
其中,所述C n为所述第二子时间序列中的第n个元素,所述x n为所述目标时间序列中的第n个元素;
所述N根据以下公式确定:
Figure PCTCN2020107627-appb-000006
所述T为所述目标时间序列的长度,所述W为加入窗口的窗长,所述α是第一预设值。
结合第二方面,在一种可能的实现方式中,所述根据所述第一参数集合对应的第一类的判定模型,检测所述目标时间序列的异常情况包括:确定包括N个所述趋势分量的第三子时间序列;将第二时间序列分成M个目标长度的子序列,所述M为正整数,所述第二时间序列是所述第三子时间序列或所述第二时间序列是根据所述第三子时间序列和线性分段算法PLR形成的;计算M个目标长度的子序列的矩阵轮廓MP值,所述M个目标长度的子序列的矩阵轮廓MP值组成MP时间序列;根据所述MP时间序列和N-sigma算法,检测所述目标时间序列的异常情况。
首先,将第二时间序列分成M个目标长度的子序列,其中,第二时间序列是所述第三子时间序列或所述第二时间序列是根据所述第三子时间序列和线性分段算法PLR形成的;其次,计算M个目标长度的子序列的矩阵轮廓MP值;最后,根据MP时间序列和N-sigma算法,检测目标时间序列的异常情况,从而提高了流量异常检测的精度。
结合第二方面,在一种可能的实现方式中,所述方法还包括:根据第五映射关系和所述目标参数所属于的第一参数集合,确定所述第一参数集合对应的第一类的判定模型,所述第五映射关系包括所述多个参数集合和所述多个第一类的判定模型的对应关系。
根据第五映射关系和第一参数集合,确定第一参数集合对应的第一类的判定模型,从而可以获取目标时间序列对应的判定模型,提高了流量异常检测的精度。
第三方面,提供了一种流量异常检测的方法,所述方法包括:获取目标时间序列,所述目标时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;将所述目标时间序列中的所述N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个所述趋势分量的第三子时间序列;将第二时间序列分成M个目标长度的子序列,所述M为正整数,所述第二时间序列是所述第三子时间序列或所述第二时间序列是根据第三子时间序列和线性分段算法PLR形成的;计算M个目标长度的子序列的矩阵轮廓MP值,所述M个目标长度的子序列的矩阵轮廓MP值组成MP时间序列;根据第三类的判定模型,检测所述MP时间序列的异常情况。
首先,获取目标时间序列,并将第二时间序列分成M个目标长度的子序列,其中,第二时间序列是第三子时间序列或第二时间序列是根据第三子时间序列和线性分段算法PLR形成的,第三子时间序列是对目标时间序列中的N个元素中的每个元素分解的趋势分量形成的时间序列;其次,计算M个目标长度的子序列的矩阵轮廓MP值;最后,根据第三类的判定模型,检测目标时间序列的异常情况,从而提高了流量异常检测的精度。
结合第三方面,在一种可能的实现方式中,所述方法还包括:确定包括N个所述残余分量的第二子时间序列;根据第二子时间序列和第三类的判定模型,检测所述目标时间序列的异常情况。
结合第三方面,在一种可能的实现方式中,所述目标长度是通信协议规定的。
结合第三方面,在一种可能的实现方式中,所述第三类的判定模型是N-sigma模型。
第四方面,提供了一种流量模式的分类方法,包括:获取目标时间序列,所述目标时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元 素为所对应的时刻接收到的流量数据;根据所述目标时间序列,获取目标时间序列的目标参数,所述目标参数包括周期因子和/或抖动密度,其中,所述周期因子用于表示所述目标时间序列中呈现出来的围绕长期趋势的一种波浪形变动,所述抖动密度用于表示所述目标时间序列在目标时间内实际值与目标值的偏差;根据所述目标参数,对所述目标时间序列进行分类。
获取目标时间序列,并根据目标时间序列的目标参数,对目标时间序列进行分类,从而以便后续对分类后的目标时间序列进行处理,提高了目标时间序列进行处理的精度。
结合第四方面,在一种可能的实现方式中,所述根据所述目标时间序列,获取所述目标时间序列的目标参数包括:将所述目标时间序列中的所述N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个所述周期分量的第一子时间序列和包括N个所述残余分量的第二子时间序列;根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数。
结合第四方面,在一种可能的实现方式中,所述根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数包括:根据所述第一子时间序列,确定所述目标时间序列是否存在所述周期因子。
结合第四方面,在一种可能的实现方式中,所述根据所述第一子时间序列,确定所述目标时间序列是否存在所述周期因子包括:在所述第一子时间序列中的N个周期分量存在的情况下,确定所述目标时间序列存在所述周期因子;在所述第一子时间序列中的N个周期分量不存在的情况下,确定所述目标时间序列不存在所述周期因子。
结合第四方面,在一种可能的实现方式中,所述根据所述目标参数,对所述目标时间序列进行分类包括:在所述周期因子存在的情况下,将所述目标时间序列确定为周期型;在所述周期因子不存在的情况下,将所述目标时间序列确定为非周期型。
结合第四方面,在一种可能的实现方式中,所述根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数包括:根据所述第二子时间序列,确定所述目标时间序列的抖动密度。
结合第四方面,在一种可能的实现方式中,所述根据所述目标时间序列的残差分量,确定所述目标时间序列的抖动密度包括:根据以下公式确定所述目标时间序列的R:
Figure PCTCN2020107627-appb-000007
其中,所述R是抖动密度,所述r n可以根据以下公式确定:
Figure PCTCN2020107627-appb-000008
其中,所述C n为所述第二子时间序列中的第n个元素,所述x n为所述目标时间序列中的第n个元素;
所述N根据以下公式确定:
Figure PCTCN2020107627-appb-000009
所述T为所述目标时间序列的长度,所述W为加入窗口的窗长,所述α是第一预设值。
结合第四方面,在一种可能的实现方式中,所述根据所述目标参数,对所述目标时间序列进行分类包括:在所述抖动密度大于第二预设值的情况下,将所述目标时间序列确定为毛刺型;在所述抖动密度小于或等于所述第二预设值的情况下,所述目标时间序列确定为平稳型。
第五方面,提供了一种流量异常检测模型训练方法,包括:获取第一时间序列,所述第一时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;根据所述第一时间序列的原始分类模型,获取所述第一时间序列的第一类型;根据第一类型对应的第一类的判定模型,对所述第一类型的第一时间序列进行流量异常检测处理,以获取第一数据,所述第一数据是所述第一时间序列的异常点;获取第二数据,所述第二数据为所述第一时间序列的原始异常点;根据所述第一数据和所述第二数据,调整所述第一类的判定模型的参数,以获取第一目标判定模型。
可选地,可以获取多个第一时间序列,根据多个第一时间序列来训练第一目标判定模型。
结合第五方面,在一种可能的实现方式中,所述第一时间序列的第一类型为周期型或非周期型、毛刺型、平稳型、周期平稳型、非周期平稳型、周期毛刺型或非周期毛刺型。
第六方面,提供了一种流量异常检测模型训练方法,包括:获取第一时间序列,所述第一时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;根据所述第一时间序列的原始参数模型,获取所述第一时间序列的第一参数集合;根据第一参数集合对应的第一类的判定模型,对所述第一时间序列进行流量异常检测处理,以获取第四数据,所述第四数据是所述第一时间序列的异常点;获取第二数据,所述第二数据为所述第一时间序列的原始异常点;根据所述第二数据和所述第四数据,调整所述第一类的判定模型的参数,以获取第一目标判定模型。
可选地,可以获取多个第一时间序列,根据多个第一时间序列来训练第一目标判定模型。
结合第六方面,在一种可能的实现方式中,所述第一时间序列的第一类型为周期型或非周期型、毛刺型、平稳型、周期平稳型、非周期平稳型、周期毛刺型或非周期毛刺型。
第七方面,提供了一种流量异常检测模型的训练方法,所述方法包括:获取第一时间序列,所述第一时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;对所述第一时间序列进行处理,以获取第三子时间序列,所述第三子时间序列是所述第一时间序列中的N个元素中的每个元素分解的趋势分量组成的时间序列;根据第四类的判定模型,对所述第一时间序列进行流量异常检测处理,以获取第三数据,所述第三数据是所述第一时间序列的异常点;获取第二数据,所述第二数据为所述第一时间序列的原始异常点;根据所述第二数据和所述第三数据,调整所述第四类的判定模型的参数,以获取第二目标判定模型。
可选地,可以获取多个第一时间序列,根据多个第一时间序列来训练第二目标判定模型。
第八方面,提供了一种流量模式的分类模型训练方法,包括:获取第一时间序列,所述第一时间序列包括N个元素,所述N个元素与所述N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;根据所述第一时间序列的原始分类模型,获取所述第一时间序列的第一类型;获取所述第一时间序列的原始类型;根据所述第一时间序列的原始类型和所述第一时间序列的第一类型,调整所述第一时间序列的原始模型的参数,以获取所述第一时间序列的目标分类模型。
结合第八方面,在一种可能的实现方式中,所述第一时间序列的第一类型为周期型或非周期型、毛刺型、平稳型、周期平稳型、非周期平稳型、周期毛刺型或非周期毛刺型。
第九方面,提供了一种流量异常检测装置,包括存储器,用于存储程序;处理器,用于执行存储器存储的程序,当所述处理器执行所述存储器存储的程序时,所述处理器用于获取目标时间序列,所述目标时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;根据所述目标时间序列,获取所述目标时间序列的目标参数,所述目标参数包括周期因子和/或抖动密度,其中,所述周期因子用于表示所述目标时间序列中呈现出来的围绕长期趋势的一种波浪形变动,所述抖动密度用于表示所述目标时间序列在目标时间内实际值与目标值的偏差;根据所述目标参数,从多个类型中确定所述目标时间序列所属于的第一类型,其中,所述多个类型中的每个类型对应一个参数集合,所述目标参数属于所述第一类型对应的参数集合;根据所述第一类型对应的第一类的判定模型,检测所述目标时间序列的异常情况,其中,所述多个类型中的每个类型对应一个类型的判定模型,所述判定模型用于流量异常检测。
结合第九方面,在一种可能的实现方式中,所述处理器还用于:将所述目标时间序列中的所述N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个所述周期分量的第一子时间序列和包括N个所述残余分量的第二子时间序列;根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数。
结合第九方面,在一种可能的实现方式中,所述处理器还用于:根据所述第一子时间序列,确定所述目标时间序列是否存在所述周期因子。
结合第九方面,在一种可能的实现方式中,所述处理器还具体用于:在所述第一子时间序列中的N个周期分量存在的情况下,确定所述目标时间序列存在所述周期因子;在所述第一子时间序列中的N个周期分量不存在的情况下,确定所述目标时间序列不存在所述周期因子。
结合第九方面,在一种可能的实现方式中,所述处理器还用于:根据第一映射关系和所述目标时间序列所属于的第一类型,确定所述第一类型对应的第二类的判定模型,所述第一映射关系包括所述多个类型和多个所述第二类的判定模型的对应关系;根据所述第二子时间序列和所述第一类型对应的第二类的判定模型,检测所述目标时间序列的异常情况,所述第二类的判定模型是N-sigma模型。
结合第九方面,在一种可能的实现方式中,所述处理器还用于:根据所述第二子时间序列,确定所述目标时间序列的抖动密度。
结合第九方面,在一种可能的实现方式中,所述处理器还具体用于:根据以下公式确定所述目标时间序列的R:
Figure PCTCN2020107627-appb-000010
其中,所述R是抖动密度,所述r n可以根据以下公式确定:
Figure PCTCN2020107627-appb-000011
其中,所述C n为所述第二子时间序列中的第n个元素,所述x n为所述目标时间序列中的第n个元素;
所述N根据以下公式确定:
Figure PCTCN2020107627-appb-000012
所述T为所述目标时间序列的长度,所述W为加入窗口的窗长,所述α是第一预设值。
结合第九方面,在一种可能的实现方式中,所述处理器还用于:根据所述目标参数,从所述多个参数集合中确定所述目标参数属于的第一参数集合;根据第三映射关系和所述第一参数集合,从多个类型中确定所述目标时间序列所属于的第一类型,所述第三映射关系包括多个参数集合和多个所述类型的对应关系。
其中,上述所述类型可以包括:周期型、非周期型、稳定型、毛刺型、周期稳定型、周期毛刺型、非周期稳定型或非周期毛刺型。
其中,周期型和非周期型可以根据周期因子确定。具体地,在所述周期因子存在的情况下,所述类型为周期型;在所述周期因子不存在的情况下,所述类型为非周期型。
其中,稳定型和毛刺型可以根据抖动密度确定。具体地,在所述抖动密度大于第二预设值的情况下,所述类型为毛刺型;在所述抖动密度小于或等于所述第二预设值的情况下,所述类型为平稳型。
结合第九方面,在一种可能的实现方式中,所述处理器还具体用于:确定包括N个所述趋势分量的第三子时间序列;将第二时间序列分成M个目标长度的子序列,所述M为正整数,所述第二时间序列是所述第三子时间序列或所述第二时间序列是根据所述第三子时间序列和线性分段算法PLR形成的;计算M个目标长度的子序列的矩阵轮廓MP值,所述M个目标长度的子序列的矩阵轮廓MP值组成MP时间序列;根据所述MP时间序列和N-sigma算法,检测所述目标时间序列的异常情况。
结合第九方面,在一种可能的实现方式中,所述处理器还用于:根据所述第二映射关系和所述目标时间序列所属于的第一类型,确定所述第一类型对应的第一类的判定模型,所述第二映射关系包括所述多个类型和多个所述第一类的判定模型的对应关系。
第十方面,提供了一种流量异常检测装置,包括存储器,用于存储程序;处理器,用于执行存储器存储的程序,当所述处理器执行所述存储器存储的程序时,所述处理器用于获取目标时间序列;所述目标时间序列包括N个元素,所述N个元素与N个时刻对应, 其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;根据所述目标时间序列,获取所述目标时间序列的目标参数,所述目标参数包括周期因子和/或抖动密度,其中,所述周期因子用于表示所述目标时间序列中呈现出来的围绕长期趋势的一种波浪形变动,所述抖动密度用于表示所述目标时间序列在目标时间内实际值与目标值的偏差;从多个参数集合中确定所述目标参数所属于的第一参数集合;根据所述第一参数集合对应的第一类的判定模型,检测所述目标时间序列的异常情况,其中,所述多个参数集合中的每个参数集合对应一个类型的判定模型,所述判定模型用于流量异常检测。
结合第十方面,在一种可能的实现方式中,所述处理器还用于:将所述目标时间序列中的所述N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个所述周期分量的第一子时间序列和包括N个所述残余分量的第二子时间序列;根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数。
结合第十方面,在一种可能的实现方式中,所述处理器还用于:根据所述第一子时间序列,确定所述目标时间序列是否存在所述周期因子。
结合第十方面,在一种可能的实现方式中,所述处理器还具体用于:在所述第一子时间序列中的N个周期分量存在的情况下,确定所述目标时间序列存在所述周期因子;在所述第一子时间序列中的N个周期分量不存在的情况下,确定所述目标时间序列不存在所述周期因子。
结合第十方面,在一种可能的实现方式中,所述处理器还用于:根据所述第二子时间序列和所述第一参数集合对应的第二类的判定模型,检测所述目标时间序列的异常情况,所述第二类的判定模型是N-sigma模型。
结合第十方面,在一种可能的实现方式中,所述处理器还用于:根据第四映射关系和所述目标参数所属于的第一参数集合,确定所述第一参数集合对应的第二类的判定模型,所述第四映射关系包括所述多个参数集合和多个所述第二类的判定模型的对应关系。
结合第十方面,在一种可能的实现方式中,所述处理器还用于:根据所述第二子时间序列,确定所述目标时间序列的抖动密度。
结合第十方面,在一种可能的实现方式中,所述处理器还用于:根据以下公式确定所述目标时间序列的抖动密度:
Figure PCTCN2020107627-appb-000013
其中,所述R是抖动密度,所述r n可以根据以下公式确定:
Figure PCTCN2020107627-appb-000014
其中,所述C n为所述第二子时间序列中的第n个元素,所述x n为所述目标时间序列中的第n个元素;
所述N根据以下公式确定:
Figure PCTCN2020107627-appb-000015
所述T为所述目标时间序列的长度,所述W为加入窗口的窗长,所述α是第一预设值。
结合第十方面,在一种可能的实现方式中,所述处理器还具体用于:确定包括N个所述趋势分量的第三子时间序列;将第二时间序列分成M个目标长度的子序列,所述M为正整数,所述第二时间序列是所述第三子时间序列或所述第二时间序列是根据所述第三子时间序列和线性分段算法PLR形成的;计算M个目标长度的子序列的矩阵轮廓MP值,所述M个目标长度的子序列的矩阵轮廓MP值组成MP时间序列;根据所述MP时间序列和N-sigma算法,检测所述目标时间序列的异常情况。
结合第十方面,在一种可能的实现方式中,所述处理器还具体用于:根据第五映射关系和所述目标参数所属于的第一参数集合,确定所述第一参数集合对应的第一类的判定模型,所述第五映射关系包括所述多个参数集合和所述多个第一类的判定模型的对应关系。
第十一方面,提供了一种流量异常检测装置,包括存储器,用于存储程序;处理器,用于执行存储器存储的程序,当所述处理器执行所述存储器存储的程序时,所述处理器用于获取目标时间序列,所述目标时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;将所述目标时间序列中的所述N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个所述趋势分量的第三子时间序列;将第二时间序列分成M个目标长度的子序列,所述M为正整数,所述第二时间序列是所述第三子时间序列或所述第二时间序列是根据第三子时间序列和线性分段算法PLR形成的;计算M个目标长度的子序列的矩阵轮廓MP值,所述M个目标长度的子序列的矩阵轮廓MP值组成MP时间序列;根据第三类的判定模型,检测所述MP时间序列的异常情况。
结合第十一方面,在一种可能的实现方式中,所述处理器还用于:确定包括N个所述残余分量的第二子时间序列;根据第二子时间序列和第三类的判定模型,检测所述目标时间序列的异常情况。
结合第十一方面,在一种可能的实现方式中,所述目标长度是通信协议规定的。
结合第十一方面,在一种可能的实现方式中,所述第三类的判定模型是N-sigma模型。
第十二方面,提供了一种流量模式的分类装置,包括存储器,用于存储程序;处理器,用于执行存储器存储的程序,当所述处理器执行所述存储器存储的程序时,所述处理器用于获取目标时间序列,所述目标时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;根据所述目标时间序列,获取目标时间序列的目标参数,所述目标参数包括周期因子和/或抖动密度,其中,所述周期因子用于表示所述目标时间序列中呈现出来的围绕长期趋势的一种波浪形变动,所述抖动密度用于表示所述目标时间序列在目标时间内实际值与目标值的偏差;根据所述目标参数,对所述目标时间序列进行分类。
结合第十二方面,在一种可能的实现方式中,所述处理器还具体用于:将所述目标时间序列中的所述N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个所述周期分量的第一子时间序列和N个所述残余分量的第二子时间序列;根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数。
结合第十二方面,在一种可能的实现方式中,所述处理器还具体用于:根据所述第一子时间序列,确定所述目标时间序列是否存在所述周期因子。
结合第十二方面,在一种可能的实现方式中,所述处理器还具体用于:在所述第一子时间序列中的N个周期分量存在的情况下,确定所述目标时间序列存在所述周期因子;在所述第一子时间序列中的N个周期分量不存在的情况下,确定所述目标时间序列不存在所述周期因子。
结合第十二方面,在一种可能的实现方式中,所述处理器还具体用于:在所述周期因子存在的情况下,将所述目标时间序列确定为周期型;在所述周期因子不存在的情况下,将所述目标时间序列确定为非周期型。
结合第十二方面,在一种可能的实现方式中,所述处理器还具体用于:根据所述第二子时间序列,确定所述目标时间序列的抖动密度。
结合第十二方面,在一种可能的实现方式中,所述处理器还具体用于:根据以下公式确定所述目标时间序列的R:
Figure PCTCN2020107627-appb-000016
其中,所述R是抖动密度,所述r n可以根据以下公式确定:
Figure PCTCN2020107627-appb-000017
其中,所述C n为所述第二子时间序列中的第n个元素,所述x n为所述目标时间序列中的第n个元素;
所述N根据以下公式确定:
Figure PCTCN2020107627-appb-000018
所述T为所述目标时间序列的长度,所述W为加入窗口的窗长,所述α是第一预设值。
结合第十二方面,在一种可能的实现方式中,所述处理器还具体用于:在所述抖动密度大于第二预设值的情况下,将所述目标时间序列确定为毛刺型;在所述抖动密度小于或等于所述第二预设值的情况下,所述目标时间序列确定为平稳型。
第十三方面,提供了一种流量异常检测模型训练装置,包括存储器,用于存储程序;处理器,用于执行存储器存储的程序,当所述处理器执行所述存储器存储的程序时,所述处理器用于获取第一时间序列,所述第一时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;根据所述第一时间序列的原始分类模型,获取所述第一时间序列的第一类型;根据第一类型对应的第一类的判定模型,对所述第一类型的第一时间序列进行流量异常检测处理,以获取第一数据,所述第一数据是所述第一时间序列的异常点;获取第二数据,所述第二数据为所 述第一时间序列的原始异常点;根据所述第一数据和所述第二数据,调整所述第一类的判定模型的参数,以获取第一目标判定模型。
可选地,所述处理器可以获取多个第一时间序列,根据多个第一时间序列来训练第一目标判定模型。
结合第十三方面,在一种可能的实现方式中,所述第一时间序列的第一类型为周期型或非周期型、毛刺型、平稳型、周期平稳型、非周期平稳型、周期毛刺型或非周期毛刺型。
第十四方面,提供了一种流量异常检测模型训练装置,包括存储器,用于存储程序;处理器,用于执行存储器存储的程序,当所述处理器执行所述存储器存储的程序时,所述处理器用于获取第一时间序列,所述第一时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;根据所述第一时间序列的原始参数模型,获取所述第一时间序列的第一参数集合;根据第一参数集合对应的第一类的判定模型,对所述第一时间序列进行流量异常检测处理,以获取第四数据,所述第四数据是所述第一时间序列的异常点;获取第二数据,所述第二数据为所述第一时间序列的原始异常点;根据所述第二数据和所述第四数据,调整所述第一类的判定模型的参数,以获取第一目标判定模型。
可选地,所述处理器可以获取多个第一时间序列,根据多个第一时间序列来训练第一目标判定模型。
结合第十四方面,在一种可能的实现方式中,所述第一时间序列的第一类型为周期型或非周期型、毛刺型、平稳型、周期平稳型、非周期平稳型、周期毛刺型或非周期毛刺型。
第十五方面,提供了一种流量异常检测模型的训练模型,包括存储器,用于存储程序;处理器,用于执行存储器存储的程序,当所述处理器执行所述存储器存储的程序时,所述处理器用于获取第一时间序列,所述第一时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;对所述第一时间序列进行处理,以获取第三子时间序列,所述第三子时间序列是所述第一时间序列中的N个元素中的每个元素分解的趋势分量组成的时间序列;根据第四类的判定模型,对所述第一时间序列进行流量异常检测处理,以获取第三数据,所述第三数据是所述第一时间序列的异常点;获取第二数据,所述第二数据为所述第一时间序列的原始异常点;根据所述第二数据和所述第三数据,调整所述第四类的判定模型的参数,以获取第二目标判定模型。
可选地,所述处理器可以获取多个第一时间序列,根据多个第一时间序列来训练第二目标判定模型。
第十六方面,提供了一种流量模式的分类模型训练装置,包括存储器,用于存储程序;处理器,用于执行存储器存储的程序,当所述处理器执行所述存储器存储的程序时,所述处理器用于获取第一时间序列,所述第一时间序列包括N个元素,所述N个元素与所述N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;根据所述第一时间序列的原始分类模型,获取所述第一时间序列的第一类型;获取所述第一时间序列的原始类型;根据所述第一时间序列的原始类型和所述第一时间序列的第一类型,调整所述第一时间序列的原始模型的参数,以获取所述第一时间序列的目标分类模型。
结合第十六方面,在一种可能的实现方式中,所述第一时间序列的第一类型为周期型或非周期型、毛刺型、平稳型、周期平稳型、非周期平稳型、周期毛刺型或非周期毛刺型。
第十七方面,提供一种计算机存储介质,该计算机存储介质存储有程序代码,该程序代码包括用于执行上述第一方面至第八方面中任一种可能实现方式中的方法。
第十八方面,提供一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述第一方面至第八方面中任一种可能实现方式中的方法。
第十九方面,提供一种芯片,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,执行上述第一方面至第八方面中任一种可能实现方式中的方法。
可选地,作为一种实现方式,所述芯片还可以包括存储器,所述存储器中存储有指令,所述处理器用于执行所述存储器上存储的指令,当所述指令被执行时,所述处理器用于执行第一方面至第八方面中任一种可能实现方式中的方法。
上述芯片具体可以是现场可编程门阵列FPGA或者专用集成电路ASIC。
附图说明
图1是本申请实施例提供的系统架构的结构示意图。
图2是本申请实施例提供的一种流量模式的分类方法200的示意性流程图。
图3是4种网络设备的网络流量序列。
图4是本申请实施例提供的一种流量异常检测的方法400的示意性流程图。
图5是本申请实施例的对时间序列处理后的示意图。
图6是本申请实施例提供的另一种流量异常检测的方法600的示意性流程图。
图7是本申请实施例的基线流量异常示意图。
图8是本申请实施例提供的另一种流量异常检测的方法800示意性流程图。
图9是本申请实施例提供的一种流量模式的分类模型训练方法900的示意性流程图。
图10是本申请实施例提供的另一种流量异常检测模型训练方法1000的示意性流程图。
图11是本申请实施例提供的又一种流量异常检测模型训练方法1100的示意性流程图。
图12是本申请实施例提供的又一种流量异常检测模型训练方法1200的示意性流程图。
图13是本申请实施例提供的流量模式的分类装置1300的示意性框图。
图14是本申请实施例提供的流量异常检测装置1400的示意性框图。
图15是本申请实施例提供的流量模式的分类模型训练装置1500的硬件结构示意图。
图16是本申请实施例提供的流量异常检测模型训练装置1600的硬件结构示意图。
具体实施方式
为了便于理解,下面先对本申请实施例涉及的几个概念进行介绍。
1、时间序列是一组按照时间发生先后顺序进行排列的数据点序列。通常一组时间序列的时间间隔为一恒定值,因此时间序列可以作为离散时间数据进行分析处理。时间序列的异常检测通常是找出远离相对既定模式或分布的数据点。时间序列异常包括:突升、突 降、均值变化等。时间序列的异常检测算法包括基于统计与数据分布的算法(N-Sigma)、基于距离/密度的算法(局部异常因子算法)、孤立森林、基于预测的算法(ARIMA)等。
2、流量异常检测,针对网络中设备或端口中采集的流量数据进行异常检测,异常检测结果为发现网络攻击、配置错误和网络设备故障提供依据。
3、N-Sigma算法
假设数据流量在短时间内,服从正态分布,即:
Figure PCTCN2020107627-appb-000019
其中,x t为t时刻的流量数据,μ为正态分布的均值,σ为正态分布的方差。
其中,均值和方差可以利用时间窗内的n个历史流量数据(x 1,x 2,...x t,...x n)进行估计,其估计如下公式:
Figure PCTCN2020107627-appb-000020
Figure PCTCN2020107627-appb-000021
若待检测流量数据x t与均值的距离大于预设值,则该待检测数据流量x t为异常点,即:
Figure PCTCN2020107627-appb-000022
其中,Y为预设值。
若待检测流量数据x t与均值的距离小于或等于预设值,则该待检测数据流量x t为正常点,即:
Figure PCTCN2020107627-appb-000023
在N-sigma算法中,假设数据流量在短时间内是服从正态分布的,对网络流量数据进行异常检测。但是网络流量数据分布在短时间内并不服从正态分布,因此,基于统计与数据分布的算法对网络流量数据进行异常检测的精度不高。
因此,亟需一种能够提高流量数据异常检测精度的方法。
下面结合图1对本申请实施例的系统架构进行详细的介绍。
图1是本申请实施例的系统架构的示意图。如图1所示,系统架构100包括执行设备110、训练设备120、数据库130、网络设备140、数据存储系统150、以及数据采集设备160。
另外,执行设备110包括计算模块111、I/O接口112、预处理模块113和预处理模块114。其中,计算模块111中可以包括目标模型/规则101,预处理模块113和预处理模块114是可选的。
数据采集设备160用于采集训练数据。针对本申请实施例的流量异常检测的方法来说,训练数据可以包括第一时间序列,其中,第一时间序列包括N个元素,N个元素与N个时刻对应,其中,N个元素中的每个元素为所对应的时刻接收到的流量数据。在采集到训练数据之后,数据采集设备160将这些训练数据存入数据库130,训练设备120基于数据库130中维护的训练数据训练得到目标模型/规则101。
下面对训练设备120基于训练数据得到目标模型/规则101进行描述,训练设备120对第一时间序列进行流量异常检测,将输出的第一时间序列的流量异常检测结果与第一时间序列的原始流量异常的结果进行对比,直到训练设备120输出的第一时间序列的流量异常检测结果与第一时间序列的原始流量异常的结果小于一定的阈值,从而完成目标模型/规则101的训练,其中,第一时间序列的原始流量异常的结果是操作人员通过对第一时间序列分析得到的。
上述目标模型/规则101能够用于实现本申请实施例的流量异常检测的方法,即,将目标时间序列(通过相关预处理后)输入该目标模型/规则101,即可得到目标时间序列的检测结果。需要说明的是,在实际的应用中,所述数据库130中维护的第一时间序列不一定都来自于数据采集设备160的采集,也有可能是从其他设备接收得到的。另外需要说明的是,训练设备120也不一定完全基于数据库130维护的第一时间序列进行目标模型/规则101的训练,也有可能从云端或其他地方获取第一时间序列进行模型训练,上述描述不应该作为对本申请实施例的限定。
根据训练设备120训练得到的目标模型/规则101可以应用于不同的系统或设备中,如应用于图1所示的执行设备110,所述执行设备110可以是服务器或者云端等。在图1中,执行设备110配置输入/输出(input/output,I/O)接口112,用于与外部设备进行数据交互,网络设备140向I/O接口112输入数据,所述输入数据在本申请实施例中可以包括:网络设备输入的目标时间序列。这里的网络设备140具体可以是终端设备。
预处理模块113和预处理模块114用于根据I/O接口112接收到的输入数据(如目标时间序列)进行预处理,在本申请实施例中,也可以没有预处理模块113和预处理模块114(也可以只有其中的一个预处理模块),而直接采用计算模块111对输入数据进行处理。
在执行设备110对输入数据进行预处理,或者在执行设备110的计算模块111执行计算等相关的处理过程中,执行设备110可以调用数据存储系统150中的数据、代码等以用于相应的处理,也可以将相应处理得到的数据、指令等存入数据存储系统150中。
最后,I/O接口112将处理结果,如上述得到的目标时间序列的检测结果呈现给网络设备140,从而提供给用户。
值得说明的是,训练设备120可以针对不同的目标或称不同的任务,基于不同的训练数据生成相应的目标模型/规则101,该相应的目标模型/规则101即可以用于实现上述目标或完成上述任务,从而为用户提供所需的结果。
在图1中所示情况下,网络设备140可以自动地向I/O接口112发送输入数据。用户可以在网络设备140查看执行设备110输出的结果,具体的呈现形式可以是显示、声音、动作等具体方式。网络设备140也可以作为数据采集端,采集如图1所示输入I/O接口112的输入数据及输出I/O接口112的输出结果作为新的样本数据,并存入数据库130。当然,也可以不经过网络设备140进行采集,而是由I/O接口112直接将如图所示输入I/O接口112的输入数据及输出I/O接口112的输出结果,作为新的样本数据存入数据库130。
值得注意的是,图1仅是本申请实施例提供的一种系统架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制,例如,在图1中,数据存储系统150相对执行设备110是外部存储器,在其它情况下,也可以将数据存储系统150置于执行设备110中。
下面将结合附图,对本申请中的技术方案进行描述。
下面结合图2对本申请实施例提供的一种流量模式的分类方法200进行详细的介绍。
图2所示的方法可是由流量模式的分类装置执行,该流量模式的分类装置可以是具有流量模式的分类功能的服务器或云端。
图2所示的方法200包括步骤210至步骤230。下面分别对这些步骤进行详细的描述。
210、获取目标时间序列,该目标时间序列包括N个元素,该N个元素与N个时刻对应,其中,N个元素中的每个元素为所对应的时刻接收到的流量数据。
可选地,上述目标时间序列可以是第一网络设备发送的。
可选地,服务器还可以获取除目标时间序列之外的其他待流量异常检测的时间序列,其中,其他待流量异常检测的时间序列可以是来自同一网络设备,该同一个网络设备可以是第一网络设备,该同一个网络设备也可以是除第一网络设备之外的其他任意一个网络设备。或者,其他待流量异常检测的时间序列可以是来自不同的网络设备。本申请对此并不做限定。
可选地,该N个元素中的每个元素为所对应的时刻接收到的流量数据可以理解为是服务器采集的单位时间(秒)内第一网络设备的流量数据;即N个元素中的每个元素可以是服务器每隔1s,采集一次第一网络设备的流量数据。例如,在N为4的情况下,目标时间序列可以包括4个元素,即,服务器需要每隔1s采集一次第一网络设备的流量数据,得到4s内采集的目标时间序列。例如,该目标时间序列可以是{1MB,2MB,5MB,9MB},其中,1MB是第一网络设备在1s时的流量数据,2MB是第一网络设备在2s时的流量数据,5MB是第一网络设备在3s时的流量数据,9MB是第一网络设备在4s时的流量数据。又例如,该目标时间序列也可以是{1.2MB,2MB,3MB,5MB},其中,1.2MB是第一网络设备在1s时的流量数据,2MB是第一网络设备在2s时的流量数据,3MB是第一网络设备在3s时的流量数据,5MB是第一网络设备在4s时的流量数据。或者,该N个元素中的每个元素为所对应的时刻接收到的流量数据也可以理解为是服务器采集的预设时间内第一网络设备的总的流量数据,其中,该预设时间大于单位时间。例如,在N为6的情况下,目标时间序列可以包括6个元素,即服务器需要每隔预设时间,采集一次第一网络设备的流量数据,得到6个预设时间内的目标时间序列。例如,在N为6,且预设时间为10s的情况下,即服务器每隔10s采集一次第一网络设备的流量数据,得到1mim内采集的目标时间序列。例如,目标时间序列可以是{1MB,2.5MB,5MB,8MB,12MB,18MB},其中,1MB是第一网络设备在1s-10s内的流量数据,2.5MB是第一网络设备在11s-20s内的流量数据,5MB是第一网络设备在21s-30s内的流量数据,8MB是第一网络设备在31s-40s内的流量数据,12MB是第一网络设备在41s-50s内的流量数据,18MB是第一网络设备在51s-60s内的流量数据。又例如,在N为5,且预设时间为5s的情况下,该目标时间序列也可以是{1MB,2MB,3.5MB,6MB,8MB},其中,1MB是第一网络设备在1s-5s内的流量数据,2MB是第一网络设备在6s-10s内的流量数据,3.5MB是第一网络设备在11s-15s内的流量数据,6MB是第一网络设备在16s-20s内的流量数据,8MB是第一网络设备在21s-25s内的流量数据。
220、根据目标时间序列,获取目标时间序列的目标参数,该目标参数包括周期因子和/或抖动密度,其中,周期因子用于表示目标时间序列中呈现出来的围绕长期趋势的一种波浪形变动,抖动密度用于表示目标时间序列在目标时间内实际值与目标值的偏差。
可选地,将目标时间序列中的N个元素中的每个元素分解为趋势分量、周期分量和残 余分量;确定包括N个周期分量的第一子时间序列和N个残余分量的第二子时间序列,并根据第一子时间序列或第二子时间序列,获取目标时间序列的目标参数。
其中,目标时间序列中的N个元素中的每个元素分解的N个周期分量是相同的,即一个时间序列中的每个元素分解的每个周期分量是相同的。
例如,x n为目标时间序列中的一个元素,将x n分解为趋势分量T n、周期分量S n、残余分量C n;S n为第一子时间序列中的元素,C n为第二子时间序列中的元素。其中,x n可以表示为以下公式:
x n=T n+S n+C n
可选地,服务器可以通过时间序列分解(Time Series Decompose,TSD)算法将目标时间序列中的N个元素中的每个元素分解为趋势分量、周期分量、残余分量。
可选地,根据第一子时间序列,确定目标时间序列是否存在周期因子。
具体地,在第一子时间序列中的N个周期分量存在的情况下,确定目标时间序列存在周期因子;在第一子时间序列中的N个周期分量不存在的情况下,确定目标时间序列不存在周期因子。
可选地,上述N个周期分量存在可以理解为N个周期分量都是有效值;例如,该周期分量可以为1,该周期分量也可以为3。上述N个周期分量不存在可以理解为N个周期分量都是无效值;例如,该周期分量可以是0。其中,N个周期分量都是有效值可以理解为将时间序列分解得到的N个周期分量都是非0值;N个周期分量都是无效值可以理解为将时间序列分解得到的N个周期分量都是0值。例如,如图3(c)所示的时间序列,其为一个周期型的时间序列,通过将图3(c)所示的时间序列进行分解时,得到时间序列的N个周期分量为0值,即该时间序列的N个周期分量是无效值,即周期分量不存在。又例如,如图3(b)所示的时间序列,其为一个非周期型的时间序列,通过将图3(b)所示的时间序列进行分解时,得到时间序列的N个周期分量都为非0值,即该时间序列的N个周期分量是有效值,即周期分量存在。
可选地,若目标时间序列存在周期因子,可以将周期因子设为1;若目标时间序列不存在周期因子,可以将周期因子设为0。
可选地,根据第二子时间序列,确定目标时间序列的抖动密度。
具体地,服务器根据以下公式确定目标时间序列的R:
Figure PCTCN2020107627-appb-000024
其中,R是抖动密度,r n可以根据以下公式确定:
Figure PCTCN2020107627-appb-000025
其中,C n为第二子时间序列中的第n个元素,x n为目标时间序列中的第n个元素;
N根据以下公式确定:
Figure PCTCN2020107627-appb-000026
T为目标时间序列的长度,W为加入窗口的窗长,α是第一预设值。
230、根据目标参数,对目标时间序列进行分类。
具体地,在周期因子存在的情况下,将目标时间序列确定为周期型;在周期因子不存在的情况下,将目标时间序列确定为非周期型。在抖动密度大于第二预设值的情况下,将目标时间序列确定为毛刺型;在抖动密度小于或等于第二预设值的情况下,将目标时间序列确定为平稳型。
根据目标参数,对目标序列进行分类可以得到以下8种情况,其中,S为周期因子,R为抖动密度:
①在S存在的情况下,即S=1的情况下,该目标时间序列所属于的第一类型为周期型;
②在S不存在的情况下,即S=0的情况下,该目标时间序列所属于的第一类型为非周期型;
③在R大于第二预设值的情况下,该目标时间序列所属于的第一类型为毛刺型;
④在R小于或等于第二预设值的情况下,该目标时间序列所属于的第一类型为平稳型;
⑤在S存在,且R大于第二预设值的情况下,该目标时间序列所属于的第一类型为周期毛刺型;
⑥在S不存在,且R大于第二预设值的情况下,该目标时间序列所属于的第一类型为非周期毛刺型;
⑦在S存在,且R小于或等于第二预设值的情况下,该目标时间序列所属于的第一类型为周期平稳型;
⑧在S存在,且R大于第二预设值的情况下,该目标时间序列所属于的第一类型为周期毛刺型。
可选地,该方法200还可以包括步骤230。
230、将目标时间序列的分类结果输出。即输出目标时间序列是周期型或非周期型;或者,输出目标时间序列是平稳型或毛刺型;或者,输出目标时间序列是周期毛刺型、周期平稳型、非周期毛刺型或非周期平稳型。
例如,如图3所示,为4种网络设备的网络流量序列。从图3(a)、图3(b)、图3(c)和图3(d)中可以看出,网络流量的特征变化很大。其中,如图3(a)所示的网络设备的流量则在各个时刻比较平稳,即流量序列可以为平稳型;如图3(b)所示的网络设备的流量局部抖动明显,即流量序列可以为毛刺型;如图3(c)所示的网络设备的流量呈现出极强的周期特性(天、周),即流量序列可以为周期型;如图3(d)所示的网络设备的流量周期特性(天、周)并不明显,即流量序列可以为非周期型。
通过将目标时间序列进行分类,从而可以为后续对分类后的目标时间序列的流量异常检测打下基础,从而提高了目标时间序列的流量异常检测的精度。
下面结合图4对本申请实施例提供的一种流量异常检测的方法400进行详细的介绍。
图4所示的方法可以是由流量异常检测装置执行,该流量异常检测装置可以是具有流量异常检测功能的服务器。
图4所示的方法包括步骤410至步骤440。下面分别对这些步骤进行详细的描述。
本申请实施例中所述的时间序列的异常情况可以理解为时间序列的流量异常情况。
410、获取目标时间序列,该目标时间序列包括N个元素,该N个元素与N个时刻对应,其中,N个元素中的每个元素为所对应的时刻接收到的流量数据。
可选地,上述目标序列可以是第一网络设备发送的。
可选地,服务器还可以获取除目标时间序列之外的其他待流量异常检测的时间序列,其中,其他待流量异常检测的时间序列可以是来自同一网络设备,该同一个网络设备可以是第一网络设备,该同一个网络设备也可以是除第一网络设备之外的其他任意一个网络设备。或者,其他待流量异常检测的时间序列可以是来自不同的网络设备。
420、根据目标时间序列,获取目标时间序列的目标参数,该目标参数包括周期因子和/或抖动密度,其中,周期因子用于表示目标时间序列中呈现出来的围绕长期趋势的一种波浪形变动,抖动密度用于表示目标时间序列在目标时间内实际值与目标值的偏差。
可选地,将目标时间序列中的N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个周期分量的第一子时间序列和包括N个残余分量的第二子时间序列;根据第一子时间序列或第二子时间序列,获取目标时间序列的目标参数。
例如,x n为目标时间序列中的一个元素,将x n分解为趋势分量T n、周期分量S n、残余分量C n;所述S n为第一子时间序列中的元素,所述C n为第二子时间序列中的元素。其中,x n可以表示为以下公式:
x n=T n+S n+C n
可选地,服务器可以通过时间序列分解(Time Series Decompose,TSD)算法将目标时间序列中的每个元素分解为趋势分量、周期分量、残余分量。
可选地,根据第一子时间序列,确定目标时间序列是否存在周期因子。
具体地,在第一子时间序列中的N个周期分量存在的情况下,确定目标时间序列存在周期因子;在第一子时间序列中的N个周期分量不存在的情况下,确定目标时间序列不存在周期因子。
可选地,一个时间序列中的每个元素分解的周期分量是相同的,即一个时间序列对应一个周期分量。例如,该时间序列中的每个元素分解的周期分量可以都是1该时间序列中的每个元素分解的周期分量也可以都是2。
可选地,上述N个周期分量存在可以理解为N个周期分量都是有效值;例如,该周期分量可以为0.5,该周期分量也可以为2。上述N个周期分量不存在可以理解为N个周期分量都是无效值;例如,该周期分量可以是0。
可选地,若目标时间序列存在周期因子,可以将周期因子设为1;若目标时间序列不存在周期因子,可以将周期因子设为0。
可选地,根据第二子时间序列,确定目标时间序列的抖动密度。
具体地,服务器根据以下公式确定目标时间序列的R:
Figure PCTCN2020107627-appb-000027
其中,R是抖动密度,r n可以根据以下公式确定:
Figure PCTCN2020107627-appb-000028
其中,C n为第二子时间序列中的第n个元素,x n为目标时间序列中的第n个元素;
N根据以下公式确定:
Figure PCTCN2020107627-appb-000029
T为目标时间序列的长度,W为加入窗口的窗长,α是第一预设值。
步骤410和步骤420中未描述的内容可以参考上述方法200中步骤210和步骤220的描述,这里不再赘述。
430、根据目标参数,从多个类型中确定目标时间序列所属于的第一类型,其中,所述多个类型中的每个类型对应一个参数集合,目标参数属于第一类型对应的参数集合。
其中,上述所述类型可以包括:周期型、非周期型、稳定型、毛刺型、周期稳定型、周期毛刺型、非周期稳定型或非周期毛刺型。
其中,周期型和非周期型可以根据周期因子确定。具体地,在所述周期因子存在的情况下,所述类型为周期型;在所述周期因子不存在的情况下,所述类型为非周期型。
其中,稳定型和毛刺型可以根据抖动密度确定。具体地,在所述抖动密度大于第二预设值的情况下,所述类型为毛刺型;在所述抖动密度小于或等于所述第二预设值的情况下,所述类型为平稳型。
应理解,目标时间序列所属于的类型可以有以下8种情况,其中,S为周期因子,R为抖动密度:①在S存在的情况下,即S=1的情况下,该目标时间序列所属于的第一类型为周期型;②在S不存在的情况下,即S=0的情况下,该目标时间序列所属于的第一类型为非周期型;③在R大于第二预设值的情况下,该目标时间序列所属于的第一类型为毛刺型;④在R小于或等于第二预设值的情况下,该目标时间序列所属于的第一类型为平稳型;⑤在S存在,且R大于第二预设值的情况下,该目标时间序列所属于的第一类型为周期毛刺型;⑥在S不存在,且R大于第二预设值的情况下,该目标时间序列所属于的第一类型为非周期毛刺型;⑦在S存在,且R小于或等于第二预设值的情况下,该目标时间序列所属于的第一类型为周期平稳型;⑧在S存在,且R大于第二预设值的情况下,该目标时间序列所属于的第一类型为周期毛刺型。
可选地,参数集合可以理解为{S=1},{S=0},{R>β},{R≤β},其中β可以是第二预设值。
首先,根据目标参数,从多个参数集合中确定目标参数属于的第一参数集合;其次,根据第三映射关系和第一参数集合,从多个类型中确定目标时间序列所属于的第一类型,第三映射关系包括多个参数集合和多个所述类型的对应关系。即通过计算得到目标参数,并根据目标参数是属于哪个参数集合,从而确定出第一参数集合,并根据第一参数集合所 对应的第一类型,确定目标时间序列所属于的第一类型。
440、根据第一类型对应的第一类的判定模型,检测目标时间序列的异常情况,其中,多个类型中的每个类型对应一个类型的判定模型,判定模型用于流量异常检测。可选地,可以根据第二映射关系和目标时间序列所属于的第一类型,确定第一类型对应的第一类的判定模型,第二映射关系包括多个类型和多个第一类的判定模型的对应关系。
其中,第一类的判定模型具体可以是:首先,确定包括N个所述趋势分量的第三子时间序列;并将第二时间序列分成M个目标长度的子序列,M为正整数,第二时间序列是第三子时间序列或第二时间序列是根据第三子时间序列和(Linear Segmentation Algorithms,PLR)线性分段算法形成的;其次,计算M个目标长度的子序列的矩阵轮廓MP值,M个目标长度的子序列的矩阵轮廓MP值组成MP时间序列;最后,根据MP时间序列和N-sigma算法,检测目标时间序列的异常情况。
其中,时间序列的PLR表示中,线段的数目决定了对原始序列的近似粒度。线段越多,线段的平均长度就越短,反映了时间序列的短期波动情况,线段越少,线段的平均长度就越长,反映了时间序列的中长期趋势。PLR表示方法就是用M条首尾邻接的直线段来近似标识一条长度为m(m>>M)的时间序列。
其中,趋势分量反应的是时间序列整体变化,因此,将包括N个趋势分量的第三子时间序列进行异常情况检测,可以提高时间序列异常检测结果的精准。
本申请实施列中将第三子时间序列(目标时间序列中的N个元素中的每个元素分解后趋势分量组成的时间序列)用PLR表示,是采用自顶向下的算法实现,流量数据的开始点和结束点,是首先选中的分段点。然后,遍历两点之间的所有点,找出和这两点连成的直线距离最大的点,如果这个点到直线的距离大于预先给定的阀值,则将它作为第三个分段点。这样就有了两个线段,这个新增点到左边相邻点和右边相邻点构成的两条线段,继续寻找距离最大的点,找到的两个点,谁与相应的线段距离最大,且这个距离大于阀值,则该点作为第四个分段点,如此循环,直到再也找不到距离大于阈值的点,分段完成。这个阀值,也就是点到线段的距离,使用欧式距离。
可选地,将第二时间序列通过矩阵轮廓(Matrix Profile,MP)分成M个目标长度的子序列,并通过MP计算M个目标长度的子序列的矩阵轮廓MP值。其中,MP是从时间序列的结构上描述序列轮廓的方法,常用于时间序列的聚类、密度估计、图形发现等。MP的原理是将整个时间序列切割成定长的子序列,再分别计算子序列与其他子序列的欧式距离,取最小值作为该序列的MP值。例如,时间序列X={x 0,x 1,...x n-2,x n-1}通过MP分成若干个子序列
Figure PCTCN2020107627-appb-000030
子序列
Figure PCTCN2020107627-appb-000031
与原始时间序列X的MP值为子序列与原始序列中的其他子序列的距离的最小值,即
Figure PCTCN2020107627-appb-000032
j∈[0,n-m]。
根据MP时间序列和N-sigma算法,检测目标时间序列的异常情况具体如下:
假设MP序列可以是MP={mp 0,mp 1,...mp n-m},则
Figure PCTCN2020107627-appb-000033
Figure PCTCN2020107627-appb-000034
其中,μ mp为MP时间序列的均值,σ mp为MP时间序列的方差。
Figure PCTCN2020107627-appb-000035
则目标时间序列中的第i个元素所对应的时刻接收到的流量数据为异常数据;若
Figure PCTCN2020107627-appb-000036
则目标时间序列中的第i个元素所对应的时刻接收到的流量数据为正常数据。其中,δ为预设值。
其中,根据MP时间序列和N-sigma算法,可以检测出MP时间序列中的流量异常点,MP时间序列中的流量异常点所对应的时刻即为目标时间序列的流量异常点的时刻,从而可以根据MP时间序列中的流量异常点得到目标时间序列的流量异常点。
例如,如图5所示,上面图中所示的是原始时间序列形成的图,中间图中所示的是将原始序列用PLR表示后的时间序列形成的图;下面图中所示的是将PLR表示后的时间序列切分为M个子序列,并计算M个子序列的MP值形成的时间序列的图。从图5中可以看到,原始序列在横坐标是60左右有突变,通过将将原始时间序列用PLR表示后的序列,可以将横坐标是60左右的突变填平。在实际中,流量时间序列在较短的时间内可能会存在波动,但是在较短的时间内流量就恢复正常,因此,需要通过将该时间序列经过PLR表示和计算MP值,保证对该时间序列进行流量异常检测的准确性,从而提高流量异常检测的精度。可选地,根据第一类型对应的第一类的判定模型,检测目标时间序列的异常情况之后,可以输出目标时间序列的异常点。
可选地,服务器还可以对目标时间序列进行另一种流量异常检测,即服务器还可执行步骤450。
步骤450,根据第二子时间序列和第一类型对应的第二类的判定模型,检测目标时间序列的异常情况,第二类的判定模型是N-sigma模型。
其中,根据第一映射关系和目标时间序列所属于的第一类型,确定第一类型对应的第二类的判定模型,第一映射关系包括多个类型和多个第二类的判定模型的对应关系。
其中,第二类判定模型可以具体是:
Figure PCTCN2020107627-appb-000037
Figure PCTCN2020107627-appb-000038
其中,μ 2为第二子时间序列的均值,σ 2为第二子时间序列的方差。
Figure PCTCN2020107627-appb-000039
则目标时间序列中的第n个元素所对应的时刻接收到的流量数据为异常数据;若
Figure PCTCN2020107627-appb-000040
则目标时间序列中的第n个元素所对应的时刻接收到的流量数据为正常数据。其中,
Figure PCTCN2020107627-appb-000041
为预设值。
其中,根据第二子时间序列和第一类型对应的第二类的判定模型,可以检测出第二子时间序列中的流量异常点,第二子时间序列中的流量异常点所对应的时刻即为目标时间序列的流量异常点的时刻,从而可以根据第二子时间序列中的流量异常点得到目标时间序列 的流量异常点。可选地,根据第二子时间序列和第一类型对应的第二类的判定模型,检测目标时间序列的异常情况之后,可以输出目标时间序列的异常点。
上述实施例中,根据时间序列的类型和时间序列的类型对应的判定模型进行异常检测,其中,每一类的时间序列对应一类判定模型,即每类型的时间序列,都存在与该类型时间序列相对应的第一类的判定模型和第二类的判定模型。例如,周期型的时间序列,存在与之对应的第一类的判定模型,也存在与之对应的第二类的判定模型;非周期型的时间序列,存在与之对应的第一类的判定模型,也存在与之对应的第二类的判定模型。从而可以针对每一类的时间序列和每一类的时间序列对应的判定模型进行时间序列的异常检测,从而提高了时间序列异常检测的精度。
下面结合图6对本申请实施例提供的另一种流量异常检测的方法600进行详细的介绍。
图6所示的方法可以是由流量异常检测装置执行,该流量异常检测装置可以是具有流量异常检测功能的服务器。
图6所示的方法包括步骤610至步骤640。下面分别对这些步骤进行详细的描述。
610、获取目标时间序列;目标时间序列包括N个元素,N个元素与N个时刻对应,其中,N个元素中的每个元素为所对应的时间点接收到的流量数据。
620、根据目标时间序列,获取目标时间序列的目标参数,目标参数包括周期因子和/或抖动密度,其中,周期因子用于表示目标时间序列中呈现出来的围绕长期趋势的一种波浪形变动,抖动密度用于表示目标时间序列在目标时间内实际值与目标值的偏差。
可选地,将所述目标时间序列中的所述N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个所述周期分量的第一子时间序列和包括N个所述残余分量的第二子时间序列,并根据第一子时间序列或第二子时间序列,获取目标时间序列的目标参数。
可选地,根据第一子时间序列,确定目标时间序列是否存在周期因子。
具体地,在第一子时间序列中的N个周期分量存在的情况下,确定目标时间序列存在周期因子;在第一子时间序列中的N个周期分量不存在的情况下,确定目标时间序列不存在周期因子。
可选地,在目标时间序列存在周期因子的情况下,可以默认为周期因子S=1;在目标时间序列不存在周期因子的情况下,可以默认为周期因子S=0。
可选地,根据第二子时间序列,确定目标时间序列的抖动密度。
具体地,服务器根据以下公式确定目标时间序列的R:
Figure PCTCN2020107627-appb-000042
其中,R是抖动密度,r n可以根据以下公式确定:
Figure PCTCN2020107627-appb-000043
其中,C n为第二子时间序列中的第n个元素,x n为目标时间序列中的第n个元素;
N根据以下公式确定:
Figure PCTCN2020107627-appb-000044
T为目标时间序列的长度,W为加入窗口的窗长,α是第一预设值。
上述步骤610和步骤620中未描述的内容可以参考上述方法200中步骤210和步骤220的描述,这里不再赘述。
630、从多个参数集合中确定所述目标参数所属于的第一参数集合。
可选地,其中,多个参数集合可以是{S=1},{S=0},{R>β},{R≤β}。
例如,在β等于2的情况下,若确定的目标时间序列的抖动密度为4,则可以确定第一参数集合是{R>β},若确定的目标时间序列的抖动密度为1,则可以确定第一参数集合是{R≤β}。
例如,在目标序列存在周期因子的情况下,则S=1,即第一参数集合是{S=1};在目标序列不存在周期因子的情况下,则S=0,即第一参数集合是{S=0}。
640、根据第一参数集合对应的第一类的判定模型,检测目标时间序列的异常情况,其中,多个参数集合中的每个参数集合对应一个类型的判定模型,判定模型用于流量异常检测。
可选地,根据第五映射关系和目标参数所属于的第一参数集合,确定第一参数集合对应的第一类的判定模型,第五映射关系包括多个参数集合和多个第一类的判定模型的对应关系。
具体地,确定包括N个所述趋势分量的第三子时间序列;将第二时间序列分成M个目标长度的子序列,M为正整数,第二时间序列是第三子时间序列或第二时间序列是根据第三子时间序列和线性分段算法PLR形成的;计算M个目标长度的子序列的矩阵轮廓MP值,M个目标长度的子序列的矩阵轮廓MP值组成MP时间序列;根据所述MP时间序列和N-sigma算法,检测目标时间序列的异常情况。
其中,根据MP时间序列和N-sigma算法,可以检测出MP时间序列中的流量异常点,MP时间序列中的流量异常点所对应的时刻即为目标时间序列的流量异常点的时刻,从而可以根据MP时间序列中的流量异常点得到目标时间序列的流量异常点。
可选地,在根据第一参数集合对应的第一类的判定模型,检测目标时间序列的异常情况之后,可以输出目标时间序列的异常点。
可选地,上述方法600还可以包括步骤650。
650、根据第二子时间序列和第一参数集合对应的第二类的判定模型,目标时间序列 的异常情况,第二类的判定模型是N-sigma模型。
根据第四映射关系和目标参数所属于的第一参数集合,确定第一参数集合对应的第二类的判定模型,第四映射关系包括多个参数集合和多个第二类的判定模型的对应关系。
其中,根据第二子时间序列和第一参数集合对应的第二类的判定模型,可以检测出第二子时间序列中的流量异常点,第二子时间序列中的流量异常点所对应的时刻即为目标时间序列的流量异常点的时刻,从而可以根据第二子时间序列中的流量异常点得到目标时间序列的流量异常点。
可选地,在根据第一参数集合对应的第二类的判定模型,检测目标时间序列的异常情况之后,可以输出目标时间序列的异常点。
如图7所示,由于业务切换或设备发生故障等,端口流量会出现长时间的整体变化,该类异常可以归纳为基线流量异常,因此,本申请实施例提供的另一种流量异常检测的方法800,来进行基线流量异常检测。下面结合图8对本申请实施例提供的另一种流量异常检测的方法800进行详细的介绍。
图8所示的方法可以是由流量异常检测装置执行,该流量异常检测装置可以是具有流量异常检测功能的服务器。
图8所述的方法包括步骤810至步骤840。下面分别对这些步骤进行详细的描述。
810、获取目标时间序列,目标时间序列包括N个元素,N个元素与N个时刻对应,其中,N个元素中的每个元素为所对应的时刻接收到的流量数据。
820、将目标时间序列中的N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个所述趋势分量的第三子时间序列。
可选地,根据TSD算法,将目标时间序列中的N个元素中的每个元素分解为趋势分量、周期分量和残余分量。
830、将第二时间序列分成M个目标长度的子序列,所述M为正整数,所述第二时间序列是所述第三子时间序列或所述第二时间序列是根据所述第三子时间序列和线性分段算法PLR形成的。
可选地,目标长度是通信协议规定的。
840、计算M个目标长度的子序列的矩阵轮廓MP值,M个目标长度的子序列的矩阵轮廓MP值组成MP时间序列。
上述步骤810至步骤840中未描述的内容可以参考上述方法400方法相应步骤的描述,这里不再赘述。
850、根据第三类的判定模型,检测MP时间序列的异常情况。
其中,第三类的判定模型是N-sigma模型。
其中,根据MP时间序列和第三类的判定模型,可以检测出MP时间序列中的流量异常点,MP时间序列中的流量异常点所对应的时刻即为目标时间序列的流量异常点的时刻,从而可以根据MP时间序列中的流量异常点得到目标时间序列的流量异常点。
可选地,根据MP时间序列和第三类的判定模型,检测MP时间序列的异常情况后,即可得到目标时间序列的流量异常情况,继而可输出目标时间序列中的异常点。
上述方法800还可以包括步骤860。
860、确定包括N个残余分量的第二子时间序列,根据第二子时间序列和第三类的判定模型,检测第二子时间序列的异常情况,所述第二子时间序列是对所述目标时间序列中 的N个元素中的每个元素分解的残差分量形成的时间序列。
其中,根据第二子时间序列和第三类的判定模型,可以检测出第二子时间序列中的流量异常点,第二子时间序列中的流量异常点所对应的时刻即为目标时间序列的流量异常点的时刻,从而可以根据第二子时间序列中的流量异常点得到目标时间序列的流量异常点。
再根据第三类的判定模型,检测第二子时间序列的异常情况后,可输出第二子时间序列中的异常点。
以上,结合图2详细说明了本申请实施例的流量模式的分类方法,以及结合图4至图8详细说明了本申请实施例的流量异常检测的方法,以下,结合图9详细说明本申请实施例提供的流量模式的分类模型训练方法,以及结合图10至图12详细说明本申请实施例提供的流量异常检测模型训练方法。
图9是本申请实施例提供的一种流量模式的分类模型训练方法900的示意性流程图。图9所示的方法可以由计算机设备、服务器设备或者运算设备等运算能力较强的设备来执行。图9所示的方法包括步骤910至940,下面分别对这几个步骤进行详细的介绍。
910、获取第一时间序列,第一时间序列包括N个元素,N个元素与N个时刻对应,其中,N个元素中的每个元素为所对应的时刻接收到的流量数据。
可选地,该第一时间序列中的每个元素为所对应的时刻接收的流量数据可以理解为第一时间序列中的每个元素为所对应的时刻接收的历史流量数据。
可选地,还可以获取多个第一时间序列。
920、根据第一时间序列的原始分类模型,获取第一时间序列的第一类型。
可选地,原始分类模型的步骤包括步骤1至步骤4。
步骤1,根据TSD算法,将第一时间序列中的N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个周期分量的第一子时间序列和包括N个残余分量的第二子时间序列。
步骤2:根据第一子时间序列中的N个周期分量,确定第一时间序列的周期因子。在第一子时间序列中的N个周期分量存在的情况下,确定第一时间序列的周期因子存在,可以将周期因子确定为1;在第一子时间序列中的N个周期分量不存在的情况下,确定第一时间序列的周期因子不存在,可以将周期因子确定为0。
步骤3:根据第二子时间序列中的N个残差分量,确定第一时间序列的抖动密度。
具体地,根据以下公式确定第一时间序列的R:
Figure PCTCN2020107627-appb-000045
其中,R为抖动密度,r n可以根据以下公式确定:
Figure PCTCN2020107627-appb-000046
其中,C n为第二子时间序列中的第n个元素,x n为目标时间序列中的第n个元素;
N根据以下公式确定:
Figure PCTCN2020107627-appb-000047
T为第一时间序列的长度,W为加入窗口的窗长,α是第一预设值。
步骤4、根据周期因子S和抖动密度R,确定第一时间序列的第一类型。
在S存在的情况下,第一时间序列的第一类型为周期型;在S不存在的情况下,第一时间序列的第一类型为非周期型;在R大于第二预设值的情况下,第一时间序列的第一类型为毛刺型;在R小于或等于第二预设值的情况下,第一时间序列的第一类型为平稳型;在S存在,且R大于第二预设值的情况下,第一时间序列的第一类型为周期毛刺型;在S存在,且R小于或等于第二预设值的情况下,第一时间序列的第一类型为周期平稳型;在S不存在,且R大于第二预设值的情况下,第一时间序列的第一类型为非周期毛刺型;在S不存在,且R小于或等于第二预设值的情况下,第一时间序列的第一类型为非周期平稳型。
930、获取第一时间序列的原始类型。
940、根据第一时间序列的原始类型和第一时间序列的第一类型,调整第一时间序列的原始模型的参数,以获取第一时间序列的目标分类模型。
在第一时间序列的原始类型和第一时间序列的第一类型不一致的情况下,调节第一时间序列的原始模型的参数,其中,第一时间序列的原始模型的参数包括第一预设值和第二预设值。若第一时间序列的原始类型为毛刺型,而第一时间序列的第一类行为平稳型,则相应的可以将第一预设值调小一点,或者相应的可以将第二预设值调小一点。
可选地,服务器可以获取多个第一时间序列,并根据多个第一时间序列来训练第一时间序列的目标分类模型,即根据多个第一时间序列模型,不断地优化第一时间序列的目标分类模型。
图10是本申请实施例提供的另一种流量异常检测模型训练方法1000的示意性流程图。图10所示的方法可以由计算机设备、服务器设备或者运算设备等运算能力较强的设备来执行。图10所示的方法包括步骤1010至1050,下面分别对这几个步骤进行详细的介绍。
1010、获取第一时间序列,第一时间序列包括N个元素,N个元素与N个时刻对应,其中,N个元素中的每个元素为所对应的时刻接收到的流量数据。
可选地,该第一时间序列中的每个元素为所对应的时刻接收的流量数据可以理解为第一时间序列中的每个元素为所对应的时刻接收的历史流量数据。
可选地,还可以获取多个第一时间序列。
1020、根据所述第一时间序列的原始分类模型,获取所述第一时间序列的第一类型。
可选地,原始分类模型的步骤包括步骤1至步骤4。
步骤1,根据TSD算法,将第一时间序列中的N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个周期分量的第一子时间序列和包括N个残余分量的第二子时间序列。
步骤2:根据第一子时间序列中的N个周期分量,确定第一时间序列的周期因子。在第一子时间序列中的N个周期分量存在的情况下,确定第一时间序列的周期因子存在,可以将周期因子确定为1;在第一子时间序列中的N个周期分量不存在的情况下,确定第一时间序列的周期因子不存在,可以将周期因子确定为0。
步骤3:根据第二子时间序列中的N个残差分量,确定第一时间序列的抖动密度。
具体地,根据以下公式确定第一时间序列的R:
Figure PCTCN2020107627-appb-000048
其中,R为抖动密度,r n可以根据以下公式确定:
Figure PCTCN2020107627-appb-000049
其中,C n为第二子时间序列中的第n个元素,x n为目标时间序列中的第n个元素;
N根据以下公式确定:
Figure PCTCN2020107627-appb-000050
T为所述第一时间序列的长度,W为加入窗口的窗长,α是第一预设值。
步骤4、根据周期因子S和抖动密度R,确定第一时间序列的第一类型。
在S存在的情况下,第一时间序列的第一类型为周期型;在S不存在的情况下,第一时间序列的第一类型为非周期型;在R大于第二预设值的情况下,第一时间序列的第一类型为毛刺型;在R小于或等于第二预设值的情况下,第一时间序列的第一类型为平稳型;在S存在,且R大于第二预设值的情况下,第一时间序列的第一类型为周期毛刺型;在S存在,且R小于或等于第二预设值的情况下,第一时间序列的第一类型为周期平稳型;在S不存在,且R大于第二预设值的情况下,第一时间序列的第一类型为非周期毛刺型;在S不存在,且R小于或等于第二预设值的情况下,第一时间序列的第一类型为非周期平稳型。
1030、根据第一类型对应的第一类的判定模型,对所述第一类型的第一时间序列进行流量异常检测处理,以获取第一数据,所述第一数据是所述第一时间序列的异常点。
可选地,第一类的判定模型的步骤包括步骤A至步骤D:
步骤A:确定包括N个趋势分量的第三子时间序列;
步骤B:将第二时间序列分成M个目标长度的子序列,M为正整数,第二时间序列是第三子时间序列或所述第二时间序列是根据第三子时间序列和线性分段算法PLR形成的;
步骤C:计算M个目标长度的子序列的矩阵轮廓MP值,M个目标长度的子序列的矩阵轮廓MP值组成MP时间序列;
步骤D:根据MP时间序列和N-sigma算法,检测第一时间序列的异常情况。
1040、获取第二数据,第二数据为第一时间序列的原始异常点;
1050、根据第一数据和第二数据,调整第一类的判定模型的参数,以获取第一目标判定模型。
在第一数据和第二数据不一致的情况下,调节第一时间序列的原始模型的参数,其中,第一时间序列的原始模型的参数包括灵敏度。若第一数据中的数据个数大于第二数据中的数据个数,则相应的可以将第一时间序列的原始模型的灵敏度调低;若第一数据中的数据个数小于第二数据中的数据个数,则相应的可以将第一时间序列的原始模型的灵敏度调高;若第一数据中的数据个数等于第二数据中的数据个数,则无需调节第一时间序列的原始模型的灵敏度。
可选地,服务器可以获取多个第一时间序列,根据多个第一时间序列来训练第一目标判定模型,即根据多个第一时间序列,不断地优化第一目标判定模型。
图11是本申请实施例提供的又一种流量异常检测模型训练方法1100的示意性流程图。图11所示的方法可以由计算机设备、服务器设备或者运算设备等运算能力较强的设备来执行。图11所示的方法包括步骤1110至1150,下面分别对这几个步骤进行详细的介绍。
1110、获取第一时间序列,第一时间序列包括N个元素,N个元素与N个时刻对应,其中,N个元素中的每个元素为所对应的时刻接收到的流量数据。
可选地,该第一时间序列中的每个元素为所对应的时刻接收的流量数据可以理解为第一时间序列中的每个元素为所对应的时刻接收的历史流量数据。
可选地,还可以获取多个第一时间序列。
1120、根据第一时间序列的原始参数模型,获取第一时间序列的第一参数集合。
可选地,第一时间序列的原始参数模型可以包括步骤a至步骤d:
步骤a:根据TSD算法,将第一时间序列中的N个元素中的每个元素分解为趋势分量、周期分量和残余分量;确定包括N个周期分量的第一子时间序列和包括N个残余分量的第二子时间序列。
步骤b:根据第一子时间序列中的N个周期分量,确定第一时间序列的周期因子。在第一子时间序列中的N个周期分量存在的情况下,确定第一时间序列的周期因子存在,可以将周期因子确定为1;在第一子时间序列中的N个周期分量不存在的情况下,确定第一时间序列的周期因子不存在,可以将周期因子确定为0。
步骤c:根据第二子时间序列中的N个残差分量,确定第一时间序列的抖动密度。
具体地,根据以下公式确定第一时间序列的R:
Figure PCTCN2020107627-appb-000051
其中,R为抖动密度,r n可以根据以下公式确定:
Figure PCTCN2020107627-appb-000052
其中,C n为第二子时间序列中的第n个元素,x n为目标时间序列中的第n个元素;
N根据以下公式确定:
Figure PCTCN2020107627-appb-000053
所述T为所述第一时间序列的长度,所述W为加入窗口的窗长,所述α是第一预设值。
步骤d、根据周期因子S和抖动密度R,确定第一时间序列的第一参数集合。
其中,参数集合为{S=1},{S=0},{R>β},{R≤β}。
1130、根据第一参数集合对应的第一类的判定模型,对第一时间序列进行流量异常检测处理,以获取第四数据,第四数据是第一时间序列的异常点。
可选地,第一类的判定模型的步骤包括步骤A至步骤D:
步骤A:确定包括N个趋势分量的第三子时间序列;
步骤B:将第二时间序列分成M个目标长度的子序列,M为正整数,第二时间序列是第三子时间序列或所述第二时间序列是根据第三子时间序列和线性分段算法PLR形成的;
步骤C:计算M个目标长度的子序列的矩阵轮廓MP值,M个目标长度的子序列的矩阵轮廓MP值组成MP时间序列;
步骤D:根据MP时间序列和N-sigma算法,检测第一时间序列的异常情况。
1140、获取第二数据,所述第二数据为所述第一时间序列的原始异常点。
1150、根据第二数据和第四数据,调整第一类的判定模型的参数,以获取第一目标判定模型。
在第一数据和第二数据不一致的情况下,调节第一类的判定模型的参数,其中,第一类的判定模型的参数包括灵敏度。若第一数据中的数据个数大于第二数据中的数据个数,则相应的可以将第一类的判定模型的灵敏度调低;若第一数据中的数据个数小于第二数据中的数据个数,则相应的可以将第一类的判定模型的灵敏度调高;若第一数据中的数据个数等于第二数据中的数据个数,则无需调节第一类的判定模型的灵敏度。
可选地,服务器可以获取多个第一时间序列,根据多个第一时间序列来训练第一目标判定模型,即根据多个第一时间序列,不断地优化第一目标判定模型。
图12是本申请实施例提供的又一种流量异常检测模型训练方法1200的示意性流程图。图12所示的方法可以由计算机设备、服务器设备或者运算设备等运算能力较强的设备来执行。图12所示的方法包括步骤1210至1250,下面分别对这几个步骤进行详细的介绍。
1210、获取第一时间序列,第一时间序列包括N个元素,N个元素与所述N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据。
可选地,该第一时间序列中的每个元素为所对应的时刻接收的流量数据可以理解为第一时间序列中的每个元素为所对应的时刻接收的历史流量数据。
可选地,还可以获取多个第一时间序列。
1220、对第一时间序列进行处理,以获取第三子时间序列,第三子时间序列是第一时间序列中的N个元素中的每个元素分解的趋势分量形成的时间序列。
可选地,根据TSD算法,将第一时间序列中的N个元素中的每个元素分解为趋势分量、周期分量和残余分量,确定包括N个趋势分量的第三子时间序列。
1230、根据第四类的判定模型,对第一时间序列进行流量异常检测处理,以获取第三数据,第三数据是所述第一时间序列的异常点。
其中,第四类的判定模型包括步骤A’至步骤D’:
步骤A’:确定包括N个趋势分量的第三子时间序列;
步骤B’:将第二时间序列分成M个目标长度的子序列,M为正整数,第二时间序列是第三子时间序列或所述第二时间序列是根据第三子时间序列和线性分段算法PLR形成的;
步骤C’:计算M个目标长度的子序列的矩阵轮廓MP值,M个目标长度的子序列的矩阵轮廓MP值组成MP时间序列;
步骤D’:根据MP时间序列和N-sigma算法,检测第一时间序列的异常情况。
1240、获取第二数据,第二数据为第一时间序列的原始异常点;
1250、根据第二数据和第三数据,调整第四类的判定模型的参数,以获取第二目标判定模型。
在第一数据和第二数据不一致的情况下,调节第四类的判定模型的参数,其中,第四类的判定模型的参数包括灵敏度。若第一数据中的数据个数大于第二数据中的数据个数,则相应的可以将第四类的判定模型的灵敏度调低;若第一数据中的数据个数小于第二数据中的数据个数,则相应的可以将第四类的判定模型的灵敏度调高;若第一数据中的数据个数等于第二数据中的数据个数,则无需调节第四类的判定模型的灵敏度。
可选地,服务器可以获取多个第一时间序列,根据多个第一时间序列来训练第二目标判定模型,即根据多个第一时间序列,不断地优化第二目标判定模型。
图13是本申请实施例的流量模式的分类装置1300的示意性框图。图13所示的流量模式的分类装置1300包括存储器1301、处理器1302、通信接口1303以及总线1304。其中,存储器1301、处理器1302、通信接口1303通过总线1304实现彼此之间的通信连接。
存储器1301可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器1301可以存储程序,当存储器1301中存储的程序被处理器1302执行时,处理器1302和通信接口1303用于执行本申请实施例的流量模式的分类方法的各个步骤。具体地,通信接口1303可以从存储器或者其他设备中获取目标时间序列,然后由处理器8002对该待目标时间序列进行分类。
处理器1302可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的流量模式的分类装置中的单元所需执行的功能,或者执行本申请实施例的流量模式的分类方法。
处理器1302还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请实施例的流量模式的分类方法的各个步骤可以通过处理器1302中的硬件的集成逻辑电路或者软件形式的指令完成。
上述处理器1302还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、ASIC、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。上述通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1301,处理器1302读取存储器1301中的信息,结合其硬件完成本申请实施例的物体检测装置中包括的单元所需执行的功能,或者执行本申请方法实施例的物体检测方法。
通信接口1303使用例如但不限于收发器一类的收发装置,来实现装置1300与其他设备或通信网络之间的通信。例如,可以通过通信接口1303获取目标时间序列。
总线1304可包括在装置1300各个部件(例如,存储器1301、处理器1302、通信接口1303)之间传送信息的通路。
图14是本申请实施例的流量异常检测装置1400的示意性框图。图14所示的流量异常检测装置1400包括存储器1401、处理器1402、通信接口1403以及总线1404。其中,存储器1401、处理器1402、通信接口1403通过总线1404实现彼此之间的通信连接。
存储器1401可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器1401可以存储程序,当存储器1401中存储的程序被处理器1402执行时,处理器1402和通信接口1403用于执行本申请实施例的流量异常检测方法的各个步骤。具体地,通信接口1403可以从存储器或者其他设备中获取目标时间序列,然后由处理器8002对该待目标时间序列进行分类。
处理器1402可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的流量模式的分类装置中的单元所需执行的功能,或者执行本申请实施例的流量模式的分类方法。
处理器1402还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请实施例的流量模式的分类方法的各个步骤可以通过处理器1402中的硬件的集成逻辑电路或者软件形式的指令完成。
上述处理器1402还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、ASIC、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。上述通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1401,处理器1402读取存储器1401中的信息,结合其硬件完成本申请实施例的物体检测装置中包括的 单元所需执行的功能,或者执行本申请方法实施例的物体检测方法。
通信接口1403使用例如但不限于收发器一类的收发装置,来实现装置1400与其他设备或通信网络之间的通信。例如,可以通过通信接口1403获取目标时间序列。
总线1404可包括在装置1400各个部件(例如,存储器1401、处理器1402、通信接口1403)之间传送信息的通路。
图15是本申请实施例的流量模式的分类模型训练装置1500的硬件结构示意图。与上述装置1300类似,图15所示的流量模式的分类模型训练装置1500包括存储器1501、处理器1502、通信接口1503以及总线1504。其中,存储器1501、处理器1502、通信接口1503通过总线1504实现彼此之间的通信连接。
存储器1501可以存储程序,当存储器1501中存储的程序被处理器1502执行时,处理器1502用于执行本申请实施例的流量模式的分类模型训练方法的各个步骤。
处理器1502可以采用通用的CPU,微处理器,ASIC,GPU或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的流量模式的分类模型训练方法。
处理器1502还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请实施例的流量模式的分类模型训练方法的各个步骤可以通过处理器1502中的硬件的集成逻辑电路或者软件形式的指令完成。
应理解,通过图15所示的流量模式的分类模型训练装置1500对流量模式的分类模型进行训练,训练得到的流量模式的分类模型就可以用于执行本申请实施例的流量模式的分类方法。
具体地,图15所示的装置可以通过通信接口1503从外界获取第一时间序列,然后由处理器根据第一时间序列对待训练的流量模式的分类模型进行训练。
图16是本申请实施例的流量异常检测模型训练装置1600的硬件结构示意图。与上述装置1400类似,图16所示的流量异常检测模型训练装置1600包括存储器1601、处理器1602、通信接口1603以及总线1604。其中,存储器1601、处理器1602、通信接口1603通过总线1604实现彼此之间的通信连接。
存储器1601可以存储程序,当存储器1601中存储的程序被处理器1602执行时,处理器1602用于执行本申请实施例的流量异常检测模型训练方法的各个步骤。
处理器1602可以采用通用的CPU,微处理器,ASIC,GPU或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的流量异常检测模型训练方法。
处理器1602还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请实施例的流量模式的分类模型训练方法的各个步骤可以通过处理器1602中的硬件的集成逻辑电路或者软件形式的指令完成。
应理解,通过图16所示的流量模式的分类模型训练装置1600对流量异常检测模型进行训练,训练得到的流量异常检测模型就可以用于执行本申请实施例的流量异常检测方法。
具体地,图16所示的装置可以通过通信接口1603从外界获取第一时间序列,然后由处理器根据第一时间序列对待训练的流量异常检测模型进行训练。
应注意,尽管上述装置1300、装置1400、装置1500和装置1600仅仅示出了存储器、处理器、通信接口,但是在具体实现过程中,本领域的技术人员应当理解,装置1300、装置1400、装置1500和装置1600还可以包括实现正常运行所必须的其他器件。同时,根据 具体需要,本领域的技术人员应当理解,装置1300、装置1400、装置1500和装置1600还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当理解,装置1300、装置1400、装置1500和装置1600也可仅仅包括实现本申请实施例所必须的器件,而不必包括图13、图14、图15和图16中所示的全部器件。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (19)

  1. 一种流量异常检测的方法,其特征在于,包括:
    获取目标时间序列,所述目标时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;
    根据所述目标时间序列,获取所述目标时间序列的目标参数,所述目标参数包括周期因子和/或抖动密度,其中,所述周期因子用于表示所述目标时间序列中呈现出来的围绕长期趋势的一种波浪形变动,所述抖动密度用于表示所述目标时间序列在目标时间内实际值与目标值的偏差;
    根据所述目标参数,从多个类型中确定所述目标时间序列所属于的第一类型,其中,所述多个类型中的每个类型对应一个参数集合,所述目标参数属于所述第一类型对应的参数集合;
    根据所述第一类型对应的第一类的判定模型,检测所述目标时间序列的异常情况,其中,所述多个类型中的每个类型对应一个类型的判定模型,所述判定模型用于流量异常检测。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述目标时间序列,获取所述目标时间序列的目标参数包括:
    将所述目标时间序列中的所述N个元素中的每个元素分解为趋势分量、周期分量和残余分量;
    确定包括N个所述周期分量的第一子时间序列和包括N个所述残余分量的第二子时间序列;
    根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数包括:
    根据所述第一子时间序列,确定所述目标时间序列是否存在所述周期因子。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述第一子时间序列,确定所述目标时间序列是否存在所述周期因子包括:
    在所述第一子时间序列中的N个周期分量存在的情况下,确定所述目标时间序列存在所述周期因子;
    在所述第一子时间序列中的N个周期分量不存在的情况下,确定所述目标时间序列不存在所述周期因子。
  5. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    根据第一映射关系和所述目标时间序列所属于的第一类型,确定所述第一类型对应的第二类的判定模型,所述第一映射关系包括所述多个类型和多个所述第二类的判定模型的对应关系;
    根据所述第二子时间序列和所述第一类型对应的第二类的判定模型,检测所述目标时间序列的异常情况,其中,所述第二类的判定模型是N-sigma模型。
  6. 根据权利要求2所述的方法,其特征在于,所述根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数包括:
    根据所述第二子时间序列,确定所述目标时间序列的抖动密度。
  7. 根据权利要求6所述的方法,其特征在于,所述根据所述第二子时间序列,确定所述目标时间序列的抖动密度包括:
    根据以下公式确定所述目标时间序列的抖动密度:
    Figure PCTCN2020107627-appb-100001
    其中,所述R是抖动密度,所述r n可以根据以下公式确定:
    Figure PCTCN2020107627-appb-100002
    其中,所述C n为所述第二子时间序列中的第n个元素,所述x n为所述目标时间序列中的第n个元素;
    所述N根据以下公式确定:
    Figure PCTCN2020107627-appb-100003
    所述T为所述目标时间序列的长度,所述W为加入窗口的窗长,所述α是第一预设值。
  8. 根据权利要求1所述的方法,其特征在于,所述根据目标参数,从多个类型中确定所述目标时间序列所属于的第一类型包括:
    根据所述目标参数,从所述多个参数集合中确定所述目标参数属于的第一参数集合;
    根据第三映射关系和所述第一参数集合,从多个类型中确定所述目标时间序列所属于的第一类型,所述第三映射关系包括多个参数集合和多个所述类型的对应关系。
  9. 根据权利要求2所述的方法,其特征在于,所述根据所述第一类型对应的第一类的判定模型,检测所述目标时间序列的异常情况包括:
    确定包括N个所述趋势分量的第三子时间序列;
    将第二时间序列分成M个目标长度的子序列,所述M为正整数,所述第二时间序列是所述第三子时间序列或所述第二时间序列是根据所述第三子时间序列和线性分段算法PLR形成的;
    计算M个目标长度的子序列的矩阵轮廓MP值,所述M个目标长度的子序列的矩阵轮廓MP值组成MP时间序列;
    根据所述MP时间序列和N-sigma算法,检测所述目标时间序列的异常情况。
  10. 根据权利要求1至9中任一项所述的方法,其特征在于,所述根据所述第一类型对应的第一类的判定模型,检测所述目标时间序列的异常情况之前,所述方法还包括:
    根据第二映射关系和所述目标时间序列所属于的第一类型,确定所述第一类型对应的第一类的判定模型,所述第二映射关系包括所述多个类型和多个所述第一类的判定模型的对应关系。
  11. 一种流量模式的分类方法,其特征在于,包括:
    获取目标时间序列,所述目标时间序列包括N个元素,所述N个元素与N个时刻对应,其中,所述N个元素中的每个元素为所对应的时刻接收到的流量数据;
    根据所述目标时间序列,获取所述目标时间序列的目标参数,所述目标参数包括周期因子和/或抖动密度,其中,所述周期因子用于表示所述目标时间序列中呈现出来的围绕长期趋势的一种波浪形变动,所述抖动密度用于表示所述目标时间序列在目标时间内实际值与目标值的偏差;
    根据所述目标参数,对所述目标时间序列进行分类。
  12. 根据权利要求11所述的方法,其特征在于,所述根据所述目标时间序列,获取所述目标时间序列的目标参数包括:
    将所述目标时间序列中的所述N个元素中的每个元素分解为趋势分量、周期分量和残余分量;
    确定包括N个所述周期分量的第一子时间序列和包括N个所述残余分量的第二子时间序列;
    根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数。
  13. 根据权利要求12所述的方法,其特征在于,所述根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数包括:
    根据所述第一子时间序列,确定所述目标时间序列是否存在所述周期因子。
  14. 根据权利要求13所述的方法,其特征在于,所述根据所述第一子时间序列,确定所述目标时间序列是否存在所述周期因子包括:
    在所述第一子时间序列中的N个周期分量存在的情况下,确定所述目标时间序列存在所述周期因子;
    在所述第一子时间序列中的N个周期分量不存在的情况下,确定所述目标时间序列不存在所述周期因子。
  15. 根据权利要求14所述的方法,其特征在于,所述根据所述目标参数,对所述目标时间序列进行分类包括:
    在所述周期因子存在的情况下,将所述目标时间序列确定为周期型;
    在所述周期因子不存在的情况下,将所述目标时间序列确定为非周期型。
  16. 根据权利要求12所述的方法,其特征在于,所述根据所述第一子时间序列或所述第二子时间序列,获取所述目标时间序列的目标参数包括:
    根据所述第二子时间序列,确定所述目标时间序列的抖动密度。
  17. 根据权利要求16所述的方法,其特征在于,所述根据所述目标时间序列的残差分量,确定所述目标时间序列的抖动密度包括:
    根据以下公式确定所述目标时间序列的抖动密度:
    Figure PCTCN2020107627-appb-100004
    其中,所述R是抖动密度,所述r n可以根据以下公式确定:
    Figure PCTCN2020107627-appb-100005
    其中,所述C n为所述第二子时间序列中的第n个元素,所述x n为所述目标时间序列中的第n个元素;
    所述N根据以下公式确定:
    Figure PCTCN2020107627-appb-100006
    所述T为所述目标时间序列的长度,所述W为加入窗口的窗长,所述α是第一预设值。
  18. 根据权利要求16或17所述的方法,其特征在于,所述根据所述目标参数,对所述目标时间序列进行分类包括:
    在所述抖动密度大于第二预设值的情况下,将所述目标时间序列确定为毛刺型;
    在所述抖动密度小于或等于所述第二预设值的情况下,将所述目标时间序列确定为平稳型。
  19. 一种装置,其特征在于,包括:
    存储器,用于存储程序;
    处理器,用于执行所述存储器存储的程序,当所述处理器执行所述存储器存储的程序时,所述处理器用于执行权利要求1至18中任一项所述的方法。
PCT/CN2020/107627 2019-08-15 2020-08-07 一种流量异常检测的方法、模型训练方法和装置 WO2021027697A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20852228.4A EP4009590A4 (en) 2019-08-15 2020-08-07 METHODS FOR DETECTING TRAFFIC ANOMALIES AND MODEL TRAINING METHOD AND DEVICE
US17/669,638 US20220166681A1 (en) 2019-08-15 2022-02-11 Traffic Anomaly Detection Method, and Model Training Method and Apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910752193.9A CN110266552B (zh) 2019-08-15 2019-08-15 流量异常检测的方法、模型训练方法和装置
CN201910752193.9 2019-08-15

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/669,638 Continuation US20220166681A1 (en) 2019-08-15 2022-02-11 Traffic Anomaly Detection Method, and Model Training Method and Apparatus

Publications (1)

Publication Number Publication Date
WO2021027697A1 true WO2021027697A1 (zh) 2021-02-18

Family

ID=67912122

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/107627 WO2021027697A1 (zh) 2019-08-15 2020-08-07 一种流量异常检测的方法、模型训练方法和装置

Country Status (4)

Country Link
US (1) US20220166681A1 (zh)
EP (1) EP4009590A4 (zh)
CN (2) CN112398677A (zh)
WO (1) WO2021027697A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420070A (zh) * 2021-06-24 2021-09-21 平安国际智慧城市科技股份有限公司 排污监测数据处理方法、装置、电子设备及存储介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112398677A (zh) * 2019-08-15 2021-02-23 华为技术有限公司 流量异常检测的方法、模型训练方法和装置
CN110781433B (zh) * 2019-10-11 2023-06-02 腾讯科技(深圳)有限公司 数据类型的确定方法和装置、存储介质及电子装置
CN112819491B (zh) * 2019-11-15 2024-02-09 百度在线网络技术(北京)有限公司 一种转化数据处理的方法、装置、电子设备及存储介质
CN113079129B (zh) * 2020-01-06 2023-08-08 阿里巴巴集团控股有限公司 数据异常检测方法、装置、系统及电子设备
CN113328872B (zh) 2020-02-29 2023-03-28 华为技术有限公司 故障修复方法、装置和存储介质
CN112134862B (zh) * 2020-09-11 2023-09-08 国网电力科学研究院有限公司 基于机器学习的粗细粒度混合网络异常检测方法及装置
CN112153044B (zh) * 2020-09-23 2021-11-12 腾讯科技(深圳)有限公司 流量数据的检测方法及相关设备
CN113852603B (zh) * 2021-08-13 2023-11-07 京东科技信息技术有限公司 网络流量的异常检测方法、装置、电子设备和可读介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106095655A (zh) * 2016-05-31 2016-11-09 北京蓝海讯通科技股份有限公司 一种异常检测方法、应用和监控设备
US20170011299A1 (en) * 2014-11-13 2017-01-12 Purdue Research Foundation Proactive spatiotemporal resource allocation and predictive visual analytics system
CN106685750A (zh) * 2015-11-11 2017-05-17 华为技术有限公司 系统异常检测方法和装置
US20190102276A1 (en) * 2017-10-04 2019-04-04 Servicenow, Inc. Systems and methods for robust anomaly detection
CN109783876A (zh) * 2018-12-19 2019-05-21 平安科技(深圳)有限公司 时间序列模型建立方法、装置、计算机设备和存储介质
CN109862129A (zh) * 2018-12-26 2019-06-07 中国互联网络信息中心 Dns流量异常检测方法、装置、电子设备及存储介质
CN110266552A (zh) * 2019-08-15 2019-09-20 华为技术有限公司 流量异常检测的方法、模型训练方法和装置

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090041198A (ko) * 2007-10-23 2009-04-28 한국정보보호진흥원 추이성분 필터링을 이용한 시계열 모델 기반의 네트워크공격 탐지 방법
CN102111312B (zh) * 2011-03-28 2013-05-01 钱叶魁 基于多尺度主成分分析的网络异常检测方法
US10917419B2 (en) * 2017-05-05 2021-02-09 Servicenow, Inc. Systems and methods for anomaly detection
CN107528722B (zh) * 2017-07-06 2020-10-23 创新先进技术有限公司 一种时间序列中异常点检测方法及装置
CN108804731B (zh) * 2017-09-12 2021-08-13 中南大学 基于重要点双重评价因子时间序列趋势特征提取方法
US10628435B2 (en) * 2017-11-06 2020-04-21 Adobe Inc. Extracting seasonal, level, and spike components from a time series of metrics data
CN109902703B (zh) * 2018-09-03 2021-09-21 华为技术有限公司 一种时间序列异常检测方法及装置
CN109871401B (zh) * 2018-12-26 2021-05-25 北京奇安信科技有限公司 一种时间序列异常检测方法及装置
CN109784042B (zh) * 2018-12-29 2021-02-23 奇安信科技集团股份有限公司 时间序列中异常点的检测方法、装置、电子设备及存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170011299A1 (en) * 2014-11-13 2017-01-12 Purdue Research Foundation Proactive spatiotemporal resource allocation and predictive visual analytics system
CN106685750A (zh) * 2015-11-11 2017-05-17 华为技术有限公司 系统异常检测方法和装置
CN106095655A (zh) * 2016-05-31 2016-11-09 北京蓝海讯通科技股份有限公司 一种异常检测方法、应用和监控设备
US20190102276A1 (en) * 2017-10-04 2019-04-04 Servicenow, Inc. Systems and methods for robust anomaly detection
CN109783876A (zh) * 2018-12-19 2019-05-21 平安科技(深圳)有限公司 时间序列模型建立方法、装置、计算机设备和存储介质
CN109862129A (zh) * 2018-12-26 2019-06-07 中国互联网络信息中心 Dns流量异常检测方法、装置、电子设备及存储介质
CN110266552A (zh) * 2019-08-15 2019-09-20 华为技术有限公司 流量异常检测的方法、模型训练方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4009590A4

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420070A (zh) * 2021-06-24 2021-09-21 平安国际智慧城市科技股份有限公司 排污监测数据处理方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN110266552B (zh) 2020-04-21
CN110266552A (zh) 2019-09-20
EP4009590A4 (en) 2022-09-28
EP4009590A1 (en) 2022-06-08
US20220166681A1 (en) 2022-05-26
CN112398677A (zh) 2021-02-23

Similar Documents

Publication Publication Date Title
WO2021027697A1 (zh) 一种流量异常检测的方法、模型训练方法和装置
US10216558B1 (en) Predicting drive failures
US20210124983A1 (en) Device and method for anomaly detection on an input stream of events
JP7270617B2 (ja) 歩行者流量ファネル生成方法及び装置、プログラム、記憶媒体、電子機器
WO2019141144A1 (zh) 确定网络故障的方法和装置
KR20220114986A (ko) 가상 네트워크 관리를 위한 머신 러닝 기반 vnf 이상 탐지 시스템 및 방법
CN113038302B (zh) 流量预测方法及装置、计算机可存储介质
CN110912908B (zh) 网络协议异常检测方法、装置、计算机设备和存储介质
WO2020082588A1 (zh) 异常业务请求的识别方法、装置、电子设备及介质
WO2019019749A1 (zh) 一种内存异常检测方法及设备
CN108664603A (zh) 一种修复时序数据的异常聚合值的方法及装置
CN111291824B (zh) 时间序列的处理方法、装置、电子设备和计算机可读介质
KR20190008515A (ko) 개선된 sax 기법 및 rtc 기법을 이용한 공정 모니터링 장치 및 방법
CN111767538A (zh) 一种基于相关信息熵的工控入侵检测系统特征选择方法
CN112994960B (zh) 业务数据异常检测方法、装置及计算设备
CN109857618A (zh) 一种监控方法、装置及系统
Chen et al. Approximating median absolute deviation with bounded error
US10320636B2 (en) State information completion using context graphs
CN115689095B (zh) 设备能耗分析方法、装置、生产系统及存储介质
CN113874888A (zh) 信息处理装置、生成方法和生成程序
JP2017091530A (ja) 交通事故の検出方法、交通事故の検出装置及び電子機器
US20230259756A1 (en) Graph explainable artificial intelligence correlation
JP2019046453A (ja) ニューラルネットワークの中間情報分析装置、最適化装置及び特徴可視化装置
CN111866128B (zh) 一种基于双lstm迭代学习的物联网数据流检测方法
CN109614854B (zh) 视频数据处理方法及装置、计算机装置及可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20852228

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020852228

Country of ref document: EP

Effective date: 20220304