CN111046323A

CN111046323A - Network traffic data preprocessing method based on EMD

Info

Publication number: CN111046323A
Application number: CN201911343753.1A
Authority: CN
Inventors: 尚立; 赵炜; 杨会峰; 李井泉; 徐珊; 刘芳; 董正坤; 李英敏; 郭少勇; 徐思雅; 杨杨
Original assignee: State Grid Corp of China SGCC; Beijing University of Posts and Telecommunications; Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; Beijing University of Posts and Telecommunications; Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2020-04-21

Abstract

The invention discloses a network flow data preprocessing method based on EMD, relating to the technical field of information communication; the network flow sequence is decomposed by adopting EMD and an EMD decomposition subsequence is obtained, so that the complexity of time sequence data is reduced; according to the method, the network flow sequence is decomposed by adopting the EMD, the EMD decomposition subsequence is obtained, and the like, so that the applicability of network flow data preprocessing is improved, the data integrity is maintained, and the data characteristic information is enriched.

Description

Network traffic data preprocessing method based on EMD

Technical Field

The invention relates to the technical field of information communication, in particular to a network traffic data preprocessing method based on EMD.

Background

The network flow data in the power data communication network reflects the data flow condition in the current power data communication network and can also be used as an information basis for judging the operation condition of the power data communication network. The single network flow data is preprocessed, so that richer and more reliable information can be provided for later analysis, prediction, fault diagnosis and the like of the network flow data, and the preprocessing of the network flow data has important research value and application prospect.

Network traffic data is essentially time series data, and in recent years, many scholars have studied on the preprocessing of time series data, and many methods for preprocessing time series data have been proposed. Currently, methods for preprocessing time series data include multivariate time series data complementation, clustering algorithms, and the like. However, because the network traffic data sequence is affected by various uncertain factors, the influencing factors are difficult to express, and the network traffic sequence has the complex characteristics of high nonlinearity and non-stationarity, the problems that the processed data does not have applicability, the data information is lost and the like easily occur by adopting the traditional preprocessing method.

Therefore, how to improve the applicability of network traffic data preprocessing, maintain data integrity, enrich data characteristic information, and the like is a problem to be solved by those skilled in the art.

In order to solve the development state of the prior art, the existing patents and documents are searched, compared and analyzed, and the following technical information with high relevance to the invention is screened out:

patent scheme 1: 201610818702.X preprocessing method and device for multi-source time sequence data

The invention provides a preprocessing method and device for multi-source time sequence data. The method comprises the following steps: acquiring and analyzing multi-source time sequence data, namely acquiring original data with different structures from different data sources respectively, and converting the original data with different structures into a plurality of time sequence data with a unified structure; a data cleaning step of cleaning the plurality of time series data with the unified structure; and a preprocessing step aiming at the characteristics of the time sequence data, and performing mutual verification and supplementation by using a plurality of time sequence data describing the same object according to the specific attributes of the time sequence data. The method solves the problem that multi-source time sequence data can not be thoroughly preprocessed in the prior art, so that more complete and more reliable structured time sequence data can be obtained, and subsequent data analysis and prediction are facilitated.

Patent scheme 2: 201710158447.5 network flow time sequence prediction method based on distributed clustering

The invention provides a network flow time sequence prediction method based on distributed clustering, which comprises network flow data preprocessing of a distributed clustering algorithm. The method comprises the following steps: the time slice tuple is obtained by carrying out fragmentation processing on the time slice data, distributed clustering preprocessing is carried out on the time slice tuple by using a distributed K-means clustering algorithm, normal distribution is obtained by carrying out normal fitting on a clustering result, data preparation is carried out for a distributed time sequence prediction, and the accuracy of network flow time sequence prediction is improved.

Patent scheme 3: 201810174986.2 time sequence data prediction method, device and equipment

The invention provides a time sequence data prediction method, a time sequence data prediction device and time sequence data prediction equipment, wherein the time sequence data prediction method comprises a time sequence data preprocessing method, and the time sequence data preprocessing method comprises the following steps: acquiring historical time sequence data, and performing data cleaning and data slicing on the historical time sequence data to obtain a corresponding time sequence data sequence; and carrying out stabilization operation on the time sequence data sequence, and carrying out feature reconstruction on the time sequence data sequence subjected to the stabilization operation by adopting an immune genetic feature reconstruction algorithm to obtain a corresponding feature sequence. The method is different from the prior art that the data set characteristics are acquired by a sampling method, and the effectiveness of the acquired time sequence data characteristics is ensured through the steps of data preprocessing, stabilizing operation, characteristic reconstruction and the like, so that the time sequence characteristics of the time sequence data can be learned by a deep learning model, and the prediction accuracy of the deep learning model is ensured.

The defects of the prior art are as follows:

the defects of the above patent scheme 1: the scheme acquires a plurality of time sequence data from different data sources for processing, converts original data into a plurality of time sequence data with the same structure, then carries out data cleaning on the time sequence data, and finally carries out time sequence data preprocessing according to the specific attribute of the time sequence data to complement the characteristics of the time sequence data. In the scheme, different data sources are mainly collected for processing, and the characteristics of the time sequence data are complemented and complemented by utilizing the time sequence data of multiple sources, but in practical situations, the data are difficult to collect many times, and the data of the multiple sources are difficult to collect, so that the universality of the scheme is not high.

The defects of the above patent scheme 2: the scheme provides a network flow time sequence preprocessing method based on distributed clustering. Dividing time sequence data into time slices with fixed length, storing the time slices in a multi-tuple form, combining a value of a next time point corresponding to each time slice tuple with the time slice tuple to be recorded as a binary group, then carrying out distributed clustering on the binary group, and clustering the time slice tuples by using a k-means clustering algorithm, thereby completing data preprocessing and preparing data for subsequent prediction. In the scheme, a distributed clustering algorithm is mainly adopted for data preprocessing, data is provided for subsequent normal fitting and prediction correction, so that the data preprocessed basically only aims at the scheme, and more general network traffic data preprocessing cannot be popularized.

The defect of the above patent scheme 3: the scheme provides a sequential data preprocessing method for subsequent prediction, which comprises the steps of firstly obtaining historical sequential data to carry out data cleaning and data slicing, then carrying out stabilization operation on the sequential data sequence, and carrying out feature reconstruction on the sequential data sequence subjected to the stabilization operation by adopting an immune genetic feature reconstruction algorithm to obtain a corresponding feature sequence. In the scheme, a smoothing and immune genetic characteristic reconstruction algorithm is mainly adopted for time sequence data preprocessing, the preprocessing method is complex, and the smoothing and reconstruction operations are adopted, so that part of data information is actually eliminated in the preprocessing process, and all complete information is not reserved in the data characteristics.

Problems with the prior art and considerations:

how to solve the technical problems to be solved by the application, such as improving the applicability of network traffic data preprocessing, maintaining the integrity of data, enriching data characteristic information and the like.

Disclosure of Invention

The invention aims to solve the technical problem of providing a network traffic data preprocessing method based on EMD, which improves the applicability of network traffic data preprocessing, keeps data integrity and enriches data characteristic information by decomposing a network traffic sequence and obtaining an EMD decomposition subsequence and the like by adopting EMD.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a network flow data preprocessing method based on EMD is to decompose a network flow sequence by adopting EMD and obtain an EMD decomposition subsequence, thereby reducing the complexity of time sequence data.

The further technical scheme is as follows: the method specifically comprises the following steps:

s1, acquiring historical network flow data;

s2, mirror image continuation of the network flow data sequence, the extended time sequence is used as the original time sequence of the EMD;

s3, initializing an original time sequence, wherein i is 1;

s4, obtaining the ith IMF;

s5, subtracting the newly obtained IMF component from the original sequence;

s6, if the number of extreme points obtained in the remaining sequences is still more than 2, calculating i to i +1, and turning to the step S4, otherwise, turning to the step S7;

s7, the decomposition is finished, and the remaining sequence is the residual component.

The further technical scheme is as follows: the step S2 specifically includes:

s21, finding out the network traffic data sequence x (t) ═ x (t)₁),x(t₂),…,x(t_n) All the maximum and minimum points of the symbol are set as x_M(i) I ∈ {1,2, …, M }, corresponding to a time point T_M(i) I is equal to {1,2, …, M }, and the minimum point is x_N(i) I ∈ {1,2, …, N }, corresponding to a time of: t is_N(i),i∈{1,2,…,N}；

S22, performing continuation on the left end of the sequence x (t), there are two cases:

(1)T_M(1)<T_N(1) with the axis of continuation passing through T_M(1) Longitudinal axis of (a):

T_M(-i+2)＝T_M(i)-2T_M(1),x_M(-i+2)＝x_M(i) wherein i>1；

T_N(-i+1)＝T_N(i)-2T_M(1),x_N(-i+1)＝x_N(i)；

(2)T_N(1)<T_M(1) With the axis of continuation passing through T_N(1) Longitudinal axis of (a):

T_M(-i+1)＝T_M(i)-2T_N(1),x_M(-i+1)＝x_M(i)；

T_N(-i+2)＝T_M(i)-2T_M(1),x_N(-i+2)＝x_N(i) wherein i>1；

S23, extending the right end of the sequence x (t), which includes the following two cases:

(1)T_M(M)<T_N(N) the continuation symmetry axis is through T_MLongitudinal axis of (M):

T_M(M+i)＝2T_M(M)-T_M(M-i),x_M(M+i)＝x_M(M-i)；

T_N(N+i)＝2T_M(M)-T_N(N-i+1),x_N(N+i)＝x_N(N-i+1)；

(2)T_N(N)<T_M(M) the continuation symmetry axis is through T_NLongitudinal axis of (N):

T_M(M+i)＝2T_N(N)-T_M(M-i+1),x_M(M+i)＝x_M(M-i+1)；

T_N(N+i)＝2T_N(N)-T_N(N-i),x_N(N+i)＝x_N(N-i)。

the further technical scheme is as follows: the step S3 specifically includes: initialization time sequence, r₀＝x(t),i＝1。

The further technical scheme is as follows: the step S4 specifically includes:

s41, initialization: h is₀＝r_i-1(t),j＝1；

S42, finding h_j-1(t) all local maxima and local minima points;

s43, pair h_j-1(t) performing cubic spline function interpolation on all the maximum and minimum value points to form an upper line envelope line;

s44, calculating the average value of the upper envelope and the lower envelope to form an average envelope m_i-1(t)；

S45, subtracting the average envelope from the original sequence to obtain a new sequence:

h_j(t)＝h_j-1(t)-m_i-1(t)

s46, judgment h_j(t) whether IMF function conditions are satisfied, and if so, h_j(t) is the IMF functionNumber imf_i(t)＝h_j(t), otherwise, j equals j +1, and the process goes to step S42.

The further technical scheme is as follows: the step S5 specifically includes: r is_i(t)＝r_i-1(t)-imf_i(t)。

The further technical scheme is as follows: the algorithm end of the step S7 can be finally verified to obtain:

i.e. the sum of all IMF sequences and residual components is the original sequence.

The further technical scheme is as follows: the method is run on a server basis.

The further technical scheme is as follows: the server displays the EMD decomposition sub-sequence through a display connected thereto.

The further technical scheme is as follows: the server prints the EMD decomposition sub-sequence through the printer connected thereto.

Adopt the produced beneficial effect of above-mentioned technical scheme to lie in:

a network flow data preprocessing method based on EMD is to decompose a network flow sequence by adopting EMD and obtain an EMD decomposition subsequence, thereby reducing the complexity of time sequence data. According to the method, the network flow sequence is decomposed by adopting the EMD, the EMD decomposition subsequence is obtained, and the like, so that the applicability of network flow data preprocessing is improved, the data integrity is maintained, and the data characteristic information is enriched.

See detailed description of the preferred embodiments.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is an EMD decomposition subsequence diagram in the invention.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the application, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, but the present application may be practiced in other ways than those described herein, and it will be apparent to those of ordinary skill in the art that the present application is not limited to the specific embodiments disclosed below.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

The relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present application unless specifically stated otherwise. Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description. Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate. In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

In the description of the present application, it is to be understood that the orientation or positional relationship indicated by the directional terms such as "front, rear, upper, lower, left, right", "lateral, vertical, horizontal" and "top, bottom", etc., are generally based on the orientation or positional relationship shown in the drawings, and are used for convenience of description and simplicity of description only, and in the case of not making a reverse description, these directional terms do not indicate and imply that the device or element being referred to must have a particular orientation or be constructed and operated in a particular orientation, and therefore, should not be considered as limiting the scope of the present application; the terms "inner and outer" refer to the inner and outer relative to the profile of the respective component itself.

Spatially relative terms, such as "above … …," "above … …," "above … …," "above," and the like, may be used herein for ease of description to describe one device or feature's spatial relationship to another device or feature as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if a device in the figures is turned over, devices described as "above" or "on" other devices or configurations would then be oriented "below" or "under" the other devices or configurations. Thus, the exemplary term "above … …" can include both an orientation of "above … …" and "below … …". The device may be otherwise variously oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.

It should be noted that the terms "first", "second", and the like are used to define the components, and are only used for convenience of distinguishing the corresponding components, and the terms have no special meanings unless otherwise stated, and therefore, the scope of protection of the present application is not to be construed as being limited.

As shown in fig. 1, the present invention discloses a network traffic data preprocessing method based on EMD, which decomposes a network traffic sequence by using EMD and obtains an EMD decomposition subsequence, thereby reducing the complexity of time sequence data.

The method specifically comprises the following steps:

and S1, acquiring historical network traffic data.

And S2, mirroring the continuation network flow data sequence, wherein the extended time sequence is used as the original time sequence of the EMD. The method specifically comprises the following steps:

s21, finding out the network traffic data sequence x (t) ═ x (t)₁),x(t₂),…,x(t_n) All the maximum and minimum points of the symbol are set as x_M(i) I ∈ {1,2, …, M }, corresponding to a time point T_M(i) I is equal to {1,2, …, M }, and the minimum point is x_N(i) I ∈ {1,2, …, N }, corresponding to a time of: t is_N(i),i∈{1,2,…,N}。

T_M(-i+2)＝T_M(i)-2T_M(1),x_M(-i+2)＝x_M(i) wherein i>1。

T_N(-i+1)＝T_N(i)-2T_M(1),x_N(-i+1)＝x_N(i)。

T_M(-i+1)＝T_M(i)-2T_N(1),x_M(-i+1)＝x_M(i)。

T_N(-i+2)＝T_M(i)-2T_M(1),x_N(-i+2)＝x_N(i) wherein i>1。

T_M(M+i)＝2T_M(M)-T_M(M-i),x_M(M+i)＝x_M(M-i)。

T_N(N+i)＝2T_M(M)-T_N(N-i+1),x_N(N+i)＝x_N(N-i+1)。

T_M(M+i)＝2T_N(N)-T_M(M-i+1),x_M(M+i)＝x_M(M-i+1)。

T_N(N+i)＝2T_N(N)-T_N(N-i),x_N(N+i)＝x_N(N-i)。

s3, initializing the original time sequence, i being 1. The method specifically comprises the following steps: initialization time sequence, r₀＝x(t),i＝1。

S4, the ith IMF is obtained. The method specifically comprises the following steps:

s41, initialization: h is₀＝r_i-1(t),j＝1。

S42, finding h_j-1All local maximum points and local minimum points of (t).

S43, pair h_j-1And (t) respectively carrying out cubic spline function interpolation on all the maximum and minimum value points of the (t) to form an upper line envelope line.

S44, calculating the average value of the upper envelope and the lower envelope to form an average envelope m_i-1(t)。

h_j(t)＝h_j-1(t)-m_i-1(t)

s46, judgment h_j(t) whether IMF function conditions are satisfied, and if so, h_j(t) is the IMF function, IMF_i(t)＝h_j(t), otherwise, j equals j +1, and the process goes to step S42.

S5, subtracting the newly obtained IMF component from the original sequence. The method specifically comprises the following steps: r is_i(t)＝r_i-1(t)-imf_i(t)。

And S6, if the number of extreme points obtained in the residual sequence is still more than 2, calculating i to i +1, and turning to the step S4, otherwise, turning to the step S7.

Finally, it can be verified that:

The EMD algorithm, the formula and the parameters in the specific steps are not described herein again for the prior art.

The purpose of the invention is as follows:

the network flow data preprocessing aims to provide reliable data for network planning and maintenance and provide more characteristic information for data analysis and data prediction. Most of the existing network traffic data preprocessing methods adopt methods such as data complementation, clustering algorithm, data feature reconstruction and the like to perform data preprocessing according to the characteristic that network traffic data is essentially time sequence data, but have the problems of low applicability, complex steps, data information loss and the like. In order to solve the above problems, the present patent proposes an EMD-based network traffic data preprocessing method. According to the method, the data complexity is reduced by performing EMD on a single network flow data sequence, data information is enriched, and the method has high applicability. For the network traffic data in the common situation, the change of the network traffic data is influenced by various factors which are difficult to express, and the sequence has the complex characteristics of high nonlinearity and non-stationarity. In order to improve the effect of the network traffic data in the applications of data analysis, data prediction and the like, the invention integrates the EMD method in the field of signal analysis into the network traffic data preprocessing. According to the network traffic data preprocessing method based on the EMD, the EMD is utilized to decompose the complex and variable nonlinear network traffic data into a smoother sequence, the complexity of the data sequence is effectively reduced on the premise that data information is not lost, the characteristic information of the data sequence is enriched, and the difficulty is reduced for possible analysis or prediction operation later. The invention aims to overcome the defects in the prior art on the basis of the prior art, improve the applicability of data preprocessing, reduce the complexity of data and enrich the characteristic information of the data.

The technical contribution of the invention is as follows:

first, the present invention needs to explain variables used in the EMD-based network traffic data preprocessing method. The variables used were as follows:

r₀an original time sequence;

h_j(t) the jth subsequence;

imf_i(t): the ith imf sequence;

r_i(t): the original sequence has the residual components after the i imf sequences removed.

The network traffic data preprocessing method based on the EMD decomposes a network traffic sequence by using the EMD, reduces the complexity of time sequence data and enriches characteristic information. The solution according to the invention is explained in detail below with reference to fig. 1, with the above-defined variables.

As shown in fig. 1, the steps are described as follows:

s1, acquiring historical network flow data;

s3, initializing an original time sequence, wherein i is 1;

s4, obtaining the ith IMF;

s5, subtracting the newly obtained IMF component from the original sequence;

Wherein, definition 1: the Intrinsic Mode Function, IMF for short. IMF is a function that satisfies the following requirements:

(1) the number of extreme points of an eigenmode function must be equal to the number of zero crossings, or the number of both differs by only one.

(2) At all points in time, the average of the upper envelope defined by the local maxima and the lower envelope defined by the local minima is zero.

Wherein, step S2 specifically includes:

T_M(-i+2)＝T_M(i)-2T_M(1),x_M(-i+2)＝x_M(i) wherein i>1；

T_N(-i+1)＝T_N(i)-2T_M(1),x_N(-i+1)＝x_N(i)。

T_M(-i+1)＝T_M(i)-2T_N(1),x_M(-i+1)＝x_M(i)；

T_N(-i+2)＝T_M(i)-2T_M(1),x_N(-i+2)＝x_N(i) wherein i>1。

T_M(M+i)＝2T_M(M)-T_M(M-i),x_M(M+i)＝x_M(M-i)；

T_N(N+i)＝2T_M(M)-T_N(N-i+1),x_N(N+i)＝x_N(N-i+1)。

T_M(M+i)＝2T_N(N)-T_M(M-i+1),x_M(M+i)＝x_M(M-i+1)；

T_N(N+i)＝2T_N(N)-T_N(N-i),x_N(N+i)＝x_N(N-i)。

wherein, step S3 specifically includes: initialization time sequence, r₀＝x(t),i＝1；

Wherein, step S4 specifically includes:

s41, initialization: h is₀＝r_i-1(t),j＝1；

S42, finding h_j-1(t) all local maxima and local minima points;

h_j(t)＝h_j-1(t)-m_i-1(t)

s46, judgment h_j(t) whether IMF function conditions are satisfied, and if so, h_j(t) is the IMF function, IMF_i(t)＝h_j(t), otherwise, j ═ j +1, go to step S42;

wherein, step S5 specifically includes:

r_i(t)＝r_i-1(t)-imf_i(t)

wherein, the end of the S7 algorithm can be finally verified to obtain:

The key points of the invention are as follows:

the network flow data preprocessing is widely applied to various fields of networks, the network flow data sequence is a nonlinear time sequence in nature, but the network flow data preprocessing is influenced by various uncertain factors and has the characteristic of high instability, the network flow data is difficult to express and apply due to the characteristic, and further planning and maintaining of future networks become difficult. For this reason, network traffic data preprocessing is very important. The invention provides a network flow data preprocessing method based on EMD. Compared with the prior work, the main contributions of the invention lie in the following aspects:

(1) different from the prior method, the network traffic prediction method provided by the invention combines EMD decomposition in the field of signal analysis, and aims to decompose highly nonlinear and unstable network traffic sequences into a plurality of relatively stable sequences, reduce the difficulty of network traffic prediction and simplify the expression of subsequent models.

(2) The invention reserves complete network flow data information in data preprocessing and enriches data characteristics.

After the application runs secretly for a period of time, the feedback of field technicians has the advantages that:

the invention decomposes the highly unstable network flow sequence into more stable subsequences by EMD decomposition, and simultaneously ensures that data information is not lost. Through EMD decomposition, complex and changeable network flow data can be decomposed into more stable and easily expressed subsequences, and complete, rich and reliable information is provided for the following aspects that the processed data can be widely applied to data analysis, data prediction and the like.

Examples of the invention illustrate:

as shown in fig. 2, in the embodiment of the present invention, 14776 pieces of network traffic sequence data are collected as a data set, and EMD decomposition is performed on a set signal (t) of network traffic data to obtain 13 subsequences imf1(t), imf2(t), … … imf12(t), and res (t).

Claims

1. A network flow data preprocessing method based on EMD is characterized in that: and the network flow sequence is decomposed by adopting EMD and an EMD decomposition subsequence is obtained, so that the complexity of time sequence data is reduced.

2. The method for preprocessing network traffic data based on EMD of claim 1, wherein: the method specifically comprises the following steps:

s1, acquiring historical network flow data;

s3, initializing an original time sequence, wherein i is 1;

s4, obtaining the ith IMF;

s5, subtracting the newly obtained IMF component from the original sequence;

3. The method according to claim 2, wherein the method comprises: the step S2 specifically includes:

T_M(-i+2)＝T_M(i)-2T_M(1),x_M(-i+2)＝x_M(i) wherein i>1；

T_N(-i+1)＝T_N(i)-2T_M(1),x_N(-i+1)＝x_N(i)；

T_M(-i+1)＝T_M(i)-2T_N(1),x_M(-i+1)＝x_M(i)；

T_N(-i+2)＝T_M(i)-2T_M(1),x_N(-i+2)＝x_N(i) wherein i>1；

T_M(M+i)＝2T_M(M)-T_M(M-i),x_M(M+i)＝x_M(M-i)；

T_N(N+i)＝2T_M(M)-T_N(N-i+1),x_N(N+i)＝x_N(N-i+1)；

T_M(M+i)＝2T_N(N)-T_M(M-i+1),x_M(M+i)＝x_M(M-i+1)；

T_N(N+i)＝2T_N(N)-T_N(N-i),x_N(N+i)＝x_N(N-i)。

4. the method according to claim 2, wherein the method comprises: the step S3 specifically includes: initialization time sequence, r₀＝x(t),i＝1。

5. The method according to claim 2, wherein the method comprises: the step S4 specifically includes:

s41, initialization: h is₀＝r_i-1(t),j＝1；

S42, finding h_j-1(t) all local maxima and local minima points;

s44, calculating the upper and lower bagsThe average value of the envelope constitutes the average envelope m_i-1(t)；

h_j(t)＝h_j-1(t)-m_i-1(t)

6. The method according to claim 2, wherein the method comprises: the step S5 specifically includes: r is_i(t)＝r_i-1(t)-imf_i(t)。

7. The method according to claim 2, wherein the method comprises: the algorithm end of the step S7 can be finally verified to obtain:

8. The method for preprocessing network traffic data based on EMD according to any one of claims 1-7, characterized in that: the method is run on a server basis.

9. The method according to claim 8, wherein the method comprises: the server displays the EMD decomposition sub-sequence through a display connected thereto.

10. The method according to claim 8, wherein the method comprises: the server prints the EMD decomposition sub-sequence through the printer connected thereto.