Disclosure of Invention
In view of this, the present invention provides a method for detecting an anomaly of a service function chain in a network slice scene, which solves the problem of data normal and anomaly imbalance in training, and can enhance the security of a virtual network and effectively improve the accuracy and stability of anomaly detection.
In order to achieve the purpose, the invention provides the following technical scheme:
a method for detecting service function chain abnormity in a network slice scene comprises the following steps:
s1: in a network slice scene, in order to enable a plurality of virtual network functions VNF contained in a service function chain SFC to independently perform anomaly detection, a distributed anomaly detection framework is constructed;
s2: in order to mine deep-level features of data which are easy to learn by a network, feature extraction is carried out on the data in each VNF, and a relation between time sequence data is captured by adopting a sliding window;
s3: because the VNF data has the problem of class imbalance, the generation of an antagonistic network GAN is adopted to learn normal data characteristics, and the learning capability of the GAN on the data characteristics is improved by combining a time convolution network TCN and an automatic encoder AE;
s4: and judging the state of the VNF by adopting an abnormality score function, and further completing the abnormality detection of the SFC.
Optionally, in S1, the network slice scene includes: the network function virtualization management system comprises an infrastructure layer, a virtual network function layer, a service operation support system, a network function virtualization management and orchestration NFV MANO and a Software Defined Network (SDN) controller; the SFC is formed by connecting virtual links by VNFs of a specific set and is deployed at a virtual network function layer;
the distributed anomaly detection architecture is used for independently detecting the anomaly of each VNF and providing an independent detection module for each VNF.
Optionally, in S2, the transforming, by the sliding-window-based feature extractor, the original time-series data in the VNF into a feature sequence specifically includes:
extracting two derivative characteristics of norm and Manhattan distance of data by using a first layer sliding window, and capturing features in a data time window and between time windows;
and (3) mining the deep features of the data by extracting eight statistical features of the average value MEA, the minimum value MIN, the maximum value MAX, the first quartile Q1, the second quartile Q2, the third quartile Q3, the standard deviation STD and the peak-to-peak amplitude P2P by using a second layer sliding window to obtain a feature sequence of the original data.
Optionally, in S3, using GAN to learn data features, and training only using normal data to solve the problem of data class imbalance, a distributed GAN model is proposed, where the model deploys a generator in the element manager EM of each VNF, the generator is composed of a triple-layer codec built by TCN and AE, and deploys a discriminator in the virtualized network function manager VNFM) of the NFV MANO, and a distributed GAN model of multiple generators and a single discriminator is built to learn the normal data features of VNFs in a single SFC, which specifically includes:
in each VNF, normal data in the VNF are input into a feature extractor to obtain a feature sequence of the VNF;
inputting the characteristic sequence into a generator in the EM, respectively obtaining potential representation, reconstruction characteristics and reconstruction potential characteristics of data through a three-layer codec, and sending the generated data to a discriminator for discrimination;
after the discriminator receives the data sent by the generator, calculating by using a discriminator loss function to obtain a discriminator updating gradient and a feedback error, wherein the gradient is used for updating the network parameters of the discriminator, and the feedback error is required to be sent to the corresponding EM to be used for updating the network parameters of the generator;
after the generator receives the feedback error sent by the discriminator, the generator loss function and the feedback error are used for calculating to obtain a generator updating gradient so as to complete the parameter updating of a coder and a decoder in the generator;
the discriminator and the generator are continuously executed interactively, and a plurality of global iterations are carried out, so that the generator of each VNF can well learn and reconstruct normal data characteristics.
Optionally, in S4, the state of each VNF is evaluated through an abnormality scoring function, which is an abnormality scoring function a (X)i) From apparent loss LaAnd potential loss LlCollectively represent:
A(Xi)=λ×La+(1-λ)×Ll
wherein, XiIs time series data of ith VNF, and lambda is LaOccupied weight, LaFor measuring the difference between reconstructed features and feature sequences, LlFor measuring the difference between the reconstructed potential representation and the potential representation;
when inputting XiFor each distributed VNF, the producer calculates an anomaly score A (X) for iti) When A (X)i) If the input data is larger than the judgment threshold value, judging that the input data existsAnd (4) abnormity, namely abnormity exists in the ith VNF, and the abnormity detection of the SFC is completed.
The invention has the beneficial effects that: according to the invention, a distributed GAN anomaly detection model is constructed, each VNF in the SFC can independently perform anomaly detection, the detection accuracy and stability are improved, the robustness of the whole virtual network is improved, and the network security is further enhanced.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.
Referring to fig. 1, fig. 1 is an overall flow of the distributed GAN anomaly detection method according to the present invention. The SFC comprises k VNFs, time sequence data in the SFC is X, and EM on the ith VNF
iGiven a time series data
Wherein
X
iDenotes EM
iDuring a period of time t, data of n attributes are contained. G
iIs EM
iGenerator of (1), G
iComprises a three-layer codec composed of TCNs: encoder for encoding a video signal
Decoder
Encoder for encoding a video signal
F
iAnd
respectively represent X
iAnd reconstructed signature sequence of (a), z
iAnd
respectively represent F
iIs potentially represented by
To reconstruct the potential representation.
Respectively representing for updating
The error term of (2).
In the training stage, only normal data are used for training based on the distributed GAN anomaly detection method. First, feature extraction is performed on the original time series in each EM using a feature extractor, and derived features and statistical features in the data are calculated. The signature sequence is then input into a generator module to reconstruct the data. Then, the characteristic sequence and the reconstruction sequence generated in each EM are input into a discriminator module, the discriminator is used for discriminating the characteristic sequence and the reconstruction sequence, and the parameters are fed back for optimization and adjustment of a generator module, so that the generator module in each EM can well reconstruct the normal data.
In the test phase, only the generator module of each EM is used for anomaly detection. Inputting the test data into a generator module of a certain EM, if the test data are normal data, the generator module can reconstruct the normal data well, namely obtaining a lower abnormal score; on the contrary, when the generator model reconstructs the data, a larger deviation is generated, and a higher abnormal score is obtained, that is, the data in the EM is abnormal.
The sliding window-based feature extractor in the method comprises two steps: calculating derivative characteristics and calculating statistical characteristics.
The derivative property is first calculated through a first layer sliding window. Norms are used to capture features between data within the same time window, denoted by NR. The norm difference is used to measure the difference between two time windows, denoted by MD. For EM
iGiven time series data
The total length of the time sequence is t, and the window size is S
w(S
w>1) The moving step length is S
sThe sliding window(s) divides the time series data to obtain c time windows.
Represents EM
iThe mth time window of (1), comprising a time interval of [ mS
s-1,mS
s+S
w-2](m ≧ 1) raw time series data, the norm within the time window can be expressed as:
wherein the content of the first and second substances,
to represent
Time series data of the j-th attribute of (1).
The norm difference can be expressed as:
after the norm and the norm difference are obtained, the sequences are combined to obtain the sequence
Wherein
Is I
iThe (m) th data of (2),
statistical features are then calculated through a second layer sliding window. For each derived feature, 8 statistical features were chosen to describe it: mean (MEA), Minimum (MIN), Maximum (MAX), first quartile (Q1), second quartile (Q2), third quartile (Q3), standard deviation (STD), and peak-to-peak amplitude (P2P) to measure the concentration trend and dispersion of the data. The signature sequence F can be obtainedi:
Wherein the content of the first and second substances,
is F
iD is the number of rows of the time series after the features are extracted, and n is the number of attributes of the time series.
Referring to fig. 2, fig. 2 is a single training iteration step of the distributed GAN anomaly detection method according to the present invention.
At the discriminator side, at each global iteration, from k distributed EM' s
iIn collecting the generated data stream
And true normal data stream { (F)
1,z
1),…,(F
k,z
k) Then through the received data stream and the loss function J
disCompleting the parameter w to the discriminator D
dAnd (6) updating. Finally for each EM
iCoding and decoding inCalculating error terms
The specific process is as follows:
the discriminator gradient is first calculated. Discriminator D for each set of received data streams (F)
i,z
i)、
Calculating a loss function, loss function J
disThe expression of (a) is:
where D (-) is the probability from a true normal data set. Data flow (F)
i,z
i)、
Loss function J of
disTo w
dThe gradient Δ w can be derived
diFor k EM s
iK gradients, { Δ w, can be obtained
d1,…,Dw
dkUsing the mean value of k gradients Δ w
dParameter w to discriminator D
dAnd (6) updating.
An error term is then calculated. To update distributed EM
iGenerator modules deployed in the system, need to be respectively paired
Is updated. The MANO respectively calculates according to the received data stream
Error term of
And sends it to the corresponding EM
iThereby completing the parameter update.
Error term of
Each one of which is
Is defined as:
wherein z is
iIs composed of
Is z
iThe jth data of (1).
Error term of
Each one of which is
Is defined as:
wherein the content of the first and second substances,
is composed of
Is composed of
The jth data of (1).
Error term of
Each one of which is
Is defined as:
wherein the content of the first and second substances,
is composed of
Is composed of
The jth data of (1).
At the generator side, a distributed generator G
iIn each global iteration of (a), a respective set of data streams (F) is generated
i,z
i)、
To discriminator D in the MANO. Then using the error term from discriminator D
Are respectively paired
Parameter (d) of
Updating is carried out, and the specific process is as follows:
a data stream is first generated. At each distributed EM
iIn (1), the method uses a sample containing only normal time-series data X
iF is obtained by a feature extractor
i,F
iThrough an encoder
Z can be extracted
iThen passes through a decoder
To obtain
Through an encoder
Extract to
Through the above steps, an original feature stream (F) can be constructed
i,z
i) And reconstructing the feature stream
It is sent to the MANO for authentication.
Then to generator GiAnd updating the parameters. In distributed EMiIn (1), using discrimination loss LfApparent loss LaAnd potential loss LlThree parts constituting GiLoss function J ofgen。
Identification of loss LfThe measurable discriminator D is generated by misjudging the reconstructed data as the real dataIs lost. Identification of loss LfThe expression of (A) is:
where s (-) is a binary cross entropy loss function and D (-) is the probability that the data is predicted to be true data. To trick the discriminator D, let GiThe generated reconstructed data is closer to the real data, and let a be 1.
Apparent loss LaThe difference between the reconstructed signature sequence and the signature sequence can be measured by continuously reducing LaThe reconstructed signature sequence can be made closer to the signature sequence. Apparent loss LaThe expression of (a) is:
potential loss LlDifferences between the potential representation of the reconstructed feature data and the potential representation of the feature sequence can be measured to help learn the reconstructed feature sequence and the potential representation of the feature sequence. Potential loss LlThe expression of (a) is:
thus, generator GiLoss function J ofgenCan be expressed as:
Jgen=ωf×Lf+ωa×La+ωl×Ll
wherein, ω isf、ωa、ωlFor adjustment in the loss function JgenMiddle Lf、La、LlThe weight of (c).
Distributed EM
iReceiving error terms from MANO
Then, passing through the meterCalculating a loss function J
genTo it generator G
iThree sub-networks in
Parameter (d) of
And (6) updating.
The parameter update of (2) can be expressed as:
wherein the content of the first and second substances,
is that
The ith parameter of (1).
The parameter update of (2) can be expressed as:
wherein the content of the first and second substances,
is that
The ith parameter of (1).
The parameter update of (2) can be expressed as:
wherein the content of the first and second substances,
is that
The ith parameter of (1).
Is calculated to obtain
Thereafter, the parameters are updated using an Adam optimizer.
Finally, the state of each VNF is judged through an abnormal scoring function, namely an abnormal scoring function A (X)i) From apparent loss LaAnd potential loss LlCollectively represent:
A(Xi)=λ×La+(1-λ)×Ll
wherein, XiIs time series data of ith VNF, and lambda is LaOccupied weight, LaFor measuring the difference between reconstructed features and feature sequences, LlFor measuring the difference between the reconstructed potential representation and the potential representation.
When inputting XiFor each distributed VNF, the producer calculates an anomaly score A (X) for iti) When A (X)i) And when the input data is larger than the judgment threshold, judging that the input data is abnormal, namely the ith VNF is abnormal, and finishing the abnormal detection of the SFC.
Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all that should be covered by the claims of the present invention.