CN110381052A

CN110381052A - Ddos attack multivariate information fusion method and device based on CNN

Info

Publication number: CN110381052A
Application number: CN201910639677.2A
Authority: CN
Inventors: 唐湘滟; 程杰仁; 黄梦醒; 蔡灿婷; 郭威; 李梦洋
Original assignee: Hainan University
Current assignee: Hainan University
Priority date: 2019-07-16
Filing date: 2019-07-16
Publication date: 2019-10-25
Anticipated expiration: 2039-07-16
Also published as: CN110381052B

Abstract

The ddos attack multivariate information fusion method and device based on CNN that the invention discloses a kind of, belongs to field of communication technology.The method comprise the steps that carrying out feature extraction to network flow within the unit time, diverse characteristics are obtained；The diverse characteristics are based on principal component model and are weighted fusion feature；The disaggregated model based on convolutional neural networks is constructed, analysis extracts the Weighted Fusion feature to obtain final feature.Described device includes memory and processor, and the memory is for storing computer program, when the computer program is executed by the processor, realizes the ddos attack multivariate information fusion method based on CNN.The method that the present invention detects ddos attack compared to the prior art, improves verification and measurement ratio, reduces rate of failing to report and total false rate, while also reducing the runing time and memory source of attack detecting.

Description

Ddos attack multivariate information fusion method and device based on CNN

Technical field

The present invention relates to field of communication technology, in particular to a kind of ddos attack multivariate information fusion method based on CNN And device.

Background technique

Convolutional neural networks (Convolutional Neural Networks, CNN) are a kind of comprising convolutional calculation and tool There is the multilayer supervised learning neural network model of depth structure, and is a kind of multilayer being specifically designed for handling 2-D data Neural network model.

Distributed denial of service attack (Distributed Denial ofService, DDoS) refers to that attacker utilizes and employs The multiple stage computers of servant initiate Denial of Service attack to one or more destination server respectively, to make server that can not locate The instruction for managing legitimate user, can cause considerable damage to network using ddos attack.

Nowadays, in big data era, everywhere largely, diversification, at a high speed and variable data.Information fusion is to be directed to Multi-level, many-sided and various dimensions advanced treatment process of multi-source heterogeneous data, it is more complete, accurate and timely to obtain Result.In recent years, the range of ddos attack is more and more wider, and the field being related to is also more and more.The attack method of ddos attack Rapid differentiation is obtained, the detection method of single-element can not identify ddos attack well, and many based on single member The detection method rate of failing to report with higher and rate of false alarm of element.

The present inventor has found the prior art during studying existing ddos attack multivariate information fusion method The prior art has at least the following problems: rate of failing to report and total false rate are high, verification and measurement ratio is low, memory source consumes fast, long operational time etc..

Therefore, the present invention provides a kind of verification and measurement ratios that can be improved DDoS, reduce rate of failing to report and total false rate, reduce fortune The ddos attack information fusion method of row time and memory source.

Summary of the invention

The application's is designed to provide a kind of ddos attack multivariate information fusion method and device based on CNN, solves Part or all of problem of the existing technology.

To achieve the above object, on the one hand the application provides a kind of ddos attack multivariate information fusion side based on CNN Method, in one embodiment, which comprises feature extraction is carried out to network flow within the unit time, obtains polynary spy Sign；The diverse characteristics are based on principal component model and are weighted fusion feature；Point of the building based on convolutional neural networks Class model, analysis extract the Weighted Fusion feature to obtain final feature.

Further, described that feature extraction is carried out to network flow within the unit time, obtain diverse characteristics, comprising: right The network flow is quantified, and obtains the type of the network flow each feature within the unit time；It will be described every The type of a feature is converted into feature vector, obtains the diverse characteristics.

Further, described that the diverse characteristics are weighted fusion feature based on principal component model, comprising: base The weight of the diverse characteristics is calculated in the principal component model；Fusion feature is weighted according to the weight.

Further, the weight that the diverse characteristics are calculated based on the principal component model, comprising: pass through institute It states principal component model the diverse characteristics are normalized, obtains the variance of the diverse characteristics；By described The variance of diverse characteristics calculates variance contribution ratio, obtains the final weight of each feature.

Further, the diverse characteristics are handled with the principal component model, and constantly adjusts the polynary spy The weight and deviation of principal component in sign.

Further, the Weighted Fusion feature is extracted in the disaggregated model of the building based on convolutional neural networks, analysis To obtain final feature, comprising: the convolutional neural networks include an input layer, three convolutional layers, three pond layers, two It is fully connected layer and an output layer；The diverse characteristics are inputted into the convolutional neural networks model by the input layer, Into the convolutional layer；The convolutional layer extracts the feature of the input layer different stage by convolution, inputs the pond layer； There are weight and deviation by the characteristic pattern that the pond layer exports, connection is fully connected layer, output valve is transmitted to output layer Classify, to obtain final feature.

Further, the convolutional layer is made of the characteristic pattern of multiple diverse characteristics, and each characteristic pattern is by more A neuron composition；The convolutional layer and the pond layer are alternately present；The preceding layer of the convolutional layer and the convolutional layer is logical Cross locality connection and the shared connection of weight.

Further, the output valve that layer is fully connected described in the last one is transmitted to the output layer, by softmax into Row classification.

Further, the convolutional neural networks are one by an input layer, three convolutional layers, three ponds Change layer, the one-dimensional convolutional neural networks that two complete connectivity layers and an output layer are constituted.

To achieve the above object, on the other hand the application provides a kind of ddos attack multivariate information fusion based on CNN Device, described device include memory and processor, and the memory is for storing computer program, the computer program quilt When the processor executes, realize that the ddos attack multiple information based on CNN as described in any one of claims 1 to 5 melts Conjunction method.

Therefore the present invention is solved by providing a kind of ddos attack multivariate information fusion method and device based on CNN The prior art of having determined there are part or all of problem, the present invention compared to the prior art in detection ddos attack method, improve The verification and measurement ratio of DDoS reduces rate of failing to report and total false rate, while also reducing the runing time and memory source of attack detecting.

Detailed description of the invention

To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.

Fig. 1 is the flow chart of the ddos attack multivariate information fusion method provided in an embodiment of the present invention based on CNN；

Fig. 2 is the comparison diagram of the verification and measurement ratio of the different test set samples provided in an embodiment of the present invention based on CNN；

Fig. 3 is the comparison diagram provided in an embodiment of the present invention based on SVM verification and measurement ratio in different test set samples；

Fig. 4 is the characteristic value schematic diagram of MEFF and NWMEFF in normal discharge provided in an embodiment of the present invention；

Fig. 5 is the characteristic value schematic diagram of MEFF and NWMEFF in attack traffic provided in an embodiment of the present invention

The comparison diagram of MEFF and NWMEFF accuracy rate during Fig. 6 is training provided in an embodiment of the present invention；

Fig. 7 is the structural schematic diagram of the ddos attack multivariate information fusion device provided in an embodiment of the present invention based on CNN.

Specific embodiment

Below in conjunction with attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that institute The embodiment of description is only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, Every other embodiment obtained by those of ordinary skill in the art without making creative efforts, belongs to this hair The range of bright protection.

In order to solve the problems, such as the prior art there are part or all of, the present invention provides a kind of DDoS based on CNN to attack Hit multivariate information fusion method and device.

Fig. 1 is the flow chart of the ddos attack multivariate information fusion method provided in an embodiment of the present invention based on CNN.

S101 carries out feature extraction to network flow within the unit time, obtains diverse characteristics.

In one embodiment, the network flow is quantified, obtains the network flow in the unit time The type of interior each feature；For the raw information for retaining network data flow as far as possible, network data flow need to only be quantified.Tool Body, since the attribute of data packet is intended merely to distinguish, therefore the attribute of the data packet of network data flow can be carried out such as Lower processing:

Assuming that the network flow F in the unit time T is

<(t₁,sip₁,dip₁,sp₁,dp₁,p₁),······,(t_n,sip_n,dip_n,sp_n,dp_n,p_n) >, wherein i =1,2, n

t_i,sip_i,dip_i,sp_i,dp_i,p_iRespectively indicate the time of i-th of data packet, source IP address, purpose IP address, source Port, destination port and data package size.

Define 1, within the sampling time, the source IP address feature of network flow F (Source IP Address Feature, SIPAF it) is defined as follows:

In the definition of SIPAF, the species number of the source IP address of network flow F in per unit time, and this feature are calculated It can be well reflected network condition.Ddos attack is that attacker is sent largely by the IP address largely forged to victim host Useless data packet, and the request of common legitimate network user is covered, to achieve the purpose that attack victim host, consumption network Resource.Under normal circumstances, the quantity of source IP address should be not less and more stable in network flow in a period of time.Work as attack When initiation, the quantity of source IP address will not increase suddenly, because there are a large amount of spoofed source IP address in network flow.SIPAF exists Bigger than under normal circumstances under attack condition, therefore, it can efficiently differentiate normal network flow and abnormal network flow Amount.

Define 2, within the sampling time, purpose IP address feature (the Destination IP Address of network flow F Feature, DIPAF) it is defined as follows:

In the definition of DIPAF, the species number of the purpose IP address of network flow F per unit time is calculated.In positive reason Under condition, the quantity of different purpose IP address will be more stable in network flow.When attacking initiation, attacker will find destination host, And target ip address Relatively centralized.Therefore, DIPAF is attacked smaller under normal circumstances, can effectively distinguish proper network stream Amount and Abnormal network traffic.

Define 3, within the sampling time, source port feature (Source Port Feature, the SPF) definition of network flow F is such as Under:

In the definition of SPF, the species number of the source port of network flow F per unit time is calculated.Ddos attack is attack Person sends a large amount of hash packets to aggrieved destination host by controlling the host of randomly selected source port.In normal condition Under, the species number of source port will be less and more stable in network flow.When attacking, the quantity of source port be will increase.

Define 4, within the sampling time, the target port feature (Destination Port Feature, DPF) of network flow It is defined as follows:

In the definition of DPF, the species number of the destination port of network flow F per unit time is calculated.It is aggrieved in order to exhaust The Internet resources of host, attacker will occupy Internet resources as much as possible, this makes ordinary user that can not access Internet resources.Machine Device can occupy the different port of aggrieved destination host as much as possible.Therefore, under normal circumstances, different target port in network flow Quantity be in reduced levels.On the contrary, it can increase suddenly when attacking.

Define 5, within the sampling time, the data packet number feature (Packet Number Feature, PNF) of network flow It is defined as follows:

PNF=n (5)

In the definition of PNF, the data packet number of network flow F per unit time is calculated.According to analysis it is found that positive reason Data packet number under condition is less than quantity when attacking.

Define 6, within the sampling time, the data package size feature (Packet Size Feature, PSF) of network flow is fixed Justice is as follows:

In the definition of PSF, the number of types of the data package size of network flow F per unit time is calculated.In proper network In, the size of video and text also has significantly different, even the text of identical content, their size is in different networks It may also be different in environment.But the size of ddos attack data packet is identical.According to analysis it is found that in positive reason Under condition, the data package size in network flow is nearly all different.But when launching a offensive, the size of data packet is identical.Cause This, value of the PSF in attack can be lower than value under normal circumstances.

Six diverse characteristics defined above can individually reflect current network condition, but they are not particularly suited for institute There is something special.For example, ddos attack may be mistaken for when network congestion occurs.Therefore, present applicant proposes one kind based on volume The ddos attack multivariate information fusion method of product neural network, it can be from multiple angle fusion diverse characteristics, more accurately instead Reflect the truth of network.

In another embodiment, feature vector is converted by the type of each feature, obtains the diverse characteristics, The diverse characteristics are as follows:

Wherein X1, X2, X3, X4, X5 and X6 respectively indicate SIPAF, DIPAF, SPF, DPF PNF and PSF, n expression sample Number.

The diverse characteristics are based on principal component model and are weighted fusion feature by S102.

In one embodiment, under different network environments, the network condition of different character representations is also different 's.It is also different for the feature at victim end and the extraction of attacker end.Therefore set forth herein a kind of feature weight computation models To calculate in different network environments different characteristic to the response situation of current network conditions.Principal component analysis (Principal Component Analysis, PCA) it is a kind of Multivariable Statistical Methods, for studying the correlation between multiple variables.It grinds The internal structure how disclosed by several chief components between multiple variables is studied carefully.Principal component analysis can eliminate assessment Interference between index, because principal component analysis can form mutually independent principal component with handover raw data target variable.By It is a kind of multivariable technique in principal component analysis, therefore is applicable in the processing of diverse characteristics in this article.Base proposed in this paper The contribution of each feature in diverse characteristics is mainly considered in the feature weight computation model of principal component analysis to determine weighted value.

By the way that the matrix X in formula (8) is normalized, acquisition matrix Z:

WhereinIndicate the mean value of jth column,Indicate the standard deviation of jth column；

By the matrix Z, covariance matrix R is obtained:

The characteristic root and feature vector for calculating the matrix R, obtain the linear combination of six principal components:

F₁=γ₁₁X₁+γ₂₁X₂++γ₆₁X₆

F₂=γ₁₂X₁+γ₂₂X₂++γ₆₂X₆

F₆=γ₁₆X₁+γ₂₆X₂++γ₆₆X₆ (12)

The variance contribution ratio that the principal component is calculated according to formula (13), when the cumulative proportion in ANOVA of the principal component is big When 85%, the m principal components are selected, formula (14) calculates the weight of each element, then obtains by normalization The final weight of each element.

Wherein w1, w2, w3, w4, w5 and w6 respectively indicate the SIPAF, the DIPAF, the SPF, the DPF, institute State the weight of PNF and the PSF.

In another embodiment, current network environment becomes increasingly complex, and single features can only unilaterally indicate network The case where some aspect.High flow capacity and variability feature for ddos attack, element characteristic can not accurately identify ddos attack. Present applicant proposes a kind of diverse characteristics information fusion methods, consider information from multiple angles.Pass through 6 spies of said extracted Sign, obtains diverse characteristics, comprehensively considers the information of multiple features, more accurately reflect current network conditions.

The application defines a polynary fusion feature (Multi-element Fusion Feature, MEFF), it is It is come out according to six feature calculations, including SIPAF, DIPAF, SPF, DPF, PNF and PSF.

MEFF=ω₁lg(SIPAF)+ω₂lg(DIPAF)+ω₃lg(SPF)+ω₄lg(DPF)+ω₅lg(PNF)+ω₆lg (PSF) (15)

Wherein w1, w2, w3, w4, w5 and w6 respectively indicate the weight by calculated six features of principal component analysis.This Application is by taking logarithm to SIPAF, DIPAF, SPF, DPF, PNF and PSF, because if not taking logarithm, gradient in training process Direction can deviate, and the training time is too long, and effect can be bad.After carrying out logarithm operation, characteristic value Relatively centralized improves precision And convergence rate.

S103, constructs the disaggregated model based on convolutional neural networks, and analysis extracts the Weighted Fusion feature to obtain most Whole feature.

In the present embodiment, in order to verify the application proposition information fusion method correctness, we construct one Disaggregated model based on convolutional neural networks.Convolutional neural networks are a kind of typical artificial feedforward neural networks, it is by building Multiple filters are found to extract the feature of input data.With the increase of the network number of plies, constantly analysis extracts feature to obtain most Whole feature.There are two features by CNN: locality connection and weight are shared.Convolutional layer and preceding layer are shared by locality connection and weight Connection, greatly reduces number of parameters, reduces network complexity, keep network more healthy and stronger, and can effectively prevent overfitting.

The basic structure of convolutional neural networks: input layer, convolutional layer, pond layer are fully connected layer and output layer.In general, Convolutional layer and pond layer are alternately present.Finally, the feature of connection pool layer to be to form feature vector, and feature vector has passed through Full articulamentum obtains class vector.

Convolutional layer, convolutional layer are made of multiple characteristic patterns, and each characteristic pattern is made of multiple neurons.Each neuron is logical It crosses convolution kernel and is connected to last characteristic pattern.Convolutional layer extracts the feature of input layer different stage, the form of convolutional layer by convolution It is as follows:

Wherein l indicates that current layer, b indicate the deviation of current layer, and k indicates convolution kernel, M_jIt indicates the convolution window of convolution kernel, swashs Function living is usually using sigmoid, tanh, relu, and in this application, we select relu activation primitive, Relu activation primitive It is defined as follows:

F (x)=max (0, x) (17)

As x > 0, gradient is always 1, and does not have the problem of gradient dispersion, and express delivery restrains quickly；It is defeated as x < 0 It is out 0；The neuron that gradient is 0 after training is more, they will be more sparse；The feature of extraction will be representative, and general Change ability will be stronger.

Pond layer, pond layer are made of the subsequent multiple Feature Mappings of convolutional layer.Each Feature Mapping of pond layer is only right Should be in a Feature Mapping of preceding layer, and the quantity of Feature Mapping will not change, and convolutional layer is the input layer of pond layer, pond The form for changing layer is as follows:

Wherein down (x_j) indicate j-th of neuron down-sampling.Each output characteristic pattern has weight beta and deviation b.

It is fully connected layer, after multiple convolutional layers and pond layer, connection one or more is fully connected layer, is fully connected Each neuron in layer is fully connected to all neurons in preceding layer, the activation primitive of each neuron in full articulamentum Relu function is generally selected, the last one output valve for being fully connected layer is transmitted to output layer, can be divided by softmax Class.

Convolutional neural networks described herein include an input layer, three convolutional layers, three pond layers, two it is complete Articulamentum and an output layer.The verification and measurement ratio of DDoS can be effectively improved in the present embodiment, reduce rate of failing to report and total mistake Rate.

In order to verify method provided by the present invention, the present embodiment is also right, and the present embodiment is also to " CAIDA DDoS Attack 2007 " data set is tested, as follows:

The application obtains normal data sample and attack data sample from 2007 data set of CAIDA DDoS Attack. Firstly, extracting six spies of SIPAF, DIPAF, SPF, DPF, PNS and PSF according to the diverse characteristics extracting rule that the application proposes Sign.Then, according to weighing computation method, the weight of the feature in formula 14 is obtained, wherein w1, w2, w3, w4, w5 and w6 are respectively 0.186,0.122,0.185,0.19,0.186 and 0.131.Finally, MEFF is obtained by diverse characteristics fusion formula.

In order to verify the validity and generalization of diverse characteristics information fusion method proposed in this paper, We conducted comparisons Experiment, specific steps and contrast and experiment are as follows:

1, the performance based on the MEFF of CNN and other features compares

In this experiment, the quantity of training set sample is constant, and five different test set samples are randomly choosed from test set This, includes normal stream and attack stream.The quantity of five test set samples is respectively 500,1000,2000,5000 and 10000.? In this experiment, we compare MEFF feature based on CNN model and other detections of six features under different sample sizes The performance of rate, rate of failing to report and error rate.

As shown in figure 3, MEFF, SIPAF, DIPAF, SPF, DPF, PNF and PSF can preferably detect ddos attack.When When test sample number is 500, other than verification and measurement ratio is 69.6% PSF, the verification and measurement ratio of each feature is 70%.However, working as sample When this number is 1000, it is clear that the verification and measurement ratio of each feature differs widely.In these features, two features with high detection rate It is MEFF and SIPAF, they are respectively 89.8% and 90%.Gap between them only has 0.2%.It can be seen that the two The detection effect difference of feature is little.However, the recall rate of DIPAF, SPF, DPF, PNF and PSF are respectively 84.2%, 82.2%, 88.8%, 85% and 82%.Compared with the verification and measurement ratio of MEFF feature, the verification and measurement ratio of other features is poor.MEFF has Better testing result.When the sample number in test set is 2000, the verification and measurement ratio of PSF feature is minimum, and only 87.5%, and The verification and measurement ratio of MEFF feature and SIPAF feature is respectively 92% and 92.1%, and the difference between them is only 0.1%.Work as test When the sample number of concentration is 5000, the verification and measurement ratio of MEFF and SIPAF are still very high, and the gap between them is smaller, only 0.04%.On the contrary, the verification and measurement ratio of DPF feature and PNF feature increases slow compared with the case where sample size is 2000.Work as sample When number is 10000, the trend of feature verification and measurement ratio is relatively stable.According to experimental result, it has been found that with the increase of sample size, Verification and measurement ratio gap between MEFF and SIPAF feature is smaller and smaller, this shows that fusion feature MEFF proposed in this paper can be effective Identify ddos attack.When sample number difference, the verification and measurement ratio of MEFF feature is generally higher than other features, such as DIPAF, SPF, DPF, PNF and PSF.Since MEFF considers the information of multiple elements, including source IP address, purpose IP address, source port, destination Mouthful, data package size and number-of-packet, therefore it is than only considering that the feature of single aspect has higher verification and measurement ratio.Pass through Fig. 1 Know that the verification and measurement ratio of feature increases with the increase of sample size in test set.Based under CNN model, each feature verification and measurement ratio It increases rapidly when starting, slowly increases in the later period.

Table 1

As it can be seen from table 1 MEFF, SIPAF, DIPAF, SPF, DPF, PNF and PSF with different test set samples The performance of rate of failing to report and error rate.When sample number is 500, the rate of failing to report and error rate of each feature are essentially identical, show In the case where small sample, the rate of failing to report of each feature and the performance of error rate are similar.But with the increasing of sample size Add, the rate of failing to report and error rate of different characteristic are significantly different.When sample number is 1000, MEFF and SIPAF keep lower leakage Report rate and lower error rate, in addition than sample size be 500 when reduce 20%.On the contrary, other features have higher leakage Report rate and error rate, especially PSF feature, rate of failing to report 18%.When sample size is 2000, the rate of failing to report of SIPAF is at this It is minimum in a little features, only 7.9%, and the rate of failing to report of MEFF is 8%, it means that there is no too many differences between them.Work as sample When this number is 2000, DIPAF feature and SPF feature rate of failing to report having the same and error rate, respectively 11.1% and 5.55%. When sample size is 5000, MEFF feature rate of failing to report is 4.04%, error rate 2.02%, and SIPAF feature rate of failing to report is 4%, Error rate is 2%.But other features have higher rate of failing to report and error rate.When sample size is 10000, MEFF is special The rate of failing to report for SIPAF feature of seeking peace and the performance of error rate are still without too many differences, but the performance of MEFF characteristic is far superior to Other five features.In terms of rate of failing to report, MEFF ratio DIPAF low 3.19%, lower than SPF 4.41%, lower than DPF 1.14%, than PNF low 2.02%, lower than PSF 4.68%.With the increase of sample size, the rate of failing to report of MEFF feature and the performance of error rate are got over It is better to come, and the performance of the rate of failing to report of PSF feature and error rate is worse and worse.According to experimental result as can be seen that not same In the case where this amount, the rate of failing to report and error rate of MEFF feature proposed in this paper are superior to most of features.This is because MEFF Feature considers perhaps various information, rather than only considers the information of individual element.

2, the comparison that the runing time and memory of MEFF and other features use

In this experiment, we select the training set with fixed sample and test set to consider information, including source IP Location, purpose IP address, source port, destination port, data package size and number-of-packet.In addition, the experiment is detecting whether to occur When ddos attack, MEFF characteristic and other six features are compared in terms of runing time and memory are using two.

Table 2

From table 2 it can be seen that MEFF and other six features exist in the case where not changing training set and test set quantity There are larger gaps in terms of runing time and memory service performance.Aspect between at runtime, MEFF need 23.54 seconds, and other Six summations are 146.27 seconds.It is used as memory, MEFF function uses 33.84MB, other six summations are 225.74MB.From experimental result as can be seen that the total run time of MEFF feature and total memory usage amount well below other six The summation of a feature.

3, the performance based on the MEFF of SVM and other features compares

It is applicable not only to apply also for it based on CNN detection model to verify the information fusion method of the application proposition His model, therefore, comparative experiments are carried out based on supporting vector machine model.

SVM is the supervised learning model in machine learning field, is commonly used in pattern-recognition, classification and regression analysis.Locating When managing small sample, SVM can be obtained than other model better performances, and its generalization is higher.C-SVM mould is selected herein Type, kernel function are radial basis function.In this experiment, it is 1, g 0.1 that we, which are based on matlab platform setting parameter c,.

As shown in figure 4, under SVM model, MEFF, SIPAF, DIPAF, SPF, DPF, PNF and PSF inspection still with higher Survey rate.When the sample size of test set is 500, the verification and measurement ratio of each feature is 80%.When sample size is 1000, MEFF is special The verification and measurement ratio for SIPAF feature of seeking peace is about 90%, and PNF, SPF and the verification and measurement ratio of DIPAF feature are lower than 85%, shows MEFF spy The detection effect for SIPAF feature of seeking peace is obviously more preferable.When test set number is 2000, the verification and measurement ratio of PNF, SPF and DIPAF are obvious Higher than other features, verification and measurement ratio improves 6%.When sample size is greater than 2000, the verification and measurement ratio of each feature slowly increases and becomes Stablize.However, it is possible to which MEFF and SIPAF verification and measurement ratio having the same can be clearly seen and remain high detection rate.Particularly, When sample size is larger, for example, the verification and measurement ratio of MEFF, SIPAF and PSF are greater than 95% when sample size is 10000.Pass through Compare PSF feature under the detection model of detection model and SVM based on CNN, we can be found that PSF feature is more suitable for SVM model, because being based on CNN model, the verification and measurement ratio of PSF feature is minimum in all features, but is based on SVM model, verification and measurement ratio In higher level.MEFF feature proposed in this paper all keeps high detection rate in CNN model and SVM model.As can be seen that this The MEFF characteristic that application proposes can more accurately detect ddos attack with the information of the multiple elements of effective integration.

Table 3

As can be seen from Table 3 with the increase of test set sample, the rate of failing to report and error rate of each feature are on a declining curve. When sample size is 500, the rate of failing to report of each feature is 20%, error rate 10%.As can be seen that the small sample the case where Under, the rate of failing to report and error rate of each feature are relatively high.When the sample size of test set is 1000, SPF feature is failed to report Rate and the error rate highest in all features, respectively 18.2% and 9.1%.When sample size is 2000, MEFF, SIPAF Relatively low rate of failing to report and error rate are kept with DPF, they are about 8%.When sample size is 10000, MEFF feature Error rate is that the error rate of 1.25%, SIPAF feature is 1.11%, and the maximum error rate in these features is DIPAF feature, It is 4.15%.As can be seen that the rate of failing to report and error rate of each feature are lower and lower, this shows with the increase of sample size These features can preferably detect ddos attack.However, the rate of failing to report and error rate of MEFF feature are generally lower than other features Rate of failing to report, it means that MEFF feature can effectively merge the information of multiple elements.

Table 4

As can be seen from Table 4 when sample is constant, the runing time and memory of MEFF, which uses, is significantly lower than other six spies Sign.Based under SVM model, the runing time of MEFF is only 9.6 seconds, and the runing time of other six features needs 101.44 Second.Meanwhile the memory usage amount very little of MEFF feature, and it is much smaller than the summation of six diverse characteristics.The memory of MEFF uses Amount is 19.64 MB, other six summations are 125.57 MB.From experimental result as can be seen that MEFF at runtime between and it is interior Use aspect is deposited to perform better than.MEFF can be considered polynary prime information and be used using shortest runing time and least memory Amount.

4, MEFF the and NWMEFF performance based on CNN compares

In order to verify feature weight correctness.In this application, We conducted a comparative experiments, base is compared In the training process of CNN model every batch of MEFF feature and accuracy rate without weight MEFF feature (NWMEFF), verification and measurement ratio, fail to report Rate and error rate.

As shown in figure 5, the characteristic value Relatively centralized of the characteristic value ratio NWMEFF of MEFF.Due to MEFF feature consider it is each The importance of elemental characteristic and each feature is measured by weight, therefore the characteristic value of MEFF is relatively stable and will not fluctuate It is very big.On the contrary, the characteristic value fluctuation of NWMEFF is very big, maximum value is greater than 10, and minimum value is less than 2.From figure 3, it can be seen that 500 sampling times, the 1500th sampling time and the 6000th sampling time, the value of NWMEFF is in the height of network access The peak phase, and be likely to be mistaken for attacking.But the value of MEFF is relatively stable, therefore will not judge by accident.

As shown in fig. 6, when attacking beginning, the characteristic value of NWMEFF fluctuation from 2 to 16.Have in the middle and later periods of attack Several fluctuations.Therefore, attack stream may be mistaken for normal stream.However, MEFF is fluctuated not in the early and late stage of attack Greatly, a possibility that therefore judging by accident is much lower.

Fig. 6 show training during MEFF and NWMEFF accuracy rate comparison.When training starts, NWMEFF's is accurate Rate is higher than the accuracy rate of MEFF feature.Since 20 batches, the accuracy rate of MEFF is higher than NWMEFF, the accuracy rate of MEFF feature About 80%.In 40 batches, the accuracy rate of MEFF nearly reaches 90%.From fig. 6, it can be seen that the training precision base of MEFF It is higher than NWMEFF in sheet.In the case where normal discharge and attack traffic, the characteristic value fluctuation of NWMEFF is very big, this makes can not It is accurate to express Network status at that time and easily determine mistake.But the characteristic value of MEFF is relatively stable, misjudgment can It can property very little.Therefore, in entire training process, the accuracy rate of MEFF is relatively higher than NWMEFF.

Table 5

As can be seen from Table 5, the verification and measurement ratio of MEFF feature is greater than the verification and measurement ratio of NWMEFF feature.MEFF rate of failing to report is 6.24%, NWMEFF rate of failing to report are 10.07%.The error rate of MEFF is much smaller than NWMEFF, and the error rate of MEFF feature is 3.70%.The performance of MEFF feature is better than the weighted that the reason of NWMEFF feature is each feature, by increasing and decreasing The case where weight of each feature, MEFF can more accurately express current network.

According to above-mentioned experiment, it can be seen that MEFF feature has high detection rate, low rate of failing to report and error rate.Pass through MEFF spy Faster, committed memory is less for the sign detection ddos attack speed of service.DPF feature under SVM model have high detection rate, but There is low probability of detection under CNN model, show that DPF feature is only applicable to SVM model.The experimental results showed that the detection of MEFF feature Rate is generally greater than other features, and when considering other characteristic informations, runing time and memory usage are lower.In addition, MEFF Weight can be with the importance of effectively measuring each feature, effectively fusion feature, and precision height.In conclusion the application The information fusion method of proposition can effective integration diverse characteristics information, verification and measurement ratio is high, and rate of failing to report is low, and error rate is low.Moreover, should Method is applicable not only to CNN detection model, is also applied for other models.

In the present embodiment, the ddos attack multivariate information fusion device based on CNN includes memory and processor, described Memory is for storing computer program, when the computer program is executed by the processor, realizes that the DDoS based on CNN is attacked Hit multivariate information fusion method.Its realization principle and technical effect to be achieved above have discussion, and details are not described herein.

The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims

1. a kind of ddos attack multivariate information fusion method based on CNN, which is characterized in that the described method includes:

Feature extraction is carried out to network flow within the unit time, obtains diverse characteristics；

The diverse characteristics are based on principal component model and are weighted fusion feature；

The disaggregated model based on convolutional neural networks is constructed, analysis extracts the Weighted Fusion feature to obtain final feature.

2. the ddos attack multivariate information fusion method based on CNN as described in claim 1, which is characterized in that described in list Feature extraction is carried out to network flow in the time of position, obtains diverse characteristics, comprising:

The network flow is quantified, obtains the type of the network flow each feature within the unit time；

Feature vector is converted by the type of each feature, obtains the diverse characteristics.

3. the ddos attack multivariate information fusion method based on CNN as described in claim 1, which is characterized in that described by institute It states diverse characteristics and fusion feature is weighted based on principal component model, comprising:

The weight of the diverse characteristics is calculated based on the principal component model；

Fusion feature is weighted according to the weight.

4. the ddos attack multivariate information fusion method based on CNN as claimed in claim 3, which is characterized in that described to be based on The principal component model calculates the weight of the diverse characteristics, comprising:

The diverse characteristics are normalized by the principal component model, obtain the side of the diverse characteristics Difference；

Variance contribution ratio is calculated by the variance of the diverse characteristics, obtains the final weight of each feature.

5. the ddos attack multivariate information fusion method based on CNN as described in claim 1, which is characterized in that the building Disaggregated model based on convolutional neural networks, before analysis extracts the Weighted Fusion feature to obtain final feature, further includes:

The diverse characteristics are handled with the principal component model, and constantly adjust the power of principal component in the diverse characteristics Weight and deviation.

6. the ddos attack multivariate information fusion method based on CNN as described in claim 1, which is characterized in that the building Disaggregated model based on convolutional neural networks, analysis extract the Weighted Fusion feature to obtain final feature, comprising:

The convolutional neural networks are fully connected layer and one including an input layer, three convolutional layers, three pond layers, two Output layer；

The diverse characteristics are inputted into the convolutional neural networks model by the input layer, into the convolutional layer；

The convolutional layer extracts the feature of the input layer different stage by convolution, inputs the pond layer；

There are weight and deviation by the characteristic pattern that the pond layer exports, connection is fully connected layer, output valve is transmitted to defeated Layer is classified out, to obtain final feature.

7. the ddos attack multivariate information fusion method based on CNN as claimed in claim 6, which is characterized in that the convolution Layer is made of the characteristic pattern of multiple diverse characteristics, and each characteristic pattern is made of multiple neurons；The convolutional layer and The pond layer is alternately present；The preceding layer of the convolutional layer and the convolutional layer passes through locality connection and the shared connection of weight.

8. the ddos attack multivariate information fusion method based on CNN as claimed in claim 6, which is characterized in that the last one The output valve for being fully connected layer is transmitted to the output layer, is classified by softmax.

9. the ddos attack multivariate information fusion method based on CNN as claimed in claim 6, which is characterized in that the convolution Neural network is one by an input layer, three convolutional layers, three pond layers, two complete connectivity layers and The one-dimensional convolutional neural networks that one output layer is constituted.

10. a kind of ddos attack multivariate information fusion device based on CNN, which is characterized in that described device include memory and Processor, the memory is for storing computer program, when the computer program is executed by the processor, realizes as weighed Benefit require any one of 1 to 9 described in the ddos attack multivariate information fusion method based on CNN.