CN109889452A - Network context flow generation method and system based on condition production confrontation network - Google Patents

Network context flow generation method and system based on condition production confrontation network Download PDF

Info

Publication number
CN109889452A
CN109889452A CN201910012933.5A CN201910012933A CN109889452A CN 109889452 A CN109889452 A CN 109889452A CN 201910012933 A CN201910012933 A CN 201910012933A CN 109889452 A CN109889452 A CN 109889452A
Authority
CN
China
Prior art keywords
model
network
data
flow
traffic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910012933.5A
Other languages
Chinese (zh)
Other versions
CN109889452B (en
Inventor
赵鹏
程学旗
张志斌
杨春晖
郭嘉丰
何文婷
王赛
王征
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201910012933.5A priority Critical patent/CN109889452B/en
Publication of CN109889452A publication Critical patent/CN109889452A/en
Application granted granted Critical
Publication of CN109889452B publication Critical patent/CN109889452B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention relates to a kind of network context flow generation methods based on condition production confrontation network, comprising: data acquisition step, the network flow data and conditional information of acquisition, dyad turn to real traffic;Model generation step is obtained being initially generated model and discrimination model with the real traffic, and is initially generated model to this with the discrimination model by condition production confrontation network and is trained, to obtain generating model;Traffic generating step, by the generation model with generating random vector simulation background flow.

Description

Network context flow generation method and system based on condition production confrontation network
Technical field
The present invention relates to network securitys and network simulation field, and in particular to one kind fights network based on condition production (CGAN) network context flow generation method and system.
Background technique
Network flow generation technique is chiefly used in network test.It is born initial stage so far from network, with network size and complexity The continuous growth of property, network flow generation technique are also constantly developing.From network flow generation method, 3 classes can be divided into Method: the generation method based on statistical model, the generation method based on traffic characteristic, the generation method based on application/session.
1, based on the generation method of statistical model
It is theoretical that this method is based on statistical analysis, using the statistical models such as Poisson, ON/OFF, FBM/FGN or multi-fractal, carves The network flow regularity of distribution is drawn, and generates analogue flow rate based on these models.The principle of such method are as follows: pass through probability statistics etc. Theory analysis establishes the statistical model of network flow, and generates analogue flow rate based on statistical model.The composed structure of such tool It can be generally divided into two parts: statistical model, network flow Core Generator.Wherein, statistical model is core, and it is raw to reflect flow At core concept;Network flow Core Generator is responsible for being based on statistical model, generates analogue flow rate.The technology that such tool uses Means are mainly probabilistic models, comprising: Poisson model, exponential distribution, ON/OFF model, Weibull distribution, Pareto points Cloth, Gaussian Profile etc..The traffic generating process of such tool includes the initiation parameter that statistical model is arranged, configuration statistics mould Type;It is then based on statistical model and generates analogue flow rate.
2, based on the generation method of traffic characteristic
According to the granularity of traffic characteristic, such method can be divided into 2 subclasses: packet-level and stream grade again.
Based on the traffic generating of packet-level feature: this method is realized often by data packet playback.It is related to Flow packet level feature are as follows: packet interval arrival time, data package size etc..This method only considers the substantially special of single data packet Sign, advantage is: simple, operand is low;The disadvantage is that: fidelity is low, does not account for influencing each other, ignoring between various data packets Traffic characteristic between agreement and inside single agreement.The traffic generating of data flow level: data packet is the substantially single of network flow Position, data flow includes multiple data packets.Data flow is for connection-oriented network protocol, and a stream includes the company of foundation It connects, connect the whole process for continuing, closing connection.In ICP/IP protocol cluster, stream just refers to TCP flow.It is reset and is given birth to by TCP flow At Model Background flow.The advantages of this method is that the data on flows collection of generation is smaller compared to packet level flow;Disadvantage is the absence of and uses Family behavioural habits, application features and the relevant feature of operating system, and the flow of non-Transmission Control Protocol can not be generated.
The traffic generating of session level.Session refers to primary the interacting with physical meaning between server and client computer Journey.This method extracts some characteristic informations about network connection from live network flow, comprising: session persistence, meeting Words interval arrival time, and based on many unique session data packets of these characteristic informations creation.It is newly-built by resetting these Session data packet realizes the Traffic simulation of session level.The advantages of this method is: can generate with semantic simulating traffic;It lacks Point is relative complex.The principle of generation method of flow based on traffic characteristic be by analysis live network flow in packet/stream/ Session characteristics establish traffic characteristic model, and generate analogue flow rate based on characteristic model.The composed structure of such tool generally may be used To be divided into traffic characteristic model and network flow Core Generator.Wherein, traffic characteristic model is core, includes from live network The various traffic characteristics extracted in flow;Network flow Core Generator is responsible for being based on traffic characteristic model, generates analogue flow rate.It should The traffic generating process of class tool including the following steps: packet/stream/session level feature of analysis live network flow is established Traffic characteristic model;According to traffic characteristic model, data packet, data flow or session are generated.The technology hand that such tool uses Section includes that Bayesian probability discusses equiprobability analysis theories, data packet resets tool, command script of specific protocol etc..
3, based on the generation method of application
This method carries out probability analysis, and the mode reset using specific application order towards specific application, generates mould Type flow.The advantages of this method is simple;The disadvantage is that can be only generated the flow of specific application, true network context stream is not met The complexity of amount.The principle of such method be by analysis live network flow in application feature, and based on using feature give birth to At analogue flow rate.Opposite packet level/stream grade/session level traffic generator, such method is relatively high-rise, studies the stream of specific application Measure feature.The composed structure of such tool can be generally divided into application traffic characteristics analysis module and network flow Core Generator. Wherein, application traffic characteristics analysis module is core, includes the various application traffic features extracted from live network flow; Network flow Core Generator is responsible for being based on application traffic feature, generates analogue flow rate.Such tool during generating flow, The application traffic feature in live network is analyzed first, is then based on application traffic feature, generates the corresponding data of specific application Packet.The technological means that such tool uses includes: the data packets such as tcpdump gripping tool, Bayesian probability by equiprobability analysis Theory, order perform script etc..
The above method has the following problems:
(1) Model of network traffic is difficult to accurately reflect true network flow characteristic.
Network flow statistic model is either derived based on Probability Statistics Theory, or based on live network traffic characteristic point Network flow characteristic model is extracted in analysis, there is larger problem in terms of the fidelity of network flow simulation.Reason is really Network flow is extremely complex, close with the factor relations such as time, place, user, application software, operating system, emergency event.Make With a statistical model or characteristic model, it is difficult to describe live network flow.Especially manually extracting network flow characteristic When, several specific traffic characteristics can be rule of thumb specified, such as: when packet/stream arrival time, packet/stream size, data flow continue Between, the transition probability between different sessions etc..Whether live network stream can be reacted completely for these traffic characteristics of proposition Measure feature, or whether there is still not found hidden feature in live network flow, specific conclusion can not be provided.And from The essence of cognition is set out deduction, the diversified network flow for complexity, and there are more still not found hidden features can Energy property is very big.
(2) extraction of live network feature is very cumbersome.
If it is desired to embodying the influence of the factors such as time, place, user, application software, event in traffic generating, then need Traffic characteristic relevant to time, place, user, application, event is extracted from live network flow.And time, place, use Family, application, event are multifarious, and there are certain unpredictabilities.Not only live network traffic characteristic extracts work It is very cumbersome, and be difficult to accurately match live network flow.
(3) the method ease for use that above-mentioned network generates is excessively poor.
In the generation method based on statistical model, the generation method based on traffic characteristic, the generation method based on application, It is both needed to user's depth to participate in traffic generating, the process required manual intervention is relatively more.Especially Modeling Network Traffic process User time and energy are expended, and cannot be guaranteed effect.
Summary of the invention
In view of the above-mentioned problems, the present invention proposes one kind, the network context traffic generating based on condition production confrontation network Method, comprising: data acquisition step, the network flow data and conditional information of acquisition, dyad turn to real traffic;Model is raw At step, obtain being initially generated model and initial discrimination model with the real traffic, and fight network pair by condition production This is initially generated model and the initial discrimination model is trained, to obtain generating model;Traffic generating step passes through the generation Model is with generating random vector simulation background flow.
Network context flow generation method of the present invention, wherein the data acquisition step specifically includes: obtaining step Suddenly, the network flow data and the conditional information are obtained by the mirror port of network access device, and by the network flow number According to multiple data packets are stored as, the length of each data packet is m byte;Vectorization step, in the data of each data packet N 0 is filled at domain end, and vector turns to the random noise vector of 1518 dimensions, and after the random noise vector 1 dimension of series connection this Part information, to form the real traffic of 1519 dimensions;The real traffic is divided into training set by classifying step, verifying collects and test Collection, wherein the training set is for determining that this is initially generated the coefficient of model He the initial discrimination model, and the verifying collection is for verifying This is initially generated the loss function of model, which is used to test the effect of the generation model;Wherein, which is pcap Format or binary format, m, n are integer, and 64≤m≤1518, n=1518-m, which is to obtain the network flow number According to when temporal information.
Network context flow generation method of the present invention, wherein the generation model and the discrimination model are shot and long term note Recall network model;It further include resolve packet unit and this after the output layer of the shot and long term memory network model of the discrimination model The SoftMax function of output layer.
The invention further relates to a kind of network context traffic generating systems based on condition production confrontation network, comprising: number According to module is obtained, for obtaining real traffic;It wherein obtains network flow data and conditional information, dyad turns to the true stream Amount;Model generation module is initially generated model and initial discrimination model for being obtained with the real traffic, and is generated by condition Formula confrontation network discrimination model is initially generated model to this and the initial discrimination model is trained, to obtain generating model; Traffic generating module, for passing through the generation model with generating random vector simulation background flow.
Network context traffic generating system of the present invention, wherein the data acquisition module specifically includes: obtaining mould Block obtains the network flow data and the conditional information for the mirror port by network access device, and by the network flow Amount data are stored as multiple data packets, and the length of each data packet is m byte;Vectorization module, in each data The data field end filling n 0 of packet, vector turns to the random noise vector of 1518 dimensions, and connects 1 after the random noise vector Conditional information of dimension, to form the real traffic of 1519 dimensions;Categorization module, for by the real traffic be divided into training set, Verifying collection and test set, wherein for determining that this is initially generated the coefficient of model He the initial discrimination model, this is tested the training set Card collection is used to test the effect of the generation model for verifying the loss function for being initially generated model, the test set;Wherein, should Data packet is pcap format or binary format, and m, n are integer, and 64≤m≤1518, n=1518-m, which is to obtain The temporal information when network flow data.
Network context traffic generating system of the present invention, wherein the generation model and the discrimination model are shot and long term note Recall network model, further includes resolve packet unit and this after the output layer of the shot and long term memory network model of the discrimination model The SoftMax function of output layer.
Network context flow generation method of the invention does not need manually to establish Model of network traffic, but by a large amount of Training data, by CGAN training obtain Model of network traffic, the process required manual intervention is fewer, and can be true to nature Reflect live network traffic characteristic.The generation model of different scenes, energy are obtained by the live network flow training of different scenes Enough cope with the complexity and diversity of live network flow.
Innovation of the invention essentially consists in application innovation of the CGAN technology in terms of network flow generation, and there is no right CGAN technology itself improves.
Detailed description of the invention
Fig. 1 is a kind of network context traffic generating system schematic based on condition production confrontation network of the invention.
Fig. 2 is a kind of network context flow generation method flow chart based on condition production confrontation network of the invention.
Fig. 3 is a kind of frame structure of the data packet of network context flow based on condition production confrontation network of the invention Schematic diagram.
Fig. 4 is LSTM schematic network structure.
Fig. 5 is LSTM network transverse structure schematic diagram.
Fig. 6 is LSTM network hidden neuron schematic network structure.
Fig. 7 is a kind of differentiation mould of network context flow generation method based on condition production confrontation network of the invention Type structural schematic diagram.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing, the present invention is mentioned The network context flow generation method and system further description based on condition production confrontation network out.It should manage Solution, specific implementation method described herein are only used to explain the present invention, be not intended to limit the present invention.
By to traffic generating the study found that defect in the prior art is caused by the complexity of live network flow 's.With the rapid development of computer network, the complexity and diversity of live network flow are also higher and higher.It is mainly reflected in The following aspects: (1) time factor.The traffic characteristic in each month is different (such as month festivals or holidays) in 1 year;It is each in January It traffic characteristic is different (such as weekend);The traffic characteristic of each period is different in one day;(2) locality factors.Not commensurate The traffic characteristic of classification is different (such as government, company, school);Workplace is different with the traffic characteristic of house;Fixed-site It is different with the traffic characteristic on the vehicles (such as subway, automobile);(3) user's factor.The network user have diversity with it is changeable Property.The occupational group of the network user is different, and the network flow of generation is different;Different user is different using the behavioural habits of network, Cause flow that there is different use feature and sequence;Same user network flow caused by different time, place, background Also different;Same user has a degree of unpredictability using the behavior of network;(4) application software factor.Computer Agreement that software and intelligent terminal APP are used, type, version are different, and daily all occur it is various new using soft Part causes network flow complexity high;(5) operating system factor.The network context flow that different operating system generates is different, example Such as: time-to-live (TTL, Time To Live) and TCP initial sequence number (ISN, Initial Sequential Number) with Type of operating system is related.(6) emergency event factor.Propagation of the emergency event on network, will affect network flow.Such as: The events such as breaking news event, malicious virus propagation, online shopping advertising campaign can all have an impact network flow.Due to above-mentioned original Cause, Model of network traffic modeling are one of the heavy difficulties in traffic generating.
Condition production in deep learning fights network (CGAN, Conditional Generative Adversarial Networks) by certain means, a kind of generator of the probability distribution of data is simulated, so that this Probability distribution with certain observation probability statistical distribution of data it is consistent or as close possible to.CGAN model is production confrontation network In the input information of the extension of (GAN, Generative adversarial Networks) model, generator and arbiter all Increase additional conditional information y.Y can be any information, such as classification information, temporal information, location information, user information Deng.
Present invention introduces the CGAN models in deep learning to be trained CGAN model using the training data of magnanimity, The CGAN model for being able to reflect real traffic feature is obtained, the analogue flow rate that user needs is generated based on CGAN model.Wherein, The principle of CGAN is that real traffic is carried out binary with generation flow input discrimination model and is sentenced based on model generation flow is generated Not.If discrimination model can differentiate success, model is generated by backpropagation and gradient descent algorithm iteration optimization;If Discrimination model can not accurately differentiate, then pass through backpropagation and gradient descent algorithm iteration optimization discrimination model.Differentiating mould In type, resolve packet module is introduced, is trained discrimination model jointly with sample data.By generating model and discrimination model Binary minimax game, the Nash Equilibrium state both reached is to get to traffic generating model.
Fig. 1 is a kind of network context traffic generating system schematic based on condition production confrontation network of the invention. As shown in Figure 1, network context traffic generating system of the invention mainly generates model and discrimination model.Data include: random Vector, real traffic, simulation background flow etc..Wherein, model is generated to be used to be based on generating random vector simulation background flow;Sentence Other model is used to carry out true and false discrimination to the simulation background flow and real traffic of input.
Fig. 2 is a kind of network context flow generation method flow chart based on condition production confrontation network of the invention. As shown in Fig. 2, the network context flow generation method of the invention based on CGAN model the following steps are included:
Step S1, data acquisition, cleaning and vectorization capture true stream by the mirror port of interchanger and router The preservation of the file formats such as pcap can be used in amount.The real traffic of capture is used for CGAN model training, verifying and test;
Step S2, model training are based on CGAN model, using the live network flow and conditional information of magnanimity, to CGAN Model is trained, and obtains generating model and discrimination model;
Step S3 is based on trained generation model, generates simulation background flow.
Specifically, step S1 is specifically included:
Step S11, data acquisition.The mirror port that harvester connection switch or router is arranged obtains data, passes through The means such as libpcap or DPDK store the data of capture.At this moment, the network flow data obtained is data packet, is deposited Storage form can be the formats such as pcap or binary system.The data packet of capture mainly includes four kinds of formats: Ethernet v2, Ethernet 802.3raw(Novell Ethernet)、IEEE 802.3/802.2LLC(Ethernet 802.3SAP)、 IEEE 802.3/802.2SNAP.Fig. 3 is a kind of network context flow based on condition production confrontation network of the invention The frame structure schematic diagram of data packet.As shown in figure 3, the frame structure of various data packets all by different entities be different purposes and Design, can coexist in a network, but incompatible, must when to exchange information with the work station of different encapsulated types It must be communicated by the router of support.
Step S12, data vector.The data packet length grabbed from live network 64 bytes to 1518 bytes it Between.In order to facilitate processing, all data packets are extended into 1518 bytes, the data packet less than 1518 bytes is filled out at data field end Zero (padding).Therefore, data packet can be expressed as single order tensor: package=[byte [0], byte [1] ... ..., byte [1517]].Conditional information includes time, place, user group, emergency event totally four category information.Comprehensively consider these four types of conditions To the influence degree and ease for operation of network flow, only considers the influence of time, use two-dimentional single order tensor representation y= [time].Wherein, time value by the hour, respectively 0 to 23 point.Finally entering the real traffic generated in model is 1518 The series connection of the conditional information vector y of the dimension of the random noise z of dimension and 1, i.e., vector input=[byte [0], byte of 1519 dimensions [1],...,byte[1517],time]。
Step S13, sample data classification.According to the purposes of the real traffic of capture, real traffic is divided into three set: Training set, verifying collection and test set.Wherein, training set is the sample set for learning, and CGAN network is determined by training set In each coefficient undetermined;Verifying collection is used to adjust the sample set of classifier parameters, and in the training process, generating model can stand It is engraved in verifying collection to be verified, to check whether the loss function value for generating model can decline, whether accuracy rate is being improved, verifying Collection is one of the means for preventing over-fitting, starts soaring situation after preventing loss function from reducing to a certain extent again;Test Integrate and is arranged after completing to the training for generating model as the ability of Self -adaptive model.In an embodiment of the present invention, training The ratio setting of collection, verifying collection and test set is 60%, 20% and 20%.
CGAN model of the invention includes generating model and discrimination model.
One, model is generated
It generates model and uses conventional long memory network (Long Short-Term Memory, LSTM) in short-term.Conventional LSTM Refer to the standard LSTM structure that Sepp Hochreiter and Jurgen Schmidhuber are proposed.For different application scenarios, There is researcher to propose the LSTM network of other distressed structures.It can meet demand using routine LSTM in the present invention.Fig. 4 is LSTM schematic network structure.As shown in figure 4, LSTM network is made of a square matrix, including two dimensions: depth and time. Time dimension reflects the influence between list entries, and depth dimension reflects the influence between different data performance level.Phase For comparing convolutional neural networks (Convolutional Neural Networks, CNN), LSTM increases time dimension, corresponding Time series relationship in network flow.
Fig. 5 is LSTM network transverse structure schematic diagram.As shown in figure 5, the input layer of LSTM network is not present with output layer Horizontal relationship.Only there are horizontal relationships for hidden layer.In hidden layer, transverse structure reflecting time ties up the pass between list entries System.
The vertical structure of LSTM network reflects the iteration optimization relationship between the LSTM network level of different depth.In depth It spends in dimension, LSTM network can be divided into input layer, hidden layer and output layer.Wherein, the input of input layer is X0, and By X0It is directly output as C0And h0, X0For the 1518 dimension random noises for meeting random Gaussian distribution;Adjacent with input layer is hidden It hides layer neuron and inputs C0、h0, both correspond to the output X of input layer0, the then hidden layer neuron pair of each depth htOptimized repeatedly, no matter each neuron is all that weight is shared from longitudinal direction or lateral neuron;Output layer is directly defeated H out0、h1、hnAnd the input flow rate sequence in corresponding discrimination model.
Fig. 6 is LSTM network hidden neuron schematic network structure.As shown in fig. 6, input there are three each neurons, It is sample data vector x respectivelyt, long-term memory vector Ct-1, short-term memory vector ht-1.Wherein, xtAnd ht-1After splicing It is input to neuron and constitutes neuron Memory-Gate.There are two outputs for each neuron: long-term memory vector Ct-1, short-term memory to Measure ht-1.The frame of each grey represents a feedforward neural network in Fig. 5, is successively designated as the 1st, 2,3,4 feedforward mind from left to right Through network.According in real data training continuously attempt to and tuning, the activation primitive of the 1st, 2,4 feedforward neural network use The activation primitive of sigmoid, the 3rd feedforward neural network use tanh.Num_units indicates the hiding mind in each little Huang frame Through first number.
The hidden neuron network of LSTM network is introduced separately below:
1, long-term information Forgetting Mechanism.First neuron Memory-Gate ft=σ (Wf·[ht-1,xt]+bf) it is vector, each The value of vector indicates long-term memory information vector C in [0,1] sectiont-1In the ratio that passes into silence of each element;
2, new information input gate.Second neuronal messages input gate it=σ (Wi·[ht-1,xt]+bi) it is vector, vector In each element be a numerical value in [0,1] section, indicate the vector that newly inputsIn Each element information be added to the ratio in long-term memory;
3, new vector is generatedIndicate long-term memory information;
4, information out gate.Neuron is in addition to exporting long-term memory CtExcept, emphasis be by long-term memory by a certain percentage It is converted into short-term memory ht=ot*tanh(Ct), and it is output to next list entries of next layer network He this layer network;Wherein, ot=σ (Wo·[ht-1,xt]+bo)。
The weight of each neuron of the same hidden layer of LSTM network is shared, i.e. Wf、bf、Wi、bi、WC、bC、Wo、bo? Same hidden layer is all identical.Wherein, WfAnd bfRespectively indicate the weight and offset of first neuron Memory-Gate;WiAnd bi Respectively indicate the weight and offset of information input door;WCAnd bCRespectively indicate the weight and offset for generating new information;WOAnd bO Respectively indicate the weight and offset of information out gate.
Two, discrimination model
Fig. 7 is a kind of differentiation mould of network context flow generation method based on condition production confrontation network of the invention Type structural schematic diagram.As shown in fig. 7, discrimination model behind the output layer of LSTM network model plus resolve packet unit with The SoftMax function of output layer.At mutually indepedent at model and discrimination model, but it is all based on the realization of LSTM network model.? In discrimination model, have the function of resolve packet, if resolve packet mistake, identifies it for data falsification.
Three, the optimization process of CGAN model
CGAN model does not have loss function, and optimization process is " game of a binary minimax " problem:
(1) in formula, V (D, G) is the evaluation function for generating model and discrimination model;It is to instigate differentiation Model and the ability for generating model, i.e. the discriminating power of discrimination model is sufficiently strong, while the generation data for generating model are differentiated Model identifies that accurate probability is minimum;X refers to the truthful data of input;Z refers to random noise;X~Pdata and z~Pz (z) points Do not refer to that x and z meet respective distribution law respectively;D (x) refers to that truthful data x is differentiated accurate probability;G (z) refers to a z The data that noise inputs are obtained to generation model;D (G (z)) is the accurate probability of differentiation that data are generated for G (z).
In the training process, using the maximum value of evaluation function as target, alternately instruction is carried out to discrimination model and generation model Practice, the purpose of the two is also different.The training goal of discrimination model is: increasing D (x), that is, increases discrimination model correct decision Ability;The training goal for generating model is: some data are produced in increase 1-D (G (z)), i.e. puppet allows discrimination model to be thought as really. Wherein, D (G (z)) is the accurate probability of differentiation that data are generated for G (z).
CGAN model it is implicit define a probability distribution Pg, and it is desirable that Pg, which converges to data, is really distributed Pdata.? In this minimax betting model, there are optimal solutions when Pg=Pdata, that is, reach Nash Equilibrium, generate at this time Model has restored the distribution of training data, and discrimination model can not differentiate again as a result, accuracy rate is guessed at random equal to 50%.
CGAN model increases conditional information y on the basis of GAN model.Evaluation function are as follows:Condition can be class label, It is also possible to other multi-modal informations etc..Here temporal information is used, is divided within one day 24 hours, corresponds to time letter per hour A value in breath.
Specifically, the iteration optimization algorithms of CGAN model training include:
1, the gradient updating algorithm of GAN model
As described above, evaluation function is (1) formula, carried out using most small quantities of gradient descent algorithm as follows.Any gradient updating Algorithm is ok, and uses momentum gradient more new algorithm here.Algorithm is as follows:
Algorithm input: hyper parameter k and m.Hyper parameter refers to the parameter by being manually arranged, rather than by training obtained ginseng Number.Wherein, parameter k is indicated: k generation model of training, discrimination model of training;Parameter m is indicated: the sample number in most small quantities of Amount.
Algorithmic procedure:
For training samples number do
Fork step generates model training do ▽
From given noise prior distribution pg(z) in (being distributed using random Gaussian), m noisy samples { z is generated(1),z(2),...,z(m)}
Extract m authentic specimen data { x(1),x(2),...,x(m)}
By stochastic gradient ascent algorithm, discrimination model is updated:
end for
From given noise prior distribution pg(z) in, m noisy samples { z is generated(1),z(2),...,z(m)}
By stochastic gradient descent algorithm, more newly-generated model:
end for
Algorithm output: the weight parameter of model and discrimination model is generated.
(2) LSTM propagated forward algorithm is as follows
Because generating model and discrimination model being all made of LSTM model.Discrimination model increases behind LSTM model SoftMax excitation function.The propagated forward algorithm of LSTM network is described below.
1) Memory-Gate output: f is updated(t)=σ (Wfh(t-1)+Ufx(t)+bf)
2) output of input gate two parts is updated:
i(t)=σ (Wih(t-1)+Uix(t)+bi)
3) cell state is updated:
4) out gate output is updated:
o(t)=σ (Woh(t-1)+Uox(t)+bo)
h(t)=o(t-1)⊙tanh(C(t))
5) current sequence index prediction output is updated:
(3) LSTM Back Propagation Algorithm
LSTM back-propagation algorithm updates all parameters: (W by gradient descent method iterationf,Uf,bf)、(Wi,Ui,bi)、 (WC,UC,bC)、(Wo,Uo,bo).Key point is to calculate partial derivative of all parameters based on loss function.
In Recognition with Recurrent Neural Network (RecurrentNeural Network, RNN), for reverse propagated error, by hidden Hiding state h(t)Gradient δ(t)It propagates forward step by step.It is also similar in LSTM, the difference is that there are two hidden state h here(t)And C(t).Therefore, two δ are defined here(t), it may be assumed that
Wherein, it is only used when backpropagationVariableIt only helps to calculate in a certain layer to use, it is anti-there is no participating in To propagation.
And final nucleotide sequence index position τ'sWithAre as follows:
Then byReverse-direction derivation
Gradient determined by the output gradient error of this layer, it may be assumed thatAndIt is anti- To gradient error by preceding layerGradient error and this layer slave h(t)The gradient error two parts composition sent back, it may be assumed that
HaveWithThen (W is calculatedf,Uf,bf)、(Wi,Ui,bi)、(WC,UC,bC)、(Wo,Uo,bo) etc. parameters.With WfGradient calculating process for,
When CGAN model parameter training after the completion of to get to generate model.During generating modeling flow, It is input to the generation model with random vector, obtains simulation background flow.
Network context flow generation method of the invention, be based on trained generation model, with random noise to Amount and conditional information are input, can automatically generate simulation background flow similar with live network flow, not need manually to join With Modeling Network Traffic, network flow characteristic extract, network flow generate etc. processes, be greatly saved human cost, hence it is evident that Improve the efficiency of network flow generation.

Claims (10)

1. a kind of network context flow generation method based on condition production confrontation network characterized by comprising
Data acquisition step, the network flow data and conditional information of acquisition, dyad turn to real traffic;
Model generation step is obtained being initially generated model and initial discrimination model with the real traffic, and passes through condition production Confrontation network is initially generated model to this and the initial discrimination model is trained, to obtain generating model;
Traffic generating step, by the generation model with generating random vector simulation background flow.
2. network context flow generation method as described in claim 1, which is characterized in that the data acquisition step is specifically wrapped It includes:
Obtaining step obtains the network flow data and the conditional information by the mirror port of network access device, and should Network flow data is stored as multiple data packets, and the length of each data packet is m byte;
Vectorization step, in the data field end of each data packet filling n 0, vector turn to the random noises of 1518 dimensions to Amount, and the conditional information that series connection 1 is tieed up after the random noise vector, to form the real traffics of 1519 dimensions;
The real traffic is divided into training set, verifying collection and test set by classifying step, and wherein the training set is for determining that this is initial The coefficient of model and the initial discrimination model is generated, the verifying collection is for verifying the loss function for being initially generated model, the survey Examination collects the effect for testing the generation model;
Wherein the data packet is pcap format or binary format, and m, n are integer, 64≤m≤1518, n=1518-m.
3. network context flow generation method as claimed in claim 2, which is characterized in that the conditional information is to obtain the network Temporal information when data on flows.
4. network context flow generation method as described in claim 1, which is characterized in that the generation model and the discrimination model For shot and long term memory network model.
5. network context flow generation method as claimed in claim 4, which is characterized in that remember in the shot and long term of the discrimination model It further include the SoftMax function of resolve packet unit He the output layer after the output layer for recalling network model.
6. a kind of network context traffic generating system based on condition production confrontation network characterized by comprising
Data acquisition module, for obtaining real traffic;It wherein obtains network flow data and conditional information, dyad turns to this Real traffic;
Model generation module is initially generated model and initial discrimination model for obtaining with the real traffic, and raw by condition Accepted way of doing sth confrontation network discrimination model is initially generated model to this and the initial discrimination model is trained, to obtain generating mould Type;
Traffic generating module, for passing through the generation model with generating random vector simulation background flow.
7. network context traffic generating system as claimed in claim 6, which is characterized in that the data acquisition module specifically wraps It includes:
Module is obtained, obtains the network flow data and the conditional information for the mirror port by network access device, and The network flow data is stored as multiple data packets, the length of each data packet is m byte;
Vectorization module, in the data field end of each data packet filling n 0, vector to turn to making an uproar at random for 1518 dimensions Sound vector, and the conditional information that series connection 1 is tieed up after the random noise vector, to form the real traffics of 1519 dimensions;
Categorization module, for the real traffic to be divided into training set, verifying collection and test set, wherein the training set should for determining It is initially generated the coefficient of model He the initial discrimination model, which is used to verify the loss function for being initially generated model, The test set is used to test the effect of the generation model;
Wherein, which is pcap format or binary format, and m, n are integer, 64≤m≤1518, n=1518-m.
8. network context traffic generating system as claimed in claim 7, which is characterized in that the conditional information is to obtain the network Temporal information when data on flows.
9. network context traffic generating system as claimed in claim 6, which is characterized in that the generation model and the discrimination model For shot and long term memory network model.
10. network context traffic generating system as claimed in claim 9, which is characterized in that the shot and long term of the discrimination model is remembered It further include the SoftMax function of resolve packet unit He the output layer after the output layer for recalling network model.
CN201910012933.5A 2019-01-07 2019-01-07 Network background flow generation method and system based on condition generation type countermeasure network Active CN109889452B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910012933.5A CN109889452B (en) 2019-01-07 2019-01-07 Network background flow generation method and system based on condition generation type countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910012933.5A CN109889452B (en) 2019-01-07 2019-01-07 Network background flow generation method and system based on condition generation type countermeasure network

Publications (2)

Publication Number Publication Date
CN109889452A true CN109889452A (en) 2019-06-14
CN109889452B CN109889452B (en) 2021-06-11

Family

ID=66925676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910012933.5A Active CN109889452B (en) 2019-01-07 2019-01-07 Network background flow generation method and system based on condition generation type countermeasure network

Country Status (1)

Country Link
CN (1) CN109889452B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110602078A (en) * 2019-09-04 2019-12-20 南京邮电大学 Application encryption traffic generation method and system based on generation countermeasure network
CN111049762A (en) * 2019-12-23 2020-04-21 上海金仕达软件科技有限公司 Data acquisition method and device, storage medium and switch
CN111159250A (en) * 2019-12-19 2020-05-15 电子科技大学 Mobile terminal user behavior detection method based on nested deep twin neural network
CN111651765A (en) * 2020-05-27 2020-09-11 上海交通大学 Program execution path generation method based on generative countermeasure network
CN111881620A (en) * 2020-07-15 2020-11-03 哈尔滨工业大学(威海) User software behavior simulation system based on reinforcement learning algorithm and GAN model and working method thereof
CN113507429A (en) * 2021-04-16 2021-10-15 华东师范大学 Generation method of intrusion flow based on generation type countermeasure network
CN113542271A (en) * 2021-07-14 2021-10-22 西安电子科技大学 Network background flow generation method based on generation of confrontation network GAN
CN114326655A (en) * 2021-11-30 2022-04-12 深圳先进技术研究院 Industrial robot fault data generation method, system, terminal and storage medium
CN115277086A (en) * 2022-06-16 2022-11-01 西安电子科技大学 Network background flow generation method based on generation countermeasure network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951919A (en) * 2017-03-02 2017-07-14 浙江工业大学 A kind of flow monitoring implementation method based on confrontation generation network
CN108564129A (en) * 2018-04-24 2018-09-21 电子科技大学 A kind of track data sorting technique based on generation confrontation network
CN109086658A (en) * 2018-06-08 2018-12-25 中国科学院计算技术研究所 A kind of sensing data generation method and system based on generation confrontation network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951919A (en) * 2017-03-02 2017-07-14 浙江工业大学 A kind of flow monitoring implementation method based on confrontation generation network
CN108564129A (en) * 2018-04-24 2018-09-21 电子科技大学 A kind of track data sorting technique based on generation confrontation network
CN109086658A (en) * 2018-06-08 2018-12-25 中国科学院计算技术研究所 A kind of sensing data generation method and system based on generation confrontation network

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110602078A (en) * 2019-09-04 2019-12-20 南京邮电大学 Application encryption traffic generation method and system based on generation countermeasure network
CN111159250A (en) * 2019-12-19 2020-05-15 电子科技大学 Mobile terminal user behavior detection method based on nested deep twin neural network
CN111159250B (en) * 2019-12-19 2023-02-21 电子科技大学 Mobile terminal user behavior detection method based on nested deep twin neural network
CN111049762A (en) * 2019-12-23 2020-04-21 上海金仕达软件科技有限公司 Data acquisition method and device, storage medium and switch
CN111651765A (en) * 2020-05-27 2020-09-11 上海交通大学 Program execution path generation method based on generative countermeasure network
CN111651765B (en) * 2020-05-27 2023-05-02 上海交通大学 Program execution path generation method based on generation type countermeasure network
CN111881620B (en) * 2020-07-15 2022-12-30 哈尔滨工业大学(威海) User software behavior simulation system based on reinforcement learning algorithm and GAN model and working method thereof
CN111881620A (en) * 2020-07-15 2020-11-03 哈尔滨工业大学(威海) User software behavior simulation system based on reinforcement learning algorithm and GAN model and working method thereof
CN113507429A (en) * 2021-04-16 2021-10-15 华东师范大学 Generation method of intrusion flow based on generation type countermeasure network
CN113507429B (en) * 2021-04-16 2022-04-05 华东师范大学 Generation method of intrusion flow based on generation type countermeasure network
CN113542271A (en) * 2021-07-14 2021-10-22 西安电子科技大学 Network background flow generation method based on generation of confrontation network GAN
CN113542271B (en) * 2021-07-14 2022-07-26 西安电子科技大学 Network background flow generation method based on generation of confrontation network GAN
CN114326655A (en) * 2021-11-30 2022-04-12 深圳先进技术研究院 Industrial robot fault data generation method, system, terminal and storage medium
CN115277086A (en) * 2022-06-16 2022-11-01 西安电子科技大学 Network background flow generation method based on generation countermeasure network
CN115277086B (en) * 2022-06-16 2023-10-20 西安电子科技大学 Network background flow generation method based on generation of countermeasure network

Also Published As

Publication number Publication date
CN109889452B (en) 2021-06-11

Similar Documents

Publication Publication Date Title
CN109889452A (en) Network context flow generation method and system based on condition production confrontation network
WO2022033332A1 (en) Dialogue generation method and apparatus, network training method and apparatus, storage medium, and device
CN110598598A (en) Double-current convolution neural network human behavior identification method based on finite sample set
CN111461204B (en) Emotion recognition method based on electroencephalogram signals for game evaluation
CN110807469B (en) Knowledge tracking method and system integrating long-time memory and short-time memory with Bayesian network
CN108062561A (en) A kind of short time data stream Forecasting Methodology based on long memory network model in short-term
CN110147711A (en) Video scene recognition methods, device, storage medium and electronic device
CN110164476A (en) A kind of speech-emotion recognition method of the BLSTM based on multi output Fusion Features
US20220176248A1 (en) Information processing method and apparatus, computer readable storage medium, and electronic device
CN110378699A (en) A kind of anti-fraud method, apparatus and system of transaction
CN106911669A (en) A kind of DDOS detection methods based on deep learning
CN109800785B (en) Data classification method and device based on self-expression correlation
CN109992780A (en) One kind being based on deep neural network specific objective sensibility classification method
CN109325638A (en) A kind of SDN method for predicting based on RBF neural
CN110363081A (en) Face identification method, device, equipment and computer readable storage medium
US20230367934A1 (en) Method and apparatus for constructing vehicle dynamics model and method and apparatus for predicting vehicle state information
Gharib et al. Acoustic scene classification: A competition review
Liu et al. An asynchronous federated learning arbitration model for low-rate ddos attack detection
CN114120637A (en) Intelligent high-speed traffic flow prediction method based on continuous monitor
CN114218457B (en) False news detection method based on forwarding social media user characterization
CN109670623A (en) Neural net prediction method and device
Zheng et al. Enabling robust DRL-driven networking systems via teacher-student learning
CN113726545B (en) Network traffic generation method and device for generating countermeasure network based on knowledge enhancement
CN116307022A (en) Public opinion hotspot information prediction method and system
CN111078872B (en) Police event simulation data generation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant