CN109889452A - Network context flow generation method and system based on condition production confrontation network - Google Patents
Network context flow generation method and system based on condition production confrontation network Download PDFInfo
- Publication number
- CN109889452A CN109889452A CN201910012933.5A CN201910012933A CN109889452A CN 109889452 A CN109889452 A CN 109889452A CN 201910012933 A CN201910012933 A CN 201910012933A CN 109889452 A CN109889452 A CN 109889452A
- Authority
- CN
- China
- Prior art keywords
- model
- network
- data
- flow
- traffic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The present invention relates to a kind of network context flow generation methods based on condition production confrontation network, comprising: data acquisition step, the network flow data and conditional information of acquisition, dyad turn to real traffic;Model generation step is obtained being initially generated model and discrimination model with the real traffic, and is initially generated model to this with the discrimination model by condition production confrontation network and is trained, to obtain generating model;Traffic generating step, by the generation model with generating random vector simulation background flow.
Description
Technical field
The present invention relates to network securitys and network simulation field, and in particular to one kind fights network based on condition production
(CGAN) network context flow generation method and system.
Background technique
Network flow generation technique is chiefly used in network test.It is born initial stage so far from network, with network size and complexity
The continuous growth of property, network flow generation technique are also constantly developing.From network flow generation method, 3 classes can be divided into
Method: the generation method based on statistical model, the generation method based on traffic characteristic, the generation method based on application/session.
1, based on the generation method of statistical model
It is theoretical that this method is based on statistical analysis, using the statistical models such as Poisson, ON/OFF, FBM/FGN or multi-fractal, carves
The network flow regularity of distribution is drawn, and generates analogue flow rate based on these models.The principle of such method are as follows: pass through probability statistics etc.
Theory analysis establishes the statistical model of network flow, and generates analogue flow rate based on statistical model.The composed structure of such tool
It can be generally divided into two parts: statistical model, network flow Core Generator.Wherein, statistical model is core, and it is raw to reflect flow
At core concept;Network flow Core Generator is responsible for being based on statistical model, generates analogue flow rate.The technology that such tool uses
Means are mainly probabilistic models, comprising: Poisson model, exponential distribution, ON/OFF model, Weibull distribution, Pareto points
Cloth, Gaussian Profile etc..The traffic generating process of such tool includes the initiation parameter that statistical model is arranged, configuration statistics mould
Type;It is then based on statistical model and generates analogue flow rate.
2, based on the generation method of traffic characteristic
According to the granularity of traffic characteristic, such method can be divided into 2 subclasses: packet-level and stream grade again.
Based on the traffic generating of packet-level feature: this method is realized often by data packet playback.It is related to
Flow packet level feature are as follows: packet interval arrival time, data package size etc..This method only considers the substantially special of single data packet
Sign, advantage is: simple, operand is low;The disadvantage is that: fidelity is low, does not account for influencing each other, ignoring between various data packets
Traffic characteristic between agreement and inside single agreement.The traffic generating of data flow level: data packet is the substantially single of network flow
Position, data flow includes multiple data packets.Data flow is for connection-oriented network protocol, and a stream includes the company of foundation
It connects, connect the whole process for continuing, closing connection.In ICP/IP protocol cluster, stream just refers to TCP flow.It is reset and is given birth to by TCP flow
At Model Background flow.The advantages of this method is that the data on flows collection of generation is smaller compared to packet level flow;Disadvantage is the absence of and uses
Family behavioural habits, application features and the relevant feature of operating system, and the flow of non-Transmission Control Protocol can not be generated.
The traffic generating of session level.Session refers to primary the interacting with physical meaning between server and client computer
Journey.This method extracts some characteristic informations about network connection from live network flow, comprising: session persistence, meeting
Words interval arrival time, and based on many unique session data packets of these characteristic informations creation.It is newly-built by resetting these
Session data packet realizes the Traffic simulation of session level.The advantages of this method is: can generate with semantic simulating traffic;It lacks
Point is relative complex.The principle of generation method of flow based on traffic characteristic be by analysis live network flow in packet/stream/
Session characteristics establish traffic characteristic model, and generate analogue flow rate based on characteristic model.The composed structure of such tool generally may be used
To be divided into traffic characteristic model and network flow Core Generator.Wherein, traffic characteristic model is core, includes from live network
The various traffic characteristics extracted in flow;Network flow Core Generator is responsible for being based on traffic characteristic model, generates analogue flow rate.It should
The traffic generating process of class tool including the following steps: packet/stream/session level feature of analysis live network flow is established
Traffic characteristic model;According to traffic characteristic model, data packet, data flow or session are generated.The technology hand that such tool uses
Section includes that Bayesian probability discusses equiprobability analysis theories, data packet resets tool, command script of specific protocol etc..
3, based on the generation method of application
This method carries out probability analysis, and the mode reset using specific application order towards specific application, generates mould
Type flow.The advantages of this method is simple;The disadvantage is that can be only generated the flow of specific application, true network context stream is not met
The complexity of amount.The principle of such method be by analysis live network flow in application feature, and based on using feature give birth to
At analogue flow rate.Opposite packet level/stream grade/session level traffic generator, such method is relatively high-rise, studies the stream of specific application
Measure feature.The composed structure of such tool can be generally divided into application traffic characteristics analysis module and network flow Core Generator.
Wherein, application traffic characteristics analysis module is core, includes the various application traffic features extracted from live network flow;
Network flow Core Generator is responsible for being based on application traffic feature, generates analogue flow rate.Such tool during generating flow,
The application traffic feature in live network is analyzed first, is then based on application traffic feature, generates the corresponding data of specific application
Packet.The technological means that such tool uses includes: the data packets such as tcpdump gripping tool, Bayesian probability by equiprobability analysis
Theory, order perform script etc..
The above method has the following problems:
(1) Model of network traffic is difficult to accurately reflect true network flow characteristic.
Network flow statistic model is either derived based on Probability Statistics Theory, or based on live network traffic characteristic point
Network flow characteristic model is extracted in analysis, there is larger problem in terms of the fidelity of network flow simulation.Reason is really
Network flow is extremely complex, close with the factor relations such as time, place, user, application software, operating system, emergency event.Make
With a statistical model or characteristic model, it is difficult to describe live network flow.Especially manually extracting network flow characteristic
When, several specific traffic characteristics can be rule of thumb specified, such as: when packet/stream arrival time, packet/stream size, data flow continue
Between, the transition probability between different sessions etc..Whether live network stream can be reacted completely for these traffic characteristics of proposition
Measure feature, or whether there is still not found hidden feature in live network flow, specific conclusion can not be provided.And from
The essence of cognition is set out deduction, the diversified network flow for complexity, and there are more still not found hidden features can
Energy property is very big.
(2) extraction of live network feature is very cumbersome.
If it is desired to embodying the influence of the factors such as time, place, user, application software, event in traffic generating, then need
Traffic characteristic relevant to time, place, user, application, event is extracted from live network flow.And time, place, use
Family, application, event are multifarious, and there are certain unpredictabilities.Not only live network traffic characteristic extracts work
It is very cumbersome, and be difficult to accurately match live network flow.
(3) the method ease for use that above-mentioned network generates is excessively poor.
In the generation method based on statistical model, the generation method based on traffic characteristic, the generation method based on application,
It is both needed to user's depth to participate in traffic generating, the process required manual intervention is relatively more.Especially Modeling Network Traffic process
User time and energy are expended, and cannot be guaranteed effect.
Summary of the invention
In view of the above-mentioned problems, the present invention proposes one kind, the network context traffic generating based on condition production confrontation network
Method, comprising: data acquisition step, the network flow data and conditional information of acquisition, dyad turn to real traffic;Model is raw
At step, obtain being initially generated model and initial discrimination model with the real traffic, and fight network pair by condition production
This is initially generated model and the initial discrimination model is trained, to obtain generating model;Traffic generating step passes through the generation
Model is with generating random vector simulation background flow.
Network context flow generation method of the present invention, wherein the data acquisition step specifically includes: obtaining step
Suddenly, the network flow data and the conditional information are obtained by the mirror port of network access device, and by the network flow number
According to multiple data packets are stored as, the length of each data packet is m byte;Vectorization step, in the data of each data packet
N 0 is filled at domain end, and vector turns to the random noise vector of 1518 dimensions, and after the random noise vector 1 dimension of series connection this
Part information, to form the real traffic of 1519 dimensions;The real traffic is divided into training set by classifying step, verifying collects and test
Collection, wherein the training set is for determining that this is initially generated the coefficient of model He the initial discrimination model, and the verifying collection is for verifying
This is initially generated the loss function of model, which is used to test the effect of the generation model;Wherein, which is pcap
Format or binary format, m, n are integer, and 64≤m≤1518, n=1518-m, which is to obtain the network flow number
According to when temporal information.
Network context flow generation method of the present invention, wherein the generation model and the discrimination model are shot and long term note
Recall network model;It further include resolve packet unit and this after the output layer of the shot and long term memory network model of the discrimination model
The SoftMax function of output layer.
The invention further relates to a kind of network context traffic generating systems based on condition production confrontation network, comprising: number
According to module is obtained, for obtaining real traffic;It wherein obtains network flow data and conditional information, dyad turns to the true stream
Amount;Model generation module is initially generated model and initial discrimination model for being obtained with the real traffic, and is generated by condition
Formula confrontation network discrimination model is initially generated model to this and the initial discrimination model is trained, to obtain generating model;
Traffic generating module, for passing through the generation model with generating random vector simulation background flow.
Network context traffic generating system of the present invention, wherein the data acquisition module specifically includes: obtaining mould
Block obtains the network flow data and the conditional information for the mirror port by network access device, and by the network flow
Amount data are stored as multiple data packets, and the length of each data packet is m byte;Vectorization module, in each data
The data field end filling n 0 of packet, vector turns to the random noise vector of 1518 dimensions, and connects 1 after the random noise vector
Conditional information of dimension, to form the real traffic of 1519 dimensions;Categorization module, for by the real traffic be divided into training set,
Verifying collection and test set, wherein for determining that this is initially generated the coefficient of model He the initial discrimination model, this is tested the training set
Card collection is used to test the effect of the generation model for verifying the loss function for being initially generated model, the test set;Wherein, should
Data packet is pcap format or binary format, and m, n are integer, and 64≤m≤1518, n=1518-m, which is to obtain
The temporal information when network flow data.
Network context traffic generating system of the present invention, wherein the generation model and the discrimination model are shot and long term note
Recall network model, further includes resolve packet unit and this after the output layer of the shot and long term memory network model of the discrimination model
The SoftMax function of output layer.
Network context flow generation method of the invention does not need manually to establish Model of network traffic, but by a large amount of
Training data, by CGAN training obtain Model of network traffic, the process required manual intervention is fewer, and can be true to nature
Reflect live network traffic characteristic.The generation model of different scenes, energy are obtained by the live network flow training of different scenes
Enough cope with the complexity and diversity of live network flow.
Innovation of the invention essentially consists in application innovation of the CGAN technology in terms of network flow generation, and there is no right
CGAN technology itself improves.
Detailed description of the invention
Fig. 1 is a kind of network context traffic generating system schematic based on condition production confrontation network of the invention.
Fig. 2 is a kind of network context flow generation method flow chart based on condition production confrontation network of the invention.
Fig. 3 is a kind of frame structure of the data packet of network context flow based on condition production confrontation network of the invention
Schematic diagram.
Fig. 4 is LSTM schematic network structure.
Fig. 5 is LSTM network transverse structure schematic diagram.
Fig. 6 is LSTM network hidden neuron schematic network structure.
Fig. 7 is a kind of differentiation mould of network context flow generation method based on condition production confrontation network of the invention
Type structural schematic diagram.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing, the present invention is mentioned
The network context flow generation method and system further description based on condition production confrontation network out.It should manage
Solution, specific implementation method described herein are only used to explain the present invention, be not intended to limit the present invention.
By to traffic generating the study found that defect in the prior art is caused by the complexity of live network flow
's.With the rapid development of computer network, the complexity and diversity of live network flow are also higher and higher.It is mainly reflected in
The following aspects: (1) time factor.The traffic characteristic in each month is different (such as month festivals or holidays) in 1 year;It is each in January
It traffic characteristic is different (such as weekend);The traffic characteristic of each period is different in one day;(2) locality factors.Not commensurate
The traffic characteristic of classification is different (such as government, company, school);Workplace is different with the traffic characteristic of house;Fixed-site
It is different with the traffic characteristic on the vehicles (such as subway, automobile);(3) user's factor.The network user have diversity with it is changeable
Property.The occupational group of the network user is different, and the network flow of generation is different;Different user is different using the behavioural habits of network,
Cause flow that there is different use feature and sequence;Same user network flow caused by different time, place, background
Also different;Same user has a degree of unpredictability using the behavior of network;(4) application software factor.Computer
Agreement that software and intelligent terminal APP are used, type, version are different, and daily all occur it is various new using soft
Part causes network flow complexity high;(5) operating system factor.The network context flow that different operating system generates is different, example
Such as: time-to-live (TTL, Time To Live) and TCP initial sequence number (ISN, Initial Sequential Number) with
Type of operating system is related.(6) emergency event factor.Propagation of the emergency event on network, will affect network flow.Such as:
The events such as breaking news event, malicious virus propagation, online shopping advertising campaign can all have an impact network flow.Due to above-mentioned original
Cause, Model of network traffic modeling are one of the heavy difficulties in traffic generating.
Condition production in deep learning fights network (CGAN, Conditional Generative
Adversarial Networks) by certain means, a kind of generator of the probability distribution of data is simulated, so that this
Probability distribution with certain observation probability statistical distribution of data it is consistent or as close possible to.CGAN model is production confrontation network
In the input information of the extension of (GAN, Generative adversarial Networks) model, generator and arbiter all
Increase additional conditional information y.Y can be any information, such as classification information, temporal information, location information, user information
Deng.
Present invention introduces the CGAN models in deep learning to be trained CGAN model using the training data of magnanimity,
The CGAN model for being able to reflect real traffic feature is obtained, the analogue flow rate that user needs is generated based on CGAN model.Wherein,
The principle of CGAN is that real traffic is carried out binary with generation flow input discrimination model and is sentenced based on model generation flow is generated
Not.If discrimination model can differentiate success, model is generated by backpropagation and gradient descent algorithm iteration optimization;If
Discrimination model can not accurately differentiate, then pass through backpropagation and gradient descent algorithm iteration optimization discrimination model.Differentiating mould
In type, resolve packet module is introduced, is trained discrimination model jointly with sample data.By generating model and discrimination model
Binary minimax game, the Nash Equilibrium state both reached is to get to traffic generating model.
Fig. 1 is a kind of network context traffic generating system schematic based on condition production confrontation network of the invention.
As shown in Figure 1, network context traffic generating system of the invention mainly generates model and discrimination model.Data include: random
Vector, real traffic, simulation background flow etc..Wherein, model is generated to be used to be based on generating random vector simulation background flow;Sentence
Other model is used to carry out true and false discrimination to the simulation background flow and real traffic of input.
Fig. 2 is a kind of network context flow generation method flow chart based on condition production confrontation network of the invention.
As shown in Fig. 2, the network context flow generation method of the invention based on CGAN model the following steps are included:
Step S1, data acquisition, cleaning and vectorization capture true stream by the mirror port of interchanger and router
The preservation of the file formats such as pcap can be used in amount.The real traffic of capture is used for CGAN model training, verifying and test;
Step S2, model training are based on CGAN model, using the live network flow and conditional information of magnanimity, to CGAN
Model is trained, and obtains generating model and discrimination model;
Step S3 is based on trained generation model, generates simulation background flow.
Specifically, step S1 is specifically included:
Step S11, data acquisition.The mirror port that harvester connection switch or router is arranged obtains data, passes through
The means such as libpcap or DPDK store the data of capture.At this moment, the network flow data obtained is data packet, is deposited
Storage form can be the formats such as pcap or binary system.The data packet of capture mainly includes four kinds of formats: Ethernet v2,
Ethernet 802.3raw(Novell Ethernet)、IEEE 802.3/802.2LLC(Ethernet 802.3SAP)、
IEEE 802.3/802.2SNAP.Fig. 3 is a kind of network context flow based on condition production confrontation network of the invention
The frame structure schematic diagram of data packet.As shown in figure 3, the frame structure of various data packets all by different entities be different purposes and
Design, can coexist in a network, but incompatible, must when to exchange information with the work station of different encapsulated types
It must be communicated by the router of support.
Step S12, data vector.The data packet length grabbed from live network 64 bytes to 1518 bytes it
Between.In order to facilitate processing, all data packets are extended into 1518 bytes, the data packet less than 1518 bytes is filled out at data field end
Zero (padding).Therefore, data packet can be expressed as single order tensor: package=[byte [0], byte [1] ... ..., byte
[1517]].Conditional information includes time, place, user group, emergency event totally four category information.Comprehensively consider these four types of conditions
To the influence degree and ease for operation of network flow, only considers the influence of time, use two-dimentional single order tensor representation y=
[time].Wherein, time value by the hour, respectively 0 to 23 point.Finally entering the real traffic generated in model is 1518
The series connection of the conditional information vector y of the dimension of the random noise z of dimension and 1, i.e., vector input=[byte [0], byte of 1519 dimensions
[1],...,byte[1517],time]。
Step S13, sample data classification.According to the purposes of the real traffic of capture, real traffic is divided into three set:
Training set, verifying collection and test set.Wherein, training set is the sample set for learning, and CGAN network is determined by training set
In each coefficient undetermined;Verifying collection is used to adjust the sample set of classifier parameters, and in the training process, generating model can stand
It is engraved in verifying collection to be verified, to check whether the loss function value for generating model can decline, whether accuracy rate is being improved, verifying
Collection is one of the means for preventing over-fitting, starts soaring situation after preventing loss function from reducing to a certain extent again;Test
Integrate and is arranged after completing to the training for generating model as the ability of Self -adaptive model.In an embodiment of the present invention, training
The ratio setting of collection, verifying collection and test set is 60%, 20% and 20%.
CGAN model of the invention includes generating model and discrimination model.
One, model is generated
It generates model and uses conventional long memory network (Long Short-Term Memory, LSTM) in short-term.Conventional LSTM
Refer to the standard LSTM structure that Sepp Hochreiter and Jurgen Schmidhuber are proposed.For different application scenarios,
There is researcher to propose the LSTM network of other distressed structures.It can meet demand using routine LSTM in the present invention.Fig. 4 is
LSTM schematic network structure.As shown in figure 4, LSTM network is made of a square matrix, including two dimensions: depth and time.
Time dimension reflects the influence between list entries, and depth dimension reflects the influence between different data performance level.Phase
For comparing convolutional neural networks (Convolutional Neural Networks, CNN), LSTM increases time dimension, corresponding
Time series relationship in network flow.
Fig. 5 is LSTM network transverse structure schematic diagram.As shown in figure 5, the input layer of LSTM network is not present with output layer
Horizontal relationship.Only there are horizontal relationships for hidden layer.In hidden layer, transverse structure reflecting time ties up the pass between list entries
System.
The vertical structure of LSTM network reflects the iteration optimization relationship between the LSTM network level of different depth.In depth
It spends in dimension, LSTM network can be divided into input layer, hidden layer and output layer.Wherein, the input of input layer is X0, and
By X0It is directly output as C0And h0, X0For the 1518 dimension random noises for meeting random Gaussian distribution;Adjacent with input layer is hidden
It hides layer neuron and inputs C0、h0, both correspond to the output X of input layer0, the then hidden layer neuron pair of each depth
htOptimized repeatedly, no matter each neuron is all that weight is shared from longitudinal direction or lateral neuron;Output layer is directly defeated
H out0、h1、hnAnd the input flow rate sequence in corresponding discrimination model.
Fig. 6 is LSTM network hidden neuron schematic network structure.As shown in fig. 6, input there are three each neurons,
It is sample data vector x respectivelyt, long-term memory vector Ct-1, short-term memory vector ht-1.Wherein, xtAnd ht-1After splicing
It is input to neuron and constitutes neuron Memory-Gate.There are two outputs for each neuron: long-term memory vector Ct-1, short-term memory to
Measure ht-1.The frame of each grey represents a feedforward neural network in Fig. 5, is successively designated as the 1st, 2,3,4 feedforward mind from left to right
Through network.According in real data training continuously attempt to and tuning, the activation primitive of the 1st, 2,4 feedforward neural network use
The activation primitive of sigmoid, the 3rd feedforward neural network use tanh.Num_units indicates the hiding mind in each little Huang frame
Through first number.
The hidden neuron network of LSTM network is introduced separately below:
1, long-term information Forgetting Mechanism.First neuron Memory-Gate ft=σ (Wf·[ht-1,xt]+bf) it is vector, each
The value of vector indicates long-term memory information vector C in [0,1] sectiont-1In the ratio that passes into silence of each element;
2, new information input gate.Second neuronal messages input gate it=σ (Wi·[ht-1,xt]+bi) it is vector, vector
In each element be a numerical value in [0,1] section, indicate the vector that newly inputsIn
Each element information be added to the ratio in long-term memory;
3, new vector is generatedIndicate long-term memory information;
4, information out gate.Neuron is in addition to exporting long-term memory CtExcept, emphasis be by long-term memory by a certain percentage
It is converted into short-term memory ht=ot*tanh(Ct), and it is output to next list entries of next layer network He this layer network;Wherein,
ot=σ (Wo·[ht-1,xt]+bo)。
The weight of each neuron of the same hidden layer of LSTM network is shared, i.e. Wf、bf、Wi、bi、WC、bC、Wo、bo?
Same hidden layer is all identical.Wherein, WfAnd bfRespectively indicate the weight and offset of first neuron Memory-Gate;WiAnd bi
Respectively indicate the weight and offset of information input door;WCAnd bCRespectively indicate the weight and offset for generating new information;WOAnd bO
Respectively indicate the weight and offset of information out gate.
Two, discrimination model
Fig. 7 is a kind of differentiation mould of network context flow generation method based on condition production confrontation network of the invention
Type structural schematic diagram.As shown in fig. 7, discrimination model behind the output layer of LSTM network model plus resolve packet unit with
The SoftMax function of output layer.At mutually indepedent at model and discrimination model, but it is all based on the realization of LSTM network model.?
In discrimination model, have the function of resolve packet, if resolve packet mistake, identifies it for data falsification.
Three, the optimization process of CGAN model
CGAN model does not have loss function, and optimization process is " game of a binary minimax " problem:
(1) in formula, V (D, G) is the evaluation function for generating model and discrimination model;It is to instigate differentiation
Model and the ability for generating model, i.e. the discriminating power of discrimination model is sufficiently strong, while the generation data for generating model are differentiated
Model identifies that accurate probability is minimum;X refers to the truthful data of input;Z refers to random noise;X~Pdata and z~Pz (z) points
Do not refer to that x and z meet respective distribution law respectively;D (x) refers to that truthful data x is differentiated accurate probability;G (z) refers to a z
The data that noise inputs are obtained to generation model;D (G (z)) is the accurate probability of differentiation that data are generated for G (z).
In the training process, using the maximum value of evaluation function as target, alternately instruction is carried out to discrimination model and generation model
Practice, the purpose of the two is also different.The training goal of discrimination model is: increasing D (x), that is, increases discrimination model correct decision
Ability;The training goal for generating model is: some data are produced in increase 1-D (G (z)), i.e. puppet allows discrimination model to be thought as really.
Wherein, D (G (z)) is the accurate probability of differentiation that data are generated for G (z).
CGAN model it is implicit define a probability distribution Pg, and it is desirable that Pg, which converges to data, is really distributed Pdata.?
In this minimax betting model, there are optimal solutions when Pg=Pdata, that is, reach Nash Equilibrium, generate at this time
Model has restored the distribution of training data, and discrimination model can not differentiate again as a result, accuracy rate is guessed at random equal to 50%.
CGAN model increases conditional information y on the basis of GAN model.Evaluation function are as follows:Condition can be class label,
It is also possible to other multi-modal informations etc..Here temporal information is used, is divided within one day 24 hours, corresponds to time letter per hour
A value in breath.
Specifically, the iteration optimization algorithms of CGAN model training include:
1, the gradient updating algorithm of GAN model
As described above, evaluation function is (1) formula, carried out using most small quantities of gradient descent algorithm as follows.Any gradient updating
Algorithm is ok, and uses momentum gradient more new algorithm here.Algorithm is as follows:
Algorithm input: hyper parameter k and m.Hyper parameter refers to the parameter by being manually arranged, rather than by training obtained ginseng
Number.Wherein, parameter k is indicated: k generation model of training, discrimination model of training;Parameter m is indicated: the sample number in most small quantities of
Amount.
Algorithmic procedure:
For training samples number do
Fork step generates model training do ▽
From given noise prior distribution pg(z) in (being distributed using random Gaussian), m noisy samples { z is generated(1),z(2),...,z(m)}
Extract m authentic specimen data { x(1),x(2),...,x(m)}
By stochastic gradient ascent algorithm, discrimination model is updated:
end for
From given noise prior distribution pg(z) in, m noisy samples { z is generated(1),z(2),...,z(m)}
By stochastic gradient descent algorithm, more newly-generated model:
end for
Algorithm output: the weight parameter of model and discrimination model is generated.
(2) LSTM propagated forward algorithm is as follows
Because generating model and discrimination model being all made of LSTM model.Discrimination model increases behind LSTM model
SoftMax excitation function.The propagated forward algorithm of LSTM network is described below.
1) Memory-Gate output: f is updated(t)=σ (Wfh(t-1)+Ufx(t)+bf)
2) output of input gate two parts is updated:
i(t)=σ (Wih(t-1)+Uix(t)+bi)
3) cell state is updated:
4) out gate output is updated:
o(t)=σ (Woh(t-1)+Uox(t)+bo)
h(t)=o(t-1)⊙tanh(C(t))
5) current sequence index prediction output is updated:
(3) LSTM Back Propagation Algorithm
LSTM back-propagation algorithm updates all parameters: (W by gradient descent method iterationf,Uf,bf)、(Wi,Ui,bi)、
(WC,UC,bC)、(Wo,Uo,bo).Key point is to calculate partial derivative of all parameters based on loss function.
In Recognition with Recurrent Neural Network (RecurrentNeural Network, RNN), for reverse propagated error, by hidden
Hiding state h(t)Gradient δ(t)It propagates forward step by step.It is also similar in LSTM, the difference is that there are two hidden state h here(t)And C(t).Therefore, two δ are defined here(t), it may be assumed that
Wherein, it is only used when backpropagationVariableIt only helps to calculate in a certain layer to use, it is anti-there is no participating in
To propagation.
And final nucleotide sequence index position τ'sWithAre as follows:
Then byReverse-direction derivation
Gradient determined by the output gradient error of this layer, it may be assumed thatAndIt is anti-
To gradient error by preceding layerGradient error and this layer slave h(t)The gradient error two parts composition sent back, it may be assumed that
HaveWithThen (W is calculatedf,Uf,bf)、(Wi,Ui,bi)、(WC,UC,bC)、(Wo,Uo,bo) etc. parameters.With
WfGradient calculating process for,
When CGAN model parameter training after the completion of to get to generate model.During generating modeling flow,
It is input to the generation model with random vector, obtains simulation background flow.
Network context flow generation method of the invention, be based on trained generation model, with random noise to
Amount and conditional information are input, can automatically generate simulation background flow similar with live network flow, not need manually to join
With Modeling Network Traffic, network flow characteristic extract, network flow generate etc. processes, be greatly saved human cost, hence it is evident that
Improve the efficiency of network flow generation.
Claims (10)
1. a kind of network context flow generation method based on condition production confrontation network characterized by comprising
Data acquisition step, the network flow data and conditional information of acquisition, dyad turn to real traffic;
Model generation step is obtained being initially generated model and initial discrimination model with the real traffic, and passes through condition production
Confrontation network is initially generated model to this and the initial discrimination model is trained, to obtain generating model;
Traffic generating step, by the generation model with generating random vector simulation background flow.
2. network context flow generation method as described in claim 1, which is characterized in that the data acquisition step is specifically wrapped
It includes:
Obtaining step obtains the network flow data and the conditional information by the mirror port of network access device, and should
Network flow data is stored as multiple data packets, and the length of each data packet is m byte;
Vectorization step, in the data field end of each data packet filling n 0, vector turn to the random noises of 1518 dimensions to
Amount, and the conditional information that series connection 1 is tieed up after the random noise vector, to form the real traffics of 1519 dimensions;
The real traffic is divided into training set, verifying collection and test set by classifying step, and wherein the training set is for determining that this is initial
The coefficient of model and the initial discrimination model is generated, the verifying collection is for verifying the loss function for being initially generated model, the survey
Examination collects the effect for testing the generation model;
Wherein the data packet is pcap format or binary format, and m, n are integer, 64≤m≤1518, n=1518-m.
3. network context flow generation method as claimed in claim 2, which is characterized in that the conditional information is to obtain the network
Temporal information when data on flows.
4. network context flow generation method as described in claim 1, which is characterized in that the generation model and the discrimination model
For shot and long term memory network model.
5. network context flow generation method as claimed in claim 4, which is characterized in that remember in the shot and long term of the discrimination model
It further include the SoftMax function of resolve packet unit He the output layer after the output layer for recalling network model.
6. a kind of network context traffic generating system based on condition production confrontation network characterized by comprising
Data acquisition module, for obtaining real traffic;It wherein obtains network flow data and conditional information, dyad turns to this
Real traffic;
Model generation module is initially generated model and initial discrimination model for obtaining with the real traffic, and raw by condition
Accepted way of doing sth confrontation network discrimination model is initially generated model to this and the initial discrimination model is trained, to obtain generating mould
Type;
Traffic generating module, for passing through the generation model with generating random vector simulation background flow.
7. network context traffic generating system as claimed in claim 6, which is characterized in that the data acquisition module specifically wraps
It includes:
Module is obtained, obtains the network flow data and the conditional information for the mirror port by network access device, and
The network flow data is stored as multiple data packets, the length of each data packet is m byte;
Vectorization module, in the data field end of each data packet filling n 0, vector to turn to making an uproar at random for 1518 dimensions
Sound vector, and the conditional information that series connection 1 is tieed up after the random noise vector, to form the real traffics of 1519 dimensions;
Categorization module, for the real traffic to be divided into training set, verifying collection and test set, wherein the training set should for determining
It is initially generated the coefficient of model He the initial discrimination model, which is used to verify the loss function for being initially generated model,
The test set is used to test the effect of the generation model;
Wherein, which is pcap format or binary format, and m, n are integer, 64≤m≤1518, n=1518-m.
8. network context traffic generating system as claimed in claim 7, which is characterized in that the conditional information is to obtain the network
Temporal information when data on flows.
9. network context traffic generating system as claimed in claim 6, which is characterized in that the generation model and the discrimination model
For shot and long term memory network model.
10. network context traffic generating system as claimed in claim 9, which is characterized in that the shot and long term of the discrimination model is remembered
It further include the SoftMax function of resolve packet unit He the output layer after the output layer for recalling network model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910012933.5A CN109889452B (en) | 2019-01-07 | 2019-01-07 | Network background flow generation method and system based on condition generation type countermeasure network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910012933.5A CN109889452B (en) | 2019-01-07 | 2019-01-07 | Network background flow generation method and system based on condition generation type countermeasure network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109889452A true CN109889452A (en) | 2019-06-14 |
CN109889452B CN109889452B (en) | 2021-06-11 |
Family
ID=66925676
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910012933.5A Active CN109889452B (en) | 2019-01-07 | 2019-01-07 | Network background flow generation method and system based on condition generation type countermeasure network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109889452B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110602078A (en) * | 2019-09-04 | 2019-12-20 | 南京邮电大学 | Application encryption traffic generation method and system based on generation countermeasure network |
CN111049762A (en) * | 2019-12-23 | 2020-04-21 | 上海金仕达软件科技有限公司 | Data acquisition method and device, storage medium and switch |
CN111159250A (en) * | 2019-12-19 | 2020-05-15 | 电子科技大学 | Mobile terminal user behavior detection method based on nested deep twin neural network |
CN111651765A (en) * | 2020-05-27 | 2020-09-11 | 上海交通大学 | Program execution path generation method based on generative countermeasure network |
CN111881620A (en) * | 2020-07-15 | 2020-11-03 | 哈尔滨工业大学(威海) | User software behavior simulation system based on reinforcement learning algorithm and GAN model and working method thereof |
CN113507429A (en) * | 2021-04-16 | 2021-10-15 | 华东师范大学 | Generation method of intrusion flow based on generation type countermeasure network |
CN113542271A (en) * | 2021-07-14 | 2021-10-22 | 西安电子科技大学 | Network background flow generation method based on generation of confrontation network GAN |
CN114326655A (en) * | 2021-11-30 | 2022-04-12 | 深圳先进技术研究院 | Industrial robot fault data generation method, system, terminal and storage medium |
CN115277086A (en) * | 2022-06-16 | 2022-11-01 | 西安电子科技大学 | Network background flow generation method based on generation countermeasure network |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951919A (en) * | 2017-03-02 | 2017-07-14 | 浙江工业大学 | A kind of flow monitoring implementation method based on confrontation generation network |
CN108564129A (en) * | 2018-04-24 | 2018-09-21 | 电子科技大学 | A kind of track data sorting technique based on generation confrontation network |
CN109086658A (en) * | 2018-06-08 | 2018-12-25 | 中国科学院计算技术研究所 | A kind of sensing data generation method and system based on generation confrontation network |
-
2019
- 2019-01-07 CN CN201910012933.5A patent/CN109889452B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951919A (en) * | 2017-03-02 | 2017-07-14 | 浙江工业大学 | A kind of flow monitoring implementation method based on confrontation generation network |
CN108564129A (en) * | 2018-04-24 | 2018-09-21 | 电子科技大学 | A kind of track data sorting technique based on generation confrontation network |
CN109086658A (en) * | 2018-06-08 | 2018-12-25 | 中国科学院计算技术研究所 | A kind of sensing data generation method and system based on generation confrontation network |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110602078A (en) * | 2019-09-04 | 2019-12-20 | 南京邮电大学 | Application encryption traffic generation method and system based on generation countermeasure network |
CN111159250A (en) * | 2019-12-19 | 2020-05-15 | 电子科技大学 | Mobile terminal user behavior detection method based on nested deep twin neural network |
CN111159250B (en) * | 2019-12-19 | 2023-02-21 | 电子科技大学 | Mobile terminal user behavior detection method based on nested deep twin neural network |
CN111049762A (en) * | 2019-12-23 | 2020-04-21 | 上海金仕达软件科技有限公司 | Data acquisition method and device, storage medium and switch |
CN111651765A (en) * | 2020-05-27 | 2020-09-11 | 上海交通大学 | Program execution path generation method based on generative countermeasure network |
CN111651765B (en) * | 2020-05-27 | 2023-05-02 | 上海交通大学 | Program execution path generation method based on generation type countermeasure network |
CN111881620B (en) * | 2020-07-15 | 2022-12-30 | 哈尔滨工业大学(威海) | User software behavior simulation system based on reinforcement learning algorithm and GAN model and working method thereof |
CN111881620A (en) * | 2020-07-15 | 2020-11-03 | 哈尔滨工业大学(威海) | User software behavior simulation system based on reinforcement learning algorithm and GAN model and working method thereof |
CN113507429A (en) * | 2021-04-16 | 2021-10-15 | 华东师范大学 | Generation method of intrusion flow based on generation type countermeasure network |
CN113507429B (en) * | 2021-04-16 | 2022-04-05 | 华东师范大学 | Generation method of intrusion flow based on generation type countermeasure network |
CN113542271A (en) * | 2021-07-14 | 2021-10-22 | 西安电子科技大学 | Network background flow generation method based on generation of confrontation network GAN |
CN113542271B (en) * | 2021-07-14 | 2022-07-26 | 西安电子科技大学 | Network background flow generation method based on generation of confrontation network GAN |
CN114326655A (en) * | 2021-11-30 | 2022-04-12 | 深圳先进技术研究院 | Industrial robot fault data generation method, system, terminal and storage medium |
CN115277086A (en) * | 2022-06-16 | 2022-11-01 | 西安电子科技大学 | Network background flow generation method based on generation countermeasure network |
CN115277086B (en) * | 2022-06-16 | 2023-10-20 | 西安电子科技大学 | Network background flow generation method based on generation of countermeasure network |
Also Published As
Publication number | Publication date |
---|---|
CN109889452B (en) | 2021-06-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109889452A (en) | Network context flow generation method and system based on condition production confrontation network | |
WO2022033332A1 (en) | Dialogue generation method and apparatus, network training method and apparatus, storage medium, and device | |
CN110598598A (en) | Double-current convolution neural network human behavior identification method based on finite sample set | |
CN111461204B (en) | Emotion recognition method based on electroencephalogram signals for game evaluation | |
CN110807469B (en) | Knowledge tracking method and system integrating long-time memory and short-time memory with Bayesian network | |
CN108062561A (en) | A kind of short time data stream Forecasting Methodology based on long memory network model in short-term | |
CN110147711A (en) | Video scene recognition methods, device, storage medium and electronic device | |
CN110164476A (en) | A kind of speech-emotion recognition method of the BLSTM based on multi output Fusion Features | |
US20220176248A1 (en) | Information processing method and apparatus, computer readable storage medium, and electronic device | |
CN110378699A (en) | A kind of anti-fraud method, apparatus and system of transaction | |
CN106911669A (en) | A kind of DDOS detection methods based on deep learning | |
CN109800785B (en) | Data classification method and device based on self-expression correlation | |
CN109992780A (en) | One kind being based on deep neural network specific objective sensibility classification method | |
CN109325638A (en) | A kind of SDN method for predicting based on RBF neural | |
CN110363081A (en) | Face identification method, device, equipment and computer readable storage medium | |
US20230367934A1 (en) | Method and apparatus for constructing vehicle dynamics model and method and apparatus for predicting vehicle state information | |
Gharib et al. | Acoustic scene classification: A competition review | |
Liu et al. | An asynchronous federated learning arbitration model for low-rate ddos attack detection | |
CN114120637A (en) | Intelligent high-speed traffic flow prediction method based on continuous monitor | |
CN114218457B (en) | False news detection method based on forwarding social media user characterization | |
CN109670623A (en) | Neural net prediction method and device | |
Zheng et al. | Enabling robust DRL-driven networking systems via teacher-student learning | |
CN113726545B (en) | Network traffic generation method and device for generating countermeasure network based on knowledge enhancement | |
CN116307022A (en) | Public opinion hotspot information prediction method and system | |
CN111078872B (en) | Police event simulation data generation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |