CN110336768A - A kind of Tendency Prediction method based on joint hidden Markov model and genetic algorithm - Google Patents
A kind of Tendency Prediction method based on joint hidden Markov model and genetic algorithm Download PDFInfo
- Publication number
- CN110336768A CN110336768A CN201910060212.1A CN201910060212A CN110336768A CN 110336768 A CN110336768 A CN 110336768A CN 201910060212 A CN201910060212 A CN 201910060212A CN 110336768 A CN110336768 A CN 110336768A
- Authority
- CN
- China
- Prior art keywords
- chromosome
- matrix
- algorithm
- probability
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
- G06F18/295—Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/147—Network analysis or design for predicting network behaviour
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computer Security & Cryptography (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Hardware Design (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Physiology (AREA)
- Genetics & Genomics (AREA)
- Complex Calculations (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A kind of Tendency Prediction method based on joint hidden Markov model and genetic algorithm, this method handles redundancy alarm and wrong report using artificial fish school algorithm fuzzy clustering method, artificial fish-swarm algorithm can be very good the disadvantage for overcoming Fuzzy c-means Clustering sensitive to initial cluster center, to achieve the purpose that optimize alarm clustering precision.Use the alarm after cluster as inputting aiming at the problem that easily leading to training result local optimum for the setting of hidden Markov model initial parameter in the training process is improper simultaneously, optimize the initial value of hidden Markov using genetic algorithm, again using Bao Muweierqi algorithm come the parameter after further training optimization, it finally obtains the hidden Markov model parameter under maximal possibility estimation, security postures is predicted with viterbi algorithm combination observation.The accuracy of network safety situation prediction can be improved in this method.
Description
Technical field
The invention belongs to field of information security technology, and in particular to one kind is calculated based on joint hidden Markov model and heredity
The Tendency Prediction method of method.
Background technique
With the development of internet technology, the business carried is more and more.Electric power, water conservancy, communication, bank, traffic, religion
It educates, military affairs etc. all be unable to do without internet.The various businesses carried on internet, the various information of storage are all physical realities
The embodiment of value.The appearance of bit coin has more obscured the boundary of the virtual network world and real world.Network world information content
Huge, numerous and complicated.Internet it is free, convenient, quickly access so that whole world people to internet use not by when
Between, the limitation in place, so that network security is also more and more especially paid close attention to.In recent years, attack tool and hand in a network
The increasingly sophisticated multiplicity of method, relying solely on traditional safe precaution measure can no longer meet the demand of safe altitude sensitivity department.
It is traditional preventive means dispersion taken for network security, single, cannot from macroscopic perspective to various network key factors into
Row Comprehensive Evaluation.What is gradually risen exactly generates the research of network security situation awareness in this background.
Network security situation awareness is by the way that the key element data in network are obtained, understood and assessed, finally
Whole net security postures, specific network security situation awareness frame such as Fig. 2 are predicted by assessment result.Wherein, Tendency Prediction is logical
It crosses to the continual detection of network state, when discovery network state exception, i.e., is predicted under network using known prediction model
One step state.The existing Tendency Prediction method based on hidden Markov model passes through EM algorithm combination real network observation
It is trained, when network occurs abnormal, predicts Network Situation value using training pattern, have the disadvantage in that
Existing clustering method exists when being applied to Intrusion detection alarm processing to initial cluster center sensitive issue, makes
It obtains not accurate enough for the analysis of alarm result.To influence the training of last model, accurate model cannot be obtained well.
Due to the inherent shortcoming of hidden Markov model itself, the choosing of initial value when using the training of EM algorithm is easily led to
Drawbacks of the standard is taken to keep initial value selection result poor, to the training result of local optimum occur.
Summary of the invention
In order to overcome the above-mentioned deficiencies of the prior art, the object of the present invention is to provide one kind to be based on joint hidden Markov mould
The Tendency Prediction method of type and genetic algorithm, it is had in alarm initialization procedure using fish-swarm algorithm Optimization of Fuzzy clustering method
The shortcomings that effect ground overcomes alarm clustering analysis to be easily trapped into local extremum improves the precision of alarm clustering result, while utilizing group
Body Intellisense algorithm optimization hidden Markov Tendency Prediction model, so that model training avoids local optimum well, thus
So that network safety situation prediction result is more accurate.
To achieve the goals above, the technical solution adopted by the present invention is that:
It is a kind of based on joint hidden Markov model and genetic algorithm Tendency Prediction method, which is characterized in that including with
Lower step:
Step 1: according to the Intrusion detection alarm of collection, it is carried out based on artificial fish school optimization fuzzy means clustering
The pretreatment of Intrusion detection alarm clustering method, to achieve the purpose that simplified and Accurate classification alarm, and by the result of processing
External observation value as network;
According to the Intrusion detection alarm of collection, the intrusion detection based on artificial fish school optimization fuzzy means clustering is carried out to it
The pretreatment of alarm clustering method, comprising:
1): initialization intruding detection system alarm: rejecting unnecessary attribute, multi-source heterogeneous data tentatively gather
It closes;
2) weight distribution of alert properties is carried out using Consistent Matrix method;
3) fuzzy similarity matrix of alarm is established using customized alert properties similarity function and weight relationship;
4) fuzzy equivalent matrix is established using Transitive Closure Method, and Artificial Fish individual is established to every alarm;
5) food concentration function is constructed, higher-dimension sample is mapped to three-dimensional planar;
6) the FCM cluster based on artificial fish-swarm algorithm is carried out, it includes:
1) error function of artificial fish-swarm algorithm is defined:
Wherein rij1Rij expression is mapped to the Euclidean distance between the sample i of three-dimensional planar and sample j from high-order sample,
Assuming that the coordinate value of i and j is respectively (ai, bi, ci)、(aj, bj, cj), then rij1:
Rij* is the value of corresponding position in the fuzzy equivalent matrix established in step 4;
2) the food concentration function of individual is defined:
3) in three dimensions, being assigned at random to each sample from High Dimensional Mapping to three-dimensional sample random distribution to be clustered
To D coordinates value;
4) the food concentration size of Artificial Fish is calculated;
5) the optimizations behavior such as bunched on the basis of the food concentration of the current shoal of fish, look for food, knock into the back;
If 6) all Artificial Fishs in group all terminate to move, continues to execute downwards, otherwise go to step 4);
If 7) the maximum food concentration value of individual of updated Artificial Fish and maximum food concentration functional value before updating it
Difference is less than some designated value or update times reach specified maximum times, then terminates, otherwise go to step 4);
8) D coordinates value clustered using FCM algorithm, and last result is mapped to original higher-dimension sample
In;
Step 2: determining the hidden status number N of network according to network risks grade, according to expertise, to each hidden state
Probability carry out interval division, and between hidden state transition probability and hidden state arrive the output probability progress of aobvious state
Interval division;
Step 3: according to the probability interval matrix of each hidden state ready-portioned in step 2, transition probability section
Matrix, output probability interval matrix take the random number in section, and are normalized initial to generate P hidden Markov respectively
Probability matrix π, transition probability matrix A, output probability matrix B;
Wherein, P hidden Markov probability matrix π, transition probability matrix A, output probability square are generated respectively at random
Battle array B, the specific normalization result that probability matrix generated is met meet following formula:
Step 4: P probability matrix generated is encoded using floating-point encoding method;Used floating-point
Three parameter matrixs that the chromosome that number encoder method generates corresponds to hidden Markov model separately include three parts, at the beginning of hidden state
Beginning probability matrix corresponds to initial chromosome Ge π, and hidden state transition probability matrix corresponds to transfer dyeing body GeA, and hidden state is to showing
The output matrix of state corresponds to output Chromosome G eB;
Step 5: the fitness value of all P chromosomes is calculated, the randomness of genetic algorithm destroys current kind in order to prevent
The optimal individual of fitness value, i.e. optimum maintaining strategy, are copied directly to next population for the maximum individual of fitness value in group;
Step 6: it for rear P-1 chromosome, calculates it and support of species Discrete and the weighting of fitness value is asked
With so that population scale is reached P again in conjunction with roulette rule;
Individual is related to as given a definition the support calculation of species Discrete:
Define 1: definition population scale is S, is defined in a chromosome comprising Q=m*n+n*n+N gene, chromosome k
By Gk=(Gk1,Gk2...GkQ), k=1,2...S is indicated;
Define 2: chromosome fitness function f: the optimal chromosome as required by genetic algorithm is the initial ginseng of Hmm
Matrix number, therefore use the forward direction probability of all chromosomes as its fitness function, i.e.,
F=P (O/ λ);
It defines 3: defining individual phenotypeηK, the i.e. fitness value of chromosome k and Population adaptation angle value and ratio
It defines 4: defining species Discrete degree d
Define 5: define k-th of chromosome is for the support of species Discrete;
Step 7: determining crossover probability according to support size, and using arithmetic crossover mode, the heredity completed between individual is handed over
The step of fork, is as follows:
1): random selection item chromosome k, calculation formula:Wherein SptmaxRepresent maximum
Support, SptminRepresent minimum support, SptkRepresent randomly selected chromosome support;
2): generating random number r, if r < Sptr, chromosome k is one to Cross reaction body.This two step is repeated, directly
To generation two to Cross reaction body;
3): to two to Cross reaction body carry out genetic cross, cross-over principle are as follows: Ge π 1 and Ge π 1 intersects, GeA1 with
GeA2 intersects, and GeB1 and GeB2 intersect;
Step 8: the heredity of individual is completed using nonuniform meshes mode according to support size definitive variation probability
Variation;Variation mode are as follows:
Wherein GkFor k-th of chromosome before randomly selected variation, Gk' it is GkChromosome after variation, GmaxAnd GminPoint
It Wei not the minimum and maximum individual of current fitness.T be (0~1] between make a variation constant, r is a random number;Random integers
G is used when rand () is even numberk’=Gk+t(Gmax-Gk) r variation mode, when being odd number use Gk’=Gk+t(Gk-Gmix)r
Variation mode, in the way of support definitive variation probability be same step 7;
Step 9: to genetic replication, genetic cross is lived through, the newborn population after hereditary variation carries out the normalization of individual
Processing, meets the parameter constraints of hidden Markov;
Step 10: checking whether and meet preset stopping criterion for iteration, if satisfied, then terminating, selects fitness value
Required chromosome is mapped to three initial matrixs of hidden Markov model as global optimum by maximum chromosome.
Otherwise return step five starts to carry out the evolution of a new round;
Step 11: the model parameter λ obtained using Bao Muweierqi algorithm to step 10=(π, A, B) is iterated
Training obtains the maximal possibility estimation parameter of hidden Markov model;The λ that step 10 is obtained using Bao Muweierqi algorithm=
(π, A, B) is iterated training and obtains the maximal possibility estimation parameter of HMM model, comprises the following steps:
1) D alarm sequence is obtained according to the Intrusion detection alarm clustering method of the artificial fish school optimization fuzzy means clustering
Column data sample { O1,O2,...OD, any alert sequence Od={ o1 (d),o2 (d),o3 (d),....oT (d)};
2) optimal initial value λ=(π, A, a B) is obtained according to the genetic algorithm optimization;
3) for each sample d=1,2 ... D calculates γ with forward-backward algorithm algorithmt (d)(i), ξt (d)(i, j), t=1,
2...T;
4) model parameter matrix is updated;
5) it checks whether each matrix meets the condition of convergence, if satisfied, then algorithm terminates, otherwise returns to (3) iteration and execute;
Step 12: if when network state exception, it can be by collecting external observation value and trained hidden Ma Er
Can husband's model, utilize viterbi algorithm carry out network safety situation prediction.
The invention has the following advantages that
1, classified using the combination of artificial fish-swarm algorithm and fuzzy clustering to collected data with alert, effectively solved
In processing redundancy alerting process, traditional clustering method is sensitive to initial cluster center and cause cluster result accuracy not high
The shortcomings that.To improve Tendency Prediction precision.
2, Tendency Prediction, the optimization knot that genetic algorithm processing generates are carried out using genetic algorithm joint hidden Markov model
Fruit initial value inputs Bao Muweierqi algorithm, and using detection and processed network alert data change to it as observation
Generation training, obtains parameter value.Initial value is chosen during this method effectively overcomes traditional hidden Markov model Tendency Prediction
Improper the shortcomings that leading to training result local optimum.
Detailed description of the invention
Fig. 1 is the working principle of the invention figure.
Fig. 2 is network security situation awareness frame diagram.
Fig. 3 is fish-swarm algorithm of the invention-fuzzy clustering alarm processing flow chart of steps.
Fig. 4 is genetic algorithm searching process figure of the invention.
Specific embodiment
It is to be further discussed below in conjunction with the embodiments with attached drawing to the present invention, but the present invention is not limited to following embodiment below.
The present invention proposes a kind of Tendency Prediction method based on joint hidden Markov model and genetic algorithm, for existing
There is the hidden Markov Tendency Prediction method in network security situational awareness method, there are theoretical defects for initial parameter create-rule
The problem of easily leading to training result local optimum proposes to use intelligent perception theoretical optimization initial parameter, so that Bao Muweierqi
Algorithm can obtain the higher parameter value of fitness in the trained initial stage.In the initialization procedure of training data,
Wrong report, redundancy alarm are removed to it using joint artificial fish-swarm algorithm and c means clustering method.Combining for two methods makes
With the accuracy that can largely improve Tendency Prediction result, network security manager is enabled more accurately to obtain network peace
Full situation truth.
Fig. 1 is the working principle of the invention figure.Specifically, the present invention is using treated intruding detection system alert data
As input, after data initialization, data are handled using improved clustering method, as situation observation.To existing hidden
After Markov prediction initial parameter optimizes, model instruction is carried out using Bao Muweierqi algorithm combination situation observation
Practice, finally obtains the Maximum Likelihood Model parameter value of observation sequence.Recycle observation sequence and viterbi algorithm to the state of network
Gesture value is predicted.Specifically include the following steps:
1) according to pretreated Intrusion detection alarm, the invasion inspection based on artificial fish school optimization fuzzy clustering is carried out to it
The pretreatment of alarm clustering method is surveyed, to achieve the purpose that simplified and Accurate classification alarm, and using the result of processing as net
The external observation value of network;
2) the hidden status number N of network is determined according to network risks grade, according to expertise, to the initial of each hidden state
Probability carries out interval division, and carries out section to the output probability of transition probability and hidden state to aobvious state hidden state and draw
Point;
3) according to the probability interval matrix of each hidden state ready-portioned in step 2, transition probability interval matrix,
Output probability interval matrix takes the random number in section, and is normalized initial to generate P hidden Markov model respectively
Probability matrix π, transition probability matrix A, output probability matrix B;
4) P probability matrix generated is encoded using floating-point encoding method;
5) fitness value of all P chromosomes is calculated, the randomness of genetic algorithm is destroyed in current population in order to prevent
The maximum individual of fitness value is copied directly to next population by the optimal individual of fitness value, i.e. optimum maintaining strategy;
6) for rear P-1 chromosome, it is calculated to the support of species Discrete and the weighted sum of fitness value, knot
Closing roulette rule makes population scale reach P again;
7) crossover probability is determined according to support size, using arithmetic crossover mode, completes the genetic cross between individual;
8) hereditary variation of individual is completed using nonuniform meshes mode according to support size definitive variation probability;
9) to genetic replication, genetic cross is lived through, the newborn population after hereditary variation carries out the normalized of individual,
Meet the parameter constraints of hidden Markov;
10) it checks whether and meets preset stopping criterion for iteration, if satisfied, then terminating, select fitness value maximum
Chromosome as global optimum, and required chromosome is mapped to three initial matrixs of hidden Markov model.Otherwise
Return step 5) start to carry out the evolution of a new round;
11) model parameter λ=(π, A, the B) obtained to step 10) using Bao Muweierqi algorithm is iterated trained
To the maximal possibility estimation parameter of hidden Markov model;
It, can be by collecting external observation situation value and trained hidden Ma Er 12) if network state is abnormal
Can husband's model, utilize viterbi algorithm carry out network safety situation prediction.
Fig. 3 is fish-swarm algorithm-fuzzy clustering alarm processing step.Specifically, according to the Intrusion detection alarm of collection, to it
Carry out the pretreatment of the Intrusion detection alarm clustering method based on artificial fish school optimization fuzzy clustering, the specific steps are as follows:
(1): initialization intruding detection system alarm: rejecting unnecessary attribute, multi-source heterogeneous data are carried out tentatively
Polymerization comprising the steps of:
1) a warning information x is inputtediIf i=1 records its alarm types type (1), number of types counter t=1
2) when i >=2 when, from currently identified type type (1) to type type (t), judge itself and it is current alert
Type type (the x of reporti) comparing result, i.e.,
3) as i=n, for every one kind in t class data with alert, classified according to predefined time span;
(2) weight distribution of alert properties is carried out using Consistent Matrix method, comprising the following steps:
1) Importance of attribute degree ratio two-by-two is carried out according to m attribute of the expertise to Intrusion detection alarm to give a mark to obtain
Judgment matrix
Wherein xijThe ratio between the importance degree of i-th and j-th attribute;
2)
Each factor weight is B=(β1, β2,., βi,., βn);
(3) fuzzy similarity matrix of alarm is established using customized alert properties similarity function and weight relationship,
The attributes similarity function specifically includes following:
1) time similarity function:
2) port similarity function:
3) address source/destination ip similarity function
(η is the address the two source/destination ip identical digit from left to right);
4) agreement similarity function;
The similarity x of i-th alarm and j-th strip alarmijCalculation formula is;
(wherein m is attribute number,It is the of i-th article and j-th strip alarmThe similarity value of a attribute);
(4) fuzzy equivalent matrix is established using Transitive Closure Method, and Artificial Fish individual is established to every alarm;
(5) food concentration function is constructed, higher-dimension sample is mapped to three-dimensional planar;
(6) the FCM cluster based on artificial fish-swarm algorithm is carried out, it includes:
1) error function of artificial fish-swarm algorithm is defined:
Wherein rij' indicate the Euclidean distance being mapped between the sample i of three-dimensional planar and sample j from high-order sample, it is assumed that i
Coordinate value with j is respectively (ai, bi, ci)、(aj, bj, cj), then rij1:
rij *It is the value of corresponding position in the fuzzy equivalent matrix established in step 4;
2) the food concentration function of individual is defined:
3) in three dimensions, being assigned at random to each sample from High Dimensional Mapping to three-dimensional sample random distribution to be clustered
To D coordinates value
4) the food concentration size of Artificial Fish is calculated
5) the optimizations behavior such as bunched on the basis of the food concentration of the current shoal of fish, look for food, knock into the back
If 6) all Artificial Fishs in group all terminate to move, continues to execute downwards, otherwise turn (4)
If 7) the maximum food concentration value of individual of updated Artificial Fish and maximum food concentration functional value before updating it
Difference is less than some designated value or update times reach specified maximum times, then terminates, otherwise turn (4)
8) D coordinates value clustered using FCM algorithm, and last result is mapped to original higher-dimension sample
In.
Fig. 4 is genetic algorithm searching process figure.Specifically, random generate P hidden Markov probability matrix respectively
π, transition probability matrix A, output probability matrix B.The specific normalization result that probability matrix generated is met meets as follows
Formula:
The chromosome that used floating-point encoding method generates corresponds to three parameter matrixs point of hidden Markov model
Not Bao Han three parts, hidden state probability matrix corresponds to initial chromosome Ge π, and hidden state transition probability matrix, which corresponds to, to be turned
Chromosome G eA is moved, the output matrix of hidden state to aobvious state corresponds to output Chromosome G eB, such as Fig. 1:
Individual is related to as given a definition the specific calculation of the support of species Discrete:
Define 1: definition population scale is S, is defined in a chromosome comprising Q=m*n+n*n+N gene, chromosome k
By Gk=(Gk1,Gk2...GkQ), k=1,2...S is indicated;
Define 2: chromosome fitness function f: the optimal chromosome as required by genetic algorithm is the initial ginseng of Hmm
Matrix number, therefore use the forward direction probability of all chromosomes as its fitness function, i.e.,
F=P (O/ λ)
Define 3: defining individual phenotype η k, i.e., the fitness value of chromosome k and Population adaptation angle value and ratio
It defines 4: defining species Discrete degree d;
Define 5: define k-th of chromosome is for the support of species Discrete
The combination roulette rule makes population scale reach P again, and specific steps include:
(1): calculation formula ti=ufi+vSptiWherein u, v are respectively weight shared by fitness value and support angle value;
(2): calculation formula Tn=∑ufi+vSpti;
(3): calculation formula Wi=ti/Tn;
(4): calculating accumulated probability
(5): being randomly generated and meet the random number r for being uniformly distributed 0~1, and by r and giIf comparison gi-1<r<gi, then select
It selects individual i and enters next-generation new group;(4) and (5) are executed repeatedly, until the number of the new group of generation is advised equal to parent group
Mould;
Crossover probability is determined according to support size, and using arithmetic crossover mode, the genetic cross completed between individual is specific
Steps are as follows:
(1): random selection item chromosome k, calculation formula:Wherein SptmaxIt represents most
Big support, SptminRepresent minimum support, SptkRepresent randomly selected chromosome support;
(2): generating random number r, if r < Sptr, chromosome k is one to Cross reaction body.This two step is repeated,
Until generating two to Cross reaction body
(3): carrying out genetic cross, cross-over principle are as follows: Ge π 1 and Ge π 1 intersects, GeA1 to Cross reaction body ... to two
Intersect with GeA2, GeB1 and GeB2 intersect.Used specific crossover operation is in the following example:
1) parent: Ge π 1={ π11, π12,.π1n, Ge π 1={ π21, π22,.π2n}
2) a gene j is randomly choosed
3) filial generation: Ge π 1={ π11, π12... a*π1k+(1-a)π2k, a* π1(k+1)+(1-a)π2(k+1)...a*π1n+(1-a)
π2n, wherein a is the random number between 0~1, and transfer matrix intersects similarly with output matrix, repeats no more;
The hereditary variation of individual is completed using nonuniform meshes mode according to support size definitive variation probability.Tool
Body utilizes following formula:
Variation mode are as follows:
Wherein GkFor k-th of chromosome before randomly selected variation, Gk' for variation after GkChromosome after variation.
GmaxAnd GminThe minimum and maximum individual of respectively current fitness.T be (0~1] between make a variation constant, r is a random number.
That is G is used when random integers rand () is even numberk’=Gk+t(Gmax-Gk) r variation mode, when being odd number use Gk’=Gk+t
(Gk-Gmix) r variation mode.
Training is iterated to obtained λ=(π, A, B) using Bao Muweierqi algorithm and obtains the maximum of HMM model seemingly
So estimation parameter, specifically includes the following steps:
(1): D alarm is obtained according to the Intrusion detection alarm clustering method of the artificial fish school optimization fuzzy means clustering
Sequence data sample { O1,O2,...OD, any alert sequence 0d={ o1 (d),o2 (d),...oT (d), and according to the right
It is required that described in 2
(2): optimal initial value λ=(π, A, a B) is obtained according to the genetic algorithm optimization
(3): for each sample d=1,2 ... D calculates γ with forward-backward algorithm algorithmt (d)(i), ξt (d)(i, j), t=1,
2...T, wherein
Wherein, Ci (i) is preceding to probability, βiIt (i) is backward probability, aijFor transition probability, bj (a+1) is output probability
(4): model parameter is updated according to the following formula:
(5): checking whether each matrix meets the condition of convergence, if satisfied, then algorithm terminates, otherwise return to (3) iteration and hold
Row.
Hidden Markov model parameter after being trained by above step.When the network operation is abnormal, prediction algorithm
Thinking it is as follows:
(1) the Network Situation sequence of observations is obtained.
(2) the hidden Markov model parameter after training is obtained.
(3) the hidden status switch that maximizes is calculated according to viterbi algorithm.
(4) the network safety situation value of subsequent time is determined according to state-transition matrix.
Claims (1)
1. a kind of Tendency Prediction method based on joint hidden Markov model and genetic algorithm, which is characterized in that including following
Step:
Step 1: according to the Intrusion detection alarm of collection, the invasion based on artificial fish school optimization fuzzy means clustering is carried out to it
The pretreatment of detection alarm clustering method, to achieve the purpose that simplified and Accurate classification alarm, and using the result of processing as
The external observation value of network;
According to the Intrusion detection alarm of collection, the Intrusion detection alarm based on artificial fish school optimization fuzzy means clustering is carried out to it
The pretreatment of clustering method, comprising:
1): initialization intruding detection system alarm: rejecting unnecessary attribute, preliminary polymerization is carried out to multi-source heterogeneous data;
2) weight distribution of alert properties is carried out using Consistent Matrix method;
3) fuzzy similarity matrix of alarm is established using customized alert properties similarity function and weight relationship;
4) fuzzy equivalent matrix is established using Transitive Closure Method, and Artificial Fish individual is established to every alarm;
5) food concentration function is constructed, higher-dimension sample is mapped to three-dimensional planar;
6) the FCM cluster based on artificial fish-swarm algorithm is carried out, it includes:
1) error function of artificial fish-swarm algorithm is defined:
Wherein rij1Rij indicates the Euclidean distance being mapped between the sample i of three-dimensional planar and sample j from high-order sample, it is assumed that i
Coordinate value with j is respectively (ai, bi, ci)、(aj, bj, cj), then rij1:
Rij* is the value of corresponding position in the fuzzy equivalent matrix established in step 4;
2) the food concentration function of individual is defined:
3) in three dimensions, being assigned to three at random to each sample from High Dimensional Mapping to three-dimensional sample random distribution to be clustered
Dimensional coordinate values;
4) the food concentration size of Artificial Fish is calculated;
5) the optimizations behavior such as bunched on the basis of the food concentration of the current shoal of fish, look for food, knock into the back;
If 6) all Artificial Fishs in group all terminate to move, continues to execute downwards, otherwise go to step 4);
7) if the individual maximum food concentration value of updated Artificial Fish and the difference of the maximum food concentration functional value before update are small
Reach specified maximum times in some designated value or update times, then terminate, otherwise goes to step 4);
8) D coordinates value clustered using FCM algorithm, and last result is mapped in original higher-dimension sample;
Step 2: determining the hidden status number N of network according to network risks grade, according to expertise, to the first of each hidden state
Beginning probability carries out interval division, and carries out section to the output probability of transition probability and hidden state to aobvious state hidden state
It divides;
Step 3: according to the probability interval matrix of each hidden state ready-portioned in step 2, transition probability interval matrix,
Output probability interval matrix takes the random number in section, and is normalized to generate P hidden Markov probability respectively
Matrix π, transition probability matrix A, output probability matrix B;
Wherein, P hidden Markov probability matrix π is generated respectively at random, transition probability matrix A, output probability matrix B,
The specific normalization result that probability matrix generated is met meets following formula:
Step 4: P probability matrix generated is encoded using floating-point encoding method;Used floating number is compiled
Three parameter matrixs that the chromosome that code method generates corresponds to hidden Markov model separately include three parts, and hidden state is initially general
Rate matrix corresponds to initial chromosome Ge π, and hidden state transition probability matrix corresponds to transfer dyeing body GeA, hidden state to aobvious state
Output matrix correspond to output Chromosome G eB;
Step 5: calculating the fitness value of all P chromosomes, and the randomness of genetic algorithm is destroyed in current population in order to prevent
The maximum individual of fitness value is copied directly to next population by the optimal individual of fitness value, i.e. optimum maintaining strategy;
Step 6: for rear P-1 chromosome, calculating it to the support of species Discrete and the weighted sum of fitness value,
Population scale is set to reach P again in conjunction with roulette rule;
Individual is related to as given a definition the support calculation of species Discrete:
Define 1: definition population scale is S, is defined comprising Q=m*n+n*n+N gene in a chromosome, chromosome k is by Gk=
(Gk1,Gk2...GkQ), k=1,2...S is indicated;
Define 2: chromosome fitness function f: the optimal chromosome as required by genetic algorithm is the initial parameter square of Hmm
Battle array, therefore use the forward direction probability of all chromosomes as its fitness function, i.e.,
F=P (O/ λ);
It defines 3: defining individual phenotypeηK, the i.e. fitness value of chromosome k and Population adaptation angle value and ratio
It defines 4: defining species Discrete degree d
Define 5: define k-th of chromosome is for the support of species Discrete;
Step 7: determining crossover probability according to support size, using arithmetic crossover mode, completes the genetic cross between individual
Steps are as follows:
1): random selection item chromosome k, calculation formula:Wherein SptmaxRepresent maximum support
Degree, SptminRepresent minimum support, SptkRepresent randomly selected chromosome support;
2): generating random number r, if r < Sptr, chromosome k is one to Cross reaction body.Repeat this two step, Zhi Daosheng
At two to Cross reaction body;
3): carrying out genetic cross, cross-over principle to Cross reaction body to two are as follows: Ge π 1 and Ge π 1 intersects, and GeA1 and GeA2 are handed over
Fork, GeB1 and GeB2 intersect;
Step 8: according to support size definitive variation probability, using nonuniform meshes mode, the heredity for completing individual becomes
It is different;Variation mode are as follows:
Wherein GkFor k-th of chromosome before randomly selected variation, Gk' it is GkChromosome after variation, GmaxAnd GminRespectively
The minimum and maximum individual of current fitness.T be (0~1] between make a variation constant, r is a random number;Random integers rand ()
G is used when being even numberk’=Gk+t(Gmax-Gk) r variation mode, when being odd number use Gk’=Gk+t(Gk-Gmix) r variation side
Formula is same step 7 in the way of support definitive variation probability;
Step 9: to genetic replication, genetic cross is lived through, the newborn population after hereditary variation is carried out at the normalization of individual
Reason, meets the parameter constraints of hidden Markov;
Step 10: checking whether and meet preset stopping criterion for iteration, if satisfied, then terminating, selects fitness value maximum
Chromosome as global optimum, and required chromosome is mapped to three initial matrixs of hidden Markov model.Otherwise
Return step five starts to carry out the evolution of a new round;
Step 11: the model parameter λ obtained using Bao Muweierqi algorithm to step 10=(π, A, B) is iterated trained
To the maximal possibility estimation parameter of hidden Markov model;The λ that step 10 is obtained using Bao Muweierqi algorithm=(π, A, B)
It is iterated training and obtains the maximal possibility estimation parameter of HMM model, comprise the following steps:
1) D alert sequence number is obtained according to the Intrusion detection alarm clustering method of the artificial fish school optimization fuzzy means clustering
According to sample { O1,O2,...OD, any alert sequence Od={ o1 (d),o2 (d),o3 (d),....oT (d)};
2) optimal initial value λ=(π, A, a B) is obtained according to the genetic algorithm optimization;
3) for each sample d=1,2 ... D calculates γ with forward-backward algorithm algorithmt (d)(i), ξt (d)(i, j), t=1,2...T;
4) model parameter matrix is updated;
5) it checks whether each matrix meets the condition of convergence, if satisfied, then algorithm terminates, otherwise returns to (3) iteration and execute;
Step 12: if when network state exception, it can be by collecting external observation value and trained hidden Markov
Model carries out network safety situation prediction using viterbi algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910060212.1A CN110336768B (en) | 2019-01-22 | 2019-01-22 | Situation prediction method based on combined hidden Markov model and genetic algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910060212.1A CN110336768B (en) | 2019-01-22 | 2019-01-22 | Situation prediction method based on combined hidden Markov model and genetic algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110336768A true CN110336768A (en) | 2019-10-15 |
CN110336768B CN110336768B (en) | 2021-07-20 |
Family
ID=68138888
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910060212.1A Active CN110336768B (en) | 2019-01-22 | 2019-01-22 | Situation prediction method based on combined hidden Markov model and genetic algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110336768B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110826617A (en) * | 2019-10-31 | 2020-02-21 | 中国人民公安大学 | Situation element classification method and training method and device of model thereof, and server |
CN111598335A (en) * | 2020-05-15 | 2020-08-28 | 长春理工大学 | Traffic area division method based on improved spectral clustering algorithm |
CN112101673A (en) * | 2020-09-22 | 2020-12-18 | 华北电力大学 | Power grid development trend prediction method and system based on hidden Markov model |
CN112260870A (en) * | 2020-10-21 | 2021-01-22 | 重庆邮电大学 | Network security prediction method based on dynamic fuzzy clustering and grey neural network |
CN112784896A (en) * | 2021-01-20 | 2021-05-11 | 齐鲁工业大学 | Time series flow data anomaly detection method based on Markov process |
CN112994944A (en) * | 2021-03-03 | 2021-06-18 | 上海海洋大学 | Network state prediction method |
CN114490619A (en) * | 2022-02-15 | 2022-05-13 | 北京大数据先进技术研究院 | Data filling method, device, equipment and storage medium based on genetic algorithm |
CN116055182A (en) * | 2023-01-28 | 2023-05-02 | 北京特立信电子技术股份有限公司 | Network node anomaly identification method based on access request path analysis |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140191955A1 (en) * | 2010-07-13 | 2014-07-10 | Giuseppe Raffa | Efficient gesture processing |
CN106453294A (en) * | 2016-09-30 | 2017-02-22 | 重庆邮电大学 | Security situation prediction method based on niche technology with fuzzy elimination mechanism |
-
2019
- 2019-01-22 CN CN201910060212.1A patent/CN110336768B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140191955A1 (en) * | 2010-07-13 | 2014-07-10 | Giuseppe Raffa | Efficient gesture processing |
CN106453294A (en) * | 2016-09-30 | 2017-02-22 | 重庆邮电大学 | Security situation prediction method based on niche technology with fuzzy elimination mechanism |
Non-Patent Citations (5)
Title |
---|
WEI LIANG等: "Multiscale Entropy-Based Weighted Hidden Markov Network Security Situation Prediction Model", 《2017 IEEE INTERNATIONAL CONGRESS ON INTERNET OF THINGS(ICIOT)》 * |
席荣荣等: "一种改进的网络安全态势量化评估方法", 《计算机学报》 * |
王国华: "基于遗传算法的网络安全态势感知研究", 《计算机测量与控制》 * |
郭凤鸣等: "模糊聚类分析传递闭包法实用程序", 《电脑学习》 * |
高岭等: "基于马尔可夫链的自适应DRX优化机制", 《东南大学学报(自然科学版)》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110826617A (en) * | 2019-10-31 | 2020-02-21 | 中国人民公安大学 | Situation element classification method and training method and device of model thereof, and server |
CN111598335A (en) * | 2020-05-15 | 2020-08-28 | 长春理工大学 | Traffic area division method based on improved spectral clustering algorithm |
CN112101673A (en) * | 2020-09-22 | 2020-12-18 | 华北电力大学 | Power grid development trend prediction method and system based on hidden Markov model |
CN112101673B (en) * | 2020-09-22 | 2024-01-16 | 华北电力大学 | Power grid development trend prediction method and system based on hidden Markov model |
CN112260870A (en) * | 2020-10-21 | 2021-01-22 | 重庆邮电大学 | Network security prediction method based on dynamic fuzzy clustering and grey neural network |
CN112260870B (en) * | 2020-10-21 | 2022-04-05 | 重庆邮电大学 | Network security prediction method based on dynamic fuzzy clustering and grey neural network |
CN112784896A (en) * | 2021-01-20 | 2021-05-11 | 齐鲁工业大学 | Time series flow data anomaly detection method based on Markov process |
CN112994944A (en) * | 2021-03-03 | 2021-06-18 | 上海海洋大学 | Network state prediction method |
CN114490619A (en) * | 2022-02-15 | 2022-05-13 | 北京大数据先进技术研究院 | Data filling method, device, equipment and storage medium based on genetic algorithm |
CN114490619B (en) * | 2022-02-15 | 2022-09-09 | 北京大数据先进技术研究院 | Data filling method, device, equipment and storage medium based on genetic algorithm |
CN116055182A (en) * | 2023-01-28 | 2023-05-02 | 北京特立信电子技术股份有限公司 | Network node anomaly identification method based on access request path analysis |
CN116055182B (en) * | 2023-01-28 | 2023-06-06 | 北京特立信电子技术股份有限公司 | Network node anomaly identification method based on access request path analysis |
Also Published As
Publication number | Publication date |
---|---|
CN110336768B (en) | 2021-07-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110336768A (en) | A kind of Tendency Prediction method based on joint hidden Markov model and genetic algorithm | |
CN110070141B (en) | Network intrusion detection method | |
CN108520272B (en) | Semi-supervised intrusion detection method for improving Cantonese algorithm | |
Gong et al. | An efficient Lorentz equivariant graph neural network for jet tagging | |
Ji et al. | Approximate logic neuron model trained by states of matter search algorithm | |
CN111967343B (en) | Detection method based on fusion of simple neural network and extreme gradient lifting model | |
Qasem et al. | Multi-objective hybrid evolutionary algorithms for radial basis function neural network design | |
Qasem et al. | Memetic multiobjective particle swarm optimization-based radial basis function network for classification problems | |
CN108712404A (en) | A kind of Internet of Things intrusion detection method based on machine learning | |
CN110059852A (en) | A kind of stock yield prediction technique based on improvement random forests algorithm | |
Shi et al. | Feature selection for object-based classification of high-resolution remote sensing images based on the combination of a genetic algorithm and tabu search | |
Wei | A method of enterprise financial risk analysis and early warning based on decision tree model | |
CN111882041A (en) | Power grid attack detection method and device based on improved RNN (neural network) | |
Jiang et al. | A density peak clustering algorithm based on the K-nearest Shannon entropy and tissue-like P system | |
Lin et al. | One-to-one ensemble mechanism for decomposition-based multi-objective optimization | |
Zhang et al. | Mining significant fuzzy association rules with differential evolution algorithm | |
Prasenna et al. | Network programming and mining classifier for intrusion detection using probability classification | |
Geng et al. | Novel IAPSO-LSTM neural network for risk analysis and early warning of food safety | |
CN111614609B (en) | GA-PSO-DBN-based intrusion detection method | |
CN108537663A (en) | One B shareB trend forecasting method | |
Zhang et al. | A feature-enhanced long short-term memory network combined with residual-driven ν support vector regression for financial market prediction | |
Xia et al. | A novel key influencing factors selection approach of P2P lending investment risk | |
Yu et al. | Rural financial decision support system based on database and genetic algorithm | |
Cao et al. | Adopting improved Adam optimizer to train dendritic neuron model for water quality prediction | |
Chen | Hotel management evaluation index system based on data mining and deep neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |