CN110336768A - A kind of Tendency Prediction method based on joint hidden Markov model and genetic algorithm - Google Patents

A kind of Tendency Prediction method based on joint hidden Markov model and genetic algorithm Download PDF

Info

Publication number
CN110336768A
CN110336768A CN201910060212.1A CN201910060212A CN110336768A CN 110336768 A CN110336768 A CN 110336768A CN 201910060212 A CN201910060212 A CN 201910060212A CN 110336768 A CN110336768 A CN 110336768A
Authority
CN
China
Prior art keywords
chromosome
matrix
algorithm
probability
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910060212.1A
Other languages
Chinese (zh)
Other versions
CN110336768B (en
Inventor
高岭
毛勇
郑杰
杨旭东
冯通
张晓�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest University
Original Assignee
Northwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University filed Critical Northwest University
Priority to CN201910060212.1A priority Critical patent/CN110336768B/en
Publication of CN110336768A publication Critical patent/CN110336768A/en
Application granted granted Critical
Publication of CN110336768B publication Critical patent/CN110336768B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06F18/295Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Complex Calculations (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A kind of Tendency Prediction method based on joint hidden Markov model and genetic algorithm, this method handles redundancy alarm and wrong report using artificial fish school algorithm fuzzy clustering method, artificial fish-swarm algorithm can be very good the disadvantage for overcoming Fuzzy c-means Clustering sensitive to initial cluster center, to achieve the purpose that optimize alarm clustering precision.Use the alarm after cluster as inputting aiming at the problem that easily leading to training result local optimum for the setting of hidden Markov model initial parameter in the training process is improper simultaneously, optimize the initial value of hidden Markov using genetic algorithm, again using Bao Muweierqi algorithm come the parameter after further training optimization, it finally obtains the hidden Markov model parameter under maximal possibility estimation, security postures is predicted with viterbi algorithm combination observation.The accuracy of network safety situation prediction can be improved in this method.

Description

A kind of Tendency Prediction method based on joint hidden Markov model and genetic algorithm
Technical field
The invention belongs to field of information security technology, and in particular to one kind is calculated based on joint hidden Markov model and heredity The Tendency Prediction method of method.
Background technique
With the development of internet technology, the business carried is more and more.Electric power, water conservancy, communication, bank, traffic, religion It educates, military affairs etc. all be unable to do without internet.The various businesses carried on internet, the various information of storage are all physical realities The embodiment of value.The appearance of bit coin has more obscured the boundary of the virtual network world and real world.Network world information content Huge, numerous and complicated.Internet it is free, convenient, quickly access so that whole world people to internet use not by when Between, the limitation in place, so that network security is also more and more especially paid close attention to.In recent years, attack tool and hand in a network The increasingly sophisticated multiplicity of method, relying solely on traditional safe precaution measure can no longer meet the demand of safe altitude sensitivity department. It is traditional preventive means dispersion taken for network security, single, cannot from macroscopic perspective to various network key factors into Row Comprehensive Evaluation.What is gradually risen exactly generates the research of network security situation awareness in this background.
Network security situation awareness is by the way that the key element data in network are obtained, understood and assessed, finally Whole net security postures, specific network security situation awareness frame such as Fig. 2 are predicted by assessment result.Wherein, Tendency Prediction is logical It crosses to the continual detection of network state, when discovery network state exception, i.e., is predicted under network using known prediction model One step state.The existing Tendency Prediction method based on hidden Markov model passes through EM algorithm combination real network observation It is trained, when network occurs abnormal, predicts Network Situation value using training pattern, have the disadvantage in that
Existing clustering method exists when being applied to Intrusion detection alarm processing to initial cluster center sensitive issue, makes It obtains not accurate enough for the analysis of alarm result.To influence the training of last model, accurate model cannot be obtained well.
Due to the inherent shortcoming of hidden Markov model itself, the choosing of initial value when using the training of EM algorithm is easily led to Drawbacks of the standard is taken to keep initial value selection result poor, to the training result of local optimum occur.
Summary of the invention
In order to overcome the above-mentioned deficiencies of the prior art, the object of the present invention is to provide one kind to be based on joint hidden Markov mould The Tendency Prediction method of type and genetic algorithm, it is had in alarm initialization procedure using fish-swarm algorithm Optimization of Fuzzy clustering method The shortcomings that effect ground overcomes alarm clustering analysis to be easily trapped into local extremum improves the precision of alarm clustering result, while utilizing group Body Intellisense algorithm optimization hidden Markov Tendency Prediction model, so that model training avoids local optimum well, thus So that network safety situation prediction result is more accurate.
To achieve the goals above, the technical solution adopted by the present invention is that:
It is a kind of based on joint hidden Markov model and genetic algorithm Tendency Prediction method, which is characterized in that including with Lower step:
Step 1: according to the Intrusion detection alarm of collection, it is carried out based on artificial fish school optimization fuzzy means clustering The pretreatment of Intrusion detection alarm clustering method, to achieve the purpose that simplified and Accurate classification alarm, and by the result of processing External observation value as network;
According to the Intrusion detection alarm of collection, the intrusion detection based on artificial fish school optimization fuzzy means clustering is carried out to it The pretreatment of alarm clustering method, comprising:
1): initialization intruding detection system alarm: rejecting unnecessary attribute, multi-source heterogeneous data tentatively gather It closes;
2) weight distribution of alert properties is carried out using Consistent Matrix method;
3) fuzzy similarity matrix of alarm is established using customized alert properties similarity function and weight relationship;
4) fuzzy equivalent matrix is established using Transitive Closure Method, and Artificial Fish individual is established to every alarm;
5) food concentration function is constructed, higher-dimension sample is mapped to three-dimensional planar;
6) the FCM cluster based on artificial fish-swarm algorithm is carried out, it includes:
1) error function of artificial fish-swarm algorithm is defined:
Wherein rij1Rij expression is mapped to the Euclidean distance between the sample i of three-dimensional planar and sample j from high-order sample, Assuming that the coordinate value of i and j is respectively (ai, bi, ci)、(aj, bj, cj), then rij1:
Rij* is the value of corresponding position in the fuzzy equivalent matrix established in step 4;
2) the food concentration function of individual is defined:
3) in three dimensions, being assigned at random to each sample from High Dimensional Mapping to three-dimensional sample random distribution to be clustered To D coordinates value;
4) the food concentration size of Artificial Fish is calculated;
5) the optimizations behavior such as bunched on the basis of the food concentration of the current shoal of fish, look for food, knock into the back;
If 6) all Artificial Fishs in group all terminate to move, continues to execute downwards, otherwise go to step 4);
If 7) the maximum food concentration value of individual of updated Artificial Fish and maximum food concentration functional value before updating it Difference is less than some designated value or update times reach specified maximum times, then terminates, otherwise go to step 4);
8) D coordinates value clustered using FCM algorithm, and last result is mapped to original higher-dimension sample In;
Step 2: determining the hidden status number N of network according to network risks grade, according to expertise, to each hidden state Probability carry out interval division, and between hidden state transition probability and hidden state arrive the output probability progress of aobvious state Interval division;
Step 3: according to the probability interval matrix of each hidden state ready-portioned in step 2, transition probability section Matrix, output probability interval matrix take the random number in section, and are normalized initial to generate P hidden Markov respectively Probability matrix π, transition probability matrix A, output probability matrix B;
Wherein, P hidden Markov probability matrix π, transition probability matrix A, output probability square are generated respectively at random Battle array B, the specific normalization result that probability matrix generated is met meet following formula:
Step 4: P probability matrix generated is encoded using floating-point encoding method;Used floating-point Three parameter matrixs that the chromosome that number encoder method generates corresponds to hidden Markov model separately include three parts, at the beginning of hidden state Beginning probability matrix corresponds to initial chromosome Ge π, and hidden state transition probability matrix corresponds to transfer dyeing body GeA, and hidden state is to showing The output matrix of state corresponds to output Chromosome G eB;
Step 5: the fitness value of all P chromosomes is calculated, the randomness of genetic algorithm destroys current kind in order to prevent The optimal individual of fitness value, i.e. optimum maintaining strategy, are copied directly to next population for the maximum individual of fitness value in group;
Step 6: it for rear P-1 chromosome, calculates it and support of species Discrete and the weighting of fitness value is asked With so that population scale is reached P again in conjunction with roulette rule;
Individual is related to as given a definition the support calculation of species Discrete:
Define 1: definition population scale is S, is defined in a chromosome comprising Q=m*n+n*n+N gene, chromosome k By Gk=(Gk1,Gk2...GkQ), k=1,2...S is indicated;
Define 2: chromosome fitness function f: the optimal chromosome as required by genetic algorithm is the initial ginseng of Hmm Matrix number, therefore use the forward direction probability of all chromosomes as its fitness function, i.e.,
F=P (O/ λ);
It defines 3: defining individual phenotypeηK, the i.e. fitness value of chromosome k and Population adaptation angle value and ratio
It defines 4: defining species Discrete degree d
Define 5: define k-th of chromosome is for the support of species Discrete;
Step 7: determining crossover probability according to support size, and using arithmetic crossover mode, the heredity completed between individual is handed over The step of fork, is as follows:
1): random selection item chromosome k, calculation formula:Wherein SptmaxRepresent maximum Support, SptminRepresent minimum support, SptkRepresent randomly selected chromosome support;
2): generating random number r, if r < Sptr, chromosome k is one to Cross reaction body.This two step is repeated, directly To generation two to Cross reaction body;
3): to two to Cross reaction body carry out genetic cross, cross-over principle are as follows: Ge π 1 and Ge π 1 intersects, GeA1 with GeA2 intersects, and GeB1 and GeB2 intersect;
Step 8: the heredity of individual is completed using nonuniform meshes mode according to support size definitive variation probability Variation;Variation mode are as follows:
Wherein GkFor k-th of chromosome before randomly selected variation, Gk' it is GkChromosome after variation, GmaxAnd GminPoint It Wei not the minimum and maximum individual of current fitness.T be (0~1] between make a variation constant, r is a random number;Random integers G is used when rand () is even numberk’=Gk+t(Gmax-Gk) r variation mode, when being odd number use Gk’=Gk+t(Gk-Gmix)r Variation mode, in the way of support definitive variation probability be same step 7;
Step 9: to genetic replication, genetic cross is lived through, the newborn population after hereditary variation carries out the normalization of individual Processing, meets the parameter constraints of hidden Markov;
Step 10: checking whether and meet preset stopping criterion for iteration, if satisfied, then terminating, selects fitness value Required chromosome is mapped to three initial matrixs of hidden Markov model as global optimum by maximum chromosome. Otherwise return step five starts to carry out the evolution of a new round;
Step 11: the model parameter λ obtained using Bao Muweierqi algorithm to step 10=(π, A, B) is iterated Training obtains the maximal possibility estimation parameter of hidden Markov model;The λ that step 10 is obtained using Bao Muweierqi algorithm= (π, A, B) is iterated training and obtains the maximal possibility estimation parameter of HMM model, comprises the following steps:
1) D alarm sequence is obtained according to the Intrusion detection alarm clustering method of the artificial fish school optimization fuzzy means clustering Column data sample { O1,O2,...OD, any alert sequence Od={ o1 (d),o2 (d),o3 (d),....oT (d)};
2) optimal initial value λ=(π, A, a B) is obtained according to the genetic algorithm optimization;
3) for each sample d=1,2 ... D calculates γ with forward-backward algorithm algorithmt (d)(i), ξt (d)(i, j), t=1, 2...T;
4) model parameter matrix is updated;
5) it checks whether each matrix meets the condition of convergence, if satisfied, then algorithm terminates, otherwise returns to (3) iteration and execute;
Step 12: if when network state exception, it can be by collecting external observation value and trained hidden Ma Er Can husband's model, utilize viterbi algorithm carry out network safety situation prediction.
The invention has the following advantages that
1, classified using the combination of artificial fish-swarm algorithm and fuzzy clustering to collected data with alert, effectively solved In processing redundancy alerting process, traditional clustering method is sensitive to initial cluster center and cause cluster result accuracy not high The shortcomings that.To improve Tendency Prediction precision.
2, Tendency Prediction, the optimization knot that genetic algorithm processing generates are carried out using genetic algorithm joint hidden Markov model Fruit initial value inputs Bao Muweierqi algorithm, and using detection and processed network alert data change to it as observation Generation training, obtains parameter value.Initial value is chosen during this method effectively overcomes traditional hidden Markov model Tendency Prediction Improper the shortcomings that leading to training result local optimum.
Detailed description of the invention
Fig. 1 is the working principle of the invention figure.
Fig. 2 is network security situation awareness frame diagram.
Fig. 3 is fish-swarm algorithm of the invention-fuzzy clustering alarm processing flow chart of steps.
Fig. 4 is genetic algorithm searching process figure of the invention.
Specific embodiment
It is to be further discussed below in conjunction with the embodiments with attached drawing to the present invention, but the present invention is not limited to following embodiment below.
The present invention proposes a kind of Tendency Prediction method based on joint hidden Markov model and genetic algorithm, for existing There is the hidden Markov Tendency Prediction method in network security situational awareness method, there are theoretical defects for initial parameter create-rule The problem of easily leading to training result local optimum proposes to use intelligent perception theoretical optimization initial parameter, so that Bao Muweierqi Algorithm can obtain the higher parameter value of fitness in the trained initial stage.In the initialization procedure of training data, Wrong report, redundancy alarm are removed to it using joint artificial fish-swarm algorithm and c means clustering method.Combining for two methods makes With the accuracy that can largely improve Tendency Prediction result, network security manager is enabled more accurately to obtain network peace Full situation truth.
Fig. 1 is the working principle of the invention figure.Specifically, the present invention is using treated intruding detection system alert data As input, after data initialization, data are handled using improved clustering method, as situation observation.To existing hidden After Markov prediction initial parameter optimizes, model instruction is carried out using Bao Muweierqi algorithm combination situation observation Practice, finally obtains the Maximum Likelihood Model parameter value of observation sequence.Recycle observation sequence and viterbi algorithm to the state of network Gesture value is predicted.Specifically include the following steps:
1) according to pretreated Intrusion detection alarm, the invasion inspection based on artificial fish school optimization fuzzy clustering is carried out to it The pretreatment of alarm clustering method is surveyed, to achieve the purpose that simplified and Accurate classification alarm, and using the result of processing as net The external observation value of network;
2) the hidden status number N of network is determined according to network risks grade, according to expertise, to the initial of each hidden state Probability carries out interval division, and carries out section to the output probability of transition probability and hidden state to aobvious state hidden state and draw Point;
3) according to the probability interval matrix of each hidden state ready-portioned in step 2, transition probability interval matrix, Output probability interval matrix takes the random number in section, and is normalized initial to generate P hidden Markov model respectively Probability matrix π, transition probability matrix A, output probability matrix B;
4) P probability matrix generated is encoded using floating-point encoding method;
5) fitness value of all P chromosomes is calculated, the randomness of genetic algorithm is destroyed in current population in order to prevent The maximum individual of fitness value is copied directly to next population by the optimal individual of fitness value, i.e. optimum maintaining strategy;
6) for rear P-1 chromosome, it is calculated to the support of species Discrete and the weighted sum of fitness value, knot Closing roulette rule makes population scale reach P again;
7) crossover probability is determined according to support size, using arithmetic crossover mode, completes the genetic cross between individual;
8) hereditary variation of individual is completed using nonuniform meshes mode according to support size definitive variation probability;
9) to genetic replication, genetic cross is lived through, the newborn population after hereditary variation carries out the normalized of individual, Meet the parameter constraints of hidden Markov;
10) it checks whether and meets preset stopping criterion for iteration, if satisfied, then terminating, select fitness value maximum Chromosome as global optimum, and required chromosome is mapped to three initial matrixs of hidden Markov model.Otherwise Return step 5) start to carry out the evolution of a new round;
11) model parameter λ=(π, A, the B) obtained to step 10) using Bao Muweierqi algorithm is iterated trained To the maximal possibility estimation parameter of hidden Markov model;
It, can be by collecting external observation situation value and trained hidden Ma Er 12) if network state is abnormal Can husband's model, utilize viterbi algorithm carry out network safety situation prediction.
Fig. 3 is fish-swarm algorithm-fuzzy clustering alarm processing step.Specifically, according to the Intrusion detection alarm of collection, to it Carry out the pretreatment of the Intrusion detection alarm clustering method based on artificial fish school optimization fuzzy clustering, the specific steps are as follows:
(1): initialization intruding detection system alarm: rejecting unnecessary attribute, multi-source heterogeneous data are carried out tentatively Polymerization comprising the steps of:
1) a warning information x is inputtediIf i=1 records its alarm types type (1), number of types counter t=1
2) when i >=2 when, from currently identified type type (1) to type type (t), judge itself and it is current alert Type type (the x of reporti) comparing result, i.e.,
3) as i=n, for every one kind in t class data with alert, classified according to predefined time span;
(2) weight distribution of alert properties is carried out using Consistent Matrix method, comprising the following steps:
1) Importance of attribute degree ratio two-by-two is carried out according to m attribute of the expertise to Intrusion detection alarm to give a mark to obtain Judgment matrix
Wherein xijThe ratio between the importance degree of i-th and j-th attribute;
2)
Each factor weight is B=(β1, β2,., βi,., βn);
(3) fuzzy similarity matrix of alarm is established using customized alert properties similarity function and weight relationship, The attributes similarity function specifically includes following:
1) time similarity function:
2) port similarity function:
3) address source/destination ip similarity function
(η is the address the two source/destination ip identical digit from left to right);
4) agreement similarity function;
The similarity x of i-th alarm and j-th strip alarmijCalculation formula is;
(wherein m is attribute number,It is the of i-th article and j-th strip alarmThe similarity value of a attribute);
(4) fuzzy equivalent matrix is established using Transitive Closure Method, and Artificial Fish individual is established to every alarm;
(5) food concentration function is constructed, higher-dimension sample is mapped to three-dimensional planar;
(6) the FCM cluster based on artificial fish-swarm algorithm is carried out, it includes:
1) error function of artificial fish-swarm algorithm is defined:
Wherein rij' indicate the Euclidean distance being mapped between the sample i of three-dimensional planar and sample j from high-order sample, it is assumed that i Coordinate value with j is respectively (ai, bi, ci)、(aj, bj, cj), then rij1:
rij *It is the value of corresponding position in the fuzzy equivalent matrix established in step 4;
2) the food concentration function of individual is defined:
3) in three dimensions, being assigned at random to each sample from High Dimensional Mapping to three-dimensional sample random distribution to be clustered To D coordinates value
4) the food concentration size of Artificial Fish is calculated
5) the optimizations behavior such as bunched on the basis of the food concentration of the current shoal of fish, look for food, knock into the back
If 6) all Artificial Fishs in group all terminate to move, continues to execute downwards, otherwise turn (4)
If 7) the maximum food concentration value of individual of updated Artificial Fish and maximum food concentration functional value before updating it Difference is less than some designated value or update times reach specified maximum times, then terminates, otherwise turn (4)
8) D coordinates value clustered using FCM algorithm, and last result is mapped to original higher-dimension sample In.
Fig. 4 is genetic algorithm searching process figure.Specifically, random generate P hidden Markov probability matrix respectively π, transition probability matrix A, output probability matrix B.The specific normalization result that probability matrix generated is met meets as follows Formula:
The chromosome that used floating-point encoding method generates corresponds to three parameter matrixs point of hidden Markov model Not Bao Han three parts, hidden state probability matrix corresponds to initial chromosome Ge π, and hidden state transition probability matrix, which corresponds to, to be turned Chromosome G eA is moved, the output matrix of hidden state to aobvious state corresponds to output Chromosome G eB, such as Fig. 1:
Individual is related to as given a definition the specific calculation of the support of species Discrete:
Define 1: definition population scale is S, is defined in a chromosome comprising Q=m*n+n*n+N gene, chromosome k By Gk=(Gk1,Gk2...GkQ), k=1,2...S is indicated;
Define 2: chromosome fitness function f: the optimal chromosome as required by genetic algorithm is the initial ginseng of Hmm Matrix number, therefore use the forward direction probability of all chromosomes as its fitness function, i.e.,
F=P (O/ λ)
Define 3: defining individual phenotype η k, i.e., the fitness value of chromosome k and Population adaptation angle value and ratio
It defines 4: defining species Discrete degree d;
Define 5: define k-th of chromosome is for the support of species Discrete
The combination roulette rule makes population scale reach P again, and specific steps include:
(1): calculation formula ti=ufi+vSptiWherein u, v are respectively weight shared by fitness value and support angle value;
(2): calculation formula Tn=∑ufi+vSpti
(3): calculation formula Wi=ti/Tn
(4): calculating accumulated probability
(5): being randomly generated and meet the random number r for being uniformly distributed 0~1, and by r and giIf comparison gi-1<r<gi, then select It selects individual i and enters next-generation new group;(4) and (5) are executed repeatedly, until the number of the new group of generation is advised equal to parent group Mould;
Crossover probability is determined according to support size, and using arithmetic crossover mode, the genetic cross completed between individual is specific Steps are as follows:
(1): random selection item chromosome k, calculation formula:Wherein SptmaxIt represents most Big support, SptminRepresent minimum support, SptkRepresent randomly selected chromosome support;
(2): generating random number r, if r < Sptr, chromosome k is one to Cross reaction body.This two step is repeated, Until generating two to Cross reaction body
(3): carrying out genetic cross, cross-over principle are as follows: Ge π 1 and Ge π 1 intersects, GeA1 to Cross reaction body ... to two Intersect with GeA2, GeB1 and GeB2 intersect.Used specific crossover operation is in the following example:
1) parent: Ge π 1={ π11, π12,.π1n, Ge π 1={ π21, π22,.π2n}
2) a gene j is randomly choosed
3) filial generation: Ge π 1={ π11, π12... a*π1k+(1-a)π2k, a* π1(k+1)+(1-a)π2(k+1)...a*π1n+(1-a) π2n, wherein a is the random number between 0~1, and transfer matrix intersects similarly with output matrix, repeats no more;
The hereditary variation of individual is completed using nonuniform meshes mode according to support size definitive variation probability.Tool Body utilizes following formula:
Variation mode are as follows:
Wherein GkFor k-th of chromosome before randomly selected variation, Gk' for variation after GkChromosome after variation. GmaxAnd GminThe minimum and maximum individual of respectively current fitness.T be (0~1] between make a variation constant, r is a random number. That is G is used when random integers rand () is even numberk’=Gk+t(Gmax-Gk) r variation mode, when being odd number use Gk’=Gk+t (Gk-Gmix) r variation mode.
Training is iterated to obtained λ=(π, A, B) using Bao Muweierqi algorithm and obtains the maximum of HMM model seemingly So estimation parameter, specifically includes the following steps:
(1): D alarm is obtained according to the Intrusion detection alarm clustering method of the artificial fish school optimization fuzzy means clustering Sequence data sample { O1,O2,...OD, any alert sequence 0d={ o1 (d),o2 (d),...oT (d), and according to the right It is required that described in 2
(2): optimal initial value λ=(π, A, a B) is obtained according to the genetic algorithm optimization
(3): for each sample d=1,2 ... D calculates γ with forward-backward algorithm algorithmt (d)(i), ξt (d)(i, j), t=1, 2...T, wherein
Wherein, Ci (i) is preceding to probability, βiIt (i) is backward probability, aijFor transition probability, bj (a+1) is output probability
(4): model parameter is updated according to the following formula:
(5): checking whether each matrix meets the condition of convergence, if satisfied, then algorithm terminates, otherwise return to (3) iteration and hold Row.
Hidden Markov model parameter after being trained by above step.When the network operation is abnormal, prediction algorithm Thinking it is as follows:
(1) the Network Situation sequence of observations is obtained.
(2) the hidden Markov model parameter after training is obtained.
(3) the hidden status switch that maximizes is calculated according to viterbi algorithm.
(4) the network safety situation value of subsequent time is determined according to state-transition matrix.

Claims (1)

1. a kind of Tendency Prediction method based on joint hidden Markov model and genetic algorithm, which is characterized in that including following Step:
Step 1: according to the Intrusion detection alarm of collection, the invasion based on artificial fish school optimization fuzzy means clustering is carried out to it The pretreatment of detection alarm clustering method, to achieve the purpose that simplified and Accurate classification alarm, and using the result of processing as The external observation value of network;
According to the Intrusion detection alarm of collection, the Intrusion detection alarm based on artificial fish school optimization fuzzy means clustering is carried out to it The pretreatment of clustering method, comprising:
1): initialization intruding detection system alarm: rejecting unnecessary attribute, preliminary polymerization is carried out to multi-source heterogeneous data;
2) weight distribution of alert properties is carried out using Consistent Matrix method;
3) fuzzy similarity matrix of alarm is established using customized alert properties similarity function and weight relationship;
4) fuzzy equivalent matrix is established using Transitive Closure Method, and Artificial Fish individual is established to every alarm;
5) food concentration function is constructed, higher-dimension sample is mapped to three-dimensional planar;
6) the FCM cluster based on artificial fish-swarm algorithm is carried out, it includes:
1) error function of artificial fish-swarm algorithm is defined:
Wherein rij1Rij indicates the Euclidean distance being mapped between the sample i of three-dimensional planar and sample j from high-order sample, it is assumed that i Coordinate value with j is respectively (ai, bi, ci)、(aj, bj, cj), then rij1:
Rij* is the value of corresponding position in the fuzzy equivalent matrix established in step 4;
2) the food concentration function of individual is defined:
3) in three dimensions, being assigned to three at random to each sample from High Dimensional Mapping to three-dimensional sample random distribution to be clustered Dimensional coordinate values;
4) the food concentration size of Artificial Fish is calculated;
5) the optimizations behavior such as bunched on the basis of the food concentration of the current shoal of fish, look for food, knock into the back;
If 6) all Artificial Fishs in group all terminate to move, continues to execute downwards, otherwise go to step 4);
7) if the individual maximum food concentration value of updated Artificial Fish and the difference of the maximum food concentration functional value before update are small Reach specified maximum times in some designated value or update times, then terminate, otherwise goes to step 4);
8) D coordinates value clustered using FCM algorithm, and last result is mapped in original higher-dimension sample;
Step 2: determining the hidden status number N of network according to network risks grade, according to expertise, to the first of each hidden state Beginning probability carries out interval division, and carries out section to the output probability of transition probability and hidden state to aobvious state hidden state It divides;
Step 3: according to the probability interval matrix of each hidden state ready-portioned in step 2, transition probability interval matrix, Output probability interval matrix takes the random number in section, and is normalized to generate P hidden Markov probability respectively Matrix π, transition probability matrix A, output probability matrix B;
Wherein, P hidden Markov probability matrix π is generated respectively at random, transition probability matrix A, output probability matrix B, The specific normalization result that probability matrix generated is met meets following formula:
Step 4: P probability matrix generated is encoded using floating-point encoding method;Used floating number is compiled Three parameter matrixs that the chromosome that code method generates corresponds to hidden Markov model separately include three parts, and hidden state is initially general Rate matrix corresponds to initial chromosome Ge π, and hidden state transition probability matrix corresponds to transfer dyeing body GeA, hidden state to aobvious state Output matrix correspond to output Chromosome G eB;
Step 5: calculating the fitness value of all P chromosomes, and the randomness of genetic algorithm is destroyed in current population in order to prevent The maximum individual of fitness value is copied directly to next population by the optimal individual of fitness value, i.e. optimum maintaining strategy;
Step 6: for rear P-1 chromosome, calculating it to the support of species Discrete and the weighted sum of fitness value, Population scale is set to reach P again in conjunction with roulette rule;
Individual is related to as given a definition the support calculation of species Discrete:
Define 1: definition population scale is S, is defined comprising Q=m*n+n*n+N gene in a chromosome, chromosome k is by Gk= (Gk1,Gk2...GkQ), k=1,2...S is indicated;
Define 2: chromosome fitness function f: the optimal chromosome as required by genetic algorithm is the initial parameter square of Hmm Battle array, therefore use the forward direction probability of all chromosomes as its fitness function, i.e.,
F=P (O/ λ);
It defines 3: defining individual phenotypeηK, the i.e. fitness value of chromosome k and Population adaptation angle value and ratio
It defines 4: defining species Discrete degree d
Define 5: define k-th of chromosome is for the support of species Discrete;
Step 7: determining crossover probability according to support size, using arithmetic crossover mode, completes the genetic cross between individual Steps are as follows:
1): random selection item chromosome k, calculation formula:Wherein SptmaxRepresent maximum support Degree, SptminRepresent minimum support, SptkRepresent randomly selected chromosome support;
2): generating random number r, if r < Sptr, chromosome k is one to Cross reaction body.Repeat this two step, Zhi Daosheng At two to Cross reaction body;
3): carrying out genetic cross, cross-over principle to Cross reaction body to two are as follows: Ge π 1 and Ge π 1 intersects, and GeA1 and GeA2 are handed over Fork, GeB1 and GeB2 intersect;
Step 8: according to support size definitive variation probability, using nonuniform meshes mode, the heredity for completing individual becomes It is different;Variation mode are as follows:
Wherein GkFor k-th of chromosome before randomly selected variation, Gk' it is GkChromosome after variation, GmaxAnd GminRespectively The minimum and maximum individual of current fitness.T be (0~1] between make a variation constant, r is a random number;Random integers rand () G is used when being even numberk’=Gk+t(Gmax-Gk) r variation mode, when being odd number use Gk’=Gk+t(Gk-Gmix) r variation side Formula is same step 7 in the way of support definitive variation probability;
Step 9: to genetic replication, genetic cross is lived through, the newborn population after hereditary variation is carried out at the normalization of individual Reason, meets the parameter constraints of hidden Markov;
Step 10: checking whether and meet preset stopping criterion for iteration, if satisfied, then terminating, selects fitness value maximum Chromosome as global optimum, and required chromosome is mapped to three initial matrixs of hidden Markov model.Otherwise Return step five starts to carry out the evolution of a new round;
Step 11: the model parameter λ obtained using Bao Muweierqi algorithm to step 10=(π, A, B) is iterated trained To the maximal possibility estimation parameter of hidden Markov model;The λ that step 10 is obtained using Bao Muweierqi algorithm=(π, A, B) It is iterated training and obtains the maximal possibility estimation parameter of HMM model, comprise the following steps:
1) D alert sequence number is obtained according to the Intrusion detection alarm clustering method of the artificial fish school optimization fuzzy means clustering According to sample { O1,O2,...OD, any alert sequence Od={ o1 (d),o2 (d),o3 (d),....oT (d)};
2) optimal initial value λ=(π, A, a B) is obtained according to the genetic algorithm optimization;
3) for each sample d=1,2 ... D calculates γ with forward-backward algorithm algorithmt (d)(i), ξt (d)(i, j), t=1,2...T;
4) model parameter matrix is updated;
5) it checks whether each matrix meets the condition of convergence, if satisfied, then algorithm terminates, otherwise returns to (3) iteration and execute;
Step 12: if when network state exception, it can be by collecting external observation value and trained hidden Markov Model carries out network safety situation prediction using viterbi algorithm.
CN201910060212.1A 2019-01-22 2019-01-22 Situation prediction method based on combined hidden Markov model and genetic algorithm Active CN110336768B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910060212.1A CN110336768B (en) 2019-01-22 2019-01-22 Situation prediction method based on combined hidden Markov model and genetic algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910060212.1A CN110336768B (en) 2019-01-22 2019-01-22 Situation prediction method based on combined hidden Markov model and genetic algorithm

Publications (2)

Publication Number Publication Date
CN110336768A true CN110336768A (en) 2019-10-15
CN110336768B CN110336768B (en) 2021-07-20

Family

ID=68138888

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910060212.1A Active CN110336768B (en) 2019-01-22 2019-01-22 Situation prediction method based on combined hidden Markov model and genetic algorithm

Country Status (1)

Country Link
CN (1) CN110336768B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110826617A (en) * 2019-10-31 2020-02-21 中国人民公安大学 Situation element classification method and training method and device of model thereof, and server
CN111598335A (en) * 2020-05-15 2020-08-28 长春理工大学 Traffic area division method based on improved spectral clustering algorithm
CN112101673A (en) * 2020-09-22 2020-12-18 华北电力大学 Power grid development trend prediction method and system based on hidden Markov model
CN112260870A (en) * 2020-10-21 2021-01-22 重庆邮电大学 Network security prediction method based on dynamic fuzzy clustering and grey neural network
CN112784896A (en) * 2021-01-20 2021-05-11 齐鲁工业大学 Time series flow data anomaly detection method based on Markov process
CN112994944A (en) * 2021-03-03 2021-06-18 上海海洋大学 Network state prediction method
CN114490619A (en) * 2022-02-15 2022-05-13 北京大数据先进技术研究院 Data filling method, device, equipment and storage medium based on genetic algorithm
CN116055182A (en) * 2023-01-28 2023-05-02 北京特立信电子技术股份有限公司 Network node anomaly identification method based on access request path analysis

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140191955A1 (en) * 2010-07-13 2014-07-10 Giuseppe Raffa Efficient gesture processing
CN106453294A (en) * 2016-09-30 2017-02-22 重庆邮电大学 Security situation prediction method based on niche technology with fuzzy elimination mechanism

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140191955A1 (en) * 2010-07-13 2014-07-10 Giuseppe Raffa Efficient gesture processing
CN106453294A (en) * 2016-09-30 2017-02-22 重庆邮电大学 Security situation prediction method based on niche technology with fuzzy elimination mechanism

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
WEI LIANG等: "Multiscale Entropy-Based Weighted Hidden Markov Network Security Situation Prediction Model", 《2017 IEEE INTERNATIONAL CONGRESS ON INTERNET OF THINGS(ICIOT)》 *
席荣荣等: "一种改进的网络安全态势量化评估方法", 《计算机学报》 *
王国华: "基于遗传算法的网络安全态势感知研究", 《计算机测量与控制》 *
郭凤鸣等: "模糊聚类分析传递闭包法实用程序", 《电脑学习》 *
高岭等: "基于马尔可夫链的自适应DRX优化机制", 《东南大学学报(自然科学版)》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110826617A (en) * 2019-10-31 2020-02-21 中国人民公安大学 Situation element classification method and training method and device of model thereof, and server
CN111598335A (en) * 2020-05-15 2020-08-28 长春理工大学 Traffic area division method based on improved spectral clustering algorithm
CN112101673A (en) * 2020-09-22 2020-12-18 华北电力大学 Power grid development trend prediction method and system based on hidden Markov model
CN112101673B (en) * 2020-09-22 2024-01-16 华北电力大学 Power grid development trend prediction method and system based on hidden Markov model
CN112260870A (en) * 2020-10-21 2021-01-22 重庆邮电大学 Network security prediction method based on dynamic fuzzy clustering and grey neural network
CN112260870B (en) * 2020-10-21 2022-04-05 重庆邮电大学 Network security prediction method based on dynamic fuzzy clustering and grey neural network
CN112784896A (en) * 2021-01-20 2021-05-11 齐鲁工业大学 Time series flow data anomaly detection method based on Markov process
CN112994944A (en) * 2021-03-03 2021-06-18 上海海洋大学 Network state prediction method
CN114490619A (en) * 2022-02-15 2022-05-13 北京大数据先进技术研究院 Data filling method, device, equipment and storage medium based on genetic algorithm
CN114490619B (en) * 2022-02-15 2022-09-09 北京大数据先进技术研究院 Data filling method, device, equipment and storage medium based on genetic algorithm
CN116055182A (en) * 2023-01-28 2023-05-02 北京特立信电子技术股份有限公司 Network node anomaly identification method based on access request path analysis
CN116055182B (en) * 2023-01-28 2023-06-06 北京特立信电子技术股份有限公司 Network node anomaly identification method based on access request path analysis

Also Published As

Publication number Publication date
CN110336768B (en) 2021-07-20

Similar Documents

Publication Publication Date Title
CN110336768A (en) A kind of Tendency Prediction method based on joint hidden Markov model and genetic algorithm
CN110070141B (en) Network intrusion detection method
CN108520272B (en) Semi-supervised intrusion detection method for improving Cantonese algorithm
Gong et al. An efficient Lorentz equivariant graph neural network for jet tagging
Ji et al. Approximate logic neuron model trained by states of matter search algorithm
CN111967343B (en) Detection method based on fusion of simple neural network and extreme gradient lifting model
Qasem et al. Multi-objective hybrid evolutionary algorithms for radial basis function neural network design
Qasem et al. Memetic multiobjective particle swarm optimization-based radial basis function network for classification problems
CN108712404A (en) A kind of Internet of Things intrusion detection method based on machine learning
CN110059852A (en) A kind of stock yield prediction technique based on improvement random forests algorithm
Shi et al. Feature selection for object-based classification of high-resolution remote sensing images based on the combination of a genetic algorithm and tabu search
Wei A method of enterprise financial risk analysis and early warning based on decision tree model
CN111882041A (en) Power grid attack detection method and device based on improved RNN (neural network)
Jiang et al. A density peak clustering algorithm based on the K-nearest Shannon entropy and tissue-like P system
Lin et al. One-to-one ensemble mechanism for decomposition-based multi-objective optimization
Zhang et al. Mining significant fuzzy association rules with differential evolution algorithm
Prasenna et al. Network programming and mining classifier for intrusion detection using probability classification
Geng et al. Novel IAPSO-LSTM neural network for risk analysis and early warning of food safety
CN111614609B (en) GA-PSO-DBN-based intrusion detection method
CN108537663A (en) One B shareB trend forecasting method
Zhang et al. A feature-enhanced long short-term memory network combined with residual-driven ν support vector regression for financial market prediction
Xia et al. A novel key influencing factors selection approach of P2P lending investment risk
Yu et al. Rural financial decision support system based on database and genetic algorithm
Cao et al. Adopting improved Adam optimizer to train dendritic neuron model for water quality prediction
Chen Hotel management evaluation index system based on data mining and deep neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant