CN108573303A - It is a kind of that recovery policy is improved based on the complex network local failure for improving intensified learning certainly - Google Patents

It is a kind of that recovery policy is improved based on the complex network local failure for improving intensified learning certainly Download PDF

Info

Publication number
CN108573303A
CN108573303A CN201810375758.1A CN201810375758A CN108573303A CN 108573303 A CN108573303 A CN 108573303A CN 201810375758 A CN201810375758 A CN 201810375758A CN 108573303 A CN108573303 A CN 108573303A
Authority
CN
China
Prior art keywords
service mode
cluster
node
recovery
complex network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810375758.1A
Other languages
Chinese (zh)
Inventor
冯强
吴其隆
任羿
孙博
杨德真
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201810375758.1A priority Critical patent/CN108573303A/en
Publication of CN108573303A publication Critical patent/CN108573303A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a kind of based on the complex network local failure for improving intensified learning from recovery policy method is improved, and solves the problems, such as that complex network carries out the recovery policy generation of cluster repair.Steps are as follows:1 establishes the cluster service mode matrix of complex network according to local failure information.2 generate complex network adjacency matrix based on initial cluster service mode.3 priori service mode transition probability and the maintenance policy values based on Neural Network model predictive cluster.4 traverse the maintenance policy solution space of cluster based on Monte Carlo tree search algorithm, and select current time global best maintenance action.5 variations based on cluster service mode update complex network adjacency matrix.6 calculate based on cluster service mode and adjacency matrix and examine the recovery extent of complex network.7 train neural network parameter based on intensified learning empirical parameter.8 generate a complete repair recovery scheme based on recovery policy from a series of best maintenance actions of development.

Description

It is a kind of to be restored from improvement based on the complex network local failure for improving intensified learning Strategy
Technical field
The present invention provides a kind of recovery plan of improvement certainly based under the complex network local failure state for improving intensified learning Slightly (Self-improvement Recovery Strategy, SIRS) method more particularly to a kind of consideration network node composition Element characteristic is based on improved nitrification enhancement, realizes from the recovery plan for improving the repair of solving complexity network multi-node cluster Slightly method, belongs to maintainability engineering field.
Background technology
Refer to destroying position after local failure occurs for complex network and multinode concentration occur from recovery policy (SIRS) is improved Not available situation, rapid rush-repair is to whole available mode by way of cluster repair.But it is tieed up both at home and abroad about cluster at present The research repaiied does not consider sequential generally.As maintainability is increasingly taken seriously, cluster maintenance policy is carried out to complex network Higher requirement has been researched and proposed, that is, it is whole to fully consider that the sequential of cluster repair and income do not know feature and problem NP-hard features provide a kind of efficient cluster maintenance policy method.
The present invention is based on the neural network prediction models of service mode transition probability and Monte Carlo tree to search for (Monte Carlo Tree Search, MCTS) algorithm, it has invented a kind of based on the novel improvement recovery policy certainly for improving intensified learning (SIRS) method solves the problems, such as that the cluster maintenance policy under complex network local failure state generates.
Invention content
The purpose of the present invention is provide a kind of novel improvement recovery policy certainly for the complex network under local failure state (SIRS) method, it is intended to solve conventional cluster maintenance policy method and not fully consider the sequential of cluster repair and the uncertain spy of income The problems such as NP-hard features of sign and problem entirety.
The present invention proposes a kind of SIRS based on neural network prediction model and Monte Carlo tree search (MCTS) algorithm Method mainly comprises the steps of:
Step 1:The cluster service mode matrix of complex network is established based on local failure.
Research is unfolded in the cluster maintenance problem that complex network local failure recovery policy is considered as to multinode.First, it builds Node set K={ the k of complex network1,k2,…,ki,…,kj,…,kn(wherein n is the number of node), by the group of each node At being disassembled, its unit set U={ u are established1,u2,…,um}.Based on this, " node-unit " square of m × n is established Battle array, and according to local failure information, with the trouble unit in " 0 " expression local failure space to be repaired, " 1 " indicates normal single Member forms service mode matrix S to element assignment in matrix.
Step 2:Complex network adjacency matrix is generated based on initial cluster service mode.
One complex network is abstracted as one by set of node K={ k1,k2,…,ki,…,kj,…,knAnd connection (side) CollectionThe figure G=(K, E) of composition.It is described in complex network between n node with the adjacency matrix A of a n × n Connection relation (side), and do not consider from ring.When all units are normal in complex network, adjacency matrix is denoted as A*.
By node kiUnit collection Ui={ u1,u2,…,umBe divided into three classes unit collectionThenTable Show unit collectionIn all nodes be destroy space in trouble unit, similarly can be rightWithTwo class unit collection into Row description.Based on above-mentioned classification, with node kiFor, it is assumed that element is reflected with element in adjacency matrix A in service mode matrix S Penetrate relationship fS→AFor
Above-mentioned relation indicates, as node kiA classes unit when all destroying, disconnected with the associated all sides of the node; As node kiB classes unit when all destroying, the side that remaining node is directed toward by the node disconnects;As node kiC classes units it is whole When destruction, the side that the node is directed toward by remaining node disconnects.Initial repair state based on complex network, by mapping relationship fS→A The adjacency matrix A of initial repair state can be generated.
Step 3:Priori service mode transition probability based on Neural Network model predictive cluster.
Design one compression-excitation residual error network (Squeeze-and-Excitation Residual Networks, SE-ResNet the priori service mode transition probability matrix p and priori cluster maintenance policy valence of " node-unit " cluster) are predicted Value v.
Neural network input feature vector figure X:Including current " node-unit " cluster service mode S, maintenance policy iteration mistake The neighbour of nearest history cluster service mode (by taking 7 step history cluster service modes as an example) and complex network node in journey Meet matrix A (S) and A*.
Neural network output information:A priori cluster service mode transition probability p including " node-unit " cluster and One priori cluster maintenance policy is worth v.
The neural network structure of selection:Including convolution module, residual error module, compression-excitation (Squeeze-and- Excitation, SE) module, ReLU function modules etc..The expression formula of neural network is fθ(X)=(p, v).
Step 4:The maintenance policy solution space of cluster is traversed based on Monte Carlo tree search algorithm.
To improve complex network " node-unit " clustering performance recovery extent, reduction recovery time is target, structure repair Strategy is from improved iteration system.A kind of intensified learning frame based on improved weighting MCTS algorithms is designed, for solving most Excellent maintenance policy.
MCTS algorithms avoid the direct overall situation using the Maintenance forecast result p of SE-ResNet in step 3 as search weight There is multiple shot array problem in search cluster maintenance policy solution space, and the local search that solution space is carried out based on prior probability p is same Global optimum's maintenance policy can be obtained, improved service mode transition probability matrix π is obtained according to tree search, is executed primary global Best maintenance policy acts a, and current " node-unit " cluster service mode S is transferred to subsequent time cluster service mode, Its expression formula of MCTS algorithms is MCTSθ(X, p, v)=(π, a).
Step 5:Variation based on cluster service mode updates complex network adjacency matrix.
From after executing the best maintenance action at development a certain moment, cluster service mode is transferred to next recovery policy Moment, based on the variation of cluster service mode, according to the mapping relationship f in step 2S→A, update complex network adjacency matrix.
Step 6:Calculate and examine the recovery extent of complex network.
A recovery policy is executed from after improving operation (including Step 3: step 4 and step 5), after shifting " node-unit " cluster service mode S and its adjacency matrix A (S) calculates the recovery extent of complex network.
It is required if not meeting recovery, return to step three, continues to execute recovery policy and improve operation certainly.
If the cluster service mode S at T momentTMeet the requirements, then pass through T time from improve operation complete one completely it is extensive Then multiple strategy is performed simultaneously step 7 and step 8 from development.
Step 7:Neural network parameter is trained based on intensified learning empirical parameter.
A reward value z is calculated by reward function to assess recovery policy from development, based on reward value and extensive The newest intensified learning empirical parameter of T groups that multiple strategy is generated from development, SE-ResNet is to minimize the assessed value of prediction Error between v and the reward value z for improving end certainly, and maximize prior state transition probability p and the transfer of improved state Similarity between probability π is target, trains network parameter θ using gradient descent method, obtains a new SE-ResNet and be used for Next time recovery policy from development.The better direction of search can be provided for MCTS by repetitive exercise neural network.
Step 8:Based on recovery policy a complete repair recovery scheme is generated from development
A series of best maintenance action { a stored from development by recovery policy1,a2,...,aTGenerate one completely Repair recovery scheme, repair recovery scheme can be expressed as
Recovery=fRec(a1,a2,...,aT)=1 × a1+2×a2+…+T×aT
By final cluster service mode STAnd its adjacency matrix A (ST) calculate and export the recovery extent of complex network.
Description of the drawings
Fig. 1 is the overall architecture block diagram of heretofore described method
Fig. 2 is the SE-ResNet prediction models of priori service mode transition probability in the present invention
Fig. 3 is the SE-Residual cellular constructions that priori service mode transition probability prediction model is selected in the present invention
Fig. 4 is the Monte Carlo tree search algorithm flow chart that maintenance policy solution space is traversed in the present invention
Specific implementation mode
To make technical scheme of the present invention, feature and advantage are better understood upon, below in conjunction with attached drawing, make specifically It is bright.
The present invention gives a kind of novel improvement recovery policy (SIRS) methods certainly, can be used for multiple under local failure state The cluster maintenance policy problem of miscellaneous network solves conventional method and does not fully consider the sequential of cluster repair and the uncertain spy of income The deficiencies of NP-hard features of sign and problem entirety.
The overall architecture of the present invention, as shown in Figure 1.Its specific implementation step is:
Step 1:The cluster service mode matrix of complex network is established based on local failure.
Research is unfolded in the cluster maintenance problem that complex network local failure recovery policy is considered as to multinode.First, it builds Node set K={ the k of complex network1,k2,…,ki,…,kj,…,kn(wherein n is the number of node), by the group of each node At being disassembled, its unit set U={ u are established1,u2,…,um}.Based on this, " node-unit " square of m × n is established Battle array, and according to local failure information, with the trouble unit in " 0 " expression local failure space to be repaired, " 1 " indicates normal single Member forms service mode matrix S to element assignment in matrix.
When recovery policy is carried out from development to t moment, the service mode matrix expression of " node-unit " cluster is
Element in matrixIndicate t moment node knMiddle unit umRepair shape State,Indicate that the unit is normal,Indicate the trouble unit in the local failure space that the unit is to be repaired.
Example:If analysis object is a complex network for including 10 nodes, each node includes 6 units, when initial Carve " node-unit " cluster service mode matrix expression be
Element in matrixIndicate initial time node k1Middle unit u1Normally,Indicate initial time node k5 Middle unit u1It is the trouble unit in local failure space to be repaired.
Step 2:Complex network adjacency matrix is generated based on initial cluster service mode.
One complex network is abstracted as one by set of node K={ k1,k2,…,ki,…,kj,…,knAnd connection (side) CollectionThe figure G=(K, E) of composition.It is described in complex network between n node with the adjacency matrix A of a n × n Connection relation (side), and do not consider from ring.When all units are normal in complex network, adjacency matrix is denoted as A*.
By node kiUnit collection Ui={ u1,u2,…,umBe divided into three classes unit collectionThenTable Show unit collectionIn all nodes be destroy space in trouble unit, similarly can be rightWithTwo class unit collection into Row description.Based on above-mentioned classification, with node kiFor, it is assumed that element is reflected with element in adjacency matrix A in service mode matrix S Penetrate relationship fS→AFor
Above-mentioned relation indicates, as node kiA classes unit when all destroying, disconnected with the associated all sides of the node; As node kiB classes unit when all destroying, the side that remaining node is directed toward by the node disconnects;As node kiC classes units it is whole When destruction, the side that the node is directed toward by remaining node disconnects.
Initial repair state based on complex network, by mapping relationship fS→AThe adjoining square of initial repair state can be generated Battle array A, the expression formula of adjacency matrix are
Element x in matrixij(i, j=1,2 ..., n;I ≠ j) indicate node kiWith node kjBetween connection relation (side), xijThere is no side (destroy or be not present), x between=0 two nodes of expressionijThere is one between=1 two nodes of expression By node kiIt is directed toward node kjSide.When all units are normal in complex network, adjoining can be generated after the same method Matrix A *.
Example:If the node k in the m × n complex networks established in step 1iOnly with set of node { ki-2,ki-1,ki+1,ki+2} In node there are connection relation, then the expression formula of adjacency matrix A* is
Assuming that node kiUnit collection Ui={ u1,u2,…,u6Be divided into three classes unit collection By mapping relationship fS→AIt can obtain, the neck of initial time complex network connects square in step 1 Battle array expression formula be
Step 3:Priori service mode transition probability based on Neural Network model predictive cluster.
Design one compression-excitation residual error network (Squeeze-and-Excitation Residual Networks, SE-ResNet the priori service mode transition probability matrix p and priori cluster maintenance policy valence of " node-unit " cluster) are predicted Value v.
(1) neural network input information:
Neural network input feature vector figure X includes " node-unit " the cluster service mode S of t momentt, maintenance policy iteration Adjacency matrix A (the S of nearest history cluster service mode and complex network node in the processt) and A*.With 7 step history For cluster service mode, input feature vector figure X is in the expression formula of t moment
Xt=[St,St-1,...,St-7,A(St),A*]
(2) neural network output information:
A priori cluster service mode transition probability p including " node-unit " cluster and a priori cluster repair Policy value v.
A) priori cluster service mode transition probability p of " node-unit " cluster in t momentt, it is denoted as:
Element in matrixIndicate that unit m executes dimension in t moment is to node n Repair the probability of action.
B) priori cluster maintenance policy is worth vtIt is a normalized parameter, predicts that the cluster service mode of t moment meets The assessed value of recovery extent.
(3) neural network structure:
The SE-ResNet neural network structures of selection include convolution module, residual error module, compression-excitation (Squeeze- And-Excitation, SE) module, ReLU function modules etc..
Example:The deep neural network of design is as shown in Fig. 2, input feature vector figure XtBy a depth S E-Residual tower into Row processing, depth S E-Residual towers include an individual convolution module and stack the centre of multiple SE-Residual units Layer module composition:
A) individual convolution module:
1. the convolutional layer being made of 256 3 × 3 filters, step-length 1;
2. ReLU function layers;
B) middle layer module:By stacking the middle layer of SE-Residual unit construction depth neural networks (to stack 19 For layer SE-Residual), the structure of SE-Residual units is as shown in figure 3, include with lower structure:
Residual error 1. (Residual) module:Containing there are one the convolutional layer being made of c filter, which exports one Size is the characteristic pattern of w × h × c, and c is the depth of characteristic pattern (for choosing 256 filters);
2. compressing (Squeeze) module:It is made of the overall situation pond layer that is averaged;
3. encouraging (Excitation) module:A bottleneck structure being made of two full articulamentums, two full articulamentums Between pass through ReLU functional links, the dimensionality reduction coefficient r of previous full articulamentum is usually arranged as 16;
4. normalizing module:Normalized weight between obtaining 0~1 by a Sigmoid function;
Weights resetting 5. (Reweight) module:It will be on each channel of the Weight after normalization to characteristic pattern;
Note:When in Fig. 3 by SE Module-embeddings to residual error module, SE modules export the channel of characteristic pattern simultaneously with convolutional layer Connection, in residual error moduleThe characteristic pattern exported to convolutional layer in branch before operation has carried out feature recalibration.
The output of depth S E-Residual towers is divided into policy module and value module two parts:
C) policy module:
1. the convolutional layer being made of 21 × 1 filters, step-length 1;
2. ReLU function layers;
3. connection output layer entirely:Output size is the characteristic pattern of m × n, corresponding " node-unit " cluster ptDecilog;
D) it is worth module:
1. the convolutional layer being made of 11 × 1 filter, step-length 1;
2. the linear full articulamentum that scale is 256;
3. ReLU function layers;
4. linear full articulamentum;
5. tanh output layer:Export the scalar value of a value on section [- 1,1].
The expression formula of above-mentioned depth S E-ResNet prediction models isθiIt is ith from development The network parameter of depth S E-ResNet prediction models, initial network parameter θ0It is obtained by random initializtion.
Step 4:The maintenance policy solution space of cluster is traversed based on Monte Carlo tree search algorithm.
To improve complex network " node-unit " clustering performance recovery extent, reduction recovery time is target, structure repair Strategy is from improved iteration system.A kind of intensified learning frame based on improved weighting MCTS algorithms is designed, for solving most Excellent repair recovery policy.
MCTS algorithms utilize the Maintenance forecast result p of SE-ResNet in step 3tAs search weight, avoid directly complete There is multiple shot array problem in office's search cluster maintenance policy solution space, is based on prior probability ptThe local search for carrying out solution space is same Sample can obtain global optimum's maintenance policy, and improved service mode transition probability matrix π is obtained according to tree searcht, execute primary Global best maintenance policy acts at, current " node-unit " cluster service mode S is transferred to subsequent time cluster repair shape Its expression formula of state, MCTS algorithms is
Tree nodes of the cluster service mode S as MCTS search trees, all branch (S, a) corresponding tree node next step All maintenance action a ∈ Action (S), (S a) stores one group of statistical data, as follows to every branch:
Data (S, a)=N (S, a), W (S, a), Q (S, a), P (S, a) }
Wherein, (S a) indicates accessed number to N;(S a) indicates the summation of total action value to W;(S a) indicates average to Q Action value;(S a) indicates selection branch (S, prior probability a) to P.
In service mode input feature vector figure XtUnder conditions of, with the Study first (p of SE-ResNet acquisitionst,vt) it is input, The Searching Resolution Space operation based on MCTS algorithms is executed, as shown in figure 4, its search process includes mainly 4 steps:
(1) it selects
First, the service mode S of t moment is selectedtAs the root node of search tree, root node is denoted as S0, MCTS search process By root node, the leaf node S until carrying out to the L moment reaching search tree endLWhen end.In l moment (1≤l < L), according to present node SlEvery branch storage statistical data select a maintenance action al, it is represented by
Wherein U (Sl, it is a) intermediate variable, refers to a kind of improved PUCT algorithms, be represented by
Wherein cpuctIt is a constant determined by MCTS search degree;Initially tendency and the selection of this search control strategy Action a with higher prior probability and relatively low access times, but with search into guild more be inclined to selection have compared with The action of height action value.
(2) extension and assessment
By leaf node SLIt is added in a queue, by mapping relationship fS→AGenerate A (SL), and then obtain leaf node The input feature vector figure X of corresponding cluster service modeL, it is input to neural network and is expanded the side (S of leaf nodeL, a) need The statistical data of storage, this operation can be expressed as
fθ(XL)=(pa,v)
Before completing aforesaid operations, this search thread is constantly in locking state.As leafy node SLContinue extension When, its each branch (SL, a) the statistical parameter initialization of storage, can be expressed as
Data(SL, a)={ N (SL, a)=0, W (SL, a)=0, Q (SL, a)=0, P (SL, a)=pa}
(3) recall
Statistical data, which is recalled along all branches that search thread accesses from leaf node to root node, to be transmitted, and is updated and is deposited It is stored in search tree branch.In trace-back process, branch (Sl,al) storage the update of access times parameter it is primary, can be expressed as
N(Sl,al)=N (Sl,al)+1
Meanwhile the branch (Sl,al) total action value and averagely action value parameter also update once, can be expressed as
W(Sl,al)=W (Sl,al)+v
(4) it executes
It is operated by iteration above three, after completing 1000 tree search, according to an improved system service mode Transition probability matrix πtSelect the best maintenance action a of t momentt, cluster service mode StIt is transferred to St+1, πtIn element π can be with It is expressed as
π(a|Xt)=N (Xt,a)1/τ/∑bN(Xt,b)1/τ
Wherein τ is the temperature parameter of a command deployment process.
Search tree is continuing with next from improving operation, executes best maintenance action a every timetReach later Child node becomes new search root vertex, retains all branches of the node, while abandoning its cotree of a root node Branch.
The search spread of maintenance policy solution space is operated based on MCTS algorithm performs 1000 times, t moment finally can be obtained The best maintenance action a of the overall situationtWith improved service mode transition probability matrix πt, expression formula is
Example:By Step 1: step 2 and step 3 obtain one group of (Xt,pt,vt) after, the traversal repair of MCTS algorithm search Tactful solution space obtains improved m × n service modes transition probability matrix
The maximum unit of selection wherein service mode transition probability executes maintenance action, the global best maintenance action of t moment atExpression formula be
Above formula indicates t moment to node k2Unit u2Maintenance action is executed, t moment cluster service mode turns after completion Move to the t+1 moment.
Step 5:Variation based on cluster service mode updates complex network adjacency matrix.
Recovery policy is from after executing the best maintenance action at development t-1 moment, when cluster service mode is transferred to t It carves, based on the variation of cluster service mode, according to the mapping relationship f in step 2S→A, update complex network adjacency matrix, table It is up to formula
Element in matrixIndicate t moment node kiWith node kjBetween connection relation (side),Indicate there is no side (destroy or be not present) between two nodes,Indicate there is one between two nodes By node kiIt is directed toward node kjSide.
Step 6:Calculate and examine the recovery extent of complex network.
A recovery policy is executed from after improving operation (including Step 3: step 4 and step 5), after shifting " node-unit " cluster service mode S and its adjacency matrix A (S) calculates the recovery extent of complex network.
It is required if not meeting recovery, return to step three, continues to execute recovery policy and improve operation certainly.
If the cluster service mode S at T momentTMeet the requirements, then pass through T time from improve operation complete one completely it is extensive Then multiple strategy is performed simultaneously step 7 and step 8 from development.
Step 7:Neural network parameter is trained based on intensified learning empirical parameter.
A reward value z is calculated by reward function to assess recovery policy from development, based on reward value and extensive The newest intensified learning empirical parameter that multiple strategy is generated from development, SE-ResNet with minimize the assessed value v of prediction with Error between the reward value z for improving end, and maximize prior state transition probability p and improved state transition probability Similarity between π is target, trains network parameter θ, loss function that can be expressed as using gradient descent method
Loss=(z-v)2Tlogp+c||θ||2
After the completion of network parameter training, a new SE-ResNet improving certainly for recovery policy next time is obtained Journey.The better direction of search can be provided for MCTS by repetitive exercise neural network.
Step 8:Based on recovery policy a complete repair recovery scheme is generated from development
A series of best maintenance action { a stored from development by recovery policy1,a2,...,aTGenerate one completely Repair recovery scheme, repair recovery scheme can be expressed as
Recovery=fRec(a1,a2,...,aT)=1 × a1+2×a2+…+T×aT
By final cluster service mode STAnd its adjacency matrix A (ST) calculate and export the recovery extent of complex network.
Example:If " node-unit " the cluster service mode S of initial time 10 × 6 in step 10It is improved certainly in recovery policy Process performs 5 maintenance action { a altogether1,a2,a3,a4,a5, the repair recovery scheme of generation can be expressed as
The program indicates, according to repair sequential, to execute maintenance action to the following units successively:Node k7Unit u2, node k3Unit u6, node k7Unit u6, node k2Unit u5, node k10Unit u6

Claims (9)

1. improving recovery policy method certainly based on the complex network local failure for improving intensified learning, it is characterised in that:It is wrapped Containing following steps:
The first step:The cluster service mode matrix of complex network is established based on local failure:Complex web is established according to information is destroyed The service mode 0-1 matrixes of network " node-unit " cluster.
Second step:Complex network adjacency matrix is generated based on initial cluster service mode:Consider service mode matrix and adjacent square The mapping relations of battle array generate complex network adjacency matrix based on initial cluster service mode.
Third walks:Priori service mode transition probability based on Neural Network model predictive cluster:Design a SE-ResNet god Priori service mode transition probability through neural network forecast " node-unit " cluster and priori maintenance policy value.
4th step:The maintenance policy solution space of cluster is traversed based on Monte Carlo tree search algorithm:Maintenance policy solution space is traversed, Obtain improved service mode transition probability matrix, and selection current time global best maintenance action accordingly.
5th step:Variation based on cluster service mode updates complex network adjacency matrix.
6th step:Calculate and examine the recovery extent of complex network:Cluster service mode based on complex network and adjacency matrix It calculates and examines its recovery extent.
7th step:Neural network parameter is trained based on intensified learning empirical parameter:It is generated from development based on recovery policy One group of newest intensified learning empirical parameter trains neural network parameter using gradient descent method.
8th step:Based on recovery policy a complete repair recovery scheme is generated from development:It is improved certainly by recovery policy A series of best maintenance actions of process storage generate a complete repair recovery scheme.
By above step, a kind of improvement recovery policy method certainly based on improvement intensified learning is given, complexity can be solved The recovery policy problem of cluster repair is carried out under the collapse state of network part.
2. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning Method, it is characterised in that:In the first step in " the cluster service mode matrix for establishing complex network based on local failure ", The recovery problem of complex network local failure state is considered as to the cluster maintenance problem of multinode, complexity is established according to information is destroyed The service mode 0-1 matrixes of network " node-unit " cluster.
First, the node set K={ k of complex network are built1,k2,…,ki,…,kj,…,kn(wherein n is the number of node), The composition of each node is disassembled, its unit set U={ u are established1,u2,…,ui,…,uj,…,um}.Based on this, it builds " node-unit " matrix of vertical m × n, and according to information is destroyed with " 0 ", " 1 " to element assignment in matrix, form service mode Matrix S.
3. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning Method, it is characterised in that:It, will in " generating complex network adjacency matrix based on initial cluster service mode " described in second step One complex network is abstracted as one by set of node K={ k1,k2,…,ki,…,kj,…,knAnd connection (side) collection The figure G=(K, E) of composition.Connection relation in complex network between n node is described with the adjacency matrix A of a n × n (side), and do not consider from ring.When all units are normal in complex network, adjacency matrix is denoted as A*.
By node kiUnit collection Ui={ u1,u2,…,umBe divided into three classes unit collectionThenIndicate single MetasetIn all nodes be destroy space in trouble unit, other two classes unit collection can be similarly described. Based on above-mentioned classification, with node kiFor, it is assumed that the mapping relations of element and element in adjacency matrix A in service mode matrix S fS→AFor
Above-mentioned relation indicates, as node kiA classes unit when all destroying, disconnected with the associated all sides of the node;Work as node kiB classes unit when all destroying, the side that remaining node is directed toward by the node disconnects;As node kiC classes units all destroy When, the side that the node is directed toward by remaining node disconnects.Initial repair state based on complex network, by mapping relationship fS→AIt can be with Generate the adjacency matrix A of initial repair state.
4. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning Method, it is characterised in that:It is described in the third step that " the priori service mode transfer based on Neural Network model predictive cluster is general In rate ", devise a compression-excitation residual error network (Squeeze-and-Excitation Residual Networks, SE-ResNet the priori service mode transition probability matrix p and priori cluster maintenance policy valence of " node-unit " cluster) are predicted Value v.
Neural network input feature vector figure X:Including in current " node-unit " cluster service mode S, maintenance policy iterative process Nearest history cluster service mode (by taking 7 step history cluster service modes as an example) and complex network adjacency matrix A (S) and A*.
Neural network output information:Including a priori cluster service mode transition probability p of " node-unit " cluster and one Priori cluster maintenance policy is worth v.
The neural network structure of selection:Including convolution module, residual error module, compression-excitation (Squeeze-and- Excitation, SE) module, ReLU function modules etc..The expression formula of neural network is fθ(X)=(p, v).
5. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning Method, it is characterised in that:" the maintenance policy solution sky of cluster is traversed based on Monte Carlo tree search algorithm described in the 4th step Between " in, to improve complex network " node-unit " clustering performance recovery extent, reduction recovery time is target, structure repair plan Slightly from improved iteration system.A kind of intensified learning frame based on improved weighting MCTS algorithms is designed, it is optimal for solving Maintenance policy.
The Maintenance forecast result p of SE-ResNet avoids direct global search as search weight during MCTS algorithms are walked using third There is multiple shot array problem in cluster maintenance policy solution space, and the local search of solution space is carried out based on prior probability p and can equally be obtained To global optimum's maintenance policy, improved service mode transition probability matrix π is obtained according to tree search, is executed primary global best Maintenance policy acts a, and current " node-unit " cluster service mode S is transferred to subsequent time cluster service mode, and MCTS is calculated Its expression formula of method is MCTSθ(X, p, v)=(π, a).
6. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning Method, it is characterised in that:In " the variation update complex network adjacency matrix based on cluster service mode " described in the 5th step, For recovery policy from after executing the best maintenance action at development a certain moment, cluster service mode is transferred to subsequent time, base In the variation of cluster service mode, according to the mapping relationship f in second stepS→A, update complex network adjacency matrix.
7. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning Method, it is characterised in that:In " recovery extent for calculating and examining complex network " described in the 6th step, primary recovery plan is executed Slightly from after improving operation (including third step, the 4th step and the 5th step), by " node-unit " cluster service mode after shifting S and its adjacency matrix A (S) calculates the recovery extent of complex network.
It is required if not meeting recovery, returns to third step, continued to execute recovery policy and improve operation certainly.
If after the T times is improved operation certainly, cluster service mode satisfaction restores requirement, then passes through T times from improvement operation completion Then one complete recovery policy is performed simultaneously the 7th step and the 8th step from development.
8. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning Method, it is characterised in that:In " training neural network parameter based on intensified learning empirical parameter " described in the 7th step, by rewarding Function calculates a reward value z and is assessed from development recovery policy, is improved certainly based on reward value and recovery policy The newest intensified learning empirical parameter of T groups that journey generates, SE-ResNet are terminated with the assessed value v for minimizing prediction with from improvement Reward value z between error, and maximize prior state transition probability p and improved state transition probability π between phase It is target like degree, network parameter θ is trained using gradient descent method, obtains a new SE-ResNet for restoring plan next time Slightly from development.The better direction of search can be provided for MCTS by repetitive exercise neural network.
9. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning Method, it is characterised in that:" a complete repair recovery side is generated from development based on recovery policy described in the 8th step In case ", a series of best maintenance action { a for being stored from development by recovery policy1,a2,...,aTGenerate one completely Recovery scheme is repaired, repair recovery scheme can be expressed as
Recovery=fRec(a1,a2,...,aT)=1 × a1+2×a2+…+T×aT
By final cluster service mode STAnd its adjacency matrix A (ST) calculate and export the recovery extent of complex network.
CN201810375758.1A 2018-04-25 2018-04-25 It is a kind of that recovery policy is improved based on the complex network local failure for improving intensified learning certainly Pending CN108573303A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810375758.1A CN108573303A (en) 2018-04-25 2018-04-25 It is a kind of that recovery policy is improved based on the complex network local failure for improving intensified learning certainly

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810375758.1A CN108573303A (en) 2018-04-25 2018-04-25 It is a kind of that recovery policy is improved based on the complex network local failure for improving intensified learning certainly

Publications (1)

Publication Number Publication Date
CN108573303A true CN108573303A (en) 2018-09-25

Family

ID=63575205

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810375758.1A Pending CN108573303A (en) 2018-04-25 2018-04-25 It is a kind of that recovery policy is improved based on the complex network local failure for improving intensified learning certainly

Country Status (1)

Country Link
CN (1) CN108573303A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109214399A (en) * 2018-10-12 2019-01-15 清华大学深圳研究生院 A kind of improvement YOLOV3 Target Recognition Algorithms being embedded in SENet structure
CN109711040A (en) * 2018-12-25 2019-05-03 南京天洑软件有限公司 A kind of intelligent industrial design nitrification enhancement based on direction of search study
CN110598309A (en) * 2019-09-09 2019-12-20 电子科技大学 Hardware design verification system and method based on reinforcement learning
CN110971471A (en) * 2019-12-30 2020-04-07 国网江苏省电力有限公司信息通信分公司 Power communication backbone network fault recovery method and device based on state perception
CN111462044A (en) * 2020-03-05 2020-07-28 浙江省农业科学院 Greenhouse strawberry detection and maturity evaluation method based on deep learning model
CN111967636A (en) * 2020-06-08 2020-11-20 北京大学 System and method for assisting in decision-making of power distribution network maintenance strategy
CN111985672A (en) * 2020-05-08 2020-11-24 东华大学 Single-piece job shop scheduling method for multi-Agent deep reinforcement learning
CN112183777A (en) * 2020-09-14 2021-01-05 北京航空航天大学 Complex network local destruction control method based on deep reinforcement learning
CN112682182A (en) * 2019-10-18 2021-04-20 丰田自动车株式会社 Vehicle control device, vehicle control system, and vehicle control method
CN112770325A (en) * 2020-12-09 2021-05-07 华南理工大学 Cognitive Internet of vehicles spectrum sensing method based on deep learning
CN112800678A (en) * 2021-01-29 2021-05-14 南京航空航天大学 Multi-task selection model construction method, multi-task selective maintenance method and system
CN113673721A (en) * 2021-08-26 2021-11-19 北京航空航天大学 Cluster system preventive maintenance method based on deep reinforcement learning
CN113923123A (en) * 2021-09-24 2022-01-11 天津大学 Underwater wireless sensor network topology control method based on deep reinforcement learning
US11809977B2 (en) 2019-11-14 2023-11-07 NEC Laboratories Europe GmbH Weakly supervised reinforcement learning

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109214399A (en) * 2018-10-12 2019-01-15 清华大学深圳研究生院 A kind of improvement YOLOV3 Target Recognition Algorithms being embedded in SENet structure
CN109711040A (en) * 2018-12-25 2019-05-03 南京天洑软件有限公司 A kind of intelligent industrial design nitrification enhancement based on direction of search study
CN109711040B (en) * 2018-12-25 2023-06-02 南京天洑软件有限公司 Intelligent industrial design reinforcement learning algorithm based on search direction learning
CN110598309A (en) * 2019-09-09 2019-12-20 电子科技大学 Hardware design verification system and method based on reinforcement learning
CN110598309B (en) * 2019-09-09 2022-11-04 电子科技大学 Hardware design verification system and method based on reinforcement learning
CN112682182A (en) * 2019-10-18 2021-04-20 丰田自动车株式会社 Vehicle control device, vehicle control system, and vehicle control method
US11809977B2 (en) 2019-11-14 2023-11-07 NEC Laboratories Europe GmbH Weakly supervised reinforcement learning
CN110971471B (en) * 2019-12-30 2022-03-29 国网江苏省电力有限公司信息通信分公司 Power communication backbone network fault recovery method and device based on state perception
CN110971471A (en) * 2019-12-30 2020-04-07 国网江苏省电力有限公司信息通信分公司 Power communication backbone network fault recovery method and device based on state perception
CN111462044A (en) * 2020-03-05 2020-07-28 浙江省农业科学院 Greenhouse strawberry detection and maturity evaluation method based on deep learning model
CN111462044B (en) * 2020-03-05 2022-11-22 浙江省农业科学院 Greenhouse strawberry detection and maturity evaluation method based on deep learning model
CN111985672A (en) * 2020-05-08 2020-11-24 东华大学 Single-piece job shop scheduling method for multi-Agent deep reinforcement learning
CN111967636A (en) * 2020-06-08 2020-11-20 北京大学 System and method for assisting in decision-making of power distribution network maintenance strategy
CN112183777A (en) * 2020-09-14 2021-01-05 北京航空航天大学 Complex network local destruction control method based on deep reinforcement learning
CN112770325B (en) * 2020-12-09 2022-12-16 华南理工大学 Cognitive internet of vehicles spectrum sensing method based on deep learning
CN112770325A (en) * 2020-12-09 2021-05-07 华南理工大学 Cognitive Internet of vehicles spectrum sensing method based on deep learning
CN112800678A (en) * 2021-01-29 2021-05-14 南京航空航天大学 Multi-task selection model construction method, multi-task selective maintenance method and system
CN113673721A (en) * 2021-08-26 2021-11-19 北京航空航天大学 Cluster system preventive maintenance method based on deep reinforcement learning
CN113923123A (en) * 2021-09-24 2022-01-11 天津大学 Underwater wireless sensor network topology control method based on deep reinforcement learning
CN113923123B (en) * 2021-09-24 2023-06-09 天津大学 Underwater wireless sensor network topology control method based on deep reinforcement learning

Similar Documents

Publication Publication Date Title
CN108573303A (en) It is a kind of that recovery policy is improved based on the complex network local failure for improving intensified learning certainly
Chen et al. Evolutionary artificial neural networks for hydrological systems forecasting
Diao et al. Feature selection inspired classifier ensemble reduction
CN106529818B (en) Water quality assessment Forecasting Methodology based on Fuzzy Wavelet Network
CN114422382B (en) Network flow prediction method, computer device, product and storage medium
CN113094822A (en) Method and system for predicting residual life of mechanical equipment
CN106647272A (en) Robot route planning method by employing improved convolutional neural network based on K mean value
CN110188880A (en) A kind of quantization method and device of deep neural network
CN113190688A (en) Complex network link prediction method and system based on logical reasoning and graph convolution
CN113286275A (en) Unmanned aerial vehicle cluster efficient communication method based on multi-agent reinforcement learning
CN110442143A (en) A kind of unmanned plane situation data clustering method based on combination multiple target dove group's optimization
CN113469891A (en) Neural network architecture searching method, training method and image completion method
CN105512755A (en) Decomposition-based multi-objective distribution estimation optimization method
CN108733921A (en) Coiling hot point of transformer temperature fluctuation range prediction technique based on Fuzzy Information Granulation
Hakimi-Asiabar et al. Multi-objective genetic local search algorithm using Kohonen’s neural map
CN112183721B (en) Construction method of combined hydrological prediction model based on self-adaptive differential evolution
CN109408896A (en) A kind of anerobic sowage processing gas production multi-element intelligent method for real-time monitoring
CN116796194A (en) IDBO-KELM-BiGRU neural network-based active power virtual collection method for distributed photovoltaic power station
CN109948797A (en) A kind of adjacency matrix optimization method in figure neural network based on L2 norm
CN115906959A (en) Parameter training method of neural network model based on DE-BP algorithm
CN112183777A (en) Complex network local destruction control method based on deep reinforcement learning
CN1622129A (en) Optimization method for artificial neural network
Huang et al. Genetic algorithms enhanced Kohonen's neural networks
Zhao et al. Artificial bee colony algorithm with tree-seed searching for modeling multivariable systems using GRNN
CN113807005A (en) Bearing residual life prediction method based on improved FPA-DBN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180925