CN108573303A - It is a kind of that recovery policy is improved based on the complex network local failure for improving intensified learning certainly - Google Patents
It is a kind of that recovery policy is improved based on the complex network local failure for improving intensified learning certainly Download PDFInfo
- Publication number
- CN108573303A CN108573303A CN201810375758.1A CN201810375758A CN108573303A CN 108573303 A CN108573303 A CN 108573303A CN 201810375758 A CN201810375758 A CN 201810375758A CN 108573303 A CN108573303 A CN 108573303A
- Authority
- CN
- China
- Prior art keywords
- service mode
- cluster
- node
- recovery
- complex network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a kind of based on the complex network local failure for improving intensified learning from recovery policy method is improved, and solves the problems, such as that complex network carries out the recovery policy generation of cluster repair.Steps are as follows:1 establishes the cluster service mode matrix of complex network according to local failure information.2 generate complex network adjacency matrix based on initial cluster service mode.3 priori service mode transition probability and the maintenance policy values based on Neural Network model predictive cluster.4 traverse the maintenance policy solution space of cluster based on Monte Carlo tree search algorithm, and select current time global best maintenance action.5 variations based on cluster service mode update complex network adjacency matrix.6 calculate based on cluster service mode and adjacency matrix and examine the recovery extent of complex network.7 train neural network parameter based on intensified learning empirical parameter.8 generate a complete repair recovery scheme based on recovery policy from a series of best maintenance actions of development.
Description
Technical field
The present invention provides a kind of recovery plan of improvement certainly based under the complex network local failure state for improving intensified learning
Slightly (Self-improvement Recovery Strategy, SIRS) method more particularly to a kind of consideration network node composition
Element characteristic is based on improved nitrification enhancement, realizes from the recovery plan for improving the repair of solving complexity network multi-node cluster
Slightly method, belongs to maintainability engineering field.
Background technology
Refer to destroying position after local failure occurs for complex network and multinode concentration occur from recovery policy (SIRS) is improved
Not available situation, rapid rush-repair is to whole available mode by way of cluster repair.But it is tieed up both at home and abroad about cluster at present
The research repaiied does not consider sequential generally.As maintainability is increasingly taken seriously, cluster maintenance policy is carried out to complex network
Higher requirement has been researched and proposed, that is, it is whole to fully consider that the sequential of cluster repair and income do not know feature and problem
NP-hard features provide a kind of efficient cluster maintenance policy method.
The present invention is based on the neural network prediction models of service mode transition probability and Monte Carlo tree to search for (Monte
Carlo Tree Search, MCTS) algorithm, it has invented a kind of based on the novel improvement recovery policy certainly for improving intensified learning
(SIRS) method solves the problems, such as that the cluster maintenance policy under complex network local failure state generates.
Invention content
The purpose of the present invention is provide a kind of novel improvement recovery policy certainly for the complex network under local failure state
(SIRS) method, it is intended to solve conventional cluster maintenance policy method and not fully consider the sequential of cluster repair and the uncertain spy of income
The problems such as NP-hard features of sign and problem entirety.
The present invention proposes a kind of SIRS based on neural network prediction model and Monte Carlo tree search (MCTS) algorithm
Method mainly comprises the steps of:
Step 1:The cluster service mode matrix of complex network is established based on local failure.
Research is unfolded in the cluster maintenance problem that complex network local failure recovery policy is considered as to multinode.First, it builds
Node set K={ the k of complex network1,k2,…,ki,…,kj,…,kn(wherein n is the number of node), by the group of each node
At being disassembled, its unit set U={ u are established1,u2,…,um}.Based on this, " node-unit " square of m × n is established
Battle array, and according to local failure information, with the trouble unit in " 0 " expression local failure space to be repaired, " 1 " indicates normal single
Member forms service mode matrix S to element assignment in matrix.
Step 2:Complex network adjacency matrix is generated based on initial cluster service mode.
One complex network is abstracted as one by set of node K={ k1,k2,…,ki,…,kj,…,knAnd connection (side)
CollectionThe figure G=(K, E) of composition.It is described in complex network between n node with the adjacency matrix A of a n × n
Connection relation (side), and do not consider from ring.When all units are normal in complex network, adjacency matrix is denoted as A*.
By node kiUnit collection Ui={ u1,u2,…,umBe divided into three classes unit collectionThenTable
Show unit collectionIn all nodes be destroy space in trouble unit, similarly can be rightWithTwo class unit collection into
Row description.Based on above-mentioned classification, with node kiFor, it is assumed that element is reflected with element in adjacency matrix A in service mode matrix S
Penetrate relationship fS→AFor
Above-mentioned relation indicates, as node kiA classes unit when all destroying, disconnected with the associated all sides of the node;
As node kiB classes unit when all destroying, the side that remaining node is directed toward by the node disconnects;As node kiC classes units it is whole
When destruction, the side that the node is directed toward by remaining node disconnects.Initial repair state based on complex network, by mapping relationship fS→A
The adjacency matrix A of initial repair state can be generated.
Step 3:Priori service mode transition probability based on Neural Network model predictive cluster.
Design one compression-excitation residual error network (Squeeze-and-Excitation Residual Networks,
SE-ResNet the priori service mode transition probability matrix p and priori cluster maintenance policy valence of " node-unit " cluster) are predicted
Value v.
Neural network input feature vector figure X:Including current " node-unit " cluster service mode S, maintenance policy iteration mistake
The neighbour of nearest history cluster service mode (by taking 7 step history cluster service modes as an example) and complex network node in journey
Meet matrix A (S) and A*.
Neural network output information:A priori cluster service mode transition probability p including " node-unit " cluster and
One priori cluster maintenance policy is worth v.
The neural network structure of selection:Including convolution module, residual error module, compression-excitation (Squeeze-and-
Excitation, SE) module, ReLU function modules etc..The expression formula of neural network is fθ(X)=(p, v).
Step 4:The maintenance policy solution space of cluster is traversed based on Monte Carlo tree search algorithm.
To improve complex network " node-unit " clustering performance recovery extent, reduction recovery time is target, structure repair
Strategy is from improved iteration system.A kind of intensified learning frame based on improved weighting MCTS algorithms is designed, for solving most
Excellent maintenance policy.
MCTS algorithms avoid the direct overall situation using the Maintenance forecast result p of SE-ResNet in step 3 as search weight
There is multiple shot array problem in search cluster maintenance policy solution space, and the local search that solution space is carried out based on prior probability p is same
Global optimum's maintenance policy can be obtained, improved service mode transition probability matrix π is obtained according to tree search, is executed primary global
Best maintenance policy acts a, and current " node-unit " cluster service mode S is transferred to subsequent time cluster service mode,
Its expression formula of MCTS algorithms is MCTSθ(X, p, v)=(π, a).
Step 5:Variation based on cluster service mode updates complex network adjacency matrix.
From after executing the best maintenance action at development a certain moment, cluster service mode is transferred to next recovery policy
Moment, based on the variation of cluster service mode, according to the mapping relationship f in step 2S→A, update complex network adjacency matrix.
Step 6:Calculate and examine the recovery extent of complex network.
A recovery policy is executed from after improving operation (including Step 3: step 4 and step 5), after shifting
" node-unit " cluster service mode S and its adjacency matrix A (S) calculates the recovery extent of complex network.
It is required if not meeting recovery, return to step three, continues to execute recovery policy and improve operation certainly.
If the cluster service mode S at T momentTMeet the requirements, then pass through T time from improve operation complete one completely it is extensive
Then multiple strategy is performed simultaneously step 7 and step 8 from development.
Step 7:Neural network parameter is trained based on intensified learning empirical parameter.
A reward value z is calculated by reward function to assess recovery policy from development, based on reward value and extensive
The newest intensified learning empirical parameter of T groups that multiple strategy is generated from development, SE-ResNet is to minimize the assessed value of prediction
Error between v and the reward value z for improving end certainly, and maximize prior state transition probability p and the transfer of improved state
Similarity between probability π is target, trains network parameter θ using gradient descent method, obtains a new SE-ResNet and be used for
Next time recovery policy from development.The better direction of search can be provided for MCTS by repetitive exercise neural network.
Step 8:Based on recovery policy a complete repair recovery scheme is generated from development
A series of best maintenance action { a stored from development by recovery policy1,a2,...,aTGenerate one completely
Repair recovery scheme, repair recovery scheme can be expressed as
Recovery=fRec(a1,a2,...,aT)=1 × a1+2×a2+…+T×aT
By final cluster service mode STAnd its adjacency matrix A (ST) calculate and export the recovery extent of complex network.
Description of the drawings
Fig. 1 is the overall architecture block diagram of heretofore described method
Fig. 2 is the SE-ResNet prediction models of priori service mode transition probability in the present invention
Fig. 3 is the SE-Residual cellular constructions that priori service mode transition probability prediction model is selected in the present invention
Fig. 4 is the Monte Carlo tree search algorithm flow chart that maintenance policy solution space is traversed in the present invention
Specific implementation mode
To make technical scheme of the present invention, feature and advantage are better understood upon, below in conjunction with attached drawing, make specifically
It is bright.
The present invention gives a kind of novel improvement recovery policy (SIRS) methods certainly, can be used for multiple under local failure state
The cluster maintenance policy problem of miscellaneous network solves conventional method and does not fully consider the sequential of cluster repair and the uncertain spy of income
The deficiencies of NP-hard features of sign and problem entirety.
The overall architecture of the present invention, as shown in Figure 1.Its specific implementation step is:
Step 1:The cluster service mode matrix of complex network is established based on local failure.
Research is unfolded in the cluster maintenance problem that complex network local failure recovery policy is considered as to multinode.First, it builds
Node set K={ the k of complex network1,k2,…,ki,…,kj,…,kn(wherein n is the number of node), by the group of each node
At being disassembled, its unit set U={ u are established1,u2,…,um}.Based on this, " node-unit " square of m × n is established
Battle array, and according to local failure information, with the trouble unit in " 0 " expression local failure space to be repaired, " 1 " indicates normal single
Member forms service mode matrix S to element assignment in matrix.
When recovery policy is carried out from development to t moment, the service mode matrix expression of " node-unit " cluster is
Element in matrixIndicate t moment node knMiddle unit umRepair shape
State,Indicate that the unit is normal,Indicate the trouble unit in the local failure space that the unit is to be repaired.
Example:If analysis object is a complex network for including 10 nodes, each node includes 6 units, when initial
Carve " node-unit " cluster service mode matrix expression be
Element in matrixIndicate initial time node k1Middle unit u1Normally,Indicate initial time node k5
Middle unit u1It is the trouble unit in local failure space to be repaired.
Step 2:Complex network adjacency matrix is generated based on initial cluster service mode.
One complex network is abstracted as one by set of node K={ k1,k2,…,ki,…,kj,…,knAnd connection (side)
CollectionThe figure G=(K, E) of composition.It is described in complex network between n node with the adjacency matrix A of a n × n
Connection relation (side), and do not consider from ring.When all units are normal in complex network, adjacency matrix is denoted as A*.
By node kiUnit collection Ui={ u1,u2,…,umBe divided into three classes unit collectionThenTable
Show unit collectionIn all nodes be destroy space in trouble unit, similarly can be rightWithTwo class unit collection into
Row description.Based on above-mentioned classification, with node kiFor, it is assumed that element is reflected with element in adjacency matrix A in service mode matrix S
Penetrate relationship fS→AFor
Above-mentioned relation indicates, as node kiA classes unit when all destroying, disconnected with the associated all sides of the node;
As node kiB classes unit when all destroying, the side that remaining node is directed toward by the node disconnects;As node kiC classes units it is whole
When destruction, the side that the node is directed toward by remaining node disconnects.
Initial repair state based on complex network, by mapping relationship fS→AThe adjoining square of initial repair state can be generated
Battle array A, the expression formula of adjacency matrix are
Element x in matrixij(i, j=1,2 ..., n;I ≠ j) indicate node kiWith node kjBetween connection relation
(side), xijThere is no side (destroy or be not present), x between=0 two nodes of expressionijThere is one between=1 two nodes of expression
By node kiIt is directed toward node kjSide.When all units are normal in complex network, adjoining can be generated after the same method
Matrix A *.
Example:If the node k in the m × n complex networks established in step 1iOnly with set of node { ki-2,ki-1,ki+1,ki+2}
In node there are connection relation, then the expression formula of adjacency matrix A* is
Assuming that node kiUnit collection Ui={ u1,u2,…,u6Be divided into three classes unit collection By mapping relationship fS→AIt can obtain, the neck of initial time complex network connects square in step 1
Battle array expression formula be
Step 3:Priori service mode transition probability based on Neural Network model predictive cluster.
Design one compression-excitation residual error network (Squeeze-and-Excitation Residual Networks,
SE-ResNet the priori service mode transition probability matrix p and priori cluster maintenance policy valence of " node-unit " cluster) are predicted
Value v.
(1) neural network input information:
Neural network input feature vector figure X includes " node-unit " the cluster service mode S of t momentt, maintenance policy iteration
Adjacency matrix A (the S of nearest history cluster service mode and complex network node in the processt) and A*.With 7 step history
For cluster service mode, input feature vector figure X is in the expression formula of t moment
Xt=[St,St-1,...,St-7,A(St),A*]
(2) neural network output information:
A priori cluster service mode transition probability p including " node-unit " cluster and a priori cluster repair
Policy value v.
A) priori cluster service mode transition probability p of " node-unit " cluster in t momentt, it is denoted as:
Element in matrixIndicate that unit m executes dimension in t moment is to node n
Repair the probability of action.
B) priori cluster maintenance policy is worth vtIt is a normalized parameter, predicts that the cluster service mode of t moment meets
The assessed value of recovery extent.
(3) neural network structure:
The SE-ResNet neural network structures of selection include convolution module, residual error module, compression-excitation (Squeeze-
And-Excitation, SE) module, ReLU function modules etc..
Example:The deep neural network of design is as shown in Fig. 2, input feature vector figure XtBy a depth S E-Residual tower into
Row processing, depth S E-Residual towers include an individual convolution module and stack the centre of multiple SE-Residual units
Layer module composition:
A) individual convolution module:
1. the convolutional layer being made of 256 3 × 3 filters, step-length 1;
2. ReLU function layers;
B) middle layer module:By stacking the middle layer of SE-Residual unit construction depth neural networks (to stack 19
For layer SE-Residual), the structure of SE-Residual units is as shown in figure 3, include with lower structure:
Residual error 1. (Residual) module:Containing there are one the convolutional layer being made of c filter, which exports one
Size is the characteristic pattern of w × h × c, and c is the depth of characteristic pattern (for choosing 256 filters);
2. compressing (Squeeze) module:It is made of the overall situation pond layer that is averaged;
3. encouraging (Excitation) module:A bottleneck structure being made of two full articulamentums, two full articulamentums
Between pass through ReLU functional links, the dimensionality reduction coefficient r of previous full articulamentum is usually arranged as 16;
4. normalizing module:Normalized weight between obtaining 0~1 by a Sigmoid function;
Weights resetting 5. (Reweight) module:It will be on each channel of the Weight after normalization to characteristic pattern;
Note:When in Fig. 3 by SE Module-embeddings to residual error module, SE modules export the channel of characteristic pattern simultaneously with convolutional layer
Connection, in residual error moduleThe characteristic pattern exported to convolutional layer in branch before operation has carried out feature recalibration.
The output of depth S E-Residual towers is divided into policy module and value module two parts:
C) policy module:
1. the convolutional layer being made of 21 × 1 filters, step-length 1;
2. ReLU function layers;
3. connection output layer entirely:Output size is the characteristic pattern of m × n, corresponding " node-unit " cluster ptDecilog;
D) it is worth module:
1. the convolutional layer being made of 11 × 1 filter, step-length 1;
2. the linear full articulamentum that scale is 256;
3. ReLU function layers;
4. linear full articulamentum;
5. tanh output layer:Export the scalar value of a value on section [- 1,1].
The expression formula of above-mentioned depth S E-ResNet prediction models isθiIt is ith from development
The network parameter of depth S E-ResNet prediction models, initial network parameter θ0It is obtained by random initializtion.
Step 4:The maintenance policy solution space of cluster is traversed based on Monte Carlo tree search algorithm.
To improve complex network " node-unit " clustering performance recovery extent, reduction recovery time is target, structure repair
Strategy is from improved iteration system.A kind of intensified learning frame based on improved weighting MCTS algorithms is designed, for solving most
Excellent repair recovery policy.
MCTS algorithms utilize the Maintenance forecast result p of SE-ResNet in step 3tAs search weight, avoid directly complete
There is multiple shot array problem in office's search cluster maintenance policy solution space, is based on prior probability ptThe local search for carrying out solution space is same
Sample can obtain global optimum's maintenance policy, and improved service mode transition probability matrix π is obtained according to tree searcht, execute primary
Global best maintenance policy acts at, current " node-unit " cluster service mode S is transferred to subsequent time cluster repair shape
Its expression formula of state, MCTS algorithms is
Tree nodes of the cluster service mode S as MCTS search trees, all branch (S, a) corresponding tree node next step
All maintenance action a ∈ Action (S), (S a) stores one group of statistical data, as follows to every branch:
Data (S, a)=N (S, a), W (S, a), Q (S, a), P (S, a) }
Wherein, (S a) indicates accessed number to N;(S a) indicates the summation of total action value to W;(S a) indicates average to Q
Action value;(S a) indicates selection branch (S, prior probability a) to P.
In service mode input feature vector figure XtUnder conditions of, with the Study first (p of SE-ResNet acquisitionst,vt) it is input,
The Searching Resolution Space operation based on MCTS algorithms is executed, as shown in figure 4, its search process includes mainly 4 steps:
(1) it selects
First, the service mode S of t moment is selectedtAs the root node of search tree, root node is denoted as S0, MCTS search process
By root node, the leaf node S until carrying out to the L moment reaching search tree endLWhen end.In l moment (1≤l <
L), according to present node SlEvery branch storage statistical data select a maintenance action al, it is represented by
Wherein U (Sl, it is a) intermediate variable, refers to a kind of improved PUCT algorithms, be represented by
Wherein cpuctIt is a constant determined by MCTS search degree;Initially tendency and the selection of this search control strategy
Action a with higher prior probability and relatively low access times, but with search into guild more be inclined to selection have compared with
The action of height action value.
(2) extension and assessment
By leaf node SLIt is added in a queue, by mapping relationship fS→AGenerate A (SL), and then obtain leaf node
The input feature vector figure X of corresponding cluster service modeL, it is input to neural network and is expanded the side (S of leaf nodeL, a) need
The statistical data of storage, this operation can be expressed as
fθ(XL)=(pa,v)
Before completing aforesaid operations, this search thread is constantly in locking state.As leafy node SLContinue extension
When, its each branch (SL, a) the statistical parameter initialization of storage, can be expressed as
Data(SL, a)={ N (SL, a)=0, W (SL, a)=0, Q (SL, a)=0, P (SL, a)=pa}
(3) recall
Statistical data, which is recalled along all branches that search thread accesses from leaf node to root node, to be transmitted, and is updated and is deposited
It is stored in search tree branch.In trace-back process, branch (Sl,al) storage the update of access times parameter it is primary, can be expressed as
N(Sl,al)=N (Sl,al)+1
Meanwhile the branch (Sl,al) total action value and averagely action value parameter also update once, can be expressed as
W(Sl,al)=W (Sl,al)+v
(4) it executes
It is operated by iteration above three, after completing 1000 tree search, according to an improved system service mode
Transition probability matrix πtSelect the best maintenance action a of t momentt, cluster service mode StIt is transferred to St+1, πtIn element π can be with
It is expressed as
π(a|Xt)=N (Xt,a)1/τ/∑bN(Xt,b)1/τ
Wherein τ is the temperature parameter of a command deployment process.
Search tree is continuing with next from improving operation, executes best maintenance action a every timetReach later
Child node becomes new search root vertex, retains all branches of the node, while abandoning its cotree of a root node
Branch.
The search spread of maintenance policy solution space is operated based on MCTS algorithm performs 1000 times, t moment finally can be obtained
The best maintenance action a of the overall situationtWith improved service mode transition probability matrix πt, expression formula is
Example:By Step 1: step 2 and step 3 obtain one group of (Xt,pt,vt) after, the traversal repair of MCTS algorithm search
Tactful solution space obtains improved m × n service modes transition probability matrix
The maximum unit of selection wherein service mode transition probability executes maintenance action, the global best maintenance action of t moment
atExpression formula be
Above formula indicates t moment to node k2Unit u2Maintenance action is executed, t moment cluster service mode turns after completion
Move to the t+1 moment.
Step 5:Variation based on cluster service mode updates complex network adjacency matrix.
Recovery policy is from after executing the best maintenance action at development t-1 moment, when cluster service mode is transferred to t
It carves, based on the variation of cluster service mode, according to the mapping relationship f in step 2S→A, update complex network adjacency matrix, table
It is up to formula
Element in matrixIndicate t moment node kiWith node kjBetween connection relation
(side),Indicate there is no side (destroy or be not present) between two nodes,Indicate there is one between two nodes
By node kiIt is directed toward node kjSide.
Step 6:Calculate and examine the recovery extent of complex network.
A recovery policy is executed from after improving operation (including Step 3: step 4 and step 5), after shifting
" node-unit " cluster service mode S and its adjacency matrix A (S) calculates the recovery extent of complex network.
It is required if not meeting recovery, return to step three, continues to execute recovery policy and improve operation certainly.
If the cluster service mode S at T momentTMeet the requirements, then pass through T time from improve operation complete one completely it is extensive
Then multiple strategy is performed simultaneously step 7 and step 8 from development.
Step 7:Neural network parameter is trained based on intensified learning empirical parameter.
A reward value z is calculated by reward function to assess recovery policy from development, based on reward value and extensive
The newest intensified learning empirical parameter that multiple strategy is generated from development, SE-ResNet with minimize the assessed value v of prediction with
Error between the reward value z for improving end, and maximize prior state transition probability p and improved state transition probability
Similarity between π is target, trains network parameter θ, loss function that can be expressed as using gradient descent method
Loss=(z-v)2-πTlogp+c||θ||2
After the completion of network parameter training, a new SE-ResNet improving certainly for recovery policy next time is obtained
Journey.The better direction of search can be provided for MCTS by repetitive exercise neural network.
Step 8:Based on recovery policy a complete repair recovery scheme is generated from development
A series of best maintenance action { a stored from development by recovery policy1,a2,...,aTGenerate one completely
Repair recovery scheme, repair recovery scheme can be expressed as
Recovery=fRec(a1,a2,...,aT)=1 × a1+2×a2+…+T×aT
By final cluster service mode STAnd its adjacency matrix A (ST) calculate and export the recovery extent of complex network.
Example:If " node-unit " the cluster service mode S of initial time 10 × 6 in step 10It is improved certainly in recovery policy
Process performs 5 maintenance action { a altogether1,a2,a3,a4,a5, the repair recovery scheme of generation can be expressed as
The program indicates, according to repair sequential, to execute maintenance action to the following units successively:Node k7Unit u2, node
k3Unit u6, node k7Unit u6, node k2Unit u5, node k10Unit u6。
Claims (9)
1. improving recovery policy method certainly based on the complex network local failure for improving intensified learning, it is characterised in that:It is wrapped
Containing following steps:
The first step:The cluster service mode matrix of complex network is established based on local failure:Complex web is established according to information is destroyed
The service mode 0-1 matrixes of network " node-unit " cluster.
Second step:Complex network adjacency matrix is generated based on initial cluster service mode:Consider service mode matrix and adjacent square
The mapping relations of battle array generate complex network adjacency matrix based on initial cluster service mode.
Third walks:Priori service mode transition probability based on Neural Network model predictive cluster:Design a SE-ResNet god
Priori service mode transition probability through neural network forecast " node-unit " cluster and priori maintenance policy value.
4th step:The maintenance policy solution space of cluster is traversed based on Monte Carlo tree search algorithm:Maintenance policy solution space is traversed,
Obtain improved service mode transition probability matrix, and selection current time global best maintenance action accordingly.
5th step:Variation based on cluster service mode updates complex network adjacency matrix.
6th step:Calculate and examine the recovery extent of complex network:Cluster service mode based on complex network and adjacency matrix
It calculates and examines its recovery extent.
7th step:Neural network parameter is trained based on intensified learning empirical parameter:It is generated from development based on recovery policy
One group of newest intensified learning empirical parameter trains neural network parameter using gradient descent method.
8th step:Based on recovery policy a complete repair recovery scheme is generated from development:It is improved certainly by recovery policy
A series of best maintenance actions of process storage generate a complete repair recovery scheme.
By above step, a kind of improvement recovery policy method certainly based on improvement intensified learning is given, complexity can be solved
The recovery policy problem of cluster repair is carried out under the collapse state of network part.
2. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning
Method, it is characterised in that:In the first step in " the cluster service mode matrix for establishing complex network based on local failure ",
The recovery problem of complex network local failure state is considered as to the cluster maintenance problem of multinode, complexity is established according to information is destroyed
The service mode 0-1 matrixes of network " node-unit " cluster.
First, the node set K={ k of complex network are built1,k2,…,ki,…,kj,…,kn(wherein n is the number of node),
The composition of each node is disassembled, its unit set U={ u are established1,u2,…,ui,…,uj,…,um}.Based on this, it builds
" node-unit " matrix of vertical m × n, and according to information is destroyed with " 0 ", " 1 " to element assignment in matrix, form service mode
Matrix S.
3. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning
Method, it is characterised in that:It, will in " generating complex network adjacency matrix based on initial cluster service mode " described in second step
One complex network is abstracted as one by set of node K={ k1,k2,…,ki,…,kj,…,knAnd connection (side) collection
The figure G=(K, E) of composition.Connection relation in complex network between n node is described with the adjacency matrix A of a n × n
(side), and do not consider from ring.When all units are normal in complex network, adjacency matrix is denoted as A*.
By node kiUnit collection Ui={ u1,u2,…,umBe divided into three classes unit collectionThenIndicate single
MetasetIn all nodes be destroy space in trouble unit, other two classes unit collection can be similarly described.
Based on above-mentioned classification, with node kiFor, it is assumed that the mapping relations of element and element in adjacency matrix A in service mode matrix S
fS→AFor
Above-mentioned relation indicates, as node kiA classes unit when all destroying, disconnected with the associated all sides of the node;Work as node
kiB classes unit when all destroying, the side that remaining node is directed toward by the node disconnects;As node kiC classes units all destroy
When, the side that the node is directed toward by remaining node disconnects.Initial repair state based on complex network, by mapping relationship fS→AIt can be with
Generate the adjacency matrix A of initial repair state.
4. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning
Method, it is characterised in that:It is described in the third step that " the priori service mode transfer based on Neural Network model predictive cluster is general
In rate ", devise a compression-excitation residual error network (Squeeze-and-Excitation Residual Networks,
SE-ResNet the priori service mode transition probability matrix p and priori cluster maintenance policy valence of " node-unit " cluster) are predicted
Value v.
Neural network input feature vector figure X:Including in current " node-unit " cluster service mode S, maintenance policy iterative process
Nearest history cluster service mode (by taking 7 step history cluster service modes as an example) and complex network adjacency matrix A
(S) and A*.
Neural network output information:Including a priori cluster service mode transition probability p of " node-unit " cluster and one
Priori cluster maintenance policy is worth v.
The neural network structure of selection:Including convolution module, residual error module, compression-excitation (Squeeze-and-
Excitation, SE) module, ReLU function modules etc..The expression formula of neural network is fθ(X)=(p, v).
5. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning
Method, it is characterised in that:" the maintenance policy solution sky of cluster is traversed based on Monte Carlo tree search algorithm described in the 4th step
Between " in, to improve complex network " node-unit " clustering performance recovery extent, reduction recovery time is target, structure repair plan
Slightly from improved iteration system.A kind of intensified learning frame based on improved weighting MCTS algorithms is designed, it is optimal for solving
Maintenance policy.
The Maintenance forecast result p of SE-ResNet avoids direct global search as search weight during MCTS algorithms are walked using third
There is multiple shot array problem in cluster maintenance policy solution space, and the local search of solution space is carried out based on prior probability p and can equally be obtained
To global optimum's maintenance policy, improved service mode transition probability matrix π is obtained according to tree search, is executed primary global best
Maintenance policy acts a, and current " node-unit " cluster service mode S is transferred to subsequent time cluster service mode, and MCTS is calculated
Its expression formula of method is MCTSθ(X, p, v)=(π, a).
6. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning
Method, it is characterised in that:In " the variation update complex network adjacency matrix based on cluster service mode " described in the 5th step,
For recovery policy from after executing the best maintenance action at development a certain moment, cluster service mode is transferred to subsequent time, base
In the variation of cluster service mode, according to the mapping relationship f in second stepS→A, update complex network adjacency matrix.
7. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning
Method, it is characterised in that:In " recovery extent for calculating and examining complex network " described in the 6th step, primary recovery plan is executed
Slightly from after improving operation (including third step, the 4th step and the 5th step), by " node-unit " cluster service mode after shifting
S and its adjacency matrix A (S) calculates the recovery extent of complex network.
It is required if not meeting recovery, returns to third step, continued to execute recovery policy and improve operation certainly.
If after the T times is improved operation certainly, cluster service mode satisfaction restores requirement, then passes through T times from improvement operation completion
Then one complete recovery policy is performed simultaneously the 7th step and the 8th step from development.
8. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning
Method, it is characterised in that:In " training neural network parameter based on intensified learning empirical parameter " described in the 7th step, by rewarding
Function calculates a reward value z and is assessed from development recovery policy, is improved certainly based on reward value and recovery policy
The newest intensified learning empirical parameter of T groups that journey generates, SE-ResNet are terminated with the assessed value v for minimizing prediction with from improvement
Reward value z between error, and maximize prior state transition probability p and improved state transition probability π between phase
It is target like degree, network parameter θ is trained using gradient descent method, obtains a new SE-ResNet for restoring plan next time
Slightly from development.The better direction of search can be provided for MCTS by repetitive exercise neural network.
9. according to claim 1 improve recovery policy side certainly based on the complex network local failure for improving intensified learning
Method, it is characterised in that:" a complete repair recovery side is generated from development based on recovery policy described in the 8th step
In case ", a series of best maintenance action { a for being stored from development by recovery policy1,a2,...,aTGenerate one completely
Recovery scheme is repaired, repair recovery scheme can be expressed as
Recovery=fRec(a1,a2,...,aT)=1 × a1+2×a2+…+T×aT
By final cluster service mode STAnd its adjacency matrix A (ST) calculate and export the recovery extent of complex network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810375758.1A CN108573303A (en) | 2018-04-25 | 2018-04-25 | It is a kind of that recovery policy is improved based on the complex network local failure for improving intensified learning certainly |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810375758.1A CN108573303A (en) | 2018-04-25 | 2018-04-25 | It is a kind of that recovery policy is improved based on the complex network local failure for improving intensified learning certainly |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108573303A true CN108573303A (en) | 2018-09-25 |
Family
ID=63575205
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810375758.1A Pending CN108573303A (en) | 2018-04-25 | 2018-04-25 | It is a kind of that recovery policy is improved based on the complex network local failure for improving intensified learning certainly |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108573303A (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109214399A (en) * | 2018-10-12 | 2019-01-15 | 清华大学深圳研究生院 | A kind of improvement YOLOV3 Target Recognition Algorithms being embedded in SENet structure |
CN109711040A (en) * | 2018-12-25 | 2019-05-03 | 南京天洑软件有限公司 | A kind of intelligent industrial design nitrification enhancement based on direction of search study |
CN110598309A (en) * | 2019-09-09 | 2019-12-20 | 电子科技大学 | Hardware design verification system and method based on reinforcement learning |
CN110971471A (en) * | 2019-12-30 | 2020-04-07 | 国网江苏省电力有限公司信息通信分公司 | Power communication backbone network fault recovery method and device based on state perception |
CN111462044A (en) * | 2020-03-05 | 2020-07-28 | 浙江省农业科学院 | Greenhouse strawberry detection and maturity evaluation method based on deep learning model |
CN111967636A (en) * | 2020-06-08 | 2020-11-20 | 北京大学 | System and method for assisting in decision-making of power distribution network maintenance strategy |
CN111985672A (en) * | 2020-05-08 | 2020-11-24 | 东华大学 | Single-piece job shop scheduling method for multi-Agent deep reinforcement learning |
CN112183777A (en) * | 2020-09-14 | 2021-01-05 | 北京航空航天大学 | Complex network local destruction control method based on deep reinforcement learning |
CN112682182A (en) * | 2019-10-18 | 2021-04-20 | 丰田自动车株式会社 | Vehicle control device, vehicle control system, and vehicle control method |
CN112770325A (en) * | 2020-12-09 | 2021-05-07 | 华南理工大学 | Cognitive Internet of vehicles spectrum sensing method based on deep learning |
CN112800678A (en) * | 2021-01-29 | 2021-05-14 | 南京航空航天大学 | Multi-task selection model construction method, multi-task selective maintenance method and system |
CN113673721A (en) * | 2021-08-26 | 2021-11-19 | 北京航空航天大学 | Cluster system preventive maintenance method based on deep reinforcement learning |
CN113923123A (en) * | 2021-09-24 | 2022-01-11 | 天津大学 | Underwater wireless sensor network topology control method based on deep reinforcement learning |
US11809977B2 (en) | 2019-11-14 | 2023-11-07 | NEC Laboratories Europe GmbH | Weakly supervised reinforcement learning |
-
2018
- 2018-04-25 CN CN201810375758.1A patent/CN108573303A/en active Pending
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109214399A (en) * | 2018-10-12 | 2019-01-15 | 清华大学深圳研究生院 | A kind of improvement YOLOV3 Target Recognition Algorithms being embedded in SENet structure |
CN109711040A (en) * | 2018-12-25 | 2019-05-03 | 南京天洑软件有限公司 | A kind of intelligent industrial design nitrification enhancement based on direction of search study |
CN109711040B (en) * | 2018-12-25 | 2023-06-02 | 南京天洑软件有限公司 | Intelligent industrial design reinforcement learning algorithm based on search direction learning |
CN110598309A (en) * | 2019-09-09 | 2019-12-20 | 电子科技大学 | Hardware design verification system and method based on reinforcement learning |
CN110598309B (en) * | 2019-09-09 | 2022-11-04 | 电子科技大学 | Hardware design verification system and method based on reinforcement learning |
CN112682182A (en) * | 2019-10-18 | 2021-04-20 | 丰田自动车株式会社 | Vehicle control device, vehicle control system, and vehicle control method |
US11809977B2 (en) | 2019-11-14 | 2023-11-07 | NEC Laboratories Europe GmbH | Weakly supervised reinforcement learning |
CN110971471B (en) * | 2019-12-30 | 2022-03-29 | 国网江苏省电力有限公司信息通信分公司 | Power communication backbone network fault recovery method and device based on state perception |
CN110971471A (en) * | 2019-12-30 | 2020-04-07 | 国网江苏省电力有限公司信息通信分公司 | Power communication backbone network fault recovery method and device based on state perception |
CN111462044A (en) * | 2020-03-05 | 2020-07-28 | 浙江省农业科学院 | Greenhouse strawberry detection and maturity evaluation method based on deep learning model |
CN111462044B (en) * | 2020-03-05 | 2022-11-22 | 浙江省农业科学院 | Greenhouse strawberry detection and maturity evaluation method based on deep learning model |
CN111985672A (en) * | 2020-05-08 | 2020-11-24 | 东华大学 | Single-piece job shop scheduling method for multi-Agent deep reinforcement learning |
CN111967636A (en) * | 2020-06-08 | 2020-11-20 | 北京大学 | System and method for assisting in decision-making of power distribution network maintenance strategy |
CN112183777A (en) * | 2020-09-14 | 2021-01-05 | 北京航空航天大学 | Complex network local destruction control method based on deep reinforcement learning |
CN112770325B (en) * | 2020-12-09 | 2022-12-16 | 华南理工大学 | Cognitive internet of vehicles spectrum sensing method based on deep learning |
CN112770325A (en) * | 2020-12-09 | 2021-05-07 | 华南理工大学 | Cognitive Internet of vehicles spectrum sensing method based on deep learning |
CN112800678A (en) * | 2021-01-29 | 2021-05-14 | 南京航空航天大学 | Multi-task selection model construction method, multi-task selective maintenance method and system |
CN113673721A (en) * | 2021-08-26 | 2021-11-19 | 北京航空航天大学 | Cluster system preventive maintenance method based on deep reinforcement learning |
CN113923123A (en) * | 2021-09-24 | 2022-01-11 | 天津大学 | Underwater wireless sensor network topology control method based on deep reinforcement learning |
CN113923123B (en) * | 2021-09-24 | 2023-06-09 | 天津大学 | Underwater wireless sensor network topology control method based on deep reinforcement learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108573303A (en) | It is a kind of that recovery policy is improved based on the complex network local failure for improving intensified learning certainly | |
Chen et al. | Evolutionary artificial neural networks for hydrological systems forecasting | |
Diao et al. | Feature selection inspired classifier ensemble reduction | |
CN106529818B (en) | Water quality assessment Forecasting Methodology based on Fuzzy Wavelet Network | |
CN114422382B (en) | Network flow prediction method, computer device, product and storage medium | |
CN113094822A (en) | Method and system for predicting residual life of mechanical equipment | |
CN106647272A (en) | Robot route planning method by employing improved convolutional neural network based on K mean value | |
CN110188880A (en) | A kind of quantization method and device of deep neural network | |
CN113190688A (en) | Complex network link prediction method and system based on logical reasoning and graph convolution | |
CN113286275A (en) | Unmanned aerial vehicle cluster efficient communication method based on multi-agent reinforcement learning | |
CN110442143A (en) | A kind of unmanned plane situation data clustering method based on combination multiple target dove group's optimization | |
CN113469891A (en) | Neural network architecture searching method, training method and image completion method | |
CN105512755A (en) | Decomposition-based multi-objective distribution estimation optimization method | |
CN108733921A (en) | Coiling hot point of transformer temperature fluctuation range prediction technique based on Fuzzy Information Granulation | |
Hakimi-Asiabar et al. | Multi-objective genetic local search algorithm using Kohonen’s neural map | |
CN112183721B (en) | Construction method of combined hydrological prediction model based on self-adaptive differential evolution | |
CN109408896A (en) | A kind of anerobic sowage processing gas production multi-element intelligent method for real-time monitoring | |
CN116796194A (en) | IDBO-KELM-BiGRU neural network-based active power virtual collection method for distributed photovoltaic power station | |
CN109948797A (en) | A kind of adjacency matrix optimization method in figure neural network based on L2 norm | |
CN115906959A (en) | Parameter training method of neural network model based on DE-BP algorithm | |
CN112183777A (en) | Complex network local destruction control method based on deep reinforcement learning | |
CN1622129A (en) | Optimization method for artificial neural network | |
Huang et al. | Genetic algorithms enhanced Kohonen's neural networks | |
Zhao et al. | Artificial bee colony algorithm with tree-seed searching for modeling multivariable systems using GRNN | |
CN113807005A (en) | Bearing residual life prediction method based on improved FPA-DBN |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180925 |