CN115098906B - Bridge maintenance method and system based on deep reinforcement learning and system reliability - Google Patents

Bridge maintenance method and system based on deep reinforcement learning and system reliability Download PDF

Info

Publication number
CN115098906B
CN115098906B CN202210482833.0A CN202210482833A CN115098906B CN 115098906 B CN115098906 B CN 115098906B CN 202210482833 A CN202210482833 A CN 202210482833A CN 115098906 B CN115098906 B CN 115098906B
Authority
CN
China
Prior art keywords
bridge deck
maintenance
bridge
reliability
local
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210482833.0A
Other languages
Chinese (zh)
Other versions
CN115098906A (en
Inventor
李惠
徐阳
陈家辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN202210482833.0A priority Critical patent/CN115098906B/en
Publication of CN115098906A publication Critical patent/CN115098906A/en
Application granted granted Critical
Publication of CN115098906B publication Critical patent/CN115098906B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/13Architectural design, e.g. computer-aided architectural design [CAAD] related to design of buildings, bridges, landscapes, production plants or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/02Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Structural Engineering (AREA)
  • Civil Engineering (AREA)
  • Architecture (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a bridge intelligent maintenance decision-making method and system based on deep reinforcement learning and system reliability, belonging to the technical field of intelligent infrastructure, wherein the method comprises the following steps: constructing a redundancy system model of the whole bridge deck, decomposing the redundancy system model into a series of small-scale local bridge decks, and calculating the reliability probability and reliability index of the whole bridge deck system based on the failure probability of the local bridge decks and the reliability theory of the redundancy system; designing a comprehensive reward function based on maintenance cost and safety cost according to the reliability index so as to establish an integral bridge deck maintenance decision network model based on deep reinforcement learning; and training the integral bridge deck maintenance decision network model based on the deep reinforcement learning until convergence, and inputting the reliability index, the local reliability index and the service time of the bridge deck system into the trained model to obtain a bridge maintenance action result, namely realizing the intelligent bridge maintenance decision based on the deep reinforcement learning.

Description

Bridge maintenance method and system based on deep reinforcement learning and system reliability
Technical Field
The invention relates to the technical field of intelligent infrastructure, in particular to a bridge intelligent maintenance decision method and a bridge intelligent maintenance decision system based on deep reinforcement learning and system reliability.
Background
Failure of a bridge structure can cause severe economic loss, environmental damage and social impact. Wherein, the orthotropic steel bridge deck plate system is an important component of the whole bridge. Because the automobile directly bears the load of the automobile and various long-term environmental effects, damages such as fatigue, corrosion and the like inevitably occur in the service period. In order to ensure the operation safety of the bridge structure in the service period, the reasonable maintenance of the bridge deck system is very important. The maintenance decision of the bridge deck system is a comprehensive methodology combining state evaluation, degradation prediction and maintenance scheme optimization, and the purpose of researching the maintenance decision is to improve the reliability of the system, prevent the system from failing and reduce the maintenance cost of the system.
The initiation and development of fatigue cracks in the deck slab of the bridge as a whole may lead to a deterioration in the load-bearing properties. In order to ensure the safety of the bridge deck system in the service period with the lowest cost, the reliability evaluation of the long-span bridge girder system is very important. Reliability assessment of the decking is also a multi-scale problem, since the fatigue crack size is much smaller than the main girder span.
The bridge maintenance strategy mainly comprises two aspects of maintenance time and maintenance degree. Based on the maintenance decision criteria, the maintenance plan can be further divided into two categories, time-based maintenance strategy and state-based maintenance strategy. Time-based maintenance strategies, which may result in wasted component life due to a higher maintenance frequency, or significant loss of failure due to a lower maintenance frequency, maintain the system at predetermined intervals. State-based maintenance strategies make maintenance plans based on the detected or monitored current state of degradation of the structure and are therefore generally more efficient.
At present, a great deal of research is carried out on the maintenance decision of a single bridge component, and a maintenance strategy based on a threshold value is generally adopted, namely, the maintenance is carried out when the state of the component reaches the set threshold value. Due to the lower maintenance action space, state space, and decision variable dimensions, the optimal solution for a single component maintenance decision is generally easier to obtain.
The optimization of the maintenance decision of the bridge multi-component system is more complex than that of a single-component system, and the difficulty is mainly shown in the following four aspects:
(1) As the number of components increases, the system state space and maintenance action space dimensions increase exponentially;
(2) Structural correlation exists among the components, namely certain components can form a whole in function and need to be maintained simultaneously;
(3) There is an economic correlation between components, which is more pronounced when the cost of service initiation is high, while it is generally more economical to service multiple components at the same time;
(4) Random correlation exists among components, for example, correlation exists in the random process of component degradation: there is a correlation between the degradation processes of multiple fatigue cracks.
Due to the complexity of multi-component system maintenance decision modeling and analysis, research has been undertaken to date for time-based maintenance decisions in making maintenance decisions for multi-component systems. Threshold-based service decision extensions have been applied to multi-component systems in some studies, but are applied to multi-component systems requiring different service thresholds to be set for different components. As system components increase, the maintenance decision optimization problem becomes more complex and therefore is typically applied only to systems consisting of a small number of components.
The conventional threshold-based method is to optimize the maintenance threshold and other decision variables (such as maintenance time and maintenance frequency) of the degraded system based on the principles of cost minimization, safety maximization, etc. under the assumption of several different maintenance degrees (such as full maintenance and partial maintenance, etc.). Since optimization objectives are often mutually constrained, such as maintenance costs and remaining life of the structure. Thus, conventional repair decision problems are often relegated to multi-objective optimization problems. Dynamic programming, genetic algorithms, particle swarm algorithms, and the like are widely adopted multi-objective optimization methods. However, the application of the method is limited by high-dimensional decision variables of a multi-component system and the essence of a static optimization method, and the long-term non-stationary sequence decision problem is difficult to process, especially the multi-state multi-action situation is involved.
Disclosure of Invention
The present invention is directed to solving, at least in part, one of the technical problems in the related art.
Therefore, the invention aims to provide an intelligent bridge maintenance decision method based on deep reinforcement learning and system reliability.
The second purpose of the invention is to provide an intelligent bridge maintenance decision system based on deep reinforcement learning and system reliability.
A third object of the invention is to propose a computer device.
A fourth object of the invention is to propose a non-transitory computer-readable storage medium.
In order to achieve the above object, an embodiment of a first aspect of the present invention provides a bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability, including: s1, constructing a redundancy system model of the whole bridge deck, decomposing the redundancy system model into a series of small-scale local bridge decks, and calculating the reliability probability and reliability index of the whole bridge deck system based on the failure probability of the local bridge decks and the reliability theory of the redundancy system; s2, designing a comprehensive reward function based on maintenance cost and safety cost according to the reliability index so as to establish an integral bridge deck maintenance decision network model based on deep reinforcement learning; and S3, training the integral bridge deck maintenance decision network model based on the deep reinforcement learning until convergence, and inputting the reliability index, the local reliability index and the service time of the bridge deck system into the trained integral bridge deck maintenance decision network model based on the deep reinforcement learning to obtain a bridge maintenance action result.
The intelligent bridge maintenance decision method based on deep reinforcement learning and system reliability of the embodiment of the invention carries out redundant system modeling of a full-bridge integral bridge deck, decomposes the full-bridge orthotropic bridge deck into a series of small-scale local bridge decks, and calculates the system reliability of the integral bridge deck based on the failure probability of the local bridge decks and the redundant system reliability theory; designing a comprehensive reward function based on maintenance cost and safety cost by taking a local bridge deck as a minimum maintenance unit, and establishing an integral bridge deck maintenance decision network model based on deep reinforcement learning; and finally, taking the reliability index of the bridge deck system, the local reliability index and the service time as the input of a maintenance decision network model, simplifying the maintenance action into the number of the maintained local bridge decks on the premise that the maintenance priority is in inverse proportion to the local reliability index, realizing the intelligent bridge maintenance decision based on deep reinforcement learning, and solving the problem of the traditional method that the multi-scale evaluation of the whole multi-component maintenance decision of the bridge cannot be carried out.
In addition, the bridge intelligent maintenance decision method based on the deep reinforcement learning and the system reliability according to the embodiment of the invention can also have the following additional technical features:
further, in an embodiment of the present invention, the step S1 specifically includes: step S101, establishing the integral bridge deck redundancy system model and decomposing the integral bridge deck redundancy system model into a series of small-scale local bridge decks; step S102, decomposing the system into a series of small-scale local bridge deck plates based on the system failure criterion of designing the whole bridge deck plate; step S103, constructing a system state space according to the system failure criterion; step S104, calculating a system state transition probability matrix according to the system state space; and S105, calculating the reliability probability and reliability index of the whole bridge deck system according to the system state space.
Further, in one embodiment of the present invention, in the two-dimensional rule system composed of n rows and m columns of cells, if there are k or more cells failing in the cells in the consecutive r rows and s columns, the system fails.
Further, in an embodiment of the present invention, the system state space is:
Figure SMS_1
wherein, [ lambda ] ij ] n×(s-1) For a subsystem consisting of partial bridge decks of n rows (s-1) columns, lambda ij Is the state of the ith row and jth column of the local bridge panel, λ ij E {0,1},0 denotes local bridge deck security, 1 denotes local bridge deck failure, S is the security state set of the subsystem, and the ith element is marked as S i And F is the failure state set of the subsystem.
Further, in an embodiment of the present invention, the system state transition probability matrix is:
Figure SMS_2
wherein T is a transition probability matrix of system states, N is a transition probability matrix between system security states, and dimension is d s ×d s ,N i,j The probability of the ith state to the jth state transition in the S is obtained; c is a probability matrix for transition from the system safety state to the failure state, and the dimensionality is d s ×1;C i,1 The probability of the ith state in the S transferring to the failure state; 0 is a matrix composed of 0 elements and has a dimension of 1 × d s Indicating that the failure state cannot transition to the safe state; a failure state of 1 can only transition to the failure state.
Further, in an embodiment of the present invention, the step S2 specifically includes: step S201, defining a bridge deck system state, including a reliability matrix, a reliability index and service time of a local bridge deck; step S202, presetting that the maintenance priority of the local bridge deck is in inverse proportion to the local reliability index of the local bridge deck, and simplifying the maintenance action into the number of the maintained local bridge decks so as to define the maintenance action space of the bridge deck system; step S203, defining a comprehensive reward function considering maintenance cost and safety cost simultaneously based on the maintenance action space of the bridge deck system; and step S204, establishing the integral bridge deck maintenance decision network model based on the deep reinforcement learning according to the bridge deck system state and the comprehensive reward function.
Further, in an embodiment of the present invention, the service action space of the bridge deck system is:
A=[0:p:max(th),n×m],0≤th≤n×m,th mod p=0
wherein mod is a remainder operation, max (th) is a maximum positive integer divisible by p and not more than nxm, and nxm is the number of the local bridge decks.
Further, in one embodiment of the present invention, the composite reward function is:
Reward=C m +C s
C m =-a cost -C setup
C s =-Φ(-β sys )*C sys -(β Tsys )*F
wherein Reward is a comprehensive Reward function, C m For maintenance costs, C s For safety cost, a cost For maintenance action corresponding to cost, a cost >0 is proportional to the number of maintenance units, C setup For maintenance start-up costs, beta sys The system reliability index phi (-beta) of the whole bridge deck sys ) Probability of system failure for a monolithic decking, C sys Cost of system failure, beta, for a monolithic decking T The system target reliability index of the whole bridge deck is obtained, and F is a punishment coefficient.
In order to achieve the above object, an embodiment of a second aspect of the present invention provides an intelligent bridge maintenance decision system based on deep reinforcement learning and system reliability, including: the computing module is used for constructing a redundancy system model of the whole bridge deck and decomposing the redundancy system model into a series of small-scale local bridge decks, and computing the reliability probability and reliability index of the whole bridge deck system based on the failure probability of the local bridge decks and the reliability theory of the redundancy system; the construction module is used for designing a comprehensive reward function based on maintenance cost and safety cost according to the reliability index so as to establish an integral bridge deck maintenance decision network model based on deep reinforcement learning; and the training and output module is used for training the integral bridge deck maintenance decision network model based on the deep reinforcement learning until convergence, and inputting the reliability index, the local reliability index and the service time of the bridge deck system into the trained integral bridge deck maintenance decision network model based on the deep reinforcement learning to obtain a bridge maintenance action result.
The bridge intelligent maintenance decision-making system based on deep reinforcement learning and system reliability of the embodiment of the invention carries out redundant system modeling of a full-bridge integral bridge deck, decomposes the full-bridge orthotropic bridge deck into a series of small-scale local bridge decks, and calculates the system reliability of the integral bridge deck based on the failure probability of the local bridge decks and the redundant system reliability theory; designing a comprehensive reward function based on maintenance cost and safety cost by taking a local bridge deck as a minimum maintenance unit, and establishing an integral bridge deck maintenance decision network model based on deep reinforcement learning; and finally, the reliability index, the local reliability index and the service time of the bridge deck system are used as the input of a maintenance decision network model, and on the premise that the maintenance priority is in inverse proportion to the local reliability index, the maintenance action is simplified into the number of the maintained local bridge decks, so that the intelligent bridge maintenance decision based on deep reinforcement learning is realized, and the problem of multi-scale evaluation that the traditional method cannot perform the overall multi-component maintenance decision of the bridge is solved.
To achieve the above object, a third embodiment of the present invention provides a computer device, which includes a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of any one of the above methods when executing the computer program.
To achieve the above object, a fourth embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is used to implement the steps of the method described above when executed by a processor.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flowchart of a bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability according to an embodiment of the present invention;
FIG. 2 is an overall flowchart of a bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of the partitioning method of the integral bridge deck of the main girder of the bridge and the modeling of the redundancy system according to one embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating the evolution process of the state space of the integrated bridge deck redundancy system according to an embodiment of the present invention;
FIG. 5 is a diagram of a deep Q-network model architecture for a repair optimization decision for a bridge deck system according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a bridge intelligent maintenance decision system based on deep reinforcement learning and system reliability according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The following describes a bridge intelligent maintenance decision method and system based on deep reinforcement learning and system reliability according to an embodiment of the invention with reference to the accompanying drawings.
It should be noted that the maintenance decision based on deep reinforcement learning is a dynamic method, and an end-to-end direct decision from the system degradation state to the maintenance action can be realized. Through reasonable design of a value function network, the maintenance decision based on deep reinforcement learning can be suitable for solving the high-dimensional problem. In addition, the reinforcement learning framework facilitates handling of correlations between components, such as random correlations, economic correlations, and the like. Therefore, the deep reinforcement learning method is combined with the state-based maintenance decision, so that the method has great development potential and prospect. Therefore, the embodiment of the invention adopts deep reinforcement learning to construct an intelligent beam maintenance decision method, and details are as follows.
Fig. 1 is a flowchart of a bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability according to an embodiment of the present invention.
As shown in fig. 1 and 2, the bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability includes the following steps:
in step S1, an integral bridge deck redundancy system model is constructed and decomposed into a series of small-scale local bridge decks, and the reliability probability and reliability index of the integral bridge deck system are calculated based on the failure probability of the local bridge decks and the redundancy system reliability theory.
Further, in an embodiment of the present invention, step S1 specifically includes:
step S101, establishing a whole bridge deck redundancy system model, and decomposing the model into a series of small-scale local bridge decks.
Specifically, the whole bridge deck is divided into a series of small-scale partial bridge decks, the whole bridge deck system is abstracted into a redundant system, and the partial bridge decks are members forming the redundant system. The process that the vehicle passes through the bridge floor is regarded as a sweeping process of a rectangular window to a regular grid, the size of the rectangular sweeping window is determined by the coverage area of the running vehicle, and the multi-scale calculation problem of the overall reliability of the bridge deck is converted into the reliability calculation problem of a subsystem which is contained in the rectangular window and consists of a plurality of local bridge decks. And considering the reliability analysis efficiency of the local bridge deck during the whole bridge deck division. On the one hand, the larger the local bridge deck size is, the higher the cost of the finite element analysis, the larger the random variable space is, the slower the convergence of the reliability analysis is, and the like. On the other hand, the smaller the scale of the local bridge deck, the larger the number thereof, and the less efficient the calculation of the system reliability.
Step S102, designing the system failure criterion of the whole bridge deck based on and decomposed into a series of small-scale partial bridge decks.
Specifically, the system failure criteria for designing a monolithic decking: in a two-dimensional rule system consisting of n rows and m columns of units, if k or more units in the units of r rows and s columns are failed continuously, the system is failed. As shown in fig. 3, where the whole bridge deck is divided into n rows and m columns for a total of m × n partial bridge decks.
Step S103, constructing a system state space according to the system failure criterion, as follows:
Figure SMS_3
wherein [ lambda ] ij ] n×(s-1) For a subsystem consisting of partial bridge decks of n rows (s-1) columns, lambda ij Is the state of the ith row and jth column of the local bridge panel, λ ij E {0,1},0 denotes local bridge deck security, 1 denotes local bridge deck failure, S is the security state set of the subsystem, and the ith element is marked as S i And F is the failure state set of the subsystem.
Specifically, as shown in fig. 4, the redundant system state space of the whole bridge deck is divided into a safe state and a failure state, and a large-scale system is considered to be evolved from a small-scale system along one dimension. According to the redundant system failure criterion in the step S102, if the original system fails, the new system fails regardless of the state of the newly added row of the local bridge deck; if the original system is safe, whether the new system fails depends on the state combination of the newly added component and the last s-1 column of components in the original system. Even if the original system is safe, the new system fails due to the fact that a failure window appears in the new system (a row of components are added on the basis of the original system).
Step S104, calculating a system state transition probability matrix according to the system state space, wherein the system state transition probability matrix is expressed in a block matrix form:
Figure SMS_4
wherein T is a transition probability matrix of system states, N is a transition probability matrix between system security states, and dimension is d s ×d s ,N i,j The probability of transition from the ith state to the jth state in S; c is a probability matrix for transition from the system safety state to the failure state, and the dimensionality is d s ×1;C i,1 The probability of the ith state in the S transferring to the failure state; 0 is a matrix composed of 0 elements and has a dimension of 1 × d s Indicating that the failure state cannot be transferred to the safe state; a failure state of 1 can only transition to the failure state.
Considering that the failure state of the system can only be transferred to the failure state, and the transfer of the safety state depends on the state of the local bridge deck of the newly added column (both the transfer to the failure state and the transfer to the safety state is possible), in order to improve the calculation efficiency, only the transfer between the safety states of the system, namely the N matrix in the formula (2), needs to be calculated.
Recording the failure probability of the local bridge deck in the newly added column as [ p ] f,1 ,p f,2 ,L,p f,n ] T Probability N i,j The specific calculation steps are as follows:
(1) If S i Second column of (1) and S j Is different, state S is explained i Cannot be transferred to S j Then N is i,j =0;
(2) If in the state S i Post addition of S j Satisfies a failure criterion, state S i Cannot be transferred to S j Then N is i,j =0;
(3) If neither of the first two conditions is satisfied, state of the descriptionState S i Can be transferred to S j Then the transition probability can be calculated as:
Figure SMS_5
in the formula, p f,h Representing the failure probability of the h-th local bridge deck of the newly added column; lambda [ alpha ] h State of h-th local bridge deck, lambda, representing a new column h =1 and λ h =0 represents a failure state and a safety state, respectively.
And S105, calculating the reliability probability and reliability index of the whole bridge deck system according to the system state space.
Specifically, the initial subsystem is composed of n rows and S-1 columns of local bridge panels, the probability of each state in the safe state set S is recorded as matrix xi, and the dimensionality is 1 xd s . Calculating the element in xi according to the failure probability of the local bridge panel:
Figure SMS_6
in the formula, xi 1,h Indicating an initial subsystem state of S h The probability of (d); lambda [ alpha ] h,i,j Represents the state S h Row i and column j of the state of the local bridge panel; p is a radical of formula f,i,j Indicating the failure probability of the ith row and jth column local bridge deck in the initial subsystem.
The initial subsystem expands into a complete system after increasing the m-s +1 columns, so that the state of the initial subsystem is transferred for m-s +1 times, and the reliability index of the whole bridge deck system obtained by calculation is as follows:
β=Φ -1 (p s ),p s =ξN m-s+1 1 (5)
in the formula, beta represents the reliability index of the whole bridge deck system, phi represents the cumulative distribution function of the standard normal distribution, and p s Representing the safety probability of the whole bridge deck system; xi N m-s+1 Representing the probability that the system is in each safety state in S after m-S +1 transfer; 1 represents a dimension of d s Vector of x 1, each element is 1.
In step S2, a comprehensive reward function based on maintenance cost and safety cost is designed according to the reliability index so as to establish an integral bridge deck maintenance decision network model based on deep reinforcement learning.
It should be noted that, an overall bridge deck maintenance decision network model based on deep reinforcement learning is established, the goal of the model is to take the state of a bridge deck system as input and directly output maintenance actions, wherein a local bridge deck is a minimum maintenance unit, and the goal of overall bridge deck maintenance decision optimization is to ensure the reliability of the bridge deck system in the service period at minimum cost.
Further, in an embodiment of the present invention, step S2 specifically includes:
step S201, defining the system state of the bridge deck, including the reliability matrix, the reliability index and the service time of the local bridge deck.
Specifically, the defined bridge deck system state comprises a reliability matrix of a local bridge deck, a system reliability index and service time, wherein the dimension of the local bridge deck reliability matrix is n × m, and the system reliability index and the service time are scalars.
Step S202, presetting the maintenance priority of the local bridge deck to be inversely proportional to the local reliability index of the local bridge deck, and simplifying the maintenance action into the number of the local bridge deck to define the maintenance action space of the bridge deck system.
Specifically, since the entire bridge deck is divided into n × m partial bridge decks each having two maintenance actions, i.e., maintenance and non-maintenance, the actual size of the maintenance action space is 2 nm . However, such a high dimensional motion space cannot be directly modeled. The worst-case unit is generally most economical to maintain given the consistent unit maintenance costs. Therefore, setting the maintenance priority of the local bridge deck to be inversely proportional to the local reliability index thereof, simplifying the maintenance action into the number of the local bridge decks to be maintained, and converting the corresponding maintenance action space into A = [0,1, L, n × m =]The dimension is n × m +1. However, the n × m +1 dimensional motion space is still large, and the decision network is easy to converge locally in the training process of reinforcement learningOptimally, therefore, further, taking p local bridge decks as a group, further reducing the motion space:
A=[0:p:max(th),n×m],0≤th≤n×m,thmodp=0 (6)
where mod represents the remainder operation and max (th) represents the largest positive integer divisible by p that does not exceed n × m.
Step S203, defining a comprehensive reward function considering maintenance cost and safety cost simultaneously based on the maintenance action space of the bridge deck system.
It should be noted that since an objective of embodiments of the present invention is to minimize the maintenance cost of the bridge deck system over its life cycle, the reward function is designed according to its cost. The cost of the bridge in the service period comprises maintenance cost and safety cost which are mutually restricted. When the maintenance cost is high, the structure is safer, so that the safety cost is lower; when the maintenance cost is reduced, the structure may be out of order, thereby possibly causing an increase in the safety cost. Therefore, the embodiment of the invention designs a comprehensive reward function which simultaneously considers the maintenance cost and the safety cost:
Figure SMS_7
wherein Reward represents a Reward function, C m Represents the maintenance cost, C s Representing a security cost. a is cost Represents a repair action corresponding cost, a cost >0 is proportional to the number of maintenance units, C setup Indicating a repair initiation cost. Beta is a sys Represents the system reliability index phi (-beta) of the whole bridge deck sys ) Representing the probability of failure of the system of the monolithic decking, C sys Represents the system failure cost, beta, of the integral deck slab T Representing a system target reliability index of the whole bridge deck; and punishing when the system reliability index is lower than the target reliability index, wherein F represents a punishment coefficient, and the larger F represents the larger punishment degree when the system reliability index is lower than the target value.
And step S204, establishing an integral bridge deck maintenance decision network model based on deep reinforcement learning according to the state of the bridge deck system and the comprehensive reward function.
Specifically, as shown in FIG. 5, [ beta ] in the figure ij ] n×m Representing a reliability matrix of the local bridge deck, wherein the dimensionality is n multiplied by m; [ T ] ij ] n×m The service time of the local bridge deck from the latest maintenance is represented, and the dimension is n multiplied by m; beta is a sys Representing the system reliability index of the whole bridge deck; t is a unit of sys Representing the service time of the bridge; n is a radical of action Representing the number of maintenance actions.
In step S3, the deep reinforcement learning-based overall bridge deck maintenance decision network model is trained until convergence, and the reliability index, the local reliability index, and the service time of the bridge deck system are input into the trained deep reinforcement learning-based overall bridge deck maintenance decision network model to obtain a bridge maintenance action result.
In summary, compared with the conventional bridge maintenance decision-making technology, the bridge intelligent maintenance decision-making method based on deep reinforcement learning and system reliability provided by the embodiment of the invention has the following effects:
(1) The method has the advantages that the redundant system modeling is carried out on the bridge integral bridge deck, the influence of the reliability of different local bridge deck boards on the system reliability is considered, and the problem that the traditional method cannot carry out multi-scale evaluation of the maintenance decision of the bridge integral multi-component is solved;
(2) The bridge maintenance decision network takes the system reliability of the whole bridge deck, the reliability matrix of the local bridge deck, the service time and other variables as input, and compared with the traditional method, the bridge maintenance decision network can better take the actual scene of the degradation of the service state of the bridge along with the time into consideration and better accords with the actual engineering application;
(3) A simplified system maintenance action space is designed, a criterion that the local component maintenance priority is in inverse proportion to the local bridge deck reliability index is established, the engineering application practice is better met, the system maintenance action space is greatly reduced, and the learning efficiency is improved;
(4) A comprehensive reward function considering maintenance cost and safety cost simultaneously is designed, so that the engineering practice is better met, and the defect that the structural safety is influenced or the maintenance cost is overlarge due to the fact that only the maintenance cost or the safety cost is considered is avoided;
(5) The bridge deck maintenance decision method based on deep reinforcement learning is provided, end-to-end direct decision can be realized based on the current state of the system, and the model has the functional characteristics of self-simulation, self-learning, self-evolution and self-updating.
The bridge intelligent maintenance decision system based on deep reinforcement learning and system reliability provided by the embodiment of the invention is described next with reference to the attached drawings.
FIG. 6 is a bridge intelligent maintenance decision system based on deep reinforcement learning and system reliability according to an embodiment of the present invention.
As shown in fig. 6, the system 10 includes: a calculation module 100, a construction module 200, a training and output module 300.
The calculation module 100 is configured to construct an overall bridge deck redundancy system model, decompose the model into a series of small-scale local bridge decks, and calculate a reliability probability and a reliability index of the overall bridge deck system based on a failure probability of the local bridge decks and a redundancy system reliability theory. The building module 200 is configured to design a comprehensive reward function based on the maintenance cost and the security cost according to the reliability index, so as to build an overall bridge deck maintenance decision network model based on deep reinforcement learning. The training and output module 300 is configured to train the deep reinforcement learning-based overall bridge deck maintenance decision network model until convergence, and input the reliability index of the bridge deck system, the local reliability index, and the service time into the trained deep reinforcement learning-based overall bridge deck maintenance decision network model to obtain a bridge maintenance action result.
It should be noted that the foregoing explanation focusing on the embodiment of the bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability is also applicable to the system of the embodiment of the present invention, and the implementation principle is similar, and is not described herein again.
Compared with the traditional bridge maintenance decision technology, the bridge intelligent maintenance decision system based on the deep reinforcement learning and the system reliability provided by the embodiment of the invention has the following effects:
(1) The method has the advantages that the redundant system modeling is carried out on the bridge integral bridge deck, the influence of the reliability of different local bridge deck boards on the system reliability is considered, and the problem that the traditional method cannot carry out multi-scale evaluation of the maintenance decision of the bridge integral multi-component is solved;
(2) The bridge maintenance decision network takes the system reliability of the whole bridge deck, the reliability matrix of the local bridge deck, the service time and other variables as input, and compared with the traditional method, the bridge maintenance decision network can better take the actual scene of the degradation of the service state of the bridge along with the time into consideration and better accords with the actual engineering application;
(3) A simplified system maintenance action space is designed, a criterion that the local component maintenance priority is in inverse proportion to the local bridge deck reliability index is established, the engineering application practice is better met, the system maintenance action space is greatly reduced, and the learning efficiency is improved;
(4) A comprehensive reward function considering maintenance cost and safety cost simultaneously is designed, so that the engineering practice is better met, and the defect that the structural safety is influenced or the maintenance cost is overlarge due to the fact that only the maintenance cost or the safety cost is considered is avoided;
(5) The bridge deck maintenance decision method based on deep reinforcement learning is provided, end-to-end direct decision can be realized based on the current state of the system, and the model has the functional characteristics of self-simulation, self-learning, self-evolution and self-updating.
In order to implement the foregoing embodiment, the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability as in the foregoing embodiment is implemented.
In order to implement the foregoing embodiment, the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability according to the foregoing embodiment.
In the description of the specification, reference to the description of "one embodiment," "some embodiments," "an example," "a specific example," or "some examples" or the like means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or to implicitly indicate the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one of the feature. In the description of the present invention, "N" means at least two, e.g., two, three, etc., unless explicitly defined otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more N executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or N wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware that is related to instructions of a program, and the program may be stored in a computer-readable storage medium, and when executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (10)

1. A bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability is characterized by comprising the following steps:
s1, constructing an integral bridge deck redundancy system model, decomposing the integral bridge deck redundancy system model into a series of small-scale local bridge decks, and calculating the reliability probability and reliability index of an integral bridge deck system based on the failure probability of the local bridge decks and the redundancy system reliability theory;
s2, designing a comprehensive reward function based on maintenance cost and safety cost according to the reliability index so as to establish an integral bridge deck maintenance decision network model based on deep reinforcement learning, wherein the comprehensive reward function specifically comprises the following steps:
step S201, defining a bridge deck system state, including a reliability matrix, a reliability index and service time of a local bridge deck;
step S202, presetting the maintenance priority of local bridge decks in inverse proportion to the local reliability indexes of the local bridge decks, and simplifying maintenance actions into the number of the local bridge decks to define a maintenance action space of a bridge deck system;
step S203, defining a comprehensive reward function considering maintenance cost and safety cost simultaneously based on the maintenance action space of the bridge deck system;
step S204, establishing the integral bridge deck maintenance decision network model based on the deep reinforcement learning according to the bridge deck system state and the comprehensive reward function;
and S3, training the integral bridge deck maintenance decision network model based on the deep reinforcement learning until convergence, and inputting the reliability index, the local reliability index and the service time of the bridge deck system into the trained integral bridge deck maintenance decision network model based on the deep reinforcement learning to obtain a bridge maintenance action result.
2. The bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability as claimed in claim 1, wherein the step S1 specifically comprises:
step S101, establishing the integral bridge deck redundancy system model and decomposing the integral bridge deck redundancy system model into a series of small-scale local bridge decks;
step S102, decomposing the system into a series of small-scale local bridge deck plates based on the system failure criterion of designing the whole bridge deck plate;
step S103, constructing a system state space according to the system failure criterion;
step S104, calculating a system state transition probability matrix according to the system state space;
and S105, calculating the reliability probability and reliability index of the whole bridge deck system according to the system state space.
3. The bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability as claimed in claim 2, wherein the system failure criterion is that in a two-dimensional rule system composed of n rows and m columns of units, if there are k or more units in the units of r rows and s columns that are continuous, the system fails.
4. The bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability according to claim 2, wherein the system state space is:
Figure FDA0004078688280000021
wherein [ lambda ] ij ] n×(s-1) For a subsystem consisting of partial bridge decks of n rows (s-1) columns, lambda ij For the state of the ith row and jth column local bridge deck, λ ij E {0,1},0 denotes local bridge deck security, 1 denotes local bridge deck failure, S is the security state set of the subsystem, and the ith element is marked as S i And F is the failure state set of the subsystem.
5. The bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability as claimed in claim 2, wherein the system state transition probability matrix is:
Figure FDA0004078688280000022
wherein T is a transition probability matrix of system states, N is a transition probability matrix between system security states, and dimension is d s ×d s ,N i,j The probability of transition from the ith state to the jth state in S; c is a probability matrix for transition from the system safety state to the failure state, and the dimensionality is d s ×1;C i,1 The probability of the ith state in the S to be transferred to the failure state; 0 is a matrix composed of 0 elements and has a dimension of 1 × d s Indicating that the failure state cannot transition to the safe state; a failure state of 1 can only transition to the failure state.
6. The bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability as claimed in claim 1, wherein the bridge deck system maintenance action space is:
A=[0:p:max(th),n×m],0≤th≤n×m,thmodp=0
wherein mod is a remainder operation, max (th) is a maximum positive integer divisible by p and not more than nxm, and nxm is the number of the local bridge decks.
7. The bridge intelligent maintenance decision method based on deep reinforcement learning and system reliability as claimed in claim 1, wherein the comprehensive reward function is:
Reward=C m +C s
C m =-a cost -C setup
C s =-Φ(-β sys )*C sys -(β Tsys )*F
wherein Reward is a comprehensive Reward function, C m For maintenance costs, C s For safety costs, a cost For maintenance action corresponding to cost, a cost 0 is proportional to the number of maintenance units C setup For maintenance start-up costs, beta sys The system reliability index phi (-beta) of the whole bridge deck sys ) Probability of system failure for a monolithic decking, C sys Cost of system failure, beta, for a monolithic decking T The system target reliability index of the whole bridge deck is obtained, and F is a punishment coefficient.
8. The utility model provides a bridge intelligence maintenance decision-making system based on deep reinforcement study and system reliability which characterized in that includes:
the computing module is used for constructing a redundancy system model of the whole bridge deck and decomposing the redundancy system model into a series of small-scale local bridge decks, and computing the reliability probability and reliability index of the whole bridge deck system based on the failure probability of the local bridge decks and the reliability theory of the redundancy system;
the building module is used for designing a comprehensive reward function based on maintenance cost and safety cost according to the reliability index so as to build an integral bridge deck maintenance decision network model based on deep reinforcement learning, and specifically comprises the following steps:
defining the system state of the bridge deck, including the reliability matrix, reliability index and service time of the local bridge deck;
presetting the maintenance priority of the local bridge deck to be inversely proportional to the local reliability index of the local bridge deck, and simplifying the maintenance action into the number of the local bridge deck to be maintained so as to define the maintenance action space of the bridge deck system;
defining a comprehensive reward function based on the bridge deck system maintenance action space, wherein the comprehensive reward function simultaneously considers maintenance cost and safety cost;
establishing the integral bridge deck maintenance decision network model based on the deep reinforcement learning according to the bridge deck system state and the comprehensive reward function;
and the training and output module is used for training the integral bridge deck maintenance decision network model based on the deep reinforcement learning until convergence, and inputting the reliability index, the local reliability index and the service time of the bridge deck system into the trained integral bridge deck maintenance decision network model based on the deep reinforcement learning to obtain a bridge maintenance action result.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method according to any of claims 1-7.
10. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202210482833.0A 2022-05-05 2022-05-05 Bridge maintenance method and system based on deep reinforcement learning and system reliability Active CN115098906B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210482833.0A CN115098906B (en) 2022-05-05 2022-05-05 Bridge maintenance method and system based on deep reinforcement learning and system reliability

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210482833.0A CN115098906B (en) 2022-05-05 2022-05-05 Bridge maintenance method and system based on deep reinforcement learning and system reliability

Publications (2)

Publication Number Publication Date
CN115098906A CN115098906A (en) 2022-09-23
CN115098906B true CN115098906B (en) 2023-04-07

Family

ID=83287010

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210482833.0A Active CN115098906B (en) 2022-05-05 2022-05-05 Bridge maintenance method and system based on deep reinforcement learning and system reliability

Country Status (1)

Country Link
CN (1) CN115098906B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656361A (en) * 2021-08-18 2021-11-16 国家电网公司东北分部 High-reliability data storage method and device for super-fusion power data center

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR3108995B3 (en) * 2020-04-02 2022-04-01 Amadeus Reinforcement learning for website usability
US20210334441A1 (en) * 2020-04-28 2021-10-28 The Texas A&M University System Apparatus and systems for power system protective relay control using reinforcement learning
CN113673721A (en) * 2021-08-26 2021-11-19 北京航空航天大学 Cluster system preventive maintenance method based on deep reinforcement learning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656361A (en) * 2021-08-18 2021-11-16 国家电网公司东北分部 High-reliability data storage method and device for super-fusion power data center

Also Published As

Publication number Publication date
CN115098906A (en) 2022-09-23

Similar Documents

Publication Publication Date Title
Liu et al. A finite-horizon condition-based maintenance policy for a two-unit system with dependent degradation processes
Yang et al. Condition-based maintenance strategy for redundant systems with arbitrary structures using improved reinforcement learning
CN113191084A (en) Bayesian network theory-based comprehensive evaluation method for reliability of existing railway bridge
CN112086958A (en) Power transmission network extension planning method based on multi-step backtracking reinforcement learning algorithm
CN115098906B (en) Bridge maintenance method and system based on deep reinforcement learning and system reliability
CN114358520A (en) Method, system, device and medium for economic dispatching decision of power system
Czogala et al. Some problems concerning the construction of algorithms of decision-making in fuzzy systems
CN117200190A (en) Electric load prediction method for electric Internet of things
CN113452025B (en) Model-data hybrid driven power grid expected fault assessment method and system
Gu et al. Research on preventive maintenance strategy of Coating Machine based on dynamic failure rate
CN113705106A (en) Design method for life cycle maintenance scheme of complex equipment
Peng et al. Analytical model of power system hardening planning for long-term risk reduction
CN111428356A (en) Maintenance method and system for newly developed degraded equipment
CN110783911B (en) Soft intelligent switch configuration method and system for medium and low voltage distribution network
CN109242304B (en) Method for evaluating reliability of small-probability event of power system
Kaveh et al. Optimal design of barrel vaults using charged search system
CN105786482A (en) Artificial intelligence system
CN116364262A (en) Distribution robot charging scheduling method based on aggregated game
CN115293802A (en) Electric power transaction auxiliary decision-making method and device for spot market
Lu et al. Adaptive maintenance window-based opportunistic maintenance optimization considering operational reliability and cost
Fang et al. Optimal age replacement policies with multiple missions for multi-state systems
CN110969355A (en) Screening method and device of incremental risk event and computer readable medium
Zanic et al. Safety as objective in multicriterial structural optimization
da Silva et al. System-reliability-based sizing and shape optimization of trusses considering millions of failure sequences
Sun et al. Selective Maintenance on a Multistate System Executing Multiple Consecutive Missions under Sequential Maintenance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant