CN111679577B - Speed tracking control method and automatic driving control system of high-speed train - Google Patents

Speed tracking control method and automatic driving control system of high-speed train Download PDF

Info

Publication number
CN111679577B
CN111679577B CN202010461495.3A CN202010461495A CN111679577B CN 111679577 B CN111679577 B CN 111679577B CN 202010461495 A CN202010461495 A CN 202010461495A CN 111679577 B CN111679577 B CN 111679577B
Authority
CN
China
Prior art keywords
train
speed
neural network
real
speed train
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010461495.3A
Other languages
Chinese (zh)
Other versions
CN111679577A (en
Inventor
董海荣
高士根
王佳成
郑玥
李浥东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jiaotong University
Original Assignee
Beijing Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jiaotong University filed Critical Beijing Jiaotong University
Priority to CN202010461495.3A priority Critical patent/CN111679577B/en
Publication of CN111679577A publication Critical patent/CN111679577A/en
Application granted granted Critical
Publication of CN111679577B publication Critical patent/CN111679577B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention provides a speed tracking control method and an automatic driving control system of a high-speed train, wherein the tracking control method designs an ATO control algorithm which is completely independent of the internal dynamic characteristics of a train control system, an optimal train speed tracking control strategy is solved by analyzing and utilizing train running state data based on an integral reinforcement learning technology, and train speed tracking control is carried out according to the optimal train speed tracking control strategy, so that the problem of control performance reduction caused by uncertainty of train dynamic characteristics is solved, and the train control input is guaranteed to be restricted within a preset value, so that the actuator saturation phenomenon is avoided. The automatic driving control system controls the input to be limited within the preset value under the action of the tracking control method, thereby avoiding the saturation of the actuator, controlling the train to drive according to a given target speed-distance curve and realizing the automatic driving of the high-speed train.

Description

Speed tracking control method and automatic driving control system of high-speed train
Technical Field
The invention relates to the technical field of automatic driving systems of high-speed trains, in particular to a speed tracking control method and an automatic driving control system of a high-speed train.
Background
As a core of an Automatic Train Operation (ATO) system, an ATO control algorithm plays an important role in guaranteeing the safety and reliability of Train Operation.
In the design process of the ATO control algorithm, a train dynamics model is indispensable because it can reflect the dynamic characteristics of the control object. However, the high-speed railway has the characteristics of high speed during operation, long interval operation time and the like, and the dynamic characteristics of the high-speed train are time-varying and uncertain under the influence of a complex train operation environment. The train model parameter difference is caused by different traction braking characteristics, different locomotive design styles and different marshalling modes of different types of motor train units understood from the interior of the train; from external disturbance analysis, the high-speed train model parameters can be influenced by uncertain factors such as increase of operation mileage, different passenger carrying quantity, change of external environment and the like. In the running process of a high-speed train, uncertain train dynamics caused by complex disturbance can cause adverse influence on speed tracking control performance, and in order to solve the problem, learners estimate unknown train model parameters or functions containing the unknown model parameters by using technologies such as adaptive control, neural network and the like, so that the tracking accuracy of a control algorithm is improved. However, due to the lack of calculation accuracy, there are inevitable estimation errors in the estimation process, so that the ideal control effect and control performance cannot be achieved; moreover, the use of these techniques requires complex formula derivation, which results in additional computational effort and computational speed requirements that are not affordable by existing on-board computers.
Disclosure of Invention
The embodiment of the invention provides a speed tracking control method and an automatic driving control system of a high-speed train, which can essentially solve the influence of external environment disturbance on the speed tracking control of the high-speed train and have certain significance on improving the speed tracking accuracy of the high-speed train.
In order to achieve the purpose, the invention adopts the following technical scheme.
A speed tracking control method of a high-speed train comprises the following steps:
s1, obtaining the real-time position p (t) and the ideal position p of the high-speed traind(t), real-time velocity v (t) and ideal velocity vd(t);
S2 comparing the real-time position p (t) with the ideal position pd(t) differencing to obtain a real-time position error ep(t); comparing the real-time velocity v (t) with the ideal velocity vd(t) differencing to obtain a real-time speed error ev(t);
S3 basing on real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t), obtaining a high-speed train optimization control model;
s4, solving the optimal control model of the high-speed train through an integral reinforcement learning algorithm based on an actor-criticizing family neural network structure to obtain an optimal control strategy of the high-speed train; and carrying out speed tracking control on the high-speed train based on the optimal control strategy of the high-speed train.
Preferably, step S3 specifically includes:
s31 pairs real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t) integrating to obtain a high-speed train composite system state formula X (t) ═ ep(t),ev(t),pd(t),vd(t)]T(1) According to train dynamics models
Figure BDA0002511137910000021
Establishing a high-speed train composite system expression formula
Figure BDA0002511137910000022
Figure BDA0002511137910000023
Wherein M is train mass, faFor additional running resistance, fbF (X (t)) is the internal dynamic of the composite system for basic running resistance,
Figure BDA0002511137910000024
for composite system input dynamics, urControl input in the reinforcement learning process;
s32 is based on formulas (1) and (2) to obtain a composite system value function of the high-speed train
Figure BDA0002511137910000025
In the formula (I), the compound is shown in the specification,
Figure BDA0002511137910000026
Figure BDA00025111379100000212
discount factor 0<γ<1, where ρ (v) is the control with velocity v as an independent variableThe input is limited, Q and R are the state weight matrix and the input weight matrix, respectively.
Preferably, the point-based reinforcement learning algorithm based on the actor-criticizing family neural network structure includes an actor neural network and a criticizing family neural network, and the step S4 specifically includes:
s41 setting criticizing family neural network weight vector as W1Setting actor neural network weight vector to W2(ii) a Setting critics 'and actor's neural networks as basis functions
Figure BDA0002511137910000027
The basis function
Figure BDA0002511137910000028
The first derivative of the state X of the composite system of the high-speed train is
Figure BDA0002511137910000029
S42 the criticizing family neural network weight is updated through a first adaptive law, and the first adaptive law comprises
Figure BDA00025111379100000210
And
Figure BDA00025111379100000211
in the formula, alpha1>0 is the adaptive estimated rate coefficient;
s43 updating the actor neural network weights through a second adaptive law comprising
Figure BDA0002511137910000031
Figure BDA0002511137910000032
And
Figure BDA0002511137910000033
in the formula, alpha2>0 is the adaptive estimated rate coefficient, Y>0 is a control constant;
s44 high-speed train control input quantity of integral reinforcement learning is obtained based on the updating of criticizing family neural network weight and actor neural network weight
Figure BDA0002511137910000034
Applying the control input quantity of the high-speed train of the integral reinforcement learning to a control system of the high-speed train to obtain train running state data at the T + T moment; where ρ is a simplified form of ρ (v);
s45 executes steps S2, S31, S42, S43 and S44 multiple times, obtaining actor neural network weight data sets;
s46 analyzing the actor neural network weight data set to obtain the optimal weight vector of the actor neural network
Figure BDA0002511137910000035
Based on the optimal weight vector
Figure BDA0002511137910000036
Obtaining optimal control strategy of high-speed train
Figure BDA0002511137910000037
Preferably, the method comprises the following steps:
the train ideal position acquisition module is used for acquiring the ideal position p of the high-speed train in real timed(t);
The train ideal speed acquisition module is used for acquiring the ideal speed v of the high-speed train in real timed(t);
The train positioning module is used for acquiring a real-time position p (t) of the high-speed train;
the train speed measuring module is used for acquiring the real-time speed v (t) of the high-speed train;
the train control system state acquisition module is respectively in communication connection with the train ideal position acquisition module, the train ideal speed acquisition module, the train positioning module and the train speed measurement module and is used for enabling the real-time position p (t) and the ideal position pd(t) differencing to obtain a real-time position error ep(t) comparing the real-time velocity v (t) with the ideal velocity vd(t) differencing to obtain a real-time speed error ev(t) and based on the real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t), obtaining a high-speed train optimization control model;
the control strategy generation module is used for solving the high-speed train optimization control model through an integral reinforcement learning algorithm based on an actor-criticizing family neural network structure to obtain an optimal control strategy of the high-speed train;
and the train control module is used for tracking and controlling the speed based on the optimal control strategy of the high-speed train.
Preferably, the train control system state acquisition module is based on a real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t), the process of obtaining the optimal control model of the high-speed train specifically comprises the following steps:
for real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t) integrating to obtain a high-speed train composite system state formula X (t) ═ ep(t),ev(t),pd(t),vd(t)]T(1) And high speed train composite system expression formula
Figure BDA0002511137910000041
In the formula urControl input in the reinforcement learning process;
obtaining a composite system value function of the high-speed train based on the formulas (1) and (2)
Figure BDA0002511137910000042
In the formula (I), the compound is shown in the specification,
Figure BDA0002511137910000043
Figure BDA00025111379100000417
discount factor 0<γ<1, where ρ (v) is a control input limited value with velocity v as an independent variable, and Q and R are the sameA state weight matrix and an input weight matrix.
Preferably, the control strategy generation module includes,
a criticizing neural network submodule for updating the criticizing neural network weight through a first adaptive law, wherein the first adaptive law comprises
Figure BDA0002511137910000044
Figure BDA0002511137910000045
And
Figure BDA0002511137910000046
in the formula, W1In order to criticize the home neural network weight vector,
Figure BDA0002511137910000047
in order to criticize the basis functions of the neural network,
Figure BDA0002511137910000048
for criticizing the first derivative, alpha, of the basis functions of the home neural network on the state X of the high-speed train complex system1>0 is the adaptive estimated rate coefficient;
an actor neural network sub-module to update the actor neural network weight vector through a second adaptive law comprising
Figure BDA0002511137910000049
And
Figure BDA00025111379100000410
in the formula, W2For the actor's neural network weight vector,
Figure BDA00025111379100000411
is the basis function of the actor's neural network,
Figure BDA00025111379100000412
compounding high speed trains for basis functions of actor neural networksFirst derivative of system state X, alpha2>0 is the adaptive estimated rate coefficient, Y>0 is a control constant;
a train control input quantum module for obtaining the control input quantity of the high-speed train for integral reinforcement learning based on the update of the criticizing family neural network weight and the actor neural network weight
Figure BDA00025111379100000413
Sending the control input quantity of the high-speed train with the integral reinforcement learning to a train control module;
the neural network weight vector register is used for storing the updated actor neural network weight vectors in real time to obtain an actor neural network weight data set;
an optimal control strategy sub-module for analyzing the actor neural network weight data set to obtain the optimal weight vector of the actor neural network
Figure BDA00025111379100000414
Based on the optimal weight vector
Figure BDA00025111379100000415
Obtaining optimal control strategy of high-speed train
Figure BDA00025111379100000416
It can be seen from the technical solutions provided by the embodiments of the present invention that the present invention provides a speed tracking control method and an automatic driving control system for a high-speed train, wherein the tracking control method designs an ATO control algorithm completely independent of the internal dynamic characteristics of a train control system, and solves an optimal train speed tracking control strategy by analyzing and utilizing train operation state data based on an integral reinforcement learning technology, and performs train and train speed tracking control according to the optimal train and train speed tracking control strategy, thereby solving the problem of control performance degradation caused by uncertainty of train dynamics characteristics, and ensuring that train control input is constrained within a preset value, thereby avoiding the actuator saturation phenomenon. The automatic driving control system controls the input to be limited within the preset value under the action of the tracking control method, thereby avoiding the saturation of the actuator, controlling the train to drive according to a given target speed-distance curve and realizing the automatic driving of the high-speed train.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a processing flow chart of a speed tracking control method for a high-speed train according to the present invention;
fig. 2 is a block diagram illustrating an automatic driving control system of a high-speed train according to the present invention;
FIG. 3 is a logic block diagram of an automatic driving control system of a high-speed train according to the present invention;
FIG. 4 is a schematic diagram illustrating the convergence process of actor neural network weights in the method for tracking and controlling the speed of a high-speed train according to the present invention;
FIG. 5 is a graph of the actual position of the train and the ideal position of the train during the whole operation process of the train;
FIG. 6 is a graph of actual train speed and ideal train speed;
FIG. 7 is a real-time train position error plot;
FIG. 8 is a graph of a real-time train speed error curve;
FIG. 9 is a diagram of the maximum output force of the motor used in the preferred embodiment of the automatic drive control system for high-speed trains according to the present invention;
fig. 10 is a graph showing an actual train traction braking force.
In the figure:
201. the train control system comprises a train ideal position obtaining module 202, a train ideal speed obtaining module 203, a train positioning module 204, a train speed measuring module 205, a train control system state obtaining module 206, a control strategy generating module 2061, a criticizing family neural network submodule 2062, an actor neural network submodule 2063, a train control input quantum module 2064, a neural network weight vector register 2065, an optimal control strategy submodule 207 and a train control module.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or coupled. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
For the convenience of understanding the embodiments of the present invention, the following description will be further explained by taking several specific embodiments as examples in conjunction with the drawings, and the embodiments are not to be construed as limiting the embodiments of the present invention.
Referring to fig. 1, the speed tracking control method for a high-speed train provided by the invention comprises the following steps:
s1 real-time acquiring real-time position p (t) and ideal position p of high-speed traind(t), real-time velocity v (t) and ideal velocity vd(t);
S2 comparing the real-time position p (t) with the ideal position pd(t) differencing to obtain a real-time position error ep(t); comparing the real-time velocity v (t) with the ideal velocity vd(t) differencing to obtain a real-time speed error ev(t);
S3 basing on real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t), obtaining a high-speed train optimization control model;
s4, solving the optimal control model of the high-speed train through an integral reinforcement learning algorithm based on an actor-criticizing family neural network structure to obtain an optimal control strategy of the high-speed train; and carrying out speed tracking control on the high-speed train based on the optimal control strategy of the high-speed train.
In the embodiment provided by the invention, the ideal position and the ideal speed of the high-speed train can be calculated to obtain a train target speed-distance curve according to the running permission of the high-speed train received from the radio block center and the temporary speed limit command received from the temporary speed limit server, the running time between the train stations can be obtained by receiving the dispatching command from the dispatching center, and the ideal position p of the high-speed train can be respectively obtained in real time by splitting the target speed-distance curve and combining the running time between the stationsd(t) and velocity vd(t)。
Further, the step S3 specifically includes the following sub-steps:
s31 pairs real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t) integrating to obtain a high-speed train composite system state formula X (t) ═ ep(t),ev(t),pd(t),vd(t)]T(1) According to train dynamics models
Figure BDA0002511137910000071
Establishing a high-speed train composite system expression formula
Figure BDA0002511137910000072
Figure BDA0002511137910000073
Wherein M is train mass, faFor additional running resistance, fbF (X (t)) is the internal dynamic of the composite system for basic running resistance,
Figure BDA0002511137910000074
for composite system input dynamics, urControl input in the reinforcement learning process;
s32 defining composite system value function of high-speed train based on formulas (1) and (2)
Figure BDA0002511137910000075
In the formula (I), the compound is shown in the specification,
Figure BDA0002511137910000076
Figure BDA00025111379100000710
discount factor 0<γ<1, ρ (v) is a control input limited value with a velocity v as an independent variable, which is simplified to ρ, Q and R are a state weight matrix and an input weight matrix, respectively, for convenience in the subsequent process.
In order to minimize the composite system value function of the high-speed train defined in the above sub-step S32 to obtain the optimal control strategy, in the embodiment provided by the present invention, an integral reinforcement learning algorithm based on an Actor-Critic (Actor-Critic) neural network structure is provided, the Critic neural network is used to evaluate the control strategy, the Actor neural network is used to improve the control strategy according to the evaluation result, and the time interval for reinforcement learning of the control strategy is T. In this embodiment, the specific process of step S4 is implemented, and includes the following sub-steps:
s41 setting criticizing family neural network weight vector as W1Setting actor neural network weight vector to W2(ii) a Setting critics 'and actor's neural networks as basis functions
Figure BDA0002511137910000077
The basis function
Figure BDA0002511137910000078
The first derivative of the state X of the composite system of the high-speed train is
Figure BDA0002511137910000079
S42 the criticizing family neural network weight is updated through a first adaptive law, and the first adaptive law comprises
Figure BDA0002511137910000081
And
Figure BDA0002511137910000082
in the formula, alpha1>0 is the adaptive estimated rate coefficient, and U is defined by the above formula (4);
s43 updating the actor neural network weights through a second adaptive law comprising
Figure BDA0002511137910000083
Figure BDA0002511137910000084
And
Figure BDA0002511137910000085
in the formula, alpha2>0 is the adaptive estimated rate coefficient, Y>0 is a control constant and is used as a control constant,
Figure BDA0002511137910000086
inputting dynamics for the composite system;
s44 high-speed train control input quantity of integral reinforcement learning is obtained based on the updating of criticizing family neural network weight and actor neural network weight
Figure BDA0002511137910000087
Applying (inputting) the control input quantity of the high-speed train of the integral reinforcement learning to a control system of the high-speed train to obtain train operation state data at the T + T moment;
s45 executing the above steps S2, S31, S42, S43 and S44 for multiple times to complete the preset inter-train station operation and obtain an actor neural network weight data set storing updated actor neural network weights obtained after the above step S44 is repeatedly executed;
s46 analyzing the updated actor neural network weight in the actor neural network weight data set to obtain the optimal weight vector of the actor neural network
Figure BDA0002511137910000088
Based on the optimal weight vector
Figure BDA0002511137910000089
Obtaining optimal control strategy of high-speed train
Figure BDA00025111379100000810
Figure BDA00025111379100000811
And the speed tracking control system sends the speed tracking control signal to a control system of the high-speed train to perform speed tracking control on the high-speed train.
In a second aspect, the present invention provides an automatic driving control system for a high-speed train to which the above speed tracking control method is applied, as shown in fig. 2 and 3, comprising:
a train ideal position obtaining module 201 for obtaining the ideal position p of the high-speed train in real timed(t);
A train ideal speed obtaining module 202 for obtaining the ideal speed v of the high-speed train in real timed(t);
The train positioning module 203 is used for acquiring a real-time position p (t) of the high-speed train;
the train speed measuring module 204 is used for acquiring the real-time speed v (t) of the high-speed train;
a train control system state obtaining module 205, which is respectively in communication connection with the train ideal position obtaining module 201, the train ideal speed obtaining module 202, the train positioning module 203 and the train speed measuring module 204, and is used for connecting the real-time position p (t) and the ideal position pd(t) differencing to obtain a real-time position error ep(t) comparing the real-time velocity v (t) with the ideal velocity vd(t) differencing to obtain a real-time speed error ev(t) and based on the real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t), obtaining a high-speed train optimization control model;
a control strategy generation module 206, which is in communication connection with the train control system state acquisition module 205, and is used for solving the high-speed train optimization control model through an integral reinforcement learning algorithm based on actor-criticizing family neural network structure to obtain an optimal control strategy of the high-speed train;
and a train control module 207, which is in communication connection 206 with the control strategy generation module, for speed tracking control based on the optimal control strategy of the high-speed train.
Further, the train control system state acquisition module is based on the real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t), the process of obtaining the optimal control model of the high-speed train specifically comprises the following steps:
for real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t) integrating to obtain a high-speed train composite system state formula X (t) ═ ep(t),ev(t),pd(t),vd(t)]T(1) To do so byAnd high-speed train composite system expression formula
Figure BDA0002511137910000091
In the formula urControl input in the reinforcement learning process;
obtaining a composite system value function of the high-speed train based on the formulas (1) and (2)
Figure BDA0002511137910000092
In the formula (I), the compound is shown in the specification,
Figure BDA0002511137910000093
Figure BDA0002511137910000094
discount factor 0<γ<1, ρ (v) is a control input limited value with velocity v as an argument, and Q and R are a state weight matrix and an input weight matrix, respectively.
Further, the control strategy generation module 206 includes,
a criticizing nervousness network submodule 2061 for updating the criticizing nervousness network weight by a first adaptive law, the first adaptive law including
Figure BDA0002511137910000095
Figure BDA0002511137910000096
And
Figure BDA0002511137910000097
Figure BDA0002511137910000098
in the formula, W1In order to criticize the home neural network weight vector,
Figure BDA0002511137910000099
in order to criticize the basis functions of the neural network,
Figure BDA00025111379100000910
for criticizingBasis functions of neural networks
Figure BDA00025111379100000911
First derivative, alpha, of the state X of the composite system of a high-speed train1>0 is the adaptive estimated rate coefficient;
an actor neural network sub-module 2062 for updating the actor neural network weight vector through a second adaptation law comprising
Figure BDA0002511137910000101
Figure BDA0002511137910000102
And
Figure BDA0002511137910000103
in the formula, W2For the actor's neural network weight vector,
Figure BDA0002511137910000104
is the basis function of the actor's neural network,
Figure BDA0002511137910000105
basis functions for actor neural network weights
Figure BDA0002511137910000106
First derivative, alpha, of the state X of the composite system of a high-speed train2>0 is the adaptive estimated rate coefficient, Y>0 is a control constant;
a train control input quantum module 2063 for obtaining the high-speed train control input quantity of integral reinforcement learning based on the update of the criticizing family neural network weight and the actor neural network weight
Figure BDA0002511137910000107
The control input quantity of the high-speed train of the integral reinforcement learning is sent to a train control module 207;
a neural network weight vector register 2064, configured to store the updated actor neural network weight vector in real time, and obtain an actor neural network weight data set;
an optimal control strategy sub-module 2065 for analyzing the actor neural network weight data set to obtain an optimal weight vector of the actor neural network
Figure BDA0002511137910000108
Based on the optimal weight vector
Figure BDA0002511137910000109
Obtaining optimal control strategy of high-speed train
Figure BDA00025111379100001010
Figure BDA00025111379100001011
The invention also provides an embodiment for exemplarily showing the working process of the invention.
The train control module is described by a high-speed train dynamics model as follows
Figure BDA00025111379100001012
Wherein p (t) and v (t) are the position and speed of the train, respectively, and u (t) is u in the step S44 in the reinforcement learning processrIn the speed tracking control, u (t) is the control input u in the above step S46, the train mass M is 500t, and the davis coefficient a is 7.75 × 10-3,b=2.278*10-4,c=1.66*10-5(ii) a Other control parameters are:
Figure BDA00025111379100001013
Figure BDA00025111379100001014
Figure BDA00025111379100001015
wherein X1、X2、X3、X4Is a component of the composite system state X.
Fig. 4 is a convergence process of the Actor neural network weight in the integral reinforcement learning process of the present invention, and the vector of the Actor neural network weight after convergence can be obtained as follows:
Figure BDA0002511137910000111
Figure BDA0002511137910000112
Figure BDA0002511137910000113
and the convergence of the Actor neural network weight shows the convergence of the optimal control strategy in the integral reinforcement learning, and then the high-speed train speed tracking control is carried out according to the obtained optimal strategy, namely the integral reinforcement learning process in the fourth part of the block diagram is replaced by the optimal strategy control process.
Fig. 5 shows the actual position of the train and the ideal position of the train in the whole process of train operation, fig. 6 shows the actual speed of the train and the ideal speed of the train, the segment lines in fig. 5 and 6 show the ideal position and speed of the train, and the solid lines show the actual position and speed of the train; fig. 7 shows a real-time train position error, fig. 5 shows a real-time train speed error, and by analyzing fig. 7 and fig. 8, an optimal speed tracking control strategy obtained by integral reinforcement learning according to the present invention can be obtained, so that an accurate train position and speed tracking control effect can be achieved.
FIG. 9 is the maximum output force of the motor used in the simulation of the present invention, it can be seen that the maximum output value of the motor of the train is a function related to the train speed, which is specifically shown that when the train speed is 0< v <50, the maximum output value is a constant value, and when the train speed v is greater than or equal to 50, the maximum output value and the train speed are in an inverse proportional relationship; fig. 10 shows the actual train traction braking force in a solid line and the limited value of the traction braking force in a segmented line, and it can be seen that the control input is limited within the limited value under the control method designed by the present invention.
In conclusion, the invention provides a speed tracking control method and an automatic driving control system of a high-speed train, wherein the tracking control method designs an ATO control algorithm which is completely independent of the internal dynamic characteristics of a train control system, an optimal train speed tracking control strategy is solved by analyzing and utilizing train running state data based on an integral reinforcement learning technology, and train speed tracking control is carried out according to the optimal train speed tracking control strategy, so that the problem of control performance reduction caused by uncertainty of train dynamic characteristics is solved, and train control input is guaranteed to be restricted within a preset value, so that the saturation phenomenon of an actuator is avoided. The automatic driving control system controls the input to be limited within the preset value under the action of the tracking control method, thereby avoiding the saturation of the actuator, controlling the train to drive according to a given target speed-distance curve and realizing the automatic driving of the high-speed train.
Those of ordinary skill in the art will understand that: the figures are merely schematic representations of one embodiment, and the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for apparatus or system embodiments, since they are substantially similar to method embodiments, they are described in relative terms, as long as they are described in partial descriptions of method embodiments. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (2)

1. A speed tracking control method of a high-speed train is characterized by comprising the following steps:
s1, obtaining the real-time position p (t) and the ideal position p of the high-speed traind(t), real-time velocity v (t) and ideal velocity vd(t);
S2 comparing the real-time position p (t) with the ideal position pd(t) differencing to obtain a real-time position error ep(t); comparing the real-time velocity v (t) with the ideal velocity vd(t) differencing to obtain a real-time speed error ev(t);
S3 basing on real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t), obtaining a high-speed train optimization control model; the method specifically comprises the following steps:
s31 pairs real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t) integrating to obtain a high-speed train composite system state formula X (t) ═ ep(t),ev(t),pd(t),vd(t)]T(1) According to train dynamics models
Figure FDA0003155789180000011
Establishing a high-speed train composite system expression formula
Figure FDA0003155789180000012
Figure FDA0003155789180000013
Wherein M is train mass, faFor additional running resistance, fbF (X (t)) is the internal dynamic of the composite system for basic running resistance,
Figure FDA0003155789180000014
inputting dynamic state for a composite system, and ur is the control input quantity of the high-speed train for integral reinforcement learning;
s32 is based on formulas (1) and (2) to obtain a composite system value function of the high-speed train
V(X(t))=∫t e-γ(τ-t)[XT(τ)QX(τ)+U]d τ (3) in which,
Figure FDA0003155789180000015
Figure FDA0003155789180000016
the discount factor 0< gamma < 1, rho (v) is a control input limited value with speed v as an independent variable, and Q and R are a state weight matrix and an input weight matrix respectively;
s4, solving the optimal control model of the high-speed train through an integral reinforcement learning algorithm based on an actor-criticizing family neural network structure to obtain an optimal control strategy of the high-speed train; performing speed tracking control on the high-speed train based on the optimal control strategy of the high-speed train;
the integral reinforcement learning algorithm based on the actor-criticizing family neural network structure includes an actor neural network and a criticizing family neural network, and the step S4 specifically includes:
s41 setting criticizing family neural network weight vector as W1Setting actor neural network weight vector to W2(ii) a Setting criticizing family neural network and actorThe basis function of the neural network is
Figure FDA0003155789180000017
The basis function
Figure FDA0003155789180000018
The first derivative of the state X of the composite system of the high-speed train is
Figure FDA0003155789180000019
S42 the criticizing family neural network weight is updated through a first adaptive law, and the first adaptive law comprises
Figure FDA0003155789180000021
And
Figure FDA0003155789180000022
in the formula, alpha1The rate coefficient is self-adaptive estimated if the rate coefficient is more than 0;
s43 updating the actor neural network weights through a second adaptive law comprising
Figure FDA0003155789180000023
Figure FDA0003155789180000024
And
Figure FDA0003155789180000025
in the formula, alpha2The adaptive estimation rate coefficient is more than 0, and the control constant is more than 0;
s44 high-speed train control input quantity of integral reinforcement learning is obtained based on the updating of criticizing family neural network weight and actor neural network weight
Figure FDA0003155789180000026
Using the control input amount of the high-speed train for high-speed trainThe train control system acquires train running state data at the T + T moment; where ρ is a simplified form of ρ (v);
s45 executing the steps S2, S31, S42, S43 and S44 a plurality of times to obtain an actor neural network weight data set;
s46 analyzing the actor neural network weight data set to obtain the optimal weight vector of the actor neural network
Figure FDA0003155789180000027
Based on the optimal weight vector
Figure FDA0003155789180000028
Obtaining optimal control strategy of high-speed train
Figure FDA0003155789180000029
2. An automatic driving control system of a high-speed train, characterized by performing the method of claim 1, comprising:
the train ideal position acquisition module is used for acquiring the ideal position p of the high-speed train in real timed(t);
The train ideal speed acquisition module is used for acquiring the ideal speed v of the high-speed train in real timed(t);
The train positioning module is used for acquiring a real-time position p (t) of the high-speed train;
the train speed measuring module is used for acquiring the real-time speed v (t) of the high-speed train;
a train control system state acquisition module which is respectively in communication connection with the train ideal position acquisition module, the train ideal speed acquisition module, the train positioning module and the train speed measurement module and is used for connecting the real-time position p (t) with the ideal position pd(t) differencing to obtain a real-time position error ep(t) comparing the real-time velocity v (t) with the ideal velocity vd(t) differencing to obtain a real-time speed error ev(t) and based on the real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t), obtaining a high-speed train optimization control model;
the control strategy generation module is used for solving the high-speed train optimization control model through an integral reinforcement learning algorithm based on an actor-criticizing family neural network structure to obtain an optimal control strategy of the high-speed train;
the train control module is used for tracking and controlling the speed based on the optimal control strategy of the high-speed train;
the train control system state acquisition module is based on a real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t), the process of obtaining the optimal control model of the high-speed train specifically comprises the following steps:
for real-time position error ep(t), real time speed error ev(t) ideal position pd(t) and ideal velocity vd(t) integrating to obtain a high-speed train composite system state formula X (t) ═ ep(t),ev(t),pd(t),vd(t)]T(1) And high speed train composite system expression formula
Figure FDA0003155789180000031
In the formula urControlling input quantity for the high-speed train of integral reinforcement learning;
obtaining a composite system value function of the high-speed train based on the formulas (1) and (2)
V(X(t))=∫t e-γ(τ-t)[XT(τ)QX(τ)+U]d τ (3) in which,
Figure FDA0003155789180000032
Figure FDA0003155789180000033
the discount factor 0< gamma < 1, rho (v) is a control input limited value with speed v as an independent variable, and Q and R are a state weight matrix and an input weight matrix respectively;
the control strategy generation module includes a control strategy generation module,
a criticizing neural network submodule for updating the criticizing neural network weight through a first adaptive law, wherein the first adaptive law comprises
Figure FDA0003155789180000034
Figure FDA0003155789180000035
And
Figure FDA0003155789180000036
in the formula, W1In order to criticize the home neural network weight vector,
Figure FDA0003155789180000037
in order to criticize the basis functions of the neural network,
Figure FDA0003155789180000038
for criticizing the first derivative, alpha, of the basis functions of the home neural network on the state X of the high-speed train complex system1The rate coefficient is self-adaptive estimated if the rate coefficient is more than 0;
an actor neural network sub-module to update the actor neural network weight vector through a second adaptive law comprising
Figure FDA0003155789180000039
And
Figure FDA00031557891800000310
in the formula, W2For the actor's neural network weight vector,
Figure FDA00031557891800000311
is the basis function of the actor's neural network,
Figure FDA00031557891800000312
first derivative of high speed train composite system state X for actor neural network basis functionNumber, alpha2The adaptive estimation rate coefficient is more than 0, and the control constant is more than 0;
a train control input quantum module for obtaining the control input quantity of the high-speed train for integral reinforcement learning based on the update of the criticizing family neural network weight and the actor neural network weight
Figure FDA00031557891800000313
Sending the control input quantity of the high-speed train with the integral reinforcement learning to the train control module;
the neural network weight vector register is used for storing the updated actor neural network weight vectors in real time to obtain an actor neural network weight data set;
an optimal control strategy sub-module for analyzing the actor neural network weight data set to obtain the optimal weight vector of the actor neural network
Figure FDA0003155789180000041
Based on the optimal weight vector
Figure FDA0003155789180000042
Obtaining optimal control strategy of high-speed train
Figure FDA0003155789180000043
CN202010461495.3A 2020-05-27 2020-05-27 Speed tracking control method and automatic driving control system of high-speed train Active CN111679577B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010461495.3A CN111679577B (en) 2020-05-27 2020-05-27 Speed tracking control method and automatic driving control system of high-speed train

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010461495.3A CN111679577B (en) 2020-05-27 2020-05-27 Speed tracking control method and automatic driving control system of high-speed train

Publications (2)

Publication Number Publication Date
CN111679577A CN111679577A (en) 2020-09-18
CN111679577B true CN111679577B (en) 2021-11-05

Family

ID=72434290

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010461495.3A Active CN111679577B (en) 2020-05-27 2020-05-27 Speed tracking control method and automatic driving control system of high-speed train

Country Status (1)

Country Link
CN (1) CN111679577B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112590738B (en) * 2020-12-23 2022-03-08 交控科技股份有限公司 ATO (automatic train operation) parking control method compatible with different inter-vehicle generations
CN112947055B (en) * 2021-03-04 2022-09-09 北京交通大学 Method for tracking and controlling displacement speed of magnetic suspension train based on echo state network

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011250575A (en) * 2010-05-26 2011-12-08 Toshiba Corp Train control device
CN105243430A (en) * 2015-09-07 2016-01-13 北京交通大学 Train energy-saving operation target speed curve optimization method
CN108038545A (en) * 2017-12-06 2018-05-15 湖北工业大学 Fast learning algorithm based on Actor-Critic neutral net continuous controls
CN109213148A (en) * 2018-08-03 2019-01-15 东南大学 It is a kind of based on deeply study vehicle low speed with decision-making technique of speeding
CN109835354A (en) * 2019-01-14 2019-06-04 苏州工业园区职业技术学院 A kind of bullet train speed tracking control system based on RBF
CN110059646A (en) * 2019-04-23 2019-07-26 暗物智能科技(广州)有限公司 The method and Target Searching Method of training action plan model
CN110647031A (en) * 2019-09-19 2020-01-03 北京科技大学 Anti-saturation self-adaptive pseudo PID sliding mode fault tolerance control method for high-speed train
CN111141300A (en) * 2019-12-18 2020-05-12 南京理工大学 Intelligent mobile platform map-free autonomous navigation method based on deep reinforcement learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011250575A (en) * 2010-05-26 2011-12-08 Toshiba Corp Train control device
CN105243430A (en) * 2015-09-07 2016-01-13 北京交通大学 Train energy-saving operation target speed curve optimization method
CN108038545A (en) * 2017-12-06 2018-05-15 湖北工业大学 Fast learning algorithm based on Actor-Critic neutral net continuous controls
CN109213148A (en) * 2018-08-03 2019-01-15 东南大学 It is a kind of based on deeply study vehicle low speed with decision-making technique of speeding
CN109835354A (en) * 2019-01-14 2019-06-04 苏州工业园区职业技术学院 A kind of bullet train speed tracking control system based on RBF
CN110059646A (en) * 2019-04-23 2019-07-26 暗物智能科技(广州)有限公司 The method and Target Searching Method of training action plan model
CN110647031A (en) * 2019-09-19 2020-01-03 北京科技大学 Anti-saturation self-adaptive pseudo PID sliding mode fault tolerance control method for high-speed train
CN111141300A (en) * 2019-12-18 2020-05-12 南京理工大学 Intelligent mobile platform map-free autonomous navigation method based on deep reinforcement learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Neural adaptive chaotic control with constrained input using state and output feedback;Gao shi-gen et al.;《Chin. Phys. B》;20151231;第24卷(第1期);第1-7页 *
基于径向基神经网络的列车速度跟踪控制研究;黄娟等;《兰州工业学院学报》;20190430;第26卷(第2期);第74-78页 *

Also Published As

Publication number Publication date
CN111679577A (en) 2020-09-18

Similar Documents

Publication Publication Date Title
CN111679577B (en) Speed tracking control method and automatic driving control system of high-speed train
CN108153153B (en) Learning variable impedance control system and control method
CN107943022B (en) PID locomotive automatic driving optimization control method based on reinforcement learning
CN102998973B (en) The multi-model Adaptive Control device of a kind of nonlinear system and control method
JP5085175B2 (en) Method for estimating dynamic characteristics of suspension system for railway vehicles
CN105270445B (en) Automatic train control apparatus and automatic train control method
KR101770594B1 (en) Real time speed of train optimization system and real time speed of train optimization method using the same
CN113485125B (en) Time-lapse-containing vehicle queue stability control method and system suitable for arbitrary communication topology
CN109375510A (en) A kind of adaptive sliding mode fault tolerant control method for bullet train
CN112464577B (en) Vehicle dynamics model construction and vehicle state information prediction method and device
CN112486024A (en) High-speed train self-adaptive control method and system based on multi-quality-point model
CN112904890A (en) Unmanned aerial vehicle automatic inspection system and method for power line
CN112733448A (en) Parameter self-learning double Q table combined agent establishing method for automatic train driving system
CN113815679B (en) Implementation method for autonomous driving control of high-speed train
CN117170228A (en) Self-adaptive sliding mode control method for virtual marshalling high-speed train interval control
CN112859867A (en) Ship berthing and departing control system and method based on multi-tug cooperation
JP7420236B2 (en) Learning devices, learning methods and learning programs
Su et al. Adaptive fault-tolerant fixed-time cruise control for virtually coupled train set
CN111176297A (en) Subway train tracking control method and system based on multi-target DMC predictive control algorithm
CN114326646B (en) Self-adaptive coordination control method and system for limited time of high-speed train
Qin et al. A robust fault estimation scheme for heavy-haul trains equipped with ECP brake systems
Bouchama et al. Observer-based freight train control to reduce coupler strain and low adhesion issues
CN117444978B (en) Position control method, system and equipment for pneumatic soft robot
WO2023189368A1 (en) Storage battery degradation estimation device and storage battery degradation estimation method
CN116279676A (en) Train speed control method based on fractional order sliding mode and Kalman filtering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant