CN115879620A - Demand response bus look-ahead scheduling method - Google Patents

Demand response bus look-ahead scheduling method Download PDF

Info

Publication number
CN115879620A
CN115879620A CN202211574144.9A CN202211574144A CN115879620A CN 115879620 A CN115879620 A CN 115879620A CN 202211574144 A CN202211574144 A CN 202211574144A CN 115879620 A CN115879620 A CN 115879620A
Authority
CN
China
Prior art keywords
vehicle
period
time
station
order
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211574144.9A
Other languages
Chinese (zh)
Inventor
巫威眺
邹弘辉
卢凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202211574144.9A priority Critical patent/CN115879620A/en
Publication of CN115879620A publication Critical patent/CN115879620A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a demand response bus look-ahead scheduling method. The method comprises the following steps: establishing a Markov decision process model based on a rolling time domain framework; predicting future requirements based on quantile regression, an LSTM model and a Copula function; bringing future demands into a common scheduling model, establishing a prediction-based look-ahead scheduling model, and designing error correction mechanisms for look-ahead scheduling and order cancellation respectively; designing a decision pruning strategy, and compressing a solution space to reduce the calculation time; and solving the scheduling scheme by using an approximate dynamic programming algorithm. The invention can not only reduce the cost of operators, but also obviously improve the service quality, and achieve the fine organization and management of demand response bus dispatching. Meanwhile, the algorithm has good solving characteristics and a strong practical application prospect.

Description

Demand response bus look-ahead scheduling method
Technical Field
The invention relates to the research field of a demand response bus dynamic scheduling technology, in particular to a demand response bus prospective scheduling method.
Background
Demand response buses can dynamically change routes and schedules, provide more flexible and accurate travel services for passengers, and have great development potential in future public transport systems. Fixed public transportation systems tend to be more cost effective as they have stable schedules and travel routes. However, demand response public transportation faces a huge development obstacle, and not only is the operation cost high, but also the service level is limited by real factors such as fleet scale and individual demand of passengers. On one hand, the operation cost depends on the order response quantity and the vehicle scheduling scheme, but the control of the operation cost can cause the early or late arrival of the vehicle and reduce the response rate, thereby deteriorating the service level; on the other hand, the future needs of the passengers are uncertain for planning, and the current scheduling schemes are not necessarily applicable to future needs, i.e. future needs have a potentially negative impact. With the development of personalized travel, operators increasingly need to finely organize and manage the scheduling of demand response buses, and cost reduction and efficiency improvement of services are achieved
In the dispatching problem of demand response buses, the operator demand divides the operation time into a plurality of periods, and in each period, a dispatching plan is made for all passenger travel orders led into the system, wherein the dispatching plan comprises a vehicle route, departure time, waiting time and passenger-vehicle distribution. The vehicle starts from different yards, serves passengers according to the route and the time schedule given by the scheduling plan, and finally returns to the yards.
In traditional demand response bus dispatching, operators generally need to process static orders (advance reservation type) and dynamic orders (real-time submission) and make dispatching plans and schedules in real time. In demand response bus look-ahead scheduling, the current demand is processed in real time, and the influence of future demand on the current scheduling decision is also considered. Meanwhile, the system rejects the order due to insufficient transportation resources, and the passenger temporarily cancels the order due to dissatisfaction with the price, the arrival time of the vehicle, and the route of the vehicle. Although models may be built based on deep learning methods to predict future demand, the prediction models may have errors and may have a large impact on scheduling. And because the passenger flow faced by the demand response bus is sparse and uneven, a large number of missing values and abnormal peak values are brought to prediction, and prediction errors are further aggravated. Therefore, in the demand response bus prospective scheduling model, not only a scheduling plan needs to be made in real time, but also the influence of future demands needs should be considered, so that prospective scheduling is formed.
The current demand response bus dispatching research has the following defects and shortcomings: (1) The actual scene of the demand response bus comprises the processes of collecting and processing static and dynamic demands, dynamically inserting orders and optimizing routes, rejecting demands exceeding capacity and the like, wherein the optimized routes comprise changing the existing routes and dispatching new buses, and the dispatching of the new buses needs planning of yard selection, vehicle paths, departure time, intermediate waiting time and the like. Current research lacks one or several of the above scenarios. (2) The planning model is mostly expressed in a static or pseudo-dynamic form, the former simply defines the passenger demand as an appointment type, namely, the optimal route can be obtained by solving the static demand globally once, and the planning model essentially belongs to the customized bus category; the latter is based on the initial driving route, and adopts simple means (such as penalty function, matching according to OD) to insert the real-time dynamic demand into the route so as to approach the dynamic optimization. However, the above means only respond to the dynamic demand as much as possible, but it is difficult to ensure whether the route inserted with the new demand can reach the optimum again, that is, the dynamic optimization (3) in the strict sense is not achieved, wherein a heuristic algorithm is mainly adopted in the solution algorithm, and a few precise algorithm algorithms are adopted. The demand response bus dynamic optimization problem requires that the algorithm meets the requirements of quick solution response, high-quality optimization and the like, the solution time of the accurate algorithm is not practical due to the fact that the quick response idea is violated, and the solution quality is difficult to guarantee by the heuristic algorithm, so that a quick and high-quality solution algorithm is lacked.
Disclosure of Invention
The invention mainly aims to overcome the defects and shortcomings of the prior art and provide a look-ahead scheduling method considering prediction requirements based on a rolling time domain planning frame, which solves the problem that in a demand response bus look-ahead scheduling model, not only a scheduling plan needs to be made in real time, but also the influence of the future requirements needs to be considered so as to form look-ahead scheduling by accurately predicting the future requirements.
The purpose of the invention is realized by at least one of the following technical solutions.
A demand response bus look-ahead scheduling method comprises the following steps:
s1, establishing a Markov decision process model based on a rolling time domain framework;
s2, predicting future requirements based on quantile regression, an LSTM model and a Copula function;
s3, bringing future requirements into a common scheduling model, establishing a prediction-based look-ahead scheduling model, and designing error correction mechanisms for look-ahead scheduling and order cancellation respectively;
s4, designing a decision pruning strategy, and compressing a decoding space to reduce the calculation time;
and S5, solving the scheduling scheme by using an approximate dynamic programming algorithm.
Further, in the step S1, in a rolling time domain framework, a method of delayed batch matching is adopted to allocate a periodic order to a fleet of vehicles, and the principle is to divide an operation time range into a group of periodic sets P = { P | P =1,2, \8230 |, | P | } with a duration of T, and then to introduce any periodic riding order into the system after delaying, until the period is over, the introduced order is matched to the vehicle; the matching result is accepted or rejected, and the result passes through a preset buffer time t after the period is ended B Then the passenger is informed; the matched orders in each period can be executed in the next period, and the orders can be divided into served orders and matched but unserviced orders according to the execution state; since the passenger has been matched by the system and notified that the matched but unserviced order is not allowed to be rejected, it will be reassigned to the vehicle in the next cycle, the reassigned vehicle being either the same vehicle or another vehicle;
the rolling time domain frame comprises an order stage, a planning stage, an operation stage and a prediction stage, wherein the order stage and the prediction stage are firstly carried out to import an order into the system, then the planning stage carries out route planning on the imported order, and finally the planned route is executed in the operation stage; the scheduling plan of the current period is correlated with the execution state of the previous period; the dynamic order is not imported at the beginning of the period | P |, so that all phases except the service phase end at the end of the period | P | -1, and the vehicle v only in the service phase will still be able to enter the vehiclePlanning route planned in cycle | P | execution cycle | P | -1
Figure BDA0003989241130000031
A dynamic order imported with period p +1 would result in the planned route of the vehicle v being taken in/on with period p>
Figure BDA0003989241130000032
Incomplete execution of (2); in each subsequent cycle, the operator must not only consider the cycle p for newly introducing a passenger ≥ from the entry station i to the exit station j for a dynamic order>
Figure BDA0003989241130000033
Consideration is also given to matching but not served passengers->
Figure BDA0003989241130000034
Since the planning phase always lags behind the order phase by one cycle, a buffer time t of negligible length is set at the end of each cycle B To plan dynamic orders; planning the route will be performed in the operational phase immediately after the buffering time.
Further, the DRT scheduling problem is modeled as a Markov Decision Process (MDP), as follows:
defining a state variable:
Figure BDA0003989241130000035
/>
in the formula (1), the first and second groups of the compound,
Figure BDA0003989241130000036
a state variable representing phase k of cycle p; />
Figure BDA0003989241130000037
Represents the state of the upper stop of order r in phase k of period p, if the passenger has already got on the car, then->
Figure BDA0003989241130000038
Otherwise, is greater or less>
Figure BDA0003989241130000039
Figure BDA00039892411300000310
Represents the state of the departure station of order r in phase k of cycle p, if a passenger has arrived->
Figure BDA00039892411300000311
Otherwise, is greater or less>
Figure BDA00039892411300000312
Figure BDA00039892411300000313
Representing the position of the vehicle v in phase k of the cycle p; />
Figure BDA00039892411300000314
Representing the remaining driving mileage of the vehicle v in the period p and the period k; />
Figure BDA00039892411300000315
Representing the remaining passenger capacity of the vehicle v in the period p phase k; />
Figure BDA00039892411300000316
Represents the accumulated travel time of the vehicle v in the period p phase k; />
Figure BDA00039892411300000317
Representing an overcycle time of the vehicle v at the cycle p, if the travel time of the vehicle exceeds a cycle time T->
Figure BDA00039892411300000318
Otherwise, is greater or less>
Figure BDA00039892411300000319
Figure BDA00039892411300000320
Represents a running route of the vehicle v at a period p;
|K p the specific values of | per cycle are as follows:
Figure BDA00039892411300000321
due to planning of the route
Figure BDA00039892411300000322
May exceed a cycle time T; the execution time may only cover a part of the stations in the planned route, i.e. the vehicle cannot execute the planned route of the period p in the next period p +1, so that the state variable information is confused; in order to solve this problem, a supercycle time ^ of the vehicle v in the period p phase k is introduced>
Figure BDA00039892411300000323
Figure BDA00039892411300000324
If the travel time of the vehicle exceeds a cycle time T @>
Figure BDA00039892411300000325
Otherwise, is greater or less>
Figure BDA00039892411300000326
Defining a decision variable:
Figure BDA00039892411300000327
in equation (3), if a vehicle v located in the yard m is used in the period p phase k, then
Figure BDA00039892411300000328
If not, then the mobile terminal can be switched to the normal mode,
Figure BDA00039892411300000329
if an order r is assigned to a vehicle v in a period p in phase k @>
Figure BDA00039892411300000330
Otherwise, is greater or less>
Figure BDA00039892411300000331
If the vehicle v goes from station i to station j in phase k of period p @>
Figure BDA00039892411300000332
Otherwise, is greater or less>
Figure BDA00039892411300000333
Figure BDA00039892411300000334
Representing the waiting time of the vehicle v at the station j in the period p and the period k;
the construction scenario is as follows:
scenario 1:
Figure BDA0003989241130000041
this scenario shows that a vehicle v in the yard waits for a period of time
Figure BDA0003989241130000042
Since the vehicle waits in the yard, the station j to which the vehicle goes at the current stage is still the yard m to which the vehicle v belongs v
If the vehicle is at the earliest time ET for getting on the vehicle as declared by order r r Get ahead to its station, the vehicle needs to go to station i r Waiting; and enabling the vehicle to wait in the parking lot, wherein the waiting time is calculated as follows:
Figure BDA0003989241130000043
in the formula (4), d ij Represents the shortest distance from station i to station j;
Figure BDA0003989241130000044
represents the running speed of the vehicle and>
Figure BDA0003989241130000045
represents the accumulated travel time of the vehicle v in the period p and the period k;
scenario 2:
Figure BDA0003989241130000046
/>
this scenario indicates that an outbound vehicle v travels from station i to station j, where it may wait; an in-transit vehicle v heading for station j may have the following status:
1) The vehicle is empty before going to the station;
2) The vehicle has passengers in front of the station;
3) The vehicle goes forward to a station;
since the time window of the lower station is not specified, the third state will not be discussed; assuming that there are passengers in the vehicle before going to the passenger station, if the vehicle is at the earliest time ET r Before arriving at the upper station, the vehicle must wait for a period of time; however, waiting at the boarding station is undesirable to the passengers on the car; more importantly, in the framework of rolling horizon planning of DRT scheduling, when the vehicle v waits at station j in period p phase k
Figure BDA0003989241130000047
If long enough, the period time T may be exceeded, causing the decision for that period to be invalid; therefore, when there is a passenger on the vehicle, a boarding station that needs to wait should be avoided; for this purpose, the following limitations are introduced:
in state variables before making scheduling decisions
Figure BDA0003989241130000048
The travel route of the vehicle v at the period p has been recorded; if->
Figure BDA0003989241130000049
Including a disembarking station corresponding to the serviced boarding station, i.e. no passenger is present, the vehicle can go to any boarding station and the waiting time ÷ for the vehicle v at station j in phase k of period p>
Figure BDA00039892411300000410
Calculated by formula (4); otherwise, the vehicle is allowed to go to the boarding point which arrives in the time window only if passengers exist on the vehicle; if the vehicle arrives at the station in advance on the premise that the passengers are in the vehicle, the planned route is wrong;
scenario 3:
Figure BDA0003989241130000051
the scenario shows that the vehicle v returns to the place m v
Further, a constraint is defined:
Figure BDA0003989241130000052
Figure BDA0003989241130000053
Figure BDA0003989241130000054
Figure BDA0003989241130000055
Figure BDA0003989241130000056
Figure BDA0003989241130000057
Figure BDA0003989241130000058
Figure BDA0003989241130000059
Figure BDA00039892411300000510
/>
Figure BDA00039892411300000511
Figure BDA00039892411300000512
Figure BDA00039892411300000513
Figure BDA00039892411300000514
Figure BDA00039892411300000515
Figure BDA00039892411300000516
wherein the constraint condition (5) requires that each yard has the maximum number of available vehicles; the constraint (6) requires that each vehicle can only be used at most once per cycle; constraints (7) ensure that all submitted orders are imported into the system; the constraint (8) ensures that all orders are processed at the beginning of the period | P |; constraint (9) ensuresOrders with import should be in the final stage
Figure BDA00039892411300000520
Is sent from the upper station to the corresponding lower station; the constraint (10) indicates that from cycle 1 to cycle | P | -1, the order r can be decided twice at most; due to the route decision of each stage>
Figure BDA00039892411300000517
Only one boarding or disembarking station containing order r and each order can be serviced by only one vehicle, so there are two &'s associated with order r in each cycle>
Figure BDA00039892411300000518
The decision of (1); due to>
Figure BDA00039892411300000519
And &>
Figure BDA0003989241130000061
Form a tuple, so>
Figure BDA0003989241130000062
There are at most two decisions; constraint (11) indicates that there is no decision within period | P |; the constraint (12) and constraint (13) ensure that the allocated order does not exceed the remaining travel distance and the remaining capacity of the vehicle, respectively; the constraint (14) requires that the vehicle be late not to exceed a threshold value->
Figure BDA0003989241130000063
The constraint (15) defines a duration constraint of the overcycle time; constraints (16) - (19) describe the nature of decision variables which are @, if a vehicle v located in the yard m is used in a period p phase k>
Figure BDA0003989241130000064
Otherwise, is greater or less>
Figure BDA0003989241130000065
If the vehicle v goes from station i to station j in phase k of period p @>
Figure BDA0003989241130000066
Otherwise, is greater or less>
Figure BDA0003989241130000067
Defining a state transition equation:
(3) Phase state transition equation
Figure BDA0003989241130000068
Phase state transition equation
Figure BDA0003989241130000069
The change of each attribute of the state variable under different decision variables is described, and the purpose is to plan the route of each period;
state variables of period p phase k
Figure BDA00039892411300000610
The transfer function of each attribute in (b) depends on the context of the decision variables, as follows:
Figure BDA00039892411300000611
Figure BDA00039892411300000612
Figure BDA00039892411300000613
Figure BDA00039892411300000614
/>
Figure BDA00039892411300000615
Figure BDA00039892411300000616
in equations (21) and (22), once the vehicle is dispatched to the entering station or the leaving station,
Figure BDA00039892411300000617
and &>
Figure BDA00039892411300000618
Are respectively updated; equation (23) specifies the position of the vehicle v; in the formula (24), the remaining travel distance of the vehicle v is not changed for the scenario 1; for scenario 2, the remaining travel distance of vehicle v is reduced by the shortest distance between station i and station j; for scenario 3, the remaining travel distance is updated to the maximum travel distance when the vehicle returns to the yard; in equation (25), the remaining capacity of the vehicle v is unchanged for scenario 1; for scenario 2, the remaining capacity of vehicle v is reduced by the number of passengers for order r; for scenario 3, when the vehicle returns to the yard, the capacity is updated to maximum passenger capacity; in the formula (26), for scenario 1, the accumulated travel time of the vehicle v increases the waiting time at the yard; for scenario 2, the cumulative travel time of vehicle v is increased by the travel time between station i and station j; for scenario 3, accumulating travel time increases travel time, wait time, and service time;
(4) Periodic state transfer equation
Albeit phase state transition equation
Figure BDA0003989241130000071
Status variable which uses the period p phase k>
Figure BDA0003989241130000072
The route is planned for each cycle, but since the planned route of the vehicle v is ≥ in cycle p>
Figure BDA0003989241130000073
Has not yet been executed, so a state variable>
Figure BDA00039892411300000721
Does not change, the travel route of the vehicle v in the period p +1 +>
Figure BDA0003989241130000074
The planned route ≥ for period p only executes the period p within time T of period p +1>
Figure BDA0003989241130000075
Can be determined later; therefore, a periodic state transition equation is required>
Figure BDA0003989241130000076
External information G with period p changed to period p +1 p Can be based on the planned route
Figure BDA0003989241130000077
And the cycle time T is obtained; an initial status variable for the next cycle>
Figure BDA0003989241130000078
The equation may be transferred by the cycle state>
Figure BDA0003989241130000079
To iteratively calculate; based on this principle, the external information G between cycles p Can be obtained as follows;
the actual available travel time of the vehicle v in the period p is
Figure BDA00039892411300000710
External information G p From available travel time>
Figure BDA00039892411300000711
A route execution length;
specifically, first determining the available travel time is
Figure BDA00039892411300000712
When the vehicle runs out, which station the vehicle runs to, each station on the way and the station form a running route>
Figure BDA00039892411300000713
Status variable ≥ on the next cycle>
Figure BDA00039892411300000714
Can be determined by calculating the driving route pair>
Figure BDA00039892411300000715
The resulting change.
Further, an objective function is defined:
the cost of DRT scheduling includes fixed transportation cost, variable transportation cost and time penalty cost; the fixed transportation cost is related to the number of vehicles used, and the calculation formula is as follows:
Figure BDA00039892411300000716
/>
in the above expression, α f Representing the fixed transportation cost per vehicle. Varying the transportation cost depending on the distance traveled; since a new path is created at each stage k, the cost of the variable transportation can be calculated as follows:
Figure BDA00039892411300000717
the time window violation penalty costs include early and late arrivals; late penalty system alpha l And an early arrival penalty factor alpha e Satisfies alpha le (ii) a If the arrival time of the vehicle is within the specified time window, the penalty cost is 0; otherwise, the penalty cost depends on the length of the violation period; and the early and late arrival times of order r can be calculated as
Figure BDA00039892411300000718
And
Figure BDA00039892411300000719
Figure BDA00039892411300000720
the per-stage time window violation penalty cost is calculated as follows:
Figure BDA0003989241130000081
the cost function of the DRT model is related to the newly generated cost of each stage k; the cost of stage k of period p is equal to the cost of the previous stage
Figure BDA0003989241130000082
Plus the variable transportation costs incurred by that stage>
Figure BDA0003989241130000083
And a penalty cost->
Figure BDA0003989241130000084
As follows:
Figure BDA0003989241130000085
since the fixed transportation costs are at the stage of the final cycle
Figure BDA0003989241130000086
Calculating, adding the fixed transportation cost CF to the cost of the final stage, specifically as follows:
Figure BDA0003989241130000087
to represent the DRT problem as an SADP model, the following best-state cost function is proposed based on bellman's equation:
Figure BDA0003989241130000088
wherein the content of the first and second substances,
Figure BDA0003989241130000089
it is the phase k execution path policy ≥ of period p>
Figure BDA00039892411300000810
Best state cost function of;
predicted random order matrix under look-ahead scheduling
Figure BDA00039892411300000811
Is greater than or equal to>
Figure BDA00039892411300000812
The objective function needs to be included; to ensure iterative stability of the SADP algorithm, the @>
Figure BDA00039892411300000813
Indicating that the path policy is executed at stage k of period p->
Figure BDA00039892411300000814
Figure BDA00039892411300000815
The objective function is modified as follows:
Figure BDA00039892411300000816
passenger-vehicle assignment is crucial to route planning;
an order rejection mechanism is proposed, the rules of which are described as follows:
once the order cannot be completely matched, so that
Figure BDA00039892411300000817
The planning process is terminated; pick-up station i for orders r that cannot be matched r And a lower passenger station j r Then the stations are considered potential rejection stations and the solution is repeated after they are removed one by one until the remaining orders can be matched.
Further, in step S2, the predicting future demand based on quantile regression, the LSTM model, and the Copula function is specifically as follows:
predicting the future requirement of each OD pair in a conditional quantile equation under different quantile levels, and constructing empirical distribution to realize interval estimation;
Figure BDA0003989241130000091
where ζ represents the equation coefficient when the quantile level is equal to τ; when linear programming is used to solve for the optimal zeta, the conditional tau quantile equation can be obtained
Figure BDA0003989241130000092
For different quantiles τ, the equation (34) can be solved separately for each τ, from which the @, for different quantile levels, can be derived>
Figure BDA0003989241130000093
To predict future demand
Figure BDA0003989241130000094
History data for cycle 1 to cycle p are->
Figure BDA0003989241130000095
As a training data set; due to the need to predict future demand for different quantile levels, make/combine>
Figure BDA0003989241130000096
Representing quantile level τ l OD requirement ofLevel set is τ l = {5%,25%,50%,75%,95% }; therefore, the loss function of the LSTM model is modified to the optimization equation of the fractional regression in equation (35) as follows:
Figure BDA0003989241130000097
in this way, quantile samples of future demand for each OD pair can be obtained
Figure BDA0003989241130000098
Need->
Figure BDA0003989241130000099
Subject to the performance of the quantile sample, i.e.>
Figure BDA00039892411300000910
It can be expressed as an empirical distribution as follows:
Figure BDA00039892411300000911
thus, by empirical distribution
Figure BDA00039892411300000912
In (d) samples, a random prediction value @foreach OD pair may be obtained>
Figure BDA00039892411300000913
The method comprises the following specific steps:
Figure BDA00039892411300000914
by repeating the random sampling process, a plurality of prediction samples of each OD pair can be obtained; thus, by sampling all OD pairs, a random OD matrix of period p is obtained
Figure BDA00039892411300000915
Its dimension is | S- 2 The method comprises the following steps:
Figure BDA00039892411300000916
the edge distributions of different OD pairs may be correlated, while the joint distribution may capture this correlation;
therefore, the edge distribution is merged into a joint demand distribution by using a copula, which is as follows:
order to
Figure BDA0003989241130000101
Respectively representing the need of each OD pair>
Figure BDA0003989241130000102
Cumulative Distribution Functions (CDFs); based on a multivariate CDF, random vector &basedon the composition of the respective variables>
Figure BDA0003989241130000103
Can be expressed as follows:
Figure BDA0003989241130000104
for each OD pair, it can be determined by empirical distribution
Figure BDA0003989241130000105
The middle sample gets a random demand>
Figure BDA0003989241130000106
Giving the edge distribution of each OD pair
Figure BDA0003989241130000107
I.e. is>
Figure BDA0003989241130000108
CDF of (1), and according to the Copula theory of sklarMachine vector->
Figure BDA0003989241130000109
May be defined as an edge distribution @ofthe respective vector>
Figure BDA00039892411300001010
And a Copula function, as follows:
Figure BDA00039892411300001011
gaussian copolas was used as follows:
Figure BDA00039892411300001012
wherein, C G Denotes Gaussian Copula, phi σ A joint CDF representing a multivariate normal distribution with a covariance matrix of σ and a mean of 0;
using C G Can be selected from
Figure BDA00039892411300001013
Obtaining a future demand sample in the joint distribution; when doing so, it is selected>
Figure BDA00039892411300001014
First from C G In and then converts the sample into->
Figure BDA00039892411300001015
Finally, Φ is mapped to the chosen samples by inverse mapping of the edge CDFs, as follows:
Figure BDA00039892411300001016
finally, a joint distribution function phi is obtained σ Will be
Figure BDA00039892411300001017
Substituted into phi σ The joint occurrence probability is obtained>
Figure BDA00039892411300001018
Further, in step S3, bringing the future demand into a common scheduling model, establishing a prediction-based look-ahead scheduling model, and designing error correction mechanisms for look-ahead scheduling and order cancellation respectively, specifically as follows:
unlike ordinary scheduling, look-ahead scheduling not only takes into account
Figure BDA00039892411300001019
And &>
Figure BDA00039892411300001020
Also taking into account the predicted future demand of the upcoming period p + 1; in this scheduling mode, future demands for the next cycle->
Figure BDA00039892411300001021
Is predicted in advance at period p and is combined with->
Figure BDA00039892411300001022
And &>
Figure BDA00039892411300001023
Optimizing together; therefore, training to predict future demand should be done periodically on a rolling basis; at period p, in order to predict future demand for period p +1, the training data set should include the demands of the previous and current periods, i.e., { - } { ->
Figure BDA00039892411300001024
Further, in step S3, a correction mechanism is proposed to reduce the negative effect of the prediction error;
planned path due to vehicle performing look-ahead scheduling at the end of period p
Figure BDA00039892411300001025
The route cannot be adjusted within the time T of the next period; thus, only when the vehicle v arrives at the upper station i r Can only then check the remaining passenger load>
Figure BDA00039892411300001026
Whether a real number of passengers can be accommodated;
thus, the prediction error can be classified into two types, i.e., the predicted value is larger than the actual value, i.e.
Figure BDA00039892411300001027
And less than actual value i>
Figure BDA00039892411300001028
In the former case, the remaining passenger capacity can accommodate the number of passengers, so that it is only necessary to set the original remaining passenger capacity
Figure BDA00039892411300001029
Updated to the correct value->
Figure BDA0003989241130000111
For the latter, the planned route is invalid;
to avoid wasting consumed travel distance and travel time, only the requests which are separable are connected
Figure BDA0003989241130000112
Individual passenger, but redundant>
Figure BDA0003989241130000113
The passenger can act as a new dynamic order, at the end of the period p +1 and ≥>
Figure BDA0003989241130000114
Importing and planning together; in contrast, for an inseparable order, if the remaining passenger capacity is insufficient, the order must be rejected;
for canceling orders, the prediction error can be classified into two types, namely:
the predicted future demand is not cancelled, but is actually cancelled;
the predicted future demand is cancelled, but not actually;
for the former, the wrong state variables can be updated directly; for the latter, there are three states of 'not arriving at the boarding point', 'just arriving at the boarding point', and 'having traveled past the boarding point'; the prediction errors of the first two states can be directly updated; and for the 'has traveled past the pick-up point' state, another vehicle needs to be dispatched to service at the end of the period p + 1.
Further, in step S4, a pruning strategy for decision is designed, and a decoding space is compressed to reduce the calculation time, which specifically includes:
the lower bound pruning of the vehicle remaining distance is as follows:
since the vehicle v must transport all passengers in the vehicle to the corresponding departure point, if it is possible to move ahead along the arc (i, j) ∈ a to a new departure point j = i r Before, it is very meaningful to judge whether the remaining cruising distance of the vehicle can meet the mileage requirement of a subsequent path;
theorem 1: vehicle v goes to pick-up point j = i at phase k of cycle p r When the time is long, the residual cruising distance always has a lower bound
Figure BDA0003989241130000115
Theorem 2: compared with the standard dynamic programming algorithm, after the pruning operation is adopted, the saved state space in all the stages in the period p is achieved
Figure BDA0003989241130000116
The waiting time is optimized specifically as follows:
the vehicle can wait when being located the parking lot to optimize the departure time, prevent that it from producing great time penalty cost when going to first point of getting on bus. Get-on vehicle needing service after waitingPoint i r The vehicle is decided, and the subsequent path of the vehicle-entering point cannot be observed, so that the arrival time of the vehicle at the vehicle-entering point is optimized; waiting time
Figure BDA0003989241130000117
The optimal value of (c) is calculated as shown in equation (43): />
Figure BDA0003989241130000118
Calculating an optimal penalty cost by (44)
Figure BDA0003989241130000119
The method comprises the following specific steps:
Figure BDA0003989241130000121
returning to the parking lot for pruning, which is concretely as follows:
when the vehicle is at the point of departure after a visit, and there are no remaining passengers on the vehicle, the vehicle will face two decisions:
(e) Go to another boarding point;
(f) Returning to the affiliated parking lot m v
The pruning strategy is as follows: when in use
Figure BDA0003989241130000122
When a decision is made as to path (f); when +>
Figure BDA0003989241130000123
A decision is made as to path (e).
Further, in step S5, the scheduling scheme is solved by using an approximate dynamic programming algorithm, which specifically includes:
in the first place
Figure BDA0003989241130000124
On a second iteration, the operator uses the ^ h->
Figure BDA0003989241130000125
Merit function after a sub-iteration->
Figure BDA0003989241130000126
Makes a decision->
Figure BDA0003989241130000127
Figure BDA0003989241130000128
The estimated value of the state after the approximate decision is obtained by adopting a time sequence difference updating method
Figure BDA0003989241130000129
Sub-iteration makes->
Figure BDA00039892411300001210
Converge on>
Figure BDA00039892411300001211
Figure BDA00039892411300001212
When λ =0 in TD (λ), there is a special case as shown in equation (48):
Figure BDA00039892411300001213
unbiased sample estimation due to end stage
Figure BDA00039892411300001214
Involving a cost function of the initial phase of the next cycle
Figure BDA00039892411300001215
However, status +>
Figure BDA00039892411300001216
Is often not +>
Figure BDA00039892411300001217
The direct iteration results in a ≧ greater or lesser status of the decision, incomplete execution of the planned path, or>
Figure BDA00039892411300001218
A large deviation occurs, so a cost function rolling strategy needs to be adopted;
suppose that
Figure BDA00039892411300001235
Sub-iteration, planned path of period p @>
Figure BDA00039892411300001219
Is actually performed by the vehicle to +>
Figure BDA00039892411300001220
Then the initial state for period p +1 is substantially equal to &' s>
Figure BDA00039892411300001221
I.e. is>
Figure BDA00039892411300001222
To ensure stability of the state cost function updates at the beginning and end of adjacent cycles, the ^ th greater or lesser than the maximum value>
Figure BDA00039892411300001223
Status after a sub-iteration->
Figure BDA00039892411300001224
Is rolled to->
Figure BDA00039892411300001225
As in equation (49): />
Figure BDA00039892411300001226
First, the
Figure BDA00039892411300001227
Sub-iteration, end state->
Figure BDA00039892411300001228
Although related to->
Figure BDA00039892411300001229
But initial status of the next cycle>
Figure BDA00039892411300001230
Inherit the status of the period->
Figure BDA00039892411300001231
So that its calculation should be added to +>
Figure BDA00039892411300001232
Makes a decision->
Figure BDA00039892411300001233
The value of (A):
Figure BDA00039892411300001234
in the same way, the method for preparing the composite material,
Figure BDA0003989241130000131
the cost function approximation of (c) may use the medium TD (λ) and TD (0) methods of equation (51).
Figure BDA0003989241130000132
In the initial stage of the period p, n period p +1 random OD matrixes given by a prediction model are introduced, and then under n random demandsSample decisions made
Figure BDA0003989241130000133
The method comprises the following specific steps:
Figure BDA0003989241130000134
when the sample is in the state
Figure BDA0003989241130000135
Run to>
Figure BDA0003989241130000136
N sample planning paths are formed; to evaluate the impact of random demand on each sample planned path, the value of each sample path is approximated using TD (λ) and TD (0) methods:
Figure BDA0003989241130000137
due to random OD matrix
Figure BDA0003989241130000138
In conjunction with a probability of occurrence>
Figure BDA0003989241130000139
There is a difference in the value function->
Figure BDA00039892411300001310
Searching a minimum expected value function, and respectively taking the minimum expected value function and the corresponding random demand as a value function under a look-ahead strategy and an optimal random demand of a period p + 1:
Figure BDA00039892411300001311
compared with the prior art, the invention has the following advantages and effects:
the invention establishes a Markov decision process model under a rolling time domain framework; predicting future requirements based on quantile regression, an LSTM model and a Copula function; bringing future demands into a common scheduling model, establishing a prediction-based look-ahead scheduling model, and designing error correction mechanisms for look-ahead scheduling and order cancellation respectively; a decision pruning strategy is designed, and the decoding space is compressed to reduce the calculation time; and solving the scheduling scheme by using an approximate dynamic programming algorithm. The demand response bus dispatching fine organization and management are achieved, and cost reduction and efficiency improvement of service are achieved.
Drawings
FIG. 1 is a flow chart of a demand response bus look-ahead scheduling method in an embodiment of the present invention;
FIG. 2 is a flow chart of scheduling in an embodiment of the present invention;
FIG. 3 is a diagram of a rolling time domain scheduling framework in an embodiment of the invention;
FIG. 4 is a graphical representation of overcycle time in an embodiment of the invention;
FIG. 5 is a schematic diagram of decision variables in an embodiment of the present invention;
FIG. 6 is a diagram of a cost function scrolling and look-ahead strategy in an embodiment of the invention;
FIG. 7 is a distribution diagram of the "pieced Bus" project Bus stops and yards in accordance with an embodiment of the present invention;
FIG. 8 is a schematic diagram illustrating the calculation result of EAUC in the "Ping Bus" project in the embodiment of the present invention;
FIG. 9 is a diagram illustrating the calculation results of EAPM in the "Pin Bus" project in accordance with an embodiment of the present invention;
FIG. 10 is a diagram illustrating the calculation results of MURs in the "Pin Bus" project in accordance with an embodiment of the present invention;
FIG. 11 is a diagram illustrating the results of RR calculations in the "Pin Bus" project in an embodiment of the present invention;
FIG. 12 is a diagram illustrating LR calculation results in the "Bus" project according to an embodiment of the present invention;
FIG. 13 is a diagram illustrating the calculation of ALT in the "Pinyin Bus" project in accordance with an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto. First, in order to facilitate the following description of the mathematical model, table 1 lists symbolic variables related to the present invention.
TABLE 1 symbolic variables
Figure BDA0003989241130000141
/>
Figure BDA0003989241130000151
/>
Figure BDA0003989241130000161
A demand response bus look-ahead scheduling method is shown in fig. 1 and 2, and comprises the following steps:
s1, establishing a Markov decision process model based on a rolling time domain framework;
as shown in fig. 3, the rolling time domain frame includes an order stage, a planning stage, an operation stage and a prediction stage, wherein the order stage and the prediction stage are performed to import an order into the system, the planning stage performs route planning on the imported order, and the planned route is performed in the operation stage; the scheduling plan of the current period is correlated with the execution state of the previous period; the dynamic order is not imported at the beginning of the period | P |, and only the vehicles v in the operation stage execute the planned route in the period | P |
Figure BDA0003989241130000162
A dynamic order introduced at period p +1 results in the planned route of vehicle v being ≥ n at period p>
Figure BDA0003989241130000163
Incomplete execution of (2); in each subsequent cycle, the operator must not only consider the cycle p for newly introducing a passenger ≥ from the entry station i to the exit station j for a dynamic order>
Figure BDA0003989241130000164
Consideration is also given to matching but not served passengers->
Figure BDA0003989241130000165
Since the planning phase always lags behind the order phase by one cycle, a buffer time t of negligible length is set at the end of each cycle B To plan dynamic orders; planning the route will be performed in the operational phase immediately after the buffering time.
Further, the DRT scheduling problem is modeled as a Markov Decision Process (MDP), as follows:
defining the state variables:
Figure BDA0003989241130000166
|K p the specific values of | per cycle are as follows:
Figure BDA0003989241130000167
as shown in FIG. 4, the planned route of cycle 1 is
Figure BDA0003989241130000168
But the execution time of cycle 2 is only between station 2 and station 3. That is, when the vehicle has not arrived at station 3, the system will import a new order and re-generate a route. One reasonable improvement is to have the vehicle continue to station 3 in the previously planned route, and then the 1 st cycle travel route is taken from ≧ greater than or equal to>
Figure BDA0003989241130000171
Is changed into->
Figure BDA0003989241130000172
Wherein the time which is consumed more by the travel to the station 3 is recorded as the supercycle time ≥>
Figure BDA0003989241130000173
Defining decision variables:
Figure BDA0003989241130000174
as shown in fig. 5, the decision scenario is constructed as follows:
scenario 1:
Figure BDA0003989241130000175
scenario 2:
Figure BDA0003989241130000176
scenario 3:
Figure BDA0003989241130000177
if the vehicle is at the earliest time ET for getting on the vehicle as declared by order r r Forward to its upper station where vehicles need to go i r Waiting; it is reasonable to let the vehicle wait in the parking lot, and the waiting time is calculated as follows:
Figure BDA0003989241130000178
further, an objective function is defined:
the cost of DRT scheduling includes fixed transportation cost, variable transportation cost and time penalty cost; the fixed transportation cost is related to the number of vehicles used, and the calculation formula is as follows:
Figure BDA0003989241130000179
in the above expression, β f Representing the fixed transportation cost per vehicle.
The cost of variable transportation for each stage k can be calculated as follows:
Figure BDA00039892411300001710
the time penalty cost per phase k can be calculated as follows:
Figure BDA00039892411300001711
the cost function of the DRT model is related to the newly generated cost for each stage k; the cost of phase k of period p is equal to the cost of the previous phase
Figure BDA00039892411300001712
Plus the variable transportation costs incurred by that stage>
Figure BDA00039892411300001713
And a penalty cost->
Figure BDA00039892411300001714
As follows:
Figure BDA0003989241130000181
since the fixed transportation costs are at the stage of the final cycle
Figure BDA0003989241130000182
Calculating, adding the fixed transportation cost CF to the cost of the final stage, specifically as follows:
Figure BDA0003989241130000183
to represent the DRT problem as an SADP model, the following best-state cost function is proposed based on bellman's equation:
Figure BDA0003989241130000184
wherein the content of the first and second substances,
Figure BDA0003989241130000185
it is the phase k execution path policy ≥ of period p>
Figure BDA0003989241130000186
Best state cost function of;
predicted random order matrix under look-ahead scheduling
Figure BDA0003989241130000187
Is greater than or equal to>
Figure BDA0003989241130000188
The objective function needs to be included; make->
Figure BDA0003989241130000189
Indicating that the path policy is executed at stage k of period p->
Figure BDA00039892411300001810
The objective function is modified as follows:
Figure BDA00039892411300001811
an order rejection mechanism is proposed, the rules of which are described as follows:
once the order cannot be completely matched, so that
Figure BDA00039892411300001812
The planning process is terminated; pick-up station i for orders r that cannot be matched r And a lower passenger station j r Then the stations are considered potential rejection stations and the solution is repeated after they are removed one by one until the remaining orders can be matched.
S2, predicting future requirements based on quantile regression, an LSTM model and a Copula function, and specifically comprising the following steps:
predicting the future requirement of each OD pair in a conditional quantile equation under different quantile levels so as to construct empirical distribution to realize interval estimation, wherein an optimization model of quantile regression is modified into a formula (35) as follows:
Figure BDA00039892411300001813
the method adopts the LSTM model to predict future requirements
Figure BDA00039892411300001814
The loss function is modified as follows:
Figure BDA00039892411300001815
in this way, quantile samples of future demand for each OD pair can be obtained
Figure BDA00039892411300001816
Need->
Figure BDA00039892411300001817
Subject to the performance of the quantile sample, i.e.>
Figure BDA0003989241130000191
It can be expressed as an empirical distribution as follows:
Figure BDA0003989241130000192
/>
by empirical distribution
Figure BDA0003989241130000193
Medium sample, a random predictor value @, for each OD pair may be obtained>
Figure BDA0003989241130000194
The method comprises the following specific steps:
Figure BDA0003989241130000195
by repeating the random sampling process, a plurality of prediction samples of each OD pair can be obtained; thus, by sampling all OD pairs, a random OD matrix of period p is obtained
Figure BDA0003989241130000196
Its dimension is | S- 2 The method comprises the following steps:
Figure BDA0003989241130000197
edge distributions of different ODs are merged into a joint demand distribution by using a copula, which is as follows:
Figure BDA0003989241130000198
gaussian copolas was used as follows:
Figure BDA0003989241130000199
map Φ to the chosen samples by inverse mapping of the edge CDFs, as follows:
Figure BDA00039892411300001910
finally, a joint distribution function phi is obtained σ Will be
Figure BDA00039892411300001911
Substituted into phi σ The joint occurrence probability is obtained>
Figure BDA00039892411300001912
S3, bringing future requirements into a common scheduling model, establishing a prediction-based look-ahead scheduling model, and designing error correction mechanisms for look-ahead scheduling and order cancellation respectively, wherein the error correction mechanisms are as follows:
look-ahead scheduling simultaneous optimization
Figure BDA00039892411300001913
And a predicted future demand for the upcoming cycle p + 1; as shown in fig. 3, the future demand of the next cycle +>
Figure BDA00039892411300001914
Is predicted in advance at period p and is associated with->
Figure BDA00039892411300001915
And &>
Figure BDA00039892411300001916
Are optimized together. Meanwhile, when the dynamic order of the next period is imported, the prediction error is corrected; for example, cycle 2, first on a dynamic order->
Figure BDA00039892411300001917
Correction time period t 0 -T,t 0 ]Is/is>
Figure BDA00039892411300001918
And obtain a matching but unserviced order &>
Figure BDA00039892411300001919
At the same time, during the prediction time period t 0 ,t 0 +T]Is based on a future need->
Figure BDA00039892411300001920
Then will->
Figure BDA00039892411300001921
And &>
Figure BDA00039892411300001922
In a time period t 0 ,t 0 +t B ]Plan together and then schedule the vehicle for a time period t 0 +t B ,t 0 +T]Performing;
further, in step S3, a correction mechanism is proposed to reduce the negative effect of the prediction error;
planned path due to vehicle performing look-ahead scheduling at the end of period p
Figure BDA00039892411300001923
The route cannot be adjusted within the time T of the next period; thus, only when the vehicle v arrives at the upper station i r Can only check the remaining passenger capacity->
Figure BDA0003989241130000201
Whether a real number of passengers can be accommodated;
thus, the prediction error can be classified into two types, i.e., the predicted value is larger than the actual value, i.e.
Figure BDA0003989241130000202
And less than the actual value, i.e.>
Figure BDA0003989241130000203
In the former case, the remaining passenger capacity can accommodate the number of passengers, so that only the original remaining passenger capacity needs to be used
Figure BDA0003989241130000204
Updated to the correct value->
Figure BDA0003989241130000205
For the latter, the planned route is invalid;
to avoid wasting consumed travel distance and travel time, only those requests that can be split are received
Figure BDA0003989241130000206
Individual passenger, but redundant>
Figure BDA0003989241130000207
The passenger can act as a new dynamic order in conjunction with the end of period p +1>
Figure BDA0003989241130000208
Importing and planning together; in contrast, for an inseparable order, if the remaining passenger capacity is insufficient, the order must be rejected;
for canceling orders, the prediction error can be classified into two types, namely:
the predicted future demand is not cancelled, but is actually cancelled;
the predicted future demand is cancelled, but not actually;
for the former, the wrong state variables may be updated directly; for the latter, there are three states of 'not arriving at the boarding point', 'just arriving at the boarding point', and 'having traveled past the boarding point'; the prediction errors of the first two states can be directly updated; for the 'has traveled past the departure point', another vehicle needs to be dispatched to service at the end of the period p + 1.
S4, designing a decision pruning strategy, and compressing a decoding space to reduce the calculation time, wherein the method specifically comprises the following steps:
the lower bound pruning of the vehicle remaining distance is as follows:
the lower bound of the remaining range is uniformly expressed as shown in formula (20).
Figure BDA0003989241130000209
The decision that the residual driving distance is smaller than the lower bound can be eliminated by utilizing the lower bound pruning of the residual driving distance of the vehicle, and the decision cannot be a decision selected on the optimal path because the decisions are smaller than the lower bound, namely, the optimal solution is guaranteed not to be pruned.
The waiting time is optimized specifically as follows:
the arrival time of the vehicle at the station can be optimized when the vehicle is located in the parking lot. Waiting time
Figure BDA00039892411300002010
The optimal value of (2) is calculated as shown in equation (21):
Figure BDA00039892411300002011
calculating an optimal penalty cost by (22)
Figure BDA00039892411300002012
The method comprises the following specific steps:
Figure BDA0003989241130000211
returning to the parking lot for pruning, which is concretely as follows:
when in use
Figure BDA0003989241130000212
When a decision is made, as path (f); when/is>
Figure BDA0003989241130000213
A decision is made as to path (e).
S5, solving the scheduling scheme by using an approximate dynamic programming algorithm, which comprises the following specific steps:
in the first place
Figure BDA0003989241130000214
Sub-iteration, operator uses a th &>
Figure BDA0003989241130000215
Merit function after a sub-iteration->
Figure BDA0003989241130000216
Makes a decision->
Figure BDA0003989241130000217
Figure BDA0003989241130000218
/>
The estimated value of the state after the approximate decision is obtained by adopting a time sequence difference updating method
Figure BDA0003989241130000219
Sub-iteration makes->
Figure BDA00039892411300002110
Converge on>
Figure BDA00039892411300002111
Figure BDA00039892411300002112
When λ =0 in TD (λ), there is a special case as shown in equation (25):
Figure BDA00039892411300002113
in order to ensure the stability of the state cost function update at the beginning and the end of the adjacent period, the method
Figure BDA00039892411300002114
Status after sub-iteration>
Figure BDA00039892411300002115
Is rolled to->
Figure BDA00039892411300002116
As in equation (26):
Figure BDA00039892411300002117
first, the
Figure BDA00039892411300002118
Sub-iteration, end state>
Figure BDA00039892411300002119
Although related to->
Figure BDA00039892411300002120
But the initial status of the next cycle->
Figure BDA00039892411300002121
Inherits the status of the cycle @>
Figure BDA00039892411300002122
Therefore, it should be calculated to add->
Figure BDA00039892411300002123
Makes a decision->
Figure BDA00039892411300002124
The value of (A):
Figure BDA00039892411300002125
in the same way, the method has the advantages of,
Figure BDA00039892411300002126
the cost function approximation of (c) may use the medium TD (λ) and TD (0) methods of equation (28).
Figure BDA00039892411300002127
All steps of the cost function rolling strategy are summarized in table 2.
TABLE 2 VALUE FUNCTION ROLLING POLICY PSEUDO-CODE
Figure BDA00039892411300002128
/>
Figure BDA0003989241130000221
In the initial stage of the period p, n period p +1 random OD matrixes given by a prediction model are introduced, and then sample decisions are made under n random demands
Figure BDA0003989241130000222
The method comprises the following specific steps:
Figure BDA0003989241130000223
when the sample is in the state
Figure BDA0003989241130000224
Run to>
Figure BDA0003989241130000225
N sample planning paths are formed; to evaluate the impact of random demand on each sample planned path, the value of each sample path is approximated using TD (λ) and TD (0) methods:
Figure BDA0003989241130000226
due to random OD matrix
Figure BDA0003989241130000227
In conjunction with a probability of occurrence>
Figure BDA0003989241130000228
There is a difference in the value function of the samples n>
Figure BDA0003989241130000229
Medium search minimum expectationAnd the value function and the corresponding random demand are respectively used as the value function under a prospective strategy and the optimal random demand of the period p + 1:
Figure BDA00039892411300002210
example 1:
in this embodiment, the Guangzhou public transportation company has established 68 upper stations and 100 lower stations for the "Pinbus" project in the region through a real demand response public transportation project, namely the "Pinbus" project in Guangzhou university City. The spatial distribution of bus stations and yards is shown in fig. 7.
Fig. 8 to 13 compare the actual operation, the ordinary operation and the prospective operation of the "joined BUS" project on six indexes. The common operation and the forward-looking operation are found to be superior to the actual operation in six indexes, except that the actual operation in a certain few days is slightly better than the common operation in the late time rate and the average late time. The average increase in prospective operation over 7 days compared to actual operation is as follows: the number of vehicles is reduced by 11.43%, the average cost of effective passengers is reduced by 38.35%, the average mileage of effective passengers is reduced by 23.76%, the mileage utilization rate is increased by 20.05%, the response rate is increased by 13.30%, the late time is reduced by 14.46%, and the average late time is reduced by 1.81 minutes. Then, the overall increase in the "LSTM + Quantile + Copula" model over 7 days compared to the "LSTM + Quantile" model is as follows: mean effective passenger cost (EAUC) increased by 3.73%, mean effective passenger mileage (EAPM) decreased by 11.31%, mileage Utilization (MUR) increased by 6.26%, response Rate (RR) increased by 14.09%, late point rate (LR) increased by 0.13%, mean late point time (ALT) increased by 0.02 min. Although Copula's theory worsens the late rates and average late times and thus the average cost of active passengers, this negative effect is almost negligible. However, the performance improvement of effective passenger average mileage, mileage utilization rate and response rate due to Copula capturing spatial correlation is huge, which shows that the exploration of spatial features of the reinforced prediction model can improve the service efficiency of the prospective operation with small consumption of system cost. In addition, the overall increase in the 7-day quantile + LSTM model compared to the LSTM model is as follows: the average cost of the effective passengers is reduced by 4.19%, the average mileage of the effective passengers is reduced by 17.91%, the mileage utilization rate is increased by 17.41%, the response rate is increased by 7.67%, the late time rate is reduced by 0.40%, and the average late time is reduced by 0.06 min. Due to the quantile interval estimation, the prediction uncertainty of sparse passenger flow can be overcome, and the prediction deviation caused by default values and abnormal peak values is reduced. Therefore, replacing the point estimates of the "LSTM" model with interval estimates can effectively address the challenges of sparse order data.
The result shows that the prospective scheduling method and the solving algorithm provided by the invention have good characteristics and can be popularized to practical application.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (10)

1. A demand response bus look-ahead scheduling method is characterized by comprising the following steps:
s1, establishing a Markov decision process model based on a rolling time domain framework;
s2, predicting future requirements based on quantile regression, an LSTM model and a Copula function;
s3, bringing future requirements into a common scheduling model, establishing a prediction-based look-ahead scheduling model, and designing error correction mechanisms for look-ahead scheduling and order cancellation respectively;
s4, designing a decision pruning strategy, and compressing a decoding space to reduce the calculation time;
and S5, solving the scheduling scheme by using an approximate dynamic programming algorithm.
2. The demand response bus look-ahead scheduling method according to claim 1, wherein in step S1, delayed batch matching is adopted in a rolling time domain frameworkThe method comprises the steps of distributing a periodic order to a motorcade, dividing an operation time range into a group of periodic sets P = { P | P =1,2, \8230 |; | P | } with the duration T, and then leading the riding order of any period into a system after delaying, and matching the led order to a vehicle when the period is over; the matching result is accepted or rejected, and the result passes through a preset buffer time t after the period is ended B Then the passenger is informed; the matched orders in each period can be executed in the next period, and the orders can be divided into served orders and matched but unserviced orders according to the execution state; since the passenger has been matched by the system and notified that the matched but unserviced order is not allowed to be rejected, it will be reassigned to the vehicle in the next cycle, the reassigned vehicle being either the same vehicle or another vehicle;
the rolling time domain frame comprises an order stage, a planning stage, an operation stage and a prediction stage, wherein the order stage and the prediction stage are firstly carried out to import an order into the system, then the planning stage carries out route planning on the imported order, and finally the planned route is executed in the operation stage; the scheduling plan of the current period is correlated with the execution state of the previous period; the dynamic order is not imported at the beginning of the period | P |, so that all phases except the service phase end at the end of the period | P | -1, and the vehicle v only in the service phase executes the planned route planned by the period | P | -1 at the period | P |
Figure FDA0003989241120000011
A dynamic order introduced at period p +1 results in the planned route of vehicle v being ≥ n at period p>
Figure FDA0003989241120000012
Incomplete execution of (2); in each subsequent cycle, the operator must not only consider the cycle p for newly introducing a passenger ≥ from the entry station i to the exit station j for a dynamic order>
Figure FDA0003989241120000013
Also consider the piece from the boarding station i to the disembarking station j up to the period pMatched but not served passenger->
Figure FDA0003989241120000014
Since the planning phase always lags behind the order phase by one cycle, a buffer time t of negligible length is set at the end of each cycle B To plan dynamic orders; planning the route will be performed in the operational phase immediately after the buffering time.
3. The demand-response bus look-ahead scheduling method of claim 2, wherein the demand-response bus (DRT) scheduling problem is a variation of a pickup vehicle routing problem with time windows, which further considers a plurality of yards, departure time optimization, latency optimization, dynamic ordering, passenger-vehicle matching and order rejection mechanisms; in the present invention, the DRT scheduling problem is modeled as a Markov Decision Process (MDP), as follows:
defining a state variable:
Figure FDA0003989241120000021
in the formula (1), the first and second groups of the compound,
Figure FDA0003989241120000022
state variables representing the p phases k of the period; />
Figure FDA0003989241120000023
Represents the state of the upper stop of order r in phase k of period p, if the passenger has already got on the car, then->
Figure FDA0003989241120000024
Otherwise, is combined with>
Figure FDA0003989241120000025
Figure FDA0003989241120000026
Represents the state of the departure station of order r in phase k of cycle p, if a passenger has arrived->
Figure FDA0003989241120000027
Otherwise, is greater or less>
Figure FDA0003989241120000028
Figure FDA0003989241120000029
Representing the position of the vehicle v in phase k of the cycle p; />
Figure FDA00039892411200000210
Representing the remaining driving mileage of the vehicle v in the period p phase k; />
Figure FDA00039892411200000211
Representing the remaining passenger capacity of the vehicle v in the period p phase k; />
Figure FDA00039892411200000212
Represents the accumulated travel time of the vehicle v in the period p phase k; />
Figure FDA00039892411200000213
Representing an overcycle time of the vehicle v at the cycle p, if the travel time of the vehicle exceeds a cycle time T->
Figure FDA00039892411200000214
Otherwise, is combined with>
Figure FDA00039892411200000215
Figure FDA00039892411200000216
Represents a running route of the vehicle v at a period p;
then deducing the periodTotal phase of p | K p The numerical value of | for the sake of simplifying the symbolic representation, k is represented by the symbol k p I.e. k = k p (ii) a By symbols
Figure FDA00039892411200000217
Denotes the number of final stages per cycle | K p I | in combination>
Figure FDA00039892411200000218
For phase k, the system will assign the order r to the vehicle v and generate the external information ≥>
Figure FDA00039892411200000219
Passes the phase transfer equation->
Figure FDA00039892411200000220
Obtaining a state variable of the next stage; in each cycle the route is planned for all imported orders, if order r is imported in period p ≥>
Figure FDA00039892411200000221
If not, then,
Figure FDA00039892411200000222
for example, in cycle 1, since the imported order has not been serviced by the vehicle, the final number of stages | K p | equals the number of stations for static orders; in the 2 nd cycle to the | P | cycle, | K p An induction variable with a specific value per cycle equal to the number of unserviced sites for all imported orders, as follows:
Figure FDA00039892411200000223
due to planning of the route
Figure FDA00039892411200000224
Is provided withThe line time may exceed one cycle time T; the execution time may only cover a part of the stations in the planned route, i.e. the vehicle cannot execute the planned route of the period p in the next period p +1, so that the state variable information is confused; to solve this problem, a supercycle time £ of the vehicle v in the period p phase k is introduced>
Figure FDA00039892411200000225
If the travel time of the vehicle exceeds a cycle time T @>
Figure FDA00039892411200000226
Otherwise, is greater or less>
Figure FDA00039892411200000227
Defining a decision variable:
Figure FDA00039892411200000228
in the formula (3), if the vehicle v located at the yard m is used in the period p phase k, then
Figure FDA00039892411200000229
If not, then,
Figure FDA00039892411200000230
if an order r is assigned to a vehicle v in a period p phase k, then @>
Figure FDA00039892411200000231
Otherwise, is combined with>
Figure FDA00039892411200000232
If the vehicle v goes from station i to station j in phase k of period p @>
Figure FDA00039892411200000233
Otherwise, is greater or less>
Figure FDA00039892411200000234
Figure FDA00039892411200000235
Representing the waiting time of the vehicle v at the station j in the period p and the phase k;
the construction scenario is as follows:
scenario 1:
Figure FDA0003989241120000031
this scenario shows that a vehicle v in the yard waits for a period of time
Figure FDA0003989241120000032
Since the vehicle waits in the yard, the stop j to which the vehicle is heading at the current stage is still the yard m to which the vehicle v belongs v
If the vehicle is at the earliest time ET for getting on the vehicle as declared by order r r Get ahead to its station, the vehicle needs to go to station i r Waiting; it is reasonable to make the vehicle wait in the parking lot, and the waiting time is calculated as follows:
Figure FDA0003989241120000033
in the formula (4), d ij Represents the shortest distance from station i to station j;
Figure FDA00039892411200000311
indicates the vehicle running speed and is greater or less>
Figure FDA0003989241120000034
Represents the accumulated travel time of the vehicle v in the period p phase k; />
Scenario 2:
Figure FDA0003989241120000035
this scenario indicates that an outbound vehicle v travels from station i to station j, where it may wait; a vehicle v in transit heading for station j may have the following status:
1) The vehicle is empty before going to the station;
2) The vehicle has passengers in front of the station;
3) The vehicle goes forward to a station;
since the time window of the lower station is not specified, the third state will not be discussed; assuming that there are passengers in the vehicle before going to the passenger station, if the vehicle is at the earliest time ET r Before arriving at the upper station, the vehicle must wait for a period of time; however, waiting at the bus stop is undesirable for passengers in the bus; more importantly, in the framework of rolling horizon planning of DRT scheduling, when the vehicle v waits at station j in period p phase k
Figure FDA0003989241120000036
If long enough, the period time T may be exceeded, causing the decision for that period to be invalid; therefore, when there is a passenger on the vehicle, a boarding station that needs to wait should be avoided; for this purpose, the following limitations are introduced:
in state variables before making scheduling decisions
Figure FDA0003989241120000037
The travel route of the vehicle v at the period p has been recorded; if->
Figure FDA0003989241120000038
Including the off-stop corresponding to the served on-stop, i.e. no passenger is present, the vehicle can go to any on-stop and the waiting time ≥ for station j during period p, phase k>
Figure FDA0003989241120000039
Calculated by formula (4); otherwise, the passenger is present on the vehicle, and the vehicle is onlyAllowing the departure of the pick-up point reached within the time window; if the vehicle arrives at the station in advance on the premise that passengers exist in the vehicle, the fact that the planned route of the vehicle is wrong is indicated;
scenario 3:
Figure FDA00039892411200000310
the scenario shows that the vehicle v returns to the place m v
4. The demand response bus look-ahead scheduling method of claim 3, wherein a constraint condition is defined:
Figure FDA0003989241120000041
Figure FDA0003989241120000042
Figure FDA0003989241120000043
Figure FDA0003989241120000044
Figure FDA0003989241120000045
Figure FDA0003989241120000046
Figure FDA0003989241120000047
Figure FDA0003989241120000048
/>
Figure FDA0003989241120000049
Figure FDA00039892411200000410
Figure FDA00039892411200000411
Figure FDA00039892411200000412
Figure FDA00039892411200000413
Figure FDA00039892411200000414
Figure FDA00039892411200000415
wherein the constraint condition (5) requires that each yard has the maximum number of available vehicles; the constraint (6) requires that each vehicle can only be used at most once per cycle; constraints (7) ensure that all submitted orders are imported into the system; the constraint condition (8) ensures that all orders are processed at the starting time of the period | P |; restraint stripThe member (9) ensures that all imported orders should be in the final stage
Figure FDA00039892411200000416
Is sent from the upper station to the corresponding lower station; the constraint (10) indicates that from cycle 1 to cycle | P | -1, the order r can be decided twice at most; due to the route decision of each stage->
Figure FDA00039892411200000417
Only one getting-on or getting-off station containing order r and each order can be served by only one vehicle, so there are two associated with order r in each cycle
Figure FDA00039892411200000418
The decision of (1); due to>
Figure FDA00039892411200000419
And &>
Figure FDA00039892411200000420
Form a tuple, so>
Figure FDA00039892411200000421
There are at most two decisions; constraint (11) indicates that there is no decision within period | P |; the constraint (12) and the constraint (13) ensure that the allocated order does not exceed the remaining travel distance and the remaining capacity of the vehicle, respectively; the constraint (14) requires that the vehicle be late not to exceed a threshold value->
Figure FDA00039892411200000422
The constraint (15) defines a duration constraint of the overcycle time; constraints (16) - (19) describe the nature of decision variables which are @, if a vehicle v located in the yard m is used in a period p phase k>
Figure FDA0003989241120000051
Otherwise, is greater or less>
Figure FDA0003989241120000052
If the vehicle v goes from station i to station j in phase k of period p @>
Figure FDA0003989241120000053
Otherwise, is greater or less>
Figure FDA0003989241120000054
Defining a state transition equation:
(1) Phase state transition equation
Figure FDA0003989241120000055
Phase state transition equation
Figure FDA0003989241120000056
The change of each attribute of the state variable under different decision variables is described, and the purpose is to plan the route of each period;
state variables of period p phase k
Figure FDA0003989241120000057
The transfer function of each attribute in (1) depends on the context of the decision variables, as follows:
Figure FDA0003989241120000058
Figure FDA0003989241120000059
Figure FDA00039892411200000510
Figure FDA00039892411200000511
/>
Figure FDA00039892411200000512
Figure FDA00039892411200000513
in equations (21) and (22), once the vehicle is dispatched to the entering station or the leaving station,
Figure FDA00039892411200000514
and &>
Figure FDA00039892411200000515
Are respectively updated; equation (23) specifies the position of the vehicle v; in the formula (24), the remaining travel distance of the vehicle v is not changed for the scenario 1; for scenario 2, the remaining travel distance of vehicle v is reduced by the shortest distance between station i and station j; for scenario 3, the remaining travel distance is updated to the maximum travel distance when the vehicle returns to the yard; in equation (25), the remaining capacity of the vehicle v is unchanged for scenario 1; for scenario 2, the remaining capacity of vehicle v is reduced by the number of passengers for order r; for scenario 3, when the vehicle returns to the yard, the capacity is updated to maximum passenger capacity; in equation (26), for scenario 1, the cumulative travel time of vehicle v increases the waiting time at the yard; for scenario 2, the cumulative travel time of vehicle v is increased by the travel time between station i and station j; for scenario 3, accumulating travel time increases travel time, wait time, and service time;
(2) Periodic state transition equation
Albeit phase state transition equation
Figure FDA0003989241120000061
Status variable ≥ using period p phase k>
Figure FDA0003989241120000062
The route is planned for each cycle, but since the planned route of the vehicle v is ≥ in cycle p>
Figure FDA0003989241120000063
Has not yet been executed, so a state variable>
Figure FDA0003989241120000064
Is not changed, the travel route of the vehicle v is/are based on the period p +1>
Figure FDA0003989241120000065
The planned route ≥ for period p only executes the period p within time T of period p +1>
Figure FDA0003989241120000066
Can be determined later; therefore, a periodic state transition equation is required>
Figure FDA0003989241120000067
External information G with period p changed to period p +1 p Can be based on the planned route
Figure FDA0003989241120000068
And the cycle time T is obtained; an initial state variable->
Figure FDA0003989241120000069
Can be combined by a periodic state transition equation>
Figure FDA00039892411200000610
To iterate the calculation; based on this principle, the external information G between cycles p Can be obtained as follows;
the actual available travel time of the vehicle v in the period p is
Figure FDA00039892411200000611
External information G p From available travel time>
Figure FDA00039892411200000612
A route execution length;
specifically, first determining the available travel time is
Figure FDA00039892411200000613
When the vehicle runs out, which station the vehicle runs to, each station on the way and the station form a running route>
Figure FDA00039892411200000614
Status variable ≥ on the next cycle>
Figure FDA00039892411200000615
Can be determined by calculating the travel route pair->
Figure FDA00039892411200000616
The resulting change.
5. The demand response bus look-ahead scheduling method of claim 4, wherein an objective function is defined:
the cost of DRT scheduling includes fixed transportation cost, variable transportation cost and time penalty cost; the fixed transportation cost is related to the number of vehicles used, and the calculation formula is as follows:
Figure FDA00039892411200000617
in the above expression, β f Representing the fixed transportation cost of each vehicle; varying the transportation cost depending on the distance traveled; due to the fact that at each stepIf segment k generates a new path, the cost of the alternate transport can be calculated as follows:
Figure FDA00039892411200000618
the time window violation penalty costs include early and late arrivals; in general, late arrival is more dangerous than early arrival, so late arrival penalizes the system β l And early arrival penalty factor beta e Satisfies beta le (ii) a If the arrival time of the vehicle is within the specified time window, the penalty cost is 0; otherwise, the penalty cost depends on the length of the violation period; and the early and late arrival times of order r can be calculated as
Figure FDA00039892411200000619
And &>
Figure FDA00039892411200000620
The per-stage time window violation penalty cost is calculated as follows:
Figure FDA00039892411200000621
the cost function of the DRT model is related to the newly generated cost for each stage k; the cost of stage k of period p is equal to the cost of the previous stage
Figure FDA00039892411200000622
Plus altered transportation costs incurred for that stage>
Figure FDA00039892411200000623
And a penalty cost->
Figure FDA00039892411200000624
As follows:
Figure FDA0003989241120000071
since the fixed transportation costs are at the stage of the final cycle
Figure FDA0003989241120000072
Calculating, adding the fixed transportation cost CF to the cost of the final stage, specifically as follows:
Figure FDA0003989241120000073
to represent the DRT problem as an SADP model, the following best-state cost function is proposed based on bellman's equation:
Figure FDA0003989241120000074
wherein the content of the first and second substances,
Figure FDA0003989241120000075
stage k execution path policy that is period p @>
Figure FDA0003989241120000076
Best state cost function of;
predicted random order matrix under look-ahead scheduling
Figure FDA0003989241120000077
Is greater than or equal to>
Figure FDA0003989241120000078
The objective function needs to be included; however, only the future demand of cycle P +1 can be predicted, not for all subsequent cycles (from cycle P +1 to | P | phase); to ensure iterative stability of the SADP algorithm, the @>
Figure FDA0003989241120000079
Indicating that the path policy is executed at stage k of period p->
Figure FDA00039892411200000710
The objective function is modified as follows: />
Figure FDA00039892411200000711
Passenger-vehicle assignment is crucial to route planning;
however, due to supply limitations, such as fleet size and vehicle remaining passenger capacity, imported orders cannot always be fully matched; an order rejection mechanism is proposed, the rules of which are described as follows:
decision variables when all imported orders can be matched within period p
Figure FDA00039892411200000712
Can be expressed by a phase state transition equation
Figure FDA00039892411200000713
Iterating to a final state variable>
Figure FDA00039892411200000714
Namely, it is
Figure FDA00039892411200000715
However, once the order cannot be completely matched, so that
Figure FDA00039892411200000716
The planning process is terminated; pick-up station i for orders r that cannot be matched r And a guest drop-off station j r Then the stations are considered potential rejection stations and the solution is repeated after they are removed one by one until the remaining orders can be matched.
6. The demand response bus look-ahead scheduling method according to claim 1, wherein in step S2, the future demand is predicted based on quantile regression, an LSTM model, and a Copula function, specifically as follows:
predicting the future requirement of each OD pair through a conditional quantile equation under different quantile levels, thereby constructing empirical distribution to realize interval estimation;
in general, a regression problem can be expressed as an optimization model; by finding a functional relationship
Figure FDA00039892411200000717
Bringing a desired demand and a real demand->
Figure FDA00039892411200000718
The difference between them is minimal, namely:
Figure FDA0003989241120000081
wherein
Figure FDA0003989241120000082
Is a dependent variable>
Figure FDA0003989241120000083
The conditional expectation equation of (c);
however, equation (34) only applies to the case where the residuals of the two side sample points are equally weighted, i.e. median (τ = 0.5) regression; for regression of quantile τ, it is necessary to rely on
Figure FDA0003989241120000084
And &>
Figure FDA0003989241120000085
The magnitude relationship between τ and 1- τ; therefore, the optimization model of quantile regression isModified to equation (35), specifically as follows:
Figure FDA0003989241120000086
where ζ represents the equation coefficient when the quantile level is equal to τ; when linear programming is used to solve to the optimal zeta, the conditional tau quantile equation can be obtained
Figure FDA0003989241120000087
For different quantiles τ, the equation (35) can be solved separately for each τ, from which the @, at different quantile levels, can be derived>
Figure FDA0003989241120000088
To predict future demand
Figure FDA0003989241120000089
History data for cycle 1 to cycle p are->
Figure FDA00039892411200000810
As a training data set; due to the need to predict future demands at different quantile levels, let @>
Figure FDA00039892411200000811
Representing quantile level τ l OD requirement at, and a quantile level set of τ l = 5%,25%,50%,75%,95% }; therefore, the loss function of the LSTM model is modified to the optimization equation of the fractal regression in equation (36), as follows:
Figure FDA00039892411200000812
in this way, quantile samples of future demand for each OD pair can be obtained
Figure FDA00039892411200000813
Need->
Figure FDA00039892411200000814
Subject to the performance of the quantile sample, i.e.>
Figure FDA00039892411200000815
It can be expressed as an empirical distribution as follows:
Figure FDA00039892411200000816
thus, by empirical distribution
Figure FDA00039892411200000817
Medium sample, a random predictor value @, for each OD pair may be obtained>
Figure FDA00039892411200000818
The method comprises the following specific steps:
Figure FDA00039892411200000819
by repeating the random sampling process, a plurality of prediction samples of each OD pair can be obtained; thus, by sampling all OD pairs, a random OD matrix of period p is obtained
Figure FDA0003989241120000091
Its dimension is | S- 2 The method comprises the following steps:
Figure FDA0003989241120000092
the edge distributions of different OD pairs may be correlated, while the joint distribution may capture this correlation;
therefore, the edge distribution is merged into a joint demand distribution by using a copula, which is as follows:
order to
Figure FDA0003989241120000093
Respectively representing the need of each OD pair>
Figure FDA0003989241120000094
Cumulative Distribution Functions (CDFs); based on the multivariate CDF, random vector @, consisting of the respective variables>
Figure FDA0003989241120000095
Can be expressed as follows:
Figure FDA0003989241120000096
for each OD pair, it can be determined by empirical distribution
Figure FDA0003989241120000097
The middle sample gets a random demand>
Figure FDA0003989241120000098
Giving the edge distribution of each OD pair
Figure FDA0003989241120000099
I.e. is>
Figure FDA00039892411200000910
And according to sklar's Copula theory, the random vector ≥ is>
Figure FDA00039892411200000911
May be defined as an edge distribution @ofthe respective vector>
Figure FDA00039892411200000912
And a Copula function, as follows: />
Figure FDA00039892411200000913
Gaussian copolas was used as follows:
Figure FDA00039892411200000914
wherein, C G Denotes Gaussian Copula, phi σ A joint CDF representing a multivariate normal distribution with covariance matrix σ and mean 0; note C G There is no analytical formula, and numerical integration is needed to approximate the calculation;
use of C G Can be selected from
Figure FDA00039892411200000915
Obtaining a future demand sample in the joint distribution; in so doing, it is possible to do so,
Figure FDA00039892411200000916
first from C G And then convert the sample to->
Figure FDA00039892411200000917
Finally, Φ is mapped to the chosen sample by inverse mapping of the edge CDFs, as follows:
Figure FDA00039892411200000918
finally, a joint distribution function phi is obtained σ Will be
Figure FDA00039892411200000919
Substituted into phi σ The joint occurrence probability is obtained>
Figure FDA00039892411200000920
7. The demand response bus look-ahead scheduling method according to claim 1, wherein in step S3, future demands are brought into a common scheduling model, a look-ahead scheduling model based on prediction is established, and error correction mechanisms are respectively designed for look-ahead scheduling and order cancellation, specifically as follows:
unlike ordinary scheduling, look-ahead scheduling not only takes into account
Figure FDA00039892411200000921
And &>
Figure FDA00039892411200000922
Also taking into account the predicted future demand of the upcoming period p + 1; in this scheduling mode, the future demand of the next cycle->
Figure FDA00039892411200000923
Is predicted in advance at period p and is associated with->
Figure FDA00039892411200000924
And
Figure FDA00039892411200000925
optimizing together; therefore, training to predict future demand should be done periodically on a rolling basis; for example, at cycle p, in order to predict future demand for cycle p +1, the training data set should include the demands of the previous and current cycles, i.e., { (R) }>
Figure FDA0003989241120000101
When the dynamic order of the next period is imported, the prediction error is corrected; in cycle 2, first, according to the dynamic order
Figure FDA0003989241120000102
Correction time period t 0 -T,t 0 ]In or>
Figure FDA0003989241120000103
And obtains a matching but unserviced order @>
Figure FDA0003989241120000104
At the same time, during the prediction time period t 0 ,t 0 +T]Is based on a future need->
Figure FDA0003989241120000105
Then will >>
Figure FDA0003989241120000106
And &>
Figure FDA0003989241120000107
At a time period t 0 ,t 0 +t B ]Plan together and then schedule the vehicle for a time period t 0 +t B ,t 0 +T]Performing;
notwithstanding predicted future demand
Figure FDA0003989241120000108
The method is deterministic when planning is carried out at the end time of a period p, but errors may exist in quantile regression and LSTM model prediction, and great influence can be generated on planning; assume that a remaining passenger load of the vehicle v is +>
Figure FDA0003989241120000109
From the loading station i r Get to the station j r Predicted passenger number of c r =1; due to prediction errors only a real order is introduced at the end of the period p +1 @>
Figure FDA00039892411200001010
Only then can it be known that, if the number of real passengers exceeds the remaining passenger capacity of the vehicle, for example c r =2, i.e. specifying planned route ≥>
Figure FDA00039892411200001011
Are invalid, which can lead to inefficient consumption of remaining travel distance and deterioration of subsequent decisions.
8. The demand response bus look-ahead scheduling method according to claim 7, wherein in step S3, a correction mechanism is proposed to reduce the negative effect of the prediction error;
planned path due to vehicle performing look-ahead scheduling at the end of period p
Figure FDA00039892411200001012
The route cannot be adjusted within the time T of the next period; thus, only when the vehicle v arrives at the upper station i r Can only check the remaining passenger capacity->
Figure FDA00039892411200001013
Whether a real number of passengers can be accommodated;
thus, the prediction error can be divided into two types, i.e. the predicted value is larger than the actual value, i.e. the prediction error is larger than the actual value
Figure FDA00039892411200001014
And is less than the actual value i.e
Figure FDA00039892411200001015
In the former case, the remaining passenger capacity can accommodate the number of passengers, so that only the original remaining passenger capacity needs to be used
Figure FDA00039892411200001016
Updated to the correct value->
Figure FDA00039892411200001017
For the latter, the planned route is not valid;
to avoid wasting consumed travel distance and travel time, only those requests that can be split are received
Figure FDA00039892411200001018
Individual passenger, but redundant>
Figure FDA00039892411200001019
The passenger can act as a new dynamic order, at the end of the period p +1 and ≥>
Figure FDA00039892411200001020
Importing and planning together; in contrast, for an inseparable order, if the remaining passenger capacity is insufficient, the order must be rejected;
in practice, passengers may temporarily cancel orders because of dissatisfaction with changes in price, vehicle arrival time, and vehicle route; different from the route planned by the operator
Figure FDA00039892411200001021
Timely and active order rejection and dynamic order reservation>
Figure FDA00039892411200001022
The cancellation of (b) is actively and temporarily initiated by the passenger; the order is already planned by the system in the vehicle route before it is cancelled; in order to ensure the stability of the vehicle route, the cancelled order should be deleted from the planned route in real time;
if orders are cancelled before the system plans a route, the orders can be directly removed from the imported dynamic orders; planned route under look-ahead scheduling
Figure FDA00039892411200001023
Already planned, the vehicle may be due to the order being cancelled by the passengerThe station can arrive or not arrive; similar to the error correction mechanism of the predicted order, the invention can update the incorrect state variable; under a look-ahead schedule, based on a look-ahead schedule>
Figure FDA0003989241120000111
And &>
Figure FDA0003989241120000112
All are imported into the system at the end of the period p; and the passenger cancels the order>
Figure FDA0003989241120000113
And &>
Figure FDA0003989241120000114
In the period p +1 and the period p +2, respectively; thus, there are two cases of look-ahead scheduling, namely:
a) The predicted future demand is not cancelled, but is actually cancelled;
b) The predicted future demand is cancelled, but not actually;
for case a), the predicted future demand is not cancelled, indicating that the passenger is
Figure FDA0003989241120000119
Has been planned into the line by the system; the vehicle may have just arrived or not yet arrived at the upper station; thus, the wrong state variables can be updated directly;
for case b), the predicted future demand is cancelled, i.e. the order is not present in the planned route; however, since the actual order is not cancelled, the vehicle has three states of 'not arriving at the boarding point', 'just arriving at the boarding point', and 'having traveled past the boarding point'; while the prediction errors of the first two states can be updated directly as in case a); and for the 'has traveled past the pick-up point' state, another vehicle needs to be dispatched to service at the end of the period p + 1.
9. The demand response bus look-ahead scheduling method according to claim 1, wherein in step S4, a pruning strategy of the decision is designed, and a decoding space is compressed to reduce the calculation time, specifically comprising:
the lower bound pruning of the vehicle residual distance is as follows:
since vehicle v must deliver all passengers in the vehicle to the corresponding departure point, if it can go ahead along arc (i, j) e a to a new departure point j = i r Before, it is very meaningful to judge whether the remaining cruising distance of the vehicle can meet the mileage requirement of a subsequent path; the following theorem is given along with its proof:
theorem 1: vehicle v goes to pick-up point j = i at phase k of cycle p r When the time is long, the residual cruising distance always has a lower bound
Figure FDA0003989241120000115
Theorem 2: compared with the standard dynamic programming algorithm, after the pruning operation is adopted, the saved state space in all the stages in the period p is achieved
Figure FDA0003989241120000116
The waiting time is optimized specifically as follows:
the vehicle can be waited when being positioned in a parking lot, so that the departure time is optimized, and the large time penalty cost is prevented from being generated when the vehicle goes to a first boarding point; due to the point i of getting on bus needing service after waiting r The vehicle is decided, and the subsequent path of the vehicle-entering point cannot be observed, so that the arrival time of the vehicle at the vehicle-entering point is optimized; waiting time
Figure FDA0003989241120000117
The optimal value of (c) is calculated as shown in equation (45):
Figure FDA0003989241120000118
calculating an optimal penalty by (46)Penalty cost
Figure FDA0003989241120000121
The method comprises the following specific steps:
Figure FDA0003989241120000122
returning to the parking lot for pruning, which is concretely as follows:
when a vehicle is visiting a departure point and there are no remaining passengers on the vehicle, the vehicle will face two decisions:
(e) Go to another boarding point;
(f) Returning to the affiliated parking lot m v
Because the lower bound pruning of the vehicle remaining distance can eliminate partial wrong vehicle paths, the path (e) discussed in this subsection is premised on that the vehicle remaining distance is sufficient, and the vehicle decision is pruned and optimized from the cost perspective;
without considering the time window penalty, the cost of path (f) must be higher than path (e) even if the fixed cost of use of the vehicle is ignored; however, in actual operation, the time penalty must be considered, if the vehicle makes the boarding point passenger wait too long, a new vehicle is not dispatched from the parking lot to go to the service, so that the time cost is reduced, and the total cost is lower than the former; based on the above principle, the pruning strategy is as follows: when the temperature is higher than the set temperature
Figure FDA0003989241120000123
When a decision is made as to path (f); when/is>
Figure FDA0003989241120000124
A decision is made as to path (e).
10. The demand response bus look-ahead scheduling method according to any one of claims 1 to 9, wherein in step S5, the scheduling scheme is solved by using an approximate dynamic programming algorithm, specifically as follows:
in general, the VFA strategy of period p starts in an initial state, iterating along the simulated sample planning path
Figure FDA0003989241120000125
Secondly; a fifth or fifth letter>
Figure FDA0003989241120000126
Status->
Figure FDA0003989241120000127
The unbiased sample estimate of value of (a) is calculated as follows:
Figure FDA0003989241120000128
in the first place
Figure FDA0003989241120000129
On a second iteration, the operator uses the ^ h->
Figure FDA00039892411200001210
Merit function after a sub-iteration->
Figure FDA00039892411200001211
Makes a decision->
Figure FDA00039892411200001212
Figure FDA00039892411200001213
/>
The estimated value of the state after the decision is approximated by adopting a time sequence difference updating method
Figure FDA00039892411200001214
Sub-iteration makes->
Figure FDA00039892411200001215
Converge on>
Figure FDA00039892411200001216
Figure FDA00039892411200001217
When λ =0 in TD (λ), there is a special case as shown in equation (50):
Figure FDA00039892411200001218
unbiased sample estimation due to end stage
Figure FDA0003989241120000131
Merit function relating to the initial stage of the next cycle>
Figure FDA0003989241120000132
However, the status->
Figure FDA0003989241120000133
Is often not pick>
Figure FDA0003989241120000134
The direct iteration of the post-decision state of (1) and incomplete execution of the planned path results in
Figure FDA0003989241120000135
A large deviation occurs, so a cost function rolling strategy needs to be adopted;
suppose that
Figure FDA0003989241120000136
Sub-iteration, planned path of period p @>
Figure FDA0003989241120000137
Is actually performed by the vehicle to pick>
Figure FDA0003989241120000138
Then the initial state for period p +1 is substantially equal to &' s>
Figure FDA0003989241120000139
I.e. is>
Figure FDA00039892411200001310
To ensure stability of the state cost function updates at the beginning and end of adjacent cycles, the ^ th greater or lesser than the maximum value>
Figure FDA00039892411200001311
Status after sub-iteration>
Figure FDA00039892411200001312
Is rolled to->
Figure FDA00039892411200001313
As in equation (51):
Figure FDA00039892411200001314
first, the
Figure FDA00039892411200001315
Sub-iteration, end state->
Figure FDA00039892411200001316
Although related to->
Figure FDA00039892411200001317
But the next weekInitial state of phase
Figure FDA00039892411200001318
Inherit the status of the period->
Figure FDA00039892411200001319
Therefore, it should be calculated to add->
Figure FDA00039892411200001320
Makes a decision->
Figure FDA00039892411200001321
The value of (A):
Figure FDA00039892411200001322
in the same way, the method has the advantages of,
Figure FDA00039892411200001323
the cost function approximation of (c) can use the medium TD (λ) and TD (0) methods of equation (53);
Figure FDA00039892411200001324
in the initial stage of the period p, n period p +1 random OD matrixes given by a prediction model are introduced, and then sample decisions are made under n random demands
Figure FDA00039892411200001325
The method comprises the following specific steps:
Figure FDA00039892411200001326
when the sample is in the state
Figure FDA00039892411200001327
Runs to>
Figure FDA00039892411200001328
N sample planning paths are formed; to evaluate the impact of random demand on each sample planned path, the value of each sample path is approximated using TD (λ) and TD (0) methods:
Figure FDA00039892411200001329
/>
due to random OD matrix
Figure FDA00039892411200001330
In conjunction with a probability of occurrence>
Figure FDA00039892411200001331
There is a difference in the value function->
Figure FDA00039892411200001332
Searching a minimum expected value function, and respectively taking the minimum expected value function and the corresponding random demand as a value function under a look-ahead strategy and an optimal random demand of a period p + 1:
Figure FDA00039892411200001333
/>
CN202211574144.9A 2022-12-08 2022-12-08 Demand response bus look-ahead scheduling method Pending CN115879620A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211574144.9A CN115879620A (en) 2022-12-08 2022-12-08 Demand response bus look-ahead scheduling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211574144.9A CN115879620A (en) 2022-12-08 2022-12-08 Demand response bus look-ahead scheduling method

Publications (1)

Publication Number Publication Date
CN115879620A true CN115879620A (en) 2023-03-31

Family

ID=85766596

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211574144.9A Pending CN115879620A (en) 2022-12-08 2022-12-08 Demand response bus look-ahead scheduling method

Country Status (1)

Country Link
CN (1) CN115879620A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116432887A (en) * 2023-06-15 2023-07-14 华侨大学 Dynamic demand response type bus route optimization method, equipment and medium
CN117789955A (en) * 2024-02-28 2024-03-29 济南大学 Medical service distribution and path planning method, system, equipment and medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116432887A (en) * 2023-06-15 2023-07-14 华侨大学 Dynamic demand response type bus route optimization method, equipment and medium
CN116432887B (en) * 2023-06-15 2023-09-05 华侨大学 Dynamic demand response type bus route optimization method, equipment and medium
CN117789955A (en) * 2024-02-28 2024-03-29 济南大学 Medical service distribution and path planning method, system, equipment and medium
CN117789955B (en) * 2024-02-28 2024-05-03 济南大学 Medical service distribution and path planning method, system, equipment and medium

Similar Documents

Publication Publication Date Title
CN115879620A (en) Demand response bus look-ahead scheduling method
WO2019212600A1 (en) Deep reinforcement learning for optimizing carpooling policies
US11614751B2 (en) System for on-demand high-capacity ride-sharing via dynamic trip-vehicle assignment and related techniques
US11619951B2 (en) On-demand high-capacity ride-sharing via dynamic trip-vehicle assignment with future requests
JP7166222B2 (en) System for assigning commuter vehicles to passengers
CN113780808B (en) Vehicle service attribute decision optimization method based on flexible bus connection system line
US20190114595A1 (en) Systems and Methods for Joint Control of Multi-Modal Transportation Networks
CN109657820B (en) Taxi matching method capable of making reservation
CN110245377B (en) Travel scheme recommendation method and recommendation system
Wu et al. Joint optimization of timetabling, vehicle scheduling, and ride-matching in a flexible multi-type shuttle bus system
CN108288101B (en) Auction mechanism-based vehicle resource allocation and pricing method for online taxi appointment reservation service
Grahn et al. Improving the performance of first-and last-mile mobility services through transit coordination, real-time demand prediction, advanced reservations, and trip prioritization
CN114119159A (en) Network taxi appointment real-time order matching and idle vehicle scheduling method and system
JP2019128730A (en) Demand prediction device, forwarding plan generation device, user model generation device, and method
US20190205796A1 (en) System and method for optimizing allocation of different categories of vehicles
Zhang et al. Dynamic vehicle routing with random requests: A literature review
Tang et al. The data and science behind grabshare carpooling
CN112106021A (en) Method and device for providing vehicle navigation simulation environment
Erdmann et al. Dynamic car-passenger matching based on tabu search using global optimization with time windows
Castagna et al. Demand-responsive rebalancing zone generation for reinforcement learning-based on-demand mobility
JP2019133356A (en) Transfer support system, transfer support method, transfer support program, and mobile body
Alisoltani et al. Data-oriented approach for the dial-a-ride problem
CN107832864B (en) Network contract special car distribution and pricing method under bidding environment
Khademi Zareh et al. Designing a ride-sharing transportation system for assignment and transfer of passengers to a common destination
CN113962434A (en) Dynamic large-scale ride-sharing route planning method based on robust path

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination