CN109269516A - A kind of dynamic route guidance method based on multiple target Sarsa study - Google Patents

A kind of dynamic route guidance method based on multiple target Sarsa study Download PDF

Info

Publication number
CN109269516A
CN109269516A CN201810992284.5A CN201810992284A CN109269516A CN 109269516 A CN109269516 A CN 109269516A CN 201810992284 A CN201810992284 A CN 201810992284A CN 109269516 A CN109269516 A CN 109269516A
Authority
CN
China
Prior art keywords
traffic
preference
driver
target
section
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810992284.5A
Other languages
Chinese (zh)
Other versions
CN109269516B (en
Inventor
文峰
封筱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Ligong University
Original Assignee
Shenyang Ligong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Ligong University filed Critical Shenyang Ligong University
Priority to CN201810992284.5A priority Critical patent/CN109269516B/en
Publication of CN109269516A publication Critical patent/CN109269516A/en
Application granted granted Critical
Publication of CN109269516B publication Critical patent/CN109269516B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/34Route searching; Route guidance
    • G01C21/3453Special cost functions, i.e. other than distance or default speed limit of road segments
    • G01C21/3492Special cost functions, i.e. other than distance or default speed limit of road segments employing speed data or traffic data, e.g. real-time or historical
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/34Route searching; Route guidance
    • G01C21/3407Route searching; Route guidance specially adapted for specific applications
    • G01C21/3415Dynamic re-routing, e.g. recalculating the route when the user deviates from calculated route or after detecting real-time traffic data or accidents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • G06Q10/047Optimisation of routes or paths, e.g. travelling salesman problem
    • G06Q50/40

Abstract

The present invention proposes a kind of dynamic route guidance method based on multiple target Sarsa study, and process includes: information initializing;Information update;Path computing, including the normalization of Q vector table are induced, the scalar value based on driver's preference is calculated, calculates Boltzmann probability distribution, is next running section that driver's selection meets its people's preference by wheel disc bet method, until driver's vehicle arrives at the destination.According to the traffic condition of Current traffic system, optimize the driving path of vehicle, improve traffic system efficiency, alleviates traffic congestion.From actual angle, while the dynamic path guidance of more induction targets is carried out, more meets the supplier induced demand in real life.Consider that driver induces preference, the dynamic induction path for meeting personal preference is provided for driver, to improve induction path receptance, further increases the traffic efficiency of traffic system, alleviation traffic congestion.

Description

A kind of dynamic route guidance method based on multiple target Sarsa study
Technical field
The invention belongs to field of intelligent transportation technology, and in particular to a kind of dynamic route based on multiple target Sarsa study Abductive approach.
Background technique
In recent years, with the rapid development of Chinese society economy, private car ownership is constantly soaring, the following city The problems such as city's traffic pressure increases, urban traffic congestion, blocking, traffic accident takes place frequently also gets worse.In addition, driver makees For the important participant in traffic system, often there are multiple induction targets and have to different targets simultaneously in reach Different preferences.The acceptance level of induction information can be had a huge impact to influence by whether considering driver individual's preference The traffic efficiency of traffic system.Therefore, from traffic congestion is alleviated, the angle for meeting driver individual's preference sets out, realize efficiently, Dynamic paths chosen is necessary.
Intensified learning has very strong adaptivity and self-learning capability, does not need priori knowledge and modeling, so that it may with The variation of system environments constantly adjust itself control strategy, learnt using the multidate information of system, meet to height with Machine, complexity system for traffic guiding control requirement.Sarsa learns the intensified learning learnt as a kind of on-policy Algorithm is particularly well suited to complicated and changeable, and the search of optimal path and the dynamic of vehicle lure in the system for traffic guiding of strong real-time It leads.
The paths chosen model and induction algorithm that it is proposed at present are the single goal only for Link Travel Time building mostly Paths chosen method has ignored the supplier induced demand in real life and the personal preference of driver.Multiple target intensified learning is normal It is made to solve such multi-objective optimization question, the method for solving multiple target intensified learning optimal solution set is broadly divided into single strategy side Method and more strategy process.However compared to single strategy process, more strategy process can all learn a system when every time with environmental interaction The set of column optimal solution goes to approach the forward position Pareto, this process needs a large amount of calculating times, and corresponding calculation amount is also very big.And More strategy process are used in on-policy study, the plenty of time needed for the calculation amount and storage of corresponding disaggregation all makes such Method is not suitable for Dynamic Route Guidance System.Therefore, single strategy multiple target Sarsa study, suitable for solving comprising more luring Lead the dynamic path guidance problem that driver's preference is considered on the basis of target.
Summary of the invention
According to the above technical problem, the object of the present invention is to provide a kind of dynamic routes based on multiple target Sarsa study Abductive approach.Real time traffic data information and driver individual's preference information are made full use of, is provided for driver according to personal While the paths chosen information of preference, coordinate whole traffic system and pass through, alleviate traffic congestion, improves the current effect of traffic system Rate.
The technical solution adopted is that: it is a kind of based on multiple target Sarsa study dynamic route guidance method include step 1~ Step 3:
Step 1: information initializing specifically includes step 1.1~step 1.3:
Step 1.1: confirmation induction target: minimizing hourage including selection, minimize travel distance and minimize flower Take, it is one or several kinds of;;
Step 1.2: for induction target, traffic information center is using the dynamic programming algorithm based on Q value and according to geography Road network information and the collected each section static data of history are corresponding to initialize each induction target on road network in information bank The Q vector table of terminal to be selected, and the corresponding terminal to be selected of a Q vector table;
Step 1.3: the Q value information renewal time interval T that setting traffic information center is issued;
The road network information includes: road network topology structure, link length, number of track-lines;
Each section static data includes: history vehicle pass-through time, distance, cost;
Step 2: information update specifically includes: defining induction target weight, current road grid traffic congestion coefficient calculates and every Every the T moment, Q vector table is updated with Sarsa learning method:
(1) definition induction target weight:
All vehicle current informations in road network are recorded, by current in the Real-time Traffic Information and road network of current road segment Each driver preference;Assuming that share n induction target, then the preference of each driver be denoted as weight vector ω= (ω1..., ωn), wherein ωo∈ [0,1] indicates that o-th of induction target corresponds to the weight of preference, defines each induction target Weight:
To the degree of taking notice of of each induction target, the preference of as each driver is remembered to be weighed each driver's self-defining Weight;
All vehicle current informations include: including position, it is expected that destination, all next traffic sections that can be reached Point;
The Real-time Traffic Information of the current road segment includes: running time, distance, cost;
(2) current road grid traffic congestion coefficient calculates: counting vehicle fleet size NV in current road network, and according in current road network Vehicle fleet size calculates current road grid traffic congestion coefficient ∈:
Wherein, beta, gamma is parameter, and traffic congestion coefficient ∈ indicates the current traffic condition of traffic system, and the value of ∈ can be with The increase of total vehicle fleet size NV in current road network and increase, when ∈ value is larger, it is meant that current traffic condition is more gathered around Stifled, vice versa.
(3) every the T moment, Q vector table is updated with Sarsa learning method: every the T moment, by being obtained in (1) away from more The real time information of vehicle on each section of new time recently, and the next traveling distributed using step 3.3 and step 3.4 Section updates the Q vector table of corresponding terminal, Sarsa learning method to each induction target o, according to Sarsa learning method respectively Formula is as follows:
Wherein,To be induction target from transport node i by adjacent traffic node j and terminal is d's with o Q value, k are the adjacent traffic node of transport node j, and α is learning rate,It is vehicle ν by section sijThe practical prize obtained Reward value;
The practical reward value includes: running time, distance or cost, only selects one kind.
Step 3: induction path computing, including step 3.1~step 3.5:
The normalization of step 3.1:Q vector table: according to Q vector table updated in step 2, different induction targets is distinguished Corresponding Q value is normalized using deviation standardized method, formula is as follows:
Wherein,For by section sijTerminal is the normalized Q of the induction target o of d,WithRespectively terminal is d and induces target to be the minimum value and maximum value in all section Q values corresponding to o.
Step 3.2: calculate the scalar value based on driver's preference: corresponding driver's preference according to obtained in step 2 is Terminal is d using the following formula of linear scalarization function by Q vector table after weight vector ω and step 3.1 normalization Q vector table in Current traffic node locating for vehicle whole adjacent segments Q vector, be converted to the mark based on driver's preference Magnitude SQd(i, j), specific formula is as follows:
Wherein, n indicates induction destination number, ωoIndicate the corresponding preference weight of target o,It indicates through passing by one's way The normalized Q for the target o that section sij terminal is d;
Step 3.3: calculating Boltzmann probability distribution: by the vehicle current information obtained in step 2, using being based on The scalar value SQ of driver's preferenced(i, j) calculates the Boltzmann probability distribution of Current traffic node adjacent segments, and formula is such as Under:
Wherein, Pd(i, j) is that vehicle terminal is d and selects section sijProbability, i, j are transport node, and A (i) is to hand over Logical node i is the destination set in the section of starting point, according to end corresponding to present node adjacent segments obtained by road network topology structure The set of point composition, ∈ are traffic congestion coefficient, ESQd(i) be around node i section to destination d based on driver's preference Scalar value SQd(i) average value.
Step 3.4: selection meets next running section of its people's preference: calculating each section based on step 3.3 Boltzmann probability distribution is next running section that driver's selection meets its people's preference by wheel disc bet method;
Step 3.5: if vehicle does not arrive at the destination, step 3.2~3.3 are repeated, until vehicle arrives at the destination.
Advantageous effects:
1. a kind of dynamic route guidance method based on multiple target Sarsa study can make full use of Current traffic system Real time information optimizes the driving path of vehicle according to the traffic condition of Current traffic system, improves traffic system efficiency, alleviates Traffic congestion.
2. a kind of dynamic route guidance method based on multiple target Sarsa study carries out more from actual angle The dynamic path guidance for inducing target, more meets the supplier induced demand in real life.
3. a kind of dynamic route guidance method based on multiple target Sarsa study considers that driver induces preference, to drive Person provides the dynamic induction path for meeting personal preference, induces path receptance to improve, further increases traffic system Traffic efficiency, alleviate traffic congestion.
Detailed description of the invention
Fig. 1 is a kind of dynamic route guidance method flow chart based on multiple target Sarsa study of the embodiment of the present invention;
Fig. 2 is the dynamic path guidance schematic diagram of the embodiment of the present invention;
Fig. 3 is that the vehicle route of the embodiment of the present invention calculates schematic diagram;
Fig. 4 is directed to traffic congestion situation contrast schematic diagram for the embodiment of the present invention compared with traditional abductive approach.
Specific embodiment
Invention is described further with specific implementation example with reference to the accompanying drawing, entire Dynamic Route Guidance System and vehicle The process of information interaction is as shown in Figure 2.Vehicle in road network sends self-position, end to Dynamic Route Guidance System Data, the above-mentioned data and collected road network that Dynamic Route Guidance System is transmitted by vehicle such as point, personal preference are handed in real time The information such as logical situation calculate the induction path for meeting personal preference using route guidance algorithm, and are sent to vehicle, complete both sides Between information exchange.It is a kind of based on multiple target Sarsa study dynamic route guidance method include step 1~step 3, such as Fig. 1 It is shown:
Step 1: information initializing specifically includes step 1.1~step 1.3:
Step 1.1: confirmation induction target: hourage time and travel costs cost;
Step 1.2: for induction target, traffic information center is using the dynamic programming algorithm based on Q value and according to geography Road network information and the collected each section static data of history are corresponding to initialize each induction target on road network in information bank The Q vector table of terminal to be selected, and the corresponding terminal to be selected of a Q vector table;Possible destination is initialized first d weeks The Q vector in section is enclosed, concrete operations are as follows:
Wherein,For by section sijIt reaches home the initialization Q vector of d, i, j are traffic Node, timeijAnd costijRespectively vehicle passes through section sijTime and cost, D be destination set, A (i) is with traffic section Point i is starting point
Section destination set, B (i) be using transport node i as the section of terminal rise point set.
Then, the Q vector in all sections of corresponding destination d is updated by successive ignition, more new formula is as follows:
Wherein,To correspond to the section s that terminal is d when nth iterationijObtained Q vector, K is the adjacent traffic node of transport node j.
Step 1.3: the Q value information renewal time interval T that setting traffic information center is issued;
The road network information includes: road network topology structure, link length, number of track-lines;
Each section static data includes: the history vehicle pass-through time spends;
As shown in figure 3, by taking the vehicle ν that destination is d and is located at transport node j as an example, the following institute of dynamic Induction Process Show:
Step 2: information update specifically includes: defining induction target weight, current road grid traffic congestion coefficient calculates and every Every the T moment, Q vector table is updated with Sarsa learning method:
(1) definition induction target weight: all vehicle current informations in record road network, by the real-time traffic of current road segment The preference of current each driver in information and road network;Assuming that sharing n induction target, the then preference of each driver It is denoted as weight vector ω=(ω1..., ωn), wherein ωo∈ [0,1] indicates that o-th of induction target corresponds to the weight of preference, Define the weight of each induction target:
All vehicle current informations include: including position, it is expected that destination, all next traffic sections that can be reached Point;
The Real-time Traffic Information of the current road segment includes: running time, spends;
To the degree of taking notice of of each induction target, the preference of as each driver is remembered to be weighed each driver's self-defining Weight;
Record vehicle v current information such as position: transport node j, expectation destination: transport node d, can reach it is all under One transport node: k, k ', k ", by current road segment sijThe Real-time Traffic Informations such as running time, cost and driver it is inclined It is good.Preference weight vector ω=(0.8,0.2) of each driver.Wherein 0.8 and 0.2 respectively indicates with the time and spend to lure Lead the weight of preference corresponding to target.
(2) current road grid traffic congestion coefficient calculates: counting vehicle fleet size NV in current road network, and according in current road network Vehicle fleet size calculates current road grid traffic congestion coefficient ∈:
Wherein, beta, gamma is parameter, and traffic congestion coefficient ∈ indicates the current traffic condition of traffic system, and the value of ∈ can be with The increase of total vehicle fleet size NV in current road network and increase, when ∈ value is larger, it is meant that current traffic condition is more gathered around Stifled, vice versa.Wherein, beta, gamma is respectively set to 0.3,0.005. and assumes that vehicle fleet size NV is 500 in current road network, then ∈= 0.8。
(3) every the T moment, pass through vehicle on each section nearest away from renewal time of acquisition in (1), such as vehicle v In section sijOn running time immediatelyIt spends immediatelyAnd use Path selection in 3 Next running section s that method is distributedjk, and assume that learning rate α is 0.7, in current Q vector table WithValue be respectively (250s, 21$) and (200s, 20$).Therefore to each induction target according to Sarsa learning method updates the Q vector table of corresponding terminal d respectively.It is as follows that Sarsa learns formula:
Wherein,The Q for being d by adjacent traffic node j and terminal from transport node i Vector.
Step 3: induction path computing, including step 3.1~step 3.5:
The normalization of step 3.1:Q vector table: according to Q vector table updated in step 2, different induction targets is distinguished Corresponding Q value is normalized using deviation standardized method, solves different induction targets asking with different unit and dimension Topic, formula are as follows:
Wherein,For by section sijTerminal is the normalized Q of the induction target o of d,WithRespectively terminal is d and induces target to be the minimum value and maximum value in all section Q values corresponding to o.
It can be obtained based on the value in Q vector table and 2More according to this value Section s in the new normalization Q vector table corresponding to terminal dijCorresponding normalization Q vector.
Step 3.2: calculate the scalar value based on driver's preference: corresponding driver's preference according to obtained in not chasing after 2 is i.e. Q vector table in weight vector ω, and (1) after normalization, using linear scalarization function, by terminal in the Q vector table of d The Q vector median filters of whole adjacent segments of Current traffic node locating for vehicle v are the scalar value SQ based on driver's preferenced(i, J), according to Fig. 3, concrete operations are as follows:
SQd(j, k)=0.8 × 0.195+0.2 × 0.388=0.2336
SQd(j, k ')=0.8 × 0.253+0.2 × 0.306=0.2636
SQd(j, k ")=0.8 × 0.310+0.2 × 0.306=0.3092
Step 3.3: calculating Boltzmann probability distribution: by the vehicle current information obtained in step 2, using being based on The scalar value SQ of driver's preferenced(i, j) calculates the Boltzmann probability distribution of Current traffic node adjacent segments, and formula is such as Under:
Wherein, Pd(i, j) is that vehicle terminal is d and selects section sijProbability, i, j are transport node, and A (i) is to hand over Logical node i is the destination set in the section of starting point, according to end corresponding to present node adjacent segments obtained by road network topology structure The set of point composition, ∈ are traffic congestion coefficient, ESQd(i) be around node i section to destination d based on driver's preference Scalar value SQd(i) average value.
It can be calculated, pd(j, k)=0.3705, pd(j, k ')=0.3387, pd(j, k ")=0.2908
Step 3.4: selection meets next running section of its people's preference: calculating each section based on step 3.3 Boltzmann probability distribution is next running section that driver's selection meets its people's preference by wheel disc bet method;
Step 3.5: if vehicle does not arrive at the destination, step 3.2~3.3 are repeated, until vehicle arrives at the destination.
As shown in figure 4, being directed to traffic congestion situation compared with traditional abductive approach for the present invention, abscissa is simulation time Step, ordinate are road network currently total vehicle fleet size;Vehicle fleet size more multi path network more congestion.Contrast schematic diagram, Dijk represent tradition Paths chosen method, SMOSWU represent the method for the present invention, a kind of dynamic road based on multiple target Sarsa study proposed by the present invention Diameter abductive approach makes full use of on the basis of considering individual subscriber preference compared to legacy paths abductive approach Diikstra Real-time Traffic Information, improves the efficiency of traffic system, and traffic congestion is effectively relieved.

Claims (6)

1. a kind of dynamic route guidance method based on multiple target Sarsa study, which is characterized in that including following process:
Step 1: information initializing specifically includes step 1.1~step 1.3:
Step 1.1: confirmation induction target: being spent including selecting to minimize hourage, minimum travel distance and minimize, one Kind is several;
Step 1.2: for induction target, traffic information center is using the dynamic programming algorithm based on Q value and according to geography information Road network information and the collected each section static data of history in library, come initialize each induction target on road network it is corresponding to Select the Q vector table of terminal, and the corresponding terminal to be selected of a Q vector table;
Step 1.3: the Q value information renewal time interval T that setting traffic information center is issued;
Step 2: information update specifically includes: definition induction target weight, current road grid traffic congestion coefficient calculate and every T Moment updates Q vector table with Sarsa learning method:
(1) definition induction target weight:
All vehicle current informations in road network are recorded, it is every by what is passed through in the Real-time Traffic Information and road network of current road segment The preference of a driver;Assuming that share n induction target, then the preference of each driver be denoted as weight vector ω= (ω1..., ωn), wherein ωo∈ [0,1] indicates that o-th of induction target corresponds to the weight of preference, defines each induction target Weight:
Take notice of degree of each driver's self-defining to each induction target, the preference note weight of as each driver;
(2) current road grid traffic congestion coefficient calculates: counting vehicle fleet size NV in current road network, and according to vehicle in current road network Quantity calculates current road grid traffic congestion coefficient ∈:
Wherein, beta, gamma is parameter, and traffic congestion coefficient ∈ indicates the current traffic condition of traffic system;
(3) every the T moment, Q vector table is updated with Sarsa learning method: every the T moment, when by being obtained in (1) away from updating Between on nearest each section vehicle real time information, and the next running section distributed using step 3.3 and step 3.4 To each induction target o, the Q vector table of corresponding terminal, Sarsa learning method formula are updated respectively according to Sarsa learning method It is as follows:
Wherein,To be the Q value for inducing target from transport node i by adjacent traffic node j and terminal for d with o, K is the adjacent traffic node of transport node j, and α is learning rate,It is vehicle v by section sijThe practical reward value obtained;
Step 3: induction path computing, including step 3.1~step 3.5:
The normalization of step 3.1:Q vector table: according to Q vector table updated in step 2, different induction targets is respectively adopted Deviation standardized method normalizes corresponding Q value, and formula is as follows:
Wherein,For by section sijTerminal is the normalized Q of the induction target o of d,WithPoint Not Wei terminal be d and to induce target be minimum value and maximum value in all section Q values corresponding to o;
Step 3.2: calculating the scalar value based on driver's preference: corresponding driver's preference, that is, weight according to obtained in step 2 Q vector table after vector ω and step 3.1 normalize swears the Q that terminal is d using the following formula of linear scalarization function The Q vector of whole adjacent segments of Current traffic node locating for vehicle, is converted to the scalar value based on driver's preference in scale SQd(i, j), specific formula is as follows:
Wherein, n indicates induction destination number, ωoIndicate the corresponding preference weight of target o,It indicates by section sijEventually Point is the normalized Q of the target o of d;
Step 3.3: calculating Boltzmann probability distribution: by the vehicle current information obtained in step 2, using based on driving The scalar value SQ of person's preferenced(i, j) calculates the Boltzmann probability distribution of Current traffic node adjacent segments, and formula is as follows:
Wherein, Pd(i, j) is that vehicle terminal is d and selects section sijProbability, i, j are transport node, and A (i) is with traffic section Point i is the destination set in the section of starting point, according to terminal group corresponding to present node adjacent segments obtained by road network topology structure At set, ∈ be traffic congestion coefficient, ESQd(i) be around node i section to the mark based on driver's preference of destination d Magnitude SQd(i) average value;
Step 3.4: selection meets next running section of its people's preference: it is general to calculate each section Boltzmann based on step 3.3 Rate distribution is next running section that driver's selection meets its people's preference by wheel disc bet method;
Step 3.5: if vehicle does not arrive at the destination, step 3.2~3.3 are repeated, until vehicle arrives at the destination.
2. a kind of dynamic route guidance method based on multiple target Sarsa study according to claim 1, which is characterized in that Road network information described in step 1 includes: road network topology structure, link length, number of track-lines.
3. a kind of dynamic route guidance method based on multiple target Sarsa study according to claim 1, which is characterized in that Each section static data described in step 1 includes: history vehicle pass-through time, distance, cost.
4. a kind of dynamic route guidance method based on multiple target Sarsa study according to claim 1, which is characterized in that All vehicle current informations described in step 2 include: including position, it is expected that destination, all next traffic sections that can be reached Point.
5. a kind of dynamic route guidance method based on multiple target Sarsa study according to claim 1, which is characterized in that The Real-time Traffic Information of current road segment described in step 2 includes: running time, distance, cost.
6. a kind of dynamic route guidance method based on multiple target Sarsa study according to claim 1, which is characterized in that Practical reward value described in step 2 includes: running time, distance or cost, only selects one kind.
CN201810992284.5A 2018-08-29 2018-08-29 Dynamic path induction method based on multi-target Sarsa learning Active CN109269516B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810992284.5A CN109269516B (en) 2018-08-29 2018-08-29 Dynamic path induction method based on multi-target Sarsa learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810992284.5A CN109269516B (en) 2018-08-29 2018-08-29 Dynamic path induction method based on multi-target Sarsa learning

Publications (2)

Publication Number Publication Date
CN109269516A true CN109269516A (en) 2019-01-25
CN109269516B CN109269516B (en) 2022-03-04

Family

ID=65154604

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810992284.5A Active CN109269516B (en) 2018-08-29 2018-08-29 Dynamic path induction method based on multi-target Sarsa learning

Country Status (1)

Country Link
CN (1) CN109269516B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110631599A (en) * 2019-08-29 2019-12-31 重庆长安汽车股份有限公司 Navigation method, system, server and automobile based on air pollution
CN112039767A (en) * 2020-08-11 2020-12-04 山东大学 Multi-data center energy-saving routing method and system based on reinforcement learning
CN113503888A (en) * 2021-07-09 2021-10-15 复旦大学 Dynamic path guiding method based on traffic information physical system
CN114267176A (en) * 2021-12-24 2022-04-01 中电金信软件有限公司 Navigation method, navigation device, electronic equipment and computer readable storage medium
CN114459498A (en) * 2022-03-14 2022-05-10 南京理工大学 New energy vehicle charging station selection and self-adaptive navigation method based on reinforcement learning
CN114664086A (en) * 2019-12-18 2022-06-24 北京嘀嘀无限科技发展有限公司 Method and device for controlling information release, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104658297A (en) * 2015-02-04 2015-05-27 沈阳理工大学 Central type dynamic path inducing method based on Sarsa learning
CN106096756A (en) * 2016-05-31 2016-11-09 武汉大学 A kind of urban road network dynamic realtime Multiple Intersections routing resource
CN107977738A (en) * 2017-11-21 2018-05-01 合肥工业大学 A kind of multiobjective optimization control method for conveyer belt feed processing station system
US10024675B2 (en) * 2016-05-10 2018-07-17 Microsoft Technology Licensing, Llc Enhanced user efficiency in route planning using route preferences
CN108389419A (en) * 2018-03-02 2018-08-10 辽宁工业大学 A kind of Dynamic Route Guidance Method of Vehicle

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104658297A (en) * 2015-02-04 2015-05-27 沈阳理工大学 Central type dynamic path inducing method based on Sarsa learning
US10024675B2 (en) * 2016-05-10 2018-07-17 Microsoft Technology Licensing, Llc Enhanced user efficiency in route planning using route preferences
CN106096756A (en) * 2016-05-31 2016-11-09 武汉大学 A kind of urban road network dynamic realtime Multiple Intersections routing resource
CN107977738A (en) * 2017-11-21 2018-05-01 合肥工业大学 A kind of multiobjective optimization control method for conveyer belt feed processing station system
CN108389419A (en) * 2018-03-02 2018-08-10 辽宁工业大学 A kind of Dynamic Route Guidance Method of Vehicle

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110631599A (en) * 2019-08-29 2019-12-31 重庆长安汽车股份有限公司 Navigation method, system, server and automobile based on air pollution
CN114664086A (en) * 2019-12-18 2022-06-24 北京嘀嘀无限科技发展有限公司 Method and device for controlling information release, electronic equipment and storage medium
CN114664086B (en) * 2019-12-18 2023-11-24 北京嘀嘀无限科技发展有限公司 Method, device, electronic equipment and storage medium for controlling information release
CN112039767A (en) * 2020-08-11 2020-12-04 山东大学 Multi-data center energy-saving routing method and system based on reinforcement learning
CN113503888A (en) * 2021-07-09 2021-10-15 复旦大学 Dynamic path guiding method based on traffic information physical system
CN114267176A (en) * 2021-12-24 2022-04-01 中电金信软件有限公司 Navigation method, navigation device, electronic equipment and computer readable storage medium
CN114459498A (en) * 2022-03-14 2022-05-10 南京理工大学 New energy vehicle charging station selection and self-adaptive navigation method based on reinforcement learning

Also Published As

Publication number Publication date
CN109269516B (en) 2022-03-04

Similar Documents

Publication Publication Date Title
CN109269516A (en) A kind of dynamic route guidance method based on multiple target Sarsa study
CN111862579B (en) Taxi scheduling method and system based on deep reinforcement learning
CN108847037B (en) Non-global information oriented urban road network path planning method
CN108256553B (en) Construction method and device for double-layer path of vehicle-mounted unmanned aerial vehicle
CN109117993B (en) Processing method for optimizing vehicle path
CN110334838B (en) AGV trolley cooperative scheduling method and system based on ant colony algorithm and genetic algorithm
CN109215355A (en) A kind of single-point intersection signal timing optimization method based on deeply study
CN104658297B (en) A kind of center type dynamic route guidance method based on Sarsa study
CN106779212A (en) A kind of city tour's route planning method based on improvement ant group algorithm
CN105989737B (en) A kind of parking induction method
CN105070042A (en) Modeling method of traffic prediction
CN114550482B (en) Navigation method based on low-carbon target and parking lot navigation method
CN109035767A (en) A kind of tide lane optimization method considering Traffic Control and Guidance collaboration
Lin et al. Traffic signal optimization based on fuzzy control and differential evolution algorithm
CN109584552A (en) A kind of public transport arrival time prediction technique based on network vector autoregression model
CN109360429A (en) A kind of urban highway traffic dispatching method and system based on simulative optimization
CN108597246A (en) A method of Path selection real time problems are solved to avoid local congestion
WO2023186024A1 (en) Self-adaptive parking lot changeable entrance and exit control method
Xi et al. Hmdrl: Hierarchical mixed deep reinforcement learning to balance vehicle supply and demand
Cheng Dynamic path optimization based on improved ant colony algorithm
CN113724507A (en) Traffic control and vehicle induction cooperation method and system based on deep reinforcement learning
Lemos et al. Co-adaptive reinforcement learning in microscopic traffic systems
CN107767036A (en) A kind of real-time traffic states method of estimation based on condition random field
CN116194935B (en) Method and apparatus for determining a navigation profile of a vehicle in a geographic area
CN115512558A (en) Traffic light signal control method based on multi-agent reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant