CN110267193B - Vehicle position tracking method based on Markov decision process model - Google Patents

Vehicle position tracking method based on Markov decision process model Download PDF

Info

Publication number
CN110267193B
CN110267193B CN201910458141.0A CN201910458141A CN110267193B CN 110267193 B CN110267193 B CN 110267193B CN 201910458141 A CN201910458141 A CN 201910458141A CN 110267193 B CN110267193 B CN 110267193B
Authority
CN
China
Prior art keywords
value
cluster
sensor cluster
sensor
target vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910458141.0A
Other languages
Chinese (zh)
Other versions
CN110267193A (en
Inventor
张�杰
李骏
邢志超
邵雨蒙
梁腾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN201910458141.0A priority Critical patent/CN110267193B/en
Publication of CN110267193A publication Critical patent/CN110267193A/en
Application granted granted Critical
Publication of CN110267193B publication Critical patent/CN110267193B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/40Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W64/00Locating users or terminals or network equipment for network management purposes, e.g. mobility management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Traffic Control Systems (AREA)
  • Navigation (AREA)

Abstract

The invention discloses a vehicle position tracking method based on a Markov decision process model, which comprises the following steps: establishing a two-dimensional road network model; defining the state, action and reward of the sensor cluster, establishing a Markov decision process model, and obtaining the optimal action sequence of the sensor cluster by utilizing reinforcement learning to realize preliminary tracking; and accurately tracking the target vehicle by using a Gaussian weight positioning algorithm based on RSSI. The invention realizes the accurate positioning of the vehicle and provides help for the effective implementation of vehicle position tracking.

Description

Vehicle position tracking method based on Markov decision process model
Technical Field
The invention relates to the technical field of target tracking, in particular to a vehicle position tracking method based on a Markov decision process model.
Background
The vehicle networking uses GPS, vehicle terminal and other devices to realize the effective utilization of the relevant information of the vehicle on the user information platform through the wireless communication technology. The vehicle networking utilizes the vehicle position information and the historical driving data information provided by the related equipment to store the information to the cloud end, and the analysis work such as data fusion and data mining is carried out to provide services such as better position positioning and road matching as a user, so that the user can better know the road traffic condition, reasonably plan road selection and relieve traffic pressure. The real-time position location information can also be used for providing early warning of road traffic conditions. This requires the use of vehicle position data information due to its strong uncertainty and randomness.
The existing vehicle position positioning method comprises a vehicle motion track model, a discrete linear error model, a model based on an optimization control algorithm and the like, which are all based on target vehicle analysis, do not start from a sensor, and have strong uncertainty and randomness.
Disclosure of Invention
The invention aims to provide a vehicle position tracking method based on a Markov decision process model, which is used for accurately tracking the position of a target vehicle in real time.
The technical scheme for realizing the purpose of the invention is as follows: a vehicle position tracking method based on a Markov decision process model comprises the following steps:
step 1, establishing a two-dimensional road network model;
step 2, defining the state, action and reward of the sensor cluster, establishing a Markov decision process model, and obtaining an optimal action sequence of the sensor cluster by utilizing reinforcement learning to realize preliminary tracking;
and 3, accurately tracking the target vehicle by using a Gaussian weight positioning algorithm based on the RSSI.
Compared with the prior art, the invention has the following remarkable advantages: the optimal action sequence of the sensor cluster is obtained by establishing a two-dimensional road network model and a Markov decision process model; the sensor cluster utilizes the optimal action sequence to carry out state transition to reach the optimal state, and primary tracking is realized; and on the basis, accurate tracking of the position coordinates of the target vehicle is realized by utilizing a Gaussian weight positioning algorithm based on the RSSI.
Drawings
Figure 1 is a flow chart of a method for tracking vehicle position based on a markov decision process model of the present invention.
FIG. 2 is a diagram of sensor cluster tracking.
Fig. 3 is a graph comparing a gaussian weight location algorithm based on RSSI with a conventional location algorithm.
Detailed Description
The invention provides a vehicle position positioning algorithm based on MDP. When a target vehicle is positioned at a certain coordinate point, a sensor cluster is taken as a target, the optimal action sequence of the sensor cluster is obtained by establishing a Markov decision process and utilizing a Q-learning algorithm in reinforcement learning, and preliminary target tracking is realized; on the basis, the target vehicle is accurately tracked and positioned by utilizing a Gaussian weight positioning algorithm based on the RSSI.
As shown in fig. 1, the positioning method includes the following steps:
step 1, establishing a two-dimensional road network model; the method specifically comprises the following steps:
and projecting the actual road map into a Cartesian rectangular coordinate system on a two-dimensional plane. Roads are mainly divided into three types, including: a single road parallel to the X-axis or Y-axis, represented by its head-to-tail coordinates; two roads which are perpendicular to each other and are respectively parallel to the X, Y axis are represented by head and tail coordinates and intersection coordinates; a single road, which is not parallel to the X-axis and the Y-axis, is represented by the head-to-tail coordinates and two extended line foci parallel to the X-axis and the Y-axis, respectively, passing through the head-to-tail coordinate points. The other various complex roads can be divided into combinations of the above three road types.
Step 2, defining the state, action and reward of the sensor cluster, establishing a Markov decision process model, and obtaining an optimal action sequence of the sensor cluster by utilizing reinforcement learning to realize preliminary tracking; the method specifically comprises the following steps:
s21, the cluster value of each sensor cluster can be 0 or 1; when the cluster value is 0, the sensor cluster is in a dormant state, and when the cluster value is 1, the sensor cluster is in a working state; the binary number combination formed by the cluster values of each sensor cluster is the state of the sensor cluster; the action sub-value of each sensor cluster may take 0 or 1; when the corresponding action sub value of the sensor cluster is 1, the state is changed, and when the corresponding action sub value is 0, the state is kept unchanged; the binary number combination formed by each action sub-value is the action of the sensor cluster; the states and the transitions between the states satisfy:
st+1=st∧ak,k=0,1,...,N
stfor the current momentState value of sensor cluster, st+1Is the state value of the sensor cluster at the next moment, akFor the action value taken, N is the number of elements in the state set or action set.
S22, defining a direct reward when the target vehicle is at a certain coordinate: when the target vehicle is located in the working range of the sensor cluster with the cluster value of 1, directly awarding the sensor cluster as positive awards; when the target vehicle is located in the working range of the sensor with the cluster value of 0, the direct reward is a negative reward; when the target vehicle is located outside the working range of the sensor with the cluster value of 1, directly awarding the target vehicle is negative awards; when the target vehicle is outside the operating range of the sensor with cluster value 0, the direct award is 0. Establishing a Markov decision process, and calculating each Q value by using a Q-learning algorithm in reinforcement learning:
Figure BDA0002077225640000031
stfor the state value, s, of the sensor cluster at the current momentt+1Is the state value of the sensor cluster at the next moment, atFor the action value taken at the current time, a' is any action element in the action set, r is the direct reward, α is the learning rate, and γ is the reward discount value. And obtaining a final Q table through iterative calculation, obtaining an optimal action sequence of the sensor cluster under the current coordinate of the target vehicle according to the Q table, and performing state transition on the sensor cluster to an optimal state by using the optimal action sequence, namely, a sensor close to the vehicle is in a working state, and a sensor far away from the vehicle is in a dormant state, so that preliminary tracking is realized.
Step 3, accurately tracking the target vehicle by utilizing a Gaussian weight positioning algorithm based on RSSI (received signal strength indicator); the method specifically comprises the following steps:
s31, for each sensor in the sensor cluster under the working state, obtaining the distance d from the target vehicle by using the RSSI ranging formulaiAnd sorting the obtained distance sets from small to large. The total number of the sensors in the sensor cluster is 3N, and the position coordinate of the sensor corresponding to each distance after sequencing is (x)i,yi) 1, 2.., 3N, let the specific location coordinates of the target vehicle be (x, y), we can obtain:
Figure BDA0002077225640000032
three equations are taken in sequence for N times in total, and the simplification is as follows:
Figure BDA0002077225640000033
obtaining specific position coordinates of target vehicle by using least square method
Figure BDA0002077225640000034
There are N coordinates in total. Each coordinate corresponds to an average distance:
Figure BDA0002077225640000041
defining the weight of each coordinate point by a Gaussian function as follows:
Figure BDA0002077225640000042
and sigma is the influence degree of the coordinate point, and the value range is [0.1,0.2 ].
The coordinates of the final target vehicle are:
Figure BDA0002077225640000043
the present invention will be described in detail below with reference to examples and the accompanying drawings.
Examples
The invention implements the method using pycharm software. Let the mapping of the target vehicle's actual position in the two-dimensional road network model be (2.1, 3.2). There are four sensor clusters, and the initial state of the sensor cluster is {0000 }. Assume that the state transition probability of a sensor cluster is approximately 1.
FIG. 2 is a diagram of sensor cluster tracking. The black circle is the target vehicle; the square is a sensor cluster, which is in a dormant state when the square is black and in an operating state when the square is white. It can be observed that the sensor continuously utilizes the optimal action sequence to carry out state transition to reach the optimal state along with the movement of the target vehicle so as to realize the initial tracking.
Fig. 3 is a comparison graph of a gaussian weight positioning algorithm based on RSSI and a conventional three-point positioning algorithm, wherein the two algorithms are simulated 100 times respectively to obtain a target vehicle coordinate point. It can be observed that the results obtained by the RSSI-based gaussian weight positioning algorithm are significantly better than those obtained by the conventional positioning algorithm.

Claims (2)

1. A vehicle position tracking method based on a Markov decision process model is characterized by comprising the following steps:
step 1, establishing a two-dimensional road network model;
step 2, defining the state, action and reward of the sensor cluster, establishing a Markov decision process model, and obtaining an optimal action sequence of the sensor cluster by utilizing reinforcement learning to realize preliminary tracking; the specific process is as follows:
s21, the cluster value of each sensor cluster can be 0 or 1; when the cluster value is 0, the sensor cluster is in a dormant state, and when the cluster value is 1, the sensor cluster is in a working state; the binary number combination formed by the cluster values of each sensor cluster is the state of the sensor cluster; the action sub-value of each sensor cluster may take 0 or 1; when the corresponding action sub value of the sensor cluster is 1, the state is changed, and when the corresponding action sub value is 0, the state is kept unchanged; the binary number combination formed by each action sub-value is the action of the sensor cluster; the states and the transitions between the states satisfy:
st+1=st∧at
stfor the state value, s, of the sensor cluster at the current momentt+1Is the state value of the sensor cluster at the next moment, atAn action value taken for the current time;
s22, defining a direct reward when the target vehicle is at a certain coordinate: when the target vehicle is located in the working range of the sensor cluster with the cluster value of 1, directly awarding the sensor cluster as positive awards; when the target vehicle is located in the working range of the sensor with the cluster value of 0, the direct reward is a negative reward; when the target vehicle is located outside the working range of the sensor with the cluster value of 1, directly awarding the target vehicle is negative awards; when the target vehicle is located outside the working range of the sensor with the cluster value of 0, the direct reward is 0; establishing a Markov decision process model, and calculating each Q value by using a Q-learning algorithm in reinforcement learning:
Figure FDA0002838513060000011
a' is any action element in the action set, r is direct reward, alpha is learning rate, and gamma is reward discount value; obtaining a final Q table through iterative calculation, and obtaining an optimal action sequence of the sensor cluster under the current coordinate of the target vehicle according to the Q table;
and 3, accurately tracking the target vehicle by using a Gaussian weight positioning algorithm based on RSSI (received signal strength indicator), which specifically comprises the following steps:
s31, for each sensor in the sensor cluster under the working state, obtaining the distance d from the target vehicle by using the RSSI ranging formulaiSorting the obtained distance sets from small to large; the total number of the sensors in the sensor cluster is 3N, and the position coordinate of the sensor corresponding to each distance after sequencing is (x)i,yi) 1, 2.., 3N, let the specific location coordinates of the target vehicle be (x, y), we can obtain:
Figure FDA0002838513060000021
three equations are taken in sequence for N times in total, and the simplification is as follows:
Figure FDA0002838513060000022
obtaining specific position coordinates of target vehicle by using least square method
Figure FDA0002838513060000023
N coordinates are total; each coordinate corresponds to an average distance:
Figure FDA0002838513060000024
defining the weight of each coordinate point by a Gaussian function as follows:
Figure FDA0002838513060000025
sigma is the influence degree of the coordinate point, and the value range is [0.1,0.2 ];
the coordinates of the final target vehicle are:
Figure FDA0002838513060000026
2. the Markov decision process model-based vehicle position tracking method of claim 1, wherein the specific process of step 1 is as follows:
projecting an actual road map into a Cartesian rectangular coordinate system on a two-dimensional plane; roads are mainly divided into three types, including: a single road parallel to the X-axis or Y-axis, represented by its head-to-tail coordinates; two roads which are perpendicular to each other and are respectively parallel to the X, Y axis are represented by head and tail coordinates and intersection coordinates; a single road which is not parallel to the X axis and the Y axis is represented by head and tail coordinates and two extended line focus coordinates which are parallel to the X axis and the Y axis and pass through head and tail coordinate points respectively; the other various complex roads can be divided into combinations of the above three road types.
CN201910458141.0A 2019-05-29 2019-05-29 Vehicle position tracking method based on Markov decision process model Active CN110267193B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910458141.0A CN110267193B (en) 2019-05-29 2019-05-29 Vehicle position tracking method based on Markov decision process model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910458141.0A CN110267193B (en) 2019-05-29 2019-05-29 Vehicle position tracking method based on Markov decision process model

Publications (2)

Publication Number Publication Date
CN110267193A CN110267193A (en) 2019-09-20
CN110267193B true CN110267193B (en) 2021-02-12

Family

ID=67915828

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910458141.0A Active CN110267193B (en) 2019-05-29 2019-05-29 Vehicle position tracking method based on Markov decision process model

Country Status (1)

Country Link
CN (1) CN110267193B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111752274B (en) * 2020-06-17 2022-06-24 杭州电子科技大学 Laser AGV path tracking control method based on reinforcement learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102883429A (en) * 2012-08-30 2013-01-16 北京航空航天大学 Method and device for tracking move object in sensor network based on directional antenna
CN103152819A (en) * 2013-01-29 2013-06-12 浙江大学 Dim target tracking method based on underwater wireless sensor network
CN105788263A (en) * 2016-04-27 2016-07-20 大连理工大学 Method for predicating road jam through mobile phone information
CN109005512A (en) * 2018-06-26 2018-12-14 西北工业大学 A kind of position predicting method towards specified time interval

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101944234B (en) * 2010-07-23 2012-07-25 中国科学院研究生院 Multi-object tracking method and device driven by characteristic trace
US9226110B2 (en) * 2012-03-31 2015-12-29 Groupon, Inc. Method and system for determining location of mobile device
CN102685886A (en) * 2012-04-16 2012-09-19 浙江大学城市学院 Indoor positioning method applied to mobile sensing network
CN103853908B (en) * 2012-12-04 2017-11-14 中国科学院沈阳自动化研究所 A kind of maneuvering target tracking method of adaptive interaction formula multi-model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102883429A (en) * 2012-08-30 2013-01-16 北京航空航天大学 Method and device for tracking move object in sensor network based on directional antenna
CN103152819A (en) * 2013-01-29 2013-06-12 浙江大学 Dim target tracking method based on underwater wireless sensor network
CN105788263A (en) * 2016-04-27 2016-07-20 大连理工大学 Method for predicating road jam through mobile phone information
CN109005512A (en) * 2018-06-26 2018-12-14 西北工业大学 A kind of position predicting method towards specified time interval

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Likehood-Based Data Association for Extended Object Tracking Using Sampling Methods;Karl Granstrom et.al;《IEEE Transactions on Intelligent Vehicles》;20171229;第3卷(第1期);全文 *
传感器网络中目标跟踪算法研究;柳絮;《中国优秀硕士学位论文全文数据库》;20120615;全文 *

Also Published As

Publication number Publication date
CN110267193A (en) 2019-09-20

Similar Documents

Publication Publication Date Title
CN108256577B (en) Obstacle clustering method based on multi-line laser radar
CN107239076B (en) AGV laser SLAM method based on virtual scanning and distance measurement matching
CN109166140B (en) Vehicle motion track estimation method and system based on multi-line laser radar
CN108759833A (en) A kind of intelligent vehicle localization method based on priori map
CN107967486B (en) Method for recognizing behaviors of surrounding vehicles
US11199850B2 (en) Estimation device, control method, program and storage medium
CN109941274B (en) Parking method and system based on radar ranging identification shore bridge, server and medium
CN106873599A (en) Unmanned bicycle paths planning method based on ant group algorithm and polar coordinate transform
CN108036794A (en) A kind of high accuracy map generation system and generation method
CN110488234A (en) Outer ginseng scaling method, device, equipment and the medium of vehicle-mounted millimeter wave radar
CN106289296A (en) A kind of method and apparatus of road guide
CN104023394A (en) WSN positioning method based on self-adaptation inertia weight
CN110285817B (en) Complex road network map matching method based on self-adaptive D-S evidence theory
CN105704652A (en) Method for building and optimizing fingerprint database in WLAN/Bluetooth positioning processes
CN104869639A (en) Indoor positioning method and device
CN110515055A (en) The method positioned using radius chess game optimization laser radar
CN107132504A (en) Location tracking device, method and electronic equipment based on particle filter
EP3699642A1 (en) Vehicle positioning method and apparatus
CN104507097A (en) Semi-supervised training method based on WiFi (wireless fidelity) position fingerprints
CN105120479A (en) Signal strength difference correction method of Wi-Fi signals between terminals
CN109583312A (en) Lane detection method, apparatus, equipment and storage medium
CN110267193B (en) Vehicle position tracking method based on Markov decision process model
CN111325187B (en) Lane position identification method and device
CN102981160B (en) Method and device for ascertaining aerial target track
CN108871365A (en) Method for estimating state and system under a kind of constraint of course

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant