CN110267193B - Vehicle position tracking method based on Markov decision process model - Google Patents
Vehicle position tracking method based on Markov decision process model Download PDFInfo
- Publication number
- CN110267193B CN110267193B CN201910458141.0A CN201910458141A CN110267193B CN 110267193 B CN110267193 B CN 110267193B CN 201910458141 A CN201910458141 A CN 201910458141A CN 110267193 B CN110267193 B CN 110267193B
- Authority
- CN
- China
- Prior art keywords
- value
- cluster
- sensor cluster
- sensor
- target vehicle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/02—Services making use of location information
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/30—Services specially adapted for particular environments, situations or purposes
- H04W4/40—Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W64/00—Locating users or terminals or network equipment for network management purposes, e.g. mobility management
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Traffic Control Systems (AREA)
- Navigation (AREA)
Abstract
The invention discloses a vehicle position tracking method based on a Markov decision process model, which comprises the following steps: establishing a two-dimensional road network model; defining the state, action and reward of the sensor cluster, establishing a Markov decision process model, and obtaining the optimal action sequence of the sensor cluster by utilizing reinforcement learning to realize preliminary tracking; and accurately tracking the target vehicle by using a Gaussian weight positioning algorithm based on RSSI. The invention realizes the accurate positioning of the vehicle and provides help for the effective implementation of vehicle position tracking.
Description
Technical Field
The invention relates to the technical field of target tracking, in particular to a vehicle position tracking method based on a Markov decision process model.
Background
The vehicle networking uses GPS, vehicle terminal and other devices to realize the effective utilization of the relevant information of the vehicle on the user information platform through the wireless communication technology. The vehicle networking utilizes the vehicle position information and the historical driving data information provided by the related equipment to store the information to the cloud end, and the analysis work such as data fusion and data mining is carried out to provide services such as better position positioning and road matching as a user, so that the user can better know the road traffic condition, reasonably plan road selection and relieve traffic pressure. The real-time position location information can also be used for providing early warning of road traffic conditions. This requires the use of vehicle position data information due to its strong uncertainty and randomness.
The existing vehicle position positioning method comprises a vehicle motion track model, a discrete linear error model, a model based on an optimization control algorithm and the like, which are all based on target vehicle analysis, do not start from a sensor, and have strong uncertainty and randomness.
Disclosure of Invention
The invention aims to provide a vehicle position tracking method based on a Markov decision process model, which is used for accurately tracking the position of a target vehicle in real time.
The technical scheme for realizing the purpose of the invention is as follows: a vehicle position tracking method based on a Markov decision process model comprises the following steps:
step 1, establishing a two-dimensional road network model;
step 2, defining the state, action and reward of the sensor cluster, establishing a Markov decision process model, and obtaining an optimal action sequence of the sensor cluster by utilizing reinforcement learning to realize preliminary tracking;
and 3, accurately tracking the target vehicle by using a Gaussian weight positioning algorithm based on the RSSI.
Compared with the prior art, the invention has the following remarkable advantages: the optimal action sequence of the sensor cluster is obtained by establishing a two-dimensional road network model and a Markov decision process model; the sensor cluster utilizes the optimal action sequence to carry out state transition to reach the optimal state, and primary tracking is realized; and on the basis, accurate tracking of the position coordinates of the target vehicle is realized by utilizing a Gaussian weight positioning algorithm based on the RSSI.
Drawings
Figure 1 is a flow chart of a method for tracking vehicle position based on a markov decision process model of the present invention.
FIG. 2 is a diagram of sensor cluster tracking.
Fig. 3 is a graph comparing a gaussian weight location algorithm based on RSSI with a conventional location algorithm.
Detailed Description
The invention provides a vehicle position positioning algorithm based on MDP. When a target vehicle is positioned at a certain coordinate point, a sensor cluster is taken as a target, the optimal action sequence of the sensor cluster is obtained by establishing a Markov decision process and utilizing a Q-learning algorithm in reinforcement learning, and preliminary target tracking is realized; on the basis, the target vehicle is accurately tracked and positioned by utilizing a Gaussian weight positioning algorithm based on the RSSI.
As shown in fig. 1, the positioning method includes the following steps:
step 1, establishing a two-dimensional road network model; the method specifically comprises the following steps:
and projecting the actual road map into a Cartesian rectangular coordinate system on a two-dimensional plane. Roads are mainly divided into three types, including: a single road parallel to the X-axis or Y-axis, represented by its head-to-tail coordinates; two roads which are perpendicular to each other and are respectively parallel to the X, Y axis are represented by head and tail coordinates and intersection coordinates; a single road, which is not parallel to the X-axis and the Y-axis, is represented by the head-to-tail coordinates and two extended line foci parallel to the X-axis and the Y-axis, respectively, passing through the head-to-tail coordinate points. The other various complex roads can be divided into combinations of the above three road types.
Step 2, defining the state, action and reward of the sensor cluster, establishing a Markov decision process model, and obtaining an optimal action sequence of the sensor cluster by utilizing reinforcement learning to realize preliminary tracking; the method specifically comprises the following steps:
s21, the cluster value of each sensor cluster can be 0 or 1; when the cluster value is 0, the sensor cluster is in a dormant state, and when the cluster value is 1, the sensor cluster is in a working state; the binary number combination formed by the cluster values of each sensor cluster is the state of the sensor cluster; the action sub-value of each sensor cluster may take 0 or 1; when the corresponding action sub value of the sensor cluster is 1, the state is changed, and when the corresponding action sub value is 0, the state is kept unchanged; the binary number combination formed by each action sub-value is the action of the sensor cluster; the states and the transitions between the states satisfy:
st+1=st∧ak,k=0,1,...,N
stfor the current momentState value of sensor cluster, st+1Is the state value of the sensor cluster at the next moment, akFor the action value taken, N is the number of elements in the state set or action set.
S22, defining a direct reward when the target vehicle is at a certain coordinate: when the target vehicle is located in the working range of the sensor cluster with the cluster value of 1, directly awarding the sensor cluster as positive awards; when the target vehicle is located in the working range of the sensor with the cluster value of 0, the direct reward is a negative reward; when the target vehicle is located outside the working range of the sensor with the cluster value of 1, directly awarding the target vehicle is negative awards; when the target vehicle is outside the operating range of the sensor with cluster value 0, the direct award is 0. Establishing a Markov decision process, and calculating each Q value by using a Q-learning algorithm in reinforcement learning:
stfor the state value, s, of the sensor cluster at the current momentt+1Is the state value of the sensor cluster at the next moment, atFor the action value taken at the current time, a' is any action element in the action set, r is the direct reward, α is the learning rate, and γ is the reward discount value. And obtaining a final Q table through iterative calculation, obtaining an optimal action sequence of the sensor cluster under the current coordinate of the target vehicle according to the Q table, and performing state transition on the sensor cluster to an optimal state by using the optimal action sequence, namely, a sensor close to the vehicle is in a working state, and a sensor far away from the vehicle is in a dormant state, so that preliminary tracking is realized.
Step 3, accurately tracking the target vehicle by utilizing a Gaussian weight positioning algorithm based on RSSI (received signal strength indicator); the method specifically comprises the following steps:
s31, for each sensor in the sensor cluster under the working state, obtaining the distance d from the target vehicle by using the RSSI ranging formulaiAnd sorting the obtained distance sets from small to large. The total number of the sensors in the sensor cluster is 3N, and the position coordinate of the sensor corresponding to each distance after sequencing is (x)i,yi) 1, 2.., 3N, let the specific location coordinates of the target vehicle be (x, y), we can obtain:
three equations are taken in sequence for N times in total, and the simplification is as follows:
obtaining specific position coordinates of target vehicle by using least square methodThere are N coordinates in total. Each coordinate corresponds to an average distance:
defining the weight of each coordinate point by a Gaussian function as follows:
and sigma is the influence degree of the coordinate point, and the value range is [0.1,0.2 ].
The coordinates of the final target vehicle are:
the present invention will be described in detail below with reference to examples and the accompanying drawings.
Examples
The invention implements the method using pycharm software. Let the mapping of the target vehicle's actual position in the two-dimensional road network model be (2.1, 3.2). There are four sensor clusters, and the initial state of the sensor cluster is {0000 }. Assume that the state transition probability of a sensor cluster is approximately 1.
FIG. 2 is a diagram of sensor cluster tracking. The black circle is the target vehicle; the square is a sensor cluster, which is in a dormant state when the square is black and in an operating state when the square is white. It can be observed that the sensor continuously utilizes the optimal action sequence to carry out state transition to reach the optimal state along with the movement of the target vehicle so as to realize the initial tracking.
Fig. 3 is a comparison graph of a gaussian weight positioning algorithm based on RSSI and a conventional three-point positioning algorithm, wherein the two algorithms are simulated 100 times respectively to obtain a target vehicle coordinate point. It can be observed that the results obtained by the RSSI-based gaussian weight positioning algorithm are significantly better than those obtained by the conventional positioning algorithm.
Claims (2)
1. A vehicle position tracking method based on a Markov decision process model is characterized by comprising the following steps:
step 1, establishing a two-dimensional road network model;
step 2, defining the state, action and reward of the sensor cluster, establishing a Markov decision process model, and obtaining an optimal action sequence of the sensor cluster by utilizing reinforcement learning to realize preliminary tracking; the specific process is as follows:
s21, the cluster value of each sensor cluster can be 0 or 1; when the cluster value is 0, the sensor cluster is in a dormant state, and when the cluster value is 1, the sensor cluster is in a working state; the binary number combination formed by the cluster values of each sensor cluster is the state of the sensor cluster; the action sub-value of each sensor cluster may take 0 or 1; when the corresponding action sub value of the sensor cluster is 1, the state is changed, and when the corresponding action sub value is 0, the state is kept unchanged; the binary number combination formed by each action sub-value is the action of the sensor cluster; the states and the transitions between the states satisfy:
st+1=st∧at
stfor the state value, s, of the sensor cluster at the current momentt+1Is the state value of the sensor cluster at the next moment, atAn action value taken for the current time;
s22, defining a direct reward when the target vehicle is at a certain coordinate: when the target vehicle is located in the working range of the sensor cluster with the cluster value of 1, directly awarding the sensor cluster as positive awards; when the target vehicle is located in the working range of the sensor with the cluster value of 0, the direct reward is a negative reward; when the target vehicle is located outside the working range of the sensor with the cluster value of 1, directly awarding the target vehicle is negative awards; when the target vehicle is located outside the working range of the sensor with the cluster value of 0, the direct reward is 0; establishing a Markov decision process model, and calculating each Q value by using a Q-learning algorithm in reinforcement learning:
a' is any action element in the action set, r is direct reward, alpha is learning rate, and gamma is reward discount value; obtaining a final Q table through iterative calculation, and obtaining an optimal action sequence of the sensor cluster under the current coordinate of the target vehicle according to the Q table;
and 3, accurately tracking the target vehicle by using a Gaussian weight positioning algorithm based on RSSI (received signal strength indicator), which specifically comprises the following steps:
s31, for each sensor in the sensor cluster under the working state, obtaining the distance d from the target vehicle by using the RSSI ranging formulaiSorting the obtained distance sets from small to large; the total number of the sensors in the sensor cluster is 3N, and the position coordinate of the sensor corresponding to each distance after sequencing is (x)i,yi) 1, 2.., 3N, let the specific location coordinates of the target vehicle be (x, y), we can obtain:
three equations are taken in sequence for N times in total, and the simplification is as follows:
obtaining specific position coordinates of target vehicle by using least square methodN coordinates are total; each coordinate corresponds to an average distance:
defining the weight of each coordinate point by a Gaussian function as follows:
sigma is the influence degree of the coordinate point, and the value range is [0.1,0.2 ];
the coordinates of the final target vehicle are:
2. the Markov decision process model-based vehicle position tracking method of claim 1, wherein the specific process of step 1 is as follows:
projecting an actual road map into a Cartesian rectangular coordinate system on a two-dimensional plane; roads are mainly divided into three types, including: a single road parallel to the X-axis or Y-axis, represented by its head-to-tail coordinates; two roads which are perpendicular to each other and are respectively parallel to the X, Y axis are represented by head and tail coordinates and intersection coordinates; a single road which is not parallel to the X axis and the Y axis is represented by head and tail coordinates and two extended line focus coordinates which are parallel to the X axis and the Y axis and pass through head and tail coordinate points respectively; the other various complex roads can be divided into combinations of the above three road types.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910458141.0A CN110267193B (en) | 2019-05-29 | 2019-05-29 | Vehicle position tracking method based on Markov decision process model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910458141.0A CN110267193B (en) | 2019-05-29 | 2019-05-29 | Vehicle position tracking method based on Markov decision process model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110267193A CN110267193A (en) | 2019-09-20 |
CN110267193B true CN110267193B (en) | 2021-02-12 |
Family
ID=67915828
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910458141.0A Active CN110267193B (en) | 2019-05-29 | 2019-05-29 | Vehicle position tracking method based on Markov decision process model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110267193B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111752274B (en) * | 2020-06-17 | 2022-06-24 | 杭州电子科技大学 | Laser AGV path tracking control method based on reinforcement learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102883429A (en) * | 2012-08-30 | 2013-01-16 | 北京航空航天大学 | Method and device for tracking move object in sensor network based on directional antenna |
CN103152819A (en) * | 2013-01-29 | 2013-06-12 | 浙江大学 | Dim target tracking method based on underwater wireless sensor network |
CN105788263A (en) * | 2016-04-27 | 2016-07-20 | 大连理工大学 | Method for predicating road jam through mobile phone information |
CN109005512A (en) * | 2018-06-26 | 2018-12-14 | 西北工业大学 | A kind of position predicting method towards specified time interval |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101944234B (en) * | 2010-07-23 | 2012-07-25 | 中国科学院研究生院 | Multi-object tracking method and device driven by characteristic trace |
US9226110B2 (en) * | 2012-03-31 | 2015-12-29 | Groupon, Inc. | Method and system for determining location of mobile device |
CN102685886A (en) * | 2012-04-16 | 2012-09-19 | 浙江大学城市学院 | Indoor positioning method applied to mobile sensing network |
CN103853908B (en) * | 2012-12-04 | 2017-11-14 | 中国科学院沈阳自动化研究所 | A kind of maneuvering target tracking method of adaptive interaction formula multi-model |
-
2019
- 2019-05-29 CN CN201910458141.0A patent/CN110267193B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102883429A (en) * | 2012-08-30 | 2013-01-16 | 北京航空航天大学 | Method and device for tracking move object in sensor network based on directional antenna |
CN103152819A (en) * | 2013-01-29 | 2013-06-12 | 浙江大学 | Dim target tracking method based on underwater wireless sensor network |
CN105788263A (en) * | 2016-04-27 | 2016-07-20 | 大连理工大学 | Method for predicating road jam through mobile phone information |
CN109005512A (en) * | 2018-06-26 | 2018-12-14 | 西北工业大学 | A kind of position predicting method towards specified time interval |
Non-Patent Citations (2)
Title |
---|
Likehood-Based Data Association for Extended Object Tracking Using Sampling Methods;Karl Granstrom et.al;《IEEE Transactions on Intelligent Vehicles》;20171229;第3卷(第1期);全文 * |
传感器网络中目标跟踪算法研究;柳絮;《中国优秀硕士学位论文全文数据库》;20120615;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110267193A (en) | 2019-09-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108256577B (en) | Obstacle clustering method based on multi-line laser radar | |
CN107239076B (en) | AGV laser SLAM method based on virtual scanning and distance measurement matching | |
CN109166140B (en) | Vehicle motion track estimation method and system based on multi-line laser radar | |
CN108759833A (en) | A kind of intelligent vehicle localization method based on priori map | |
CN107967486B (en) | Method for recognizing behaviors of surrounding vehicles | |
US11199850B2 (en) | Estimation device, control method, program and storage medium | |
CN109941274B (en) | Parking method and system based on radar ranging identification shore bridge, server and medium | |
CN106873599A (en) | Unmanned bicycle paths planning method based on ant group algorithm and polar coordinate transform | |
CN108036794A (en) | A kind of high accuracy map generation system and generation method | |
CN110488234A (en) | Outer ginseng scaling method, device, equipment and the medium of vehicle-mounted millimeter wave radar | |
CN106289296A (en) | A kind of method and apparatus of road guide | |
CN104023394A (en) | WSN positioning method based on self-adaptation inertia weight | |
CN110285817B (en) | Complex road network map matching method based on self-adaptive D-S evidence theory | |
CN105704652A (en) | Method for building and optimizing fingerprint database in WLAN/Bluetooth positioning processes | |
CN104869639A (en) | Indoor positioning method and device | |
CN110515055A (en) | The method positioned using radius chess game optimization laser radar | |
CN107132504A (en) | Location tracking device, method and electronic equipment based on particle filter | |
EP3699642A1 (en) | Vehicle positioning method and apparatus | |
CN104507097A (en) | Semi-supervised training method based on WiFi (wireless fidelity) position fingerprints | |
CN105120479A (en) | Signal strength difference correction method of Wi-Fi signals between terminals | |
CN109583312A (en) | Lane detection method, apparatus, equipment and storage medium | |
CN110267193B (en) | Vehicle position tracking method based on Markov decision process model | |
CN111325187B (en) | Lane position identification method and device | |
CN102981160B (en) | Method and device for ascertaining aerial target track | |
CN108871365A (en) | Method for estimating state and system under a kind of constraint of course |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |