CN112098993A - Multi-target tracking data association method and system - Google Patents

Multi-target tracking data association method and system Download PDF

Info

Publication number
CN112098993A
CN112098993A CN202010971580.4A CN202010971580A CN112098993A CN 112098993 A CN112098993 A CN 112098993A CN 202010971580 A CN202010971580 A CN 202010971580A CN 112098993 A CN112098993 A CN 112098993A
Authority
CN
China
Prior art keywords
target
state
value
association
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010971580.4A
Other languages
Chinese (zh)
Inventor
王超
曲承志
李斌
贲驰
张艳
陈金涛
张鑫
苏东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China North Industries Corp
Original Assignee
China North Industries Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China North Industries Corp filed Critical China North Industries Corp
Priority to CN202010971580.4A priority Critical patent/CN112098993A/en
Publication of CN112098993A publication Critical patent/CN112098993A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/66Radar-tracking systems; Analogous systems
    • G01S13/72Radar-tracking systems; Analogous systems for two-dimensional tracking, e.g. combination of angle and range tracking, track-while-scan radar
    • G01S13/723Radar-tracking systems; Analogous systems for two-dimensional tracking, e.g. combination of angle and range tracking, track-while-scan radar by using numerical data
    • G01S13/726Multiple target tracking
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/02Systems using reflection of radio waves, e.g. primary radar systems; Analogous systems
    • G01S13/50Systems of measurement based on relative movement of target
    • G01S13/58Velocity or trajectory determination systems; Sense-of-movement determination systems
    • G01S13/60Velocity or trajectory determination systems; Sense-of-movement determination systems wherein the transmitter and receiver are mounted on the moving object, e.g. for determining ground speed, drift angle, ground track
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The invention discloses a multi-target tracking data association method and a system, wherein the method takes the track of a known initial time period of a target measuring point as a reinforcement learning training process according to the association characteristic of multi-target tracking data, generates random clutter around the known measuring point in one step, and takes the clutter and the known measuring point as radar acquisition measuring points; screening candidate measuring points from the measuring points according to the tracking gate, performing data association on all the candidate measuring points according to the matching degree and the position distribution rule by utilizing motion matching and reinforcement learning according to the target motion characteristics, checking the association result according to the known measuring points in one step, and training an experience matrix of a reinforcement learning data association model; and performing data association on track points of the targets entering the clutter area by combining motion matching, and continuously optimizing the experience matrix according to an association result until track association is completed. The problems of low correct association rate, high calculation complexity and the like are solved, the correct association rate is improved, and the calculation complexity is reduced.

Description

Multi-target tracking data association method and system
Technical Field
The invention relates to the technical field of multi-target tracking, in particular to a multi-target tracking data association method and system. The method is suitable for multi-target tracking data association in a multi-clutter environment.
Background
The basic concept of multi-target tracking was first proposed by Wax in 1955. In 1964, Sutler made intensive research on multi-target tracking theory and data association problem and made pioneering progress, however, the maneuvering target tracking theory did not really draw attention until the early 70 s. During the period, the multi-target tracking technology which is created by Bar-staff and Singer and organically combines the data correlation technology and the Kalman (kalman) filtering technology into a mark has made a breakthrough. However, target tracking data association in a dense clutter environment is always a difficult problem in the field of multi-target tracking, signals captured by a radar also include spurious measurement caused by clutter besides real measurement, and accurate association of targets is difficult to achieve.
In the research of multi-target tracking data association in a multi-clutter environment, the existing nearest neighbor data association method (NN) is the simplest method for solving data association, but the correct association rate of the nearest neighbor method in the clutter environment is lower; the joint probability data association method (JPDA) calculates association probability by combining all targets and measurements into a joint event according to hypothesis, and can well solve the multi-target measurement association problem in a clutter environment.
Disclosure of Invention
The invention provides a multi-target tracking data association method and a multi-target tracking data association system, which are used for overcoming the defects of low correct association rate, high calculation complexity, combined explosion in the calculation of association probability and the like in the prior art, and realizing the purposes of improving the correct association rate and reducing the calculation complexity.
In order to achieve the above object, the present invention provides a multi-target tracking data association method, which comprises the following steps:
s1, constructing a reinforcement learning data association model for predicting the target position at the current moment by combining the previous moment state and the motion attribute of the target;
s2, simulating random clutter points around the known measurement points at the current target moment, and obtaining intra-gate candidate measurement points and intra-gate candidate measurement point position distribution according to the set wave gate;
s3, selecting a weight in an experience matrix of an association model according to the distribution of the candidate measuring points, and obtaining the association probability of each candidate measuring point according to the fluctuation influence of the weight on the state matching degree of the candidate measuring points and the target and the motion matching degree of the candidate measuring points and the target;
s4, obtaining a one-step estimation value of the actual state of the target at the current moment according to the one-step known measurement point of the target at the current moment and carrying out point track-track association;
s5, obtaining a simulation state one-step estimation value of the target at the current moment according to the association probability and the candidate measuring points, and training an experience matrix by taking the Euclidean distance between the simulation state one-step estimation value and the actual state one-step estimation value as loss;
s6, repeating the steps S2-S5 until all known measurement points in the initial time period are associated and trained to obtain a training model;
s7, taking data points collected by the radar after the target enters the clutter area as measuring points, obtaining candidate measuring points to be measured in the door and distribution according to a set wave gate, combining the training model and motion matching to obtain a state one-step estimation value of the target, and associating the state one-step estimation value with the state one-step estimation value; and calculating a one-step state predicted value of the next moment of the target, optimizing an experience matrix of the training model by taking the Mahalanobis distance between the one-step observation predicted value of the one-step state predicted value of the target and the one-step observation predicted value of the one-step state estimated value of the target as loss, and repeating association and optimization until track association is completed.
In order to achieve the above object, the present invention further provides a multi-target tracking data association system, which includes a processor and a memory, wherein the memory stores a multi-target tracking data association program, and the processor executes the steps of the method when running the multi-target tracking data association program.
According to the multi-target tracking data association method and system provided by the invention, a multi-target tracking data association algorithm based on a reinforcement learning model and motion matching is provided, according to the multi-target tracking data association characteristic, the track association of the initial time period known by target measurement is regarded as a reinforcement learning training process, the track association of the subsequent targets entering a clutter area is regarded as a reinforcement learning association process, and the association event of each target and each measurement is not required to be established, so that the algorithm can reduce the calculation complexity in a clutter dense environment, keep the calculation speed faster and avoid the problem of combined explosion; the method disclosed by the invention utilizes a mode of combining reinforcement learning and motion matching to calculate the association probability of the target and the measurement, and the motion and state characteristics of the target and the distribution rule of the indoor measurement are considered during calculation, so that the association accuracy of the multi-target tracking data is effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
Fig. 1 is a flowchart of a radar multi-target tracking data association method based on reinforcement learning and motion matching according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of the true tracks and clutter areas of two targets with fewer clutter according to the present invention;
FIG. 3 is a schematic diagram of the comparison simulation of the true track and the estimated track of two targets when clutter is low according to the present invention;
FIG. 4 is a schematic diagram of the true tracks and clutter areas of two targets when clutter is high in the present invention;
FIG. 5 is a schematic diagram of the comparison simulation of the real track and the estimated track of two targets when clutter is high in the present invention;
FIG. 6 is a schematic diagram of the real tracks and clutter areas of two targets when clutter is dense according to the present invention;
FIG. 7 is a schematic diagram of the comparison simulation of the real track and the estimated track of two targets when the clutter is dense.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that all the directional indicators (such as up, down, left, right, front, and rear … …) in the embodiment of the present invention are only used to explain the relative position relationship between the components, the movement situation, etc. in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indicator is changed accordingly.
In addition, the descriptions related to "first", "second", etc. in the present invention are only for descriptive purposes and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the present invention, unless otherwise expressly stated or limited, the terms "connected," "secured," and the like are to be construed broadly, and for example, "secured" may be a fixed connection, a removable connection, or an integral part; the connection can be mechanical connection, electrical connection, physical connection or wireless communication connection; they may be directly connected or indirectly connected through intervening media, or they may be connected internally or in any other suitable relationship, unless expressly stated otherwise. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
In addition, the technical solutions in the embodiments of the present invention may be combined with each other, but it must be based on the realization of those skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination of technical solutions should not be considered to exist, and is not within the protection scope of the present invention.
Example one
As shown in fig. 1-7, an embodiment of the present invention provides a multi-target tracking data association method, where the scheme is not only applicable to data association in a single-target tracking process, but also applicable to data association in two or more target tracking processes, as shown in fig. 1, specifically including:
step S1, constructing a reinforcement learning data association model for predicting the target position at the current moment by combining the last moment state and the motion attribute of the target;
the method comprises the following steps that real measuring point data (not containing clutter measuring data) of a target in a determined time period by a radar and real track point data of the target in the determined time period are known in a training process at the current moment, measuring point data (containing clutter point data and real data) of the target after the determined time period by the radar are known in a correlation process, and correlation characteristics among reinforced learning real data are simulated through constructing a model to identify clutter zone data, namely clutter points (identify real data) and estimate a target track of the clutter zone;
step S2, simulating random noise points around the known measuring points at the current moment of the target and obtaining intra-gate candidate measuring points and intra-gate candidate measuring point position distribution according to the set wave gate;
determining intra-gate candidate measuring points and distribution according to the Mahalanobis distance between the random clutter point and a one-step predicted point of the target t at the previous moment k-1 obtained by a reinforcement learning data association model and a set wave gate; determining the range of the clutter points to guarantee the calculation speed;
step S3, selecting a weight in an experience matrix of the association model according to the distribution of the candidate measuring points, and obtaining the association probability of each candidate measuring point according to the fluctuation influence of the weight on the state matching degree of the candidate measuring points and the target and the motion matching degree of the candidate measuring points and the target;
the position distribution of the in-door candidate measuring points is related to the motion attributes (position, speed direction, size and the like) of the target at the previous moment, the optimal action is selected according to the parameters matched with the motion in the experience matrix to carry out action weight matching, the association probability is calculated, the association probability represents the probability that the target moves to the track point position corresponding to the clutter point at the next moment, and each in-door candidate measuring point is matched with the weight and the association probability is calculated.
Step S4, obtaining a one-step estimation value of the actual state of the target at the current moment according to the one-step known measurement point of the target at the current moment and carrying out point track-track association;
for example: for the one-step known measurement value Z at the k-th timet(k | k) Kalman filtering to obtain one-step estimation value of actual state
Figure BDA0002684258990000061
And one-step estimation value of target actual state at the kth moment
Figure BDA0002684258990000062
Performing point track-track association; at the k +1 th moment, the measured value Z is known according to one stept(k +1| k +1) Kalman filtering to obtain one-step estimation value of actual state
Figure BDA0002684258990000063
And one-step estimation value of target actual state at the k +1 th moment
Figure BDA0002684258990000064
Performing point track-track association; continuously repeating the previous process, and performing point track-track association on the obtained one-step estimation value of the training process state;
step S5, obtaining a simulation state one-step estimation value of the target at the current moment according to the association probability and the candidate measuring points, and training an experience matrix by taking the Euclidean distance between the simulation state one-step estimation value and the actual state one-step estimation value as loss;
calculating Kalman gain K of K target t at current momentt(k) Sum-state covariance one-step estimate Pt(k | k) and from this calculate a one-step estimate X of the simulated state of the target t at time kt(k | k); calculating all intra-door candidate measuring points at the current moment by combining the association probability to obtain a simulation state one-step estimation value of the target; substituting the simulation state one-step estimation value of the target at the current moment into an optimization module of the reinforcement learning data association model to obtain an experience matrix;
step S6, training an experience matrix by taking the Euclidean distance between the actual state one-step estimation value obtained in the step S4 and the simulation state one-step estimation value obtained in the step S5 as loss, and repeating the steps S2-S5 until all known measurement points in the initial time period are associated and trained to obtain a training model;
for example: one-step estimation value of target actual state at the kth moment
Figure BDA0002684258990000065
And the one-step estimation value X of the simulation statetTraining an experience matrix by taking Euclidean distance between (k | k) as loss to complete the kth round of training; repeating the steps S2-S6 to obtain the k +1 th timeOne-step estimation value X of simulation state of carved targett(k +1| k +1), and the estimated value is compared with the target actual state at the k +1 th moment by one step
Figure BDA0002684258990000066
The Euclidean distance between the training data and the training data is used as loss to train the experience matrix of the previous training cycle again to finish the (k +1) th training cycle; the process is circulated until the track point at the last moment of the known starting time period is used as a known measuring point to be associated and trained, and a training model is obtained;
step S7, using radar collected data points as measuring points after the target enters the clutter area, obtaining candidate measuring points to be measured in the door and distribution according to a set wave gate, combining the training model and motion matching to obtain a state one-step estimation value of the target, and associating the state one-step estimation value with the state one-step estimation value; and calculating the Mahalanobis distance between the target one-step state predicted value and the one-step observation predicted value of the one-step state estimated value, optimizing an experience matrix of the training model by taking the Mahalanobis distance as loss, and repeating association and optimization until track association is completed. The motion matching here refers to the degree of motion matching in the correlation probability calculation process.
Re-executing the step S2 to obtain indoor candidate measuring points and distribution by taking the measuring points of the target entering the clutter area as clutter points, executing the steps S3 and S5 to obtain a state one-step estimation value of the target at the current moment, inputting the state one-step estimation value of the target at the current moment into an experience matrix optimization module of a training model to obtain an experience matrix, and performing point trace-track association on the state one-step estimation value of the target at the current moment to obtain a state one-step prediction value of the target at the next moment; and optimizing the experience matrix of the training model by taking the Mahalanobis distance between the one-step observation predicted value of the one-step state predicted value of the target and the one-step observation predicted value of the one-step state estimated value as loss, and circulating the association and optimization processes until track association is completed.
Firstly, determining a multi-target tracking data association initial condition, regarding track association of an initial time period known by target measurement as a reinforcement learning training process according to the multi-target tracking data association characteristic, and regarding track association of a subsequent target entering a clutter area as a reinforcement learning association process; in the training process, random clutter is generated near a known target measurement point, data association is carried out by combining motion matching with reinforcement learning, and an association result is checked according to a known target measurement value so as to train a reinforcement learning experience matrix; and in the association process, performing data association by combining motion matching according to the trained reinforcement learning experience matrix, and continuously optimizing the experience matrix according to the association result until track association is completed. The invention utilizes the mode of combining motion matching with reinforcement learning to carry out data association, and can obtain accurate association results on the basis of ensuring the calculation speed.
Specifically, the method comprises the following steps: after a target enters a clutter area, acquiring a target t measuring point at the current moment; calculating the Mahalanobis distance between each measuring point and a one-step predicting point calculated by the target t through a training data association model and determining intra-gate candidate measuring points and distribution through a set gate; selecting the optimal action for all candidate measuring points of the target t in an experience matrix Q-table of a reinforcement learning data association model at the moment k according to the distribution of the candidate measuring points in the target t gate to carry out action matching, and calculating association probability; one-step state prediction value according to target t
Figure BDA0002684258990000081
And one-step state covariance prediction
Figure BDA0002684258990000082
Calculating the Kalman gain Kt(k) Sum-state covariance one-step estimate Pt(k | k) and calculating a state one-step estimate X of the target t based thereont(k | k); calculating a one-step state prediction value of a target t
Figure BDA0002684258990000083
One-step observation prediction value of
Figure BDA0002684258990000084
And a one-step state estimate XtOne-step observed prediction of (k | k)
Figure BDA0002684258990000085
Will be provided with
Figure BDA0002684258990000086
And
Figure BDA0002684258990000087
the mahalanobis distance between them is considered as the cost ft(k) Calculating reinforcement learning reward factor rt(k) And optimizing the Q-table according to the reinforcement learning reward factor until the track association is completed.
According to the correlation characteristics of the multi-target tracking data, the track correlation of the initial time period with known target measurement is regarded as a reinforcement learning training process, namely a training process before a training model is obtained; and (3) regarding the track association of the subsequent targets entering the clutter area as a reinforcement learning association process, namely an optimization process after the training model is obtained. The target measurement here refers to: the radar sensor obtains actual measurement data, the data are obtained after being removed, and the track of the target in a certain time period can be obtained through calculation of the data.
Preferably, the step S1 of constructing the reinforcement learning data association model includes:
step S11: determining initial conditions of multi-target tracking data association;
determining a known target measurement value Z for a starting time periodt(k|k),k=1,...,KtrainAnd a clutter region measurement value Z (k) for determining a state transition matrix F of the target t at the time kt(k) Observation matrix Ht(k) Process noise covariance matrix Qt(k) And the observed noise covariance matrix Rt(k) Calculating the predicted value of the one-step state of the target t at the moment k
Figure BDA0002684258990000088
One-step observation prediction value
Figure BDA0002684258990000089
One-step state covariance prediction
Figure BDA00026842589900000810
Sum innovation covariance matrix St(k) (ii) a Of the target t at time kOne-step state prediction
Figure BDA00026842589900000811
Predicting a one-step predicted value of the state (position, speed, acceleration and the like) at the moment k from the moment k-1 for the target t, and observing the predicted value in one step
Figure BDA00026842589900000812
Obtaining a one-step prediction value of the position of the target t at the time k for the radar, and predicting a state covariance matrix in one step
Figure BDA00026842589900000813
One-step prediction of covariance between states at time k of target t, St(k) A covariance matrix of innovation at time k of the target t;
the predicted value of the one-step state of the target t at the moment k
Figure BDA0002684258990000091
One-step observation prediction value
Figure BDA0002684258990000092
One-step state covariance prediction
Figure BDA0002684258990000093
Sum innovation covariance matrix St(k) Their respective computational expressions are:
Figure BDA0002684258990000094
Figure BDA0002684258990000095
Figure BDA0002684258990000096
Figure BDA0002684258990000097
wherein, Ft(k) State transition matrix, H, representing target t at time kt(k) An observation matrix, Q, representing the target t at time kt(k) Process noise covariance matrix, R, representing target t at time kt(k) An observed noise covariance matrix, X, representing the target t at time kt(k-1| k-1) is a one-step estimate of the simulated state of target t at time k-1, Pt(k-1| k-1) is a one-step estimate of the state covariance of target t at time k-1.
Step S12: setting a reinforcement learning discount factor lambda and learning efficiency gamma, establishing an experience matrix Q-table of a reinforcement learning model, wherein a state s is measured distribution, an action a is weight selection of the experience matrix, and the Q-table is initialized to a 0 matrix.
Preferably, the step of simulating a random clutter point in S2 comprises:
the training process requires a step of knowing the measurement value Z of the target t at time kt(k|k),k=1,...,KtrainAmbient generation clutter Zflase,i(k):
Zflase,i(k)=Zt(k|k)+l-2l.rand0,1 (1);
Wherein l is the equivalent square side length of the elliptic wave gate, i is 1,20,1Is a random number between 0 and 1, KtrainFor the upper limit of the training process, T is 1, 2.
Preferably, the step of S2 obtaining the intra-gate candidate measurement points and the position distribution thereof includes:
determining candidate measuring points and distribution in the gate according to the Mahalanobis distance between each measuring point and the target one-step predicting point at the previous moment and the wave gate; substituting the measurement into a wave gate detection module to obtain intra-gate candidate measurement and position distribution thereof;
determination of the measurement values Z (k):
Figure BDA0002684258990000101
for wave gate detection moduleEach measurement Z (k) and the target t one-step predicted measurement value at the time of k calculation
Figure BDA0002684258990000102
Mahalanobis distance g oft(k) If the Mahalanobis distance is smaller than the threshold of the wave gate, the measuring point is positioned in the wave gate and reserved as the candidate measurement of the target t, which is recorded as
Figure BDA0002684258990000103
Mahalanobis distance gt(k):
Figure BDA0002684258990000104
If g ist(k) If the following condition is satisfied, the measurement is retained as a candidate measurement of the target t:
gt(k)≤ζ (4);
where ζ is the gate threshold.
Considering the possible event that all candidate measurements in the target t-wave gate are not the target real measurement, the target t is measured in one step
Figure BDA0002684258990000105
Randomly generating echoes around and adding candidate measurements
Figure BDA0002684258990000106
Figure BDA0002684258990000107
Figure BDA0002684258990000108
The corresponding correlation probabilities are regarded as the probabilities that the intra-gate measurements are all clutter. Measured value predicted by target t one step
Figure BDA0002684258990000109
Establishing a two-dimensional rectangular coordinate system for the origin to divide the wave gate into 4And the area divides the wave gate into a central area and an edge area by taking zeta/2 as a limit, so that the wave gate is divided into 8 areas in total, and the distribution condition of the wave gate is calculated according to the position relation between each target candidate measurement and one-step predicted measurement value.
Preferably, the calculating step calculates the association probability between each candidate measuring point of the target t and the target
Figure BDA00026842589900001010
Step S3 includes:
step S31, selecting a weight in an experience matrix of the correlation model according to the distribution of the candidate measuring points; the method specifically comprises the following steps:
according to the distribution of candidate measurement in each target gate, selecting the best action best _ action in the Q-table corresponding state:
best_action=max[Q(current s,all actions)] (6);
wherein current s is the current state, and each state corresponds to the position distribution measured in the wave gate; all actions are all actions, and each action represents weight selection:
Figure BDA0002684258990000111
wherein Δ is a scaling factor;
step S32, calculating Euclidean distance between each candidate measuring point and one-step observation predicted value of the target to obtain the state matching degree of each candidate measuring point and the target; the state reflects the matching degree between the candidate measuring points as radar real measuring data and the positions of the target actual track points, specifically:
calculating all candidate measuring values and one-step observation predicted values of target t at the moment k
Figure BDA0002684258990000112
Euclidean distance of
Figure BDA0002684258990000113
Figure BDA0002684258990000114
Figure BDA0002684258990000115
Step S33, obtaining the fluctuation influence of the weight value on the state matching degree by algebraic operation of the selected weight value and the Euclidean distance; here, the multiplication operation is selected, specifically:
Figure BDA0002684258990000116
step S34, calculating Euclidean distance between each candidate measuring point and the target three-step observation predicted value to obtain the motion matching degree of each candidate measuring point and the target; the observation prediction from the moment k-3 to the moment k is used, the prediction process is obtained according to the motion characteristics of the target, and the distance difference between the measurement and the three-step observation prediction value can reflect whether the measured position accords with the motion characteristics of the target or not or how much the measured position differs from the motion characteristics, specifically:
performing motion matching on all candidate measurement values of the target t at the moment k and finishing weight selection; performing motion matching on all candidate measuring values of the target t at the moment k, and calculating the association probability of each candidate measuring point of the target t
Figure BDA0002684258990000117
Calculating a target t point trace X at the k-3 momenttThree-step state prediction value of (k-3| k-3)
Figure BDA0002684258990000118
Figure BDA0002684258990000121
Calculating all candidate measuring values and three-step views of target t at the moment kMeasure the predicted value
Figure BDA0002684258990000122
Euclidean distance of
Figure BDA0002684258990000123
Figure BDA0002684258990000124
Figure BDA0002684258990000125
Step S35, obtaining the association probability of each candidate measuring point according to the motion matching degree and the state matching degree after fluctuation; specifically, the method comprises the following steps:
probability of association of all candidate measurement values of target t at time k with target t
Figure BDA0002684258990000126
Comprises the following steps:
Figure BDA0002684258990000127
Figure BDA0002684258990000128
preferably, the step of obtaining a one-step state estimation value at S4 includes:
calculating Kalman gain K of target t at moment Kt(k) Sum-state covariance one-step estimate Pt(k|k):
Figure BDA0002684258990000129
Figure BDA00026842589900001210
Calculating a state one-step estimation value X of the target t at the moment kt(k|k):
Figure BDA00026842589900001211
Preferably, the step of training the experience matrix in S5 includes:
state one-step estimation value Xt(k | k) and the state covariance one-step estimate Pt(k | k) is used for point track-track correlation and substituted into the reinforcement learning experience matrix optimization module to train the Q-table: for the training process, its state one-step estimate Xt(k | k) is used to substitute into the reinforcement learning empirical matrix optimization module to train the Q-table, but without the point-track correlation, for the one-step known metrology value Zt(k|k),k=1,...,KtrainPerforming Kalman filtering, and performing one-step estimation on the obtained training process state
Figure BDA0002684258990000131
Performing point track-track association:
Figure BDA0002684258990000132
mixing Xt(k | k) and
Figure BDA0002684258990000133
the Euclidean distance between them is regarded as the cost
Figure BDA0002684258990000134
Figure BDA0002684258990000135
Calculating reinforcement learning reward factor
Figure BDA0002684258990000136
Figure BDA0002684258990000137
Training the Q-table according to the reinforcement learning reward factor:
Figure BDA0002684258990000138
wherein Qt(si,aj) Measurement at s representing target t at time kiSelect a in the statejQ value corresponding to the action, lambda is learning factor, gamma is discount factor,
Figure BDA0002684258990000139
measure target t at time k at siMaximum Q value in the state.
Preferably, the step of optimizing the experience matrix of the training model in S7 includes:
for the correlation process, calculating a target t one-step state prediction value
Figure BDA00026842589900001310
One-step observation prediction value of
Figure BDA00026842589900001311
And target t one-step state estimation value XtOne-step observed prediction of (k | k)
Figure BDA00026842589900001312
Figure BDA00026842589900001313
Figure BDA00026842589900001314
Figure BDA00026842589900001315
Figure BDA00026842589900001316
Will be provided with
Figure BDA00026842589900001317
And
Figure BDA00026842589900001318
the mahalanobis distance between them is considered as the cost ft(k):
St(k+1)=Ht(k+1)·Pt(k|k)·Ht(k+1)T (19);
Figure BDA00026842589900001319
Calculating reinforcement learning reward factor rt(k):
Figure BDA0002684258990000141
Optimizing the Q-table according to the reinforcement learning reward factor:
Figure BDA0002684258990000142
the parameters in the model of equation (22) are the same as those in equation (14) above.
Therefore, the radar multi-target tracking data association algorithm based on reinforcement learning and motion matching is finished.
The effect of the present invention is further verified and explained by the following simulation experiment.
And (I) simulation experiment data show.
In order to verify the accuracy of the method, the method is proved by a simulation experiment; the experimental data parameters were as follows:
Figure BDA0002684258990000143
(II) simulation results and analysis
The simulation results of the invention are respectively shown in fig. 2, fig. 3, fig. 4, fig. 5, fig. 6 and fig. 7, fig. 2 and fig. 4 are schematic diagrams of two target real tracks and clutter areas when clutter is less and more, fig. 3 and fig. 5 are schematic diagrams of comparison simulation of two target real tracks and estimated tracks when clutter is less and more, wherein horizontal coordinates and vertical coordinates are X and Y direction positions, and units are m. As can be seen from fig. 2 and 4, the target track is difficult to be accurately correlated and estimated by means of a conventional data correlation algorithm because the measurements of the two targets are crossed and the clutter area is tightly gathered, and as can be seen from fig. 3 and 5, the target measurements can be accurately separated from the clutter by using the method of the present invention, so that high correlation accuracy is ensured.
As can be seen from fig. 6, as the number of clutter in the clutter region further increases, the clutter distribution around the target metrology point trace is very dense. At this time, if a conventional nearest neighbor algorithm is adopted, the estimation error is large; and the situation of combination explosion can occur by adopting a conventional joint probability data association algorithm, so that association fails. The method can efficiently calculate the association probability by combining experience matching with reinforcement learning, and the effectiveness of the processing method is verified by the simulation experiment result of FIG. 7.
In conclusion, the simulation experiment verifies the correctness, the effectiveness and the reliability of the method.
Example two
Based on the first embodiment, the invention provides a multi-target tracking data association system, which comprises a memory and a processor, wherein the memory stores a multi-target tracking data association program, and the processor executes the steps of any embodiment of the method when running the multi-target tracking data association program.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all modifications and equivalents of the present invention, which are made by the contents of the present specification and the accompanying drawings, or directly/indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A multi-target tracking data association method is characterized by comprising the following steps:
s1, constructing a reinforcement learning data association model for predicting the target position at the current moment by combining the previous moment state and the motion attribute of the target;
s2, simulating random clutter points around the known measurement points at the current target moment, and obtaining intra-gate candidate measurement points and intra-gate candidate measurement point position distribution according to the set wave gate;
s3, selecting a weight in an experience matrix of an association model according to the distribution of the candidate measuring points, and obtaining the association probability of each candidate measuring point according to the fluctuation influence of the weight on the state matching degree of the candidate measuring points and the target and the motion matching degree of the candidate measuring points and the target;
s4, obtaining a one-step estimation value of the actual state of the target at the current moment according to the one-step known measurement point of the target at the current moment and carrying out point track-track association;
s5, obtaining a simulation state one-step estimation value of the target at the current moment according to the association probability and the candidate measuring points, and training an experience matrix by taking the Euclidean distance between the simulation state one-step estimation value and the actual state one-step estimation value as loss;
s6, repeating the steps S2-S5 until all known measurement points in the initial time period are associated and trained to obtain a training model;
s7, taking data points collected by the radar after the target enters the clutter area as measuring points, obtaining candidate measuring points to be measured in the door and distribution according to a set wave gate, combining the training model and motion matching to obtain a state one-step estimation value of the target, and associating the state one-step estimation value with the state one-step estimation value; and calculating a one-step state predicted value of the next moment of the target, optimizing an experience matrix of the training model by taking the Mahalanobis distance between the one-step observation predicted value of the one-step state predicted value of the target and the one-step observation predicted value of the one-step state estimated value of the target as loss, and repeating association and optimization until track association is completed.
2. The multi-target tracking data association method according to claim 1, wherein the step S1 specifically includes:
step S11: determining initial conditions of multi-target tracking data association;
determining a known target measurement value Z for a starting time periodt(k|k),k=1,...,KtrainAnd a clutter region measurement value Z (k) for determining a state transition matrix F of the target t at the time kt(k) Observation matrix Ht(k) Process noise covariance matrix Qt(k) And the observed noise covariance matrix Rt(k) Calculating the predicted value of the one-step state of the target t at the moment k
Figure FDA0002684258980000021
One-step observation prediction value
Figure FDA0002684258980000022
One-step state covariance prediction
Figure FDA0002684258980000023
Sum innovation covariance matrix St(k);
The predicted value of the one-step state of the target t at the moment k
Figure FDA0002684258980000024
One-step observation prediction value
Figure FDA0002684258980000025
One-step state covariance prediction
Figure FDA0002684258980000026
Sum innovation covariance matrix St(k) The calculation expression of (a) is:
Figure FDA0002684258980000027
Figure FDA0002684258980000028
Figure FDA0002684258980000029
Figure FDA00026842589800000210
wherein, Ft(k) State transition matrix, H, representing target t at time kt(k) An observation matrix, Q, representing the target t at time kt(k) Process noise covariance matrix, R, representing target t at time kt(k) An observed noise covariance matrix, X, representing the target t at time kt(k-1| k-1) is a one-step estimate of the simulated state of target t at time k-1, Pt(k-1| k-1) is a one-step estimation value of the state covariance of the target t at the moment k-1;
step S12: setting a reinforcement learning discount factor lambda and learning efficiency gamma, establishing an experience matrix Q-table of a reinforcement learning model, wherein a state s is measured distribution, an action a is weight selection of the experience matrix, and the Q-table is initialized to a 0 matrix.
3. The multi-target tracking data association method as claimed in claim 2, wherein the step of simulating random clutter points in step S2 comprises:
known measured value Z of target t at time kt(k|k),k=1,...,KtrainAmbient generation clutter Zflase,i(k):
Zflase,i(k)=Zt(k|k)+l-2l·rand0,1 (1);
Wherein l is the equivalent square side length of the elliptic wave gate, and i is equal to1, 2.. num _ flase is the number of clutter, rand0,1Is a random number between 0 and 1, KtrainFor the upper limit of the training process, T is 1, 2.
4. The multi-target tracking data association method as claimed in claim 3, wherein the step of obtaining the intra-gate candidate measurement points and the position distribution in step S2 comprises:
determining candidate measuring points and distribution in the gate according to the Mahalanobis distance between each measuring point and the target one-step predicting point at the previous moment and the wave gate;
determination of the measurement values Z (k):
Figure FDA0002684258980000031
calculating each measurement value Z (k) at the time k and a one-step predicted measurement value of the target t
Figure FDA0002684258980000032
Mahalanobis distance g oft(k):
Figure FDA0002684258980000033
If g ist(k) If the following condition is satisfied, the candidate measurement of the target t is retained and recorded as
Figure FDA0002684258980000034
gt(k)≤ζ (4);
Where ζ is the gate threshold;
one-step predictive measurement for target t
Figure FDA0002684258980000035
Randomly generating echoes around and adding candidate measurements
Figure FDA0002684258980000036
Figure FDA0002684258980000037
Figure FDA0002684258980000038
The corresponding association probability is regarded as the probability that the intra-gate measurement is clutter; measured value predicted by target t one step
Figure FDA0002684258980000039
Establishing a two-dimensional rectangular coordinate system for an origin to divide the wave gate into 4 areas, dividing the wave gate into a central area and an edge area by taking zeta/2 as a limit, dividing the wave gate into 8 areas in total, and calculating the distribution condition of the wave gate according to the position relation of each target candidate measuring point and one-step predicted measuring value.
5. The multi-target tracking data association method according to claim 4, wherein the step S3 specifically includes:
s31, selecting a weight in an experience matrix of an association model according to the distribution of the candidate measuring points;
step S32, calculating Euclidean distance between each candidate measuring point and one-step observation predicted value of the target to obtain the state matching degree of each candidate measuring point and the target;
step S33, obtaining the fluctuation influence of the weight value on the state matching degree by algebraic operation of the selected weight value and the Euclidean distance;
step S34, calculating Euclidean distance between each candidate measuring point and the target three-step observation predicted value to obtain the motion matching degree of each candidate measuring point and the target;
and step S35, obtaining the association probability of each candidate measuring point according to the motion matching degree and the state matching degree after fluctuation.
6. The multi-target tracking data association method according to claim 5, wherein the step S31 is specifically:
selecting the best action best _ action in the Q-table corresponding state according to the distribution of the candidate measuring points in each target gate:
best_action=max[Q(current s,all actions)] (6);
wherein current s is the current state, and each state corresponds to the position distribution measured in the wave gate; all actions are all actions, and each action represents weight selection:
Figure FDA0002684258980000041
wherein Δ is a scaling factor; finishing weight selection according to the mapping relation;
the step S32 specifically includes: calculating all candidate measuring values and one-step observation predicted values of target t at the moment k
Figure FDA0002684258980000042
Euclidean distance of
Figure FDA0002684258980000043
Figure FDA0002684258980000044
Figure FDA0002684258980000045
In step S33, the multiplication operation is specifically selected, and specifically:
Figure FDA0002684258980000046
the step S34 specifically includes;
calculating a target t point trace X at the k-3 momenttThree-step state prediction value of (k-3| k-3)
Figure FDA0002684258980000047
Figure FDA0002684258980000051
Calculating all candidate measuring values and three-step observation predicted values of target t at the moment k
Figure FDA0002684258980000052
Euclidean distance of
Figure FDA0002684258980000053
Figure FDA0002684258980000054
Figure FDA0002684258980000055
The step S35 specifically includes:
obtaining the association probability of all candidate measurement values of the target t at the moment k and the target t
Figure FDA0002684258980000056
Figure FDA0002684258980000057
Figure FDA0002684258980000058
7. The multi-target tracking data association method according to claim 6, wherein the step S4 specifically includes:
calculating Kalman gain K of target t at moment Kt(k) Sum-state covariance one-step estimate Pt(k|k):
Figure FDA0002684258980000059
Figure FDA00026842589800000510
Calculating a state one-step estimation value X of the target t at the moment kt(k|k):
Figure FDA00026842589800000511
8. The multi-target tracking data association method according to claim 7, wherein the step S5 specifically includes:
state one-step estimation value Xt(k | k) and the state covariance one-step estimate Pt(k | k) is used for point track-track correlation and substituted into the reinforcement learning experience matrix optimization module to train the Q-table:
for one step known measurement value Zt(k|k),k=1,...,KtrainPerforming Kalman filtering, and performing one-step estimation on the obtained training process state
Figure FDA0002684258980000061
Performing point track-track association:
Figure FDA0002684258980000062
mixing Xt(k | k) and
Figure FDA0002684258980000063
the Euclidean distance between them is regarded as the cost
Figure FDA0002684258980000064
Figure FDA0002684258980000065
Calculating reinforcement learning reward factor
Figure FDA0002684258980000066
Figure FDA0002684258980000067
Training the Q-table according to the reinforcement learning reward factor:
Figure FDA0002684258980000068
wherein Qt(si,aj) Measurement at s representing target t at time kiSelect a in the statejQ value corresponding to the action, lambda is learning factor, gamma is discount factor,
Figure FDA00026842589800000618
measure target t at time k at siMaximum Q value in the state.
9. The multi-target tracking data association method according to claim 8, wherein the step S7 specifically includes:
for the correlation process, a state prediction value is calculated
Figure FDA0002684258980000069
One-step observation prediction value of
Figure FDA00026842589800000610
And the estimated value XtOne-step observed prediction of (k | k)
Figure FDA00026842589800000611
Figure FDA00026842589800000612
Figure FDA00026842589800000613
Figure FDA00026842589800000614
Figure FDA00026842589800000615
Will be provided with
Figure FDA00026842589800000616
And
Figure FDA00026842589800000617
the mahalanobis distance between them is considered as the cost ft(k):
St(k+1)=Ht(k+1)·Pt(k|k)·Ht(k+1)T (19);
Figure FDA0002684258980000071
Calculating reinforcement learning reward factor rt(k):
Figure FDA0002684258980000072
Optimizing the Q-table according to the reinforcement learning reward factor:
Figure FDA0002684258980000073
10. a multi-target tracking data association system, comprising a processor and a memory, wherein the memory stores a multi-target tracking data association program, and the processor executes the steps of the multi-target tracking data association method according to any one of claims 1 to 9 when running the multi-target tracking data association program.
CN202010971580.4A 2020-09-16 2020-09-16 Multi-target tracking data association method and system Pending CN112098993A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010971580.4A CN112098993A (en) 2020-09-16 2020-09-16 Multi-target tracking data association method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010971580.4A CN112098993A (en) 2020-09-16 2020-09-16 Multi-target tracking data association method and system

Publications (1)

Publication Number Publication Date
CN112098993A true CN112098993A (en) 2020-12-18

Family

ID=73759875

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010971580.4A Pending CN112098993A (en) 2020-09-16 2020-09-16 Multi-target tracking data association method and system

Country Status (1)

Country Link
CN (1) CN112098993A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095401A (en) * 2021-04-12 2021-07-09 吉林大学 Multi-sensor multi-target association tracking method
CN113191427A (en) * 2021-04-29 2021-07-30 无锡物联网创新中心有限公司 Multi-target vehicle tracking method and related device
CN113340308A (en) * 2021-05-31 2021-09-03 西安电子科技大学 Correction logic law flight path starting method based on self-reporting point
CN113701758A (en) * 2021-08-23 2021-11-26 中国北方工业有限公司 Multi-target data association method and system based on biological search algorithm
CN113985406A (en) * 2021-12-24 2022-01-28 中船(浙江)海洋科技有限公司 Target track splicing method for marine radar
CN116628448A (en) * 2023-05-26 2023-08-22 兰州理工大学 Sensor management method based on deep reinforcement learning in extended target

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109901153A (en) * 2019-03-29 2019-06-18 西安电子科技大学 Targetpath optimization method based on information entropy weight and nearest-neighbor data correlation
CN110824467A (en) * 2019-11-15 2020-02-21 中山大学 Multi-target tracking data association method and system
CN111007495A (en) * 2019-12-10 2020-04-14 西安电子科技大学 Target track optimization method based on double-fusion maximum entropy fuzzy clustering JPDA

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109901153A (en) * 2019-03-29 2019-06-18 西安电子科技大学 Targetpath optimization method based on information entropy weight and nearest-neighbor data correlation
CN110824467A (en) * 2019-11-15 2020-02-21 中山大学 Multi-target tracking data association method and system
CN111007495A (en) * 2019-12-10 2020-04-14 西安电子科技大学 Target track optimization method based on double-fusion maximum entropy fuzzy clustering JPDA

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095401A (en) * 2021-04-12 2021-07-09 吉林大学 Multi-sensor multi-target association tracking method
CN113191427A (en) * 2021-04-29 2021-07-30 无锡物联网创新中心有限公司 Multi-target vehicle tracking method and related device
CN113191427B (en) * 2021-04-29 2022-08-23 无锡物联网创新中心有限公司 Multi-target vehicle tracking method and related device
CN113340308A (en) * 2021-05-31 2021-09-03 西安电子科技大学 Correction logic law flight path starting method based on self-reporting point
CN113701758A (en) * 2021-08-23 2021-11-26 中国北方工业有限公司 Multi-target data association method and system based on biological search algorithm
CN113985406A (en) * 2021-12-24 2022-01-28 中船(浙江)海洋科技有限公司 Target track splicing method for marine radar
CN113985406B (en) * 2021-12-24 2022-05-10 中船(浙江)海洋科技有限公司 Target track splicing method for marine radar
CN116628448A (en) * 2023-05-26 2023-08-22 兰州理工大学 Sensor management method based on deep reinforcement learning in extended target
CN116628448B (en) * 2023-05-26 2023-11-28 兰州理工大学 Sensor management method based on deep reinforcement learning in extended target

Similar Documents

Publication Publication Date Title
CN110824467B (en) Multi-target tracking data association method and system
CN112098993A (en) Multi-target tracking data association method and system
CN109901153B (en) Target track optimization method based on information entropy weight and nearest neighbor data association
CN105137418B (en) Multiple target tracking and data interconnection method based on complete adjacent fuzzy clustering
CN113091738B (en) Mobile robot map construction method based on visual inertial navigation fusion and related equipment
CN106980114A (en) Target Track of Passive Radar method
CN106872955A (en) Radar Multi Target tracking optimization method based on Joint Probabilistic Data Association algorithm
CN104035083B (en) A kind of radar target tracking method based on measurement conversion
CN105510896B (en) A kind of weighted nearest neighbor numeric field data correlating method of centralization multi-radar data processing
CN104155650A (en) Object tracking method based on trace point quality evaluation by entropy weight method
CN104199022B (en) Target modal estimation based near-space hypersonic velocity target tracking method
CN103729859A (en) Probability nearest neighbor domain multi-target tracking method based on fuzzy clustering
CN107526070A (en) The multipath fusion multiple target tracking algorithm of sky-wave OTH radar
CN106932771A (en) A kind of radar simulation targetpath tracking and system
CN104156984A (en) PHD (Probability Hypothesis Density) method for multi-target tracking in uneven clutter environment
CN107656265A (en) Particle filter fusion method for tracking short flight path before multi frame detection
CN106054151A (en) Radar multi-target tracking optimization method based on data correlation method
CN103985120A (en) Remote sensing image multi-objective association method
CN111830501B (en) HRRP history feature assisted signal fuzzy data association method and system
CN110501671A (en) A kind of method for tracking target and device based on measurement distribution
CN109509207B (en) Method for seamless tracking of point target and extended target
CN105424043A (en) Motion state estimation method based on maneuver judgment
Xia et al. Extended object tracking with automotive radar using learned structural measurement model
CN108712725B (en) SLAM method based on rodent model and WIFI fingerprint
CN104182652B (en) Typical motor formation target tracking modeling method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination