CN116385493A

CN116385493A - Multi-moving-object detection and track prediction method in field environment

Info

Publication number: CN116385493A
Application number: CN202310381363.3A
Authority: CN
Inventors: 王煜; 黄时星
Original assignee: Zhongke Bote Intelligent Technology Anhui Co ltd
Current assignee: Zhongke Bote Intelligent Technology Anhui Co ltd
Priority date: 2023-04-11
Filing date: 2023-04-11
Publication date: 2023-07-04

Abstract

The invention discloses a method for detecting multiple moving targets and predicting tracks in a field environment, which comprises the following steps: and the point cloud image is fused with a three-dimensional target detection, three-dimensional multi-target tracking, three-dimensional track prediction and dynamic SLAM algorithm to establish a static map. The invention has the beneficial effects that: the method can provide future tracks of moving objects and establish a static map on a homemade field environment data set, verifies the high accuracy of a point cloud image fusion three-dimensional target detection algorithm, the stable tracking effect of a multi-target tracking algorithm, the precision of three-dimensional track prediction, the high preservation rate of static points of the established static map and the high removal rate of dynamic points, and can run in real time on experimental equipment.

Description

Multi-moving-object detection and track prediction method in field environment

Technical Field

The invention relates to the technical field of track prediction methods, in particular to a multi-moving-object detection and track prediction method in a field environment.

Background

With the rapid development of robot technology, the robot is widely applied to industrial manufacture, military operations and civil life. In an actual field environment, an operator can control the ground mobile robot in a remote control mode to finish tasks such as material conveying. However, this approach requires a significant amount of ground mobile robot control experience for the operator. If the ground mobile robot has the function of automatically following the target object, the requirement of the control experience of an operator is greatly reduced.

The Chinese patent application document with the publication number of CN115601397A discloses a ship track tracking and predicting method based on a monocular camera, which comprises the following steps: step 1: constructing a target ship image dataset, and step 2: labeling and training a data set, and step 3: and (4) evaluating and analyzing a model training result, wherein in step (4): target ship tracking experiment based on YOLOv5 and Deep Sort algorithm, step 5: target ship visual positioning and data preprocessing, step 6: constructing an LSTM track prediction model, and step 7: selecting and evaluating a time step of a prediction model, and step 8: and predicting and verifying target ship track. The method has the advantages that the YOLOv5 algorithm is selected to complete the identification of the target ship through the image acquired by the camera, as few interference objects on the sea surface are available, the ship can be identified on the sponge through the image, in a field environment, the interference objects with different surface colors and texture features are more, the requirements of the target identification and the track prediction are difficult to meet only by means of image identification, the accuracy is poor, and the target identification error rate is high.

Some papers have realized target detection and trajectory prediction methods in automatic driving in urban environments. For example, the article "Fast and functional end-to-end 3d detection,trackingand motion forecasting with a single convolutional net" published in 2018, "Proceedings of the IEEE conference on Computer Vision and PatternRecognition" discloses that the tasks of target detection, tracking and track prediction are completed by designing a deep learning neural network, but the method is designed for an automatic driving dataset in urban environment, does not consider the situation of rugged terrain, and can not solve the problem of identifying a target object in an interfering object by using only three-dimensional laser radar point clouds to complete the task, so that the tracking effect of a multi-target tracking algorithm is unstable.

The invention patent application document with the bulletin number of CN114972911A discloses an output data collection processing method, electronic equipment, a storage medium and a vehicle of an automatic driving perception algorithm model, solves the problem of how to realize effective collection of a complementary training set of a preset scene so as to improve the performance of the perception algorithm model, time sequences a target perception result output by the perception algorithm model to obtain a plurality of perception frame data, judges whether each frame of perception frame data accords with the preset scene or not, and takes data acquired by a vehicle-mounted sensor in a time window where the perception frame data accords with the preset scene as the complementary training set of the perception algorithm model so as to train the perception algorithm model, and can more effectively improve the performance of the perception algorithm model. In the process of tracking the target object, the moving objects can be mutually shielded, and a large number of dynamic points can be reserved in the point cloud map. The method is characterized in that a point cloud image fusion three-dimensional target detection module is simply used for obtaining information to remove dynamic points, and the corresponding dynamic points cannot be removed because the target detection algorithm can generate the condition that a certain frame of data moving object is missed. And the residual error from the characteristic point to the corresponding plane is overlarge, so that the estimation of the pose of the point cloud frame is unreasonable.

Therefore, the field environment target tracking and predicting system in the prior art also has the problems of poor accuracy of target identification, unstable multi-target tracking effect and low removal rate of dynamic points.

Disclosure of Invention

In order to solve the technical problems, the invention aims to provide a multi-moving object detection and track prediction method in a field environment, which comprises the following steps: the method for detecting and predicting the multiple moving targets in the field environment has the advantages of being good in target recognition accuracy, stable in multiple target tracking effect and high in dynamic point removal rate.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:

the method for detecting and predicting the multiple moving targets in the field environment comprises the following steps:

s1, detecting a three-dimensional target by fusing point cloud images: detecting and obtaining point cloud data by using a three-dimensional detector, wherein the point cloud data obtain point cloud target detection data through a point cloud target detection algorithm; obtaining image data by using an image sensor, wherein the image data obtains image target detection data through an image target detection algorithm; matching the point cloud target detection data with the image target detection data by using a Hungary matching algorithm, and outputting matching result data;

s2, three-dimensional multi-target tracking: creating a corresponding tracking example according to the matching result data of the point cloud image fusion three-dimensional target detection, constructing a three-dimensional Kalman filter to predict the position of the moving object at the next moment, and using a Hungary matching algorithm to complete matching of the target detection result and the predicted value at the next moment so as to complete multi-target tracking;

s3, three-dimensional track prediction: the method comprises the steps of combining a deep learning track prediction algorithm and a Kalman filtering algorithm to infer a target motion track, wherein the deep learning track prediction algorithm is used for inferring a future track of a target based on a historical track of the target in the past for a period of time, calculating an error of the future track by using the Kalman filtering algorithm as a reference, and completing prediction of a three-dimensional track by using the Kalman filtering algorithm when the error of the future track is larger than a preset first threshold value;

s4, establishing a static map by a dynamic SLAM algorithm: taking the last point cloud point of the point cloud data as a reference, and obtaining the pose of the three-dimensional detector corresponding to each point cloud point by using IMU integral interpolation to finish the de-distortion of the point cloud; and similarly, obtaining the initial pose of the point cloud data, removing dynamic points in the point cloud data according to the dynamic object position information obtained by a sensing algorithm, performing grid voxel downsampling, and then constructing an incremental map by using ikd-Tree to finish the establishment of a static map.

By such arrangement: the problem that targets cannot be stably detected and tracked due to the fact that a plurality of field environment interferents exist is solved, the stability of real-time operation of the system can be effectively guaranteed, and the advantage of stable multi-target tracking effect is achieved. The three-dimensional track prediction method has the advantages that the three-dimensional track prediction can be completed by the system under different stages and different conditions, and the predicted result of the track prediction algorithm based on deep learning can be evaluated to obtain the optimal predicted track, so that the advantage of higher accuracy of target motion track prediction is achieved. And the position information of the moving object of the previous frame is used for completing the removal of the dynamic object of the current three-dimensional laser radar point cloud frame, so that the advantage of higher removal rate of the dynamic point is achieved. The parallel computation of the CIA-SSD network process, the YOLOv5 process and the SLAM process is realized, and the real-time running stability of the whole system is ensured.

Preferably, in the step S1, the method further includes the steps of:

the three-dimensional detector obtains point cloud target detection data through a CIA-SSD network, the image sensor obtains image target detection data through a YOLOv5 network, and homogeneous coordinates of a center point of a target object under an image sensor coordinate system are calculated

Wherein->

Representing homogeneous transformation matrix of three-dimensional detector coordinate system under image sensor coordinate system, P ^L The homogeneous coordinates of the center point of the target object under the three-dimensional detector coordinate system are represented, a two-dimensional bounding box of the moving object is obtained by using a YOLOv5 network, and then the center point coordinate p of the two-dimensional bounding box is obtained _img Then find p _L And p is as follows _img And constructing a cost matrix by using the Euclidean distance between the point cloud target detection data and the image target detection data, and matching the point cloud target detection data with the image target detection data by using a Hungary matching algorithm.

By such arrangement: and matching of CIA-SSD point cloud target detection results and YOLOv5 image target detection results is achieved.

Preferably, in the step S2, the method further includes the steps of:

predicting the current position P of tracking example by using Kalman filtering algorithm _pre Reasonable observation value P of the tracking example is obtained through a Hungary matching algorithm ^L 。

By such arrangement: the real-time update of the tracking instance is realized, so that the three-dimensional multi-target tracking function can be realized.

Preferably, in the step S2, the method further includes the steps of:

let p be _pre To track projection of example position predictors on an image, p _img Detecting a moving object center point obtained by a network for a Yolov5 image target; if p _pre And p is as follows _img With redundant p _img If not, entering a step S2.1; if p _pre And p is as follows _L 、p _img And p is as follows _L If the two types are matched, entering a step S2.2; if no observation data exists, the step S2.3 is entered; if p _img And p is as follows _pre If the categories are not matched, discarding all observation data at the moment;

s2.1, creating a tracking instance for the target, wherein the tracking instance is in an initial state;

S2.2、P ^L as the observation value of the corresponding tracking example, the tracking example enters a tracking state after obtaining 3 frames of observation data;

s2.3 using the predicted value P _pre And deleting the tracking example after the current position of the tracking example is continuously obtained for 5 times without observation data, and entering an extinction state.

By such arrangement: the state of the tracking instance can be used as a judgment basis for making the next operation on the tracking instance, which is beneficial to improving the detection, tracking and prediction of the system on the target.

Preferably, in the step S3, the method further includes the steps of:

the deep learning track prediction algorithm adopts a memory enhancement neural network, uses a history track encoder network to extract history track characteristics hi in a history track, uses a future track encoder network to extract future track characteristics fi in a future track, splices the hi and fi characteristics, inputs the hi and fi characteristics into a decoder network, and restores the future track of a moving object.

By such arrangement: the accuracy of the memory-enhanced neural network trajectory prediction can be ensured.

Preferably, in the step S2, the method further includes the steps of:

acceleration, velocity and position of the tracking instance are estimated using a three-dimensional kalman filter.

By such arrangement: further monitoring and analysis of the state of the tracking instance is facilitated.

Preferably, in the step S3, the method further includes the steps of:

estimating acceleration, speed and position of tracking example by using three-dimensional Kalman filter, and calculating track point of future period

By trace points->

And calculating track point errors of a period of time in a plurality of future tracks for the reference, selecting the future track with the smallest error as an optimal track, and invalidating the future track when the error of the optimal track is larger than a preset first threshold value.

By such arrangement: the accuracy of future tracks can be further ensured, and the advantage of higher accuracy of target motion track prediction is achieved.

Preferably, in the step S4, the method further includes the steps of:

and acquiring pose changes between two adjacent frames of three-dimensional laser radar point clouds by using IMU pre-integration, constructing a factor graph by using the pose changes as a constraint, and acquiring corresponding state quantities such as IMU zero offset value after the factor graph is optimized, so as to complete initialization of the IMU.

By such arrangement: the accuracy of removing distortion of the point cloud can be guaranteed when pose interpolation based on IMU integration is completed.

Preferably, in the step S4, the method further includes the steps of:

after constructing an incremental map by using ikd-Tree, solving a plurality of nearest map points between characteristic points in point cloud data and ikd-Tree, judging whether the map points are planes, and solving residual errors of the characteristic points and planes where the map points are located if the map points are planes.

By such arrangement: by solving the residual error, the pose estimation of the current three-dimensional detector can be completed, and the positioning accuracy of the whole system is ensured.

Preferably, in the step S4, the method further includes the steps of:

calculation of

Wherein the method comprises the steps of

Coordinates representing the ith feature point of the point cloud frame,/->

Representation->

And if S is larger than a preset second threshold value, removing the characteristic point.

By such arrangement: whether the feature point is a noise point can be judged, and the advantages of higher dynamic point removal rate and capability of storing static points as much as possible are achieved.

Compared with the prior art, the invention has the beneficial technical effects that:

1. and the point cloud target detection data obtained by the three-dimensional laser radar and the image target detection data obtained by the camera are subjected to fusion matching through a Hungary matching algorithm, and matching result data can be output. Meanwhile, the parallel computation of the CIA-SSD point cloud target detection process and the YOLOv5 image target detection process is realized, the problem that targets cannot be stably detected and tracked due to more field environment interferents is solved, the stability of real-time operation of a system can be effectively ensured, and the stability of multi-target tracking is improved while higher detection accuracy is achieved.

2. The deep learning-based trajectory prediction algorithm requires a historical trajectory of the moving object over a period of time to complete the inference. If the number of the historical track points of the moving object cannot meet the requirement, the deep learning neural network cannot extract effective features. However, when the actual system is running, the tracking instance is created early, and the number of the historical track points cannot meet the requirement. At the moment, three-dimensional track prediction is completed by using a Kalman filtering algorithm, so that the three-dimensional track prediction can be completed by the system under different conditions at different stages. In addition, the Kalman filtering algorithm predicts the moving object with small track error in a short time, and can evaluate the predicted result of the track prediction algorithm based on the deep learning to obtain the optimal predicted track, thereby achieving the advantage of high accuracy of target motion track prediction.

3. The method is characterized in that the method is not complete by simply using a point cloud image fusion three-dimensional target detection module to obtain information to remove dynamic points, and the corresponding dynamic points cannot be removed because the target detection algorithm can generate the condition that a certain frame of data moving object is missed. In order to solve the problem, the invention designs a dynamic object information collection strategy, which is used for acquiring the target detection result of the previous frame of data on one hand and acquiring the corresponding position estimation value from the tracking example which is not matched with the observed value of the previous frame but is not deleted on the other hand. Since the relative speed between the ground mobile robot and the moving object is not large in the course of the ground mobile robot tracking the target object. Therefore, the position change of the moving object between the adjacent data frames is not great, the position information of the moving object of the previous frame can be used for removing the dynamic object of the current three-dimensional laser radar point cloud frame, and the advantage of higher removal rate of the dynamic point is achieved. The parallel computation of the CIA-SSD network process, the YOLOv5 process and the SLAM process is realized, and the real-time running stability of the whole system is ensured.

4. The invention can provide future tracks of moving objects and establish a static map on a homemade field environment data set, and verifies the high accuracy of a point cloud image fusion three-dimensional target detection algorithm, the stable tracking effect of a multi-target tracking algorithm, the precision of three-dimensional track prediction, the high preservation rate of static points of the established static map and the high removal rate of dynamic points. Can run in real time on experimental equipment. The system has important significance for solving the problem of tracking the target object by the ground mobile robot.

Drawings

FIG. 1 is a schematic diagram of a system flow of a method for detecting multiple moving targets and predicting trajectories in a field environment according to an embodiment of the present invention;

FIG. 2 is a flowchart of a life cycle management process of a matching policy and tracking instance according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a trace-instance state transition flow in an embodiment of the present invention;

FIG. 4 is a schematic diagram of the structure of a historical track encoder network, a future track encoder network, and a decoder network in an embodiment of the present invention;

FIG. 5 is a flow chart of a dynamic SLAM algorithm in an embodiment of the present invention.

Detailed Description

The present invention will be further described in detail with reference to the following examples, for the purpose of making the objects, technical solutions and advantages of the present invention more apparent, but the scope of the present invention is not limited to the following specific examples.

Referring to fig. 1, the method for detecting multiple moving targets and predicting trajectories in a field environment is used for detecting objects around a ground mobile robot, predicting relative movement trajectories and establishing a static map, and comprises the following steps:

s1, detecting a three-dimensional target by fusing point cloud images: detecting and obtaining point cloud data by using a three-dimensional detector, wherein the point cloud data is original data obtained by the three-dimensional detector, and obtaining point cloud target detection data by a point cloud target detection algorithm; obtaining image data by using an image sensor, wherein the image data obtains image target detection data through an image target detection algorithm; and matching the point cloud target detection data with the image target detection data by using a Hungary matching algorithm, and outputting matching result data. The three-dimensional detector is a three-dimensional laser radar, and the image sensor is a camera.

The three-dimensional detector obtains point cloud target detection data through a CIA-SSD network, and the image sensor obtains point cloud target detection data through a YOLO network _v 5, obtaining image target detection data by a network, and calculating to obtain the coordinates of the center point of the moving object under the coordinate system of the image sensor by the formula (1):

calculating homogeneous coordinates of the target object in the image coordinate system through a formula (2):

let p _L = (u, v) represents the projection of the moving object center point onto the image inferred from the CIA-SSD network, where (x ^C ，y ^C ，z ^C 1) represents homogeneous coordinates of a center point of a target object in an image sensor coordinate system,

representing homogeneous transformation matrix of three-dimensional detector coordinate system under image sensor coordinate system, P ^L The method is characterized in that the method comprises the steps of representing homogeneous coordinates of a target object center point under a three-dimensional detector coordinate system, (u, v, 1) representing homogeneous coordinates of the target object center point under an image coordinate system, and K representing an internal reference matrix of the monocular camera. Obtaining a two-dimensional bounding box of a moving object by using a YOLOv5 network, and then obtaining a center point coordinate p of the two-dimensional bounding box _img Then find p _L And p is as follows _img And constructing a cost matrix by using the Euclidean distance between the CIA and the SSD, and matching the CIA-SSD point cloud target detection result with the YOLOv5 image target detection result by using a Hungary matching algorithm to match the point cloud target detection data with the image target detection data.

since the moving object is moving over rough terrain, three-dimensional kinematics must be used to model the moving object. The position, the speed and the acceleration of the tracking example are taken as state quantity, and the following state equation (3) is obtained according to the kinematic relation:

wherein (P) _t ，v _t ，a _t ) Representing the position, velocity and acceleration of the tracking instance, Δt representing the time interval, Q representing the process noise, subject to gaussian distribution;

because only the position of the target object can be obtained in the detection of the point cloud image fusion three-dimensional target, the established observation equation (4) is as follows:

wherein P is ^w The position of the tracking example under the map coordinate system is represented, R represents observation noise and obeys Gaussian distribution;

since the velocity of the moving object, which often occurs in the detection range of the sensor, is substantially uniform during the process of tracking the target object, the present invention will provide an average velocity v of the moving object _avg As the initial velocity of the newly found tracking example, when the tracking example obtains 3 times of observation data, after accumulating 3 frames of history track points, obtaining a new velocity estimation value v through a least square algorithm _ls . Calculation of v _ls Velocity estimation value of Kalman filtering algorithm

Is the difference of (2)Deltav. If the modulus of Deltav is greater than the preset third threshold value, v is used _ls Updating the tracking instance state and performing iterative computation by using a Kalman filtering algorithm; estimating acceleration, speed and position of a tracking instance by using a three-dimensional Kalman filter;

predicting the current position P of tracking example by using Kalman filtering algorithm _pre Reasonable observation value P of the tracking example is obtained through a Hungary matching algorithm ^L Real-time update of the tracking instance is realized, so that the three-dimensional multi-target tracking function can be realized;

an algorithmic framework for matching policy and tracking instance lifecycle management is shown in FIG. 2, where p _pre To track projection of example position predictors on an image, p _img Detecting a moving object center point obtained by a network for a Yolov5 image target; if p _pre And p is as follows _img With redundant p _img If no match exists, the method meets the condition 1 and goes to the step s2.1; if it is _pre And p is as follows _L 、p _img And p is as follows _L Also the matching is made up, then condition 2 is met and step S2.2 is entered; if no observation data exists, the method meets the condition 3 and proceeds to the step S2.3; if p _img And p is as follows _pre If the categories are not matched, discarding all observation data at the moment; the life cycle management strategy of the tracking instance is as follows, and the state conversion of the tracking instance is shown in fig. 3, wherein the state of the tracking instance mainly comprises an initial state, a tracking state and an extinction state;

S3, three-dimensional track prediction: the method comprises the steps that a deep learning track prediction algorithm and a Kalman filtering algorithm are combined to infer a target motion track, the deep learning track prediction algorithm is used for inferring a future track of a target based on a historical track of the target in the past for a period of time, the Kalman filtering algorithm is used as a reference to calculate an error of the future track, when the error of the future track is larger than a preset first threshold, the future track is invalid, and the Kalman filtering algorithm is used for completing three-dimensional track prediction;

the deep learning track prediction algorithm adopts a memory enhancement neural network, uses a history track encoder network to extract history track characteristics hi in a history track, uses a future track encoder network to extract future track characteristics fi in a future track, splices the hi and fi characteristics, inputs the hi and fi characteristics into a decoder network, and restores the future track of a moving object; the network structure of the encoder network and the decoder network is shown in fig. 4, the encoder network is divided into a history track encoder network and a future track encoder network, the network structures are the same, but the learned data are different;

the controller of the memory-enhanced neural network is composed of a fully-connected neural network, and the training process of the controller is used for an encoder network and a decoder network. Firstly, part of training data is required to be selected to obtain historical track feature memory and future track feature memory through a historical track encoder network and a future track encoder network. And then starting formal training, obtaining a history track feature h by using a history track feature encoder, searching 5 history track feature vectors which are most similar to h in history track feature memory through cosine distances, and further searching 5 corresponding future track feature vectors. H is spliced with the 5 future track feature vectors, and 5 future tracks are obtained through the decoder network. The error rate of the predicted trajectory is calculated by the following equation (5):

wherein the method comprises the steps of

Is an indication function when the predicted three-dimensional trajectory point +.>

And true value P _F The error is at a certain threshold Th _t Within the range it is 1, otherwise 0. While threshold Th _t As the prediction time increases, th is set _1s ＝0.5m，Th _2s ＝1.0mn，Th _3s ＝1.5m，Th _4s ＝2.0m。

Selecting the maximum value E of error rate in 5 predicted tracks _{r ate_max} Loss function L for constructing memory-enhanced neural network controller _c The following formula (6) shows:

L _c ＝E _{rate_max} (1-P(w))+(1-E _{rate_max} )P(w) ⑥

wherein P (w) is the probability of writing memory output by the controller. When P (w) > 0.5, the corresponding historical track features and future track features are written into the memory as memory. The controller is trained while the history track feature memory and the future track feature memory are continuously increased for network inference. In the process of training the memory enhancement neural network, noise is randomly added into historical track points to serve as a data enhancement means, so that the generalization capability of the network is improved, and overfitting is avoided.

In the process of deducing, 5 corresponding future track feature memories are still selected, then 5 future tracks are output by a decoder, the 5 tracks are required to be evaluated, and the optimal future track is selected as a final result. The invention utilizes the three-dimensional Kalman filter to estimate the acceleration, the speed and the position of the tracking example, and calculates the track point of 1 second in future

By trace points->

And calculating track point errors of 1 second in a plurality of future tracks for the reference, selecting the future track with the smallest error as an optimal track, and turning to finish three-dimensional track prediction by using a Kalman filtering algorithm when the error of the optimal track is larger than a preset first threshold value.

S4, establishing a static map by a dynamic SLAM algorithm: taking the last point cloud point of the point cloud data as a reference, and obtaining the pose of the three-dimensional detector corresponding to each point cloud point by using IMU integral interpolation to finish the de-distortion of the point cloud; and similarly, obtaining the initial pose of the point cloud data, removing dynamic points in the point cloud data according to the dynamic object position information obtained by a sensing algorithm, performing grid voxel downsampling, and then constructing an incremental map by using ikd-Tree to finish the establishment of a static map. The SLAM algorithm flow chart is shown in fig. 5, IMU pre-integration is used for obtaining the pose change between two adjacent frames of three-dimensional laser radar point clouds, meanwhile, the pose change can be used as a constraint construction factor graph, and after the factor graph is optimized, corresponding state quantities such as IMU zero offset values and the like can be obtained, so that the initialization of the IMU is completed. The points in the three-dimensional laser radar point cloud are not all obtained by scanning at the same moment, and in the process of scanning the surrounding environment by the three-dimensional laser radar, the three-dimensional laser radar coordinate systems corresponding to the point cloud points scanned at different moments are not necessarily located at the same pose due to the motion of the ground mobile robot, so that the distortion of the three-dimensional laser radar point cloud is caused. The invention uses a pose interpolation method based on IMU integration to remove the motion distortion of the three-dimensional laser radar point cloud. And taking the last point cloud point of the three-dimensional laser radar point cloud frame as a reference, and obtaining the three-dimensional laser radar pose corresponding to each point cloud point by using IMU integral interpolation, thereby completing the de-distortion of the point cloud.

And after the distortion of the three-dimensional laser radar point cloud is removed, the initial pose of the three-dimensional laser radar point cloud frame is obtained by using an IMU integral pose interpolation method. And removing dynamic points in the three-dimensional laser radar point cloud according to the dynamic object position information obtained by the sensing algorithm, and performing grid voxel downsampling. And thus, finishing the three-dimensional laser radar point cloud frame processing. After constructing an incremental map by using ikd-Tree, 5 nearest map points between the characteristic points in the point cloud data and the ikd-Tree constructed incremental map are obtained, whether the 5 map points are planes is judged, and if the 5 map points are planes, residual errors of the characteristic points and planes where the 5 map points are located are obtained.

The residual of a feature point to a corresponding plane is typically relatively large if the dynamic point is not removed. It is not reasonable to use these feature points to estimate the point cloud frame pose. Accordingly, the corresponding value is calculated with the following equation 7.

Wherein the method comprises the steps of

Coordinates representing the ith feature point of the point cloud frame,/->

Representation->

And if S is larger than a preset second threshold value, the residual error value from the feature point to the nearest plane of the incremental map is considered unreasonable, the feature point is a dynamic point or a noise point, and the feature point is removed, so that the removal of the dynamic point is realized.

This embodiment has the following advantages:

the point cloud target detection data obtained by the three-dimensional laser radar and the image target detection data obtained by the camera are subjected to fusion matching through a Hungary matching algorithm, and matching result data can be output, so that parallel calculation of a CIA-SSD point cloud target detection process and a YOLOv5 image target detection process is realized, the problem that targets cannot be stably detected and tracked due to more field environment interferents is solved, the stability of real-time operation of a system can be effectively ensured, and the stability of multi-target tracking is improved while higher detection accuracy is achieved.

The deep learning-based trajectory prediction algorithm requires a historical trajectory of the moving object over a period of time to complete the inference. If the number of the historical track points of the moving object cannot meet the requirement, the deep learning neural network cannot extract effective features. However, when the actual system is running, the tracking instance is created early, and the number of the historical track points cannot meet the requirement. At the moment, three-dimensional track prediction is completed by using a Kalman filtering algorithm, so that the three-dimensional track prediction can be completed by the system under different conditions at different stages. In addition, the Kalman filtering algorithm predicts the moving object with small track error in a short time, and can evaluate the predicted result of the track prediction algorithm based on the deep learning to obtain the optimal predicted track, thereby achieving the advantage of high accuracy of target motion track prediction.

The method is characterized in that the method is not complete by simply using a point cloud image fusion three-dimensional target detection module to obtain information to remove dynamic points, and the corresponding dynamic points cannot be removed because the target detection algorithm can generate the condition that a certain frame of data moving object is missed. In order to solve the problem, the invention designs a dynamic object information collection strategy, which is used for acquiring the target detection result of the previous frame of data on one hand and acquiring the corresponding position estimation value from the tracking example which is not matched with the observed value of the previous frame but is not deleted on the other hand. Since the relative speed between the ground mobile robot and the moving object is not large in the course of the ground mobile robot tracking the target object. Therefore, the position change of the moving object between the adjacent data frames is not great, the position information of the moving object of the previous frame can be used for removing the dynamic object of the current three-dimensional laser radar point cloud frame, and the advantage of higher removal rate of the dynamic point is achieved. The parallel computation of the CIA-SSD network process, the YOLOv5 process and the SLAM process is realized, and the real-time running stability of the whole system is ensured.

The invention can provide future tracks of moving objects and establish a static map on a homemade field environment data set, and verifies the high accuracy of a point cloud image fusion three-dimensional target detection algorithm, the stable tracking effect of a multi-target tracking algorithm, the precision of three-dimensional track prediction, the high preservation rate of static points of the established static map and the high removal rate of dynamic points. The test device can run in real time on test equipment (CPU is AMD Ryzen 93950X, GPU is NVIDIAGeForceRTX 3080). The system has important significance for solving the problem of tracking the target object by the ground mobile robot.

The corresponding matching value is obtained through the Hungary matching algorithm, and the state of the tracking instance is further judged through the life cycle management strategy of the tracking instance, so that the state of the tracking instance can be used as a judgment basis for the next operation of the tracking instance, and the detection, tracking and prediction of the system on the target are facilitated.

The accuracy of the memory enhancement neural network track prediction can be ensured by training a historical track encoder network for extracting the historical track features in the historical track, a future track encoder network for extracting the future track features in the future track and a future track encoder network for decoding and recovering the future track features into the future track.

By initializing the IMU, the accuracy of completing the point cloud de-distortion in the pose interpolation based on IMU integration can be ensured.

The acceleration, the speed and the position of the tracking instance are estimated by using the three-dimensional Kalman filter, so that estimated data of the tracking instance can be further provided, and the state of the tracking instance can be further monitored and analyzed.

The future track is evaluated through the three-dimensional Kalman filter, so that the accuracy of the future track can be further ensured, and the advantage of higher accuracy of target motion track prediction is achieved.

By solving the residual error, whether the residual error value from the characteristic point in the point cloud data to the nearest plane of the incremental map is reasonable or not can be judged, whether the characteristic point is a dynamic point noise point or not can be judged based on the residual error value, the removal rate of the dynamic point is improved, and the real-time operation stability of the whole system is ensured. The pose estimation of the current three-dimensional detector can be completed, and the positioning accuracy of the whole system is ensured.

The residual of a feature point to a corresponding plane is typically relatively large if the dynamic point is not removed. It is not reasonable to use these feature points to estimate the point cloud frame pose. Accordingly, the corresponding value is calculated with the following equation 7. If the value is larger than the second threshold value, the residual value from the feature point to the nearest plane is considered unreasonable, and the feature point is a dynamic point or a noise point and needs to be removed. The method of directly removing all the cloud points near the position of the dynamic object obviously removes part of ground points, so that holes and other conditions appear in the ground area. The invention can solve the problem by using a ikd-Tree to construct an incremental map, and the holes are complemented by using the next three-dimensional laser radar point cloud key frame, so that the purpose of preserving static points as much as possible is realized. The method has the advantages of higher dynamic point removal rate and capability of preserving static points as much as possible.

Variations and modifications to the above would be obvious to persons skilled in the art to which the invention pertains from the foregoing description and teachings. Therefore, the invention is not limited to the specific embodiments disclosed and described above, but some modifications and changes of the invention should be also included in the scope of the claims of the invention. In addition, although specific terms are used in the present specification, these terms are for convenience of description only and do not constitute any limitation on the invention.

Claims

1. The method for detecting and predicting the multiple moving targets in the field environment is characterized by comprising the following steps:

2. The method for detecting and predicting a plurality of moving objects in a field environment according to claim 1, further comprising the steps of:

Wherein->

3. The method for detecting and predicting a plurality of moving objects in a field environment according to claim 1, further comprising the steps of:

4. The method for detecting and predicting trajectories of multiple moving objects in a field environment according to claim 3, further comprising the steps of, in said step S2:

5. The method for detecting and predicting a plurality of moving objects in a field environment according to claim 1, further comprising the steps of:

6. The method for detecting and predicting trajectories of multiple moving objects in a field environment according to claim 5, further comprising the steps of, in said step S2:

7. The method for detecting and predicting trajectories of multiple moving objects in a field environment according to claim 6, further comprising the steps of, in the step S3:

By trace points->

And calculating track point errors of a period of time in future tracks predicted by the memory-enhanced neural network for the reference, selecting the future track with the smallest error as an optimal track, and invalidating the future track when the error of the optimal track is larger than a preset first threshold value.

8. The method for detecting and predicting a plurality of moving objects in a field environment according to claim 1, further comprising the steps of:

9. The method for detecting and predicting a plurality of moving objects in a field environment according to claim 1, further comprising the steps of:

10. The method for detecting and predicting trajectories of multiple moving objects in a field environment according to claim 9, characterized in that in said step S4, it further comprises the steps of:

calculation of

Wherein the method comprises the steps of

Coordinates representing the ith feature point of the point cloud frame,/->

Representation->