CN111693972B

CN111693972B - Vehicle position and speed estimation method based on binocular sequence images

Info

Publication number: CN111693972B
Application number: CN202010475838.1A
Authority: CN
Inventors: 钟赫冉; 耿可可; 殷国栋; 刘能; 范世硕
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2020-05-29
Filing date: 2020-05-29
Publication date: 2022-11-15
Anticipated expiration: 2040-05-29
Also published as: CN111693972A

Abstract

The invention discloses a vehicle position and speed estimation method based on binocular sequence images, which comprises the following steps of: s1, acquiring a depth map and a point cloud map by using a ZED binocular camera; s2, background subtraction is achieved through a KNN algorithm, and a moving target and a static background environment in the sequence image are identified; s3, detecting edge points of the moving target, drawing a rectangular identification frame positioned on the moving target, and tracking the moving target in real time; s4, removing a rectangular recognition frame appearing on the non-moving target caused by error factors such as light and shadow; s5, calculating the pixel coordinate of the central point of the effective rectangular identification frame locked on the moving target; s6, obtaining the (X, Y, Z) three-dimensional space coordinates of the points; and S7, estimating the position and the speed of the next moment by using a Kalman filtering algorithm according to the three-dimensional space coordinates of the previous frame and the current frame. The method has the advantages of simple algorithm principle and low calculation complexity, is suitable for the technical field of simulation, and is used for detecting the position and the speed of the co-directional or opposite-directional vehicles by the unmanned vehicles.

Description

Vehicle position and speed estimation method based on binocular sequence images

Technical Field

The invention relates to the technical field of intelligent automobile positioning, in particular to a vehicle position and speed estimation method based on binocular sequence images.

Technical Field

The measurement, calculation and prediction of the position and the speed of the vehicle have important significance in the field of unmanned control. The method comprises the steps of calculating and predicting the position and the speed by using a sequence image acquired by a binocular camera, ensuring the relative accuracy of a predicted value, and controlling the use cost better. And estimating the vehicle speed and position based on the binocular image sequence, considering the sequence images acquired by using a binocular camera, realizing background subtraction by means of a KNN algorithm and performing relevant optimization adjustment, and acquiring relevant values and then performing processing calculation by using a Kalman filtering algorithm. Namely, the position and the speed of the next moment are predicted by using the position change of the characteristic points of the two frames before and after the real-time image.

At present, the existing mature vehicle speed detection technologies include ground magnetic induction coil detection, infrared laser ray vehicle speed detection, radar vehicle speed detection, video-based vehicle speed detection and the like. The video detection speed is relatively high, the amount of information which can be simultaneously obtained through the video is large, the detection technology cost is low, and the like, so that attention is paid to the video detection technology. In the video-based detection method, a related moving object needs to be obtained through a video frame, the position coordinate information of the related moving object is detected, and the speed is calculated. The relatively mature vehicle speed measuring method based on motion equation and KLT characteristic point tracking, which is proposed in 2010 by Madasu and Handlun and the like, can obtain the position information and the speed information of the vehicle.

In a similar video detection method, a monocular camera is commonly used, a real-time video is read through the monocular camera, and the position change of a moving target is obtained by comparing the known camera state with a two-dimensional image sequence of a front frame and a rear frame. However, this method cannot directly obtain three-dimensional coordinates, and has certain limitations in use.

Disclosure of Invention

In order to reduce the cost of a traditional vehicle-mounted positioning system mainly based on a laser radar or a combined navigation system, the invention provides a vehicle position and speed estimation method based on a binocular sequence image, which comprises the following steps:

s1, continuously acquiring sequence images in a visual range by using a ZED binocular camera to obtain a depth map and a point cloud map;

the ZED binocular camera can completely acquire the geometric information of the moving object. The traditional camera for vehicle speed measurement is generally a monocular camera, and although the mode of acquiring moving object information is similar to that of a binocular camera, the acquired information is extremely limited. Because of the lack of image depth information, a monocular camera can only acquire (x, y) coordinate information in an image coordinate system, but cannot acquire (x, y, z) three-dimensional coordinates of a certain moving object or a certain characteristic point in the three-dimensional coordinate system, so that the monocular camera has great limitations on calculation and prediction in the aspects of speed, position and distance, and has requirements on the placement angle and the like of the camera.

The binocular stereo vision adopted by the binocular camera can solve the problem, the three-dimensional position information of the moving object is completely recorded by utilizing the three-dimensional stereo coordinate system established at the camera, and the method has obvious help effect on the requirements of prediction and calculation of the position and the speed by utilizing a Kalman filtering method.

The binocular stereo vision is based on the parallax principle for measuring the coordinates of a target point, firstly, a three-dimensional stereo coordinate system with the central point of a camera as the origin is established, and the central distance b between the two cameras is determined. Then, images acquired from the two cameras are matched, and the projections M of the same characteristic point M in the coordinate systems of the two cameras are acquired ₁ And M ₂ Coordinate value (x) of ₁ ，y ₁ ) And (x) ₂ ，y ₂ ) And the three-dimensional coordinates (x, y, z) of the characteristic point in a camera coordinate system can be obtained through data calculation

The three-dimensional coordinates acquired through the ZED binocular camera are uniformly stored in a Point cloud type built in the ZED, and the coordinate system can be freely selected to be established based on a left eye or a right eye (the default is the right eye). Meanwhile, the transformation between the three-dimensional coordinate system and the image coordinate system can be realized, namely:

and (x, y) coordinates of the dynamic target in an image coordinate system, which are obtained through the image acquired by the left eye camera, are substituted into a retrieval function of the three-dimensional point cloud picture, so that the coordinate system conversion can be completed, and the (x, y, z) coordinates of the target point in the left eye camera coordinate system are returned for algebraic calculation in Kalman filtering.

S2, background subtraction is achieved through a KNN algorithm, and a moving target and a static background environment in the sequence image are identified;

the K nearest neighbor algorithm (KNN) belongs to the data mining classification technology and is characterized in that: and judging the class of the sample according to the class of the nearest neighbor sample of the sample, wherein the selected neighbor is the sample which is correctly classified.

The principle of the KNN algorithm in background modeling is as follows: the feature value type at one coordinate (x, y) is predicted, and the category of the feature value type is determined according to the feature value category of K points closest to the feature value type. Firstly according to a distance formula

And calculating the distance between the target point and the adjacent point, screening points for comparison through the K value, and finally determining the type of the target point.

The KNN algorithm has the advantages of high accuracy, stable effect, simple thought and clear process. Although the problem of no training process and complex calculation process exists, the complexity of the calculation process has no obvious influence on the camera pixel points with small sample capacity and can be ignored, and the application of the method is very suitable for classifying and distinguishing moving objects and backgrounds based on the characteristic of excellent classification capability, and a KNN algorithm is adopted.

Considering that the KNN algorithm belongs to the category of machine learning, in the first few frames when a program just starts to run, the algorithm needs to carry out adaptive training according to images, and the background and the moving object cannot be automatically separated until a proper background is found. However, the time of the adaptive training is very short, and the data processing content in the adaptive process is removed, so that the subsequent functions are not obviously influenced, the part has no influence on the tracking of the moving object and the subsequent coordinate calculation and Kalman filtering data processing process, and the task of tracking the moving object by background elimination modeling can be effectively completed.

The background subtraction algorithm is a very classical moving target detection means, and is characterized in that a background model under a static scene is obtained in a background modeling mode, then difference operation is carried out by means of image characteristics under the current frame and a previously stored background model, and an obtained region is stored as a moving target of a moving region, so that the identification and tracking of a moving object are completed. The principle of the background subtraction algorithm is

(x, y) -coordinate values of corresponding pixel points in the image coordinate system, f _c (x, y) -feature value of pixel at (x, y) coordinate under current frame, f _b (x, y) -eigenvalues of pixel points at (x, y) coordinates on background modeling, T _h -a set threshold value, M, for determining moving objects _Object (x, y) -image obtained after black-and-white binarization and corresponding processing are carried out on the current frame and the background difference image. And when the difference value between the characteristic value of the pixel point and the characteristic value of the background point is judged to be larger than the threshold value, the point is judged to be a moving object, otherwise, the point is judged to be the background point.

Background elimination modeling can separate moving objects from background characteristic points so as to obtain required parts (point cloud coordinates of the moving objects need to be obtained in the project), the project utilizes a KNN algorithm to realize background elimination, the background elimination is realized in C + + through an application function, and a relatively complete moving object and background separated image can be obtained after denoising processing of black and white binarization, corrosion and expansion (the moving objects are displayed in white, and the background is displayed in black).

S3, detecting edge points of the moving target, drawing a rectangular identification frame positioned on the moving target, and tracking the moving target in real time;

the images are analyzed through a Find contacts contour detection function in an OPENCV library, and the moving objects can be visualized through a rectangle function, so that the moving objects can be tracked in real time.

S4, setting a rectangular identification frame screening mechanism, and removing rectangular identification frames appearing on the non-moving target caused by error factors such as light and shadow;

the moving target identification is ensured to be accurate by setting a screening function to screen the identification range.

S5, calculating pixel coordinates X 'and Y' of the central point of the effective rectangular identification frame locked on the moving target;

the visualized image is saved based on the pixel coordinates (x, y) of the center point of the image coordinate system.

S6, acquiring corresponding real coordinates X and Y and a depth coordinate Z by using the obtained X 'and Y' and by means of a ZED point cloud coordinate, namely acquiring a (X, Y, Z) three-dimensional space coordinate of the point;

and converting the pixel coordinates (x, y) into three-dimensional coordinates (x, y, z) of the camera through the coordinate conversion mode in the step S1.

And S7, estimating the position and the speed of the next moment by using a Kalman filtering algorithm according to the three-dimensional space coordinates of the previous frame and the current frame.

The Kalman filtering algorithm has the core idea of prediction and measurement feedback and consists of two parts, wherein the first part is a linear system state prediction equation, and the second part is a linear system observation equation

The linear system state prediction equation can be expressed as:

x _k ＝Ax _k-1 +Bu _k-1 +ω _k-1

wherein: p (omega) -N (0, Q)

X in the equation _k And x _k-1 Represents the true value of the state at the time k and (k-1), u _k-1 Indicates the control amount, ω, at the time (k-1) _k-1 Representing process excitation noise, a represents the state transition coefficient matrix (n × n order), B represents the gain matrix of the optional control inputs, and Q represents the covariance matrix of the process excitation noise.

The linear system observation equation can be expressed as:

z _k ＝Hx _k +v _k

wherein: p (v) to N (0, R)

Z in the equation _k Is an observed value at time k, v _k For observing noise, H denotes a measurement coefficient matrix (m × n order matrix) and R denotes a measurement noise covariance matrix.

To get the optimal estimate, the a posteriori covariance at time k must be known a priori the estimated covariance.

The expression of the posterior estimated covariance at the time k is as follows:

the a posteriori estimate of the k time can be expressed by a priori estimation and kalman gain:

thereby, it is possible to obtain:

the prior estimated covariance expression at time k is:

can be obtained by the following two formulas:

let the trace of the A posteriori estimated covariance matrix at time k be Tp _k ]Thus, there are:

P _k the sum of the diagonal elements is the mean square error. The mean square error is used for deriving the unknown quantity K, the value of Kalman gain K can be determined by making the derivation function equal to 0, and the minimum mean square error estimation of the model is completed, so that the error of the posterior estimation is small and is closer to the true value of the state.

The minimum mean square error is found to determine the expression for kalman gain:

as can be seen,

the larger, K _k The larger the priori estimation error is, the more reliable measurement feedback is needed, and the Kalman gain is automatically increased by the model to carry out more accurate estimation;

is 0,K _k The value is 0, namely the prior estimation has no error, the accurate posterior estimation can be obtained only through prediction, the feedback of measurement is not needed, and the weight value of the feedback of measurement is automatically set to be 0 by the model.

Therefore, a final equation of the Kalman filtering algorithm can be obtained:

(1) Time update equation

Forward reckoning state variables:

forward estimation error covariance:

(2) Equation of state update

Calculating a Kalman gain:

from an observed variable z _k Updating estimation:

updating the error covariance:

the beneficial effects of the invention are as follows:

and aiming at the estimation of the position and the speed of the vehicle, a sequence image acquired by using a binocular camera is considered, background subtraction is realized by means of a KNN algorithm, relevant optimization adjustment is carried out, and after relevant numerical values are acquired, processing calculation is carried out by means of a Kalman filtering algorithm. Namely, the position and the speed of the next moment are predicted by using the position change of the characteristic points of the two frames before and after the real-time image. Experiments prove that the prediction method provided by the invention has higher reliability in a certain range. The method has the advantages of simple algorithm principle, low calculation complexity and low use cost by utilizing the binocular camera, is suitable for the technical field of simulation, can realize calculation and prediction of the speed and the position of the vehicle in a real environment, can be used for detecting the position and the speed of the co-directional or opposite-directional vehicle by the unmanned vehicle, can also be used for actual scenes such as intersection vehicle speed measurement and the like, and has good application prospect.

Drawings

FIG. 1 is a flow chart of the algorithm of the present invention;

FIG. 2 (a) is a comparison graph (X direction) of the real value and the predicted value of the three-dimensional space position of the experimental trolley of the invention;

FIG. 2 (b) is a comparison graph (Y direction) of the real value and the predicted value of the three-dimensional spatial position of the experimental trolley;

FIG. 2 (c) is a comparison graph (Z direction) of the real value and the predicted value of the three-dimensional space position of the experimental trolley of the invention;

FIG. 3 (a) is a comparison graph (X direction) of the real value and the predicted value of the three-dimensional direction speed of the experimental trolley;

FIG. 3 (b) is a comparison graph (Y direction) of the real value and the predicted value of the three-dimensional direction speed of the experimental trolley;

FIG. 3 (c) is a comparison graph (Z direction) of the real value and the predicted value of the three-dimensional direction speed of the experimental trolley of the invention;

fig. 4 is a working schematic diagram of the binocular camera of the present invention.

Examples

The present invention will be further illustrated with reference to the accompanying drawings and specific embodiments, which are to be understood as merely illustrative of the invention and not as limiting the scope of the invention.

The general design idea of the invention is as follows: the method comprises the steps of continuously acquiring sequence images by using a ZED binocular camera, obtaining a depth map and a point cloud map, realizing Background Subtraction (BSM) through a KNN algorithm, locking a moving target by using a rectangular identification frame, realizing accurate tracking of a moving vehicle, obtaining three-dimensional space coordinates of points on the moving target, obtaining a real-time moving state of the vehicle, and realizing more accurate prediction of the impending moving state of the vehicle by using a Kalman filtering algorithm.

As shown in the figure, the embodiment of the invention discloses a vehicle position and speed estimation method based on binocular sequence images, and the embodiment of the invention comprises the following steps:

s1, using a ZED binocular camera to continuously collect sequence images in a visual range to obtain a depth map and a point cloud map

The experiment dolly freely moves and passes in front of the camera, and the ZED binocular camera continuously acquires the image sequence, after processing and calculating, the three-dimensional coordinate that acquires through the ZED binocular camera is uniformly stored in the built-in Point cloud of ZED, and whether this coordinate system is established based on the left eye or the right eye (the default is the right eye) can be freely selected. And simultaneously, the conversion between a three-dimensional coordinate system and an image coordinate system is realized, and the (x, y) coordinates of the dynamic target in the image coordinate system, which are obtained through the image acquired by the left eye camera, are substituted into a retrieval function of the three-dimensional point cloud picture to complete the coordinate system conversion and return the (x, y, z) coordinates of the target point in the left eye camera coordinate system for algebraic calculation in Kalman filtering.

background elimination is realized by utilizing a KNN algorithm, separation of a background and a moving trolley is realized by an apply function, a background model under a static scene is obtained in a background modeling mode after denoising processing of black-white binarization, corrosion and expansion is carried out, then difference operation is carried out by means of image characteristics under a current frame and a previously stored background model, and an obtained area is used as a moving object of a moving area to be stored, so that identification and tracking of the moving object are completed, and a relatively complete image with the moving object separated from the background is obtained.

Background subtraction is realized by KNN algorithm, moving object identification is carried out by adopting the background subtraction method, namely background elimination modeling is realized by utilizing a Gaussian mixture model of an image segmentation mode or a KNN algorithm of machine learning,

the KNN algorithm predicts the characteristic value type under one coordinate (x, y) during background modeling, determines the category of the KNN algorithm according to the characteristic value category of K points closest to the KNN algorithm, and firstly determines the category of the KNN algorithm according to a distance formula

Calculating the distance between the target point and the adjacent point, screening points for comparison through a K value, and finally determining the type of the target point;

the background subtraction algorithm is specifically

(x, y) -coordinate values of corresponding pixel points in the image coordinate system, f _c (x, y) -feature value of pixel at (x, y) coordinate under current frame, f _b (x, y) -eigenvalues of pixel points at (x, y) coordinates on background modeling, T _h Setting a threshold value for determining moving objects, M _object (x, y) -obtaining an image after black-and-white binarization and corresponding processing are carried out on the current frame and the background difference image; and when the difference value between the characteristic value of the pixel point and the characteristic value of the background point is judged to be larger than the threshold value, the point is judged to be a moving object, otherwise, the point is judged to be the background point.

Background elimination is realized by utilizing a KNN algorithm in C + + through an apply function, and a relatively complete moving target and background separated image can be obtained after denoising processing of black and white binarization, corrosion and expansion.

S3, detecting edge points of the moving target, drawing a rectangular recognition frame positioned on the moving target, and tracking the moving target in real time;

the image is analyzed through a Find contacts contour detection function in an OPENCV library, and the moving target is processed visually through a rectangle function, so that the real-time tracking effect of the moving trolley is realized.

the screening and identifying range is screened by setting the screening function, so that the identifying frame is accurately and correctly selected on the contour of the moving trolley without obvious deviation and error.

S5, calculating pixel coordinates X 'and Y' of the central point of the effective rectangular recognition frame locked on the moving target;

and saving the visual trolley tracking image based on the pixel coordinates (x, y) of the central point of the image coordinate system for later acquiring three-dimensional coordinates.

and (4) converting the pixel coordinates (x, y) of the center of the moving trolley into real three-dimensional coordinates (x, y, z) of the trolley position under the camera coordinate system through the coordinate conversion mode in the step (S1).

Namely, the position and the speed of the next moment can be estimated by using a Kalman filtering algorithm;

the Kalman filtering algorithm consists of two parts, the first part is a linear system state prediction equation, and the second part is a linear system observation equation

The linear system state prediction equation can be expressed as:

x _k ＝Ax _k-1 +Bu _k-1 +ω _k-1

wherein: p (omega) to N (0, Q)

X in the equation _k And x _k-1 Showing the true values of the states at times k and (k-1), u _k-1 Indicates the control amount, ω, at the time (k-1) _k-1 Representing process excitation noise, a representing a state transition coefficient matrix (n × n order), B representing a gain matrix of the optional control input, and Q representing a covariance matrix of the process excitation noise;

the linear system observation equation can be expressed as:

z _k ＝Hx _k +v _k

wherein: p (v) to N (0, R)

the posterior estimation of k time is expressed by prior estimation and Kalman gain:

thereby, it is possible to obtain:

the prior estimated covariance expression at time k is:

the two formulas can be obtained:

let us denote the trace of the a posteriori estimated covariance matrix at time k as T [ Pk ], so there is:

P _k the sum of the diagonal elements is the mean square error. The mean square error is derived from the unknown quantity K, the value of the Kalman gain K can be determined by making the derivative function equal to 0, and the minimum mean square error estimation of the model is completed, so that the error of the posterior estimation is small and is closer to the true value of the state.

Solving the minimum mean square error to determine an expression of the Kalman gain:

using a final equation of a kalman filter algorithm:

(1) Time update equation

Forward reckoning state variables:

forward estimation error covariance:

(2) Equation of state update

Calculating a Kalman gain:

from an observed variable z _k Updating estimation:

updating the error covariance:

and substituting the real coordinate value, the measurement speed value and the time interval of each frame, predicting the position and the speed of the trolley at the next moment, storing the predicted data, and comparing the predicted data with the real data calculated at the next moment and carrying out posterior correction.

As shown in fig. 4, the binocular stereo vision measures coordinates of a target point based on the principle of parallax, and a three-dimensional coordinate system with a camera center point as an origin is established to determine a center distance b between two cameras. Then, images acquired from the two cameras are matched, and the projections M of the same characteristic point M in the coordinate systems of the two cameras are acquired ₁ And M ₂ Coordinate value (x) of ₁ ，y ₁ ) And (x) ₂ ，y ₂ ) And three-dimensional coordinates (x, y, z) of the characteristic point in a camera coordinate system can be obtained through data calculation.

The technical means disclosed in the scheme of the invention are not limited to the technical means disclosed in the above embodiments, but also include the technical means formed by any combination of the above technical features. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and such improvements and modifications are also considered to be within the scope of the present invention.

Claims

1. A vehicle position and speed estimation method based on binocular sequence images is characterized by comprising the following steps: the method comprises the following steps:

s4, setting a rectangular recognition frame screening mechanism to remove a rectangular recognition frame appearing on the non-moving target caused by light and shadow error factors;

2. The method according to claim 1, wherein in the step S1, the ZED binocular camera completely acquires geometric information of a moving object;

the three-dimensional coordinate obtained through the ZED binocular camera is uniformly stored in a Point cloud type Point cloud built in the ZED, and the conversion between a three-dimensional coordinate system and an image coordinate system is realized, namely:

and (x, y) coordinates of the dynamic target in an image coordinate system, which are obtained through the image acquired by the left eye camera, are substituted into a retrieval function of the three-dimensional point cloud picture, namely, the coordinate system conversion is completed, and the (x, y, z) coordinates of the dynamic target point in the left eye camera coordinate system are returned for algebraic calculation in Kalman filtering.

3. The method of claim 1, wherein: in the step S2, background elimination is realized by using a KNN algorithm, separation of a background from a moving vehicle is realized by an apply function, denoising processing of black-and-white binarization, corrosion and expansion is performed, a background model in a static scene is obtained by a background modeling method, a difference operation is performed by using image characteristics in a current frame and a previously stored background model, and an obtained region is stored as a moving target in a moving region, so that identification and tracking of a moving object are completed, and a relatively complete image in which the moving target is separated from the background is obtained.

4. The method of claim 1, wherein: in the step S3, the image is analyzed by the Find thresholds contour detection function in the OPENCV library, and the moving object is visualized by the rectangle function, so as to realize real-time tracking of the moving object.

5. The method of claim 1, wherein: in the step S4, the identification range is screened by setting the screening function, so as to ensure that the moving object is accurately identified.

6. The method of claim 1, wherein: in step S5, the visualized image is stored based on the pixel coordinates (x, y) of the center point of the image coordinate system.

7. The method of claim 6, wherein: in step S6, the pixel coordinates (x, y) are converted into three-dimensional camera coordinates (x, y, z).

8. The method of claim 7, wherein: in the step S7, according to the three-dimensional coordinates (x, y, z) obtained in the step S6,

using a final equation of a kalman filter algorithm:

(1) Time update equation

Forward reckoning state variables:

forward estimation error covariance:

(2) Equation of state update

Calculating a Kalman gain:

from an observed variable z _k Updating estimation:

updating the error covariance: