CN110717445B

CN110717445B - Front vehicle distance tracking system and method for automatic driving

Info

Publication number: CN110717445B
Application number: CN201910953010.XA
Authority: CN
Inventors: 高跃; 魏宇轩; 赵曦滨
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2019-10-09
Filing date: 2019-10-09
Publication date: 2022-08-23
Anticipated expiration: 2039-10-09
Also published as: CN110717445A

Abstract

The invention discloses a front vehicle distance tracking system and a method for automatic driving, wherein the method comprises the following steps: the data acquisition unit is used for acquiring an image sequence at equal time intervals from the camera; the vehicle detection unit is used for extracting a contour outline of a front vehicle from the image; the coordinate positioning unit is used for calculating the real position of the front vehicle according to the pixel coordinates of the vehicle outline frame line in the image and the camera parameters; the vehicle tracking unit identifies the same vehicle in the plurality of images through front wheel outline frame lines of all the images in the series image sequence, and numbers the vehicle; the system converts the input image sequence into an XML-formatted file of a sequence of distances of the vehicle ahead, based on the vehicle position calculated by the coordinate locating unit and the vehicle number calculated by the vehicle tracking unit. By the technical scheme, the detection and tracking of the position of the front vehicle on the road can be realized by using a single vehicle-mounted camera, and the road sensing, automatic obstacle avoidance and auxiliary decision making in an automatic driving system are facilitated.

Description

Front vehicle distance tracking system and method for automatic driving

Technical Field

The invention relates to the technical field of vision-assisted automatic driving, in particular to a front vehicle distance tracking system for automatic driving and a vehicle distance measuring method based on a two-dimensional image.

Background

Automatic driving based on visual assistance is the mainstream solution of current automatic driving technology. Cameras and lidar are the two most commonly used vehicle-mounted vision sensors, with cameras collecting two-dimensional image data and lidar collecting three-dimensional point cloud data. Compared with a camera, the laser radar is high in manufacturing cost at present, so that the camera is the scheme with the lowest cost for realizing vision-assisted automatic driving.

On an automatic driving platform of a vehicle-mounted camera, image-oriented road condition recognition and understanding are core problems of automatic driving. Automatic decision-making technologies such as vehicle avoidance and path planning related to automatic driving all need to acquire position information of a front vehicle. Meanwhile, by tracking the vehicles on the road ahead, the system can further serve scene understanding and driving decision tasks such as driving behavior prejudgment, early braking and the like of the vehicles ahead. Therefore, the image-based front vehicle distance tracking method and system are key technologies for automatic driving.

In the prior art, the traditional graphics operator is usually adopted to extract the front vehicle information in the image, the method cannot be well adapted to complex road conditions, and misjudgment is easily caused for the conditions of night, mottled tree shadow, a road surface with textures and mutual shielding of vehicles. Meanwhile, in the prior art, the relevance of multi-frame images is mostly not considered, different images of the same vehicle are not matched from continuous multi-frame images, and the continuous tracking of the position of the front vehicle is neglected, so that the application range of the distance measurement result is limited. Specifically, the following difficulties exist in assisting automatic driving decision by using the traditional front vehicle visual ranging method:

1) complexity: the application scene of automatic driving is very complicated, the illumination degree, the road texture and the shielding condition of images shot under different road conditions are different, and the traditional method lacks robustness under a strong interference complex scene;

2) representation capability: the existing front vehicle distance measuring method does not integrate a vehicle tracking technology, cannot connect a plurality of front vehicle instantaneous coordinates in series into a front vehicle track, and cannot estimate the driving behavior of the front vehicle from the front vehicle track.

Disclosure of Invention

The invention aims to: and tracking the track of the front vehicle and calculating the position of the front vehicle based on the image sequence shot by the vehicle-mounted camera, so as to assist the automatic driving decision.

The technical scheme of the invention provides a front vehicle distance tracking system for automatic driving, which is characterized by comprising the following components: the system comprises a data acquisition unit, a vehicle detection unit, a coordinate positioning unit and a vehicle tracking unit;

the data acquisition unit is used for extracting an image sequence with equal time intervals from the camera and simultaneously recording parameters of the camera, wherein the parameters comprise internal parameters representing the focal length, the projection center, the inclination and the distortion of the camera and external parameters representing the position translation and the position rotation of the camera;

the vehicle detection unit comprises a plurality of vehicle detection modules, wherein each vehicle detection module is responsible for processing a single image; the vehicle detection module identifies vehicles in the images by using a classifier constructed by a deep neural network model, and fits a contour outline of the vehicle by using a regressor, so that the front vehicle is positioned on a single image;

the coordinate positioning unit comprises a plurality of coordinate positioning modules, wherein each coordinate positioning module is responsible for processing a single vehicle outline frame line; for each vehicle contour outline, the coordinate positioning module transforms the coordinates of the contour outline pixels of the front vehicle into the coordinates of the vehicle body where the vehicle-mounted camera is located by a geometric transformation method according to the pixels of the four vertexes of the outline and the camera parameters obtained by the data acquisition unit, so as to determine the spatial position of the front vehicle relative to the vehicle;

the vehicle tracking unit is used for processing and identifying the same vehicle in the multi-frame images to form a driving track of each vehicle; the vehicle tracking unit identifies the same vehicle appearing in different pictures, gives the same unique id number to the vehicle, and connects the driving tracks in series, thereby realizing the front vehicle tracking.

Further, the vehicle detection unit comprises a candidate region generation module, a vehicle discrimination module and a contour outline regression module;

the candidate region generation module stores a plurality of anchor points with different sizes, and each anchor point is a rectangular region composed of a plurality of pixels;

the vehicle distinguishing module inputs the candidate area which is generated by the candidate area generating module and possibly contains the vehicle and outputs information whether the candidate area contains the vehicle or not;

and the outline regression module finely adjusts the outline of the vehicle in the picture on the basis of the candidate region coordinates according to the mask region convolution neural network regressor.

Further, according to the vehicle detection unit, calculating a contour outline of the vehicle in the image by using a mask region convolution neural network; the calculation of the mask region convolutional neural network is divided into three steps:

step 1, extracting alternative areas, traversing the whole image from left to right and from top to bottom by a plurality of preset anchor points with different sizes, and calculating the position of a rectangular block which can be used as a vehicle outline;

step 2, carrying out regional object classification, extracting visual features of the regions by using a convolutional neural network for each candidate region, and judging the category of the objects in the regions by using a multilayer perceptron through the visual features;

and 3, finishing the coordinates of the contour outline, and regressing the offset of the contour outline of the candidate area relative to the contour outline of the detection target by using a neural network for each candidate area so as to further fit the contour outline of the detection target.

Further, the coordinate locating unit is configured to:

according to the coordinates of the vehicle in the image, which are obtained by the vehicle detection unit, the coordinates of the vehicle in a world coordinate system are obtained through a camera coordinate-world coordinate transformation formula, wherein the coordinate transformation formula is as follows:

sm＝A[R|t]M

wherein M ═ x _w ,y _w ,z _w ] ^T Is a three-dimensional coordinate under the world coordinate system, m ═ x _i ,y _i ] ^T The two-dimensional coordinates of the bottom center point of the outline frame line of the vehicle detected in the image are R and t are respectively a rotation matrix and a translation matrix in an external parameter matrix of the camera; a is the intra-camera parameter matrix, _- A[1,1]＝f _x ,A[1,3]＝c _x ,A[2,2]＝f _y ,A[2,3]＝c _y ,A[3,3]1, the remaining positions of a are all 0; wherein f is _x ,f _y ,c _x ,c _y The focal length of the camera in the x-axis direction, the focal length in the y-axis direction, the optical center in the x-axis direction and the optical center in the y-axis direction are respectively; s is the depth of field; where M, a, R, t are known quantities that can be obtained from the data acquisition unit, and s, M are unknown quantities to be found.

Further, the calculation of the coordinate locating unit is divided into 2 steps:

step 1, depth of field estimation, namely, taking a point on the ground of a horizontal line at the bottom of a vehicle frame line on an image, wherein the two-dimensional coordinate of the point in the image is m _g The z-direction component z of the world coordinate of the point _w When the ground point and the bottom center point of the vehicle frame line are located at the same horizontal plane of the image, the depth of field s of the two pixel points is the same, and thus, the first linear equation set (e31) can be obtained

sm _g ＝A[R|t]M ₀ (e31)

Step 2, according to the world coordinate solving step, obtaining a second linear equation set (e32) for the bottom center point m of the vehicle frame line by a coordinate transformation formula

sm＝A[R|t]M (e32)

By combining the two equations, the known quantity in equation set (e31) can be used to eliminate the unknown quantity depth of field s in equation set (e32) to find the world coordinate M of the bottom center point of the vehicle frame line.

Further, the vehicle tracking unit comprises a distance calculation module and a distance matching module

The distance calculation module calculates the pixel distance from the centers of a plurality of contour frame lines of a first frame to a plurality of contour frame lines of a second frame between two frames of images according to the contour frame lines of each vehicle of the two frames of images, and a group of inter-frame matching contour frame lines with the closest distance are regarded as the contour frame lines of the two frames of images of the same vehicle;

the distance matching module preferentially matches vehicle outline frame lines closest to each other in two adjacent pictures in the image sequence according to a closest matching principle, assigns the same vehicle ID mark to a group of adjacent vehicle outline frame lines obtained by matching, removes the matched vehicle outline frame lines, and continues matching in the remaining outline frame lines in the two adjacent pictures according to the closest principle until all the vehicle outline frame lines in one picture in the two adjacent pictures are completely matched.

The invention also provides a method for tracking the front vehicle distance tracking system for automatic driving, which specifically comprises the following steps:

step 1: calibrating a camera, namely calibrating the position parameters and the optical parameters of the vehicle-mounted camera and recording the position parameters and the optical parameters in system data acquisition software;

the camera position parameters comprise the distances from the fixed position of the camera to the vehicle head, the vehicle chassis and two sides of the vehicle body and the three-dimensional angle of the camera relative to the vehicle chassis;

step 2: the method comprises the following steps of identifying and positioning a front vehicle, wherein the identification and positioning of the front vehicle are realized through a vehicle detection unit and a coordinate positioning unit;

and step 3: the method comprises the steps of tracking the track of a front vehicle, identifying vehicles which repeatedly appear in an image sequence through a vehicle tracking and positioning unit, giving different unique IDs to all different vehicles appearing in the image sequence as differences, and outputting the track sequence of each vehicle to an XML file.

The invention has the beneficial effects that: the mask area neural network is used for identifying the vehicle outline in the image, so that the robustness of vehicle detection in complex road condition scenes is improved. Through the coordinate transformation based on the horizon line pixel points, the three-dimensional coordinate points of the vehicle are restored from the image more accurately. Through the vehicle tracking unit, the system not only has a distance measuring function of the front vehicle, but also has a vehicle tracking function, so that the motion track of the front vehicle can be better mastered, the movement direction of the front vehicle is pre-judged, and the driving decision of the vehicle is more pre-judged and safer.

Drawings

The advantages of the above and/or additional aspects of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a schematic block diagram of a leading vehicle distance tracking method and system for autonomous driving according to one embodiment of the present invention;

FIG. 2 is a schematic diagram of a coordinate transformation calculation process according to one embodiment of the invention.

Detailed Description

In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflict.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.

The embodiment is as follows:

embodiments of the present invention will be described below with reference to fig. 1 to 2.

As shown in fig. 1, the present embodiment provides a leading vehicle distance tracking system 100 for automatic driving, including: a data acquisition unit 10, a vehicle detection unit 20, a coordinate locating unit 30 and a vehicle tracking unit 40.

The data acquisition unit is used for extracting an image sequence with equal time intervals from the camera and simultaneously recording parameters of the camera, wherein the parameters comprise internal parameters representing the focal length, the projection center, the inclination and the distortion of the camera and external parameters representing the position translation and the position rotation of the camera; in this embodiment, the data acquisition unit 10 includes data acquisition hardware devices and data acquisition software. The data acquisition hardware is a camera fixed on the top of the vehicle, the lens orientation of the camera is parallel to the chassis of the vehicle, and the distances from the camera to the vehicle head, the chassis and the two sides of the vehicle body are measured and recorded in data acquisition software. When the system is driven, the camera is turned on to shoot road conditions, and shot image sequences are transmitted to subsequent units of the system through data acquisition software.

The data acquisition software is used for transmitting the shot road condition image sequence information from the camera and recording the parameter information of the camera, and provides data support for the processing of the subsequent units of the system. Specifically, the data acquisition software is divided into an image sequence acquisition module and a camera parameter acquisition module.

The image sequence acquisition module is used for acquiring road condition image sequences from the camera and transmitting the road condition image sequences to the subsequent units. A vehicle-mounted front camera shoots a video of the road condition in front, including the conditions of roads, vehicles and pedestrians. The recorded video is cut into image sequences with equal intervals according to a fixed frame rate. The image sequence contains a plurality of pictures, which are denoted by "picture 1" and "picture 2" … … "picture n" in fig. 1, where n represents the total number of pictures in the image sequence. In the image sequence, the pictures follow the chronological precedence relationship, and the intervals of the shooting time of two adjacent pictures are equal.

The camera parameter acquisition module records external parameters and internal parameters of the camera and transmits the external parameters and the internal parameters to the subsequent unit. Specifically, the external parameters of the camera are spatial position parameters of the camera placed on the vehicle body, and are stored by using a rotation matrix and a translation vector; the extrinsic parameters of the camera are the optical parameters of the camera itself, including the focal length and the components of the optical center of the camera in both the x-axis and y-axis directions.

The vehicle detection unit 20 comprises a plurality of vehicle detection modules, wherein each vehicle detection module is responsible for processing a single image; the vehicle detection module identifies vehicles in the images by using a classifier constructed by a deep neural network model, and fits a contour outline of the vehicle by using a regressor, so that the front vehicle is positioned on a single image;

in this embodiment, the vehicle detection unit 20 includes: the n vehicle detection subunits, denoted in fig. 1 by "vehicle detection unit 21", "vehicle detection unit 22" … … "vehicle detection unit 2 n", where n again denotes the total number of pictures in the image sequence to be processed. In particular, one vehicle detection subunit is responsible for processing one picture in the image sequence. Each vehicle detection subunit has the exact same configuration, except that the pictures processed are different.

Specifically, one vehicle detection unit 20 is composed of the following functional modules: the device comprises a candidate region generation module, a vehicle discrimination module and a contour outline regression module.

The candidate region generation module is configured to: the candidate area generation module stores a plurality of anchor points with different sizes, and each anchor point is a rectangular area formed by a plurality of pixels. The candidate area generation module sequentially moves the anchor points from the upper left corner of the picture to the lower right corner of the picture from left to right and from top to bottom, and moves the anchor points one pixel unit at a time. And at each position where the anchor point moves, the candidate area generation module judges whether the rectangular area covered by the anchor point possibly contains the vehicle or not according to the pixel characteristics, and if the rectangular area covered by the anchor point possibly contains the vehicle, the position of the anchor point in the picture is recorded as the candidate area.

The vehicle discrimination module is configured to: the vehicle determination module inputs the candidate region that is generated by the candidate region generation module and is likely to contain the vehicle, and outputs information as to whether the candidate region contains the vehicle. Further, the vehicle distinguishing module extracts and classifies the pixel characteristics of the candidate region according to the mask region convolutional neural network classifier, and determines whether the candidate region contains the vehicle according to the probability of each class output by the classifier. Specifically, if the probability of the category "vehicle" is the highest among the probabilities output by the classifier, the candidate region is considered to contain a vehicle, otherwise, the candidate region is considered to contain no vehicle, and subsequent processing is not required.

The contour outline regression module is configured to: and finely adjusting the outline frame line of the vehicle in the picture on the basis of the candidate region coordinates according to the mask region convolution neural network regressor. Further, the contour outline regression module is configured to: the outline border is represented by the left side (x, y) of the upper left corner of the border and the width and height (w, h) of the outline border. Each fig is converted into a set of contour outline parameters for each vehicle in the picture by the vehicle detection unit: ((x) ₁ ,y ₁ ,w ₁ ,h ₁ ),(x ₂ ,y ₂ ,w ₂ ,h ₂ ),…,(x _k ,y _k ,w _k ,h _k )). Where the variable k indicates that k vehicles are included in the picture fig, k may be a different value for different pictures in the image sequence.

Calculating a contour outline of the vehicle in the image by using a mask region convolution neural network according to the vehicle detection unit; the calculation of the mask region convolutional neural network is divided into three steps:

The coordinate locating unit 30 includes a number of coordinate locating modules, each of which is responsible for processing a single vehicle contour outline; for each vehicle contour frame line, the coordinate positioning module transforms the contour frame line pixel coordinates of the front vehicle into the vehicle body coordinates of the vehicle-mounted camera through a geometric transformation method according to the pixels of the four vertexes of the frame line and the camera parameters obtained by the data acquisition unit (10), so that the spatial position of the front vehicle relative to the vehicle is determined;

in this embodiment, the coordinate locating unit 30 includes: n coordinate locating subunits, which are indicated in fig. 1 by the "coordinate locating unit 31" and the "coordinate locating unit 32" … … "coordinate locating unit 3 n", where n again indicates the total number of pictures in the image sequence to be processed. Specifically, a coordinate positioning subunit is responsible for processing a group of contour outline parameter sets obtained by a picture in the image sequence through a vehicle detection unit. Each coordinate positioning subunit has the exact same configuration except that the set of outline frame parameters being processed is different.

Specifically, the coordinate locating unit is configured to: according to the coordinates of the vehicle outline frame line in the image, which are obtained by the vehicle detection unit, the coordinates of the vehicle in a world coordinate system are obtained through a camera coordinate-world coordinate transformation formula, wherein the coordinate transformation formula is as follows:

sm＝A[R|t]M

wherein M ═ x _w ,y _w ,z _w ] ^T Is a three-dimensional coordinate under the world coordinate system, m ═ x _i ,y _i ] ^T R and t are respectively a rotation matrix and a translation matrix in an external parameter matrix of the camera for the two-dimensional coordinates of the point in a picture shot by the camera. A is the camera intrinsic parameter matrix, A1, 1]＝f _x ,A[1,3]＝c _x ,A[2,2]＝f _y ,A[2,3]＝c _y ,A[3,3]1, the remaining positions of a are all 0, wherein f _x ,f _y ,c _x ,c _y The focal length of the camera in the x-axis direction, the focal length in the y-axis direction, the optical center in the x-axis direction, and the optical center in the y-axis direction, respectively. s is the depth of field.

According to the above formula, the calculation of the coordinate positioning unit is divided into 2 steps: a depth of field estimation step and a world coordinate solving step. The depth of field estimation step is to take a reference point on the ground of the horizontal line at the bottom of the vehicle frame line on the image, and the two-dimensional coordinate of the reference point in the image is m _g The z-direction component z of the world coordinate of the point _w And if the depth of field is 0, the ground point and the bottom center point of the vehicle frame line are positioned on the same horizontal plane of the image, and the depth of field s of the two pixel points is the same. From this, a first system of linear equations (e31) can be derived

sm _g ＝A[A|t]M ₀

Wherein M is ₀ ＝[x ₀ ,y ₀ ,0] ^T As three-dimensional coordinates of the reference point in the world coordinate system, m _g ＝[x _g ,y _g ] ^T R and t are respectively a rotation matrix and a translation matrix in an external parameter matrix of the camera. A is the camera intrinsic parameter matrix, A1, 1]＝f _x ,A[1,3]＝c _x ,A[2,2]＝f _y ,A[2,3]＝c _y ,A[3,3]1, the remaining positions of a are all 0, wherein f _x ,f _y ,c _x ,c _y The focal length of the camera in the x-axis direction, the focal length in the y-axis direction, the optical center in the x-axis direction, and the optical center in the y-axis direction, respectively. s is the depth of field.

The world coordinate solving step, a second linear equation set is obtained by the coordinate transformation formula for the bottom central point m of the vehicle frame line (e32)

sm＝A[R|t]M

Wherein M ═ x _w ,y _w ,z _w ] ^T For the three-dimensional coordinates of the bottom center of the outline frame line of the vehicle detected in the image under the world coordinate system, m ═ x _i ,y _i ] ^T R and t are respectively a rotation matrix and a translation matrix in an external parameter matrix of the camera for two-dimensional coordinates of the bottom center point of the outline frame line of the vehicle detected in the image. A is the camera intrinsic parameter matrix, A1, 1]＝f _x ,A[1,3]＝c _x ,A[2,2]＝f _y ,A[2,3]＝c _y ,A[3,3]1, the remaining positions of a are all 0, wherein f _x ,f _y ,c _x ,c _y The focal length of the camera in the x-axis direction, the focal length in the y-axis direction, the optical center in the x-axis direction, and the optical center in the y-axis direction, respectively. s is the depth of field.

Further, for each vehicle contour outline (x, y, w, h) in each image, the center point coordinates (x + w/2, y + h) of the vehicle contour bottom are calculated, i.e., m ═ x + w/2, y + h ₀ On the same horizontal line as m, and therefore m is accordingly ₀ From this the quantity m, m is known (x', y + h) ₀ And A, R, t are all given, the simultaneous equations (e31, e32) can solve for the world coordinates (X, Y, Z) of the bottom of the trailing edge of the vehicle body.

The vehicle tracking unit 40 is responsible for processing and identifying the same vehicle in the multi-frame images to form a driving track of each vehicle; the vehicle tracking unit 40 identifies the same vehicle, gives a unique id number to the same vehicle, and connects the trajectories in series, thereby realizing the tracking of the preceding vehicle.

Calculating the pixel distance from the centers of a plurality of contour frame lines of a first frame to a plurality of contour frame lines of a second frame between two frames of images according to the contour frame lines of each vehicle of the two frames of images, wherein a group of inter-frame matching contour frame lines closest to the centers are regarded as the contour frame lines of the two frames of images of the same vehicle; therefore, every two adjacent images of the whole image sequence are calculated, all inter-frame matching outline frame lines can be obtained, all different vehicles in the image sequence correspond to the inter-frame matching outline frame lines, and for each vehicle, the outline frame lines of the vehicles in each frame are connected in series, so that the driving track of each vehicle is obtained.

In this embodiment, the vehicle tracking unit 40 includes: the distance calculating module and the distance matching module.

Specifically, the distance calculation module calculates the pixel distance from the centers of a plurality of contour frame lines of a first frame to a plurality of contour frame lines of a second frame between two frames of images according to the contour frame lines of each vehicle of the two frames of images, and a group of inter-frame matching contour frame lines closest to each other are regarded as the contour frame lines of the two frames of images of the same vehicle. And performing distance calculation on every two adjacent pictures of the whole image sequence to obtain the distance of all corresponding vehicle outline frame lines in the two adjacent pictures.

Specifically, the distance matching module is configured to: preferentially matching two adjacent pictures in the image sequence with the vehicle outline frame line closest to the two adjacent pictures according to the closest matching principle, giving the same vehicle ID mark to the adjacent group of vehicle outline frame lines obtained by matching, then removing the matched vehicle outline frame line, and continuing to match the rest of the outline frame lines in the two adjacent pictures according to the closest principle until all the vehicle outline frame lines in one picture in the two adjacent pictures are completely matched, and ignoring the rest of the vehicle outline frame lines in the other picture. And performing the matching operation on all adjacent picture pairs in the whole image sequence to obtain all inter-frame matching contour outline border lines and the unique ID of the vehicle corresponding to the contour outline border lines. For each vehicle with a unique ID, the outline frame lines of the vehicles in each frame are connected in series to obtain the driving track of each vehicle.

Further, the distance calculation module is configured to: for two adjacent images of the same image sequence, there are vehicle contour sequences respectively:

and

where k and k' refer to the images fig respectively ₁ And fig ₂ The number of vehicle contour outline lines calculated by the vehicle detection unit 20. Defining a distance

The distance between the k x k' sets of vehicle contour lines can be found.

Further, the distance matching module is configured to: for image fig ₁ Each of the vehicle outline frame lines

Finding image fig according to the recent principle ₂ Outline frame line of

And (6) matching. The matching of all the two adjacent images in an image sequence is sequentially calculated, and a plurality of continuous matching strings can be obtained

Where K is the number of occurrences of vehicle i in the image sequence. By integrating the results of the coordinate locating unit 30, a coordinate series of the vehicle i can be obtained

And outputting the coordinate sequence in an XML format as a final output result of the front vehicle distance tracking system.

The embodiment also provides a front vehicle distance tracking method for automatic driving, which specifically comprises the following steps:

step 1: and (3) calibrating a camera, wherein the camera calibration step requires that the position parameters and the optical parameters of the vehicle-mounted camera are calibrated and recorded in system data acquisition software.

Specifically, the camera position parameters comprise the distances from the fixed position of the camera to the vehicle head, the vehicle chassis and two sides of the vehicle body and the solid angle of the camera relative to the vehicle chassis. The distance between the camera and the two sides of the vehicle head, the vehicle chassis and the vehicle body is represented by a translation matrix t in the camera external parameter matrix, and the solid angle between the camera and the vehicle chassis is represented by a rotation matrix R in the camera external parameter matrix.

The optical parameters of the camera are represented by an intra-camera parameter matrix A, where A [1,1 ]]＝f _x ,A[1,3]＝c _x ,A[2,2]＝f _y ,A[2,3]＝c _y ,A[3,3]1, the remaining positions of a are all 0, wherein f _x ,f _y ,c _x ,c _y The focal length of the camera in the x-axis direction, the focal length in the y-axis direction, the optical center in the x-axis direction, and the optical center in the y-axis direction, respectively.

Step 2: the recognition and positioning of the preceding vehicle is realized by the vehicle detection unit 20 and the coordinate positioning unit 30.

Specifically, for a set of image sequences, n equally time-spaced pictures are included: "picture 1", "picture 2" … … "picture n", n vehicle detection subunits are in one-to-one correspondence with each frame of picture in the image sequence. Each vehicle detection subunit detects contour outline lines of all vehicles in one picture, and each contour outline line is represented by the coordinates of the upper left corner of the contour and the width and height of the contour. For a picture containing k vehicles, the front vehicle identification and location step will result in the following set of k sets of outline frame lines:

((x ₁ ,y ₁ ,w ₁ ,h ₁ ),(x ₂ ,y ₂ ,w ₂ ,h ₂ ),…,(x _k ,y _k ,w _k ,h _k ) X, y, w, h respectively represent the abscissa of the top-left pixel, the ordinate of the top-left pixel, the width and the height of the outline. Subscripts 1,2, … …, k correspond to the 1 st, 2 nd, … … th, k th vehicles in the picture, respectively.

For each contour frame line in each frame of picture, the coordinate positioning unit firstly obtains the two-dimensional coordinate M of the midpoint of the bottom of the vehicle on the vehicle contour frame line in the picture according to the frame line parameters (X, Y, w, h) (X + w/2, Y + h/2) — further, the two-dimensional coordinate of the midpoint of the bottom of the vehicle is converted into the three-dimensional coordinate M in the real world coordinate system by the coordinate positioning unit, which is (X, Y,0) — wherein X and Y are the transverse distance and the longitudinal distance of the vehicle in front of the real world coordinate system relative to the vehicle.

And step 3: the trajectory of the vehicle in front is tracked. And identifying vehicles which repeatedly appear in the image sequence through a vehicle tracking and positioning unit, endowing different vehicles appearing in the image sequence with different unique IDs as differences, and outputting the track sequence of each vehicle to an XML file.

Each frame of picture in the image sequence is converted into a plurality of coordinate points, which respectively represent the position of the vehicle ahead in the frame of picture, through step 2. The vehicle tracking and positioning unit identifies the same vehicle in two adjacent frames, so that the vehicle coordinates in different pictures of the whole image sequence are arranged into a plurality of continuous tracks, and each track corresponds to a front vehicle with a unique ID. One track is in the shape of

X, Y represent the lateral and longitudinal distance, respectively, of the vehicle in front with respect to the host vehicle, the subscript T, T +1, … …, T represents the continuous time sequence of the presence of the vehicle in the camera, and the superscript i represents the unique ID. vehicle trajectory of the vehicle, which is ultimately saved in an XML file, an example of which is as follows:

the steps in the present application may be sequentially adjusted, combined, and subtracted according to actual requirements.

The units in the device can be merged, divided and deleted according to actual requirements.

Although the present application has been disclosed in detail with reference to the accompanying drawings, it is to be understood that such description is merely illustrative and is not intended to limit the application of the present application. The scope of the present application is defined by the appended claims and may include various modifications, adaptations, and equivalents of the invention without departing from the scope and spirit of the application.

Claims

1. A front-vehicle distance tracking system for autonomous driving, the system comprising: the system comprises a data acquisition unit (10), a vehicle detection unit (20), a coordinate positioning unit (30) and a vehicle tracking unit (40);

the data acquisition unit (10) is used for extracting an image sequence at equal time intervals from the camera and simultaneously recording parameters of the camera, wherein the parameters comprise internal parameters representing the focal length, the projection center, the inclination and the distortion of the camera and external parameters representing the position translation and the position rotation of the camera;

the vehicle detection unit (20) comprises a plurality of vehicle detection modules, wherein each vehicle detection module is responsible for processing a single image; the vehicle detection module identifies vehicles in the images by using a classifier constructed by a deep neural network model, and fits a contour outline of the vehicle by using a regressor, so that the front vehicle is positioned on a single image;

the coordinate locating unit (30) comprises a plurality of coordinate locating modules, wherein each coordinate locating module is responsible for processing a single vehicle contour outline; for each vehicle contour frame line, the coordinate positioning module transforms the contour frame line pixel coordinates of the front vehicle into the vehicle body coordinates of the vehicle-mounted camera through a geometric transformation method according to the pixels of the four vertexes of the frame line and the camera parameters obtained by the data acquisition unit (10), so that the spatial position of the front vehicle relative to the vehicle is determined;

the coordinate positioning unit (30) is configured to:

according to the coordinates of the vehicle in the image, which are obtained by the vehicle detection unit (20), the coordinates of the vehicle in a world coordinate system are obtained through a camera coordinate-world coordinate transformation formula, wherein the coordinate transformation formula is as follows:

wherein

Is a three-dimensional coordinate under a world coordinate system,

two-dimensional coordinates of the bottom center point of the outline frame line of the vehicle detected in the image,

and

respectively a rotation matrix and a translation matrix in the external parameter matrix of the camera

Is a matrix of parameters within the camera that,

，

are all 0; wherein the focal length of the camera in the x-axis direction, the focal length in the y-axis direction, the optical center in the x-axis direction, and the optical center in the y-axis direction are respectively

Is the depth of field; wherein,

are known quantities that can be acquired from the data acquisition unit (10),

is the unknown quantity to be solved;

the calculation of the coordinate positioning unit is divided into 2 steps:

step 1, estimating the depth of field, namely taking a point on the ground of a horizontal line at the bottom of a vehicle frame line on an image, wherein the two-dimensional coordinate of the point in the image is

Z-direction component of world coordinate of the point

As is known, the depth of field of the two pixel points is the same as the ground point and the bottom center point of the vehicle frame line are positioned on the same horizontal plane of the image

Similarly, a first system of linear equations (e31) can be derived therefrom

（e31）

Step 2, solving the step according to the world coordinates, and using a coordinate transformation formula to determine the bottom center point of the vehicle frame line

Obtain a second system of linear equations (e32)

（e32）

By combining the two equations, the unknown amount of depth in the equation set (e32) may be cancelled out with the known amount in the equation set (e31)

Thereby obtaining the world coordinates of the bottom center point of the vehicle frame line

；

The vehicle tracking unit (40) is responsible for processing and identifying the same vehicle in the multi-frame images to form the driving track of each vehicle; the vehicle tracking unit (40) identifies the same vehicle appearing in different pictures, gives a unique id number to the vehicle, and connects the driving tracks in series, thereby realizing the tracking of the front vehicle;

the vehicle tracking unit (40) comprises a distance calculation module and a distance matching module;

the distance matching module preferentially matches the vehicle contour frame line closest to the two adjacent images in the image sequence according to a closest matching principle, a group of adjacent vehicle contour frame lines obtained through matching are endowed with the same vehicle ID mark, then the matched vehicle contour frame lines are removed, and the matching is continued in the remaining vehicle contour frame lines in the two adjacent images according to the closest principle until all the vehicle contour frame lines in one image in the two adjacent images are completely matched.

2. The preceding vehicle distance tracking system for autonomous driving according to claim 1, characterized in that the vehicle detection unit (20) includes a candidate region generation module, a vehicle discrimination module, and a contour outline regression module;

and the outline regression module finely adjusts the outline of the vehicle in the image on the basis of the candidate region coordinates according to the mask region convolution neural network regressor.

3. The front vehicle distance tracking system for autonomous driving of claim 2,

according to the vehicle detection unit, calculating a contour outline of the vehicle in the image by using a mask region convolution neural network; the calculation of the mask area convolution neural network is divided into three steps:

4. A method for tracking using the system for tracking a distance to a leading vehicle for automatic driving of claim 1, comprising the steps of:

step 2: the method comprises the steps of identifying and positioning a front vehicle, wherein the identification and positioning of the front vehicle are realized through a vehicle detection unit (20) and a coordinate positioning unit (30);