CN113436262A

CN113436262A - Vision-based vehicle target position and attitude angle detection method

Info

Publication number: CN113436262A
Application number: CN202110771236.5A
Authority: CN
Inventors: 孙超; 郑群锋; 高楠楠; 王博
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2021-07-08
Filing date: 2021-07-08
Publication date: 2021-09-24

Abstract

The invention provides a vision-based vehicle target position and attitude angle detection method, which comprises the steps of firstly adopting a semantic segmentation method to detect pixel points in an image one by one, extracting pixel points of all vehicles, and then processing the pixel points to obtain accurate position coordinate information and attitude information of the vehicles. The invention has low cost and high detection precision: because the semantic segmentation is to judge the pixel level of the target, accurate information of the outer contour of the vehicle is obtained, good detection effect can be obtained in different application places, and remote upgrading iteration can be performed according to the increase of scenes.

Description

Vision-based vehicle target position and attitude angle detection method

Technical Field

The invention provides a vision-based vehicle target position and attitude angle detection method, and belongs to the field of vehicle target detection.

Background

The detection of the position and the orientation of the vehicle target is a difficult problem to be faced in the field of vision-based target detection at present, and the detection of the position and the orientation of the vehicle target refers to the detection and identification of the position and the orientation angle of some specific vehicles in an image relative to a camera by applying some visual image processing algorithms through the externally mounted camera. The current mainstream method of vision-based target detection is to detect and acquire a two-dimensional boundary box of a vehicle target in a graph, and measure and calculate the position and attitude orientation angle of the vehicle target according to the two-dimensional boundary box. The current mainstream method easily causes a large distance error of the detected vehicle position when coordinate space conversion is carried out, and estimation of the target attitude orientation is difficult. As shown in fig. 1, since the two-dimensional bounding box cannot accurately enclose the vehicle, the image information of the vehicle cannot be accurately described.

The main conventional vision-based target detection method is to acquire a two-dimensional bounding box (bounding box) of a vehicle target in an image according to a CNN convolutional neural network (YOLO, SSD, fast-rcnn, or the like). The general method flow is as follows: firstly, preprocessing operations such as size changing (resize) are carried out on an input image, then neural network reasoning is carried out on the preprocessed image to obtain possible two-dimensional boundary frames (bounding box) of all vehicle targets, then all repeated two-dimensional boundary frames are filtered out for each vehicle target in a post-processing stage, and finally the lower boundary of the two-dimensional boundary frame is used as a grounding point coordinate of the vehicle target in the image to be converted into a vehicle coordinate system to output a corresponding position distance. It is difficult to acquire the attitude information of the vehicle from the two-dimensional bounding box.

In conjunction with the description of the conventional target detection methods in the previous paragraph, the following disadvantages of these methods can be summarized:

disadvantage 1: the distance measurement of the vehicle target position is inaccurate, and the error is large. The conventional method uses a two-dimensional boundary frame to represent the detected vehicle, but the lower boundary of the two-dimensional boundary frame of the vehicle target is not the grounding point position of the vehicle, so that the distance of the detected vehicle target position has a larger error relative to the true value. Moreover, the two-dimensional bounding box often cannot be well attached to the outer contour of the vehicle, and may have inconsistent sizes or a certain offset, which cannot represent the vehicle very accurately.

And (2) disadvantage: the attitude orientation of the target vehicle cannot be detected effectively. In the conventional method, only two-dimensional sizes of the width direction and the height direction of the vehicle target can be detected, and the attitude orientation of the target vehicle is not easy to detect.

Disclosure of Invention

With the increase of the holding capacity of new energy electric vehicles, the automatic charging demand of the electric vehicles becomes larger and larger. The automatic charging device is designed for the electric automobile, so that the manual operation time of a driver can be saved, and good user experience is brought to the driver; and the charging gun handle can be prevented from being contacted by a plurality of people, so that the sterilization times of the charging equipment can be reduced, and the probability of spreading new corona viruses is reduced.

In the automatic charging device, the recognition of the charging port of the electric automobile based on a visual algorithm (a camera) is the most central functional link, and the charging gun can be accurately inserted into the charging port of the automobile. The function of identifying the charging port of the automobile to be charged based on the visual algorithm can be divided into two main steps: detecting the target position and posture of the vehicle to be charged and identifying the hole position of a charging port of the vehicle.

The detection precision of the target position and the posture of the vehicle to be charged determines the recognition success rate of the hole site of the charging port of the vehicle. The accurate position information of the vehicle charging port can be obtained only by accurately detecting the actual position and the attitude angle of the vehicle to be charged relative to the parking space and combining the self information (such as the position of the charging port on the vehicle body, the height of the vehicle body and the like) of the vehicle, so that the accurate identification of the vehicle charging hole position is ensured, and the charging gun is finally guided to complete automatic charging.

Therefore, the invention aims to design a vision-based detection method for the target position and the attitude angle of the vehicle to be charged, which can enable the distance detection precision of the target position of the vehicle to be higher, can enable the attitude orientation of the vehicle target to be more easily detected and obtained, and is finally applied to the detection of the position of the automatic charging port of the vehicle.

The invention improves the defects caused by the mainstream target detection methods, and provides a new vehicle target position and attitude orientation detection method. The invention can greatly improve the position precision and the attitude angle precision of the vehicle target and ensure the high success rate of automatic charging.

The method comprises the following specific steps:

a vision-based detection method for a vehicle target position and an attitude angle comprises the following steps:

s1, acquiring original data:

collecting images as bare data input through a camera;

s2, image preprocessing:

the image preprocessing mainly comprises the steps of performing certain telescopic cutting on the size of an image;

s3, semantic information processing of the image:

in the vehicle detection reasoning process, a CNN convolutional neural network method is adopted to detect and reason vehicle pixels in the image; compared with the existing method, the semantic segmentation method can accurately extract the pixel information of the vehicle.

S4, data post-processing:

when the number of the pixel points meets the requirement, the minimum circumscribed rectangle of the vehicle contour obtained in the step S3 is extracted, so that the coordinate information of the vehicle in the image space can be obtained, and the angle between the rectangle and the horizontal plane in the image can be obtained. And establishing a relational expression between the angle of the vehicle in the image and the true attitude angle of the vehicle, and further calculating the attitude angle of the vehicle in the world coordinate system from the angle in the image by using the relational expression.

S5, camera calibration:

calibrating the camera by Zhangyingyou method to obtain the internal reference of the camera

And radix Ginseng

Creating pixelsCoordinate transformation relation between coordinate system and world coordinate system.

The Zhang friend calibration method describes the conversion relationship between the coordinate position of a point P in space from the world coordinate system in space to the pixel coordinate system on the camera imaging plane, wherein the conversion relationship relates to 4 coordinate systems of the world coordinate system, the camera coordinate system, the pixel coordinate system and the pixel coordinate system.

In the transformation from the world coordinate system to the camera coordinate system, P (X)_W，Y_W，Z_W) Is the coordinate of the point P in the space world coordinate system

In the formula, R is a 3 × 3 rotation matrix representing a rotation transformation relationship after the camera coordinate system rotates around the axis X, Y, Z with respect to the world coordinate system, and T is a 3 × 1 offset vector representing an offset of the origin of the camera coordinate system with respect to the origin of the world coordinate system. The coordinate of the transformed point P in the camera coordinate system is (X)_C，Y_C，Z_C)。

From the camera coordinate system to the pixel coordinate system, a transformation of the projection of the 3D coordinates into the 2D image plane is involved, where Zc denotes the depth value of the point P and f is the focal length of the camera.

From the pixel coordinate system to the pixel coordinate system, (u, v) indicates the pixel coordinate value of the point P on the pixel coordinate system plane, (u0, v0) indicates the coordinate value of the pixel coordinate origin in the pixel coordinate system, and the units are pixel, dx and dy indicate how many mm each column and each row respectively represents, i.e. 1pixel ═ dx mm. fx, and fy, f/dy.

S6, acquiring coordinates of a charging port of the vehicle:

and extracting the most front pixel point set of the vehicle outline to obtain an average value, namely the coordinates of the vehicle head. The coordinates of the vehicle charging port are obtained through conversion according to the information of the vehicle, such as the vehicle type, the position (a priori fixed value) of the vehicle charging port on the vehicle body and the like. And calculating to obtain the world coordinate value by utilizing the parameter matrix calibrated by the camera.

S7, obtaining the attitude angle of the vehicle:

and calculating the angle of the vehicle in the image to obtain the attitude angle in the world coordinate system by using the established polynomial relation between the angle of the vehicle in the image and the attitude angle of the real vehicle.

The application of the invention has the technical effects that:

1. the cost is low: the sensor is a common camera, and has the advantages of low price, wide usable environment and long service life.

2. The detection precision is high: because the semantic segmentation is to judge the pixel level of the target, accurate information of the outer contour of the vehicle is obtained. Therefore, the visual detection method based on semantic segmentation has high precision of the obtained position information and the attitude orientation information.

3. The application scene is wide: due to the fact that the deep learning method is adopted for detection, and the data set richly covers a plurality of vehicle types and scenes. Therefore, the detection effect can be well obtained in different application places.

4. The product is easy to be iteratively upgraded: remote upgrade iterations may be performed according to the increase in the scenario.

Drawings

FIG. 1 is an image acquisition process of an embodiment;

FIG. 2 illustrates semantic segmentation and semantic information processing according to an embodiment;

FIG. 3 is an embodiment of data post-processing, attitude angle determination;

fig. 4 is a vehicle charging port coordinate acquisition of an embodiment.

Detailed Description

The specific technical scheme of the invention is described by combining the embodiment.

s1, acquiring original data:

collecting images as bare data input through a camera; the camera captures and pre-processes the crop as shown in figure 1.

S2, image preprocessing:

s3, semantic information processing of the image:

in the vehicle detection reasoning process, a CNN convolutional neural network method is adopted to detect and reason vehicle pixels in the image; compared with the existing method, the semantic segmentation method can accurately extract the pixel information of the vehicle. FIG. 2 is a result of semantic segmentation using a training model;

s4, data post-processing:

when the number of the pixel points meets the requirement, the minimum circumscribed rectangle of the vehicle contour obtained in the step S3 is extracted, so that the coordinate information of the vehicle in the image space can be obtained, and the angle between the rectangle and the horizontal plane in the image can be obtained. And establishing a relational expression between the angle of the vehicle in the image and the true attitude angle of the vehicle, and further calculating the attitude angle of the vehicle in the world coordinate system from the angle in the image by using the relational expression. Fig. 3 shows the result of performing outline recognition and drawing a minimum bounding rectangle, and it can be seen that the coordinates of 4 vertices (clockwise from the top left corner) of the rectangle are [594,100], [1204,53], [1266,859], [656,905], the head center point coordinate [961,882], and the vehicle tilt attitude angle is-4.3987 °.

S5, camera calibration:

And radix Ginseng

And establishing a coordinate conversion relation between the pixel coordinate system and the world coordinate system.

S6, acquiring coordinates of a charging port of the vehicle:

and extracting the most front pixel point set of the vehicle outline to obtain an average value, namely the coordinates of the vehicle head. The coordinates of the vehicle charging port are obtained through conversion according to the information of the vehicle, such as the vehicle type, the position (a priori fixed value) of the vehicle charging port on the vehicle body and the like. And calculating to obtain the world coordinate value by utilizing the parameter matrix calibrated by the camera. Fig. 4 shows the charging port position calculated by combining the inherent relationship between the headstock center and the charging port center, and the coordinate values are [888.6, -12.4 ].

S7, obtaining the attitude angle of the vehicle:

Claims

1. A vision-based vehicle target position and attitude angle detection method is characterized in that pixel points in an image are detected one by adopting a semantic segmentation method, pixel points of all vehicles are extracted, and then accurate position coordinate information and attitude information of the vehicles are obtained by processing aiming at the pixel points.

2. The vision-based detection method of the position and attitude angle of the vehicle target according to claim 1, characterized by comprising the steps of:

s1, acquiring original data:

collecting images as bare data input through a camera;

s2, image preprocessing:

s3, semantic information processing of the image:

in the vehicle detection reasoning process, a CNN convolutional neural network method is adopted to detect and reason vehicle pixels in the image;

s4, data post-processing:

when the number of the pixel points meets the requirement, processing the vehicle contour obtained in the step S3 to obtain the attitude angle of the vehicle in the world coordinate system;

s5, camera calibration:

calibrating a camera, and establishing a coordinate conversion relation between a pixel coordinate system and a world coordinate system;

s6, acquiring coordinates of a charging port of the vehicle:

extracting a pixel point set at the most front end of the vehicle outline to obtain an average value of the pixel point set, namely coordinates of the vehicle head; according to the information of the vehicle, the coordinates of the charging port of the vehicle are obtained through conversion; calculating to obtain a world coordinate value by utilizing a parameter matrix calibrated by a camera;

s7, obtaining the attitude angle of the vehicle:

3. The vision-based detection method for the position and the attitude angle of the vehicle target according to the claim 2, wherein S4. data post-processing comprises the following processes: extracting a minimum circumscribed rectangle of the vehicle contour obtained in the step S3, acquiring coordinate information of the vehicle in an image space, and acquiring an angle between the rectangle and the horizontal plane in the image; and establishing a relational expression between the angle of the vehicle in the image and the true attitude angle of the vehicle, and calculating the attitude angle of the vehicle in the world coordinate system by using the angle in the image by using the relational expression.

4. The vision-based detection method for the position and the attitude angle of the vehicle target according to claim 2, wherein S5. camera calibration comprises the following processes: calibrating the camera by Zhangyingyou method to obtain the internal reference of the camera

And radix Ginseng

The camera calibration method comprises the following steps:

Wherein, R is a 3-by-3 rotation matrix which represents the rotation transformation relation after the camera coordinate system rotates around an axis X, Y, Z in sequence relative to the world coordinate system, and T is a 3-by-1 offset vector which represents the offset of the origin of the camera coordinate system relative to the origin of the world coordinate system; the coordinate of the transformed point P in the camera coordinate system is (X)_C，Y_C，Z_C)；

From the camera coordinate system to the pixel coordinate system, a transformation involving projection of 3D coordinates onto the 2D image plane, where Zc denotes the depth value of point P, f is the focal length of the camera;

from the pixel coordinate system to the pixel coordinate system, (u, v) represents the pixel coordinate value of the point P on the pixel coordinate system plane, (u0, v0) represents the coordinate value of the pixel coordinate origin in the pixel coordinate system, with the units of pixel, dx and dy representing how many mm each column and each row represents respectively, i.e. 1pixel ═ dx mm; fx, and fy, f/dy.