CN115164827A

CN115164827A - Monocular distance measurement method based on self-adaptive target detection network and license plate detection

Info

Publication number: CN115164827A
Application number: CN202210897606.4A
Authority: CN
Inventors: 凌强; 查易鑫; 李峰; 许永华
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2022-07-28
Filing date: 2022-07-28
Publication date: 2022-10-11

Abstract

The invention relates to a monocular distance measurement method and a monocular distance measurement system based on a self-adaptive target detection network and license plate detection, wherein the method comprises the following steps: s1: acquiring a forward-looking traffic image of a vehicle by using a camera, identifying the shape of a license plate in the forward-looking traffic image by using a license plate detection algorithm, and obtaining a depression gamma of the camera by using a perspective transformation principle; s2: inputting the forward-looking traffic image into a target detection network to obtain a lower edge central point P of a predicted target detection frame, and predicting to obtain a target distance by utilizing gamma and P; s3: and constructing a loss function by using the target distance and the real distance for training the target detection network. The method provided by the invention can directly detect the acquired image and measure the distance of the target, thereby avoiding the steps of image fusion and the like and reducing the manufacturing cost.

Description

A Monocular Ranging Method Based on Adaptive Target Detection Network and License Plate Detection

技术领域technical field

本发明涉及自动驾驶领域，具体涉及一种基于自适应目标检测网络与车牌检测的单目测距方法及系统。The invention relates to the field of automatic driving, in particular to a monocular ranging method and system based on an adaptive target detection network and license plate detection.

背景技术Background technique

随着自动驾驶技术的不断进步，越来越多的技术被投入使用中，各种各样的的自动驾驶汽车被投入市场。距离测量是自动驾驶技术中十分重要的一个部分。精准测量障碍物的距离，对于自动驾驶的路径规划以及预警系统等重要功能都有着重大的意义，甚至可以说是基石。With the continuous advancement of autonomous driving technology, more and more technologies have been put into use, and various types of autonomous vehicles have been put into the market. Distance measurement is a very important part of autonomous driving technology. Accurately measuring the distance of obstacles is of great significance for important functions such as path planning of autonomous driving and early warning systems, and can even be said to be the cornerstone.

在测距方法上主要分为主动测量与被动测量，主动测量是目前众多研究员的主要研究方向。主动测量是通过超声波传感器、激光雷达、摄像机等车载设备进行测距。超声波传感器成本较低，但是精度并不好，尤其在高速行驶时误差较大，适用场景较为有限。激光雷达精度最高，但是成本造价昂贵。且这些方法不易与摄像机采集到的图像进行目标的融合。单目相机可直接对采集到的图像进行检测，之后对目标进行距离的测量，避免图像融合等步骤，还降低了造价，有利于其在智能车上得到推广应用。The ranging method is mainly divided into active measurement and passive measurement. Active measurement is the main research direction of many researchers at present. Active measurement is ranging through ultrasonic sensors, lidars, cameras and other in-vehicle equipment. The cost of ultrasonic sensors is low, but the accuracy is not good, especially when driving at high speed, the error is large, and the applicable scenarios are relatively limited. Lidar has the highest accuracy, but it is expensive. And these methods are not easy to fuse the target with the image collected by the camera. The monocular camera can directly detect the collected images, and then measure the distance of the target, avoiding steps such as image fusion, and also reducing the cost, which is conducive to its popularization and application in smart cars.

单目视觉测距一般采用对应点标定法来获取图像的深度信息，对应点标定法是指通过不同坐标系中对应点的对应坐标求解坐标系的转换关系，但对应点标定法，在标定过程中，由于受器材限制，仍无法做到十分精确地记录一个点在世界坐标系和图像坐标系中的对应坐标如果其坐标不够精确，那么得到的转换矩阵的精确度也会受到制约，坐标转换结果的精度也会因此而波动，由于对应点标定法对于摄像机的标定是在摄像机的各个角度及高度已经确定的情况下进行的，当摄像机的任何一个参数发生变化时，都要重新进行标定，以得到在该种具体情况下的转换矩阵，所以该方法仅适用于位置固定的摄像机的情况，而对于应用在移动载体上的摄像机来说，由于摄像机载体在运动过程中会使摄像机的参数发生变化，所以适用性受到了限制。因此，如何在移动过程中进行测距成为一个亟待解决的问题。Monocular vision ranging generally uses the corresponding point calibration method to obtain the depth information of the image. The corresponding point calibration method refers to solving the transformation relationship of the coordinate system through the corresponding coordinates of the corresponding points in different coordinate systems, but the corresponding point calibration method, in the calibration process Due to the limitation of equipment, it is still impossible to record the corresponding coordinates of a point in the world coordinate system and the image coordinate system very accurately. If its coordinates are not accurate enough, the accuracy of the obtained transformation matrix will also be restricted, and the coordinate transformation The accuracy of the results will also fluctuate because of the corresponding point calibration method. The calibration of the camera is carried out under the condition that the various angles and heights of the camera have been determined. When any parameter of the camera changes, it must be re-calibrated. In order to obtain the transformation matrix in this specific case, this method is only applicable to the case of the camera with a fixed position, and for the camera applied on the moving carrier, because the camera carrier will cause the camera parameters to occur during the movement process. changes, so applicability is limited. Therefore, how to perform ranging in the moving process has become an urgent problem to be solved.

发明内容SUMMARY OF THE INVENTION

为了解决上述技术问题，本发明提供一种基于自适应目标检测网络与车牌检测的单目测距方法及系统。In order to solve the above technical problems, the present invention provides a monocular ranging method and system based on an adaptive target detection network and license plate detection.

本发明技术解决方案为：一种基于自适应目标检测网络与车牌检测的单目测距方法，包括：The technical solution of the present invention is: a monocular ranging method based on adaptive target detection network and license plate detection, comprising:

步骤S1：利用相机获取车辆的前视交通图像，使用车牌检测算法识别所述前视交通图像中的车牌形状，通过透视变换原理得到所述相机的俯视角γ；Step S1: use a camera to obtain a forward-looking traffic image of the vehicle, use a license plate detection algorithm to identify the shape of the license plate in the forward-looking traffic image, and obtain a top-down angle γ of the camera through a perspective transformation principle;

步骤S2：将所述前视交通图像输入目标检测网络，得到预测目标检测框的下沿中心点P，利用γ和P，预测得到目标距离；Step S2: Input the forward-looking traffic image into the target detection network to obtain the lower center point P of the predicted target detection frame, and use γ and P to predict the target distance;

步骤S3：利用所述目标距离和真实距离构建损失函数，用于训练所述目标检测网络。Step S3: constructing a loss function using the target distance and the real distance for training the target detection network.

本发明与现有技术相比，具有以下优点：Compared with the prior art, the present invention has the following advantages:

1、本发明公开了一种基于自适应目标检测网络与车牌检测的单目测距方法，使用单目相机可直接对采集到的图像进行检测，之后对目标进行距离的测量，避免图像融合等步骤，还降低了造价，有利于其在智能车上得到推广应用。1. The present invention discloses a monocular ranging method based on an adaptive target detection network and license plate detection. Using a monocular camera, the collected images can be directly detected, and then the distance of the target is measured to avoid image fusion, etc. It also reduces the cost, which is conducive to its popularization and application in smart cars.

2、本发明利用通过车牌检测网络将车牌的形状检测出来，再通过透视变换的原理估计相机的角度，从而获得更加精确的相机俯视角度。2. The present invention uses the license plate detection network to detect the shape of the license plate, and then estimates the angle of the camera through the principle of perspective transformation, so as to obtain a more accurate top-view angle of the camera.

3、本发明在对目标检测网络在训练时，通过将实时测量的测距结果作为约束，使其进一步满足几何测距原理成像点的要求，增加测距结果的精确度。3. When training the target detection network, the present invention further satisfies the requirements of the imaging point of the geometric ranging principle by using the real-time measurement ranging result as a constraint, and increases the accuracy of the ranging result.

附图说明Description of drawings

图1为本发明实施例中一种基于自适应目标检测网络与车牌检测的单目测距方法的流程图；1 is a flowchart of a monocular ranging method based on an adaptive target detection network and license plate detection in an embodiment of the present invention;

图2为本发明实施例中相机成像几何关系示意图；FIG. 2 is a schematic diagram of a camera imaging geometric relationship in an embodiment of the present invention;

图3为本发明实施例中计算出目标距离后可视化示意图；3 is a schematic diagram of visualization after calculating the target distance in an embodiment of the present invention;

图4为本发明实施例中一种基于自适应目标检测网络与车牌检测的单目测距系统的结构框图。FIG. 4 is a structural block diagram of a monocular ranging system based on an adaptive target detection network and license plate detection in an embodiment of the present invention.

具体实施方式Detailed ways

本发明提供了一种基于自适应目标检测网络与车牌检测的单目测距方法，可直接对采集到的图像进行检测，并对目标进行距离的测量，避免图像融合等步骤，降低了造价。The invention provides a monocular ranging method based on an adaptive target detection network and license plate detection, which can directly detect the collected image and measure the distance of the target, avoid steps such as image fusion, and reduce the cost.

为了使本发明的目的、技术方案及优点更加清楚，以下通过具体实施，并结合附图，对本发明进一步详细说明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below through specific implementation and in conjunction with the accompanying drawings.

实施例一Example 1

如图1所示，本发明实施例提供的一种基于自适应目标检测网络与车牌检测的单目测距方法，包括下述步骤：As shown in FIG. 1 , a monocular ranging method based on an adaptive target detection network and license plate detection provided by an embodiment of the present invention includes the following steps:

步骤S1：利用相机获取车辆的前视交通图像，使用车牌检测算法识别前视交通图像中的车牌形状，通过透视变换原理得到相机的俯视角γ；Step S1: use the camera to obtain the front-view traffic image of the vehicle, use the license plate detection algorithm to identify the shape of the license plate in the front-view traffic image, and obtain the top-down angle γ of the camera through the principle of perspective transformation;

步骤S2：将前视交通图像输入目标检测网络，得到预测目标检测框的下沿中心点P，利用γ和P，预测得到目标距离；Step S2: Input the forward-looking traffic image into the target detection network to obtain the lower center point P of the predicted target detection frame, and use γ and P to predict the target distance;

步骤S3：利用目标距离和真实距离构建损失函数，用于训练目标检测网络。Step S3: Use the target distance and the real distance to construct a loss function for training the target detection network.

在一个实施例中，上述步骤S1：利用相机获取车辆的前视交通图像，使用车牌检测算法识别前视交通图像中的车牌形状；通过透视变换原理得到相机的俯视角γ，具体包括：In one embodiment, the above step S1: using a camera to obtain a forward-looking traffic image of the vehicle, and using a license plate detection algorithm to identify the shape of the license plate in the forward-looking traffic image; obtaining the top-down angle γ of the camera through the principle of perspective transformation, which specifically includes:

步骤S11：使用车牌检测网络对前视交通图像中车牌进行识别并获得车牌形状；Step S11: use the license plate detection network to identify the license plate in the forward-looking traffic image and obtain the license plate shape;

使用已经训练好的车牌检测网络，输入车载相机所获取的前视交通图像，识别得到前视交通图像中车辆的车牌形状；Using the trained license plate detection network, input the forward-looking traffic image obtained by the vehicle camera, and identify the license plate shape of the vehicle in the forward-looking traffic image;

步骤S12：根据车牌形状的四个角的四组点计算透视变换矩阵H，根据H获取相机的俯视角γ。Step S12: Calculate the perspective transformation matrix H according to the four sets of points at the four corners of the license plate shape, and obtain the top-view angle γ of the camera according to H.

根据车牌形状从而取四组点算出透视变换矩阵H，如图2所示，图中A是车牌所在的平面，而B是与相机相平面平行的平面，车牌检测网络的作用就是检测出车牌在相机中的像素坐标，通常是梯形。According to the shape of the license plate, four sets of points are taken to calculate the perspective transformation matrix H. As shown in Figure 2, A is the plane where the license plate is located, and B is the plane parallel to the camera phase plane. The function of the license plate detection network is to detect that the license plate is in the Pixel coordinates in the camera, usually a trapezoid.

从图上可知，该梯形底边与矩形的车牌长边长度相等，由于车牌的长宽比是已知的，可以通过该比例求出矩形车牌在B平面上的矩形像素坐标。通过该方法收集到四对对应点，并通过四点对应法计算出B平面在A平面像素坐标下的透视变换矩阵H。获得A与B平面之间的透视变换矩阵H后，设现实空间中同一个点在A与B上的像点分别为x_A与x_B，如公式(1)所示：It can be seen from the figure that the length of the bottom edge of the trapezoid is equal to the length of the long side of the rectangular license plate. Since the aspect ratio of the license plate is known, the rectangular pixel coordinates of the rectangular license plate on the B plane can be obtained through this ratio. Four pairs of corresponding points are collected by this method, and the perspective transformation matrix H of the B plane under the pixel coordinates of the A plane is calculated by the four-point correspondence method. After obtaining the perspective transformation matrix H between the A and B planes, let the image points of the same point on A and B in the real space be x _A and x _B respectively, as shown in formula (1):

其中，K_B与K_A分别是两个平面相机对应的内参矩阵；Among them, K _B and K _A are the internal parameter matrices corresponding to the two plane cameras respectively;

由于两个内参矩阵K_B与K_A，以及透视变换矩阵H都是已知参数，可以计算出正交矩阵R，R的第三行就是B平面对应的相机在世界坐标系(也就是A平面对应的相机的坐标系)中指向正前方的主轴的单位向量。由此可获得A与B平面的夹角。根据几何关系，A与B的夹角即是相机与水平面的俯视角γ。Since the two internal parameter matrices K _B and K _A and the perspective transformation matrix H are known parameters, the orthogonal matrix R can be calculated. The third row of R is the camera corresponding to the B plane in the world coordinate system (that is, the A plane). The unit vector of the main axis pointing straight ahead in the corresponding camera's coordinate system). From this, the angle between the A and B planes can be obtained. According to the geometric relationship, the angle between A and B is the top-down angle γ of the camera and the horizontal plane.

在一个实施例中，上述步骤S2：将前视交通图像输入目标检测网络，得到预测目标检测框的下沿中心点P，利用γ和P，预测得到目标距离，具体包括：In one embodiment, the above step S2: input the forward-looking traffic image into the target detection network to obtain the lower center point P of the predicted target detection frame, and use γ and P to predict the target distance, which specifically includes:

将前视交通图像输入预训练好权重的目标检测网络YOLOv4，用对前视交通图进行检测，输出预测目标检测框，令预测目标检测框下沿的中心点为P，根据公式(2)，预测得到相机与P的目标距离O₃P：Input the forward-looking traffic image into the pre-trained weighted target detection network YOLOv4, use it to detect the forward-looking traffic map, and output the predicted target detection frame. Let the center point of the lower edge of the predicted target detection frame be P, according to formula (2), Predict the target distance O ₃ P between the camera and P:

其中，γ是相机与地面的俯视角,Height为相机距离地面的高度。Among them, γ is the top-down angle between the camera and the ground, and Height is the height of the camera from the ground.

在一个实施例中，上述步骤S3：利用目标距离和真实距离构建损失函数，用于训练目标检测网络，具体包括：In one embodiment, the above step S3: using the target distance and the real distance to construct a loss function for training the target detection network, specifically including:

构建损失函数值Giou，如公式(3)～(4)所示：Construct the loss function value Giou, as shown in formulas (3) to (4):

其中，IOU是预测目标检测框与真实检测框的重合面积与两者面积总和的比例；p为预测目标检测框与真实检测框中心之间的距离；x^t、x^p分别为真实检测框与预测目标框的中心点坐标；GT_x为目标检测网络的GroundTruth，c为预设的参数；Among them, IOU is the ratio of the overlapping area of the predicted target detection frame and the real detection frame to the sum of the two areas; p is the distance between the predicted target detection frame and the center of the real detection frame; x ^t and x ^p are the real detection frame and the real detection frame, respectively. Predict the center point coordinates of the target frame; GT _x is the GroundTruth of the target detection network, and c is the preset parameter;

x_i为预测的距离O₃P，μ为标定的真实距离；N为样本总数。x _i is the predicted distance O ₃ P, μ is the calibrated real distance; N is the total number of samples.

在本步骤中，要对目标检测网络进行重新训练，在训练过程中将测距结果x_i与标定的真实距离μ的差值作为loss输入目标检测网络中，从而对于目标检测网络获取的P点进行进一步的矫正，使其接近真实值。In this step, the target detection network needs to be retrained. During the training process, the difference between the ranging result _xi and the calibrated real distance μ is input into the target detection network as loss, so that the P point obtained by the target detection network is Further corrections are made to bring it closer to the true value.

通过训练好的目标检测网络计算出目标距离后，可以使用OpenCV工具将结果显示在图片上，从而实现结果的可视化，样例如图3所示。After calculating the target distance through the trained target detection network, you can use the OpenCV tool to display the result on the picture, so as to realize the visualization of the result, as shown in Figure 3.

为了验证本发明方法的有效性，分别比较了使用本发明方法改进前与改进后的结果：In order to verify the effectiveness of the method of the present invention, the results before and after the improvement using the method of the present invention were compared respectively:

表1：使用原始YOLOv4和本发明的BAOD的不同结果展示Table 1: Different results presentation using original YOLOv4 and BAOD of the present invention

GroundTruth(m)GroundTruth(m) BEVDE+YOLOv4BEVDE+YOLOv4 BEVDE+BAODBEVDE+BAOD 4.104.10 4.424.42 4.184.18 6.026.02 6.546.54 6.456.45 7.137.13 7.647.64 7.637.63 8.558.55 9.289.28 8.838.83

可以看出，本发明的方法BAOD相比于原始YOLOv4更接近GroundTruth。It can be seen that the method BAOD of the present invention is closer to GroundTruth than the original YOLOv4.

表2：使用LPD对相机镜头姿态进行矫正前后对比Table 2: Comparison of before and after camera lens pose correction using LPD

GroundTruth(m)GroundTruth(m) MDE+YOLOv4MDE+YOLOv4 MDE+YOLOv4+BAODMDE+YOLOv4+BAOD 3.003.00 2.832.83 3.003.00 4.204.20 4.014.01 4.114.11 4.804.80 4.534.53 4.704.70 6.006.00 8.928.92 5.825.82 7.207.20 12.3612.36 6.816.81 8.408.40 \\ 7.697.69 9.309.30 \\ 8.628.62 20.0020.00 \\ 19.8219.82 30.0030.00 \\ 29.0529.05 40.0040.00 \\ 34.5834.58 50.0050.00 \\ 37.9637.96

同样地可以看出，本发明的方法BAOD相比于原始YOLOv4更接近GroundTruth。It can also be seen that the method BAOD of the present invention is closer to GroundTruth than the original YOLOv4.

本发明公开了一种基于自适应目标检测网络与车牌检测的单目测距方法，使用单目相机可直接对采集到的图像进行检测，之后对目标进行距离的测量，避免图像融合等步骤，还降低了造价，有利于其在智能车上得到推广应用。本发明利用通过车牌检测网络将车牌的形状检测出来，再通过透视变换的原理估计相机的角度，从而获得更加精确的相机俯视角度。本发明在对目标检测网络在训练时，通过将实时测量的测距结果作为约束，使其进一步满足几何测距原理成像点的要求，增加测距结果的精确度。The invention discloses a monocular ranging method based on an adaptive target detection network and license plate detection. A monocular camera can be used to directly detect the collected image, and then the distance of the target is measured, avoiding steps such as image fusion, etc. It also reduces the cost, which is conducive to its popularization and application in smart cars. The invention uses the license plate detection network to detect the shape of the license plate, and then estimates the angle of the camera through the principle of perspective transformation, so as to obtain a more accurate top-view angle of the camera. When training the target detection network, the present invention further satisfies the requirements of the geometric ranging principle imaging point by taking the real-time measurement ranging result as a constraint, and increases the accuracy of the ranging result.

实施例二Embodiment 2

如图4所示，本发明实施例提供了一种基于自适应目标检测网络与车牌检测的单目测距系统，包括下述模块：As shown in FIG. 4 , an embodiment of the present invention provides a monocular ranging system based on an adaptive target detection network and license plate detection, including the following modules:

获取相机的俯视角模块41，用于利用相机获取车辆的前视交通图像，使用车牌检测算法识别前视交通图像中的车牌形状，通过透视变换原理得到相机的俯视角γ；The module 41 for obtaining the top-view angle of the camera is used to obtain the front-view traffic image of the vehicle by using the camera, use the license plate detection algorithm to identify the shape of the license plate in the front-view traffic image, and obtain the top-view angle γ of the camera through the principle of perspective transformation;

预测目标距离模块42，用于将前视交通图像输入目标检测网络，得到预测目标检测框的下沿中心点P，利用γ和P，预测得到目标距离；The predicted target distance module 42 is used to input the forward-looking traffic image into the target detection network to obtain the lower center point P of the predicted target detection frame, and use γ and P to predict the target distance;

训练目标检测网络模块43，用于利用目标距离和真实距离构建损失函数，用于训练目标检测网络。The training target detection network module 43 is used for constructing a loss function by using the target distance and the real distance for training the target detection network.

提供以上实施例仅仅是为了描述本发明的目的，而并非要限制本发明的范围。本发明的范围由所附权利要求限定。不脱离本发明的精神和原理而做出的各种等同替换和修改，均应涵盖在本发明的范围之内。The above embodiments are provided for the purpose of describing the present invention only, and are not intended to limit the scope of the present invention. The scope of the invention is defined by the appended claims. Various equivalent replacements and modifications made without departing from the spirit and principle of the present invention should be included within the scope of the present invention.

Claims

1. A monocular distance measurement method based on a self-adaptive target detection network and license plate detection is characterized by comprising the following steps:

step S1: acquiring a forward-looking traffic image of a vehicle by using a camera, identifying the shape of a license plate in the forward-looking traffic image by using a license plate detection algorithm, and obtaining a depression gamma of the camera by using a perspective transformation principle;

step S2: inputting the forward-looking traffic image into a target detection network to obtain a lower edge central point P of a predicted target detection frame, and predicting to obtain a target distance by utilizing gamma and P;

and step S3: and constructing a loss function by using the target distance and the real distance for training the target detection network.

2. The monocular distance measuring method based on the adaptive target detection network and the license plate detection according to claim 1, wherein the step S1: acquiring a forward-looking traffic image of a vehicle by using a camera, and identifying the shape of a license plate in the forward-looking traffic image by using a license plate detection algorithm; obtaining a depression gamma of the camera through a perspective transformation principle, specifically comprising:

step S11: identifying the license plate in the forward-looking traffic image by using a license plate detection network and obtaining the shape of the license plate;

step S12: and calculating a perspective transformation matrix H according to four groups of points at four corners of the license plate shape, and acquiring a depression angle gamma of the camera according to H.

3. The monocular distance measuring method based on adaptive target detection network and license plate detection according to claim 1, wherein the step S2: inputting the forward-looking traffic image into a target detection network to obtain a lower edge central point P of a predicted target detection frame, and predicting to obtain a target distance by utilizing gamma and P, wherein the method specifically comprises the following steps:

inputting the forward-looking traffic image into a target detection network, outputting a predicted target detection frame, setting the central point of the lower edge of the predicted target detection frame as P, and predicting to obtain the target distance O between the camera and the P according to a formula (2) ₃ P：

Wherein gamma is the depression angle of the camera and the ground, and Height is the Height of the camera from the ground.

4. The monocular distance measuring method based on the adaptive target detection network and the license plate detection according to claim 1, wherein the step S3: constructing a loss function by using the target distance and the real distance for training the target detection network, specifically comprising:

constructing a loss function value Giou as shown in formulas (3) to (4):

the IOU is the proportion of the overlapping area of the prediction target detection frame and the real detection frame to the sum of the areas of the prediction target detection frame and the real detection frame; p is the distance between the center of the predicted target detection frame and the center of the real detection frame; x is the number of ^t 、x ^p Respectively a central point coordinate of a real detection frame and a central point coordinate of the prediction target frame; GT & lt/EN & GT _x Detecting the GroudTruth of the network for the target, wherein c is a preset parameter;

x _i is a predicted distance O ₃ P, mu is a calibrated real distance; and N is the total number of samples.

5. A monocular distance measuring system based on self-adaptive target detection network and license plate detection is characterized by comprising the following modules:

the system comprises a camera depression angle acquisition module, a license plate detection module and a depression angle acquisition module, wherein the camera depression angle acquisition module is used for acquiring a forward-looking traffic image of a vehicle, recognizing the shape of a license plate in the forward-looking traffic image by using a license plate detection algorithm and obtaining a depression angle gamma of the camera through a perspective transformation principle;

the predicted target distance module is used for inputting the forward-looking traffic image into a target detection network to obtain a lower edge central point P of a predicted target detection frame, and the target distance is obtained through prediction by utilizing gamma and P;

and the training target detection network module is used for constructing a loss function by using the target distance and the real distance and training the target detection network.