CN113436217A

CN113436217A - Unmanned vehicle environment detection method based on deep learning

Info

Publication number: CN113436217A
Application number: CN202110838473.9A
Authority: CN
Inventors: 宋勇; 张双建; 庞豹; 袁宪锋; 许庆阳; 巩志
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2021-07-23
Filing date: 2021-07-23
Publication date: 2021-09-24

Abstract

The invention provides an unmanned vehicle environment detection method based on deep learning, which comprises the following steps: step (1), obtaining a super-resolution image; step (2), generating a medium-resolution image by a generator; then, sending the generated image with the medium resolution to an edge enhancement network to generate a super-resolution image, and then transmitting the generated super-resolution image to a target detector to perform a task of classifying and positioning a target; and (3) self-adaptive Gaussian distribution modeling, and (4) fusing an EEN and a self-adaptive uncertainty target positioning-based end-to-end detection model through a dynamic suppression strategy. The method has strong self-adaptive capacity on small targets, medium targets and large targets, and verifies that when the density of the targets in the image changes greatly, the problems of high overlap rate and high error rate in detection can be solved by increasing or reducing the threshold value through a dynamic adjustment strategy, so that good generalization capacity and scene self-adaptive capacity are shown.

Description

Unmanned vehicle environment detection method based on deep learning

Technical Field

The invention relates to the technical field of environment detection, in particular to an unmanned vehicle environment detection method based on deep learning.

Background

Unmanned ground vehicles are also called unmanned vehicles (unmanned vehicles for short), and due to the rapid development of the development of novel sensors and the basic research of machine learning technology in recent years, the development of civil unmanned vehicles becomes technically possible. Research institutions and enterprises at home and abroad are invested in research and development lines of intelligent automobiles or unmanned automobiles, and some of the institutions are said to realize commercial popularization of unmanned automobiles in five years in the future.

Disclosure of Invention

The invention aims to provide an unmanned vehicle environment detection method based on deep learning to solve the problems in the background technology.

The technical problem solved by the invention is realized by adopting the following technical scheme: the unmanned vehicle environment detection method based on deep learning comprises the following steps:

firstly, inputting an input low-resolution image into a G generator in a countermeasure generation network, inputting the image generated by the generator into an edge enhancement network, and then mapping the extracted edge features into a high-resolution space by utilizing an up-sampling operation of sub-pixel convolution to obtain a super-resolution image;

generating a medium-resolution image by a generator; then, sending the generated image with the medium resolution to an edge enhancement network to generate a super-resolution image, and then transmitting the generated super-resolution image to a target detector to perform a task of classifying and positioning a target;

step (3), self-adaptive Gaussian distribution modeling is carried out, and a threshold value is increased when the target density is increased and mutual shielding is increased through a dynamic suppression strategy; when the target density is low and the object appears independently, the threshold value is reduced;

and (4) fusing an EEN and an adaptive uncertainty target positioning end-to-end detection model based on the GAN.

In the step (1), the network is focused on real edge information by the constructed Mask branch by using an attention mechanism to remove noise and artifacts, and the constructed Mask branch is used for learning an image to detect and eliminate the separated noise, namely, the wrong edge point when the edge is extracted.

After the edge enhancement network removes image noise and extracts high-frequency edge detail features to finally generate a super-resolution image in the step (2); the discriminator for generating the countermeasure network judges the generated super-resolution image, judges whether the image is a false image or not, finds the difference value between the GT image and the middle super-resolution image, and reversely transmits the difference value to the G generator to generate the super-resolution image until the discriminator cannot judge whether the generated image is true or false with the high-resolution image, and the whole network training is completed.

In the step (2), in the network branch generation during network training, an SRResNet structure is adopted as an integral network structure, a convolution layer with the step length of 1, 9 x 9 is used for extracting a primary feature map, then an RRDB dense residual block is used for extracting image semantic information to obtain a clearer edge feature map, then feature fusion is carried out on the edge feature map and the primary feature, and finally a medium-resolution image is obtained through an upsampling operation; sending the medium super-resolution image into an EEN network, and extracting edge features to obtain a super-resolution image; and finally, inputting the SR image into a YOLO3 detection network, and classifying and positioning to obtain a final result.

In the step (3), a CNN fitting density function is trained to be used as a supervision signal, namely, a picture is input, and the density of the object at each position can be output; the density function is shown as a formula (1), and the density of the target i is defined as the largest value of the bounding box iou in other targets of the label set;

thus, the dynamic suppression strategy is updated according to the density function definition using the following equations (4-19):

N_M:＝max(N_t,d_M) (2)

wherein N is_MAdaptive threshold representing target M, d_MRepresents the density of the target M; there are three cases of dynamic suppression strategies: (1) when the adjacent bounding box is far away from M, i.e. iou (M, b)_i)＜N_MConsistent with the initial NMS threshold; (2) when M is located in a dense region, i.e. d_M＞N_tUsing the density value of M as the adaptive threshold N of A-NMS_dM. Thus, adjacent candidate regions are retained, which may be located around M; (3) for objects in sparse regions, i.e. d_M＜N_tNMS threshold equal to N_tThus, FP can be reduced.

In the step (3), the self-adaptive Gaussian distribution modeling adopts a Gaussian distribution function to model the prediction coordinates, the mean value of the output coordinates is used as the mean value of the Gaussian distribution, and the variance represents the uncertainty of the prediction positioning.

Uncertainty of Box coordinates in the adaptive Gaussian distribution modeling in the step (3) can be modeled and evaluated by using each Gaussian model with center coordinates, width and height. For a given test sample x, the output y may be modeled with a model t consisting of Gaussian parameters_x,t_y,t_w,t_hTo indicate indeterminate positioning, as shown by the following equation, P (y | x) ═ N (y; μ (x), s²(x) Wherein μ (x) and s²(x) Mean and variance of Box coordinates, respectively, and y is a value under the gaussian distribution.

To predict the uncertainty of the Box coordinates, the predicted feature map coordinates are mean and variance in Gaussian modeling, and the output is

Considering the nature of the gaussian distribution and the structure of the YOLO3 detection layer, the variance of the gaussian distribution is fixed between 0 and 1, fixing the range of the variance. Therefore, the following formula is adopted to preprocess the Gaussian parameters and establish four Gaussian distributions;

mean value of each coordinate in the detection layer

Predicted coordinates representing a gaussian model; variance of each coordinate

Representing the uncertainty of each coordinate. Because of the fact that

Represents the center coordinates of the frame, so is processed to a value between 0 and 1 using the sigmoid function; variance of each coordinate

Processing the value between 0 and 1 based on the Sigmoid function to represent the reliability of the coordinate; in YOLO3, the height and width information of the bounding box is processed as t_w、t_hA priori bounding boxes, i.e. Gaussian parameters

Representing t in a YOLO network_w、t_h。

In the step (4), the image edge enhancement GAN network and the AN-Gaussian YOLOv3 are jointly designed into a unified framework, the loss function of the model is redesigned, the AN-Gaussian YOLOv3 algorithm is added to judge the loss to detect the loss, the COCO data set is tested, and the COCO data set is compared with other algorithms to verify the effectiveness of the algorithms.

Compared with the prior art, the invention has the beneficial effects that: the method has strong self-adaptive capacity on small targets, medium targets and large targets, and verifies that when the density of the targets in the image changes greatly, the problems of high overlap rate and high error rate in detection can be solved by increasing or reducing the threshold value through a dynamic adjustment strategy, so that good generalization capacity and scene self-adaptive capacity are shown.

Drawings

FIG. 1 is a schematic flow chart of step (1) of the present invention.

FIG. 2 is a schematic flow chart of step (2) of the present invention.

FIG. 3 is a schematic view of step (2) according to the present invention.

FIG. 4 is a diagram of the AN-Gaussian YOLOv3 predicted features of the present invention.

FIG. 5 is a diagram of the detection effect of the AN-Gaussian YOLOv3 algorithm in multiple scenes.

FIG. 6 is a graph of the IOU versus position uncertainty of the present invention.

Fig. 7(a) is a diagram of the effect of the uncertainty sparse scene detection of the present invention.

FIG. 7(b) is a diagram of the detection effect of the dense target scene under the Gaussian combination framework of the present invention.

FIG. 7(c) is a diagram of the effect of complex background detection in the Gaussian combination framework of the present invention.

FIG. 7(d) is a diagram of the occlusion detection effect under the Gaussian combination frame of the present invention.

FIG. 8 is a context aware skills development flow diagram of the present invention.

Fig. 9(a) is a diagram illustrating the effect of detecting an uncertain sparse scene according to the present invention.

Fig. 9(b) is a diagram of the effect of the uncertainty sparse scene detection of the present invention.

Fig. 9(c) is a diagram of the effect of the uncertainty sparse scene detection of the present invention.

Detailed Description

In the description of the present invention, it should be noted that unless otherwise specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly and may be, for example, fixedly connected, detachably connected, or integrally connected, mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements.

As shown in fig. 1 to 9, the method for detecting the environment of the unmanned vehicle based on deep learning includes the following steps:

The network structure of D adopts VGG network, the activation function between layers is LEAKYRELU, and the number of basic channels is 64. After the features are extracted, the image output is 4 multiplied by 512, and finally the output probability value is obtained through classifier. The perceptual loss is used in network training and is the average of the difference between the ISR and HR images generated by the generator and the output of the network through the VGG 19.

Setting and training network parameters: the network model was trained using an Adam optimizer, where β₁＝0.9，β₂The generator and arbiter are alternately updated until convergence, 0.999. Beta is a₁The coefficient is exponential decay rate, and the specific gravity of momentum and the current gradient is controlled; beta is a₂The coefficient is an exponential decay rate and controls the influence of the square of the gradient.

In the training process, the G and D networks are alternately updated based on Adam, the maximum iteration number is 500000, the initial learning rate is 0.0001, and the training is performed by halving after the iteration numbers are 50k, 100k and 250 k; the residual information needs to be scaled in the residual structure described above, and thus the size is set to 0.2.

N_M:＝max(N_t,d_M) (2)

wherein N is_MAdaptive threshold representing target M, d_MRepresents the density of the target M; there are three cases of dynamic suppression strategies: (1) when the adjacent bounding box is far away from M, i.e. iou (M, b)_i)＜N_MConsistent with the initial NMS threshold; (2) when M is located in a dense region, i.e. d_M＞N_t(4) Using the density value of M as the adaptive threshold N of A-NMS_dM. Thus, adjacent candidate regions are retained, which may be located around M; (3) for objects in sparse regions, i.e. d_M＜N_t(5) NMS threshold equal to N_tThus, FP can be reduced.

Uncertainty of Box coordinates in the adaptive Gaussian distribution modeling in the step (3) can be modeled and evaluated by using each Gaussian model with center coordinates, width and height. For a given testSample x, output y may be modeled by a model t consisting of Gaussian parameters_x,t_y,t_w,t_hTo indicate indeterminate positioning, as shown by the following equation, P (y | x) ═ N (y; μ (x), s²(x) (6) wherein μ (x) and s²(x) Mean and variance of Box coordinates, respectively, and y is a value under the gaussian distribution.

mean value of each coordinate in the detection layer

Representing the uncertainty of each coordinate. Because of the fact that

Also processed as values between 0 and 1 based on Sigmoid function, representing coordinatesReliability; in YOLO3, the height and width information of the bounding box is processed as t_w、t_hA priori bounding boxes, i.e. Gaussian parameters

Representing t in a YOLO network_w、t_h。

The AN-Gauss ian Yolov3 algorithm flow: comprises the following steps:

the R-Darknet-53 optimized network obtained by YOLO in ImageNet classification task is used as AN initialization weight parameter of AN AN-Gaussian YOLOv3 backbone network, AN Adam optimizer is used for carrying out back propagation to update network parameters, the batch size of the network is set to 64, the initial learning rate is set to 0.001, 60000 iterations are carried out, and training is carried out by reducing the learning rate by half after 10k, 20k and 250k iterations respectively; and (5) adopting multi-scale training, and adjusting the resolution after each 10 iterations.

As can be seen from the detection effect diagram in fig. 5, the algorithm has accurate positioning precision for a small target at a longer distance in the upper part of the diagram, and can extract deep semantic information to further improve generalization capability; in the middle and lower parts of the graph, when the targets to be detected are shielded and the target background information is close to the target information, the model can still accurately position the targets. Therefore, the AN-Gaussian YOLOv3 algorithm designed by the invention has good generalization capability and self-adaption capability for the high target density and sparse target, the positioning accuracy of the algorithm is greatly improved, and the target can be accurately positioned for the detection results with poor illumination conditions and good target shielding problems. But the accuracy is affected to some extent by noise problems due to the lower resolution of the image.

FIG. 6 shows the relationship between IOU and position uncertainty on KITTI data set, with the IOU increasing as the value of the position uncertainty decreases; the larger the IOU, the smaller the positioning uncertainty, and the closer the prediction result is to the real label value. Therefore, the comprehensive evaluation index of the uncertainty target of the proposed adaptive Gaussian distribution detection algorithm can effectively represent the confidence degree of the predicted coordinate.

In the step (4), the image edge enhancement GAN network and the AN-Gaussian YOLOv3 are jointly designed into a unified framework, the loss function of the model is redesigned, the AN-Gaussian YOLOv3 algorithm detection loss is added in the process of judging the loss, the image quality and the positioning precision are improved, the hardware resources are saved, the requirements of the system on real-time performance and high precision are met, the test is carried out on a COCO data set, and the comparison with other algorithms is carried out to verify the effectiveness of the algorithms.

When the detection result is accurate, the loss function brings the prediction coordinate into a Gaussian probability density function as shown in formula (9) to judge the accuracy of the result. The greater the probability, the more accurate the prediction.

Therefore, the box coordinate regression loss is reconstructed by using the negative log-likelihood loss, namely the sum of Gaussian probability density values, which is recorded as L_x，L_y，L_w，L_hRespectively represent coordinate components

The sum of the losses. The formula is shown in formulas (10) to (11):

and positioning uncertainty evaluation indexes, namely after the position uncertainty of the four coordinate components is obtained through Gaussian modeling, in order to reduce the fraction of the bounding box with higher uncertainty to be lower than a threshold value, and improve the detection precision. The following comprehensive uncertainty evaluation indexes are provided:

Cr＝σ_obj×U_s_ijk (14)

wherein:

in the formula (15), U _ s_ijkIs the confidence of the location of the predicted value. Because the location uncertainty is determined collectively by the location coordinates, location confidence can be measured. Multiplying the original target confidence coefficient by the positioning confidence coefficient to obtain an uncertainty comprehensive score, wherein when the positioning is accurate, the positioning confidence coefficient is close to 1; when the positioning accuracy is poor, the position confidence degree is small, the position confidence degree approaches to 0, at the moment, the original target confidence degree is multiplied by a small coefficient, so that the uncertainty comprehensive score is obviously reduced, further, the prediction result lower than the threshold value is filtered, the FP is reduced to a certain extent, and the detection accuracy is improved.

The synthetic loss function: introducing the AN-Gaussian YOLOv3 algorithm detection loss into the total discriminant loss, redesigning a total discriminant network loss function to optimize and update the generator network and the detection network, wherein the total discriminant network loss is defined as the weighted sum of the total loss of the edge enhancement network, the classification loss of AN-Gaussian YOLOv3 and the uncertainty regression loss under Gaussian distribution.

L_all＝L_{G_all}+ξL_{det_AN-Gaussian YOLOv3} (16)

L_{det_AN-Gaussian YOLOv3}＝L_{cls_AN-Gaussian YOLOv3}+L_{reg_AN-Gaussian YOLOv3} (17)

Experimental setup: initializing by using a VGG19 weight network pre-trained in ImageNet, initializing to 1, training an algorithm model in an end-to-end mode, and setting the learning rate to be 0.0001 in the training process, wherein the learning rate is halved at each iteration of 50 k; the batch size is set to 5 and the weights are updated using Adam as the optimizer until the entire architecture converges. 23 RRDB blocks and 6 RRDB dense residue blocks of the EEN network are used in the generator G.

The EEN network consists of densely connected sub-networks and Mask branch networks. The network is focused on real edge information by a constructed Mask branch by using an attention mechanism to achieve the aim of removing noise and artifacts, and finally a mapping function F is learned to reconstruct a corresponding real image for given low-resolution input.

For a given sample I_baseFirstly, using laplacian operator to detect and extract edge information, defining laplacian L (x, y) of an image I (x, y) as an image second-order partial derivative, and defining E (x, y) as an extracted edge feature, and using the following formula:

the edge information is then extracted using a jump connection and mapped into a low resolution space, as opposed to operating in a high resolution space, which reduces the amount of computation. Meanwhile, the constructed Mask branch learns the image to detect and eliminate the separated noise, namely the wrong edge point when the edge is extracted; and then mapping the extracted edge features to a high-resolution space by utilizing an up-sampling operation of sub-pixel convolution to obtain a super-resolution image.

The penalty function for an EEN network is defined as the image consistency penalty and the edge consistency penalty. In training the network, using Charbonnier Loss between the medium and high resolution images, called Loss of image consistency, as shown below,

representing the distance between the high-resolution and super-resolution images. This is advantageous for obtaining an image with good edge information. However, the edge of the object is damaged, which may generate noise, and thus good edge information may not be obtained. Therefore, to calculate the edge Loss, an edge consistency Loss is introduced, and charbonier Loss between the edge information extracted from the medium resolution image and the edge information extracted from the high resolution image is evaluated, as shown below.

Finally, the total loss of consistency is the sum of the individual losses of the image and the edge, as shown below.

L_EEN＝L_{img_cl}+L_{edge_cl}。

As can be seen from fig. 7(a), the added adaptive uncertainty algorithm type has a good detection effect on the non-occluded target. It can be seen from fig. 7(b) that when the target density of the model is high, the detection effect is higher than the model without uncertainty algorithm in positioning accuracy, and the parameter is adjusted in time according to the target density, so that the generalization capability is better. It can be seen that the predicted bounding box can accurately detect the target object in the image after adding the localization uncertainty algorithm, wherein the dashed box represents the localization of the localization uncertainty represented by the adaptive NMS-gaussian distribution algorithm.

The method comprises the steps of converting a Pytrch algorithm Model trained by a self-adaptive Gaussian distribution uncertainty algorithm into a Hilens Kit pb Model suitable for an adaptor chip based on a Model Arts cloud AI development platform, compiling a skill template based on Pytrch 1.0 Python3.6 in a Hilens Studio IDE multi-language integrated development environment, simultaneously importing the Model Arts pb Model from an OBS storage server, compiling a logic reasoning code, finally obtaining a om skill Model in Hilens Studio, compiling and debugging the skill code, and then publishing skills, deploying and operating the skills on an end-side device Hilens Kit.

In order to actually test the intelligent vehicle sensing system in an outdoor environment, a simpler control strategy and a lane line identification algorithm are firstly designed, so that the intelligent vehicle can move along a travelable area of a lane line according to detected environmental information. Firstly, calculating the inter-frame running distance of the intelligent trolley according to the moving speed of the intelligent trolley and the frame difference time of image processing by Hilens, then judging whether the periphery of the next frame difference is a safe drivable area or not according to the detection result of the end-to-end combined model, and transmitting the detection result to a main control decision system, wherein the intelligent trolley controls the next step of movement of the intelligent trolley according to the detection result of a specific target, and the images 9(a), 9(b) and 9(c) show that the unmanned vehicle algorithm is actually measured under the conditions of good weather conditions and wind and snow weather in the outdoor actual environment, so that the algorithm under the actually measured environment still has good positioning accuracy and generalization capability in severe environment.

The intelligent vehicle is actually measured in an outdoor environment through a simple control strategy and a lane line identification algorithm. Hilens captures the front image of the surrounding environment and inputs the front image into an om detection model, and the output of the model controls the next action of the intelligent trolley through a controller. And the main control system of the intelligent trolley controls the stepping motor and the servo motor according to the corresponding action instruction. The test was performed in an outdoor environment on a campus. Under most circumstances, the intelligent trolley can safely drive along a lane line and accurately position surrounding targets, but under the condition that the surrounding environment is complex, the intelligent trolley cannot accurately judge traffic rules, the judgment capability of the two-dimensional image on the space position is limited, and the control precision of a specific target tracking task is improved.

The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. The unmanned vehicle environment detection method based on deep learning is characterized by comprising the following steps: the method comprises the following steps:

2. The deep learning-based unmanned vehicle environment detection method according to claim 1, characterized in that: in the step (1), the network is focused on real edge information by the constructed Mask branch by using an attention mechanism to remove noise and artifacts, and the constructed Mask branch is used for learning an image to detect and eliminate the separated noise, namely, the wrong edge point when the edge is extracted.

3. The deep learning-based unmanned vehicle environment detection method according to claim 1, characterized in that: after the edge enhancement network removes image noise and extracts high-frequency edge detail features to finally generate a super-resolution image in the step (2); the discriminator for generating the countermeasure network judges the generated super-resolution image, judges whether the image is a false image or not, finds the difference value between the GT image and the middle super-resolution image, and reversely transmits the difference value to the G generator to generate the super-resolution image until the discriminator cannot judge whether the generated image is true or false with the high-resolution image, and the whole network training is completed.

4. The deep learning-based unmanned vehicle environment detection method according to claim 3, characterized in that: in the step (2), in the network branch generation during network training, an SRResNet structure is adopted as an integral network structure, a convolution layer with the step length of 1, 9 x 9 is used for extracting a primary feature map, then an RRDB dense residual block is used for extracting image semantic information to obtain a clearer edge feature map, then feature fusion is carried out on the edge feature map and the primary feature, and finally a medium-resolution image is obtained through an upsampling operation; sending the medium super-resolution image into an EEN network, and extracting edge features to obtain a super-resolution image; and finally, inputting the SR image into a YOLO3 detection network, and classifying and positioning to obtain a final result.

5. The deep learning-based unmanned vehicle environment detection method according to claim 1, characterized in that: in the step (3), a CNN fitting density function is trained to be used as a supervision signal, namely, a picture is input, and the density of the object at each position can be output; the density function is shown as a formula (1), and the density of the target i is defined as the largest value of the bounding box iou in other targets of the label set;

thus, the dynamic suppression strategy is updated according to the density function definition using the following formula:

N_M:＝max(N_t,d_M) (2)

wherein N is_MAdaptive threshold representing target M, d_MRepresents the density of the target M; there are three cases of dynamic suppression strategies: (1) when the adjacent bounding box is far away from M, i.e. iou (M, b)_i)＜N_MConsistent with the initial NMS threshold; (2) when M is located in a dense region, i.e. d_M＞N_tUsing the density value of M as the adaptive threshold N of A-NMS_dM(ii) a Thus, adjacent candidate regions are retained, which may be located around M; (3) for objects in sparse regions, i.e. d_M＜N_tNMS threshold equal to N_tThus, FP can be reduced.

6. The deep learning-based unmanned vehicle environment detection method according to claim 5, characterized in that: in the step (3), the self-adaptive Gaussian distribution modeling adopts a Gaussian distribution function to model the prediction coordinates, the mean value of the output coordinates is used as the mean value of the Gaussian distribution, and the variance represents the uncertainty of the prediction positioning.

7. The deep learning-based unmanned vehicle environment detection method according to claim 6, characterized in that: the uncertainty of the Box coordinate in the adaptive Gaussian distribution modeling in the step (3) can be modeled and evaluated by each Gaussian model with the center coordinate, the width and the height; for a given test sample x, the output y may be modeled with a model t consisting of Gaussian parameters_x,t_y,t_w,t_hTo indicate indeterminate positioning, as shown by the following equation, P (y | x) ═ N (y; μ (x), s²(x) Wherein μ (x) and s²(x) Mean and variance of Box coordinates, respectively, and y is a value under the gaussian distribution.

8. According to claim 7The unmanned vehicle environment detection method based on deep learning is characterized by comprising the following steps: to predict the uncertainty of the Box coordinates, the predicted feature map coordinates are mean and variance in Gaussian modeling, and the output is

Considering the nature of the gaussian distribution and the structure of the YOLO3 detection layer, the variance of the gaussian distribution is fixed between 0 and 1, fixing the range of the variance; therefore, the following formula is adopted to preprocess the Gaussian parameters and establish four Gaussian distributions;

mean value of each coordinate in the detection layer

Representing the uncertainty of each coordinate; because of the fact that

Representing t in a YOLO network_w、t_h。

9. The deep learning-based unmanned vehicle environment detection method according to claim 1, characterized in that: in the step (4), the image edge enhancement GAN network and the AN-Gaussian YOLOv3 are jointly designed into a unified framework, the loss function of the model is redesigned, the AN-Gaussian YOLOv3 algorithm is added to judge the loss to detect the loss, the COCO data set is tested, and the COCO data set is compared with other algorithms to verify the effectiveness of the algorithms.