CN115937636A - Traffic target detection method for unmanned driving based on deep learning - Google Patents

Traffic target detection method for unmanned driving based on deep learning Download PDF

Info

Publication number
CN115937636A
CN115937636A CN202211703954.XA CN202211703954A CN115937636A CN 115937636 A CN115937636 A CN 115937636A CN 202211703954 A CN202211703954 A CN 202211703954A CN 115937636 A CN115937636 A CN 115937636A
Authority
CN
China
Prior art keywords
target detection
model
data set
module
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211703954.XA
Other languages
Chinese (zh)
Inventor
朱勇建
李长旭
王栋
张裕
王嘉钰
刘云翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Institute of Technology
Original Assignee
Shanghai Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Institute of Technology filed Critical Shanghai Institute of Technology
Priority to CN202211703954.XA priority Critical patent/CN115937636A/en
Publication of CN115937636A publication Critical patent/CN115937636A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a traffic target detection method for unmanned driving based on deep learning, which is characterized in that on the basis of the existing mature model YOLOv5, an ACmix module with improved convolution and self-attention mechanism fusion is added in front of an SPP module, a multi-scale target detection layer is added, a BDD100K data set is downloaded and processed, a training set, a verification set and a test set for model training are constructed, and finally the BDD100K data set is sent to the constructed traffic target detection model based on the YOLOv5 improvement for model training, testing and evaluation; in the model construction stage, the introduced ACmix module is more beneficial to extracting target features. In the training stage, images which do not contain traffic targets in the sent data set are deleted, so that the training of the model is prevented from being interfered, and the network convergence is accelerated. In the evaluation stage, the accuracy and the speed of the model are optimized by adjusting the width and the depth of the model so as to meet the requirements of practical application.

Description

Traffic target detection method for unmanned driving based on deep learning
Technical Field
The invention relates to a traffic target detection method for unmanned driving based on deep learning.
Background
With the continuous development of automatic driving technology, the target detection method is more and more concerned. Due to the complexity and diversity of the actual driving road, the rapid, accurate and high-precision target detection method plays an important role in aspects such as automatic driving. In a road environment, the background of a target image to be identified in an image shot by a camera is complex, traffic targets are different in size, and the problems of dynamic objects, shielding and the like exist. The target detection technology based on deep learning is one of the most important research directions in the field of computer vision. With the development of artificial intelligence technology and the continuous upgrading and iteration of computer hardware, target detection is gradually developed from a traditional feature extraction method to detection by using a deep learning technology. The field of deep learning target detection comprises that a single-stage target represented by YOLO has high detection speed and high precision, a model is simplified and is convenient to improve, but the real-time performance is poor, the detection precision of a small target with low resolution is low, and the phenomenon of missed detection and false detection is easy to occur.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a traffic target detection method for unmanned driving based on deep learning.
In order to achieve the above purpose, the technical solution for solving the technical problem is as follows:
a traffic target detection method for unmanned driving based on deep learning comprises the following steps:
s1: downloading and processing a BDD100K data set, including a training data set and a testing data set;
s2: adding a multi-scale target detection layer to the model;
s3: an ACmix module is introduced into the YOLOv5 network model, so that the feature expression and learning capacity of the model can be enhanced, and the model operation overhead is reduced.
Further, the S1 includes the following contents: converting the downloaded BDD100K data set into a txt format adopted by YOLOv5 from a json format, constructing a traffic target detection training data set and a test data set by adopting real shot images in the BDD100K data set, wherein the training data set consists of 10 ten thousand images, comprises six types of samples of bus, car, truck, person, bike and motor, and is according to train: val: test =7:2: the proportion of 1 is divided into 70000 frames of training sets, 20000 frames of verification sets and 10000 frames of test sets for model training, verification and testing.
Further, in S1: the bus is a medium bus or a large bus; car is car including car, minibus, SUV various forms; truck is small, medium and large truck containing pickup; person is human; bike. The motor is a motorcycle.
Further, the S2 includes the following contents:
s2-1, adding an upper sampling layer in an upper sampling module of a PAN feature fusion network of a YOLOv5 target detection model, wherein the upper sampling layer is an upper sampling layer increased by 4 times on the basis of 8 times, 16 times and 32 times of the upper sampling layer;
s2-2, adding a Concat fusion layer in a PAN feature fusion network of a YOLOv5 target detection model, and performing feature fusion on the 4 times of the added upsampling layer in the S2-1 and a feature map with the same size obtained in the backbone network feature extraction process through the added Concat fusion layer to generate a 4 times of upsampled feature map;
s2-3, adding a small target detection layer, using the 4 times of up-sampled feature map in S2-2 for small target detection, adding a traffic target detection model for unmanned driving based on deep learning into a prediction layer with 4 scales, and using the prediction layer for multi-scale detection of the Head part;
and S2-4, adding a group of anchor point frames with small target sizes according to the small target detection layer added in the S2-3, and acquiring the anchor point frames according with the small target size characteristics by adopting a K-means self-adaptive algorithm.
Further, in S3:
an ACmix module is introduced into a Yolov5 network model, and particularly, the ACmix module is inserted at the tail of a backbone network of the Yolov5, namely between the last CBL module and the SPP module in the backbone network, so that the model characteristic expression capability is improved, and the model operation overhead is reduced.
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: by means of multi-scale fusion features, the advantages of convolution and self attention combination are utilized, accuracy is improved, calculated amount is reduced, and both accuracy and detection efficiency are considered.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below. In the drawings:
FIG. 1 is a flow chart of a traffic target detection method for unmanned driving based on deep learning of the present invention;
FIG. 2 is a schematic diagram of a detection network model architecture of the present invention;
fig. 3 is a schematic diagram of the ACmix structure of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in FIG. 1, the invention discloses a traffic target detection method for unmanned driving based on deep learning, which comprises the following steps:
s1: the BDD100K data sets, including the training data set and the test data set, are downloaded and processed.
Further, S1 includes the following:
converting the downloaded BDD100K data set into a txt format adopted by YOLOv5 from a json format, constructing a traffic target detection training data set and a test data set by adopting real shot images in the BDD100K data set, wherein the training data set consists of 10 ten thousand images, comprises six types of samples of bus, car, truck, person, bike and motor, and is according to train: val: test =7:2: the proportion of 1 is divided into 70000 frames of training sets, 20000 frames of verification sets and 10000 frames of test sets for model training, verification and testing.
S2: and adding a multi-scale target detection layer to the model.
Further, S2 includes the following:
s2-1: as shown in fig. 2, in an upsampling module of a PAN feature fusion network of a YOLOv5 target detection model, an upsampling layer is added, wherein the upsampling layer is 4 times of the upsampling layer added on the basis of 8 times, 16 times and 32 times of the upsampling layer;
s2-2: in a PAN feature fusion network of a YOLOv5 target detection model, a Concat fusion layer is added, feature fusion is carried out on feature graphs with the same size obtained in the process of extracting features of the backbone network and the added 4-time upsampling layer in S2-1 through the added Concat fusion layer, the receptive field is increased by adopting 4-level Spatial Pyramid Pooling (Spatial Pyramid Pooling), multi-scale feature fusion is realized on the feature graphs with 4 levels and different sizes by utilizing SPP (shortest Path processing), the multi-scale feature fusion of a Neck part is realized, and the specific Neck structure is shown in FIG. 2;
s2-3: adding a small target detection layer, using the feature map sampled 4 times in the step S2-2 for small target detection, adding a prediction layer with 4 scales based on an infrared image pedestrian target detection deep learning model of improved YOLOv5, wherein the prediction layer is the upsampling feature layer with 4 scales respectively 4 times, 8 times, 16 times and 32 times, and for an input infrared image with 512 × 512, four feature scales obtained by adding a detection layer are respectively: a 128 × 128 scale feature layer, a 64 × 64 scale feature layer, a 32 × 32 scale feature layer and a 16 × 16 scale feature layer, which are used for realizing multi-scale detection of a Head part, wherein a specific Head structure is shown in fig. 2;
s2-4: according to the small target detection layer added in the step S2-3, increasing the size of a group of anchor point frames (anchors) with small target sizes, and acquiring the anchor point frames which accord with the small target size characteristics of the infrared image pedestrian by adopting a K-means self-adaptive algorithm; according to the increased 128 x 128-scale feature layer in S2-3, the anchors corresponding to the small scales are added to the divided small-scale grids, so that the anchors are added to 12 groups corresponding to 4 detection scales;
s3: an ACmix module is introduced into a YOLOv5 network model, so that the feature expression and learning capacity of the model can be enhanced, and the model operation overhead is reduced
Further, S3 includes the following:
and inserting an ACmix module between the last CBL module and the SPP module in the backbone network.
Specifically, ACmix includes two stages.
Stage I: the input features are projected by 3 1 × 1 convolutions and reshaped into N blocks, respectively. In this way, a rich set of intermediate features is obtained comprising 3 × N feature maps.
Stage II: for the self-attention path, the intermediate features are clustered into N groups, each group containing 3 feature maps, each feature from a 1 × 1 convolution. The corresponding 3 feature maps, as queries, keys, and values, respectively, follow the traditional multi-headed self-attention model.
For a convolution path with the kernel size of k, a lightweight full-link layer (3 Nx (k ^ 2) N) is adopted to generate k ^2 feature maps, and N groups are shared. Thus, by translating and aggregating the generated features, we process the input features in a convolution manner and gather information from the local receptive fields like the conventional method. Finally, the outputs of the two paths are added.
The convolution and self-attention of the first stage actually share the same operation when projecting the input feature map by 1 × 1 convolution. In stage II, ACmix introduces additional computational overhead through lightweight full-link layer and packet convolution, whose computational complexity is linear to the channel size C, and is smaller than stage I.
Compared with the prior art, the invention improves the backbone network and the Neck network in the YOLOv5, increases the multi-scale target detection layer, introduces the ACmix module, can reduce the model calculation overhead, improves the speed, enables the feature extraction network to pay more attention to the extraction of the shallow feature, can more thoroughly extract the shallow detail feature and the deep high-level semantic feature, and enables the robustness of the model to be better.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (5)

1. A traffic target detection method for unmanned driving based on deep learning is characterized by comprising the following steps:
s1: download and process BDD100K data sets, including: training a data set and testing the data set;
s2: adding a multi-scale target detection layer to the YOLOv5 target detection model;
s3: an ACmix module is introduced into a YOLOv5 target detection model to enhance the feature expression and learning capacity of the model and reduce the model operation overhead.
2. The traffic target detection method for unmanned aerial vehicle based on deep learning according to claim 1, wherein the S1 comprises:
converting the downloaded BDD100K data set into a txt format adopted by YOLOV5 from a json format, constructing a traffic target detection training data set and a test data set by adopting real shooting images through the BDD100K data set, wherein the training data set consists of 10 ten thousand images and comprises: bus, car, truck, person, bike, and motor six samples, and according to train: val: test =7:2: the proportion of 1 is divided into 70000 frames of training sets, 20000 frames of verification sets and 10000 frames of test sets for model training, verification and testing.
3. The traffic-target detecting method for unmanned aerial vehicle based on deep learning according to claim 2, wherein in S1,
the bus is a medium bus or a large bus;
car is a car including cars, minibuses, and SUVs in various forms;
truck is small, medium and large truck containing pickup;
person is human;
bike;
the motor is a motorcycle.
4. The deep learning-based traffic target detection method for unmanned aerial vehicle according to claim 1, wherein the S2 comprises:
s2-1, adding an upsampling layer in an upsampling module of a PAN feature fusion network of a YOLOv5 target detection model, wherein the upsampling layer is an upsampling layer increased by 4 times on the basis of 8 times, 16 times and 32 times of the upsampling layer;
s2-2, adding a Concat fusion layer in a PAN feature fusion network of a YOLOv5 target detection model, and performing feature fusion on the feature graph with the same size obtained in the process of extracting features of the added 4 times of upsampling layers and the backbone network through the added Concat fusion layer to generate a 4 times of upsampling feature graph;
s2-3, adding a small target detection layer, using the 4 times of up-sampled feature map for small target detection, adding a traffic target detection model for unmanned driving based on deep learning into prediction layers of 4 scales, and using the prediction layers for multi-scale detection of Head part;
and S2-4, adding a group of anchor point frames with small target sizes according to the added small target detection layer, and acquiring the anchor point frames according with the small target size characteristics by adopting a K-means self-adaptive algorithm.
5. The deep learning-based traffic target detection method for unmanned aerial vehicle according to claim 1, wherein in S3:
introducing an ACmix module into a YOLOv5 target detection model, wherein the ACmix module comprises: an ACmix module is inserted at the tail of a backbone network of YOLOv5, namely between the last CBL module and the SPP module in the backbone network, so that the model feature expression capability is improved, and the model operation overhead is reduced.
CN202211703954.XA 2022-12-29 2022-12-29 Traffic target detection method for unmanned driving based on deep learning Pending CN115937636A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211703954.XA CN115937636A (en) 2022-12-29 2022-12-29 Traffic target detection method for unmanned driving based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211703954.XA CN115937636A (en) 2022-12-29 2022-12-29 Traffic target detection method for unmanned driving based on deep learning

Publications (1)

Publication Number Publication Date
CN115937636A true CN115937636A (en) 2023-04-07

Family

ID=86550659

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211703954.XA Pending CN115937636A (en) 2022-12-29 2022-12-29 Traffic target detection method for unmanned driving based on deep learning

Country Status (1)

Country Link
CN (1) CN115937636A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116721351A (en) * 2023-07-06 2023-09-08 内蒙古电力(集团)有限责任公司内蒙古超高压供电分公司 Remote sensing intelligent extraction method for road environment characteristics in overhead line channel

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116721351A (en) * 2023-07-06 2023-09-08 内蒙古电力(集团)有限责任公司内蒙古超高压供电分公司 Remote sensing intelligent extraction method for road environment characteristics in overhead line channel

Similar Documents

Publication Publication Date Title
CN110135366B (en) Shielded pedestrian re-identification method based on multi-scale generation countermeasure network
CN110222604B (en) Target identification method and device based on shared convolutional neural network
CN114202743A (en) Improved fast-RCNN-based small target detection method in automatic driving scene
CN112489072B (en) Vehicle-mounted video perception information transmission load optimization method and device
CN116342894B (en) GIS infrared feature recognition system and method based on improved YOLOv5
CN115937636A (en) Traffic target detection method for unmanned driving based on deep learning
CN116994047A (en) Small sample image defect target detection method based on self-supervision pre-training
CN115743101A (en) Vehicle track prediction method, and track prediction model training method and device
CN114267025A (en) Traffic sign detection method based on high-resolution network and light-weight attention mechanism
CN116258940A (en) Small target detection method for multi-scale features and self-adaptive weights
CN113076988B (en) Mobile robot vision SLAM key frame self-adaptive screening method based on neural network
CN113902753A (en) Image semantic segmentation method and system based on dual-channel and self-attention mechanism
CN113901931A (en) Knowledge distillation model-based behavior recognition method for infrared and visible light videos
CN111612803A (en) Vehicle image semantic segmentation method based on image definition
CN112101113A (en) Lightweight unmanned aerial vehicle image small target detection method
CN114359689B (en) Dynamic target detection and tracking method
CN115861861A (en) Lightweight acceptance method based on unmanned aerial vehicle distribution line inspection
Jiangzhou et al. Research on real-time object detection algorithm in traffic monitoring scene
CN116189012A (en) Unmanned aerial vehicle ground small target detection method based on improved YOLOX
Shao et al. Research on yolov5 vehicle object detection algorithm based on attention mechanism
Li et al. Infrared Small Target Detection Algorithm Based on ISTD-CenterNet.
CN115359271B (en) Large-scale invariance deep space small celestial body image matching method
Wang et al. Research on Vehicle Object Detection Based on Deep Learning
CN112861733B (en) Night traffic video significance detection method based on space-time double coding
CN113076898B (en) Traffic vehicle target detection method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination