CN115937636A - Traffic target detection method for unmanned driving based on deep learning - Google Patents
Traffic target detection method for unmanned driving based on deep learning Download PDFInfo
- Publication number
- CN115937636A CN115937636A CN202211703954.XA CN202211703954A CN115937636A CN 115937636 A CN115937636 A CN 115937636A CN 202211703954 A CN202211703954 A CN 202211703954A CN 115937636 A CN115937636 A CN 115937636A
- Authority
- CN
- China
- Prior art keywords
- target detection
- model
- data set
- module
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Image Analysis (AREA)
Abstract
The invention provides a traffic target detection method for unmanned driving based on deep learning, which is characterized in that on the basis of the existing mature model YOLOv5, an ACmix module with improved convolution and self-attention mechanism fusion is added in front of an SPP module, a multi-scale target detection layer is added, a BDD100K data set is downloaded and processed, a training set, a verification set and a test set for model training are constructed, and finally the BDD100K data set is sent to the constructed traffic target detection model based on the YOLOv5 improvement for model training, testing and evaluation; in the model construction stage, the introduced ACmix module is more beneficial to extracting target features. In the training stage, images which do not contain traffic targets in the sent data set are deleted, so that the training of the model is prevented from being interfered, and the network convergence is accelerated. In the evaluation stage, the accuracy and the speed of the model are optimized by adjusting the width and the depth of the model so as to meet the requirements of practical application.
Description
Technical Field
The invention relates to a traffic target detection method for unmanned driving based on deep learning.
Background
With the continuous development of automatic driving technology, the target detection method is more and more concerned. Due to the complexity and diversity of the actual driving road, the rapid, accurate and high-precision target detection method plays an important role in aspects such as automatic driving. In a road environment, the background of a target image to be identified in an image shot by a camera is complex, traffic targets are different in size, and the problems of dynamic objects, shielding and the like exist. The target detection technology based on deep learning is one of the most important research directions in the field of computer vision. With the development of artificial intelligence technology and the continuous upgrading and iteration of computer hardware, target detection is gradually developed from a traditional feature extraction method to detection by using a deep learning technology. The field of deep learning target detection comprises that a single-stage target represented by YOLO has high detection speed and high precision, a model is simplified and is convenient to improve, but the real-time performance is poor, the detection precision of a small target with low resolution is low, and the phenomenon of missed detection and false detection is easy to occur.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a traffic target detection method for unmanned driving based on deep learning.
In order to achieve the above purpose, the technical solution for solving the technical problem is as follows:
a traffic target detection method for unmanned driving based on deep learning comprises the following steps:
s1: downloading and processing a BDD100K data set, including a training data set and a testing data set;
s2: adding a multi-scale target detection layer to the model;
s3: an ACmix module is introduced into the YOLOv5 network model, so that the feature expression and learning capacity of the model can be enhanced, and the model operation overhead is reduced.
Further, the S1 includes the following contents: converting the downloaded BDD100K data set into a txt format adopted by YOLOv5 from a json format, constructing a traffic target detection training data set and a test data set by adopting real shot images in the BDD100K data set, wherein the training data set consists of 10 ten thousand images, comprises six types of samples of bus, car, truck, person, bike and motor, and is according to train: val: test =7:2: the proportion of 1 is divided into 70000 frames of training sets, 20000 frames of verification sets and 10000 frames of test sets for model training, verification and testing.
Further, in S1: the bus is a medium bus or a large bus; car is car including car, minibus, SUV various forms; truck is small, medium and large truck containing pickup; person is human; bike. The motor is a motorcycle.
Further, the S2 includes the following contents:
s2-1, adding an upper sampling layer in an upper sampling module of a PAN feature fusion network of a YOLOv5 target detection model, wherein the upper sampling layer is an upper sampling layer increased by 4 times on the basis of 8 times, 16 times and 32 times of the upper sampling layer;
s2-2, adding a Concat fusion layer in a PAN feature fusion network of a YOLOv5 target detection model, and performing feature fusion on the 4 times of the added upsampling layer in the S2-1 and a feature map with the same size obtained in the backbone network feature extraction process through the added Concat fusion layer to generate a 4 times of upsampled feature map;
s2-3, adding a small target detection layer, using the 4 times of up-sampled feature map in S2-2 for small target detection, adding a traffic target detection model for unmanned driving based on deep learning into a prediction layer with 4 scales, and using the prediction layer for multi-scale detection of the Head part;
and S2-4, adding a group of anchor point frames with small target sizes according to the small target detection layer added in the S2-3, and acquiring the anchor point frames according with the small target size characteristics by adopting a K-means self-adaptive algorithm.
Further, in S3:
an ACmix module is introduced into a Yolov5 network model, and particularly, the ACmix module is inserted at the tail of a backbone network of the Yolov5, namely between the last CBL module and the SPP module in the backbone network, so that the model characteristic expression capability is improved, and the model operation overhead is reduced.
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: by means of multi-scale fusion features, the advantages of convolution and self attention combination are utilized, accuracy is improved, calculated amount is reduced, and both accuracy and detection efficiency are considered.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below. In the drawings:
FIG. 1 is a flow chart of a traffic target detection method for unmanned driving based on deep learning of the present invention;
FIG. 2 is a schematic diagram of a detection network model architecture of the present invention;
fig. 3 is a schematic diagram of the ACmix structure of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in FIG. 1, the invention discloses a traffic target detection method for unmanned driving based on deep learning, which comprises the following steps:
s1: the BDD100K data sets, including the training data set and the test data set, are downloaded and processed.
Further, S1 includes the following:
converting the downloaded BDD100K data set into a txt format adopted by YOLOv5 from a json format, constructing a traffic target detection training data set and a test data set by adopting real shot images in the BDD100K data set, wherein the training data set consists of 10 ten thousand images, comprises six types of samples of bus, car, truck, person, bike and motor, and is according to train: val: test =7:2: the proportion of 1 is divided into 70000 frames of training sets, 20000 frames of verification sets and 10000 frames of test sets for model training, verification and testing.
S2: and adding a multi-scale target detection layer to the model.
Further, S2 includes the following:
s2-1: as shown in fig. 2, in an upsampling module of a PAN feature fusion network of a YOLOv5 target detection model, an upsampling layer is added, wherein the upsampling layer is 4 times of the upsampling layer added on the basis of 8 times, 16 times and 32 times of the upsampling layer;
s2-2: in a PAN feature fusion network of a YOLOv5 target detection model, a Concat fusion layer is added, feature fusion is carried out on feature graphs with the same size obtained in the process of extracting features of the backbone network and the added 4-time upsampling layer in S2-1 through the added Concat fusion layer, the receptive field is increased by adopting 4-level Spatial Pyramid Pooling (Spatial Pyramid Pooling), multi-scale feature fusion is realized on the feature graphs with 4 levels and different sizes by utilizing SPP (shortest Path processing), the multi-scale feature fusion of a Neck part is realized, and the specific Neck structure is shown in FIG. 2;
s2-3: adding a small target detection layer, using the feature map sampled 4 times in the step S2-2 for small target detection, adding a prediction layer with 4 scales based on an infrared image pedestrian target detection deep learning model of improved YOLOv5, wherein the prediction layer is the upsampling feature layer with 4 scales respectively 4 times, 8 times, 16 times and 32 times, and for an input infrared image with 512 × 512, four feature scales obtained by adding a detection layer are respectively: a 128 × 128 scale feature layer, a 64 × 64 scale feature layer, a 32 × 32 scale feature layer and a 16 × 16 scale feature layer, which are used for realizing multi-scale detection of a Head part, wherein a specific Head structure is shown in fig. 2;
s2-4: according to the small target detection layer added in the step S2-3, increasing the size of a group of anchor point frames (anchors) with small target sizes, and acquiring the anchor point frames which accord with the small target size characteristics of the infrared image pedestrian by adopting a K-means self-adaptive algorithm; according to the increased 128 x 128-scale feature layer in S2-3, the anchors corresponding to the small scales are added to the divided small-scale grids, so that the anchors are added to 12 groups corresponding to 4 detection scales;
s3: an ACmix module is introduced into a YOLOv5 network model, so that the feature expression and learning capacity of the model can be enhanced, and the model operation overhead is reduced
Further, S3 includes the following:
and inserting an ACmix module between the last CBL module and the SPP module in the backbone network.
Specifically, ACmix includes two stages.
Stage I: the input features are projected by 3 1 × 1 convolutions and reshaped into N blocks, respectively. In this way, a rich set of intermediate features is obtained comprising 3 × N feature maps.
Stage II: for the self-attention path, the intermediate features are clustered into N groups, each group containing 3 feature maps, each feature from a 1 × 1 convolution. The corresponding 3 feature maps, as queries, keys, and values, respectively, follow the traditional multi-headed self-attention model.
For a convolution path with the kernel size of k, a lightweight full-link layer (3 Nx (k ^ 2) N) is adopted to generate k ^2 feature maps, and N groups are shared. Thus, by translating and aggregating the generated features, we process the input features in a convolution manner and gather information from the local receptive fields like the conventional method. Finally, the outputs of the two paths are added.
The convolution and self-attention of the first stage actually share the same operation when projecting the input feature map by 1 × 1 convolution. In stage II, ACmix introduces additional computational overhead through lightweight full-link layer and packet convolution, whose computational complexity is linear to the channel size C, and is smaller than stage I.
Compared with the prior art, the invention improves the backbone network and the Neck network in the YOLOv5, increases the multi-scale target detection layer, introduces the ACmix module, can reduce the model calculation overhead, improves the speed, enables the feature extraction network to pay more attention to the extraction of the shallow feature, can more thoroughly extract the shallow detail feature and the deep high-level semantic feature, and enables the robustness of the model to be better.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (5)
1. A traffic target detection method for unmanned driving based on deep learning is characterized by comprising the following steps:
s1: download and process BDD100K data sets, including: training a data set and testing the data set;
s2: adding a multi-scale target detection layer to the YOLOv5 target detection model;
s3: an ACmix module is introduced into a YOLOv5 target detection model to enhance the feature expression and learning capacity of the model and reduce the model operation overhead.
2. The traffic target detection method for unmanned aerial vehicle based on deep learning according to claim 1, wherein the S1 comprises:
converting the downloaded BDD100K data set into a txt format adopted by YOLOV5 from a json format, constructing a traffic target detection training data set and a test data set by adopting real shooting images through the BDD100K data set, wherein the training data set consists of 10 ten thousand images and comprises: bus, car, truck, person, bike, and motor six samples, and according to train: val: test =7:2: the proportion of 1 is divided into 70000 frames of training sets, 20000 frames of verification sets and 10000 frames of test sets for model training, verification and testing.
3. The traffic-target detecting method for unmanned aerial vehicle based on deep learning according to claim 2, wherein in S1,
the bus is a medium bus or a large bus;
car is a car including cars, minibuses, and SUVs in various forms;
truck is small, medium and large truck containing pickup;
person is human;
bike;
the motor is a motorcycle.
4. The deep learning-based traffic target detection method for unmanned aerial vehicle according to claim 1, wherein the S2 comprises:
s2-1, adding an upsampling layer in an upsampling module of a PAN feature fusion network of a YOLOv5 target detection model, wherein the upsampling layer is an upsampling layer increased by 4 times on the basis of 8 times, 16 times and 32 times of the upsampling layer;
s2-2, adding a Concat fusion layer in a PAN feature fusion network of a YOLOv5 target detection model, and performing feature fusion on the feature graph with the same size obtained in the process of extracting features of the added 4 times of upsampling layers and the backbone network through the added Concat fusion layer to generate a 4 times of upsampling feature graph;
s2-3, adding a small target detection layer, using the 4 times of up-sampled feature map for small target detection, adding a traffic target detection model for unmanned driving based on deep learning into prediction layers of 4 scales, and using the prediction layers for multi-scale detection of Head part;
and S2-4, adding a group of anchor point frames with small target sizes according to the added small target detection layer, and acquiring the anchor point frames according with the small target size characteristics by adopting a K-means self-adaptive algorithm.
5. The deep learning-based traffic target detection method for unmanned aerial vehicle according to claim 1, wherein in S3:
introducing an ACmix module into a YOLOv5 target detection model, wherein the ACmix module comprises: an ACmix module is inserted at the tail of a backbone network of YOLOv5, namely between the last CBL module and the SPP module in the backbone network, so that the model feature expression capability is improved, and the model operation overhead is reduced.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211703954.XA CN115937636A (en) | 2022-12-29 | 2022-12-29 | Traffic target detection method for unmanned driving based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211703954.XA CN115937636A (en) | 2022-12-29 | 2022-12-29 | Traffic target detection method for unmanned driving based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115937636A true CN115937636A (en) | 2023-04-07 |
Family
ID=86550659
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211703954.XA Pending CN115937636A (en) | 2022-12-29 | 2022-12-29 | Traffic target detection method for unmanned driving based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115937636A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116721351A (en) * | 2023-07-06 | 2023-09-08 | 内蒙古电力(集团)有限责任公司内蒙古超高压供电分公司 | Remote sensing intelligent extraction method for road environment characteristics in overhead line channel |
-
2022
- 2022-12-29 CN CN202211703954.XA patent/CN115937636A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116721351A (en) * | 2023-07-06 | 2023-09-08 | 内蒙古电力(集团)有限责任公司内蒙古超高压供电分公司 | Remote sensing intelligent extraction method for road environment characteristics in overhead line channel |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110135366B (en) | Shielded pedestrian re-identification method based on multi-scale generation countermeasure network | |
CN110222604B (en) | Target identification method and device based on shared convolutional neural network | |
CN114202743A (en) | Improved fast-RCNN-based small target detection method in automatic driving scene | |
CN112489072B (en) | Vehicle-mounted video perception information transmission load optimization method and device | |
CN116342894B (en) | GIS infrared feature recognition system and method based on improved YOLOv5 | |
CN115937636A (en) | Traffic target detection method for unmanned driving based on deep learning | |
CN116994047A (en) | Small sample image defect target detection method based on self-supervision pre-training | |
CN115743101A (en) | Vehicle track prediction method, and track prediction model training method and device | |
CN114267025A (en) | Traffic sign detection method based on high-resolution network and light-weight attention mechanism | |
CN116258940A (en) | Small target detection method for multi-scale features and self-adaptive weights | |
CN113076988B (en) | Mobile robot vision SLAM key frame self-adaptive screening method based on neural network | |
CN113902753A (en) | Image semantic segmentation method and system based on dual-channel and self-attention mechanism | |
CN113901931A (en) | Knowledge distillation model-based behavior recognition method for infrared and visible light videos | |
CN111612803A (en) | Vehicle image semantic segmentation method based on image definition | |
CN112101113A (en) | Lightweight unmanned aerial vehicle image small target detection method | |
CN114359689B (en) | Dynamic target detection and tracking method | |
CN115861861A (en) | Lightweight acceptance method based on unmanned aerial vehicle distribution line inspection | |
Jiangzhou et al. | Research on real-time object detection algorithm in traffic monitoring scene | |
CN116189012A (en) | Unmanned aerial vehicle ground small target detection method based on improved YOLOX | |
Shao et al. | Research on yolov5 vehicle object detection algorithm based on attention mechanism | |
Li et al. | Infrared Small Target Detection Algorithm Based on ISTD-CenterNet. | |
CN115359271B (en) | Large-scale invariance deep space small celestial body image matching method | |
Wang et al. | Research on Vehicle Object Detection Based on Deep Learning | |
CN112861733B (en) | Night traffic video significance detection method based on space-time double coding | |
CN113076898B (en) | Traffic vehicle target detection method, device, equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |